An Experimental Framework for Measuring Model Fairness and Stability in the Machine Learning Lifecycle

Herasymuk, Denys

Домівка
→
Students Research & Project Works | Роботи студентів
→
Факультет прикладних наук
→
Бакалаврська програма "Комп'ютерні науки"
→
2023
→
Перегляд матеріалів

dc.contributor.author	Herasymuk, Denys
dc.date.accessioned	2024-02-14T14:00:56Z
dc.date.available	2024-02-14T14:00:56Z
dc.date.issued	2023
dc.identifier.citation	Herasymuk, Denys. An Experimental Framework for Measuring Model Fairness and Stability in the Machine Learning Lifecycle / Denys Herasymuk; Supervisor: Julia Stoyanovich; Ukrainian Catholic University, Department of Computer Sciences. – Lviv: 2023. – 70 p.: ill.	uk
dc.identifier.uri	https://er.ucu.edu.ua/handle/1/4461
dc.language.iso	en	uk
dc.title	An Experimental Framework for Measuring Model Fairness and Stability in the Machine Learning Lifecycle	uk
dc.type	Preprint	uk
dc.status	Публікується вперше	uk
dc.description.abstracten	Model fairness and stability are tightly dependent both on the input data, and on the choices and practices in the machine learning (ML) lifecycle. Errors in the ML pipeline can be classified as exogenous data errors (e.g., outliers, missing values, or incorrect labels) and endogenous modeling or processing errors (e.g., incorrect or sub- optimal choices of outlier detection or missing value imputation technique). To an- ticipate and mitigate detrimental downstream consequences, it is crucial to under- stand the impact of various exogenous data errors on model performance during model development. Moreover, these data errors may occur during the model serv- ing flow, so model developers need to know the possible behavior of their models under exogenous data errors to choose the best one that can be robust and fair in the production setup. Thus, the focus of this work is on quantifying the impact of exoge- nous errors on model performance, both during development and post-deployment. There is currently no open-source toolkit that can help to measure the impact of controlled data errors on model performance, in terms of both fairness and sta- bility. To address this gap, we develop two software libraries in this thesis, Virny and MLcF. Virny is a dedicated library for auditing model fairness and stability, and plugs into MLcF, which takes a broader lifecycle view of model performance un- der different errors (exogenous vs. endogenous). To showcase the utility of these libraries, we conduct extensive stress-testing of different model architectures by sys- tematically injecting controlled data errors. These experiments allow us to draw interesting insights about the impact of data errors on classifier performance — in terms of accuracy, stability, and fairness.	uk