Data ModelsΒΆ
This library uses two primary data models defined via pydantic:
fairical.Scores and fairical.Solutions. These models form the
backbone of the multi-objective fairness assessment workflow for machine learning (ML)
systems.
The fairical.Scores model holds the raw outputs from machine learning
systems, you must provide to the library. Multiple sets of scores represent
different operating modes (utility/fairness trade-off) of a machine learning system.
Operating models correspond to utility-fairness trade-offs that can be adjusted a
posteriori, after the system has been trained. Examples of adjustments that can affect
the utility-fairness trade-off in an ML system can be the threshold at which positive
and negative samples are classified, or any other selection mechanisms on systems that
can be tuned a posteriori to modify their behaviour (e.g. Pareto Hyper-networks,
Navon et al. [NSCF], or You-Only-Train-Once models,
Dosovitskiy and Djolonga [DD20]).
The fairical.Solutions model holds intermediary data produced by this
library. It carries information about two or more performance metrics (utility or
fairness) for each operating mode (utility/fairness trade-off) of the analysed ML
system.
This library assumes you can create a JSON or Python representation of
fairical.Scores directly from your ML framework. You can then either use the
Python API or Command-line Apps to transform fairical.Scores into
fairical.Solutions, and then into tabled results and plots, as discussed in
Using Fairical.
The data model implemented in this package is summarized in the following figure:
In the next section, we explore use-cases that exemplify the use of the Python API and Command-line Apps to convert dataset scores, ground-truth and protected attributes into summary tables and visualisations.