Data Models¶

This library uses two primary data models defined via pydantic: fairical.Scores and fairical.Solutions. These models form the backbone of the multi-objective fairness assessment workflow for machine learning (ML) systems.

The fairical.Scores model holds the raw outputs from machine learning systems, you must provide to the library. Multiple sets of scores represent different operating modes (utility/fairness trade-off) of a machine learning system. Operating models correspond to utility-fairness trade-offs that can be adjusted a posteriori, after the system has been trained. Examples of adjustments that can affect the utility-fairness trade-off in an ML system can be the threshold at which positive and negative samples are classified, or any other selection mechanisms on systems that can be tuned a posteriori to modify their behaviour (e.g. Pareto Hyper-networks, Navon et al. [NSCF], or You-Only-Train-Once models, Dosovitskiy and Djolonga [DD20]).

The fairical.Solutions model holds intermediary data produced by this library. It carries information about two or more performance metrics (utility or fairness) for each operating mode (utility/fairness trade-off) of the analysed ML system.

This library assumes you can create a JSON or Python representation of fairical.Scores directly from your ML framework. You can then either use the Python API or Command-line Apps to transform fairical.Scores into fairical.Solutions, and then into tabled results and plots, as discussed in Using Fairical.

The data model implemented in this package is summarized in the following figure:

In the next section, we explore use-cases that exemplify the use of the Python API and Command-line Apps to convert dataset scores, ground-truth and protected attributes into summary tables and visualisations.