I/O Formats¶
Pydantic models for fairical.Scores and fairical.Solutions can
be created from or saved into JSON representations that can be stored on disk. When
loaded, JSON representations are validated against the pre-defined schema defined for
those objects.
Scores¶
As indicated in Data Models, fairical.Scores correspond to the
primary input to fairical. A valid JSON for fairical.Scores
consists of a dictionary composed of 3 keys:
scores: A list of prediction scores for each Machine Learning (ML) model in consideration. Each model (or utility/fairness tradeoff) is represented with a single list of floats. A system that has multiple models (utility/fairness tradeoffs) is therefore represented by lists of lists of floats. An example of
scoresfrom a single-mode model, having predictions for 9 samples is shown below:{ "scores": [ [ 0.3970, 0.3434, 0.7074, 0.5787, 0.2451, 0.6383, 0.5937, 0.3629, 0.5526 ] ] }
Caution
Even in a single-mode model, prediction scores must be encapsulated within a list so that structure is represented as
"scores": [[.]].Here is an example of a two-mode model, with predictions for 6 samples:
{ "scores": [ [ 0.646910, 0.442378, 0.342101, 0.457198, 0.549640, 0.331999 ], [ 0.623593, 0.453462, 0.370829, 0.466496, 0.542477, 0.359879 ] ] }
In this example, scores for the same underlying sample are paired across the two lists. The first entry of each list corresponds to the first sample in a list of test samples, the second to the second, and so on.
ground-truth: A list of ground-truth labels paired to each sample in the list or lists in
scores. Given a binary classification task, an example ofground-truthlist with nine test samples is shown below:{ "ground-truth": [ 1, 0, 0, 0, 0, 0, 0, 1, 0 ] }
attributes: A dictionary of sensitive attributes with each having a list of values composed of integers or strings. Similarly to
ground-truth, this entry must be paired with the entries inscoresandground-truth. An example ofattributesdictionary forgenderandracedemographic groups with nine test samples is shown below:{ "attributes": { "gender": [ "f", "m", "f", "m", "f", "f", "f", "m", "m" ], "race": [ 2, 0, 0, 1, 0, 0, 1, 0, 2 ] } }
for a classification task where gender is
{female, male}as two categories{"m", "f"}and race is{Asian, Black, White}as three categories{0, 1, 2}.
Optionally, an additional identifiers can be added to give names to each model
defined in the scores dictionary:
{
"identifiers": [
"my-model-1",
]
}
A complete JSON representation for a single binary classifier analzyed for race with nine test samples is examplified as below:
{
"scores": [
[ 0.5077, 0.5165, 0.5073, 0.4777, 0.6062, 0.4830, 0.7178, 0.7152, 0.4331 ]
],
"identifiers": [
"my-model-1"
],
"ground-truth": [
0, 0, 0, 1, 1, 0, 0, 0, 1
],
"attributes": {
"race": [
1, 0, 2, 1, 0, 0, 2, 0, 0
]
}
}
Solutions¶
As indicated in Data Models, fairical.Solutions correspond to
configurable utility and fairness performance metrics
corresponding to operating-points of ML being analyzed, considering all of its
operating modes (trade-offs). fairical.Solutions are computed by this
library from fairical.Scores and therefore can be saved to a disk
representation, in JSON format. The user may equally provide a solution JSON file with
pre-calculated OMs for analysis, under-cutting fairical solving.
A valid JSON representation for fairical.Solutions consists of a dictionary composed of 2 keys:
points: A dictionary of utility and fairness metrics evaluated for all OMs of a model in consideration. An example JSON of an ML system with multiple OMs (trade-offs) with one non-dominated solution (NDS) and two dominated solutions (DS) is shown below:
{
"points": {
"eod+race": [ 0.104166, 1.0, 1.0 ],
"eod+gender": [ 0.398741, 1.0, 1.0 ],
"acc": [ 0.652702, 0.0, 0.0 ]
}
}
Here, there one considers two fairness (Equalized Odds Difference for race and gender) and one utility metric (accuracy).
Note
Dominated solutions in the example are selected to reflect the worst values in dimensions of metrics (1.0 for eod and 0.0 for acc).
metadata: Several dictionaries containing information related to the points. The expected entries are:
thresholds: A list specifying the threshold at which each point was computed.
{ "threholds": [0.1, 0.2, 0.3] }
identifier-names: A list of unique model names. If
identifierswas not given in the score file, these are generated automatically asmodel-1, ..,model-n.
{ "identifier-names": [ "model-1" ] }
identifiers: A list of indexes into identifier-names indicating which model the points are taken from.
{ "identifiers": [ 0, 0, 0 ] }
nds-from: If the file contains a-priori solutions, holds the name of the system the thresholds were taken from.
{ "nds-from": null }
A complete JSON representation for a solutions file containing three points composed of two fairness metrics and one utiliy metric is shown below:
{
"points": {
"eod+race": [ 0.104166, 1.0, 1.0 ],
"eod+gender": [ 0.398741, 1.0, 1.0 ],
"acc": [ 0.652702, 0.0, 0.0 ]
}
"metadata": {
{
"threholds": [ 0.1, 0.2, 0.3 ],
"identifier-names": [ "model-1" ],
"identifiers": [ 0, 0, 0 ],
"nds-from": null
}
}