fairical

A library to assess adjustable demographically fair Machine Learning (ML) systems.

class fairical.Scores(**data)[source]

Bases: BaseModel

Data model representing raw machine learning score outputs.

It is composed of a set of scores, for one or more operating points (e.g. preference rays, or ratios between various optimisation objectives), ground-truth for the task being analyzed, as well as extra protected attributes that are relevant for, at least, demographic fairness analysis.

For the JSON representation, scores, ground-truth, and demographic attributes may be inlined or out-sourced to an external file where the data structure can be loaded from. Relative paths are considered w.r.t. the location of the current file.

scores: list[list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0), Le(le=1.0)])]] | Path]

Inline scores data or list of file paths. Each score must be a floating-point number between 0 and 1 inclusive.

identifiers: list[str] | None

Optional inline identifiers corresponding to each system in scores. Each identifier must be a string.

ground_truth: list[Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0)])]] | Path

Inline ground-truth data or a single file path. Each ground-truth label must be an integer with a minimum value of 0.

attributes: dict[str, list[str | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0)])]]] | Path

Inline attributes data or a single file path. It is setup as a dictionary mapping attribute names to lists of demographic data, which can be of type str, integer or floating-point.

maybe_load_members(info)[source]

Load all external files if needed.

Return type:

Self

check_consistent_num_samples()[source]

Ensure all sample-level lists have the same length.

Return type:

Self

check_identifiers_specified()[source]

Generate default identifier per system if not specified.

Return type:

Self

check_consistent_num_subsystems()[source]

Ensure all subsystem-level lists have the same length.

Return type:

Self

classmethod load(source)[source]

Validate and load a JSON file into a raw data object.

This function is intended to validate and load the input in JSON format. It opens the given file path, parses its JSON content, and validates it against the defined data model.

Parameters:

source (Path | str | TextIO) – Source input where to read JSON from.

Return type:

Self

Returns:

Parsed and validated content as a Scores instance.

Raises:

pydantic_core.ValidationError – If the file contains invalid data.

save(dest, **args)[source]

Save contents to an external file.

Parameters:
Return type:

None

solutions_a_posteriori(metrics, thresholds=None)[source]

Calculate all solutions of a system a posteriori, given metrics and thresholds.

This method retrieves solutions that can be implemented by systems. For each set of scores in self.scores, it calculates all solutions of the system being analysed through simple thresholding, and then aggregates all solutions to construct all possible sets of solutions a system can implement.

Parameters:
  • metrics (Sequence[str]) – Metrics types to consider when evaluating solutions. Example: eod+age, eod+gender, or acc.

  • thresholds (list[float] | None) – List of thresholds to apply as values within the interval \([0,1]\). If not provided, then uses scikit-learn to extract meaningful scores from the system.

Return type:

Solutions

Returns:

All know solutions for the input system.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

solutions_a_priori(metrics, prior_solutions, dominated=None)[source]

Calculate all solutions of a system with settings (system and metrics) a priori.

This method retrieves solutions that can be implemented by systems. For each set of scores in self.scores, it calculates all solutions of the system being analysed through simple thresholding, and then aggregates all solutions to construct all possible sets of solutions a system can implement.

Parameters:
  • metrics (Sequence[str]) – Metrics types to consider when evaluating solutions. Example: eod+age, eod+gender, or acc.

  • prior_solutions (Solutions) – Solutions to use as the basis for the calculation of solutions with threshold and system picked a priori.

  • dominated (bool | None) –

    A tri-state boolean flag that defines which subset of prior_solutions to apply to the scores:

    • None: applies all solutions (default)

    • True: applies only dominated solutions

    • False: applies only non-dominated solutions

Return type:

Solutions

Returns:

All a priori known solutions for the input system.

class fairical.Solutions(**data)[source]

Bases: BaseModel

Data model representing system solutions (or utility/fairness trade-offs).

Objects of this type carry information about two or more performance metrics (utility or fairness) for each operating mode (utility/fairness trade-off) of the analysed ML system.

It is a dictionary where keys correspond to utility or fairness metrics calculated for the whole system, and values across different keys represent each the performance at a particular operating mode (utility/fairness trade-off) the system being analysed can potentially implement.

points: dict[str, list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0), Le(le=1.0)])]]]
metadata: dict[str, Any]
n_metrics()

Return the number of metrics stored in the model.

Return type:

int

Returns:

The count of metric keys in the model.

n_solutions()[source]

Return the number of solutions stored in the model, across all metrics.

Return type:

int

Returns:

The number of solutions in the model, across all metrics.

items()[source]

Return a view of metric keys and their associated solution vectors.

Return type:

ItemsView[str, list[float]]

Returns:

A set-like view of (metric, vector) pairs.

keys()[source]

Return a view of the metric keys.

Return type:

KeysView[str]

Returns:

A set-like view of metric names.

values()[source]

Return a view of all solution vectors.

Return type:

ValuesView[list[float]]

Returns:

A view of all metric solution vectors in the model.

classmethod fromarray(points, metrics, thresholds, identifiers=None, identifier_names=None)[source]

Create a new instance from an array and names of metrics.

Parameters:
  • points (TypeAliasType) – 2-D array-like object with floating-point numbers organized as (n_solutions, n_metrics).

  • metrics (Sequence[str]) – A set of strictly valid and supported metrics, each representing the columns of the input points array.

  • thresholds (Sequence[float]) – A list of tresholds that were used to compute the points.

  • identifiers (Optional[Sequence[int]]) – Optional list of identifiers, indicating which model each point corresponds to. Each identifier indexes into identifier_names. If not specified, all points will have the same identifier.

  • identifier_names (Optional[Sequence[str]]) – Optoinal list of identifier names.

Return type:

Self

Returns:

A newly created and validated object.

Raises:

AssertionError – If the number of columns on the input array-like object is different than the number of listed metrics.

classmethod load(source)[source]

Validate and load a JSON file into a solution data object.

This function is intended to validate and load the input in JSON format. It opens the given file path, parses its JSON content, and validates it against the defined model.

Parameters:

source (Path | str | TextIO) – Source input where to read JSON from.

Return type:

Self

Returns:

Parsed and validated content as a Solutions instance.

Raises:

pydantic_core.ValidationError – If the file contains invalid data.

save(dest, **args)[source]

Save contents to an external file.

Parameters:
Return type:

None

check_metrics_validity()[source]

Ensure all metrics are valid.

Return type:

Self

check_identifiers_specified()[source]

Ensure solutions contain identifiers.

Return type:

Self

check_consistent_lengths()[source]

Ensure all solution lists have the same length.

Return type:

Self

filter_metadata_by_indices(indices)[source]

Filter metadata from provided indices.

Parameters:

indices – The indices used for filtering.

Returns:

Filtered metadata.

deduplicate(eps=1e-06)[source]

Filter solutions to remove duplicates within a certain epsilon.

Remove points in these solutions that lie within eps of another by clustering with sklearn.cluster.DBSCAN (min_samples=1) and keeping the first point in each cluster.

Parameters:

eps (float) – Maximum distance between points in the same cluster.

Return type:

Self

Returns:

Filtered solutions without duplicates, as a new object.

non_dominated_solutions()[source]

Filter solutions from system that are non-dominated.

This is a thin wrapper around pymoo.util.nds.NonDominatedSorting that extracts the rank‑0 solutions (those that are not dominated by any other).

Definition: A point p is dominated only if one single competitor is no worse in every objective and strictly better in at least one.

Parameters:

solutions – All solutions available in the current system.

Return type:

tuple[Self, Self]

Returns:

A tuple containing non-dominated and dominated solutions respectively. By definition, the sets are guaranteed to not overlap.

indicators()[source]

Assess utility-fairness trade-off systems based on characteristics of the estimated Pareto front.

This method evaluates trade-off between utiltiy and fairness of adjustable systems by using Multi-Objective based performance indicators. It first estimates the set of non-dominated solutions.

Return type:

dict[Literal['hv', 'ud', 'os', 'as', 'onvg', 'onvgr', 'relative-onvg', 'area'], float]

Returns:

A dictionary that characterizes the (estimated Pareto) front composed of non-dominated solutions in nds. The dictionary contains the following keys:

  • hv: The hypervolume of the front.

    Higher is better. This indicator evaluates how the solution set covers the metric space in terms of diversity and proximity to the ideal. HV is formulated as:

    \[\begin{split}HV(S) = VOL\left(\bigcup_{\substack{x \in S \\ x \prec r}} \prod_{i=1}^{N}[x^{i},r^{i}]\right)\end{split}\]

    Where \(x\) is the solution set and \(r\) is the Nadir point

  • ud: Uniformity of the distribution of nds points on the front.

    This indicator evaluates how uniform the solution set is spanned in the metric space based on an upper-bound distance, \(\sigma\). UD is formulated as:

    \[UD(S,\sigma)=\frac{1}{1+D_{nc}(S, \sigma)}\]

    Where

    \[D_{nc}(S,\sigma)=\sqrt{\frac{1}{|X_n|-1} \sum_{i=1}^{|X_n|} \left(nc(x^i,\sigma)-\mu_{nc(x,\sigma)}\right)^2}\]

    and

    \[nc(x^i,\sigma)=|\{x \in X_n, \|x-x^i\|<\sigma\}|-1\]

    \(\sigma\) is the niche radius that is problem dependent and can be adjusted based on the distribution of the candidate solution in the space. \(\mu_{nc(x,\sigma)}\) is the mean of the niche counts, \(nc\), calculated as \(\mu_{nc(x,\sigma)}=\frac{1}{|X_n|} \sum_{j=1}^{|X_n|} nc(x^j,\sigma)\).

  • os: Overall spread of nds points with respect to extremities of the front.

    This indicator assesses how well the points from the candidate set spreads towards the ideal of the optimal PF. OS is formulated as:

    \[OS(S,\mathcal{P})=\prod_{i=1}^{N}\left|\frac{\max\limits_{s \in S}s_i-\min\limits_{s \in S}s_i}{\max\limits_{p \in \mathcal{P}}p_{i}-\min\limits_{p \in \mathcal{P}}p_{i}}\right|\]

    Where the nominator and denominator are the absolute difference between the worst and best points for the candidate solution \(S\) and Pareto optimal set \(\mathcal{P}\), respectively.

  • as: Average spread of nds points with respect to extremities of the front.

    This indicator assesses how well the points from the candidate set spreads towards the ideal of the optimal PF. AS is formulated as:

    \[AS(S,\mathcal{P})=\frac{1}{N}\sum_{i=1}^{N}\left|\frac{\max\limits_{s \in S}s_i-\min\limits_{s \in S}s_i}{\max\limits_{p \in \mathcal{P}}p_{i}-\min\limits_{p \in \mathcal{P}}p_{i}}\right|\]

    Where the nominator and denominator are the absolute difference between the worst and best points for the candidate solution \(S\) and Pareto optimal set \(\mathcal{P}\), respectively.

  • onvg: Overall Nondominated Vector Generation (ONVG) in the front (nds).

    Higher is better. This indicator evaluates how many optimal solutions are generated by the system. ONVG is formulated as:

    \[ONVG(S) = |X_n|\]

    Where \(|.|\) is the cardinality of the candidate solution set in the metric space.

  • onvgr: Ratio between number of solutions in nds and nds + ds.

    Higher is better. This indicator assesses the proportion of optimal solutions generated by the system. ONVGR is formulated as:

    \[ONVGR(S) = \left|\frac{X_n}{S}\right|\]

    Where \(|.|\) is the ratio of the optimality.

tabulate()[source]

Generate a table containing the given solutions.

Each table row contains the values for each metric, the threshold used to compute them, and the corresponding identifier name.

Return type:

str

Returns:

The generated table as a string.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

fairical.make_table(indicators, table_keys=['relative-onvg', 'onvgr', 'ud', 'as', 'hv'], fmt='simple')[source]

Extract and format table from pre-computed evaluation data.

Extracts elements from data that can be displayed on a terminal-style table, format, and return it.

Parameters:
  • indicators (dict[str, dict[Literal['hv', 'ud', 'os', 'as', 'onvg', 'onvgr', 'relative-onvg', 'area'], float]]) –

    Indicators organized in a dictionary of dictionaries where keys represent the labels of each system, and values, dictionaries that represent indicators for that system with at least keys listed in table_keys. We assume the following metrics are calculated for every system:

    • hv: the pareto estimate hypervolume (float)

    • onvg: the number of non-dominated solutions (int)

    • onvgr: the ratio between the number of non-dominated solutions and the total number of solutions (int)

    • ud: the uniformity of non-dominated solutions across the estimated front (float)

    • as: the average spread of non-dominated solutions across the estimated front (float)

  • table_keys (Sequence[Union[Literal['hv', 'ud', 'os', 'as', 'onvg', 'onvgr', 'relative-onvg', 'area'], str]]) – The indicator keys that will be tabulated in the table.

  • fmt (str) – One of the formats supported by python-tabulate. Default is “github”.

Return type:

str

Returns:

A string representation of a table.

fairical.pareto_plot(solutions, axes_labels={}, alpha=0.2, hide_ds=False)[source]

Generate pareto plot for all systems under comparison.

This method generates pareto plot given solutions of systems in comparison.

Parameters:
  • solutions (dict[str, tuple[Solutions, Solutions]]) – A dictionary where keys represent system names (that will be used as labels), and values are tuples with non-dominated (nds) and dominated solutions (ds) respectively.

  • axes_labels (dict[str, str]) – If specified, overwrites the default labels for dimensions in fairical.solutions.Solutions. Should be a dictionary that maps the keys in each fairical.solutions.Solutions object to a single label. If not set, then we use a default setup provided in the module.

  • alpha (float) – Alpha blend between non-dominated (fully opaque) and dominated solutions (partly transparent).

  • hide_ds (bool) – If true, hide the ds points for a-priori data from the plot.

Return type:

tuple[Figure, Axes]

Returns:

A tuple of lists containing both the matplotlib figures and axes used to create the pareto plot. The lists will contain 2 elements each if separate is True, 1 otherwise.

fairical.radar_chart(indicators, axes={'as': '$AS$', 'hv': '$HV$', 'onvgr': '$ONVGR$', 'relative-onvg': '$\\\\widehat{ONVG}$', 'ud': '$UD$'})[source]

Generate radar chart for all systems under comparison.

This method generates radar chart given performance indicator values in comparison of systems. It requires the presence of complement-ud and relative-onvg on indicators.

Parameters:
  • indicators (dict[str, dict[Literal['hv', 'ud', 'os', 'as', 'onvg', 'onvgr', 'relative-onvg', 'area'], float]]) – Indicators organized in a single dictionary where keys represent system labels, and values, dictionaries with at least the same keys as listed in axes_keys.

  • axes (dict[Union[Literal['hv', 'ud', 'os', 'as', 'onvg', 'onvgr', 'relative-onvg', 'area'], str], str]) – A dictionary containing the indicator keys that will be drawn on the radar chart, and corresponding labels associated with each of those axes. You can use LaTeX symbols and notations on the values of the dictionary.

  • title – The plot title.

  • **kwargs – Additional keyword arguments for updating chart properties. Supported options: - linewidth: Line width - linestyle: Line style

Return type:

tuple[Figure, Axes]

Returns:

A tuple containing both the matplotlib figure and axes used to create the radar chart.

Modules

metrics

Helpers to evaluate scikit-learn metrics at arbitrary thresholds.

plot

Plotting utilities.

scores

Data model organizing scores of ML systems under multi-objective constraints.

scripts

Command-line interfaces.

solutions

Define the basic solution data model and functionality.

utils

Shared utilities.