fairical.solutions¶
Define the basic solution data model and functionality.
Classes
|
Data model representing system solutions (or utility/fairness trade-offs). |
- class fairical.solutions.Solutions(**data)[source]¶
Bases:
BaseModelData model representing system solutions (or utility/fairness trade-offs).
Objects of this type carry information about two or more performance metrics (utility or fairness) for each operating mode (utility/fairness trade-off) of the analysed ML system.
It is a dictionary where keys correspond to utility or fairness metrics calculated for the whole system, and values across different keys represent each the performance at a particular operating mode (utility/fairness trade-off) the system being analysed can potentially implement.
- points: dict[str, list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0), Le(le=1.0)])]]]¶
- n_metrics()¶
Return the number of metrics stored in the model.
- Return type:
- Returns:
The count of metric keys in the model.
- n_solutions()[source]¶
Return the number of solutions stored in the model, across all metrics.
- Return type:
- Returns:
The number of solutions in the model, across all metrics.
- values()[source]¶
Return a view of all solution vectors.
- Return type:
- Returns:
A view of all metric solution vectors in the model.
- classmethod fromarray(points, metrics, thresholds, identifiers=None, identifier_names=None)[source]¶
Create a new instance from an array and names of metrics.
- Parameters:
points (
TypeAliasType) – 2-D array-like object with floating-point numbers organized as(n_solutions, n_metrics).metrics (
Sequence[str]) – A set of strictly valid and supported metrics, each representing the columns of the input points array.thresholds (
Sequence[float]) – A list of tresholds that were used to compute the points.identifiers (
Optional[Sequence[int]]) – Optional list of identifiers, indicating which model each point corresponds to. Each identifier indexes into identifier_names. If not specified, all points will have the same identifier.identifier_names (
Optional[Sequence[str]]) – Optoinal list of identifier names.
- Return type:
Self- Returns:
A newly created and validated object.
- Raises:
AssertionError – If the number of columns on the input array-like object is different than the number of listed metrics.
- classmethod load(source)[source]¶
Validate and load a JSON file into a solution data object.
This function is intended to validate and load the input in JSON format. It opens the given file path, parses its JSON content, and validates it against the defined model.
- Parameters:
source (
Path|str|TextIO) – Source input where to read JSON from.- Return type:
Self- Returns:
Parsed and validated content as a
Solutionsinstance.- Raises:
pydantic_core.ValidationError – If the file contains invalid data.
- save(dest, **args)[source]¶
Save contents to an external file.
- Parameters:
dest (
Path|str|TextIO) – Destination where to save the contents. If not a path or str, then assumed to have awritemethod accepting strings.args – Parameters further passed down to
pydantic.BaseModel.model_dump_json().
- Return type:
- check_consistent_lengths()[source]¶
Ensure all solution lists have the same length.
- Return type:
Self
- filter_metadata_by_indices(indices)[source]¶
Filter metadata from provided indices.
- Parameters:
indices – The indices used for filtering.
- Returns:
Filtered metadata.
- deduplicate(eps=1e-06)[source]¶
Filter solutions to remove duplicates within a certain epsilon.
Remove points in these solutions that lie within
epsof another by clustering withsklearn.cluster.DBSCAN(min_samples=1) and keeping the first point in each cluster.- Parameters:
eps (
float) – Maximum distance between points in the same cluster.- Return type:
Self- Returns:
Filtered solutions without duplicates, as a new object.
- non_dominated_solutions()[source]¶
Filter solutions from system that are non-dominated.
This is a thin wrapper around
pymoo.util.nds.NonDominatedSortingthat extracts the rank‑0 solutions (those that are not dominated by any other).Definition: A point p is dominated only if one single competitor is no worse in every objective and strictly better in at least one.
- Parameters:
solutions – All solutions available in the current system.
- Return type:
tuple[Self,Self]- Returns:
A tuple containing non-dominated and dominated solutions respectively. By definition, the sets are guaranteed to not overlap.
- indicators()[source]¶
Assess utility-fairness trade-off systems based on characteristics of the estimated Pareto front.
This method evaluates trade-off between utiltiy and fairness of adjustable systems by using Multi-Objective based performance indicators. It first estimates the set of non-dominated solutions.
- Return type:
dict[Literal['hv','ud','os','as','onvg','onvgr','relative-onvg','area'],float]- Returns:
A dictionary that characterizes the (estimated Pareto) front composed of non-dominated solutions in
nds. The dictionary contains the following keys:hv: The hypervolume of the front.Higher is better. This indicator evaluates how the solution set covers the metric space in terms of diversity and proximity to the ideal. HV is formulated as:
\[\begin{split}HV(S) = VOL\left(\bigcup_{\substack{x \in S \\ x \prec r}} \prod_{i=1}^{N}[x^{i},r^{i}]\right)\end{split}\]Where \(x\) is the solution set and \(r\) is the Nadir point
ud: Uniformity of the distribution ofndspoints on the front.This indicator evaluates how uniform the solution set is spanned in the metric space based on an upper-bound distance, \(\sigma\). UD is formulated as:
\[UD(S,\sigma)=\frac{1}{1+D_{nc}(S, \sigma)}\]Where
\[D_{nc}(S,\sigma)=\sqrt{\frac{1}{|X_n|-1} \sum_{i=1}^{|X_n|} \left(nc(x^i,\sigma)-\mu_{nc(x,\sigma)}\right)^2}\]and
\[nc(x^i,\sigma)=|\{x \in X_n, \|x-x^i\|<\sigma\}|-1\]\(\sigma\) is the niche radius that is problem dependent and can be adjusted based on the distribution of the candidate solution in the space. \(\mu_{nc(x,\sigma)}\) is the mean of the niche counts, \(nc\), calculated as \(\mu_{nc(x,\sigma)}=\frac{1}{|X_n|} \sum_{j=1}^{|X_n|} nc(x^j,\sigma)\).
os: Overall spread ofndspoints with respect to extremities of the front.This indicator assesses how well the points from the candidate set spreads towards the ideal of the optimal PF. OS is formulated as:
\[OS(S,\mathcal{P})=\prod_{i=1}^{N}\left|\frac{\max\limits_{s \in S}s_i-\min\limits_{s \in S}s_i}{\max\limits_{p \in \mathcal{P}}p_{i}-\min\limits_{p \in \mathcal{P}}p_{i}}\right|\]Where the nominator and denominator are the absolute difference between the worst and best points for the candidate solution \(S\) and Pareto optimal set \(\mathcal{P}\), respectively.
as: Average spread ofndspoints with respect to extremities of the front.This indicator assesses how well the points from the candidate set spreads towards the ideal of the optimal PF. AS is formulated as:
\[AS(S,\mathcal{P})=\frac{1}{N}\sum_{i=1}^{N}\left|\frac{\max\limits_{s \in S}s_i-\min\limits_{s \in S}s_i}{\max\limits_{p \in \mathcal{P}}p_{i}-\min\limits_{p \in \mathcal{P}}p_{i}}\right|\]Where the nominator and denominator are the absolute difference between the worst and best points for the candidate solution \(S\) and Pareto optimal set \(\mathcal{P}\), respectively.
onvg: Overall Nondominated Vector Generation (ONVG) in the front (nds).Higher is better. This indicator evaluates how many optimal solutions are generated by the system. ONVG is formulated as:
\[ONVG(S) = |X_n|\]Where \(|.|\) is the cardinality of the candidate solution set in the metric space.
onvgr: Ratio between number of solutions inndsandnds + ds.Higher is better. This indicator assesses the proportion of optimal solutions generated by the system. ONVGR is formulated as:
\[ONVGR(S) = \left|\frac{X_n}{S}\right|\]Where \(|.|\) is the ratio of the optimality.
- tabulate()[source]¶
Generate a table containing the given solutions.
Each table row contains the values for each metric, the threshold used to compute them, and the corresponding identifier name.
- Return type:
- Returns:
The generated table as a string.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].