explanations
desdeo.explanations.explainer
Explainers are defined here.
ShapExplainer
Defines a SHAP explainer for reference point based methods.
Source code in desdeo/explanations/explainer.py
__init__
Initialize the explainer.
Initializes the explainer with given data, and input and output symbols. The data should contain the columns listed in the input and output symbols. This data is then used to simulate the inputs and outputs of an (interactive) multiobjective optimization method, which is used to explain the relation of its inputs and outputs using SHAP values.
Note
The data can be generated by for a reference point based based by, e.g.,
randomly sampling the input space and then evaluating the methods with the
sampled inputs to generate outputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
problem_data
|
DataFrame
|
the data to simulate the input and outputs of a multiobjective optimization method. |
required |
input_symbols
|
list[str]
|
the input symbols present in |
required |
output_symbols
|
list[str]
|
the output symbols present in |
required |
Source code in desdeo/explanations/explainer.py
evaluate
Evaluates the multiobjective optimization method represented by the data.
Note
Evaluation happens by finding the closest matching input array in the
self.input_array and then using that value's corresponding output
as the evaluation result. Closest means lowest Euclidean distance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
evaluate_array
|
ndarray
|
the inputs to the method represented by the data.
Can be either a single input, or an array of multiple inputs. Used mainly by
|
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: the evaluated output(s) corresponding to the input data. |
Source code in desdeo/explanations/explainer.py
explain_input
Explain an input and produces SHAP values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
to_be_explained
|
DataFrame
|
the input to be explained. The
dataframe must have the columns defined in |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
the key 'shaps' corresponds to the computed SHAP values for the input, the key 'base_values' is the baseline the SHAP values were computed against, and the key 'data' is the input the SHAP values were computed for. |
Source code in desdeo/explanations/explainer.py
setup
Setup the explainer.
Setups the SHAP explainer with the given background data. The
background data should have the columns self.input_symbols. The
background data is used as the background (or missing data) when
computing SHAP values. The mean (or expected values) of the background
data's output (self.output_symbols) will determine the baseline of the
SHAP values.
Note
To generate a dataset with meaningful expected values, e.g., in case
the SHAP values are better understood by relating them to a specific baseline,
see desdeo.explanations.generate_biased_mean_data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
background_data
|
DataFrame
|
the background data. |
required |
Source code in desdeo/explanations/explainer.py
desdeo.explanations.lagrange
Utilities for working with Lagrange multipliers in explainable multiobjective optimization.
compute_tradeoffs
compute_tradeoffs(
filtered_multipliers: dict[str, float] | None,
) -> dict[str, dict[str, float]] | None
Compute a tradeoff matrix from filtered Lagrange multipliers.
Tradeoffs represent marginal rates of substitution between objectives.
For objectives i and j: tradeoff[i][j] = -lambda_j / lambda_i.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filtered_multipliers
|
dict[str, float] | None
|
Dict mapping objective keys to their multiplier
values. Typically the output of :func: |
required |
Returns:
| Type | Description |
|---|---|
dict[str, dict[str, float]] | None
|
A nested dict where |
dict[str, dict[str, float]] | None
|
objective i on objective j. Diagonal entries are 1.0. |
dict[str, dict[str, float]] | None
|
Returns |
Source code in desdeo/explanations/lagrange.py
determine_active_objectives
determine_active_objectives(
lagrange_multipliers: list[dict[str, float] | None],
constraint_values: list[dict[str, float] | None]
| None = None,
objective_symbols: list[str] | None = None,
threshold: float = 1e-05,
) -> list[list[str]]
Determine which objectives are active (binding) for each solution.
An objective is considered active if its corresponding constraint is binding (constraint value >= 0) or, when constraint values are not available, if its multiplier magnitude exceeds a threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lagrange_multipliers
|
list[dict[str, float] | None]
|
List of filtered multiplier dicts, one per solution. |
required |
constraint_values
|
list[dict[str, float] | None] | None
|
Optional list of filtered constraint value dicts. |
None
|
objective_symbols
|
list[str] | None
|
Optional list of known objective symbols. |
None
|
threshold
|
float
|
Multiplier magnitude threshold for the fallback heuristic. |
1e-05
|
Returns:
| Type | Description |
|---|---|
list[list[str]]
|
A list of lists, where each inner list contains the symbols of the |
list[list[str]]
|
active objectives for the corresponding solution. |
Source code in desdeo/explanations/lagrange.py
filter_constraint_values
filter_constraint_values(
constraint_values: dict[str, float],
objective_symbols: list[str] | None = None,
) -> dict[str, float]
Filter constraint values to keep one representative per objective.
Same grouping logic as :func:filter_lagrange_multipliers but applied
to constraint values. Used to determine which constraints are active.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
constraint_values
|
dict[str, float]
|
Raw constraint value dict from SolverResults. |
required |
objective_symbols
|
list[str] | None
|
Optional list of objective symbols for grouping. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
A dict mapping each objective key to its filtered constraint value. |
Source code in desdeo/explanations/lagrange.py
filter_lagrange_multipliers
filter_lagrange_multipliers(
lagrange_multipliers: dict[str, float],
objective_symbols: list[str] | None = None,
) -> dict[str, float]
Filter raw Lagrange multipliers to keep one representative per objective.
Solver results may contain multiple multiplier entries per objective (e.g., from equality and inequality constraints). This function groups them and selects a preferred one (non-equality constraint preferred).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lagrange_multipliers
|
dict[str, float]
|
Raw multiplier dict from SolverResults. |
required |
objective_symbols
|
list[str] | None
|
If provided, use these to group multipliers by
objective symbol. Otherwise, group by |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
A dict mapping each objective key to its filtered multiplier value. |
dict[str, float]
|
Missing objectives are assigned 0.0. |
Source code in desdeo/explanations/lagrange.py
desdeo.explanations.utils
Utilities specific to explainable multiobjective optimization.
generate_biased_mean_data
generate_biased_mean_data(
data: ndarray,
target_means: ndarray,
min_size: int = 2,
max_size: int | None = None,
solver: str = "SCIP",
) -> list | None
Finds a subset of the provided data that has a mean value close to provided target values.
Finds a subset of the provided data that has a mean value close to the
provided target values. Formulates a mixed-integer quadratic programming problem to
find a subset of data with a mean as close as possible to target_values
and a size between min_size and max_size. In other words, the following problems is solved:
,
where \(n\) is the number of rows in data, \(m\) is the number of columns in
data, and \(k\) is the size of the subset. Notice that the closeness to the
target means is based on the Euclidean distance.
Note
Be mindful that this functions can take a long time with a very large data set and large upper bound for the desired subset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
ndarray
|
the data from which to generate the subset with a biased mean. Should be a 2D array with each row representing a sample and each column the value of the variables in the sample. |
required |
target_means
|
ndarray
|
the target mean values for each column the generated subset should have values close to. |
required |
min_size
|
int
|
the minimum size of the generated subset. Defaults to 2. |
2
|
max_size
|
int | None
|
the maximum size of the generated
subset. If None, then the maximum size is bound by the numbers of rows
in |
None
|
solver
|
str
|
the selected solver to be used by cvxpy. The solver should support mixed-integer quadratic programming. Defaults to "SCIP". |
'SCIP'
|
Returns:
| Type | Description |
|---|---|
list | None
|
list | None: the indices of the samples of the generated subset respect to
|