xtalmet.evaluator module

This module contains the Evaluator class for VSUN calculation.

class xtalmet.evaluator.Evaluator(validity: list[str] | None = None, stability: Literal['continuous', 'binary', None] = None, uniqueness: bool = False, novelty: bool = False, distance: str | None = None, ref_xtals: list[Crystal | Structure] | Literal['mp20'] | str | None = None, agg_func: Literal['prod', 'ave'] = 'prod', weights: dict[str, float] | None = None, multiprocessing: bool = False, n_processes: int | None = None, **kwargs)View on GitHub

Bases: object

Class for evaluating a set of crystals.

The evaluation is based on a chosen combination of validity (V), stability (S), uniqueness (U), and novelty (N).

Initialize the Evaluator.

Parameters:
  • validity (list[str] | None) – Approaches to evaluating validity. The currently supported methods are shown in SUPPORTED_VALIDITY in constants.py. If set to None, validity is not evaluated. Default is None.

  • stability (Literal["continuous", "binary", None]) – Stability evaluation method. “continuous” or “binary” or None. If set to None, stability is not evaluated. Default is None.

  • uniqueness (bool) – Whether to evaluate uniqueness. Default is False.

  • novelty (bool) – Whether to evaluate novelty. Default is False.

  • distance (str | None) – Distance metric used for uniqueness and novelty evaluation. The currently supported distances are listed in SUPPORTED_DISTANCES in constants.py. For more detailed information about each distance metric, please refer to the tutorial notebook. If both uniqueness and novelty are False, this argument is ignored. Default is None.

  • ref_xtals (list[Crystal | Structure] | Literal["mp20"] | str | None) – Reference crystal structures (typically a training set) for novelty evaluation. This can be a list of crystal structures, dataset name, or a path to the file containing the pre-computed embeddings of the reference structures. If a dataset name is given, its training data will be downloaded from Hugging Face. If novelty is False, this argument is ignored. Default is None.

  • agg_func (Literal["prod", "ave"]) – Aggregation function for combining V, S, U, and N. “prod” means multiplication, and “ave” means (weighted) average. Default is “prod”.

  • weights (dict[str, float] | None) – Weights for V, S, U, and N when agg_func is “ave”. For example, {“validity”: 0.2, “stability”: 0.3, “uniqueness”: 0.2, “novelty”: 0.3}. You only need to provide weights for the metrics you choose to evaluate. If the weights are not normalized, they will be normalized internally. If None, equal weights are used. Default is None.

  • multiprocessing (bool) – Whether to use multiprocessing for computing the embeddings of reference crystals. This argument is only effective when novelty is True and ref_xtals is list[Crystal | Structure]. Default is False.

  • n_processes (int | None) – Maximum number of processes to use for multiprocessing. If None, the number of logical CPU cores - 1 will be used. We recommend setting this argument to a smaller number than the number of available CPU cores to avoid out-of-memory. If multiprocessing is False, this argument is ignored. Default is None.

  • **kwargs – Additional keyword arguments. It can contain four keys: “args_validiity”, “args_stability”, “args_emb”, and “args_dist”. “args_validity” is for the validity evaluation, while “args_stability” is for the stability evaluation. “args_emb” and “args_dist” are for the distance metric used in uniqueness and novelty evaluation: The former is for the embedding calculation, and the latter is for the distance matrix calculation between embeddings. For more details, please refer to the tutorial notebook.

Examples

>>> # Evaluator for the conventional SUN metric against the MP20 dataset
>>> # using the StructureMatcher distance.
>>> evaluator = Evaluator(
...     stability="binary",
...             uniqueness=True,
...             novelty=True,
...             distance="smat",
...             ref_xtals="mp20",
...             agg_func="prod",
... )
>>> # Evaluator for the VSUN metric against a custom reference dataset using
>>> # the ElMD distance, with average aggregation.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd",
...     ref_xtals=ref_xtals,  # list[Crystal | Structure]
...     agg_func="ave",
...     weights={
...         "validity": 0.25,
...         "stability": 0.25,
...         "uniqueness": 0.25,
...         "novelty": 0.25,
...     },
...     multiprocessing=True,
...     n_processes=10,
... )
>>> # Evaluator for the VSUN metric against the MP20 dataset using the AMD
>>> # distance, with custom kwargs.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"k": 100},
...         "args_dist": {"metric": "chebyshev", "low_memory": False},
...     }
... )
>>> # Evaluator for the SUN metric against the MP20 dataset using the
>>> # ElMD+AMD distance, with custom kwargs.
>>> evaluator = Evaluator(
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd+amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"amd": {"k": 100}},
...         "args_dist": {
...                             "elmd": {"metric": "mod_petti"},
...                             "amd": {"metric": "chebyshev", "low_memory": False},
...                             "coefs": {
...                                     "elmd": float.fromhex("0x1.8d7d565a99f87p-1"),
...                                     "amd": float.fromhex("0x1.ca0aa695981e5p-3")},
...                             },
...                     }
...     }
... )
evaluate(xtals: list[Crystal | Structure], dir_intermediate: str | None = None, multiprocessing: bool = False, n_processes: int | None = None) tuple[float, ndarray, dict[str, float]]View on GitHub

Evaluate the given crystal structures.

Parameters:
  • xtals (list[Crystal | Structure]) – List of crystal structures to be evaluated.

  • dir_intermediate (str | None) – Directory to search for pre-computed intermediate results, such as validity scores, energy above hulls, embeddings, and distance matrices. If pre-computed files do not exist in the directory, they will be computed and saved to the directory for future use. If set to None, no files will be loaded or saved. It is recommended to set this argument when evaluating the same large set of crystal structures multiple times, for example trying different aggregation functions. The intermediate results can be shared as long as the same set of crystals is evaluated. Default is None.

  • multiprocessing (bool) – Whether to use multiprocessing for embedding and distance matrix computation. This argument is only effective when uniqueness or novelty evaluation is enabled. Default is False.

  • n_processes (int | None) – Maximum number of processes to use for multiprocessing. If None, the number of logical CPU cores - 1 will be used. We recommend setting this argument to a smaller number than the number of available CPU cores to avoid out-of-memory. If multiprocessing is False, this argument is ignored. Default is None.

Returns:

A tuple containing the overall score (float), individual scores for each crystal structure, and a dictionary of computation times for each evaluation component.

Return type:

tuple[float, np.ndarray, dict[str, float]]

Examples

>>> # Evaluate the conventional SUN metric using the StructureMatcher
>>> # distance ("smat") against the MP20 dataset.
>>> evaluator = Evaluator(
...     stability="binary",
...             uniqueness=True,
...             novelty=True,
...             distance="smat",
...             ref_xtals="mp20",
...             agg_func="prod",
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.28, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluate the VSUN metric using the ElMD distance against a custom
>>> # reference dataset, with average aggregation.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd",
...     ref_xtals=ref_xtals,  # list[Crystal | Structure]
...     agg_func="ave",
...     weights={
...         "validity": 0.25,
...         "stability": 0.25,
...         "uniqueness": 0.25,
...         "novelty": 0.25,
...     },
...     multiprocessing=True,
...     n_processes=10,
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate=None,
...     multiprocessing=False,
...     n_processes=None,
... )
>>> 0.6119424269941065, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluate the VSUN metric using the AMD distance against the MP20
>>> # dataset, with custom kwargs.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"k": 100},
...         "args_dist": {"metric": "chebyshev", "low_memory": False},
...     }
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.019558713928249892, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluator for the SUN metric against the MP20 dataset using the
>>> # ElMD+AMD distance, with custom kwargs.
>>> evaluator = Evaluator(
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd+amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"amd": {"k": 100}},
...         "args_dist": {
...                             "elmd": {"metric": "mod_petti"},
...                             "amd": {"metric": "chebyshev", "low_memory": False},
...                             "coefs": {
...                                     "elmd": float.fromhex("0x1.8d7d565a99f87p-1"),
...                                     "amd": float.fromhex("0x1.ca0aa695981e5p-3")},
...                             },
...                     }
...     }
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.16403383975840835, np.array([...]), {"aggregation": ..., ...}

Note

Here, I demonstrate how VSUN (or its subsets) is computed. Validity \(V(x)\) of each crystal \(x\) is 0 (invalid) or 1 (valid) depending on whether it passes the specified validity checks. If you specify multiple validity methods \(V_1(x)\), \(V_2(x)\), …, the overall validity is the product of individual validity scores:

\[V(x) = \prod_i V_i(x)\]

Stability \(S(x)\) is computed from the energy above hull, and has two variants: binary and continuous. The binary stability score \(S_b(x)\) is defined as follows.

\[\begin{split}S_b(x) = \begin{cases} 1 & \text{if } E_\text{hull}(x) \le \text{threshold} \\ 0 & \text{otherwise} \end{cases}\end{split}\]

“threshold” can be specified in kwargs when initializing the Evaluator. The default value is 0.1 [eV/atom]. The continuous stability score \(S_c(x)\) is defined as follows.

\[\begin{split}S_c(x) = \begin{cases} 1 & \text{if } E_\text{hull}(x) \le 0 \\ 1 - \frac{E_\text{hull}(x)}{\text{intercept}} & \text{if } 0 \le E_\text{hull}(x) \le \text{intercept} \\ 0 & \text{otherwise} \end{cases}\end{split}\]

“intercept” can be specified in kwargs when initializing the Evaluator. The default value 0.4289 [eV/atom]. In both cases, a higher score closer to 1 indicates a more stable structure. The definition of uniqueness \(U(x)\) depends on the chosen distance metric. For a binary distance \(d_b\), uniqueness of the i-th crystal \(x_i\) in the set of crystals \(\{x_1, x_2, \ldots, x_n\}\) is defined as follows.

\[U_b(x_i) = I \left(\land_{j=1}^{i-1} d_b(x_i, x_j) \neq 0 \right),\]

where \(I\) is the indicator function. For a continuous distance \(d_c\), uniqueness is defined as follows.

\[U_c(x_i) = \frac{1}{n-1} \sum_{j=1}^{n} d_c(x_i, x_j).\]

In both cases, the score ranges from 0 to 1, since the binary distance takes values of 0 or 1, and the continuous distance is normalized to be between 0 and 1. A higher score indicates a more unique structure. Novelty \(N(x)\) is defined similarly to uniqueness, but against a reference set of crystals. For a binary distance \(d_b\), novelty of the i-th crystal \(x_i\) is defined as follows.

\[N_b(x_i) = I \left(\land_{j=1}^{m} d_b(x_i, y_j) \neq 0 \right),\]

where \(\{y_1, y_2, \ldots, y_m\}\) is the reference set of crystals. For a continuous distance \(d_c\), novelty is defined as follows.

\[N_c(x_i) = \min_{j=1 \ldots m} d_c(x_i, y_j).\]

In both cases, the socre ranges from 0 to 1, and a higher score indicates a more novel structure. Finally, the VSUN score of each crystal \(x\) is computed by aggregating the individual scores using either multiplication or (weighted) average:

\[\begin{split}\text{VSUN}(x) = \begin{cases} V(x) S(x) U(x) N(x) & \text{if agg_func = “prod"} \\ w_V V(x) + w_S S(x) + w_U U(x) + w_N N(x) & \text{if agg_func =“ave"} \end{cases}\end{split}\]

where \(w_V\), \(w_S\), \(w_U\), and \(w_N\) are the normalized weights for validity, stability, uniqueness, and novelty, respectively. If only a subset of VSUN is evaluated, each crystal’s score is computed using only the specified components in the similar manner. The overall single score for the entire set of crystals is then obtained by averaging the individual scores.

\[\text{VSUN} = \frac{1}{n} \sum_{i=1}^{n} \text{VSUN}(x_i)\]