xtalmet package

Package containing a variety of crystal distance functions and evaluation metrics.

class xtalmet.Crystal(lattice: ArrayLike | Lattice, species: Sequence[str | Element | Species | DummySpecies | dict | Composition], coords: Sequence[ArrayLike], charge: float | None = None, validate_proximity: bool = False, to_unit_cell: bool = False, coords_are_cartesian: bool = False, site_properties: dict | None = None, labels: Sequence[str | None] | None = None, properties: dict | None = None)View on GitHub

Bases: Structure

Container for a single crystal structure.

Initialize a Crystal object.

The parameters are the same as those used in the __init__() method of the pymatgen.core.Structure class.

Parameters:
  • lattice (Lattice/3x3 array) – The lattice, either as a pymatgen.core.Lattice or simply as any 2D array. Each row should correspond to a lattice vector. e.g. [[10,0,0], [20,10,0], [0,0,30]] specifies a lattice with lattice vectors [10,0,0], [20,10,0] and [0,0,30].

  • species ([Species]) –

    Sequence of species on each site. Can take in flexible input, including:

    1. A sequence of element / species specified either as string

      symbols, e.g. [“Li”, “Fe2+”, “P”, …] or atomic numbers, e.g. (3, 56, …) or actual Element or Species objects.

    2. List of dict of elements/species and occupancies, e.g.

      [{“Fe” : 0.5, “Mn”:0.5}, …]. This allows the setup of disordered structures.

  • coords (Nx3 array) – list of fractional/Cartesian coordinates of each species.

  • charge (int) – overall charge of the structure. Defaults to behavior in SiteCollection where total charge is the sum of the oxidation states.

  • validate_proximity (bool) – Whether to check if there are sites that are less than 0.01 Ang apart. Defaults to False.

  • to_unit_cell (bool) – Whether to map all sites into the unit cell, i.e. fractional coords between 0 and 1. Defaults to False.

  • coords_are_cartesian (bool) – Set to True if you are providing coordinates in Cartesian coordinates. Defaults to False.

  • site_properties (dict) – Properties associated with the sites as a dict of sequences, e.g. {“magmom”:[5, 5, 5, 5]}. The sequences have to be the same length as the atomic species and fractional_coords. Defaults to None for no properties.

  • labels (list[str]) – Labels associated with the sites as a list of strings, e.g. [‘Li1’, ‘Li2’]. Must have the same length as the species and fractional coords. Defaults to None for no labels.

  • properties (dict) – Properties associated with the whole structure. Will be serialized when writing the structure to JSON or YAML but is lost when converting to other formats.

Note

The descriptions for args are copied from pymatgen.core.Structure class.

classmethod from_Structure(structure: Structure) CrystalView on GitHub

Create a Crystal object from a pymatgen Structure object.

Parameters:

structure (Structure) – A pymatgen Structure object.

Returns:

A Crystal object created from the Structure.

Return type:

Crystal

get_embedding(distance: str, **kwargs) tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]View on GitHub

Get the embedding of the crystal based on the specified distance metric.

Parameters:
  • distance (str) – The distance metric to use.

  • **kwargs – Additional arguments for embedding methods if needed.

Returns:

The embedding corresponding to the specified distance metric.

Return type:

TYPE_EMB_ALL

Raises:

ValueError – If an unsupported distance metric is provided.

Note

For “smat” distance, the embedding is the Crystal object itself.

get_composition_pymatgen() CompositionView on GitHub

Get the pymatgen composition of the crystal.

Called when screening using SMACT or E_hull.

Returns:

Pymatgen Composition object.

Return type:

Composition

get_ase_atoms() AtomsView on GitHub

Get the ASE Atoms object of the crystal.

Called when screening using E_hull. Not influenced by oxidation states.

Returns:

ASE Atoms object.

Return type:

Atoms

xtalmet.distance(distance: str, xtal_1: Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]], xtal_2: Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]], normalize: bool = True, verbose: Literal[False] = False, **kwargs) floatView on GitHub
xtalmet.distance(distance: str, xtal_1: Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]], xtal_2: Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]], normalize: bool = True, verbose: Literal[True] = False, **kwargs) tuple[float, tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]], tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]]

Compute the distance between two crystals.

Parameters:
  • distance (str) – The distance metric to use. Currently supported metrics are listed in SUPPORTED_DISTANCES in constants.py. For more detailed information about each distance metric, please refer to the tutorial notebook.

  • xtal_1 (Structure | Crystal | TYPE_EMB_ALL) – pymatgen Structure or Crystal or an embedding.

  • xtal_2 (Structure | Crystal | TYPE_EMB_ALL) – pymatgen Structure or Crystal or an embedding.

  • normalize (bool) – Whether to normalize the distance d to [0, 1] by using d’ = d / (1 + d). This argument is only considered when d is a continuous distance that is not normalized to [0, 1]. Such distances are listed in CONTINUOUS_UNNORMALIZED_DISTANCES in constants.py. Default is True.

  • verbose (bool) – Whether to return intermediate embeddings. Default is False.

  • **kwargs – Additional keyword arguments for specific distance metrics. It can contain two keys: “args_emb” and “args_dist”. The value of “args_emb” is a dict of arguments for the calculation of embeddings, and the value of “args_dist” is a dict of arguments for the calculation of distance between the embeddings. If embeddings are pre-computed and provided as inputs, the “args_emb” will be ignored.

Returns:

Distance between crystals. If verbose is True, also returns the embeddings and the computing time.

Return type:

float | tuple[np.ndarray, TYPE_EMB_ALL, TYPE_EMB_ALL, dict[str, float]]

Raises:

ValueError – If an unsupported distance metric is provided.

xtalmet.distance_matrix(distance: str, xtals_1: list[Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]], xtals_2: list[Structure | Crystal | tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]] | None = None, normalize: bool = True, multiprocessing: bool = False, n_processes: int | None = None, verbose: bool = False, **kwargs) ndarray | tuple[ndarray, list[tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]], dict[str, float]] | tuple[ndarray, list[tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]], list[tuple[tuple[str, int]] | tuple[int, tuple[str]] | list[float] | ndarray[float32 | float64] | str | tuple[str, ndarray[float32 | float64]]], dict[str, float]]View on GitHub

Compute the distance matrix between two sets of crystals.

If xtals_2 is None, compute the distance matrix within xtals_1.

Parameters:
  • distance (str) – The distance metric to use. Currently supported metrics are listed in SUPPORTED_DISTANCES in constants.py. For more detailed information about each distance metric, please refer to the tutorial notebook.

  • xtals_1 (list[Structure | Crystal | TYPE_EMB_ALL]) – A list of pymatgen Structures or Crystals or embeddings.

  • xtals_2 (list[Structure | Crystal | TYPE_EMB_ALL] | None) – A list of pymatgen Structures or Crystals or embeddings, or None. Default is None.

  • normalize (bool) – Whether to normalize the distances d to [0, 1] by using d’ = d / (1 + d). This argument is only considered when d is a continuous distance that is not normalized to [0, 1]. Such distances are listed in CONTINUOUS_UNNORMALIZED_DISTANCES in constants.py. Default is True.

  • multiprocessing (bool) – Whether to use multiprocessing. Default is False.

  • n_processes (int | None) – Maximum number of processes for multiprocessing. If multiprocessing is False, this argument will be ignored. Default is None.

  • verbose (bool) – Whether to return embeddings and the computing time. Default is False.

  • **kwargs – Additional keyword arguments for specific distance metrics. It can contain two keys: “args_emb” and “args_dist”. The value of “args_emb” is a dict of arguments for the calculation of embeddings, and the value of “args_dist” is a dict of arguments for the calculation of distance matrix using the embeddings. If embeddings are pre-computed and provided as inputs, the “args_emb” will be ignored.

Returns:

Distance matrix, the embeddings of xtals_1 (and xtals_2 if xtals_2 is not None) and the computing time.

Return type:

TYPE_D_MTX_RETURN

Raises:

ValueError – If an unsupported distance metric is provided.

class xtalmet.Evaluator(validity: list[str] | None = None, stability: Literal['continuous', 'binary', None] = None, uniqueness: bool = False, novelty: bool = False, distance: str | None = None, ref_xtals: list[Crystal | Structure] | Literal['mp20'] | str | None = None, agg_func: Literal['prod', 'ave'] = 'prod', weights: dict[str, float] | None = None, multiprocessing: bool = False, n_processes: int | None = None, **kwargs)View on GitHub

Bases: object

Class for evaluating a set of crystals.

The evaluation is based on a chosen combination of validity (V), stability (S), uniqueness (U), and novelty (N).

Initialize the Evaluator.

Parameters:
  • validity (list[str] | None) – Approaches to evaluating validity. The currently supported methods are shown in SUPPORTED_VALIDITY in constants.py. If set to None, validity is not evaluated. Default is None.

  • stability (Literal["continuous", "binary", None]) – Stability evaluation method. “continuous” or “binary” or None. If set to None, stability is not evaluated. Default is None.

  • uniqueness (bool) – Whether to evaluate uniqueness. Default is False.

  • novelty (bool) – Whether to evaluate novelty. Default is False.

  • distance (str | None) – Distance metric used for uniqueness and novelty evaluation. The currently supported distances are listed in SUPPORTED_DISTANCES in constants.py. For more detailed information about each distance metric, please refer to the tutorial notebook. If both uniqueness and novelty are False, this argument is ignored. Default is None.

  • ref_xtals (list[Crystal | Structure] | Literal["mp20"] | str | None) – Reference crystal structures (typically a training set) for novelty evaluation. This can be a list of crystal structures, dataset name, or a path to the file containing the pre-computed embeddings of the reference structures. If a dataset name is given, its training data will be downloaded from Hugging Face. If novelty is False, this argument is ignored. Default is None.

  • agg_func (Literal["prod", "ave"]) – Aggregation function for combining V, S, U, and N. “prod” means multiplication, and “ave” means (weighted) average. Default is “prod”.

  • weights (dict[str, float] | None) – Weights for V, S, U, and N. For “ave”, weights are coefficients (normalized internally; equal weights if None). For “prod”, weights are exponents (not normalized; default 1.0 per active metric if None). Default is None.

  • multiprocessing (bool) – Whether to use multiprocessing for computing the embeddings of reference crystals. This argument is only effective when novelty is True and ref_xtals is list[Crystal | Structure]. Default is False.

  • n_processes (int | None) – Maximum number of processes to use for multiprocessing. If None, the number of logical CPU cores - 1 will be used. We recommend setting this argument to a smaller number than the number of available CPU cores to avoid out-of-memory. If multiprocessing is False, this argument is ignored. Default is None.

  • **kwargs – Additional keyword arguments. It can contain four keys: “args_validity”, “args_stability”, “args_emb”, and “args_dist”. “args_validity” is for the validity evaluation, while “args_stability” is for the stability evaluation. “args_emb” and “args_dist” are for the distance metric used in uniqueness and novelty evaluation: The former is for the embedding calculation, and the latter is for the distance matrix calculation between embeddings. For more details, please refer to the tutorial notebook.

Examples

>>> # Evaluator for the conventional SUN metric against the MP20 dataset
>>> # using the StructureMatcher distance.
>>> evaluator = Evaluator(
...     stability="binary",
...             uniqueness=True,
...             novelty=True,
...             distance="smat",
...             ref_xtals="mp20",
...             agg_func="prod",
... )
>>> # Evaluator for the VSUN metric against a custom reference dataset using
>>> # the ElMD distance, with average aggregation.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd",
...     ref_xtals=ref_xtals,  # list[Crystal | Structure]
...     agg_func="ave",
...     weights={
...         "validity": 0.25,
...         "stability": 0.25,
...         "uniqueness": 0.25,
...         "novelty": 0.25,
...     },
...     multiprocessing=True,
...     n_processes=10,
... )
>>> # Evaluator for the VSUN metric against the MP20 dataset using the AMD
>>> # distance, with custom kwargs.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"k": 100},
...         "args_dist": {"metric": "chebyshev", "low_memory": False},
...     }
... )
>>> # Evaluator for the SUN metric against the MP20 dataset using the
>>> # ElMD+AMD distance, with custom kwargs.
>>> evaluator = Evaluator(
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd+amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"amd": {"k": 100}},
...         "args_dist": {
...                             "elmd": {"metric": "mod_petti"},
...                             "amd": {"metric": "chebyshev", "low_memory": False},
...                             "coefs": {
...                                     "elmd": float.fromhex("0x1.8d7d565a99f87p-1"),
...                                     "amd": float.fromhex("0x1.ca0aa695981e5p-3")},
...                             },
...                     }
...     }
... )
evaluate(xtals: list[Crystal | Structure], dir_intermediate: str | None = None, multiprocessing: bool = False, n_processes: int | None = None) tuple[float, ndarray, dict[str, float]]View on GitHub

Evaluate the given crystal structures.

Parameters:
  • xtals (list[Crystal | Structure]) – List of crystal structures to be evaluated.

  • dir_intermediate (str | None) – Directory to search for pre-computed intermediate results, such as validity scores, energy above hulls, embeddings, and distance matrices. If pre-computed files do not exist in the directory, they will be computed and saved to the directory for future use. If set to None, no files will be loaded or saved. It is recommended to set this argument when evaluating the same large set of crystal structures multiple times, for example trying different aggregation functions. The intermediate results can be shared as long as the same set of crystals is evaluated. Default is None.

  • multiprocessing (bool) – Whether to use multiprocessing for embedding and distance matrix computation. This argument is only effective when uniqueness or novelty evaluation is enabled. Default is False.

  • n_processes (int | None) – Maximum number of processes to use for multiprocessing. If None, the number of logical CPU cores - 1 will be used. We recommend setting this argument to a smaller number than the number of available CPU cores to avoid out-of-memory. If multiprocessing is False, this argument is ignored. Default is None.

Returns:

A tuple containing the overall score (float), individual scores for each crystal structure, and a dictionary of computation times for each evaluation component.

Return type:

tuple[float, np.ndarray, dict[str, float]]

Examples

>>> # Evaluate the conventional SUN metric using the StructureMatcher
>>> # distance ("smat") against the MP20 dataset.
>>> evaluator = Evaluator(
...     stability="binary",
...             uniqueness=True,
...             novelty=True,
...             distance="smat",
...             ref_xtals="mp20",
...             agg_func="prod",
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.28, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluate the VSUN metric using the ElMD distance against a custom
>>> # reference dataset, with average aggregation.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd",
...     ref_xtals=ref_xtals,  # list[Crystal | Structure]
...     agg_func="ave",
...     weights={
...         "validity": 0.25,
...         "stability": 0.25,
...         "uniqueness": 0.25,
...         "novelty": 0.25,
...     },
...     multiprocessing=True,
...     n_processes=10,
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate=None,
...     multiprocessing=False,
...     n_processes=None,
... )
>>> 0.6119424269941065, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluate the VSUN metric using the AMD distance against the MP20
>>> # dataset, with custom kwargs.
>>> evaluator = Evaluator(
...     validity=["smact", "structure"],
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"k": 100},
...         "args_dist": {"metric": "chebyshev", "low_memory": False},
...     }
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.019558713928249892, np.array([...]), {"aggregation": ..., ...}
>>> # Evaluator for the SUN metric against the MP20 dataset using the
>>> # ElMD+AMD distance, with custom kwargs.
>>> evaluator = Evaluator(
...     stability="continuous",
...     uniqueness=True,
...     novelty=True,
...     distance="elmd+amd",
...     ref_xtals="mp20",
...     agg_func="prod",
...     {
...         "args_validity": {
...             "structure": {
...                 "threshold_distance": 0.5,
...                 "threshold_volume": 0.1,
...             }
...         },
...         "args_stability": {
...             "diagram": "mp_250618",
...             "mace_model": "medium-mpa-0",
...             "intercept": 0.4289,
...         },
...         "args_emb": {"amd": {"k": 100}},
...         "args_dist": {
...                             "elmd": {"metric": "mod_petti"},
...                             "amd": {"metric": "chebyshev", "low_memory": False},
...                             "coefs": {
...                                     "elmd": float.fromhex("0x1.8d7d565a99f87p-1"),
...                                     "amd": float.fromhex("0x1.ca0aa695981e5p-3")},
...                             },
...                     }
...     }
... )
>>> evaluator.evaluate(
...     xtals=xtals,  # list[Crystal | Structure]
...     dir_intermediate="intermediate_results/",
...     multiprocessing=True,
...     n_processes=10,
... )
>>> 0.16403383975840835, np.array([...]), {"aggregation": ..., ...}

Note

Here, I demonstrate how VSUN (or its subsets) is computed. Validity \(V(x)\) of each crystal \(x\) is 0 (invalid) or 1 (valid) depending on whether it passes the specified validity checks. If you specify multiple validity methods \(V_1(x)\), \(V_2(x)\), …, the overall validity is the product of individual validity scores:

\[V(x) = \prod_i V_i(x)\]

Stability \(S(x)\) is computed from the energy above hull, and has two variants: binary and continuous. The binary stability score \(S_b(x)\) is defined as follows.

\[\begin{split}S_b(x) = \begin{cases} 1 & \text{if } E_\text{hull}(x) \le \text{threshold} \\ 0 & \text{otherwise} \end{cases}\end{split}\]

“threshold” can be specified in kwargs when initializing the Evaluator. The default value is 0.1 [eV/atom]. The continuous stability score \(S_c(x)\) is defined as follows.

\[\begin{split}S_c(x) = \begin{cases} 1 & \text{if } E_\text{hull}(x) \le 0 \\ 1 - \frac{E_\text{hull}(x)}{\text{intercept}} & \text{if } 0 \le E_\text{hull}(x) \le \text{intercept} \\ 0 & \text{otherwise} \end{cases}\end{split}\]

“intercept” can be specified in kwargs when initializing the Evaluator. The default value 0.4289 [eV/atom]. In both cases, a higher score closer to 1 indicates a more stable structure. The definition of uniqueness \(U(x)\) depends on the chosen distance metric. For a binary distance \(d_b\), uniqueness of the i-th crystal \(x_i\) in the set of crystals \(\{x_1, x_2, \ldots, x_n\}\) is defined as follows.

\[U_b(x_i) = I \left(\land_{j=1}^{i-1} d_b(x_i, x_j) \neq 0 \right),\]

where \(I\) is the indicator function. For a continuous distance \(d_c\), uniqueness is defined as follows.

\[U_c(x_i) = \frac{1}{n-1} \sum_{j=1}^{n} d_c(x_i, x_j).\]

In both cases, the score ranges from 0 to 1, since the binary distance takes values of 0 or 1, and the continuous distance is normalized to be between 0 and 1. A higher score indicates a more unique structure. Novelty \(N(x)\) is defined similarly to uniqueness, but against a reference set of crystals. For a binary distance \(d_b\), novelty of the i-th crystal \(x_i\) is defined as follows.

\[N_b(x_i) = I \left(\land_{j=1}^{m} d_b(x_i, y_j) \neq 0 \right),\]

where \(\{y_1, y_2, \ldots, y_m\}\) is the reference set of crystals. For a continuous distance \(d_c\), novelty is defined as follows.

\[N_c(x_i) = \min_{j=1 \ldots m} d_c(x_i, y_j).\]

In both cases, the socre ranges from 0 to 1, and a higher score indicates a more novel structure. Finally, the VSUN score of each crystal \(x\) is computed by aggregating the individual scores using either multiplication or (weighted) average:

\[\begin{split}\text{VSUN}(x) = \begin{cases} V(x) S(x) U(x) N(x) & \text{if agg_func = "prod"} \\ w_V V(x) + w_S S(x) + w_U U(x) + w_N N(x) & \text{if agg_func ="ave"} \end{cases}\end{split}\]

where \(w_V\), \(w_S\), \(w_U\), and \(w_N\) are the normalized weights for validity, stability, uniqueness, and novelty, respectively. If only a subset of VSUN is evaluated, each crystal’s score is computed using only the specified components in the similar manner. The overall single score for the entire set of crystals is then obtained by averaging the individual scores.

\[\text{VSUN} = \frac{1}{n} \sum_{i=1}^{n} \text{VSUN}(x_i)\]
class xtalmet.StabilityCalculator(diagram: Literal['mp_250618', 'mp'] | PatchedPhaseDiagram | str = 'mp_250618', mace_model: str = 'medium-mpa-0', binary=True, threshold: float = 0.1, intercept: float = 0.4289)View on GitHub

Bases: object

Class to calculate stability scores of crystal structures.

Initialize StabilityCalculator.

Parameters:
  • diagram (Literal["mp_250618", "mp"] | PatchedPhaseDiagram | str) – A phased diagram to use. If “mp_250618” is specified, the diagram constructed using this class from the MP entries on June 18, 2025, will be used. If “mp” is specified, the diagram will be constructed on the spot. You can also pass your own diagram or a path to it.

  • mace_model (str) – The MACE model to use for energy prediction. Default is “medium-mpa-0”.

  • binary (bool) – If True, compute binary stability scores (1 for stable, 0 for unstable). If False, compute continuous stability scores between 0 and 1. Default is True.

  • threshold (float) – Energy above hull threshold for stability in eV/atom. Only used if binary is True. Default is 0.1 eV/atom.

  • intercept (float) – Intercept for linear scaling of stability scores in eV/atom. Only used if binary is False. Default is 0.4289 eV/atom, which is the 99.9th percentile of the energy above hull values for the MP20 test data.

compute_stability_scores(xtals: list[Crystal], e_above_hulls_precomputed: ndarray[float] | None = None) tuple[ndarray[float], ndarray[float], float]View on GitHub

Compute stability scores for a list of crystals.

Parameters:
  • xtals (list[Crystal]) – List of crystals to compute stability scores for.

  • e_above_hulls_precomputed (np.ndarray[float] | None) – Precomputed energy above hull values. If None, they will be computed internally.

Returns:

A tuple of stability scores, raw energy above hull values, and computation time in seconds.

Return type:

tuple[np.ndarray[float], np.ndarray[float], float]

class xtalmet.SMACTValidatorView on GitHub

Bases: SingleValidator

Class to calculate validity of crystal structures using SMACT.

Initialize SMACTValidator.

validate(xtals: list[Crystal]) ndarray[float]View on GitHub

Validate a list of crystals using SMACT.

Parameters:

xtals (list[Crystal]) – List of crystals to validate.

Returns:

Array of validity scores for each crystal. A value of 1.0 indicates that the crystal passed the SMACT screening, while 0.0 indicates that it failed.

Return type:

np.ndarray[float]

References

class xtalmet.StructureValidator(threshold_distance: float = 0.5, threshold_volume: float = 0.1)View on GitHub

Bases: SingleValidator

Class to calculate structure-based validity of crystal structures.

Initialize StructureValidator.

Parameters:
  • threshold_distance (float) – Minimum allowed distance between atoms.

  • threshold_volume (float) – Minimum allowed volume of the unit cell.

References

  • Xie et al., (2022). Crystal Diffusion Variational Autoencoder for Periodic Material Generation. In International Conference on Learning Representations.

validate(xtals: list[Crystal]) ndarray[float]View on GitHub

Validate a list of crystals using structure-based method.

Parameters:

xtals (list[Crystal]) – List of crystals to validate.

Returns:

Array of validity scores for each crystal. A value of 1.0 indicates that the crystal passed the structure-based screening, while 0.0 indicates that it failed.

Return type:

np.ndarray[float]

class xtalmet.Validator(methods: list[str], **kwargs)View on GitHub

Bases: object

Class to calculate validity of crystal structures.

Initialize Validator.

Parameters:
  • methods (list[str]) – List of validity evaluation methods to use. The currently supported methods are shown in SUPPORTED_VALIDITY in constants.py.

  • **kwargs – Additional keyword arguments for each validity evaluation method.

validate(xtals: list[Crystal], skip: list[str]) tuple[dict[str, ndarray[float]], dict[str, float]]View on GitHub

Validate a list of crystals using the specified methods.

Parameters:
  • xtals (list[Crystal]) – List of crystals to validate.

  • skip (list[str]) – List of validity methods to skip.

Returns:

A dictionary of individual scores from each validator, and a dictionary of time taken for each validity method.

Return type:

tuple[dict[str, np.ndarray[float]], dict[str, float]]

Submodules