xtalmet.evaluator module
This module contains the Evaluator class for uniqueness and novelty calculation.
- class xtalmet.evaluator.Evaluator(gen_xtals: list[Crystal | Structure])View on GitHub
Bases:
objectClass for storing and evaluating a set of crystals.
Initialize the Evaluator.
- Parameters:
gen_xtals (list[Crystal | Structure]) – Generated crystal structures.
- uniqueness(distance: Literal['smat', 'comp', 'wyckoff', 'magpie', 'pdd', 'amd'], screen: Literal[None, 'smact', 'ehull'] = None, dir_intermediate: str | None = None, return_time: bool = False, **kwargs) float | tuple[float, dict[str, float]]View on GitHub
Evaluate the uniqueness of a set of crystals.
- Parameters:
distance (Literal) – Distance function used for uniqueness evaluation.
screen (Literal) – Method to screen the crystals.
dir_intermediate (str | None) – Directory to search for pre-computed embeddings, distance matrix, and screening results for faster computation. If pre-computed files do not exist in the directory, they will be saved to the directory for future use. If set to None, no files will be loaded or saved. It is recommended that you set this argument. This is especially important when evaluating a large number of generated crystals or when d_smat is used as the distance metric.
return_time (bool) – Whether to return the time taken for each step.
**kwargs – Additional keyword arguments for specific distance metrics and thermodynamic screening. It can contain three keys: “args_emb”, “args_mtx”, and “args_screen”. The value of “args_emb” is a dict of arguments for the calculation of embeddings, the value of “args_mtx” is a dict of arguments for the calculation of distance matrix using the embeddings, and the value of “args_screen” is a dict of arguments for the screening function.
Examples
>>> evaluator.uniqueness( ... distance="smat", ... screen=None, ... dir_intermediate="./intermediate", ... return_time=True, ... ) >>> ( ... 0.9945, ... { ... "uni_emb": 0.003, ... "uni_d_mtx": 16953.978, ... "uni_metric": 0.152, ... "uni_total": 16954.133, ... }, ... ) >>> evaluator.uniqueness( ... distance="amd", ... screen="ehull", ... dir_intermediate="./intermediate", ... return_time=False, ... kwargs={ ... "args_emb": {"k": 200}, ... "args_mtx": {"metric": "chebyshev", "low_memory": False}, ... "args_screen": {"diagram": "mp_250618"}, ... }, ... ) >>> 0.0016
- Returns:
Uniqueness value or (uniqueness value, a dictionary of time taken for each step).
- Return type:
float | tuple
- novelty(train_xtals: list[Crystal | Structure] | Literal['mp20'], distance: Literal['smat', 'comp', 'wyckoff', 'magpie', 'pdd', 'amd'], screen: Literal[None, 'smact', 'ehull'] = None, dir_intermediate: str | None = None, return_time: bool = False, **kwargs) float | tuple[float, dict[str, float]]View on GitHub
Evaluate the novelty of a set of crystals.
- Parameters:
train_xtals (list[Crystal | Structure] | Literal["mp20"]) – List of training crystal structures or dataset name. If a dataset name is given, the embeddings of its training data will be downloaded from Hugging Face. The embeddings were computed using the _embed method above with no additional kwargs.
distance (Literal) – Distance used for novelty evaluation.
screen (Literal) – Method to screen the generated crystals.
dir_intermediate (str | None) – Directory to search for pre-computed embeddings, distance matrix, and screening results for faster computation. If pre-computed files do not exist in the directory, they will be saved to the directory for future use. If set to None, no files will be loaded or saved. It is recommended that you set this argument. This is especially important when evaluating a large number of generated crystals or when d_smat is used as the distance metric.
return_time (bool) – Whether to return the time taken for each step.
**kwargs – Additional keyword arguments for specific distance metrics and thermodynamic screening. It can contain three keys: “args_emb”, “args_mtx”, and “args_screen”. The value of “args_emb” is a dict of arguments for the calculation of embeddings, the value of “args_mtx” is a dict of arguments for the calculation of distance matrix using the embeddings, and the value of “args_screen” is a dict of arguments for the screening function.
Examples
>>> evaluator.novelty( ... train_xtals="mp20", ... distance="smat", ... screen=None, ... dir_intermediate="./intermediate", ... return_time=True, ... ) >>> ( ... 0.9892, ... { ... "nov_emb_gen": 1.693, ... "nov_emb_train": 5.790, ... "nov_d_mtx": 42784.921, ... "nov_metric": 0.628, ... "nov_total": 42793.032, ... }, ... ) >>> evaluator.novelty( ... train_xtals=list_of_train_xtals, ... distance="amd", ... screen="ehull", ... dir_intermediate="./intermediate", ... return_time=False, ... kwargs={ ... "args_emb": {"k": 200}, ... "args_mtx": {"metric": "chebyshev", "low_memory": False}, ... "args_screen": {"diagram": "mp_250618"}, ... }, ... ) >>> 0.0075
- Returns:
- Novelty value or a tuple containing the novelty value
and a dictionary of time taken for each step.
- Return type:
float | tuple