Skip to content

I/O API

File input/output and integration with external tools.

Readers

Functions for loading potential data from various file formats.

carriercapture.io.readers

File readers for various data formats.

Supports loading potential data, configuration files, and results from: - Plain text (DAT, TXT) - CSV - JSON - YAML - NumPy NPZ

read_potential_data(filepath, delimiter=None, skip_header=0)

Read potential energy surface data from text file.

Expects two-column format: Q (amu^0.5·Å), E (eV)

Parameters:

Name Type Description Default
filepath str or Path

Path to data file

required
delimiter str

Column delimiter (auto-detected if None)

None
skip_header int

Number of header lines to skip

0

Returns:

Name Type Description
Q_data NDArray[float64]

Configuration coordinates (amu^0.5·Å)

E_data NDArray[float64]

Potential energies (eV)

Raises:

Type Description
ValueError

If file doesn't have exactly 2 columns

Examples:

>>> Q, E = read_potential_data("excited.dat")
>>> Q.shape, E.shape
((100,), (100,))

read_json(filepath)

Read JSON file.

Parameters:

Name Type Description Default
filepath str or Path

Path to JSON file

required

Returns:

Name Type Description
data dict

Loaded JSON data

Examples:

>>> data = read_json("potential.json")
>>> pot = Potential.from_dict(data)

read_yaml(filepath)

Read YAML configuration file.

Parameters:

Name Type Description Default
filepath str or Path

Path to YAML file

required

Returns:

Name Type Description
config dict

Loaded YAML configuration

Examples:

>>> config = read_yaml("config.yaml")
>>> pot_config = config['potential_initial']

read_csv(filepath, has_header=True)

Read CSV file with potential data.

Parameters:

Name Type Description Default
filepath str or Path

Path to CSV file

required
has_header bool

Whether file has header row

True

Returns:

Name Type Description
Q_data NDArray[float64]

Configuration coordinates

E_data NDArray[float64]

Potential energies

Examples:

>>> Q, E = read_csv("data.csv")

read_npz(filepath)

Read NumPy NPZ file.

Parameters:

Name Type Description Default
filepath str or Path

Path to NPZ file

required

Returns:

Name Type Description
data dict

Dictionary of arrays from NPZ file

Examples:

>>> data = read_npz("results.npz")
>>> Q = data['Q_data']
>>> E = data['E_data']

load_potential_from_file(filepath, file_format=None)

Load potential from file (auto-detect format).

Supports: .json, .yaml, .yml, .npz, .dat, .txt, .csv

Parameters:

Name Type Description Default
filepath str or Path

Path to file

required
file_format str

Force specific format (json, yaml, npz, dat, csv) If None, auto-detect from extension

None

Returns:

Name Type Description
data dict

Potential data dictionary

Examples:

>>> data = load_potential_from_file("potential.json")
>>> pot = Potential.from_dict(data)

Writers

Functions for exporting results to various formats.

carriercapture.io.writers

File writers for saving results.

Supports exporting to: - JSON (human-readable, full precision) - YAML (configuration files) - CSV (tabular data) - NumPy NPZ (compressed arrays) - Plain text (simple Q-E data)

write_json(data, filepath, indent=2)

Write data to JSON file.

Parameters:

Name Type Description Default
data dict

Data to write

required
filepath str or Path

Output file path

required
indent int

Indentation level for pretty printing

2

Examples:

>>> pot_data = pot.to_dict()
>>> write_json(pot_data, "potential.json")

write_yaml(data, filepath)

Write data to YAML file.

Parameters:

Name Type Description Default
data dict

Data to write

required
filepath str or Path

Output file path

required

Examples:

>>> config = {'potential': {...}, 'capture': {...}}
>>> write_yaml(config, "config.yaml")

write_csv(Q_data, E_data, filepath, header=True)

Write Q-E data to CSV file.

Parameters:

Name Type Description Default
Q_data NDArray[float64]

Configuration coordinates

required
E_data NDArray[float64]

Potential energies

required
filepath str or Path

Output file path

required
header bool

Include header row

True

Examples:

>>> write_csv(Q, E, "potential.csv")

write_npz(data, filepath, compressed=True)

Write arrays to NumPy NPZ file.

Parameters:

Name Type Description Default
data dict

Dictionary of arrays to save

required
filepath str or Path

Output file path

required
compressed bool

Use compression

True

Examples:

>>> data = {'Q': Q_data, 'E': E_data, 'eigenvalues': eigs}
>>> write_npz(data, "results.npz")

write_potential_data(Q_data, E_data, filepath, header=None, fmt='%.10e')

Write Q-E data to plain text file.

Parameters:

Name Type Description Default
Q_data NDArray[float64]

Configuration coordinates

required
E_data NDArray[float64]

Potential energies

required
filepath str or Path

Output file path

required
header str

Header comment line

None
fmt str

Format string for numbers

"%.10e"

Examples:

>>> write_potential_data(Q, E, "potential.dat", header="Q(amu^0.5·Å)  E(eV)")

write_capture_results(config_coord, filepath, file_format='json', include_partial=False)

Write capture coefficient results to file.

Parameters:

Name Type Description Default
config_coord ConfigCoordinate

Configuration coordinate with computed results

required
filepath str or Path

Output file path

required
file_format str

Output format: json, yaml, csv, npz

"json"
include_partial bool

Include partial capture coefficients (large arrays)

False

Examples:

>>> write_capture_results(cc, "results.json")
>>> write_capture_results(cc, "results.csv", file_format="csv")

save_potential(potential, filepath, file_format=None)

Save Potential object to file (auto-detect format).

Parameters:

Name Type Description Default
potential Potential

Potential object to save

required
filepath str or Path

Output file path

required
file_format str

Force specific format (json, yaml, npz, dat, csv) If None, auto-detect from extension

None

Examples:

>>> save_potential(pot, "potential.json")
>>> save_potential(pot, "potential.npz")

doped Interface

Integration with the doped package for defect calculations.

Optional Dependency

The doped interface requires pip install carriercapture[doped]

carriercapture.io.doped_interface

Interface for doped package integration.

Provides functions to load defect data from the doped package and convert it into CarrierCapture Potential objects for carrier capture rate calculations.

The doped package (https://github.com/SMTG-Bham/doped) automates defect calculations and provides tools for configuration coordinate diagram generation. This module enables seamless integration between doped and CarrierCapture.

load_defect_entry(file_path)

Load DefectEntry from doped JSON.GZ file.

Parameters:

Name Type Description Default
file_path str or Path

Path to DefectEntry JSON.GZ file saved by doped

required

Returns:

Type Description
DefectEntry

Loaded defect entry object

Raises:

Type Description
ImportError

If doped package is not installed

FileNotFoundError

If file does not exist

Examples:

>>> defect = load_defect_entry("v_O_defect.json.gz")
>>> print(defect.name)
'v_O_0'

get_available_charge_states(defect_entry)

Get list of available charge states from DefectEntry.

Parameters:

Name Type Description Default
defect_entry DefectEntry

Loaded defect entry

required

Returns:

Type Description
List[int]

Available charge states

Examples:

>>> charges = get_available_charge_states(defect)
>>> print(charges)
[-2, -1, 0, +1, +2]

validate_charge_states(defect_entry, charge_initial, charge_final)

Validate that requested charge states are available in DefectEntry.

Parameters:

Name Type Description Default
defect_entry DefectEntry

Loaded defect entry

required
charge_initial int

Initial charge state

required
charge_final int

Final charge state

required

Raises:

Type Description
ValueError

If requested charge states are not available

suggest_Q0(struct_initial, struct_final, align=True)

Suggest Q0 value based on structure displacement.

Q0 is typically chosen as a point along the configuration coordinate where the wavefunction overlap is significant. A common choice is Q0 ~ 0.5 * ΔQ (midpoint between initial and final structures).

Parameters:

Name Type Description Default
struct_initial Structure

Initial defect structure

required
struct_final Structure

Final defect structure

required
align bool

Whether to align structures before calculating displacement

True

Returns:

Type Description
float

Suggested Q0 value (amu^0.5·Å)

Examples:

>>> Q0 = suggest_Q0(struct_i, struct_f)
>>> print(f"Suggested Q0: {Q0:.2f} amu^0.5·Å")

load_path_calculations(path_dir, ref_structure=None, energy_key='energy', verbose=False)

Load completed VASP calculations along configuration coordinate path.

Expects directory structure: path_dir/ image_000/ (VASP calculation) image_001/ ... image_NNN/

Parameters:

Name Type Description Default
path_dir str or Path

Directory containing image_XXX/ subdirectories with VASP calculations

required
ref_structure Structure

Reference structure for alignment (typically initial structure). If None, uses structure from image_000 as reference.

None
energy_key str

How to extract energy: "energy" (electronic), "energy_per_atom", or "free_energy"

"energy"
verbose bool

Print progress information

False

Returns:

Name Type Description
Q NDArray[float64]

Configuration coordinates (amu^0.5·Å), shape (n_images,)

E NDArray[float64]

Potential energies (eV), shape (n_images,)

Raises:

Type Description
FileNotFoundError

If path directory or image subdirectories not found

ValueError

If VASP calculations failed or are incomplete

Examples:

>>> Q, E = load_path_calculations("cc_path/")
>>> print(f"Loaded {len(Q)} images")
>>> print(f"ΔQ = {Q[-1]:.2f} amu^0.5·Å")

extract_cc_data_from_structures(struct_initial, struct_final, energy_initial, energy_final, n_images=10, align=True, verbose=False)

Extract Q-E data for configuration coordinate diagram from two structures.

Generates linear interpolation path between initial and final structures, with energies linearly interpolated. This is useful for quick estimates or when full NEB/path calculations are not available.

Parameters:

Name Type Description Default
struct_initial Structure

Initial defect structure (e.g., charge state 0)

required
struct_final Structure

Final defect structure (e.g., charge state +1)

required
energy_initial float

Total energy of initial structure (eV)

required
energy_final float

Total energy of final structure (eV)

required
n_images int

Number of interpolation points

10
align bool

Whether to align structures before interpolation

True
verbose bool

Print information about displacement and energy

False

Returns:

Name Type Description
Q_initial NDArray[float64]

Configuration coordinates for initial potential (amu^0.5·Å)

E_initial NDArray[float64]

Energies for initial potential (eV), harmonic approximation around initial

E_final NDArray[float64]

Energies for final potential (eV), harmonic approximation around final

Notes

This function provides a simple linear interpolation. For accurate results, use NEB calculations or single-point calculations along the path.

Examples:

>>> Q_i, E_i, E_f = extract_cc_data_from_structures(
...     struct_0, struct_1, energy_0, energy_1, n_images=15
... )

create_potential_from_doped(defect_entry, charge_state, Q_data=None, E_data=None, name=None)

Create Potential object from doped DefectEntry and Q-E data.

Parameters:

Name Type Description Default
defect_entry DefectEntry

Loaded defect entry from doped

required
charge_state int

Charge state for this potential

required
Q_data NDArray[float64]

Configuration coordinates (amu^0.5·Å). If None, must be provided later before fitting.

None
E_data NDArray[float64]

Potential energies (eV). If None, must be provided later before fitting.

None
name str

Name for the potential. If None, generated from defect_entry name.

None

Returns:

Type Description
Potential

Potential object with Q_data and E_data set

Examples:

>>> pot_i = create_potential_from_doped(
...     defect, charge_state=0, Q_data=Q, E_data=E_initial
... )
>>> pot_i.fit(fit_type="spline", order=4)
>>> pot_i.solve(nev=60)

prepare_ccd_structures(defect_entry_initial, defect_entry_final, verbose=False)

Prepare structures for CCD calculation from two DefectEntry objects.

Uses doped's orient_s2_like_s1() to ensure structures are properly aligned for the shortest linear path between charge states, handling symmetry-equivalent configurations automatically.

Parameters:

Name Type Description Default
defect_entry_initial DefectEntry

DefectEntry for the initial charge state (e.g., excited state)

required
defect_entry_final DefectEntry

DefectEntry for the final charge state (e.g., ground state)

required
verbose bool

Print alignment information and dQ values

False

Returns:

Type Description
dict

Dictionary containing: - 'struct_initial': Structure - Initial state structure - 'struct_final': Structure - Final state structure (aligned) - 'struct_final_original': Structure - Original final structure (before alignment) - 'dQ': float - Mass-weighted displacement (amu^0.5*Angstrom) - 'energy_initial': float - Total energy of initial state (eV) - 'energy_final': float - Total energy of final state (eV) - 'dE': float - Energy difference E_final - E_initial (eV) - 'charge_initial': int - Initial charge state - 'charge_final': int - Final charge state - 'defect_name': str - Name of the defect

Raises:

Type Description
ValueError

If defect entries are for different defects (by name)

ImportError

If doped package is not installed

Examples:

>>> entry_0 = load_defect_entry("v_O_0.json.gz")
>>> entry_1 = load_defect_entry("v_O_+1.json.gz")
>>> ccd_data = prepare_ccd_structures(entry_0, entry_1, verbose=True)
>>> print(f"dQ = {ccd_data['dQ']:.2f} amu^0.5*A")
>>> print(f"dE = {ccd_data['dE']:.3f} eV")
Notes

The alignment process uses doped's orient_s2_like_s1() which: - Finds the symmetry-equivalent orientation of struct_final that minimizes the linear path distance to struct_initial - Ensures atomic indices match for proper displacement calculation - Returns the shortest-path configuration for NEB/CCD calculations

generate_ccd_path(ccd_data, n_images=11, displacements=None, output_dir=None, write_vasp=False)

Generate interpolated structures along the configuration coordinate path.

Uses doped's get_path_structures() to create linearly interpolated structures between initial and final states for single-point calculations.

Parameters:

Name Type Description Default
ccd_data dict

Output from prepare_ccd_structures() containing aligned structures

required
n_images int

Number of images along the path (including endpoints). Recommended: odd number so midpoint is included.

11
displacements NDArray[float64]

Explicit fractional displacements (0.0 to 1.0) for path generation. If provided, overrides n_images. Example: np.linspace(0, 1, 11)

None
output_dir str or Path

Directory to write VASP input files. If None, only returns structures.

None
write_vasp bool

Whether to write VASP POSCAR files to output_dir. Requires output_dir to be set.

False

Returns:

Type Description
dict

Dictionary containing: - 'path_structures': Dict[float, Structure] - Interpolated structures keyed by fractional displacement (0.0 to 1.0) - 'Q_fractions': NDArray - Fractional positions along path (0 to 1) - 'Q_values': NDArray - Actual Q values (amu^0.5*A), 0 to dQ - 'n_images': int - Number of images generated - 'output_dir': Path or None - Where files were written (if any)

Examples:

>>> path_data = generate_ccd_path(ccd_data, n_images=11)
>>> print(f"Generated {path_data['n_images']} structures")
>>> print(f"Q range: 0 to {path_data['Q_values'][-1]:.2f}")
>>> # Write VASP inputs for single-point calculations
>>> path_data = generate_ccd_path(
...     ccd_data, n_images=11,
...     output_dir="ccd_path/", write_vasp=True
... )
Notes

For accurate CCD calculations, single-point DFT calculations should be performed at each interpolated structure. The resulting energies can be loaded using load_path_calculations() and used to create Potential objects.

estimate_phonon_frequency(Q_data, E_data, Q0=None, method='curvature')

Estimate effective phonon frequency from Q-E data.

The phonon frequency hw is related to the curvature of the potential energy surface near the minimum. This function estimates hw from DFT single-point calculations along the CCD path.

Parameters:

Name Type Description Default
Q_data NDArray[float64]

Configuration coordinates (amu^0.5*Angstrom)

required
E_data NDArray[float64]

Potential energies (eV)

required
Q0 float

Equilibrium position. If None, uses the position of minimum E_data.

None
method (curvature, harmonic_fit)

Method for frequency estimation: - "curvature": Finite difference second derivative at minimum - "harmonic_fit": Least-squares fit of E = E0 + 0.5k(Q-Q0)^2

"curvature"

Returns:

Type Description
dict

Dictionary containing: - 'hw': float - Phonon energy (eV) - 'hw_meV': float - Phonon energy (meV) - 'omega': float - Angular frequency (rad/fs) - 'frequency_THz': float - Frequency (THz) - 'Q0': float - Equilibrium position used - 'E0': float - Minimum energy - 'curvature': float - d²E/dQ² at minimum (eV/(amu*A²)) - 'method': str - Method used for estimation

Examples:

>>> Q, E = load_path_calculations("ccd_path/")
>>> freq_data = estimate_phonon_frequency(Q, E)
>>> print(f"Estimated phonon energy: {freq_data['hw_meV']:.1f} meV")
Notes

The relationship between curvature and phonon frequency is: hw = hbar * sqrt(k / m_eff) where k = d²E/dQ² and m_eff = 1 amu (since Q is mass-weighted).

For accurate phonon frequencies, phonon calculations (DFPT/finite displacement) should be used. This estimate is useful for initial harmonic potential approximations.

calculate_Q0_crossing(pot_initial, pot_final, method='crossing', search_range=None)

Calculate optimal Q0 for overlap integral evaluation.

Q0 determines where the (Q - Q0) operator is evaluated in the electron-phonon coupling matrix element. The optimal choice is typically near the crossing point of the two potential surfaces, where wavefunction overlap is maximized.

Parameters:

Name Type Description Default
pot_initial Potential

Fitted initial state potential (must have fit_func or E array)

required
pot_final Potential

Fitted final state potential (must have fit_func or E array)

required
method (crossing, midpoint, minimum_barrier)

Method for Q0 determination: - "crossing": Q where E_initial(Q) = E_final(Q) - "midpoint": Simple midpoint between Q0_initial and Q0_final - "minimum_barrier": Q that minimizes max(E_i, E_f) along path

"crossing"
search_range tuple[float, float]

(Q_min, Q_max) range to search for crossing. If None, uses the overlap of the two potential Q grids.

None

Returns:

Type Description
dict

Dictionary containing: - 'Q0': float - Recommended Q0 value (amu^0.5*A) - 'E_crossing': float - Energy at crossing point (eV), or None if no crossing - 'barrier_initial': float - Barrier from initial minimum to Q0 (eV) - 'barrier_final': float - Barrier from final minimum to Q0 (eV) - 'method': str - Method used - 'Q0_initial': float - Equilibrium Q of initial potential - 'Q0_final': float - Equilibrium Q of final potential

Raises:

Type Description
ValueError

If no crossing point found in search range (for "crossing" method)

ValueError

If potentials don't have required data

Examples:

>>> Q0_data = calculate_Q0_crossing(pot_i, pot_f, method="crossing")
>>> print(f"Q0 = {Q0_data['Q0']:.2f} amu^0.5*A")
>>> print(f"Barrier = {Q0_data['barrier_initial']:.3f} eV")
Notes

The crossing point method is physically motivated: the electron-phonon coupling is strongest where the electronic states are degenerate. However, for strongly asymmetric potentials, "minimum_barrier" may give better numerical convergence.

create_ccd_from_defect_entries(defect_entry_initial, defect_entry_final, path_dir_initial=None, path_dir_final=None, fit_type='spline', fit_kwargs=None, nev_initial=180, nev_final=60, W=None, degeneracy=1, Q0_method='auto', use_harmonic=False, hw=None, verbose=False)

Create ConfigCoordinate from two doped DefectEntry objects.

This is the main convenience function for the doped -> CarrierCapture workflow. It handles structure alignment, Q-E data loading or generation, potential fitting, Schrodinger equation solving, and ConfigCoordinate creation in a single call.

Parameters:

Name Type Description Default
defect_entry_initial DefectEntry or str or Path

DefectEntry for initial state, or path to JSON.GZ file

required
defect_entry_final DefectEntry or str or Path

DefectEntry for final state, or path to JSON.GZ file

required
path_dir_initial str or Path

Directory with VASP single-point calculations for initial state. If None, uses harmonic approximation based on endpoint energies.

None
path_dir_final str or Path

Directory with VASP single-point calculations for final state. If None, uses harmonic approximation based on endpoint energies.

None
fit_type str

Fitting method for potential: "spline", "harmonic", "morse", etc. Ignored if use_harmonic=True.

"spline"
fit_kwargs dict

Additional arguments for potential fitting (order, smoothness, etc.)

None
nev_initial int

Number of eigenvalues to compute for initial potential

180
nev_final int

Number of eigenvalues to compute for final potential

60
W float

Electron-phonon coupling matrix element (eV). If None, must be set later before calculating capture coefficient.

None
degeneracy int

Degeneracy factor for the capture process

1
Q0_method (crossing, midpoint, auto)

Method for determining Q0. "auto" tries "crossing" first, falls back to "midpoint" if no crossing found.

"crossing"
use_harmonic bool

Use simple harmonic potentials instead of fitting to path data. Useful for quick estimates or when path calculations unavailable.

False
hw float

Phonon energy for harmonic approximation (eV). If None and use_harmonic=True, estimates from structure displacement.

None
verbose bool

Print progress information

False

Returns:

Name Type Description
cc ConfigCoordinate

Configured ConfigCoordinate with potentials solved and ready for calculate_overlap() and calculate_capture_coefficient()

metadata dict

Dictionary with workflow details: - 'ccd_data': Output from prepare_ccd_structures() - 'Q0': float - Q0 value used/recommended - 'Q0_method': str - Method used for Q0 - 'pot_initial': Potential - Reference to initial potential - 'pot_final': Potential - Reference to final potential - 'fit_type': str - Fitting type used - 'hw_estimated': float or None - Estimated phonon energy if available

Examples:

>>> # Full workflow with path calculations
>>> cc, meta = create_ccd_from_defect_entries(
...     "v_O_0.json.gz", "v_O_+1.json.gz",
...     path_dir_initial="ccd_path_0/",
...     path_dir_final="ccd_path_+1/",
...     fit_type="spline",
...     W=0.068,
...     verbose=True
... )
>>> cc.calculate_overlap(Q0=meta['Q0'])
>>> cc.calculate_capture_coefficient(volume=1e-21, temperature=temps)
>>> # Quick harmonic estimate (no path calculations)
>>> cc, meta = create_ccd_from_defect_entries(
...     entry_0, entry_1,
...     use_harmonic=True,
...     hw=0.008,  # 8 meV phonon
...     W=0.068,
... )
Notes

Workflow steps performed: 1. Load DefectEntry objects (if paths provided) 2. Align structures using orient_s2_like_s1() 3. Load or generate Q-E data for both potentials 4. Fit potentials using specified method 5. Solve Schrodinger equation for both potentials 6. Calculate optimal Q0 7. Create ConfigCoordinate with all parameters

For production calculations, path_dir_initial and path_dir_final should contain single-point DFT calculations. The harmonic approximation is useful for quick screening but may not capture anharmonic effects.

Usage Examples

Loading Data

from carriercapture.io import load_potential, read_csv_data
from carriercapture.core import Potential

# Load from CSV file
Q_data, E_data = read_csv_data('potential_data.csv')

# Create potential from data
pot = Potential(Q_data=Q_data, E_data=E_data)
pot.fit(fit_type='spline', order=4, smoothness=0.001)

Saving Results

from carriercapture.io import save_results
from carriercapture.core import ConfigCoordinate

# After calculating capture coefficient
cc = ConfigCoordinate(...)
cc.calculate_capture_coefficient(...)

# Save to JSON
save_results(cc, 'results.json', format='json')

# Save to HDF5
save_results(cc, 'results.h5', format='hdf5')

doped Integration

from carriercapture.io.doped_interface import (
    load_defect_entry,
    create_potential_from_doped
)

# Load defect data from doped
defect = load_defect_entry('defect.json.gz')

# Create potentials for charge state transition
pot_initial = create_potential_from_doped(defect, charge_state=0)
pot_final = create_potential_from_doped(defect, charge_state=+1)

# Continue with standard workflow
pot_initial.solve(nev=180)
pot_final.solve(nev=60)
# ...

File Formats

Supported Input Formats

Format Extension Description
CSV .csv, .dat Comma or space-separated values
JSON .json Structured JSON with metadata
NPZ .npz NumPy compressed arrays
HDF5 .h5, .hdf5 Hierarchical data format
doped .json.gz doped DefectEntry files

Supported Output Formats

Format Extension Use Case
JSON .json Human-readable, portable
NPZ .npz Fast, compact, NumPy-native
HDF5 .h5 Large datasets, hierarchical
CSV .csv Spreadsheet software

See Also