Skip to content

Roadmap

TrainCraft is developed in phased, independently-testable chunks. The dataset + selection layer is the spine — everything else builds on it.


✅ Phase 0 — Foundation

Done. A clean, installable, testable package with no globals.

  • pydantic v2 config models (discriminated unions, extra=forbid)
  • Structure (with content hash), registry, Workspace/Job, provenance
  • Geometry: file/scratch sources; nanotube/molecule builders; vacuum/supercell/perturb transforms
  • Calculators: emt, tblite/xtb, mace (MP0 + fine-tuned)
  • Sampling: md (Langevin NVT), rattle (HiPhive)
  • Selection funnel: physicality → dedup → diversity (FPS)
  • Dataset: extxyz IO with provenance; hash-dedup
  • CLI: run / validate / new / plugins
  • 6 annotated examples, 19 tests, CI on GitHub Actions

✅ Phase 1, Chunk 1 — Molecules on Surfaces + Monte Carlo

Done. Fragment identity + surface adsorbate builders + Metropolis MC.

  • core/fragments.py — per-atom tc_fragment array; infer_fragments for reactive runs
  • smiles source — RDKit ETKDG + MMFF
  • surface_adsorbate builder — single adsorbate on crystalline slab
  • surface_packing builder — N-molecule coverage via Packmol
  • monte_carlo sampler — translate/rotate/conformer-swap with Metropolis acceptance
  • Examples 07–11 (CO on Cu, ethanol, butane)

✅ Phase 1, Chunk 2 — Mechanical Geometry Breadth

Done. All common bulk, surface, and 2D structure types.

  • core/converter.py — ASE ↔ pymatgen ↔ RDKit bridge
  • url source — download → ASE read
  • crystal builder — bulk + supercell + vacancy/substitution/interstitial defects
  • slab builder — named facet or arbitrary Miller indices; all-framework
  • layered builder — graphene/hBN/MX₂; AA/AB stacking; twist (non-periodic moiré flake)
  • Transforms: strain (hydrostatic/Voigt), rotate, set_pbc
  • Examples 12–14, 36 additional tests

✅ Phase 1, Chunk 3 — Database Providers, Liquids, Intercalation, Constraints

Done. Remaining geometry breadth (except the polymer builder).

  • Sources: materials_project (mp-api), optimade and pubchem (both dependency-free)
  • liquid builder — Packmol multi-species box (explicit cell or density-driven)
  • intercalation builder — guests per gallery of a planar layered host, with staging
  • constraints transform — FixAtoms on the final structure (fixes the legacy index-misalignment bug after reordering builders)
  • Examples 15–17, unit tests (network/Packmol paths skip when deps are absent)
  • Deferred: polymer (PySoftK) — dependency is unreliable; wrapper to be verified against the live API rather than guessed

🟡 Phase 2 — DFT Labeling

Label selected frames with energy, forces, stress, dipole, and polarizability.

  • calculators/dft.py — FHI-aims (fhi_aims) and Quantum ESPRESSO (qe) factories; FHI-aims polarizability via DFPT (dielectric periodic / polarizability molecular, auto-selected). Run command injected from the environment so the plugins stay container-agnostic. QE polarizability raises NotImplementedError (needs a ph.x run).
  • ✅ Labeling stage ([labeling]): labels the selected frames, tags them dft_labeled with level of theory, writes labeled_dft/ (labeled.extxyz, manifest.json, per-frame work dirs). examples/18.
  • 🔜 Cost-aware labeling: polarizability flagged as the expensive task
  • 🔜 Production runs on any Slurm cluster via the DFT container (or runtime=native)

🟡 Cross-cutting — Packaging & HPC Deployment (any Slurm cluster)

Run the real workflow on any Slurm cluster via Apptainer (or the cluster's own binaries). Nothing is site-specific in the code. See DESIGN.md §20 and the Run on HPC (Slurm + Apptainer) guide.

  • ✅ Architecture + four Apptainer *.def files (containers/): traincraft-core (CPU orchestrator), traincraft-mlip (GPU MACE), traincraft-qe (QE, open source), traincraft-dft (FHI-aims — private, licensed). DFT images are compiled from source (self-contained UCX+PMIx+OpenMPI).
  • ✅ Resumable per-stage execution (traincraft stage) + a portable Slurm executor that renders dependency-chained sbatch scripts (traincraft submit, [orchestration] config) with two cluster-agnostic knobs: runtime (apptainer images | native host binaries) and mpi (pmix|cray_shasta|pmi2|none). examples/19 (Leonardo, apptainer+pmix), examples/20 (LUMI, native+cray_shasta).
  • 🔜 Build + validate the images on a real cluster (single-node DFT, then multi-node)

Phase 3 — Training + Validation (in progress)

Train a multi-head MACE model and measure quality end-to-end. Delivered in chunks: training first (validation builds on it), then dataset health, then validation.

✅ Chunk 1 — Training (training/). MACE fine-tune / train-from-scratch wrapper over mace_run_train, as a pluggable trainer registry backend. Multi-head property targets (energy/forces/stress + dipole + polarizability) map onto MACE's model types and losses (AtomicDipolesMACE / EnergyDipolesMACE / AtomicDielectricMACE). The train stage consumes the dataset and emits a model tree (model/<name>.model + manifest); on HPC it runs as a GPU (--nv) step in traincraft-mlip.sif with the command injected from the environment. Fine-tuning defaults follow Tompa et al. (arXiv:2606.12704): foundation-consistent E0s, multihead replay against forgetting, weight_decay=0, high EMA, constant energy-prioritised loss weights. See Training; examples/21.

🔜 Chunk 2 — Dataset health tooling (datasets/). Composition/space/volume coverage maps, per-element force distributions with outlier flags, extrapolation grade, redundancy report.

🔜 Chunk 3 — Validation (validation/). Per-property parity + RMSE/MAE per element, learning curves, NVE/MD stability, EOS/phonons, and IR/Raman spectra reconstructed from MLIP-driven MD vs DFT/experiment.


🔜 Phase 4 — Active-Learning Loop

Close the loop: explore → select → label → retrain → converge.

  • selection/uncertainty.py — committee/ensemble uncertainty selector
  • active_learning/ — full loop with resume/idempotency
  • Convergence criteria: val force-RMSE + spectral error thresholds

🔜 Phase 5 — Orchestration

Parallel execution of the active-learning loop.

  • Local engine hardened: threadpool for independent jobs
  • QuACC adapter: explore + label stages as a parallel DAG
  • Identical science, swappable engine

🔜 Phase 6 — Polish & Extras

  • Full public API docs + library-usage tutorials (including Raman use case)
  • Additional MLIP backends: MatterSim, Orb, SevenNet, CHGNet

Agent workbench — a purpose-built web UI

A single browser app, served from the VM (WebGL rendering needs no X server), that combines a conversational agent (the workflow pattern is introduced in Tutorial 11) with tabbed views over one workflow. It is a front-end over the existing TOML spine — the same configs the CLI and agent already use, no parallel logic:

  • Chat — driven by Pi.dev as the agent backend (not a homegrown loop); Pi.dev does the agentic work — reading the schema, writing, validating and running configs — and the workbench renders the conversation and its results inline.
  • Geometry — interactive 3D of the structure the agent just built (weas-widget / py3Dmol), with natural-language edits round-tripping to the agent.
  • Workflow — the node-based editor: the pipeline DAG (geometry → sample → select → label → dataset → train) as nodes, edited visually and (de)serialised to/from the TOML the CLI runs.
  • Dataset — interactive exploration of the generated dataset with chemiscope (structure–property maps linked to per-frame structures, descriptors, energies and forces).

Likely Streamlit/Gradio + stmol/py3Dmol + the chemiscope widget; the node editor emits the serialised TOML DAG. Details TBD.


Dependency graph

graph TD
    P0["✅ Phase 0<br/>Foundation"]
    P1A["✅ Phase 1 Ch.1<br/>Surfaces + MC"]
    P1B["✅ Phase 1 Ch.2<br/>Geometry breadth"]
    P1C["✅ Phase 1 Ch.3<br/>providers, liquid, intercalation"]
    P2["🟡 Phase 2<br/>DFT labeling"]
    HPC["🟡 Containers + HPC<br/>any Slurm cluster"]
    P3["🔜 Phase 3<br/>Training + Validation"]
    P4["🔜 Phase 4<br/>Active Learning"]
    P5["🔜 Phase 5<br/>Orchestration"]
    P6["🔜 Phase 6<br/>Polish"]

    P0 --> P1A --> P1B --> P1C
    P0 --> P2
    P2 --> P3 --> P4 --> P5 --> P6
    HPC -.-> P2
    HPC -.-> P3