Train & Use Your Own MACE Model¶
TrainCraft both trains a MACE model (the [training] stage) and lets you
plug a trained model back in as the exploration engine for the next
active-learning iteration. This page is the quick task reference; for a guided,
illustrated walkthrough see Tutorial 10 · Training.
Train a model (the [training] stage)¶
Add a [training] section to any config that produces a dataset. The stage
splits the data, re-keys the labels for MACE, and runs mace_run_train for you:
[training]
type = "mace"
name = "my_model"
foundation_model = "medium" # small | medium | large | mace-mp0 | <path.model>
strategy = "multihead" # multihead replay (robust) | naive | scratch
heads = ["energy", "forces", "stress"]
e0s = "foundation"
pt_train_file = "mp" # replay data for multihead fine-tuning
device = "cuda" # "cpu" for tiny demos
pixi run -e mace traincraft run my_run.toml # whole pipeline, incl. training
# or just the training stage on an existing dataset:
pixi run -e mace traincraft stage train my_run.toml
The trained potential lands at runs/<name>/model/<name>.model, alongside a
manifest.json recording exactly how it was trained. See the
Config Schema for every [training] field and
the Training concept page for the defaults' rationale.
Multi-head for IR / Raman
Add dipole (IR) and polarizability (Raman) to heads — provided your
dataset was labelled with those properties. This selects MACE's dielectric
model types automatically.
Use a trained model as the exploration engine¶
Once you have a .model, point the [calculator] at it so the next round of
sampling is driven by your potential instead of the foundation:
[calculator]
type = "mace"
model = "mace-mp0" # (1) provenance label
model_path = "runs/my_run/model/my_model.model" # (2)
device = "cuda" # or "cpu"
default_dtype = "float32"
model— kept as a provenance label whenmodel_pathis set.model_path— absolute or relative path to the.modelfrommace_run_train. When set, it overrides the foundation-model download.
In Python:
import traincraft as tc
from traincraft.config.models import MaceCalc
calc = tc.make_calculator(MaceCalc(
model_path="runs/my_run/model/my_model.model",
device="cuda",
))
Iterative active-learning workflow¶
Iteration 1
traincraft run seed_run.toml ← generate → select → label → dataset → TRAIN
→ runs/seed_run/model/model_v1.model
Iteration 2
set [calculator].model_path = .../model_v1.model ← explore with your model
traincraft run iter2.toml ← MACE-driven explore → select → label → TRAIN
→ model_v2.model
…repeat until validation error converges
Phase 4 automates this loop (committee uncertainty selection + convergence criteria); today you drive the iterations by hand. See the Roadmap.
Under the hood¶
The [training] stage is a thin, reproducible front-end over mace_run_train.
It writes the train/valid splits with explicit reference keys (REF_energy,
REF_forces, REF_stress, REF_dipole, REF_polarizability) and passes them
via --energy_key/--forces_key/…, so labels are read correctly and a frame
missing one property is skipped for that head rather than trained on zeros. The
exact rendered command is saved in manifest.json (and viewable with a
dry_run=True call — see Tutorial 10).
Because the command is injected from TRAINCRAFT_MACE_TRAIN_COMMAND, the same
config trains locally or inside traincraft-mlip.sif on a GPU node with no change
— the HPC executor sets the variable for you (DESIGN §20.3).