MDT YAML Configuration Format¶
MDT uses a structured YAML format to define verification workflows. Each section of the configuration defines a set of tasks that MDT will execute in order of their dependencies.
Top-Level Sections¶
| Section | Description |
|---|---|
data |
Defines model and observation datasets to load. |
pairing |
Specifies how datasets are matched in time and space. |
combine |
Merges multiple paired datasets for comparison. |
statistics |
Lists the verification metrics to compute. |
plots |
Configures visualizations. |
execution |
Sets up the compute environment (e.g., local or HPC). |
1. data Section¶
The data section defines the sources of information for your verification.
type: Themonetiodataset name (e.g.,cmaq,wrfchem,aeronet,airnow).kwargs: A dictionary of arguments passed to themonetioreader's open function.
2. pairing Section¶
Pairing tasks align two datasets together.
source: The name of the source dataset (usually a model).target: The name of the target dataset (usually observations).method: The pairing algorithm (interpolateorregrid).kwargs: Additional arguments for the pairing function.
For model-to-model comparison using regrid, you can use merge_target: true to include the target model in the output.
pairing:
# Model vs Obs (Interpolation)
cmaq_airnow:
source: "cmaq_output"
target: "airnow_obs"
method: "interpolate"
# Model vs Model (Regridding)
cmaq_vs_wrf:
source: "cmaq_output"
target: "wrfchem_model" # Target model defines the grid
method: "regrid"
kwargs:
merge_target: true
suffix_source: "_cmaq"
suffix_target: "_wrf"
3. combine Section¶
The combine section (optional) merges multiple paired datasets into a single dataset. This is useful for comparing multiple model runs against the same observations.
sources: A list of pairing task names to combine.dim: The name of the new dimension to create (defaults tomodel).
4. statistics Section¶
Compute verification metrics on paired data.
input: The name of a pairing or data task.metrics: A list of metric names to compute (e.g.,rmse,bias,corr).kwargs: Arguments passed to the metric functions (e.g.,obs_var,mod_var).
statistics:
airnow_stats:
input: "cmaq_airnow"
metrics: ["rmse", "bias", "corr"]
kwargs:
obs_var: "OZONE"
mod_var: "OZONE"
5. plots Section¶
Generate visualizations.
input: The name of a pairing, statistics, or data task.type: The type of plot (e.g.,spatial,scatter,timeseries).kwargs: Arguments passed to the plotting function.
6. execution Section¶
Configure how and where tasks are executed.
default_cluster: The name of the cluster to use for tasks (defaults tocompute).clusters: A dictionary of cluster definitions.
Advanced Example: Multiple Models vs. Observations¶
This example shows how to compare two different models against a single set of observations.
data:
# Load two different model outputs
cmaq_model:
type: "cmaq"
kwargs:
fname: "cmaq_data.nc"
wrfchem_model:
type: "wrfchem"
kwargs:
fname: "wrfchem_data.nc"
# Load observational data
airnow_obs:
type: "airnow"
kwargs:
fname: "airnow_data.nc"
pairing:
# Pair each model to the same observations
pair_cmaq:
source: "cmaq_model"
target: "airnow_obs"
method: "interpolate"
pair_wrfchem:
source: "wrfchem_model"
target: "airnow_obs"
method: "interpolate"
statistics:
# Compute stats for each model
cmaq_stats:
input: "pair_cmaq"
metrics: ["rmse", "bias"]
kwargs:
obs_var: "OZONE"
mod_var: "OZONE"
wrfchem_stats:
input: "pair_wrfchem"
metrics: ["rmse", "bias"]
kwargs:
obs_var: "OZONE"
mod_var: "OZONE"
plots:
# Create comparative spatial plots
plot_cmaq:
input: "pair_cmaq"
type: "spatial"
kwargs:
savename: "cmaq_spatial.png"
plot_wrfchem:
input: "pair_wrfchem"
type: "spatial"
kwargs:
savename: "wrfchem_spatial.png"