geoprior.cli.stage1#
Stage-1: Zhongshan/Nansha preprocessing & sequence export for GeoPriorSubsNet
- This script runs Steps 1–6 of the NATCOM pipeline:
Load dataset
Clean & select features
Encode & scale
Define feature sets
Split by year & build PINN sequences
Build train/val tf.data and EXPORT all arrays & metadata
Outputs#
CSVs: raw, cleaned, scaled
Joblib: one-hot encoder, main scaler, coord scaler
NPZ: train_inputs, train_targets, val_inputs, val_targets
JSON: manifest.json describing everything (paths, shapes, dims, columns, config), so Stage-2 can load without recomputing.
- Stage-2 only needs to:
read manifest.json
np.load(…) the NPZs
joblib.load(…) the scalers/encoders if needed
build/compile/tune/train the model
Functions
|
|
|
Execute Stage-1 preprocessing pipeline. |
|
- geoprior.cli.stage1.run_stage1(overrides=None, *, config_root='nat.com', config_path=None)[source]#
Execute Stage-1 preprocessing pipeline.