Chapter 4

Validation and Release

Capture what validation is enforced, how builds are produced, and how common failures should be diagnosed.

Subsections of Validation and Release

Validation Contract

This page separates what is enforced now from what is intended later.

Enforced Now

Current validation checks:

  • required market files exist
  • config.yml loads
  • required top-level config keys exist
  • week_start is one of the allowed values
  • required functions can be sourced
  • forbidden local-environment patterns are blocked in committed scripts

Planned Next

Not fully enforced yet:

  • transform runs against real market input
  • transformed columns match the intended model contract
  • predict succeeds
  • decomp succeeds
  • build produces a traceable artefact
  • formula regression checks for cases like lead() and I(...)

Why This Distinction Matters

The repo should not overstate its current level of technical assurance. Today, validation proves that the market scaffold is structurally governed. It does not yet prove that every migrated market implementation runs end to end.

Validation and Build

Validation is not optional in this repository. The purpose of validation is to catch the kinds of failures that historically appeared only after packaging or deployment.

Run commands from the repository root.

Validation Entry Points

Validate all migrated markets:

Rscript scripts/validate_all.R

Validate one migrated market:

Rscript scripts/validate_market.R markets/CN/store_visits

Build Entry Point

Build one market/KPI:

Rscript scripts/build_market.R markets/CN/store_visits

Market-Specific Local Repros

Where a market investigation needs a tactical proof outside the main scaffold, the market folder may also carry a local repro bundle.

For China store visits, the current local repro can be run with:

./markets/CN/store_visits/local-repro/run_cn_repro.sh

This is not the long-term replacement for the repo-wide validation harness. It is a market-specific runner used to demonstrate that the local transform/refit/decomp path works from explicit artefacts.

What Validation Should Prove

Current enforced minimum:

  1. required files exist
  2. config loads
  3. forbidden local-environment patterns are absent
  4. required functions can be sourced
  5. the market contract is satisfied

Planned next-stage validation:

  1. transform runs in a clean session
  2. transformed output matches model requirements
  3. predict works
  4. decomp works
  5. build output is reproducible and traceable

Clean Environment Principle

If a market implementation only works on one analyst machine because of:

  • attached packages
  • hidden sourced files
  • local paths
  • interactive session state

then it is not ready.

See Validation Contract for the distinction between enforced and planned checks.

Build and Release

Builds should be reproducible from the repository state.

Current Build Entry Point

Rscript scripts/build_market.R markets/CN/store_visits

Current State

The build scaffold exists, but real market build logic still needs to be migrated into each market folder.

Intended Build Outcome

Each build should eventually produce:

  • a market/KPI-specific upload artefact
  • a traceable artefact name
  • an association to the commit SHA that generated it

Intended Traceability

At a minimum, a future build path should capture:

  • market
  • KPI
  • build timestamp
  • commit SHA
  • validated runtime versions

CI Versus Local Build

Current scaffold:

  • local build entrypoint exists
  • CI validates structure only

Target state:

  • local build for iteration
  • CI-backed validation before merge
  • optional CI build publication once workflow stabilises

Troubleshooting

This page captures common failure modes that this repo is intended to prevent.

Transform Works Locally But Fails In API

Typical causes:

  • hidden package dependencies
  • un-namespaced function calls
  • local helper scripts not committed
  • local data reads embedded in transform logic

Model And Transform Are Out Of Sync

Typical symptom:

  • transformed data does not contain the variables expected by the model object

Typical cause:

  • the analyst changed the transform logic or variable representation without rebuilding or updating the model artefact consistently

Decomp Fails In Packaged Runtime

Typical causes:

  • formula parsing edge cases such as lead() or I(...)
  • mismatched mapping table
  • model and decomp wrapper not using the same representation
  • hidden reliance on attached packages or local session state

Week Start / Calendar Problems

Typical symptom:

  • fixed-forecast or future data is shifted, duplicated, or partially NA

Typical cause:

  • one part of the workflow assumes Monday weeks while the market uses Sunday weeks, or vice versa

Local Script Drift

Typical symptom:

  • product team fixes do not match the analyst’s “real” script

Typical cause:

  • local SharePoint or Windows-drive copies acting as private forks

First Questions To Ask

  1. What is the authoritative Git version?
  2. What exact model object is being used for predict?
  3. Is the same model object used for decomp?
  4. Does transform() produce the variables the model actually expects?
  5. Are there any hard-coded local dependencies left?