Jul 3, 2025 · 12 min
An Opinionated Architecture for Modern ML Systems
A composable reference stack blending feature stores, vector databases, and orchestration best practices.
This is the reference stack I lean on when teams need a pragmatic ML platform without a 12-month platform rewrite. It's composable, cloud-agnostic, and optimizes for fast iteration cycles.
Feature plane as the source of truth
Raw events land in an immutable object store. A lightweight feature service (think Feast or a custom Redis + DuckDB combo) version-controls transformations and guarantees training/serving parity.
Every feature view ships with validation tests and ownership metadata so domain teams can self-serve without stepping on each other.
Training and evaluation mesh
Orchestration runs on Dagster because typing + software-defined assets make lineage obvious. Jobs produce artifacts—models, metrics, explainer plots—pushed into MLflow.
Evaluations are treated like first-class citizens: regression tests compare new models against production data slices before promotion.
Inference, observability, and governance
Online services run as FastAPI containers with a shared inference SDK. Batch consumers use the same SDK inside Spark or Snowflake tasks, so guardrails are consistent.
Telemetry feeds an OpenTelemetry collector that powers Grafana dashboards, drift alerts, and cost tracking. Access is enforced via short-lived tokens issued by the platform team.
Key takeaways
- Version everything: features, models, prompts, docs
- Treat evaluations like CI—not optional demos
- Observability is as important as accuracy when scaling
Great architecture is boring by design. Aim for legibility, paved paths, and the ability to swap components without rewriting the world.