Keeping AI Assistants Aligned with Data Platform Architecture
Enforce warehouse standards, naming conventions, and pipeline constraints — before an AI assistant writes a single query that breaks your data contracts.
AI assistants treat your warehouse like a blank slate.
Your data platform has hard-won conventions: raw tables are append-only, reporting views are the only read layer, regional datasets must never be joined cross-region, and column naming follows a strict schema. None of that is visible to an AI assistant generating a dbt model or a Spark job.
The result is pipelines that write directly to raw tables, queries that bypass the semantic layer, and naming violations that break downstream consumers — discovered only after data quality alerts fire.
Rule: Raw tables are append-only via ingestion pipelines only.
Context: Direct writes corrupt audit trail and break CDC.
✗ FAIL decision/use-staging-layer-for-aggregations
Rule: All aggregations must land in staging.*, not raw.*
→ Surfaced 2 violations before code generation.
Data contracts and warehouse rules don't live in code.
| Approach | Limitation | With Mneme HQ |
|---|---|---|
| dbt tests | Catch violations after the pipeline runs; can't prevent bad schema design up front | Decisions enforced before a model is written |
| Data catalog | Documents what exists; doesn't prevent AI from ignoring it | Catalog knowledge encoded as enforceable decisions |
| Schema docs | Static; not surfaced at prompt time; AI ignores attached context | Relevant constraints retrieved and checked per prompt |
| Code review | Catches pipeline mistakes after they're coded; wastes data engineer time | Pre-flight check before the model exists |
Data architecture decisions, enforced at prompt time.
Encode warehouse standards as decisions
Capture rules like layer boundaries, naming conventions, regional constraints, and anti-patterns in structured YAML.
Check before generating pipeline code
Run mneme check against your intended change. Mneme HQ retrieves relevant data architecture decisions and flags violations.
Generate scoped rules for data tools
Mneme HQ can output rules files for tools like Cursor or Claude Code that surface the right constraints when working in dbt, Spark, or SQL files.
Gate pipeline PRs in CI
Add mneme check --mode strict to your data pipeline CI. Schema violations and layer boundary breaks fail before merge.
What data platform decisions look like.
id: no-writes-to-raw-tables title: Raw tables are append-only via ingestion pipelines status: accepted rule: No pipeline or model may write directly to raw.* tables. All aggregations and transformations must land in staging.* or marts.* rationale: Direct writes break CDC audit trail and corrupt source-of-truth. Enforced after data corruption incident Q2 2024. enforcement: strict tags: [data-platform, warehouse, layers, anti-pattern]
id: column-naming-snake-case title: All warehouse columns must use snake_case status: accepted rule: Column names must be lowercase snake_case. No camelCase, no PascalCase. Event timestamps must end in _at. Dimension keys must end in _id. rationale: Downstream BI tools and dbt macros depend on consistent naming. enforcement: strict tags: [data-platform, naming, conventions]
$ mneme check "create a DAU model that reads from raw events" --tags data-platform Checking against 8 data-platform decisions... ✗ FAIL decision/no-writes-to-raw-tables Reason: Model reads from raw.* — should use staging.events source. ✗ FAIL decision/use-staging-layer-for-aggregations Reason: DAU aggregation should land in marts.*, not raw.* ✓ PASS decision/column-naming-snake-case ✓ PASS decision/regional-dataset-isolation Result: FAIL (2 violations, strict mode)
What data teams see after enforcement goes live.
Common questions.
Does this work with dbt, Spark, and SQL-based pipelines?
mneme check runs against the intent before you write any code.Can we enforce regional data isolation rules?
--tags gdpr,eu-region) and check with mneme check --tags gdpr to scope enforcement to region-specific rules.How do we handle schema evolution — rules that change over time?
status: deprecated. The history is preserved; enforcement is disabled. New rules take effect immediately on the next check.