Architecture
Formation is a property-market data platform: a single relational database of addresses, schemes, companies, investments, occupiers and portfolios, reached through a .NET REST/OData API and a SvelteKit web client. Reads are composable (OData filter / expand / search), writes are command-driven (CQRS with LiteBus), and list/search queries target a denormalised read model that the write path maintains.
This page is the orientation layer for the rest of the technical docs. Each subsystem has a deep-dive — links are inline.
Table of Contents
Section titled “Table of Contents”- Systems at a Glance
- Request Lifecycle
- The Data Split —
[app]vs[query] - Identity Model
- Background Jobs
- Frontend Model
- Deployment
- Where Things Live
- Reading Order for New Contributors
Systems at a Glance
Section titled “Systems at a Glance”flowchart TB subgraph Client web[SvelteKit 5 Web<br/>Svelte runes, Tailwind] end
subgraph API[".NET 10 API<br/>Container App"] readCtrl["*Controller<br/>OData: GET / $filter / $expand / $search"] writeCtrl["*WriteController<br/>POST / PATCH / PUT / DELETE"] mediator["LiteBus CommandMediator<br/>+ EventMediator<br/>(OpenTelemetry-wrapped)"] handlers["Handlers<br/>Commands/ + Events/"] readCtrl -- EF Core --> sql writeCtrl --> mediator --> handlers handlers -- EF + raw SQL --> sql handlers -- events --> handlers end
subgraph Data sql[("SQL Server<br/>app.* base tables<br/>query.*List read model")] end
subgraph Jobs["Container App Jobs"] jobload["jobload<br/>queue-triggered data import"] jobscore["jobcompletionscore<br/>nightly completeness scoring"] jobrebuild["jobrebuildqueryviews<br/>batch query.*List rebuild"] jobcurimp["jobcurimp<br/>daily currency import<br/>(lakehouse source)"] jobdedup["jobdedup<br/>duplicate-detection pass"] end
subgraph External["External"] lakehouse[("BI lakehouse<br/>ECB exchange rates")] end
subgraph Obs["Observability"] appi["Application Insights<br/>+ Log Analytics"] end
web -- HTTPS via Traefik --> readCtrl web -- HTTPS via Traefik --> writeCtrl jobload -- commands --> mediator jobscore -- writes --> sql jobrebuild -- writes --> sql jobcurimp -- reads --> lakehouse jobcurimp -- writes --> sql jobdedup -- reads / writes --> sql API -. OpenTelemetry .-> appiEverything runs in a single Azure Container Apps environment per tier (dev, uat). The web client, API, docs site, and Traefik ingress are long-running Container Apps; the data jobs are Container App Jobs triggered on events or schedules. Full detail in Deployment Topology.
Request Lifecycle
Section titled “Request Lifecycle”sequenceDiagram participant Browser participant Traefik as Traefik<br/>(ca-ingrs-01) participant API as API<br/>(ca-api-01) participant EF as EF Core participant SQL as SQL Server
Browser->>Traefik: GET Schemes with $filter, $expand, $search Traefik->>API: forward to API API->>API: Auth — JWT + role check API->>API: ApplyODataExpansions — $expand to Include API->>API: TryApplyEncodedIdFilter<br/>rewrite encoded Id filter API->>API: SearchService — field filters + FTS<br/>against query.SchemeList API->>API: options.ApplyTo with ignore mask<br/>Filter / Count / OrderBy only API->>EF: Skip/Take + AsSplitQuery EF->>SQL: Materialise base rows + includes API->>EF: LoadPolymorphicCollections — Notes / Tags / Links API->>EF: LoadLightweightMarketBoundaries API-->>Traefik: 200 JSON array Traefik-->>Browser: 200 JSON arrayThe read path is driven by ODataQueryOptions<T> but not blindly delegated to options.ApplyTo — Formation applies $expand, $top, $skip, $search itself, hands Filter/Count/OrderBy back to OData, and then loads polymorphic collections (Notes/Tags/ExternalLinks) and heavy spatial columns separately. See Dual Controller Pattern for the detail, especially the AllowedQueryOptions ignore-list footgun which trips up most first reads of the code.
Full-text search targets [query].*List tables only — never the base tables. See Query Views and search implementation.
Writes
Section titled “Writes”sequenceDiagram participant Browser participant API as *WriteController<br/>(EntityWriteControllerBase) participant Med as InstrumentedCommandMediator participant Handler as CreateSchemeCommandHandler participant SQL as SQL Server participant EvMed as InstrumentedEventMediator participant EvH as SchemeCreatedEventHandler
Browser->>API: POST Schemes with JSON body<br/>Prefer — return=representation API->>API: Auth + decode Prefer header API->>Med: SendAsync — CreateSchemeCommand Med->>Handler: OTel span start + dispatch Handler->>Handler: CreateFromBody builds Scheme entity Handler->>Handler: scheme.TryValidate — errors? Note right of Handler: Early return CommandResult.ValidationError<br/>400 ProblemDetails via ToErrorResult Handler->>SQL: BEGIN TRANSACTION Handler->>SQL: INSERT Scheme + collection ops Handler->>Handler: Domain rules<br/>unique unknown companies, share ≤ 100% Handler->>SQL: Synchronise SchemeMarketBoundaries Handler->>SQL: SaveChanges + COMMIT Handler->>EvMed: PublishAsync SchemeCreatedEvent EvMed->>EvH: OTel span start + dispatch EvH->>SQL: Upsert query.SchemeList row EvH->>SQL: Upsert query.AddressList row — counts EvH->>SQL: Upsert query.CompanyList rows — counts Handler->>SQL: Re-read Scheme for representation<br/>heavy Include graph Handler-->>Med: CommandResult.Ok — Scheme result Med-->>API: result — span end, histogram record API-->>Browser: 201 Created + Location + JSON bodyEvery write:
- Hits a thin
*WriteController(usually ~30 lines, inheritingEntityWriteControllerBase<T>). - Dispatches a typed command through
ICommandMediator, which is decorated with an OpenTelemetry wrapper (InstrumentedCommandMediator). - Runs in a handler that owns its transaction, validates domain rules, commits, and then publishes events.
- Returns a
CommandResult<T>— success carries the entity; failure carries a typed error kind plus messages. - Controller converts failures to RFC 7807 ProblemDetails via
CommandResultExtensions.ToErrorResult().
The “publish events after commit” ordering is deliberate — see CQRS Flow for why. Event handlers maintain the query-view read model; there’s no trigger-based magic, just explicit LiteBus subscribers.
The Data Split — [app] vs [query]
Section titled “The Data Split — [app] vs [query]”Two SQL schemas, with different jobs:
| Schema | Role | Maintained by | Indexed for |
|---|---|---|---|
[app].* | Normalised OLTP — the source of truth | EF Core writes inside command handlers | Relational joins, point lookups |
[query].*List | Denormalised read model — one row per entity, aggregates + join-flattened | Event handlers (single upsert) + rebuild job (batch) | Full-text search (FTI), list pagination |
Key properties:
- Search always targets
[query].*List. Base-table FTIs were removed in [#503]; list endpoints never read[app].*directly. [query].*is eventually consistent. Event-handler failures leave rows stale until the rebuild job runs. For Formation’s workload this is acceptable.- Adding a searchable column has four edit sites: DACPAC table + FTI block + mapper +
BulkUpsertSpec. All must line up or the column is silently unsearchable.
Detail: Query Views.
Identity Model
Section titled “Identity Model”Every entity carries two identifiers: the integer primary key in the database (SchemeId, AddressId, CompanyId, …) and the six-character encoded Id string exposed by the API ("SC1b2Cd", "AD03KwA", …). Clients — the frontend, URL bars, patch payloads, API consumers — only ever see the encoded form. The integer is an internal detail.
[NotMapped] public string Id => EncodeIdentifier(DbId, GetType().Name);[NotMapped] public abstract int DbId { get; }| Layer | What it uses |
|---|---|
| Database | Integer PKs / FKs (SchemeId, AddressId, …) |
| API requests | Encoded Id in URLs, JSON payloads, JSON-Patch paths |
| API responses | Encoded Id on every entity; FKs rendered as nested { Id } objects |
| Frontend | Encoded string exclusively — never touches the integer |
Why this pattern. Integer PKs leak information: sequential URLs enumerate the dataset (/Schemes/1, /Schemes/2, …), response IDs expose creation rate and total count, and every bookmarked URL couples to physical schema. Opaque encoded IDs remove that leak and buy URL stability across migrations. Compared with GUIDs they’re shorter (6 chars vs 36), type-tagged (the first chars are derived from the entity type name so a scheme ID plugged into an address route fails cleanly), and don’t fragment the clustered PK index — the database stays on integers, the encoding is a pure rendering concern. Full write-up with the encoding algorithm, rationale vs alternatives, and a list of gotchas: Entity Identifiers.
On the frontend, JSON-Patch diffs of nested objects are automatically rewritten to FK paths (e.g. /Address/Id → /AddressId) so the client edits readable shapes while the backend persists FKs. See the JSON Patch guide.
Consumers who need to filter by Id use $filter=Id eq '…', which Formation intercepts and rewrites internally to WHERE SchemeId = 42 — the encoded ID is [NotMapped], so EF can’t translate it directly. See controller pattern → encoded-Id filtering.
Background Jobs
Section titled “Background Jobs”Five Container App Jobs run alongside the long-lived API and web apps. None accept HTTP — they’re queue-triggered, scheduled, or manually invoked, and they share the same SQL database and Key Vault as the API.
| Job | Trigger | Purpose |
|---|---|---|
Data Load — jobload | Queue (KEDA) | Ingest CSV / XLSX files dropped into the data-load blob container; dispatch the same LiteBus commands as the HTTP write path. |
Completeness Score — jobcompscore | Scheduled nightly | Compute per-entity completeness score; write directly to [query].*List.CompletenessScore. |
Query View Rebuild — jobqueryvws | Manual / scheduled | Truncate and rebuild [query].*List from source tables in batches. Used after schema / mapper changes or to recover from event-handler failures. |
Currency Import — jobcurimp | Scheduled daily | Pull ECB exchange rates from the BI lakehouse (ECBExchangeRates.CurrencyConversion) via the WarehouseDb connection string, pivot direct / cross-rate pairs, upsert into [app].CurrencyConversion. |
Duplicate Detection — jobdedup | Manual | Run per-entity-type duplicate strategies (Address, Company, Scheme); flag likely-duplicate pairs in [app].DuplicateCandidate for human review. |
All five share the same shape: IHostedService worker, progress tracked in [app].JobExecution via IJobProgressService, graceful cancellation, Key-Vault-backed connection strings, 4-hour replica timeout. Detailed per-job walk-through in Background Jobs.
Frontend Model
Section titled “Frontend Model”The web client is SvelteKit 5 with Svelte 5 runes:
flowchart LR Comp["Page component<br/>(List / Detail)"] --> Comp2["Shared composable<br/>(useListPage / useDetailPage)"] Comp2 --> Store["Reactive OData store<br/>(odataStoreFactory)"] Store -- "$filter, $expand, $search" --> API["Formation API /$odata"]
Comp -->|form data| Form["*Field components"] Form -->|zod schema| Validation["Validation pipeline"] Comp --> Ops["Entity operations service<br/>(createEntity / updateEntity)"] Ops --> APILayer cake:
- Page components (routes) own UI layout. Most of the boilerplate moved into composables.
- Composables (
use-detail-page,use-list-page,use-flyout-create, …) own data loading, edit state, dirty-tracking. See component-patterns.md. - OData stores (
odataStoreFactory) are reactive, cached per entity type. See state-management.md and the OData guide. - Form components (
TextField,NumberField,EntityLink,Checkbox, …) enforce consistent styling, validation, and error handling. Never raw<input>. - Entity operations service wraps create/update/delete with JSON Patch building and error translation.
Svelte 5 runes have a small set of sharp edges (mutating $derived, $effect loops) that Formation’s docs call out — see the Svelte 5 reactivity rules in CLAUDE.md.
Deployment
Section titled “Deployment”One Azure subscription, three environments per subscription (dev, uat, prod). Per environment:
- One resource group containing VNet, SQL server + database, Key Vault, Log Analytics, Application Insights, two storage accounts, and a Container Apps Environment.
- Long-running Container Apps: api, web, docs, ingrs (Traefik).
- Container App Jobs: load, completeness-score, rebuild-query-views, currency-import, dedup.
- One shared container registry in the common resource group (
frmpmaukscommonacr01).
Public traffic flows through a Traefik ingress Container App, path-routed to /api/* → API, /docs/* → docs, /* → web. Everything else is internal-only.
Identity and secrets:
- Every app has a user-assigned managed identity with least-privilege RBAC.
- Key Vault holds app-registration IDs, DB connection strings, session keys, and AD group SIDs.
- SQL Server is AAD-only (no SQL logins); Container Apps connect via managed identity.
- JWT bearer auth on the API; two Formation roles (
User,Admin) derived from AD group membership viaGroupRoleClaimsTransformation.
Observability:
- OpenTelemetry wired in the API (
Formation.Handlers,Formation.Search,Formation.Browsersources;Formation.APImeter). AddAzureMonitorProfilerin non-dev environments ships everything to Application Insights.- Console exporter in dev so metrics are visible during
dotnet run.
CI/CD is GitHub Actions with OIDC federation — no long-lived service-principal secrets in repo. Each service has validate (PR) + deploy (push to main) workflows. DACPAC publishes are manual-dispatch with environment selection. Full detail in Deployment Topology.
Where Things Live
Section titled “Where Things Live”src/├── common/│ ├── models/ # BaseEntity, CommandResult, commands, events,│ │ # query-view DTOs + mappers (shared across services)│ └── services/ # Shared domain services│├── services/│ ├── api/app/│ │ ├── app.api/│ │ │ ├── Controllers/ # *Controller (OData) + *WriteController│ │ │ ├── Handlers/Commands/ # LiteBus command handlers│ │ │ ├── Handlers/Events/ # Post-commit fan-out (query-view upserts)│ │ │ ├── Services/│ │ │ │ ├── QueryViews/ # Mapper-driven upsert services│ │ │ │ ├── Search/ # FTS query composition per entity│ │ │ │ ├── Model/ # Spatial / domain services│ │ │ │ ├── Patching/ # JSON Patch + rewrite factory│ │ │ │ └── Telemetry/ # OTel wrappers│ │ │ ├── Extensions/ # OData apply helpers, ToErrorResult,│ │ │ │ # TryApplyEncodedIdFilter│ │ │ └── Data/ # DbContext, interceptors, configurations│ │ ├── app.unittests/ # xUnit + Moq│ │ ├── app.e2etests/ # Integration tests (requires SQL)│ │ └── app.loadtests/ # Load tests│ ││ ├── web/src/│ │ └── lib/│ │ ├── composables/ # use-detail-page, use-flyout-create, …│ │ ├── services/ # Entity operations (CRUD wrappers)│ │ ├── odata/ # Reactive store factory, query builder│ │ ├── components/ # Forms, panels, tables, filters│ │ ├── types/ # Split per-entity type definitions│ │ ├── validation/ # Zod schemas│ │ └── errors/ # ProblemDetails parser + error mapping│ ││ ├── ingrs/ # Traefik config│ ├── job/│ │ ├── load/ # CSV/XLSX import, queue-triggered│ │ ├── completionscore/ # Nightly completeness-score recompute│ │ ├── rebuildqueryviews/ # Batch rebuild of [query].*List│ │ ├── currencyimport/ # Daily pull of ECB rates from the lakehouse│ │ └── duplicatedetection/ # Per-entity duplicate-detection strategies│ └── docs/ # Static docs + OpenAPI + Storybook│├── data/app/ # DACPAC│ ├── app/Tables/ # [app].* base tables│ └── query/Tables/ # [query].*List denormalised tables + FTIs│└── infrastructure/ # Bicep IaC ├── main.bicep # Subscription-scope entry point ├── modules/core.bicep # All workload resources ├── modules/aca_job.bicep └── configuration/{dev,uat}/main.parameters.jsonReading Order for New Contributors
Section titled “Reading Order for New Contributors”If you’re new to the repo and want to build a complete mental model, work through the deep-dives in this order:
-
Dual Controller Pattern — how HTTP requests reach domain logic. Answers: “why are there two controllers per entity?” and “why isn’t
$filteran allow-list?” -
CQRS Flow with LiteBus — follows one write end-to-end, including the transaction → commit → event publish ordering. Answers: “what happens after
SendAsync?” -
Entity Identifiers — the encoded
Idscheme, why Formation doesn’t expose integer PKs, and how/Address/Idgets rewritten toAddressIdon the backend. -
Query Views — the
[app]/[query]split, the write-path fan-out, and the rebuild job. Answers: “where does search actually happen?” -
JSON Patch — how nested-object edits from the frontend become FK-level SQL.
-
Search Implementation — operators, tokenisation, and the query-composition pipeline that reads from
[query].*List. -
Frontend State Management + Component Patterns — how the web client composes OData stores, composables, and form components.
-
OData Guide — reference for building queries from the frontend.
-
EF Core Interceptors — audit, soft-delete, enum-cache hooks attached to
SaveChanges. -
Background Jobs — the five Container App Jobs (load, completeness-score, query-view rebuild, currency import, duplicate detection).
-
Deployment Topology — how it runs in Azure.
If you’re making a specific change, the right entry point depends on the change:
- Adding a new entity end-to-end → controller pattern → CQRS flow → entity IDs → query views.
- Adding a searchable column → query views (four-edit-site checklist).
- Adding or modifying a job → background jobs page.
- Debugging a frontend validation message → JSON Patch + state management.
- Understanding why a deployment can’t reach Key Vault → deployment topology (RBAC gotchas).
See also
Section titled “See also”- CLAUDE.md — distilled rules and conventions for AI tools working in the repo
- docs/README.md — top-level docs index
- docs/database.md — schema-design summary
- docs/ci-cd.md — pipeline-level CI/CD detail
- docs/infrastructure.md — one-time environment provisioning