Architecture

Formation is a property-market data platform: a single relational database of addresses, schemes, companies, investments, occupiers and portfolios, reached through a .NET REST/OData API and a SvelteKit web client. Reads are composable (OData filter / expand / search), writes are command-driven (CQRS with LiteBus), and list/search queries target a denormalised read model that the write path maintains.

This page is the orientation layer for the rest of the technical docs. Each subsystem has a deep-dive — links are inline.

Systems at a Glance
Request Lifecycle
- Reads
- Writes
The Data Split — [app] vs [query]
Identity Model
Background Jobs
Frontend Model
Deployment
Where Things Live
Reading Order for New Contributors

Systems at a Glance

flowchart TB
    subgraph Client
        web[SvelteKit 5 Web<br/>Svelte runes, Tailwind]
    end

    subgraph API[".NET 10 API<br/>Container App"]
        readCtrl["*Controller<br/>OData: GET / $filter / $expand / $search"]
        writeCtrl["*WriteController<br/>POST / PATCH / PUT / DELETE"]
        mediator["LiteBus CommandMediator<br/>+ EventMediator<br/>(OpenTelemetry-wrapped)"]
        handlers["Handlers<br/>Commands/ + Events/"]
        readCtrl -- EF Core --> sql
        writeCtrl --> mediator --> handlers
        handlers -- EF + raw SQL --> sql
        handlers -- events --> handlers
    end

    subgraph Data
        sql[("SQL Server<br/>app.* base tables<br/>query.*List read model")]
    end

    subgraph Jobs["Container App Jobs"]
        jobload["jobload<br/>queue-triggered data import"]
        jobscore["jobcompletionscore<br/>nightly completeness scoring"]
        jobrebuild["jobrebuildqueryviews<br/>batch query.*List rebuild"]
        jobcurimp["jobcurimp<br/>daily currency import<br/>(lakehouse source)"]
        jobdedup["jobdedup<br/>duplicate-detection pass"]
    end

    subgraph External["External"]
        lakehouse[("BI lakehouse<br/>ECB exchange rates")]
    end

    subgraph Obs["Observability"]
        appi["Application Insights<br/>+ Log Analytics"]
    end

    web -- HTTPS via Traefik --> readCtrl
    web -- HTTPS via Traefik --> writeCtrl
    jobload -- commands --> mediator
    jobscore -- writes --> sql
    jobrebuild -- writes --> sql
    jobcurimp -- reads --> lakehouse
    jobcurimp -- writes --> sql
    jobdedup -- reads / writes --> sql
    API -. OpenTelemetry .-> appi

Everything runs in a single Azure Container Apps environment per tier (dev, uat). The web client, API, docs site, and Traefik ingress are long-running Container Apps; the data jobs are Container App Jobs triggered on events or schedules. Full detail in Deployment Topology.

Request Lifecycle

Reads

sequenceDiagram
    participant Browser
    participant Traefik as Traefik<br/>(ca-ingrs-01)
    participant API as API<br/>(ca-api-01)
    participant EF as EF Core
    participant SQL as SQL Server

    Browser->>Traefik: GET Schemes with $filter, $expand, $search
    Traefik->>API: forward to API
    API->>API: Auth — JWT + role check
    API->>API: ApplyODataExpansions — $expand to Include
    API->>API: TryApplyEncodedIdFilter<br/>rewrite encoded Id filter
    API->>API: SearchService — field filters + FTS<br/>against query.SchemeList
    API->>API: options.ApplyTo with ignore mask<br/>Filter / Count / OrderBy only
    API->>EF: Skip/Take + AsSplitQuery
    EF->>SQL: Materialise base rows + includes
    API->>EF: LoadPolymorphicCollections — Notes / Tags / Links
    API->>EF: LoadLightweightMarketBoundaries
    API-->>Traefik: 200 JSON array
    Traefik-->>Browser: 200 JSON array

The read path is driven by ODataQueryOptions<T> but not blindly delegated to options.ApplyTo — Formation applies $expand, $top, $skip, $search itself, hands Filter/Count/OrderBy back to OData, and then loads polymorphic collections (Notes/Tags/ExternalLinks) and heavy spatial columns separately. See Dual Controller Pattern for the detail, especially the AllowedQueryOptions ignore-list footgun which trips up most first reads of the code.

Full-text search targets [query].*List tables only — never the base tables. See Query Views and search implementation.

Writes

sequenceDiagram
    participant Browser
    participant API as *WriteController<br/>(EntityWriteControllerBase)
    participant Med as InstrumentedCommandMediator
    participant Handler as CreateSchemeCommandHandler
    participant SQL as SQL Server
    participant EvMed as InstrumentedEventMediator
    participant EvH as SchemeCreatedEventHandler

    Browser->>API: POST Schemes with JSON body<br/>Prefer — return=representation
    API->>API: Auth + decode Prefer header
    API->>Med: SendAsync — CreateSchemeCommand
    Med->>Handler: OTel span start + dispatch
    Handler->>Handler: CreateFromBody builds Scheme entity
    Handler->>Handler: scheme.TryValidate — errors?
    Note right of Handler: Early return CommandResult.ValidationError<br/>400 ProblemDetails via ToErrorResult
    Handler->>SQL: BEGIN TRANSACTION
    Handler->>SQL: INSERT Scheme + collection ops
    Handler->>Handler: Domain rules<br/>unique unknown companies, share ≤ 100%
    Handler->>SQL: Synchronise SchemeMarketBoundaries
    Handler->>SQL: SaveChanges + COMMIT
    Handler->>EvMed: PublishAsync SchemeCreatedEvent
    EvMed->>EvH: OTel span start + dispatch
    EvH->>SQL: Upsert query.SchemeList row
    EvH->>SQL: Upsert query.AddressList row — counts
    EvH->>SQL: Upsert query.CompanyList rows — counts
    Handler->>SQL: Re-read Scheme for representation<br/>heavy Include graph
    Handler-->>Med: CommandResult.Ok — Scheme result
    Med-->>API: result — span end, histogram record
    API-->>Browser: 201 Created + Location + JSON body

Every write:

Hits a thin *WriteController (usually ~30 lines, inheriting EntityWriteControllerBase<T>).
Dispatches a typed command through ICommandMediator, which is decorated with an OpenTelemetry wrapper (InstrumentedCommandMediator).
Runs in a handler that owns its transaction, validates domain rules, commits, and then publishes events.
Returns a CommandResult<T> — success carries the entity; failure carries a typed error kind plus messages.
Controller converts failures to RFC 7807 ProblemDetails via CommandResultExtensions.ToErrorResult().

The “publish events after commit” ordering is deliberate — see CQRS Flow for why. Event handlers maintain the query-view read model; there’s no trigger-based magic, just explicit LiteBus subscribers.

The Data Split — `[app]` vs `[query]`

Two SQL schemas, with different jobs:

Schema	Role	Maintained by	Indexed for
`[app].*`	Normalised OLTP — the source of truth	EF Core writes inside command handlers	Relational joins, point lookups
`[query].*List`	Denormalised read model — one row per entity, aggregates + join-flattened	Event handlers (single upsert) + rebuild job (batch)	Full-text search (FTI), list pagination

Key properties:

Search always targets [query].*List. Base-table FTIs were removed in [#503]; list endpoints never read [app].* directly.
[query].* is eventually consistent. Event-handler failures leave rows stale until the rebuild job runs. For Formation’s workload this is acceptable.
Adding a searchable column has four edit sites: DACPAC table + FTI block + mapper + BulkUpsertSpec. All must line up or the column is silently unsearchable.

Detail: Query Views.

Identity Model

Every entity carries two identifiers: the integer primary key in the database (SchemeId, AddressId, CompanyId, …) and the six-character encoded Id string exposed by the API ("SC1b2Cd", "AD03KwA", …). Clients — the frontend, URL bars, patch payloads, API consumers — only ever see the encoded form. The integer is an internal detail.

[NotMapped] public string Id => EncodeIdentifier(DbId, GetType().Name);
[NotMapped] public abstract int DbId { get; }

Layer	What it uses
Database	Integer PKs / FKs (`SchemeId`, `AddressId`, …)
API requests	Encoded `Id` in URLs, JSON payloads, JSON-Patch paths
API responses	Encoded `Id` on every entity; FKs rendered as nested `{ Id }` objects
Frontend	Encoded string exclusively — never touches the integer

Why this pattern. Integer PKs leak information: sequential URLs enumerate the dataset (/Schemes/1, /Schemes/2, …), response IDs expose creation rate and total count, and every bookmarked URL couples to physical schema. Opaque encoded IDs remove that leak and buy URL stability across migrations. Compared with GUIDs they’re shorter (6 chars vs 36), type-tagged (the first chars are derived from the entity type name so a scheme ID plugged into an address route fails cleanly), and don’t fragment the clustered PK index — the database stays on integers, the encoding is a pure rendering concern. Full write-up with the encoding algorithm, rationale vs alternatives, and a list of gotchas: Entity Identifiers.

On the frontend, JSON-Patch diffs of nested objects are automatically rewritten to FK paths (e.g. /Address/Id → /AddressId) so the client edits readable shapes while the backend persists FKs. See the JSON Patch guide.

Consumers who need to filter by Id use $filter=Id eq '…', which Formation intercepts and rewrites internally to WHERE SchemeId = 42 — the encoded ID is [NotMapped], so EF can’t translate it directly. See controller pattern → encoded-Id filtering.

Background Jobs

Five Container App Jobs run alongside the long-lived API and web apps. None accept HTTP — they’re queue-triggered, scheduled, or manually invoked, and they share the same SQL database and Key Vault as the API.

Job	Trigger	Purpose
Data Load — `jobload`	Queue (KEDA)	Ingest CSV / XLSX files dropped into the `data-load` blob container; dispatch the same LiteBus commands as the HTTP write path.
Completeness Score — `jobcompscore`	Scheduled nightly	Compute per-entity completeness score; write directly to `[query].*List.CompletenessScore`.
Query View Rebuild — `jobqueryvws`	Manual / scheduled	Truncate and rebuild `[query].*List` from source tables in batches. Used after schema / mapper changes or to recover from event-handler failures.
Currency Import — `jobcurimp`	Scheduled daily	Pull ECB exchange rates from the BI lakehouse (`ECBExchangeRates.CurrencyConversion`) via the `WarehouseDb` connection string, pivot direct / cross-rate pairs, upsert into `[app].CurrencyConversion`.
Duplicate Detection — `jobdedup`	Manual	Run per-entity-type duplicate strategies (Address, Company, Scheme); flag likely-duplicate pairs in `[app].DuplicateCandidate` for human review.

All five share the same shape: IHostedService worker, progress tracked in [app].JobExecution via IJobProgressService, graceful cancellation, Key-Vault-backed connection strings, 4-hour replica timeout. Detailed per-job walk-through in Background Jobs.

Frontend Model

The web client is SvelteKit 5 with Svelte 5 runes:

flowchart LR
    Comp["Page component<br/>(List / Detail)"] --> Comp2["Shared composable<br/>(useListPage / useDetailPage)"]
    Comp2 --> Store["Reactive OData store<br/>(odataStoreFactory)"]
    Store -- "$filter, $expand, $search" --> API["Formation API /$odata"]

    Comp -->|form data| Form["*Field components"]
    Form -->|zod schema| Validation["Validation pipeline"]
    Comp --> Ops["Entity operations service<br/>(createEntity / updateEntity)"]
    Ops --> API

Layer cake:

Page components (routes) own UI layout. Most of the boilerplate moved into composables.
Composables (use-detail-page, use-list-page, use-flyout-create, …) own data loading, edit state, dirty-tracking. See component-patterns.md.
OData stores (odataStoreFactory) are reactive, cached per entity type. See state-management.md and the OData guide.
Form components (TextField, NumberField, EntityLink, Checkbox, …) enforce consistent styling, validation, and error handling. Never raw <input>.
Entity operations service wraps create/update/delete with JSON Patch building and error translation.

Svelte 5 runes have a small set of sharp edges (mutating $derived, $effect loops) that Formation’s docs call out — see the Svelte 5 reactivity rules in CLAUDE.md.

Deployment

One Azure subscription, three environments per subscription (dev, uat, prod). Per environment:

One resource group containing VNet, SQL server + database, Key Vault, Log Analytics, Application Insights, two storage accounts, and a Container Apps Environment.
Long-running Container Apps: api, web, docs, ingrs (Traefik).
Container App Jobs: load, completeness-score, rebuild-query-views, currency-import, dedup.
One shared container registry in the common resource group (frmpmaukscommonacr01).

Public traffic flows through a Traefik ingress Container App, path-routed to /api/* → API, /docs/* → docs, /* → web. Everything else is internal-only.

Identity and secrets:

Every app has a user-assigned managed identity with least-privilege RBAC.
Key Vault holds app-registration IDs, DB connection strings, session keys, and AD group SIDs.
SQL Server is AAD-only (no SQL logins); Container Apps connect via managed identity.
JWT bearer auth on the API; two Formation roles (User, Admin) derived from AD group membership via GroupRoleClaimsTransformation.

Observability:

OpenTelemetry wired in the API (Formation.Handlers, Formation.Search, Formation.Browser sources; Formation.API meter).
AddAzureMonitorProfiler in non-dev environments ships everything to Application Insights.
Console exporter in dev so metrics are visible during dotnet run.

CI/CD is GitHub Actions with OIDC federation — no long-lived service-principal secrets in repo. Each service has validate (PR) + deploy (push to main) workflows. DACPAC publishes are manual-dispatch with environment selection. Full detail in Deployment Topology.

Where Things Live

src/
├── common/
│   ├── models/            # BaseEntity, CommandResult, commands, events,
│   │                      # query-view DTOs + mappers (shared across services)
│   └── services/          # Shared domain services
│
├── services/
│   ├── api/app/
│   │   ├── app.api/
│   │   │   ├── Controllers/         # *Controller (OData) + *WriteController
│   │   │   ├── Handlers/Commands/   # LiteBus command handlers
│   │   │   ├── Handlers/Events/     # Post-commit fan-out (query-view upserts)
│   │   │   ├── Services/
│   │   │   │   ├── QueryViews/      # Mapper-driven upsert services
│   │   │   │   ├── Search/          # FTS query composition per entity
│   │   │   │   ├── Model/           # Spatial / domain services
│   │   │   │   ├── Patching/        # JSON Patch + rewrite factory
│   │   │   │   └── Telemetry/       # OTel wrappers
│   │   │   ├── Extensions/          # OData apply helpers, ToErrorResult,
│   │   │   │                        # TryApplyEncodedIdFilter
│   │   │   └── Data/                # DbContext, interceptors, configurations
│   │   ├── app.unittests/           # xUnit + Moq
│   │   ├── app.e2etests/            # Integration tests (requires SQL)
│   │   └── app.loadtests/           # Load tests
│   │
│   ├── web/src/
│   │   └── lib/
│   │       ├── composables/         # use-detail-page, use-flyout-create, …
│   │       ├── services/            # Entity operations (CRUD wrappers)
│   │       ├── odata/               # Reactive store factory, query builder
│   │       ├── components/          # Forms, panels, tables, filters
│   │       ├── types/               # Split per-entity type definitions
│   │       ├── validation/          # Zod schemas
│   │       └── errors/              # ProblemDetails parser + error mapping
│   │
│   ├── ingrs/               # Traefik config
│   ├── job/
│   │   ├── load/                    # CSV/XLSX import, queue-triggered
│   │   ├── completionscore/         # Nightly completeness-score recompute
│   │   ├── rebuildqueryviews/       # Batch rebuild of [query].*List
│   │   ├── currencyimport/          # Daily pull of ECB rates from the lakehouse
│   │   └── duplicatedetection/      # Per-entity duplicate-detection strategies
│   └── docs/                # Static docs + OpenAPI + Storybook
│
├── data/app/                # DACPAC
│   ├── app/Tables/          # [app].* base tables
│   └── query/Tables/        # [query].*List denormalised tables + FTIs
│
└── infrastructure/          # Bicep IaC
    ├── main.bicep           # Subscription-scope entry point
    ├── modules/core.bicep   # All workload resources
    ├── modules/aca_job.bicep
    └── configuration/{dev,uat}/main.parameters.json

Reading Order for New Contributors

If you’re new to the repo and want to build a complete mental model, work through the deep-dives in this order:

Dual Controller Pattern — how HTTP requests reach domain logic. Answers: “why are there two controllers per entity?” and “why isn’t $filter an allow-list?”
CQRS Flow with LiteBus — follows one write end-to-end, including the transaction → commit → event publish ordering. Answers: “what happens after SendAsync?”
Entity Identifiers — the encoded Id scheme, why Formation doesn’t expose integer PKs, and how /Address/Id gets rewritten to AddressId on the backend.
Query Views — the [app] / [query] split, the write-path fan-out, and the rebuild job. Answers: “where does search actually happen?”
JSON Patch — how nested-object edits from the frontend become FK-level SQL.
Search Implementation — operators, tokenisation, and the query-composition pipeline that reads from [query].*List.
Frontend State Management + Component Patterns — how the web client composes OData stores, composables, and form components.
OData Guide — reference for building queries from the frontend.
EF Core Interceptors — audit, soft-delete, enum-cache hooks attached to SaveChanges.
Background Jobs — the five Container App Jobs (load, completeness-score, query-view rebuild, currency import, duplicate detection).
Deployment Topology — how it runs in Azure.

If you’re making a specific change, the right entry point depends on the change:

Adding a new entity end-to-end → controller pattern → CQRS flow → entity IDs → query views.
Adding a searchable column → query views (four-edit-site checklist).
Adding or modifying a job → background jobs page.
Debugging a frontend validation message → JSON Patch + state management.
Understanding why a deployment can’t reach Key Vault → deployment topology (RBAC gotchas).

Architecture

Table of Contents

Systems at a Glance

Request Lifecycle

Reads

Writes

The Data Split — `[app]` vs `[query]`

Identity Model

Background Jobs

Frontend Model

Deployment

Where Things Live

Reading Order for New Contributors

See also

Architecture

Table of Contents

Systems at a Glance

Request Lifecycle

Reads

Writes

The Data Split — [app] vs [query]

Identity Model

Background Jobs

Frontend Model

Deployment

Where Things Live

Reading Order for New Contributors

See also

The Data Split — `[app]` vs `[query]`