Skip to content

Architecture

Formation is a property-market data platform: a single relational database of addresses, schemes, companies, investments, occupiers and portfolios, reached through a .NET REST/OData API and a SvelteKit web client. Reads are composable (OData filter / expand / search), writes are command-driven (CQRS with LiteBus), and list/search queries target a denormalised read model that the write path maintains.

This page is the orientation layer for the rest of the technical docs. Each subsystem has a deep-dive — links are inline.

flowchart TB
subgraph Client
web[SvelteKit 5 Web<br/>Svelte runes, Tailwind]
end
subgraph API[".NET 10 API<br/>Container App"]
readCtrl["*Controller<br/>OData: GET / $filter / $expand / $search"]
writeCtrl["*WriteController<br/>POST / PATCH / PUT / DELETE"]
mediator["LiteBus CommandMediator<br/>+ EventMediator<br/>(OpenTelemetry-wrapped)"]
handlers["Handlers<br/>Commands/ + Events/"]
readCtrl -- EF Core --> sql
writeCtrl --> mediator --> handlers
handlers -- EF + raw SQL --> sql
handlers -- events --> handlers
end
subgraph Data
sql[("SQL Server<br/>app.* base tables<br/>query.*List read model")]
end
subgraph Jobs["Container App Jobs"]
jobload["jobload<br/>queue-triggered data import"]
jobscore["jobcompletionscore<br/>nightly completeness scoring"]
jobrebuild["jobrebuildqueryviews<br/>batch query.*List rebuild"]
jobcurimp["jobcurimp<br/>daily currency import<br/>(lakehouse source)"]
jobdedup["jobdedup<br/>duplicate-detection pass"]
end
subgraph External["External"]
lakehouse[("BI lakehouse<br/>ECB exchange rates")]
end
subgraph Obs["Observability"]
appi["Application Insights<br/>+ Log Analytics"]
end
web -- HTTPS via Traefik --> readCtrl
web -- HTTPS via Traefik --> writeCtrl
jobload -- commands --> mediator
jobscore -- writes --> sql
jobrebuild -- writes --> sql
jobcurimp -- reads --> lakehouse
jobcurimp -- writes --> sql
jobdedup -- reads / writes --> sql
API -. OpenTelemetry .-> appi

Everything runs in a single Azure Container Apps environment per tier (dev, uat). The web client, API, docs site, and Traefik ingress are long-running Container Apps; the data jobs are Container App Jobs triggered on events or schedules. Full detail in Deployment Topology.

sequenceDiagram
participant Browser
participant Traefik as Traefik<br/>(ca-ingrs-01)
participant API as API<br/>(ca-api-01)
participant EF as EF Core
participant SQL as SQL Server
Browser->>Traefik: GET Schemes with $filter, $expand, $search
Traefik->>API: forward to API
API->>API: Auth — JWT + role check
API->>API: ApplyODataExpansions — $expand to Include
API->>API: TryApplyEncodedIdFilter<br/>rewrite encoded Id filter
API->>API: SearchService — field filters + FTS<br/>against query.SchemeList
API->>API: options.ApplyTo with ignore mask<br/>Filter / Count / OrderBy only
API->>EF: Skip/Take + AsSplitQuery
EF->>SQL: Materialise base rows + includes
API->>EF: LoadPolymorphicCollections — Notes / Tags / Links
API->>EF: LoadLightweightMarketBoundaries
API-->>Traefik: 200 JSON array
Traefik-->>Browser: 200 JSON array

The read path is driven by ODataQueryOptions<T> but not blindly delegated to options.ApplyTo — Formation applies $expand, $top, $skip, $search itself, hands Filter/Count/OrderBy back to OData, and then loads polymorphic collections (Notes/Tags/ExternalLinks) and heavy spatial columns separately. See Dual Controller Pattern for the detail, especially the AllowedQueryOptions ignore-list footgun which trips up most first reads of the code.

Full-text search targets [query].*List tables only — never the base tables. See Query Views and search implementation.

sequenceDiagram
participant Browser
participant API as *WriteController<br/>(EntityWriteControllerBase)
participant Med as InstrumentedCommandMediator
participant Handler as CreateSchemeCommandHandler
participant SQL as SQL Server
participant EvMed as InstrumentedEventMediator
participant EvH as SchemeCreatedEventHandler
Browser->>API: POST Schemes with JSON body<br/>Prefer — return=representation
API->>API: Auth + decode Prefer header
API->>Med: SendAsync — CreateSchemeCommand
Med->>Handler: OTel span start + dispatch
Handler->>Handler: CreateFromBody builds Scheme entity
Handler->>Handler: scheme.TryValidate — errors?
Note right of Handler: Early return CommandResult.ValidationError<br/>400 ProblemDetails via ToErrorResult
Handler->>SQL: BEGIN TRANSACTION
Handler->>SQL: INSERT Scheme + collection ops
Handler->>Handler: Domain rules<br/>unique unknown companies, share ≤ 100%
Handler->>SQL: Synchronise SchemeMarketBoundaries
Handler->>SQL: SaveChanges + COMMIT
Handler->>EvMed: PublishAsync SchemeCreatedEvent
EvMed->>EvH: OTel span start + dispatch
EvH->>SQL: Upsert query.SchemeList row
EvH->>SQL: Upsert query.AddressList row — counts
EvH->>SQL: Upsert query.CompanyList rows — counts
Handler->>SQL: Re-read Scheme for representation<br/>heavy Include graph
Handler-->>Med: CommandResult.Ok — Scheme result
Med-->>API: result — span end, histogram record
API-->>Browser: 201 Created + Location + JSON body

Every write:

  1. Hits a thin *WriteController (usually ~30 lines, inheriting EntityWriteControllerBase<T>).
  2. Dispatches a typed command through ICommandMediator, which is decorated with an OpenTelemetry wrapper (InstrumentedCommandMediator).
  3. Runs in a handler that owns its transaction, validates domain rules, commits, and then publishes events.
  4. Returns a CommandResult<T> — success carries the entity; failure carries a typed error kind plus messages.
  5. Controller converts failures to RFC 7807 ProblemDetails via CommandResultExtensions.ToErrorResult().

The “publish events after commit” ordering is deliberate — see CQRS Flow for why. Event handlers maintain the query-view read model; there’s no trigger-based magic, just explicit LiteBus subscribers.

Two SQL schemas, with different jobs:

SchemaRoleMaintained byIndexed for
[app].*Normalised OLTP — the source of truthEF Core writes inside command handlersRelational joins, point lookups
[query].*ListDenormalised read model — one row per entity, aggregates + join-flattenedEvent handlers (single upsert) + rebuild job (batch)Full-text search (FTI), list pagination

Key properties:

  • Search always targets [query].*List. Base-table FTIs were removed in [#503]; list endpoints never read [app].* directly.
  • [query].* is eventually consistent. Event-handler failures leave rows stale until the rebuild job runs. For Formation’s workload this is acceptable.
  • Adding a searchable column has four edit sites: DACPAC table + FTI block + mapper + BulkUpsertSpec. All must line up or the column is silently unsearchable.

Detail: Query Views.

Every entity carries two identifiers: the integer primary key in the database (SchemeId, AddressId, CompanyId, …) and the six-character encoded Id string exposed by the API ("SC1b2Cd", "AD03KwA", …). Clients — the frontend, URL bars, patch payloads, API consumers — only ever see the encoded form. The integer is an internal detail.

src/common/models/Models/BaseEntity.cs
[NotMapped] public string Id => EncodeIdentifier(DbId, GetType().Name);
[NotMapped] public abstract int DbId { get; }
LayerWhat it uses
DatabaseInteger PKs / FKs (SchemeId, AddressId, …)
API requestsEncoded Id in URLs, JSON payloads, JSON-Patch paths
API responsesEncoded Id on every entity; FKs rendered as nested { Id } objects
FrontendEncoded string exclusively — never touches the integer

Why this pattern. Integer PKs leak information: sequential URLs enumerate the dataset (/Schemes/1, /Schemes/2, …), response IDs expose creation rate and total count, and every bookmarked URL couples to physical schema. Opaque encoded IDs remove that leak and buy URL stability across migrations. Compared with GUIDs they’re shorter (6 chars vs 36), type-tagged (the first chars are derived from the entity type name so a scheme ID plugged into an address route fails cleanly), and don’t fragment the clustered PK index — the database stays on integers, the encoding is a pure rendering concern. Full write-up with the encoding algorithm, rationale vs alternatives, and a list of gotchas: Entity Identifiers.

On the frontend, JSON-Patch diffs of nested objects are automatically rewritten to FK paths (e.g. /Address/Id/AddressId) so the client edits readable shapes while the backend persists FKs. See the JSON Patch guide.

Consumers who need to filter by Id use $filter=Id eq '…', which Formation intercepts and rewrites internally to WHERE SchemeId = 42 — the encoded ID is [NotMapped], so EF can’t translate it directly. See controller pattern → encoded-Id filtering.

Five Container App Jobs run alongside the long-lived API and web apps. None accept HTTP — they’re queue-triggered, scheduled, or manually invoked, and they share the same SQL database and Key Vault as the API.

JobTriggerPurpose
Data LoadjobloadQueue (KEDA)Ingest CSV / XLSX files dropped into the data-load blob container; dispatch the same LiteBus commands as the HTTP write path.
Completeness ScorejobcompscoreScheduled nightlyCompute per-entity completeness score; write directly to [query].*List.CompletenessScore.
Query View RebuildjobqueryvwsManual / scheduledTruncate and rebuild [query].*List from source tables in batches. Used after schema / mapper changes or to recover from event-handler failures.
Currency ImportjobcurimpScheduled dailyPull ECB exchange rates from the BI lakehouse (ECBExchangeRates.CurrencyConversion) via the WarehouseDb connection string, pivot direct / cross-rate pairs, upsert into [app].CurrencyConversion.
Duplicate DetectionjobdedupManualRun per-entity-type duplicate strategies (Address, Company, Scheme); flag likely-duplicate pairs in [app].DuplicateCandidate for human review.

All five share the same shape: IHostedService worker, progress tracked in [app].JobExecution via IJobProgressService, graceful cancellation, Key-Vault-backed connection strings, 4-hour replica timeout. Detailed per-job walk-through in Background Jobs.

The web client is SvelteKit 5 with Svelte 5 runes:

flowchart LR
Comp["Page component<br/>(List / Detail)"] --> Comp2["Shared composable<br/>(useListPage / useDetailPage)"]
Comp2 --> Store["Reactive OData store<br/>(odataStoreFactory)"]
Store -- "$filter, $expand, $search" --> API["Formation API /$odata"]
Comp -->|form data| Form["*Field components"]
Form -->|zod schema| Validation["Validation pipeline"]
Comp --> Ops["Entity operations service<br/>(createEntity / updateEntity)"]
Ops --> API

Layer cake:

  • Page components (routes) own UI layout. Most of the boilerplate moved into composables.
  • Composables (use-detail-page, use-list-page, use-flyout-create, …) own data loading, edit state, dirty-tracking. See component-patterns.md.
  • OData stores (odataStoreFactory) are reactive, cached per entity type. See state-management.md and the OData guide.
  • Form components (TextField, NumberField, EntityLink, Checkbox, …) enforce consistent styling, validation, and error handling. Never raw <input>.
  • Entity operations service wraps create/update/delete with JSON Patch building and error translation.

Svelte 5 runes have a small set of sharp edges (mutating $derived, $effect loops) that Formation’s docs call out — see the Svelte 5 reactivity rules in CLAUDE.md.

One Azure subscription, three environments per subscription (dev, uat, prod). Per environment:

  • One resource group containing VNet, SQL server + database, Key Vault, Log Analytics, Application Insights, two storage accounts, and a Container Apps Environment.
  • Long-running Container Apps: api, web, docs, ingrs (Traefik).
  • Container App Jobs: load, completeness-score, rebuild-query-views, currency-import, dedup.
  • One shared container registry in the common resource group (frmpmaukscommonacr01).

Public traffic flows through a Traefik ingress Container App, path-routed to /api/* → API, /docs/* → docs, /* → web. Everything else is internal-only.

Identity and secrets:

  • Every app has a user-assigned managed identity with least-privilege RBAC.
  • Key Vault holds app-registration IDs, DB connection strings, session keys, and AD group SIDs.
  • SQL Server is AAD-only (no SQL logins); Container Apps connect via managed identity.
  • JWT bearer auth on the API; two Formation roles (User, Admin) derived from AD group membership via GroupRoleClaimsTransformation.

Observability:

  • OpenTelemetry wired in the API (Formation.Handlers, Formation.Search, Formation.Browser sources; Formation.API meter).
  • AddAzureMonitorProfiler in non-dev environments ships everything to Application Insights.
  • Console exporter in dev so metrics are visible during dotnet run.

CI/CD is GitHub Actions with OIDC federation — no long-lived service-principal secrets in repo. Each service has validate (PR) + deploy (push to main) workflows. DACPAC publishes are manual-dispatch with environment selection. Full detail in Deployment Topology.

src/
├── common/
│ ├── models/ # BaseEntity, CommandResult, commands, events,
│ │ # query-view DTOs + mappers (shared across services)
│ └── services/ # Shared domain services
├── services/
│ ├── api/app/
│ │ ├── app.api/
│ │ │ ├── Controllers/ # *Controller (OData) + *WriteController
│ │ │ ├── Handlers/Commands/ # LiteBus command handlers
│ │ │ ├── Handlers/Events/ # Post-commit fan-out (query-view upserts)
│ │ │ ├── Services/
│ │ │ │ ├── QueryViews/ # Mapper-driven upsert services
│ │ │ │ ├── Search/ # FTS query composition per entity
│ │ │ │ ├── Model/ # Spatial / domain services
│ │ │ │ ├── Patching/ # JSON Patch + rewrite factory
│ │ │ │ └── Telemetry/ # OTel wrappers
│ │ │ ├── Extensions/ # OData apply helpers, ToErrorResult,
│ │ │ │ # TryApplyEncodedIdFilter
│ │ │ └── Data/ # DbContext, interceptors, configurations
│ │ ├── app.unittests/ # xUnit + Moq
│ │ ├── app.e2etests/ # Integration tests (requires SQL)
│ │ └── app.loadtests/ # Load tests
│ │
│ ├── web/src/
│ │ └── lib/
│ │ ├── composables/ # use-detail-page, use-flyout-create, …
│ │ ├── services/ # Entity operations (CRUD wrappers)
│ │ ├── odata/ # Reactive store factory, query builder
│ │ ├── components/ # Forms, panels, tables, filters
│ │ ├── types/ # Split per-entity type definitions
│ │ ├── validation/ # Zod schemas
│ │ └── errors/ # ProblemDetails parser + error mapping
│ │
│ ├── ingrs/ # Traefik config
│ ├── job/
│ │ ├── load/ # CSV/XLSX import, queue-triggered
│ │ ├── completionscore/ # Nightly completeness-score recompute
│ │ ├── rebuildqueryviews/ # Batch rebuild of [query].*List
│ │ ├── currencyimport/ # Daily pull of ECB rates from the lakehouse
│ │ └── duplicatedetection/ # Per-entity duplicate-detection strategies
│ └── docs/ # Static docs + OpenAPI + Storybook
├── data/app/ # DACPAC
│ ├── app/Tables/ # [app].* base tables
│ └── query/Tables/ # [query].*List denormalised tables + FTIs
└── infrastructure/ # Bicep IaC
├── main.bicep # Subscription-scope entry point
├── modules/core.bicep # All workload resources
├── modules/aca_job.bicep
└── configuration/{dev,uat}/main.parameters.json

If you’re new to the repo and want to build a complete mental model, work through the deep-dives in this order:

  1. Dual Controller Pattern — how HTTP requests reach domain logic. Answers: “why are there two controllers per entity?” and “why isn’t $filter an allow-list?”

  2. CQRS Flow with LiteBus — follows one write end-to-end, including the transaction → commit → event publish ordering. Answers: “what happens after SendAsync?”

  3. Entity Identifiers — the encoded Id scheme, why Formation doesn’t expose integer PKs, and how /Address/Id gets rewritten to AddressId on the backend.

  4. Query Views — the [app] / [query] split, the write-path fan-out, and the rebuild job. Answers: “where does search actually happen?”

  5. JSON Patch — how nested-object edits from the frontend become FK-level SQL.

  6. Search Implementation — operators, tokenisation, and the query-composition pipeline that reads from [query].*List.

  7. Frontend State Management + Component Patterns — how the web client composes OData stores, composables, and form components.

  8. OData Guide — reference for building queries from the frontend.

  9. EF Core Interceptors — audit, soft-delete, enum-cache hooks attached to SaveChanges.

  10. Background Jobs — the five Container App Jobs (load, completeness-score, query-view rebuild, currency import, duplicate detection).

  11. Deployment Topology — how it runs in Azure.

If you’re making a specific change, the right entry point depends on the change:

  • Adding a new entity end-to-end → controller pattern → CQRS flow → entity IDs → query views.
  • Adding a searchable column → query views (four-edit-site checklist).
  • Adding or modifying a job → background jobs page.
  • Debugging a frontend validation message → JSON Patch + state management.
  • Understanding why a deployment can’t reach Key Vault → deployment topology (RBAC gotchas).