Skip to content

Query Views

Formation keeps its normalised data in the [app].* schema and a denormalised read model in the [query].* schema. Every entity that appears in a list or search has a matching [query].*List table — CompanyList, SchemeList, AddressList, InvestmentEventList, OccupierEventList, PortfolioList. All full-text search targets those query tables; the base tables are never touched by list queries.

This page explains how query views are defined, how they’re populated from the write path, how the rebuild job keeps them in sync, and where the footguns are.

Full-text search (FTS) on base tables doesn’t scale in Formation’s workload for two reasons:

  1. Join cost on every query. A search for “acme” against Company has to materialise the name, the company number, the primary address, the country, and the parent company — all joins. Even with indexes, the join fan-out per query blows past the latency budget as the dataset grows.

  2. Double the index-maintenance cost on writes. Every UPDATE on [app].Company forces SQL Server to re-index the base-table FTI. If you also maintain a denormalised [query].CompanyList row with its own FTI, you pay that cost twice for no query benefit.

Before [#503] the schema had FTIs on both [app].Company and [query].CompanyList. #503 deleted the base-table FTIs after confirming the list/search path never queried them. The result: one FTI per entity, denormalised to the shape the search service actually wants, maintained by application-level event handlers.

Search always targets [query].*List views. Never [app].* base tables. Base-table FTIs were removed in [#503]. When adding a new searchable column, add it to the relevant CREATE FULLTEXT INDEX ON [query].[…List] block and the corresponding query-view mapper.

The DACPAC defines query tables under src/data/app/query/Tables/. Each matches a base entity plus the denormalised fields needed for list + search. Here’s CompanyList verbatim:

-- src/data/app/query/Tables/CompanyList.sql
CREATE TABLE [query].[CompanyList]
(
[Id] NVARCHAR(10) COLLATE Latin1_General_BIN2 NOT NULL,
[CompanyId] INT NOT NULL,
[Name] NVARCHAR(255) NOT NULL,
[CompanyNumber] NVARCHAR(255) NULL,
[Url] NVARCHAR(255) NULL,
[AddressLine] NVARCHAR(MAX) NULL,
[CompanyTypeCode] NVARCHAR(50) NULL,
[CompanyTypeName] NVARCHAR(255) NULL,
[CountryCode] NVARCHAR(50) NULL,
[CountryName] NVARCHAR(255) NULL,
[ParentCompanyId] NVARCHAR(10) NULL,
[ParentCompanyName] NVARCHAR(255) NULL,
[IsVerified] BIT NULL,
[CompletenessScore] DECIMAL (18, 8) NULL,
[SchemeCount] INT DEFAULT 0 NOT NULL,
[OccupierEventCount] INT DEFAULT 0 NOT NULL,
[InvestmentEventCount] INT DEFAULT 0 NOT NULL,
[PortfolioCount] INT DEFAULT 0 NOT NULL,
CONSTRAINT [PK_CompanyList] PRIMARY KEY CLUSTERED ([Id] ASC)
);
GO
CREATE FULLTEXT INDEX ON [query].[CompanyList]
(
[Name] LANGUAGE 1033,
[CompanyNumber] LANGUAGE 1033,
[Url] LANGUAGE 1033,
[AddressLine] LANGUAGE 1033,
[CompanyTypeCode] LANGUAGE 1033,
[CompanyTypeName] LANGUAGE 1033,
[CountryCode] LANGUAGE 1033,
[CountryName] LANGUAGE 1033,
[ParentCompanyName] LANGUAGE 1033
)
KEY INDEX PK_CompanyList
ON [FormationFullTextCatalog];
GO

Each column falls into one of four kinds:

KindExamplesSourceUpdated when
PassthroughCompanyId, Name, CompanyNumber, Url, IsVerified, CompletenessScoreBase-table columnThe base-table column changes
Denormalised joinAddressLine, CompanyTypeCode, CountryName, ParentCompanyNameRelated-entity columnThe source row changes
Encoded referenceParentCompanyId (string, not int)Base-table FK, encoded via BaseEntity.IdThe FK changes
AggregateSchemeCount, OccupierEventCount, InvestmentEventCount, PortfolioCountCOUNT() across a join tableAny row in the join table changes

The aggregates are what makes list views expensive to maintain. When a scheme is created, every company linked to that scheme has its SchemeCount off by one — they all need re-upserting. The Scheme event handler walks that graph explicitly.

Id is a string (NVARCHAR(10)) — the encoded form of the primary key. CompanyId is the integer PK. The split mirrors BaseEntity.Id on .NET entities:

src/common/models/Models/BaseEntity.cs
[NotMapped]
public string Id => EncodeIdentifier(DbId);

Id is used as the PK on the query table because every list / search / detail URL uses the encoded form. Indexing on Id avoids a post-query decode step. The Latin1_General_BIN2 collation matches the encoding character set so equality is exact and index seeks are stable.

The FTI is defined only on the query table, not on [app].Company. Columns are chosen for search relevance, not storage layout. The CompanyList FTI indexes nine columns; SchemeList indexes 18 (schemes have a wider surface area because users search by development details and market, not just name).

SQL Server’s FTI population is asynchronous but fast for individual row upserts. The CONTAINS(…) and FREETEXT(…) predicates used by CompanyListSearchService hit the FTI directly. See the search implementation guide for the full query composition.

FTI index lifetime:

Upsert row into [query].CompanyList
SQL Server records the change in the FTI change-tracking table
FTI population process (background, typically within seconds)
merges the change into the FTI structure
Next CONTAINS() query sees the change

For tests and imports, call ALTER FULLTEXT INDEX ON [query].CompanyList START FULL POPULATION or use the helper in app.e2etests to synchronously wait for FTI population before asserting on search results. Without this, freshly inserted rows can be absent from FTS results for a second or two.

Query-view writes are driven by domain events, not database triggers. After a successful command, the handler publishes an event; event handlers upsert the affected query rows.

For POST /Companies:

POST /Companies
CreateCompanyCommandHandler
├─ SaveChangesAsync → [app].[Company] INSERT
├─ COMMIT
└─ eventMediator.PublishAsync(new CompanyCreatedEvent(companyId))
CompanyCreatedEventHandler
└─ _companyListViewService.UpsertCompanyAsync(companyId)
├─ Load [app].[Company] with relations
├─ Map to CompanyListItemDto
├─ Populate counts (SchemeCount, etc.)
├─ DELETE FROM [query].CompanyList WHERE CompanyId = @id
└─ INSERT INTO [query].CompanyList (…) VALUES (…)
SQL Server schedules FTI population
Searchable

The event publishes after the main transaction commits. Event handler failures do not roll back the base-table insert — query-view divergence is recoverable via the rebuild job. See CQRS flow → Event publishing for the rationale.

Cross-entity fan-out. Many events invalidate multiple query views. The scheme-created handler (SchemeCreatedEventHandler.cs) upserts:

  • [query].SchemeList — the new scheme itself.
  • [query].AddressList — the parent address’s SchemeCount and sector counts changed.
  • [query].CompanyList — every linked company’s SchemeCount changed.

A “simple” scheme creation with five companies produces seven query-view upserts, each with its own counts pass. Budget accordingly when measuring write latency.

Each entity has a pure-function mapper in src/common/models/Services/QueryViews/:

// CompanyListViewMapper.cs (abridged)
public static class CompanyListViewMapper
{
public static IQueryable<Company> IncludeRelations(IQueryable<Company> q) =>
q.Include(c => c.Address)
.Include(c => c.CompanyType)
.Include(c => c.Country)
.Include(c => c.ParentCompany);
public static CompanyListItemDto MapToListItem(Company company) =>
new()
{
Id = company.Id,
CompanyId = company.CompanyId,
Name = company.CompanyName,
CompanyNumber = company.CompanyNumber,
Url = company.Url,
AddressLine = company.Address?.AddressLine,
CompanyTypeCode = company.CompanyType?.CompanyTypeCode,
CompanyTypeName = company.CompanyType?.CompanyTypeName,
CountryCode = company.Country?.CountryCode,
CountryName = company.Country?.CountryName,
ParentCompanyId = company.ParentCompany?.Id,
ParentCompanyName= company.ParentCompany?.CompanyName,
IsVerified = company.IsVerified,
CompletenessScore= company.CompletenessScore,
};
}

Mappers are split into IncludeRelations (query shape) and MapToListItem (projection) so the same includes can be re-used by single-entity upserts and batch rebuilds.

The service then takes the DTO, enriches it with counts, and writes it to the query table:

// src/services/api/app/app.api/Services/QueryViews/CompanyListViewService.cs:62-93
public async Task UpsertCompanyAsync(int companyId, CancellationToken ct = default)
{
var query = _context.Companies!.Where(c => c.CompanyId == companyId);
var company = await CompanyListViewMapper.IncludeRelations(query)
.AsNoTracking()
.FirstOrDefaultAsync(ct);
if (company == null)
{
_logger.LogWarning("Company {CompanyId} not found for upsert", companyId);
return;
}
// Delete existing row. Separate DELETE + INSERT is more robust than MERGE
// when the Id encoding format changes (stale rows get cleaned up).
await _context.Database.ExecuteSqlInterpolatedAsync(
$"DELETE FROM [query].[CompanyList] WHERE CompanyId = {companyId}",
ct);
var dto = CompanyListViewMapper.MapToListItem(company);
await PopulateCountsAsync(dto, companyId, ct);
await UpsertAsync(dto, ct);
}

The DELETE + INSERT pattern (rather than MERGE or UPDATE) has an important side effect: if the Id encoding changes (e.g. due to a prefix-table migration), the delete catches stale rows that no longer match the new encoding. MERGE on Id would leave them orphaned.

Bulk writes go through IBulkUpsertService with a declarative BulkUpsertSpec:

// CompanyListViewService.cs:26-47
private static readonly BulkUpsertSpec<CompanyListItemDto> UpsertSpec = new(
"query", "CompanyList", "Id",
[
new("Id", "NVARCHAR(10) COLLATE Latin1_General_BIN2 NOT NULL", typeof(string), d => d.Id),
new("CompanyId", "INT NOT NULL", typeof(int), d => d.CompanyId),
new("Name", "NVARCHAR(255) NOT NULL", typeof(string), d => d.Name),
new("CompanyNumber", "NVARCHAR(255) NULL", typeof(string), d => d.CompanyNumber),
new("Url", "NVARCHAR(255) NULL", typeof(string), d => d.Url),
// … 13 more columns, matching the DACPAC shape …
]) { BatchSize = 1000 };

BulkUpsertSpec is the single source of truth for the column-to-DTO mapping used in batch writes. When adding a column, this list must be updated alongside the DACPAC definition and the mapper — see Adding a Searchable Column.

The aggregate columns (SchemeCount, OccupierEventCount, InvestmentEventCount, PortfolioCount) come from COUNT(*) queries against join tables. Single-entity upserts do this naively:

// CompanyListViewService.PopulateCountsAsync (sketch)
dto.SchemeCount = await _context.Set<SchemeCompany>()
.CountAsync(sc => sc.CompanyId == companyId, ct);
dto.OccupierEventCount = await _context.Set<OccupierCompany>()
.CountAsync(oc => oc.CompanyId == companyId, ct);
dto.InvestmentEventCount = await _context.Set<InvestmentCompany>()
.CountAsync(ic => ic.CompanyId == companyId, ct);
dto.PortfolioCount = await _context.Set<PortfolioCompany>()
.CountAsync(pc => pc.CompanyId == companyId, ct);

That’s four additional round-trips per company. For single-entity upserts fired from event handlers this is fine — one company, four queries, done. For batch rebuild (see next section), the service replaces this with four GROUP BY CompanyId queries across the entire batch, so the four-queries cost amortises over thousands of rows rather than being paid per row.

RebuildQueryViewsWorker (src/services/job/rebuildqueryviews/) is a Container App Job that re-populates every query table from scratch. It’s used:

  • After deployments that change denormalised columns or mapper logic.
  • After data-load imports that bypass the event pipeline.
  • To recover from event-handler failures (e.g., an outage during batch writes).

The shape:

// RebuildQueryViewsWorker.cs (sketch)
await _companyListViewService .RebuildAsync(null, ct);
await _schemeListViewService .RebuildAsync(null, ct);
await _addressListViewService .RebuildAsync(null, ct);
await _investmentEventListViewSvc .RebuildAsync(null, ct);
await _occupierEventListViewSvc .RebuildAsync(null, ct);
await _portfolioListViewService .RebuildAsync(null, ct);

Each service’s RebuildAsync:

  1. DELETE FROM [query].CompanyList (full truncate if no companyIds subset provided).
  2. Selects all source PKs into a list.
  3. Loads in batches of 1000, applying IncludeRelations + FromSqlRaw("SELECT * FROM [app].Company WITH (NOLOCK)") to avoid blocking ongoing writes.
  4. Maps, populates batch counts (one GROUP BY per aggregate, not one query per row), and bulk upserts.
  5. Logs include/mapping/upsert stopwatches so slow phases are visible in Application Insights.

The worker uses NOLOCK hints deliberately — it’s acceptable for the rebuild to see a near-current snapshot rather than a strictly consistent one, because individual-row upserts from event handlers will correct any divergence the worker misses.

The job’s Container App has a 4-hour replicaTimeout. Full rebuilds against production-scale data typically complete in 20-40 minutes.

Making a column searchable requires edits in four places:

  1. DACPAC table definitionsrc/data/app/query/Tables/CompanyList.sql. Add the column.
  2. DACPAC FTI block — same file. Add the column to CREATE FULLTEXT INDEX ON [query].[CompanyList].
  3. Mappersrc/common/models/Services/QueryViews/CompanyListViewMapper.cs. Populate the new column in MapToListItem.
  4. BulkUpsertSpecsrc/services/api/app/app.api/Services/QueryViews/CompanyListViewService.cs. Add the column to the spec’s column array with its SQL type.

A unit test at src/common/models.unittests/QueryViews/FullTextIndexLocationTests.cs asserts that FTI columns exist on query tables only (not base tables); the test will fail if you accidentally put a new FTI on [app].*.

After merging, the DACPAC publish during deployment picks up the schema change; RebuildQueryViewsWorker should be triggered once to backfill the new column for existing rows (ongoing writes will populate it automatically via the mapper).

Deletions mirror creations. The delete handler (DeleteCompanyCommandHandler) publishes a CompanyDeletedEvent(companyId) after the transaction commits. CompanyDeletedEventHandler removes the row from [query].CompanyList, and — as with the create path — fans out to update counts on related query tables (schemes and portfolios that referenced the company).

Formation uses soft deletes on some entities (e.g. Address.IsDeleted). For those, the event handler updates the row in [query].*List with the new deleted flag rather than removing it; the search service filters deleted rows at query time. See EF Core interceptors for the soft-delete hook.

  1. Search targets [query].*List only. Don’t add CONTAINS() calls against [app].* tables — base tables have no FTIs. A query that compiles and runs at dev scale can silently stop matching in production where the base FTI no longer exists.

  2. Four places to edit when adding a searchable column. DACPAC + FTI + mapper + BulkUpsertSpec. Missing any one produces a silent runtime bug: the column is queryable, but returns nulls, or the column is populated but not indexed, or the column appears in bulk rebuilds but not in single-entity upserts.

  3. Event-handler failures don’t trigger retries. If a handler throws, the query row drifts out of sync with the base table until the next rebuild. Rely on RebuildQueryViewsWorker for recovery, not ad-hoc re-runs.

  4. FTI population is asynchronous. Tests that insert + search in the same transaction will get zero matches unless they wait for FTI population (or manually trigger it). See app.e2etests helpers.

  5. Do not assert ordering of FTS results in E2E tests. Ranking varies with index state, seed data, and term weights. Assert presence via Assert.Contains(results, r => r.Id == "…"), not Assert.Equal("…", results[0].Id). (This is CLAUDE.md pitfall #13.) Ranking-quality regressions belong in a dedicated relevance suite, not per-commit E2Es.

  6. [query].*List is not a queue. It’s a materialised view. Don’t read from it and the base table for the same entity in one operation; you’ll see inconsistent state if an event handler is mid-flight.

  7. Aggregates fan out widely. Creating one scheme with five companies upserts seven query rows. Batch operations (bulk import) should bypass the event pipeline and call RebuildQueryViewsWorker when done rather than firing thousands of individual events.

  8. DELETE + INSERT, not MERGE. The upsert pattern is deliberate — it handles Id-encoding changes and stale rows cleanly. A well-intentioned “optimise to MERGE” PR will reintroduce orphaned rows when the encoding format changes.

  9. WITH (NOLOCK) is used in batch rebuild queries. This is intentional — rebuilds are eventually-consistent anyway. Don’t remove the hint “for correctness” without changing the rebuild strategy (e.g. snapshot isolation).

  10. The completeness-score job writes CompletenessScore directly to [query].*List rows. It doesn’t go through the mapper. If you change how CompletenessScore is computed, check src/services/job/completenessscore/ and keep the two paths aligned.