Skip to content

Bulk perf-test data seeder (sm seed) for Products, Orders, AuditLogs#118

Draft
antosubash wants to merge 2 commits intomainfrom
claude/perf-test-data-seeding-bktmE
Draft

Bulk perf-test data seeder (sm seed) for Products, Orders, AuditLogs#118
antosubash wants to merge 2 commits intomainfrom
claude/perf-test-data-seeding-bktmE

Conversation

@antosubash
Copy link
Copy Markdown
Owner

Summary

Adds a sm seed CLI subcommand that populates the configured database with large volumes of realistic test data, so list endpoints, joins, and time-range queries can be benchmarked against populated tables instead of empty/near-empty ones.

Defaults: 1M Products, 100K Orders (~250K items), 500K AuditLogs — all overridable via --count. 1M is a reasonable target for the flat Products table on PostgreSQL or SQLite; keeping Orders at 100K avoids blowing up items to multi-millions (composite PK, items 1–5 per order).

What it does

  • New sm seed command in the existing Spectre.Console CLI.
  • Backed by a new console project tools/SimpleModule.PerfSeeder/ that references the three target modules directly. The CLI shells out to it so sm itself doesn't need module references.
  • Uses EF Core batched inserts (AutoDetectChangesEnabled=false, ChangeTracker.Clear per batch, single transaction wrapping each module phase). Provider-agnostic — works with the default SQLite config, PostgreSQL, or SQL Server.
  • SQLite fast path: sets PRAGMA journal_mode=WAL, synchronous=NORMAL, temp_store=MEMORY for the seeder's connection.
  • Deterministic data via Bogus with --seed (default 42).
  • Safety rails: --truncate preserves the 10 migration-seed Products (Id ≤ 10); Orders truncate deletes children first.

Usage

# All three modules with defaults
sm seed

# Just products, custom count, truncate first
sm seed --module products --count 2000000 --truncate

# Fresh dev DB (calls EnsureCreated when migrations aren't applied)
sm seed --create-schema

# Override connection (e.g. against a local Postgres)
sm seed --connection "Host=localhost;Database=perf;Username=postgres;Password=x"

Options: --module (products|orders|auditlogs|all), --count, --connection, --provider, --batch-size (default 5000), --seed (default 42), --truncate, --create-schema.

Test plan

  • sm seed --help prints the full option set
  • Smoke test: 500 Products + 500 Orders + 500 AuditLogs against fresh SQLite — all succeed, Orders wrote 1,473 items (1–5 per order as designed)
  • Throughput check: 50K Products in Release against SQLite → ~12K rows/s (~85s projected for 1M)
  • Full-scale run: sm seed --truncate against a developer machine — verify 1M Products, 100K Orders, 500K AuditLogs land cleanly
  • End-to-end: start the host after a seed, hit GET /api/products and spot-check response times / EXPLAIN plans
  • Postgres sanity: run sm seed --provider PostgreSql --connection ... against a local PG (blocked on the pre-existing migration drift for non-SQLite providers — orthogonal to this PR)

Notes / scope

  • The CLI is the user-facing entry point; the underlying SimpleModule.PerfSeeder console app can also be invoked directly (dotnet run --project tools/SimpleModule.PerfSeeder -- ...).
  • Did not add bulk seeders for Users / Permissions / OpenIddict — those need Identity / auth setup and are unsuited to hot-path bulk insertion.
  • No changes to existing modules, migrations, runtime code, or CI. Added project is excluded from dotnet test by convention (no Tests suffix).

Generated by Claude Code

claude added 2 commits April 23, 2026 07:36
New `sm seed` CLI command (and underlying `tools/SimpleModule.PerfSeeder`
console app) for populating the database with large volumes of realistic
data so list endpoints, joins, and time-range queries can be benchmarked
against realistic table sizes. Defaults to 1M products, 100K orders
(~250K items), 500K audit entries; all overridable via --count.

Uses EF Core batched inserts with AutoDetectChangesEnabled=false and
ChangeTracker.Clear per batch; wraps each module's insert phase in a
single transaction. Enables WAL + relaxed sync pragmas on SQLite for
bulk throughput. Provider-agnostic: works with the default SQLite setup
out of the box, also supports PostgreSQL and SQL Server.

Smoke-tested against fresh SQLite at ~12K rows/s for Products in Release
(projects to ~85s for 1M). Data is deterministic via --seed (default 42).

The CLI command shells out to the PerfSeeder project so `sm` doesn't
need to pull the Products/Orders/AuditLogs modules into its dependency
graph. --create-schema calls EnsureCreated when migrations aren't
available (e.g., a fresh dev DB).
Audit entries are emitted by the audit interceptor when entities change,
so seeding them directly would produce inconsistent data (decoupled from
any actual operation, Module/Path fields guessed). Drop the AuditLogs
target and scope the seeder to Products + Orders. Audit coverage now
accumulates naturally from real CRUD traffic post-seed.

Also fix multi-module --create-schema: EnsureCreated skips table
creation when any tables exist in the DB, so seeding Products then
Orders into the same file failed on the second module. Switch to the
IRelationalDatabaseCreator.CreateTables pattern (same as
SimpleModuleWebApplicationFactory.EnsureTablesCreated), swallowing
duplicate-table errors so each module context creates only its missing
tables.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying simplemodule-website with  Cloudflare Pages  Cloudflare Pages

Latest commit: 73044d2
Status: ✅  Deploy successful!
Preview URL: https://5afefc63.simplemodule-website.pages.dev
Branch Preview URL: https://claude-perf-test-data-seedin.simplemodule-website.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants