Commands

Reference for the igem-server db command group. All commands take a --db-uri option or read from IGEM_DB_URI / DATABASE_URL / .igem.toml — see Access for the resolution order.

Quick table

Command

Purpose

Read/Write

Modes accepted

db create

Create schema + seed initial data + stamp at head

write

SQL only

db upgrade

Apply pending migrations + re-seed idempotently

write

SQL only

db status

Report current vs head Alembic revision

read

SQL only

db stamp-head

Baseline an unversioned DB (insert alembic_version row, no DDL)

write

SQL only

db migrate-dry-run

Print SQL of pending migrations without executing

read

SQL only

db info

Show connection info, backend, schema version

read

SQL + Snapshot

db export

Export the database as a Parquet snapshot

read (source) + write (output)

SQL → Snapshot

db snapshot-download

Download a Parquet snapshot from an HTTP endpoint

write (output)

n/a

db snapshot-nlp

Build the pre-compiled NLP automaton inside a snapshot

write (snapshot dir)

n/a

Schema-management commands refuse to run against a snapshot URI — schema is a SQL-mode concept.

db create

Creates a new SQL database: ensures the database exists at the URI level (for PostgreSQL, runs CREATE DATABASE against the admin connection if needed), bootstraps the schema via SQLAlchemy metadata, seeds the canonical reference data, and stamps the new DB at the current Alembic head.

igem-server --db-uri postgresql://dev:dev@localhost/igem db create

Options:

  • --overwrite — skip the “database already exists” check and run the bootstrap flow anyway. Tables are created via CREATE TABLE ... IF NOT EXISTS (idempotent for new tables) and seeds re-run idempotently.

Output ends with Stamping schema revision... followed by Stamp complete. — the new DB is at head from creation.

db upgrade

Applies pending Alembic migrations and re-runs the idempotent seed inserts. This is the command your deploy pipeline calls.

igem-server db upgrade

Behaviour by DB state (see Alembic — Lifecycle states):

  • At head → no DDL runs, seeds re-applied idempotently

  • Behind head → migrations applied in order, then seeds

  • Not versioned → refuses with a clear error pointing to db stamp-head

Migrations run inside a transaction (in PostgreSQL — DDL is transactional), so a partial failure rolls back automatically.

db status

Reports current vs head revision. Read-only; safe anywhere a SQL connection is available.

igem-server db status
# Alembic status
#   DB URI         : postgresql://...
#   Repo head      : 7a8b9c0d1e2f
#   DB revision    : 7a8b9c0d1e2f
#   Versioned?     : True
#   → up-to-date.

Use it as a sanity check before/after a deploy, or to confirm a dev DB is at the same revision as production before generating a new migration.

db stamp-head

Records the current head revision in alembic_version without running any DDL. Use to baseline a database whose schema already matches the running package but lacks a version row.

igem-server db stamp-head

Options:

  • --force — clear the existing alembic_version row and re-stamp. Useful only when the existing row points at an unknown / corrupted revision. Have a backup ready.

Idempotent: if the DB is already at head, the command short-circuits with Already at head; nothing to do.

db migrate-dry-run

Prints the SQL that db upgrade would emit, without executing it.

igem-server db migrate-dry-run
# Dry-run (SQL only — no execution)
#   From: 7a8b9c0d1e2f
#   To  : head
# CREATE TABLE ...
# ALTER TABLE ... ADD COLUMN ...

Options:

  • --target <rev> — render SQL targeting a specific revision (e.g. for rollback review). Default is head.

Uses Alembic’s offline mode, which generates SQL from the start of the migration graph; treat the output as a preview of what migrations would run, not a literal statement-by-statement plan against the live DB.

Common pattern in deploy review:

igem-server db migrate-dry-run > pending.sql
git diff <prev-deploy-sql> pending.sql   # if you keep them under version control

db info

Shows connection-level information about the active backend. Read-only; works against both SQL and snapshot URIs.

igem-server db info
# Backend     : sql
# Read-only   : False
# URI         : postgresql:///igem?host=/var/run/postgresql
# Engine      : postgresql
# Host        : <unknown>
# Database    : igem
# Connected   : True

For a snapshot URI, the output also reports the snapshot version, schema version, table count, and export date pulled from the manifest.

db export

Exports the current SQL database as a versioned Parquet snapshot directory. See Snapshots for the full workflow.

igem-server db export \
    --output /snapshots/$(date +%Y-%m-%d)/ \
    --compression zstd

Key options:

  • --output <dir> (required) — output directory; created if missing.

  • --tables a,b,c — restrict to a subset of tables.

  • --exclude a,b — explicitly skip tables.

  • --chunksize 50000 — rows per Parquet write batch.

  • --compression zstd|snappy|gzip|none (default zstd).

  • --version <label> — snapshot label (default: today’s date).

  • --overwrite — write into a non-empty directory.

Output is a directory containing one .parquet per table plus a manifest.json with per-table sha256 / row count and global metadata (schema_version, schema_revision, snapshot_version, exported_at).

db snapshot-download

Downloads a snapshot directory from an HTTP endpoint that serves a manifest.json. Files are fetched in parallel and verified against the manifest’s sha256 hashes.

igem-server db snapshot-download \
    --url https://geneexposure.org/downloads/latest/ \
    --output /local/snapshot/ \
    --workers 4

Key options:

  • --url <base> — base URL serving the snapshot directory and its manifest.json at the root. Default: https://geneexposure.org/downloads/latest/.

  • --output <dir> (required) — local target directory.

  • --include-nlp — also download the pre-compiled NLP automaton cache (~3.5 GB; off by default).

  • --workers N — concurrent downloads (default 4).

  • --overwrite — re-download files even if they exist locally with matching sha256.

After completion, point IGEM-Server at the output directory with --db-uri <output> to consume the snapshot.

db snapshot-nlp

Builds the pre-compiled NLP automaton inside an existing snapshot directory. Reads the parquet files, builds the AliasDictionary + Aho-Corasick automaton, and serialises it to <snapshot>/nlp/. The automaton is then loaded instantly by the embedded IGEM-Server on container start, skipping the ~70s rebuild.

igem-server db snapshot-nlp /path/to/snapshot/

Options:

  • --overwrite — rebuild even if <snapshot>/nlp/ already exists.

Idempotent without --overwrite: refuses if a cache is already present.