Files
codit/docs/database-schema.md

753 lines
24 KiB
Markdown

# Database Schema
This document describes the Codit relational database schema defined by `backend/migrations/001_init.sql`.
The current baseline is a single clean initialization migration. Future schema changes should be added as new migration files, for example `002_some_change.sql`, while keeping this document updated.
## Conventions
- Most durable user-facing IDs are stored as `public_id` in the database and exposed as `id` in JSON APIs.
- Internal integer `id` columns are local database primary keys and should not be exposed unless a handler explicitly needs them.
- `created_at`, `updated_at`, `started_at`, `ended_at`, and similar integer timestamp columns store Unix seconds unless the schema explicitly uses `TIMESTAMP`.
- Boolean values are stored as `INTEGER` with `0`/`1` in SQLite.
- JSON arrays and objects are stored as text columns with names ending in `_json`.
- Secrets and private keys are currently stored in database text columns. Treat database backups as sensitive material.
## Relationship Diagrams
The diagrams below are intentionally split by domain. They show the main logical relationships; a few relationships are implemented through `public_id` string columns rather than strict integer foreign keys.
### Identity And Authorization
```mermaid
erDiagram
users ||--o{ sessions : owns
users ||--o| user_totp : configures
users ||--o{ api_keys : owns
users ||--o{ user_group_members : joins
user_groups ||--o{ user_group_members : contains
service_principals ||--o{ principal_api_keys : owns
service_principals ||--o{ cert_principal_bindings : maps
service_principals ||--o{ principal_project_roles : assigned
users ||..o{ subject_permissions : subject
user_groups ||..o{ subject_permissions : subject
service_principals ||..o{ subject_permissions : subject
```
### Projects And Repository Content
```mermaid
erDiagram
users ||--o{ projects : creates
users ||--o{ project_members : member
projects ||--o{ project_members : has
projects ||--o{ project_role_bindings : grants
projects ||--o{ repos : owns
projects ||--o{ project_repos : attaches
repos ||--o{ project_repos : visible_as
projects ||--o{ issues : tracks
issues ||--o{ issue_comments : has
projects ||--o{ wiki_pages : documents
projects ||--o{ uploads : stores
```
### PKI And ACME
```mermaid
erDiagram
pki_cas ||--o{ pki_cas : parent
pki_cas ||--o{ pki_certs : issues
pki_cas ||--o{ pki_client_profiles : backs
pki_client_profiles ||--o{ pki_client_profile_targets : targets
pki_client_profiles ||--o{ pki_client_issuances : records
pki_certs ||..o{ pki_client_issuances : cert_snapshot
acme_profiles ||--o{ acme_orders : creates
pki_certs ||..o{ acme_orders : result_cert
```
### SSH Certificates And Web SSH Broker
```mermaid
erDiagram
ssh_user_cas ||--o{ ssh_user_ca_issuances : signs
users ||..o{ ssh_user_ca_issuances : issuer
ssh_principal_grants ||--o{ ssh_principal_grant_targets : targets
ssh_principal_grants ||..o{ ssh_user_ca_issuances : authorizes
ssh_servers ||--o{ ssh_server_host_keys : pins
ssh_servers ||--o{ ssh_server_group_members : grouped
ssh_server_groups ||--o{ ssh_server_group_members : contains
ssh_servers ||..o{ ssh_access_profiles : exposes
ssh_server_groups ||..o{ ssh_access_profiles : exposes
ssh_secrets ||..o{ ssh_access_profiles : credential
ssh_credentials ||..o{ ssh_access_profiles : imported_key
ssh_user_cas ||..o{ ssh_access_profiles : managed_cert_ca
ssh_access_profiles ||--o{ ssh_access_profile_targets : targets
ssh_access_profiles ||--o{ ssh_sessions : starts
ssh_servers ||--o{ ssh_sessions : target
users ||--o{ ssh_sessions : opens
```
### TLS And RPM Mirror State
```mermaid
erDiagram
tls_auth_policies ||..o{ tls_listeners : referenced_by
pki_certs ||..o{ tls_listeners : server_cert
pki_cas ||..o{ tls_listeners : client_ca
repos ||--o{ rpm_repo_dirs : contains
rpm_repo_dirs ||--o{ rpm_mirror_runs : records
```
## Core Identity And Auth
### `users`
Stores human users.
| Column | Purpose |
| --- | --- |
| `id` | Internal numeric primary key. |
| `public_id` | Stable external user ID. |
| `username` | Unique login name. |
| `display_name` | Human-readable display name. |
| `email` | Email address. |
| `password_hash` | Password hash for database-backed users. |
| `is_admin` | Site-admin flag. |
| `auth_source` | Source of the user account, for example `db`, `ldap`, or OIDC-related sources. |
| `created_at`, `updated_at` | Account timestamps. |
| `disabled` | Disables login/use when set. |
| `totp_required` | Direct per-user TOTP enforcement flag. Group enforcement can also make TOTP effectively required. |
Indexes:
- `idx_users_disabled` on `disabled`.
### `sessions`
Stores browser login sessions created by `/api/login` or OIDC login.
| Column | Purpose |
| --- | --- |
| `id` | Session row primary key. |
| `user_id` | Internal user FK. |
| `token` | Session token stored in the `codit_session` cookie. |
| `expires_at` | Session expiration time. |
| `created_at` | Creation time. |
| `totp_verified` | Whether the interactive session has satisfied the current TOTP requirement. |
Notes:
- `Logout` deletes the row for the current cookie token.
- Expired rows are cleaned opportunistically when issuing a new session.
### `user_totp`
Stores per-user TOTP configuration.
| Column | Purpose |
| --- | --- |
| `user_id` | Internal user FK and primary key. |
| `secret_encrypted` | Enabled TOTP secret encrypted with the local secret key. |
| `pending_secret_encrypted` | Pending setup secret before verification. |
| `enabled` | Whether TOTP is enabled for the user. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
### `api_keys`
Stores user-owned API keys.
| Column | Purpose |
| --- | --- |
| `public_id` | External API key ID. |
| `user_id` | Internal user owner FK. |
| `name` | User-visible key name. |
| `token_hash` | SHA-256 hash of the token. The raw token is shown only once. |
| `token_prefix` | Short prefix for display and identification. |
| `created_at`, `last_used_at`, `expires_at` | Lifecycle timestamps. `expires_at = 0` means no expiration. |
| `disabled` | Disables the key. |
Indexes:
- Unique `idx_api_keys_hash` on `token_hash`.
- Lookup indexes on `user_id`, `expires_at`, and `disabled`.
## Users, Groups, And Authorization
### `user_groups`
Stores named user groups.
| Column | Purpose |
| --- | --- |
| `public_id` | External group ID. |
| `name` | Unique group name. |
| `description` | Group description. |
| `disabled` | Marks the group inactive. |
| `totp_required` | Requires TOTP for interactive members. |
| `scope` | Group scope. `explicit` uses `user_group_members`; `all_users` applies to all users. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
Indexes:
- Unique partial index `user_groups_one_all_users` ensures at most one `all_users` group.
### `user_group_members`
Join table between groups and users.
| Column | Purpose |
| --- | --- |
| `group_id` | Internal group FK. |
| `user_id` | Internal user FK. |
| `created_at` | Membership creation time. |
Primary key: `(group_id, user_id)`.
### `subject_permissions`
Stores global permissions assigned directly to users, groups, or service principals.
| Column | Purpose |
| --- | --- |
| `permission` | Permission string. |
| `subject_type` | `user`, `group`, or `principal`. |
| `subject_public_id` | External ID of the subject. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
Unique key: `(permission, subject_type, subject_public_id)`.
### `service_principals`
Stores non-human principals for automation and certificate-based access.
| Column | Purpose |
| --- | --- |
| `public_id` | External principal ID. |
| `name` | Unique principal name. |
| `description` | Description. |
| `disabled` | Disables the principal. |
| `is_admin` | Grants site-admin behavior to the principal. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
### `principal_api_keys`
Stores API keys owned by service principals.
| Column | Purpose |
| --- | --- |
| `public_id` | External API key ID. |
| `principal_id` | Internal service principal FK. |
| `name` | Key name. |
| `token_hash`, `token_prefix` | Stored token hash and display prefix. |
| `created_at`, `last_used_at`, `expires_at` | Lifecycle timestamps. |
| `disabled` | Disables the key. |
### `cert_principal_bindings`
Maps certificate fingerprints to service principals for certificate-authenticated access.
| Column | Purpose |
| --- | --- |
| `fingerprint` | Certificate fingerprint primary key. |
| `principal_id` | Internal service principal FK. |
| `enabled` | Enables/disables the binding. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
### `principal_project_roles`
Assigns service principals to projects.
| Column | Purpose |
| --- | --- |
| `principal_id` | Internal service principal FK. |
| `project_id` | Internal project FK. |
| `role` | Project role. |
| `created_at` | Assignment timestamp. |
Primary key: `(principal_id, project_id)`.
## Projects And Content
### `projects`
Stores projects.
| Column | Purpose |
| --- | --- |
| `public_id` | External project ID. |
| `slug` | Unique project slug. |
| `name` | Display name. |
| `description` | Description. |
| `home_page` | Default project landing page. |
| `created_by`, `updated_by` | Internal user FKs. |
| `created_at`, `updated_at` | Timestamp columns. |
| `created_at_unix`, `updated_at_unix` | Unix timestamp mirrors used by newer code paths. |
Indexes:
- `idx_projects_name` on `name`.
### `project_members`
Assigns users directly to projects.
| Column | Purpose |
| --- | --- |
| `project_id` | Internal project FK. |
| `user_id` | Internal user FK. |
| `role` | Project role. |
| `created_at` | Assignment timestamp. |
Primary key: `(project_id, user_id)`.
### `project_role_bindings`
Assigns project roles to subjects such as groups or principals.
| Column | Purpose |
| --- | --- |
| `project_id` | Internal project FK. |
| `subject_type` | Subject kind. |
| `subject_public_id` | External subject ID. |
| `role` | Project role. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
Unique key: `(project_id, subject_type, subject_public_id)`.
### `repos`
Stores repositories belonging to projects.
| Column | Purpose |
| --- | --- |
| `public_id` | External repository ID. |
| `project_id` | Internal owning project FK. |
| `name` | Repository name. |
| `type` | Repository type: usually `git`, `rpm`, or `docker`. |
| `path` | Storage path. |
| `created_by` | Internal user FK. |
| `created_at` | Creation time. |
Unique index:
- `idx_repos_project_name_type` on `(project_id, name, type)`.
### `project_repos`
Join table for project-to-repository visibility/attachment, including foreign repository attachment.
| Column | Purpose |
| --- | --- |
| `project_id` | Internal project FK. |
| `repo_id` | Internal repository FK. |
| `created_at` | Attachment timestamp. |
Primary key: `(project_id, repo_id)`.
### `issues` and `issue_comments`
Issue tracking tables.
`issues` stores project issues with `title`, `body`, `status`, creator, optional assignee, and timestamps.
`issue_comments` stores comments linked to an issue, with body, creator, and creation timestamp.
### `wiki_pages`
Stores project wiki pages.
Important fields:
- `project_id`
- `title`
- `slug`
- `body`
- `created_by`
- `updated_at`
Unique key: `(project_id, slug)`.
### `uploads`
Stores project file upload metadata.
Important fields:
- `project_id`
- `filename`
- `content_type`
- `size`
- `storage_path`
- `created_by`
- `created_at`
The file bytes live in the configured upload storage path, not in this table.
## PKI And Client Certificates
### `pki_cas`
Stores X.509 certificate authorities.
| Column | Purpose |
| --- | --- |
| `public_id` | External CA ID. |
| `name` | Unique CA name. |
| `parent_ca_id` | Optional internal parent CA FK. |
| `is_root` | Root CA flag. |
| `cert_pem` | CA certificate PEM. |
| `key_pem` | CA private key PEM. Sensitive. |
| `serial_counter` | Next serial counter. |
| `status` | CA status. |
| `created_at`, `updated_at` | Lifecycle timestamps. |
### `pki_certs`
Stores issued or imported X.509 certificates.
| Column | Purpose |
| --- | --- |
| `public_id` | External cert ID. |
| `ca_id` | Optional internal CA FK. |
| `created_by_kind`, `created_by_subject_id`, `created_by_subject_name` | Issuer attribution. |
| `issuance_source` | Source such as admin, client profile, ACME, or import. |
| `serial_hex` | Certificate serial number. |
| `common_name`, `san_dns`, `san_ips` | Subject/SAN data. |
| `is_ca` | Whether the cert is a CA cert. |
| `cert_pem`, `key_pem` | Certificate and private key PEM. Sensitive. |
| `not_before`, `not_after` | Validity period. |
| `status`, `revoked_at`, `revocation_reason` | Revocation/status data. |
| `created_at` | Creation time. |
Unique key: `(ca_id, serial_hex)`.
### `pki_client_profiles`
Defines self-service client certificate issuance profiles.
Important fields:
- `ca_id`: issuing CA.
- `subject_organization`, `san_uri_prefix`: subject defaults.
- `allow_server_auth`: whether issued certs can include server auth.
- `authz_permissions`, `authz_scope`: authorization data embedded/associated with issued certs.
- `default_valid_seconds`, `max_valid_seconds`: issuance validity bounds.
- `enabled`: availability flag.
### `pki_client_profile_targets`
Maps client certificate profiles to users or groups.
Primary key: `(profile_id, target_type, target_public_id)`.
### `pki_client_issuances`
Records self-service client certificate issuance history.
Important fields:
- `cert_public_id`: issued cert ID.
- `user_public_id`, `username`: requesting user.
- `profile_public_id`, `profile_name`: profile snapshot.
- `serial_hex`, `common_name`, `san_uri`: issued cert identity data.
- `authz_permissions`, `authz_scope`: authorization snapshot.
- `not_before`, `not_after`, `created_at`: validity/history timestamps.
## ACME
### `acme_profiles`
Stores ACME account/profile configuration.
Important fields:
- `directory_url`, `email`
- `account_url`, `account_key_pem`
- `solver_type`: `manual` or `acme_dns`
- `acme_dns_*`: ACME DNS API configuration
- `enabled`, `last_error`
### `acme_orders`
Tracks ACME orders and DNS challenges.
Important fields:
- `profile_public_id`
- `common_name`, `san_dns`
- `order_url`, `finalize_url`
- `status`, `last_error`
- `cert_public_id`
- `challenges_json`
- `csr_pem`, `key_pem`
## SSH Certificates And Broker
### `ssh_user_cas`
Stores SSH user certificate authorities.
| Column | Purpose |
| --- | --- |
| `public_id` | External SSH CA ID. |
| `name` | Unique CA name. |
| `algorithm` | Key algorithm. |
| `public_key` | SSH CA public key. |
| `private_key_pem` | SSH CA private key. Sensitive. |
| `fingerprint` | CA public key fingerprint. |
| `serial_counter` | Next certificate serial. |
| `enabled` | Availability flag. |
| `allow_user_sign` | Allows self-service user signing. |
| `max_user_valid_seconds` | Self-service max validity. |
### `ssh_user_ca_issuances`
Records SSH user certificate signing history.
Important fields:
- `ca_public_id`: SSH user CA ID.
- `issuer_user_public_id`, `issuer_username`, `issuer_kind`: issuer attribution.
- `source_public_key`, `source_public_key_fingerprint`: key that was signed.
- `certificate`: issued SSH certificate.
- `key_id`, `principals_json`: certificate identity data.
- `grant_ids_json`, `grant_names_json`: principal grants used to authorize signing.
- `valid_after`, `valid_before`, `serial`: certificate validity and serial.
- `remote_addr`, `user_agent`, `created_at`: audit metadata.
### `ssh_principal_grants`
Defines grant-only SSH principal authorization.
| Column | Purpose |
| --- | --- |
| `public_id` | External grant ID. |
| `principal` | Representative/legacy principal name. |
| `principals_json` | Principal names placed into the signed SSH certificate. |
| `valid_after`, `valid_before` | Grant validity window. `0` means unbounded. |
| `max_cert_valid_seconds` | Maximum issued cert validity. |
| `max_uses`, `used_count` | Use limit and usage count. |
| `disabled` | Disables the grant. |
| `reason` | Administrative note. |
| `created_by_user_public_id` | Creator user ID. |
| `last_used_at` | Last successful use. |
### `ssh_principal_grant_targets`
Maps SSH principal grants to users or groups.
Primary key: `(grant_public_id, target_type, target_public_id)`.
### `ssh_servers`
Stores SSH server definitions.
Important fields:
- `name`, `host`, `port`
- `description`, `tags_json`
- `enabled`
- `host_key_policy`: `strict` requires an existing pinned host key; `trust_on_first_use` allows the first connection to pin the observed host key.
- `created_by_kind`, `created_by_subject_id`, `created_by_subject_name`
`created_by_kind = 'user'` indicates a personal server; admin-created servers are shared through access profiles and targets.
When `host_key_policy = 'trust_on_first_use'`, the first successful SSH connection writes the observed key into `ssh_server_host_keys`; later connections use the pinned-key check.
### `ssh_server_host_keys`
Stores pinned/known SSH server host keys.
Important fields:
- `server_public_id`
- `algorithm`
- `public_key`
- `fingerprint`
Unique key: `(server_public_id, fingerprint)`.
### `ssh_server_groups`
Stores named groups of SSH servers.
Important fields:
- `name`, `description`
- `enabled`
- `created_by_kind`, `created_by_subject_id`, `created_by_subject_name`
- `created_at`, `updated_at`
Server groups let a single SSH access profile target multiple servers. Users choose the concrete server when connecting through a group-based profile.
### `ssh_server_group_members`
Maps SSH server groups to member servers.
Primary key: `(group_public_id, server_public_id)`.
### `ssh_credentials`
Stores reusable SSH credential metadata for imported private keys.
Important fields:
- `name`, `description`
- `type`: currently `private_key`.
- `secret_public_id`: encrypted secret reference in `ssh_secrets`.
- `public_key`, `fingerprint`: non-secret public key metadata shown in the UI/API.
- `enabled`: disabled credentials must not be used for new SSH connections.
- `owner_scope`: `admin` or `user`.
- `owner_user_public_id`: owner for user-owned credentials.
- creator attribution fields.
The private key PEM itself is not stored in this table and is not returned by the API. It is encrypted into the referenced `ssh_secrets` row.
### `ssh_secrets`
Stores SSH authentication secrets used by access profiles.
Important fields:
- `kind`: secret kind, for example private key or password.
- `payload`: secret payload such as private key PEM. Values generated by the API are encrypted and prefixed with `enc:v1:`.
- `password`: password/passphrase payload. Stored password auth may use this field.
- `metadata_json`: future metadata extension point.
- creator attribution fields.
This table is sensitive. Encrypted SSH secret payloads use an AES-GCM key stored outside the database at `<data-dir>/ssh-broker.secret`. The file is generated automatically as 32 random bytes with mode `0600` if it does not exist. Backup and restore must keep this file with the database; losing it makes encrypted SSH secrets undecryptable. Anyone with both the database and `ssh-broker.secret` can decrypt stored SSH private keys/passwords.
### `ssh_access_profiles`
Defines how a user may connect to an SSH server.
Important fields:
- `server_target_type`: `server` or `group`.
- `server_public_id`: target server when `server_target_type = 'server'`.
- `server_group_public_id`: target group when `server_target_type = 'group'`.
- `remote_username`: remote account name.
- `auth_method`: `managed_ssh_cert`, `prompted_password`, `stored_password`, or `stored_private_key`.
- `second_factor_mode`: `none` or `prompted_totp`.
- `owner_scope`: `admin_shared` or `user`.
- `owner_user_public_id`: owner for self-service profiles.
- `secret_public_id`: stored credential reference.
- `ssh_credential_public_id`: imported SSH credential reference when using a stored private key credential.
- `auth_public_key`, `auth_public_key_fingerprint`: public-key metadata.
- `ssh_user_ca_public_id`: SSH CA used for managed cert auth.
- `ssh_principal_grant_ids_json`: grants used for managed cert auth.
- `default_valid_seconds`, `max_valid_seconds`: cert/session validity bounds.
- `enabled`, `allow_user_edit`: availability/editing controls.
Stored private-key and stored-password profiles ultimately reference `ssh_secrets`. Imported SSH credential private keys are not returned by the API; only public key metadata and fingerprints are exposed.
### `ssh_access_profile_targets`
Maps admin-shared SSH access profiles to users or groups.
Unique key: `(profile_public_id, target_type, target_public_id)`.
### `ssh_sessions`
Stores SSH web terminal session history.
Important fields:
- `profile_public_id`, `server_public_id`, `user_public_id`: session ownership/context.
- `username`, `remote_username`: local and remote usernames.
- `auth_method`, `second_factor_mode`: auth method snapshot.
- `host`, `port`: target snapshot.
- `status`: session state.
- `host_key_fingerprint`: connected host key fingerprint.
- `requested_term`, `requested_cols`, `requested_rows`: terminal request parameters.
- `started_at`, `connected_at`, `ended_at`: lifecycle timestamps.
- `remote_addr`, `user_agent`: requester metadata.
- `error`: terminal/connection error text.
Transcripts are stored on disk under the configured data directory, not in this table.
## TLS
### `tls_listeners`
Stores additional HTTP/HTTPS listener configuration.
Important fields:
- `http_addrs`, `https_addrs`: serialized address lists.
- `endpoint_policies_json`: service-to-policy mapping.
- `tls_server_cert_source`, `tls_cert_file`, `tls_key_file`, `tls_pki_server_cert_id`: server cert configuration.
- `tls_client_auth`, `tls_client_ca_file`, `tls_pki_client_ca_id`: client cert authentication configuration.
- `tls_min_version`: minimum TLS version.
### `tls_auth_policies`
Defines read/write access policies for TLS-authenticated endpoints.
Important fields:
- `read_mode`, `write_mode`: access mode choices.
- `require_principal_for_write`: requires mapped service principal for writes.
- `allowed_pki_client_cert_ids`, `allowed_cert_fingerprints`: allowed cert constraints.
- `required_permissions`, `permission_match`, `required_scope`: authorization requirements.
## RPM Mirror State
### `rpm_repo_dirs`
Stores per-repository RPM directory and mirror configuration.
Important fields:
- `repo_id`, `path`: primary key identifying the RPM directory.
- `mode`: `local` or `mirror`.
- `allow_delete`: whether deletion is allowed from UI/API.
- `remote_url`, `connect_host`, `host_header`: mirror source/network settings.
- `tls_server_name`, `tls_insecure_skip_verify`: TLS settings for mirror sync.
- `sync_interval_sec`, `sync_enabled`: scheduling settings.
- `dirty`, `next_sync_at`, `sync_running`: sync state.
- `sync_status`, `sync_error`, `sync_step`: current status.
- `sync_total`, `sync_done`, `sync_failed`, `sync_deleted`: progress counters.
- `last_sync_*`, `last_synced_revision`: sync history.
### `rpm_mirror_runs`
Stores individual RPM mirror run history.
Important fields:
- `repo_id`, `path`: target RPM directory.
- `started_at`, `finished_at`
- `status`, `step`, `error`
- `total`, `done`, `failed`, `deleted`
- `revision`
Retention cleanup is handled by RPM mirror code.
## Application Settings
### `app_settings`
Key/value table for site configuration.
| Column | Purpose |
| --- | --- |
| `key` | Setting key. |
| `value` | Setting value. |
| `updated_at` | Last update timestamp. |
## Index Summary
The schema defines indexes for common lookup paths:
- Authentication: `api_keys`, `principal_api_keys`, `sessions` primary/unique token lookup.
- Project access: projects by name, repos by project, role bindings by project/subject.
- PKI: CAs by parent, certs by CA, client profiles by CA, client issuances by user/cert.
- SSH: CAs by fingerprint, issuances by CA/issuer, principal grants by principal/activity, servers by name/host, access profiles by server/owner/enabled, sessions by user/profile/status.
- RPM: mirror run relationships are enforced through repo/path foreign keys.
## Migration Notes
- `backend/migrations/001_init.sql` is the current clean baseline.
- The migration runner still supports future numbered migrations. Add new changes as `002_*.sql`, `003_*.sql`, and so on.
- Do not rewrite `001_init.sql` after a release that has existing deployed databases unless intentionally rebasing development data.
- When adding a table or column, update this document and any API/OpenAPI schemas that expose the data.