Skip to content

Improve database management documentation #593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/contributing/database-migrations/_category_.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
label: "Database Migrations"
position: 2
12 changes: 11 additions & 1 deletion docs/contributing/database-migrations/edd.mdx
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 3
---

import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

Expand Down Expand Up @@ -130,6 +134,8 @@ actions.
<Tabs>
<TabItem value="first" label="Initial Migration" default>

Located in `util/Migrator/DbScripts`.

```sql
-- Add Column
IF COL_LENGTH('[dbo].[Customer]', 'FirstName') IS NULL
Expand Down Expand Up @@ -177,6 +183,8 @@ END
</TabItem>
<TabItem value="data" label="Transition Migration">

Located in `util/Migrator/DbScripts_transition`.

```sql
UPDATE [dbo].Customer SET
FirstName=FName
Expand All @@ -186,6 +194,8 @@ WHERE FirstName IS NULL
</TabItem>
<TabItem value="second" label="Finalization Migration">

Located in `util/Migrator/DbScripts_finalization`.

```sql
-- Remove Column
IF COL_LENGTH('[dbo].[Customer]', 'FName') IS NOT NULL
Expand Down Expand Up @@ -239,7 +249,7 @@ There are some important constraints to the implementation of the process:

The process to support all of these constraints is a complex one. Below is an image of a state
machine that will hopefully help visualize the process and what it supports. It assumes that all
database changes follow the standards that are laid out in [Migrations](./).
database changes follow the standards that are laid out in [Migrations](./mssql.md).

---

Expand Down
31 changes: 31 additions & 0 deletions docs/contributing/database-migrations/ef.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
sidebar_position: 2
---

# Entity Framework

:::info

For instructions on how to apply database migrations, please refer to the
[Getting Started](../../getting-started/server/database/ef/index.mdx#updating-the-database)
documentation.

:::

If you alter the database schema, you must create an EF migration script to ensure that EF databases
keep pace with these changes. Developers must do this and include the migrations with their PR.

To create these scripts, you must first update your data model in `Core/Entities` as desired. This
will be used to generate the migrations for each of our EF targets. Additionally, for table changes
it is strongly recommended to define or update an `IEntityTypeConfiguration<T>` to accurately
represent any constraints needed on the data model.

Once the model is updated, navigate to the `dev` directory in the `server` repo and execute the
`ef_migrate.ps1` PowerShell command. You should provide a name for the migration as the only
parameter:

```bash
pwsh ef_migrate.ps1 [NAME_OF_MIGRATION]
```

This will generate the migrations, which should then be included in your PR.
104 changes: 0 additions & 104 deletions docs/contributing/database-migrations/index.md

This file was deleted.

98 changes: 98 additions & 0 deletions docs/contributing/database-migrations/mssql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
---
sidebar_position: 1
---

# MSSQL

:::info

For instructions on how to apply database migrations, please refer to the
[Getting Started](../../getting-started/server/database/mssql/index.md#updating-the-database)
documentation.

:::

:::tip

We recommend reading [T-SQL Code Style](../code-style/sql.md) since it has a major impact in how we
write migrations.

:::

## SQL database project

We use a
[SDK-style SQL project](https://learn.microsoft.com/en-us/sql/azure-data-studio/extensions/sql-database-project-extension-sdk-style-projects?view=sql-server-ver16)
(`.sqlproj`) to develop the database locally -- this means we have an up-to-date representation of
the database in `src/Sql`, and any modifications need to be represented there as well. SDK-style SQL
projects are available in Visual Studio Code that provides schema comparison and more. You may also
modify the `.sql` files directly with any text editor.

To make a database change, start by modifying the `.sql` files in `src/Sql/dbo`. These changes will
also need to be applied in a migration script. Migration scripts are located in
`util/Migrator/DbScripts`.

You can either generate the migration scripts automatically using the _Schema Comparison_
functionality or by manually writing them. Do note that the automatic method will only take you so
far and it will need to be manually edited to adhere to the code styles.

For added safeguards we have automated linting and validation to ensure the SQL project is always up
to date with the migrations.

The separate database definitions in `src/Sql/.../dbo` serve as a "master" reference for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

โ›๏ธ No need for scare quotes here.

Suggested change
The separate database definitions in `src/Sql/.../dbo` serve as a "master" reference for the
The separate database definitions in `src/Sql/.../dbo` serve as the source of truth for the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an analogy to a master copy of something, like an album recording. This content just moved its position by the way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know. "The source of truth" is both more inclusive than "master reference", and more accurate as a description.

intended and final state of the database at that time. This is crucial because the state of database
definitions at the current moment may differ from when a migration was added in the past. These
definitions act as a lint and validation step to ensure that migrations work as expected, and the
separation helps maintain clarity and accuracy in database schema management and synchronization
processes.

## Modifying the database

In accordance with the tenets of [Evolutionary Database Design](./edd.mdx) every change must be
considered as split into two parts:

1. A backwards-compatible transition migration
2. A non-backwards-compatible final migration

Most changes are entirely backwards-compatible in their final form. If this is the case, only one
phase of changes is required. With the use of beta testing, partial roll-outs,
[feature flags](../feature-flags.md), etc. the often-chosen path is to spread a change across
several major releases with a calculated future state that can perform a "cleanup" migration that is
backwards-compatible but still represents an overall-_incompatible_ change beyond the boundaries of
what we need for individual release safety.

### Backwards compatible migration

1. Modify the source `.sql` files in `src/Sql/dbo`.
2. Write a migration script, and place it in `util/Migrator/DbScripts`. Each script must be prefixed
with the current date.

Tips to ensure backwards compatibility:

- any existing stored procedure accepts the same input parameters and that new parameters have
nullable defaults
- when a column is renamed the existing stored procedures first check (coalesce) the new location
before falling back to the old location
- continued updating of the old data columns since in case of a rollback no data should be lost

### Non-backwards compatible migration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

๐Ÿค” I think we should use affirmative language here. The 'non-backwards compatible migration' is talking about a 'breaking change'.

Suggested change
### Non-backwards compatible migration
### Breaking change migration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, but let's make sure to update everywhere, not just this heading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support the current term. The point of this documentation is that your change can't go backwards. It may be "breaking" to some, but the intention of the documentation is about direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use affirmative language to describe that. The EDD page uses "finalization migration" as its term for this phase.

Suggested change
### Non-backwards compatible migration
### Finalization migration


These changes should be written from the perspective of "all data has been migrated" and any old
stored procedures that were kept around for backwards compatibility should be removed. Any logic for
syncing old and new data should also be removed in this step.

Since the `dbo` schema represents the current state we need to introduce a "future" state that we
will call `dbo_finalization`.

1. Copy the relevant `.sql` files from `src/Sql/dbo` to `src/Sql/dbo_finalization`.
2. Remove the backwards compatibility logic that is no longer needed e.g. dual reads and writes to
columns.
3. Write a new Migration and place it in `src/Migrator/DbScripts_finalization`. Name it
`YYYY-0M-FinalizationMigration.sql`.
- Typically migrations are designed to be run in sequence. However since the migrations in
`DbScripts_finalization` can be run out of order, care must be taken to ensure they remain
compatible with the changes to `DbScripts`. In order to achieve this we only keep a single
migration, which executes all backwards incompatible schema changes.
Comment on lines +94 to +95
Copy link
Member

@eliykat eliykat May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that DbScripts_finalization only ever contains 1 file that everyone appends to? I'm not necessarily opposed to the requirement but I don't recall seeing it before.

Is the finalization script added at the same time as the backwards-compatible change? So there is only 1 finalization script at a time, it is always moved to DbScripts after the next rc cut date? What if you need a longer timeframe on your finalization?

(not being overly critical - just trying to understand this process given we've never really done it)

Copy link
Member

@Hinton Hinton May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's always been mentioned.

You break up a pure DB non backwards compatible change into two steps. An example would be the column rename example in https://contributing.bitwarden.com/contributing/database-migrations/edd#migrations-1.

Initial migration would:

  • Add a new column
  • Add logic that keeps old and new in sync.

Transition migration would:

  • Migrate all data from old to new column.

Finalization migration (run at next deploy):

  • Removes the old now unused column.
  • Updates sprocs to remove the fallback logic.

Copy link
Member

@Hinton Hinton May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do note we technically never have to remove the old column but not doing so adds database debt since we have

  • Unnecessary data in our database.
  • More complex sprocs.

This is why we'd like to automate it since relying on human elements here is dangerous since I could easily be OOO for a release or two or forget about cleaning it up.

Copy link
Member

@audreyality audreyality May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finalization migration (run at next deploy):

This actually needs to be run 2 deploys away from the initial migration to maintain our self-host server guarantees. We can apply it earlier on our infrastructure.

Do note we technically never have to remove the old column...

Unless doing so would cross the the 8kb maximum record length or 1,024 column limit. I'd expect something will cross the 8kb maximum record length if we never remove columns. It'd be much harder to hit the column limit.

Copy link
Member

@eliykat eliykat May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could've been clearer in my question - my previous understanding was that you might have several scripts in DbScripts_finalization as different teams add different finalization scripts. But this way, you have 1 finalization script at a time, which all teams append to. It's been a while since I read about this process so just trying to get my head around it again.

That said, it is unusual that multiple teams would append to the same file so this might be worth stating more explicitly, e.g.

Add your migration to the file in DbScripts/finalization, or create the file if it doesn't exist. Name it YYYY-0M-FinalizationMigration.sql [what should this date be?]. There should only be 1 finalization file at a time.

Rather than "Write a new migration, put it here, name it this" but then saying "we only keep a single migration".

Or if I've misunderstood please let me know.


Upon execution any finalization scripts will be [automatically moved](./edd.mdx#online-environments)
for proper history.
5 changes: 5 additions & 0 deletions docs/getting-started/server/database/ef/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -263,3 +263,8 @@ existing databases if you have not already. If these settings are not present at
With connection strings applied to your projects: ensure your databases are all migrated using
`pwsh server/dev/migrate.ps1 --all`. Then you can run EF tests from the
`test/Infrastructure.IntegrationTest` folder using `dotnet test`.

# Modifying the database

The process for modifying the database is described in
[Migrations](./../../../../contributing/database-migrations/ef.md).
2 changes: 1 addition & 1 deletion docs/getting-started/server/database/mssql/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@ of your database.
## Modifying the database

The process for modifying the database is described in
[Migrations](./../../../../contributing/database-migrations/).
[Migrations](./../../../../contributing/database-migrations/mssql.md).