Skip to content

Diff db-docs and ps/migrations #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: ps/migrations
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion custom-words.txt
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ SDLC
Serilog
signtool
signup
sprocs
sqlcmd
struct
structs
Expand Down
7 changes: 5 additions & 2 deletions docs/contributing/database-migrations/edd.mdx
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 3
---

import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

Expand Down Expand Up @@ -265,8 +269,7 @@ will not be run until the start of the next deploy when they are moved into `DbS
After this deploy, to prep for the next release, all migrations in `DbScripts_transition` are moved
to `DbScripts` and then all migrations in `DbScripts_finalization` are moved to `DbScripts`,
conserving their execution order for a clean install. For the current branching strategy, PRs will
be open against `main` when `rc` is cut to prep for this release. This PR automation will also
handle renaming the migration file and updating any reference of `[dbo_finalization]` to `[dbo]`.
be open against `main` when `rc` is cut to prep for this release.

The next deploy will pick up the newly added migrations in `DbScripts` and set the previously
repeatable _Transition_ migrations to no longer be repeatable, execute the _Finalization_
Expand Down
11 changes: 10 additions & 1 deletion docs/contributing/database-migrations/ef.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,20 @@ sidebar_position: 2

# Entity Framework

:::info

For instructions on how to apply database migrations, please refer to the
[Getting Started](../../getting-started/server/database/ef/index.mdx) documentation.

:::

If you alter the database schema, you must create an EF migration script to ensure that EF databases
keep pace with these changes. Developers must do this and include the migrations with their PR.

To create these scripts, you must first update your data model in `Core/Entities` as desired. This
will be used to generate the migrations for each of our EF targets.
will be used to generate the migrations for each of our EF targets. Additionally, for table changes
it is strongly recommended to define or update an `IEntityTypeConfiguration<T>` to accurately
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems out of place, it's not directly related to migrations but general EF best practices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would it go then? I feel like the prior existing sentence is in the same camp, but both need to be modified for migrations so I put it here.

represent any constraints needed on the data model.

Once the model is updated, navigate to the `dev` directory in the `server` repo and execute the
`ef_migrate.ps1` PowerShell command. You should provide a name for the migration as the only
Expand Down
125 changes: 50 additions & 75 deletions docs/contributing/database-migrations/mssql.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,97 +11,72 @@ For instructions on how to apply database migrations, please refer to the

:::

## SQL database project
:::tip

We use a [SDK-style SQL project][MSBuildSQL] (`sqlproj`) to develop the database locally. This means
we have an up-to-date representation of the database in `src/Sql`, and any modifications needs to be
represented there as well. SDK-style SQL projects are still in preview the tooling is not yet
available in Visual Studio. However it is available in [Visual Studio Code][vscode] and [Azure Data
Studio][azureds] with the [SQL Database Projects][SDPE] extension, which provides schema comparison
and more. You may also modify the `.sql` files directly with any text editor.
We recommend reading [T-SQL Code Style](../code-style/sql.md) since it has a major impact in how we
write migrations.

To make a database change, start by modifying the `.sql` files in `src/Sql/dbo`. These changes will
also need to be applied in a migration script. Migration scripts are located in
`util/Migrator/DbScripts`.
:::

You can either generate the migration scripts automatically using the _Schema Comparison_
functionality or by manually writing them. Do note that the automatic method will only take you so
far and it will need to be manually edited to adhere to the code styles.
## SQL database project

For added safe guards we have automated linting and validation to ensure the SQL project is always
up to date with the migrations.
Comment on lines -16 to -32
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer my section it explains the SQL project with links to references and how they relate to DbScripts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grabbed most of it with some removals due to what exists today.

The separate database definitions in `src/Sql/.../dbo` serve as a "master" reference for the
intended and final state of the database at that time. This is crucial because the state of database
definitions at the current moment may differ from when a migration was added in the past. These
definitions act as a lint and validation step to ensure that migrations work as expected, and the
separation helps maintain clarity and accuracy in database schema management and synchronization
processes.

## Modifying the database
Additionally, a
[SQL database project](https://learn.microsoft.com/en-us/azure-data-studio/extensions/sql-database-project-extension-sdk-style-projects)
is in place; however, instead of using the auto-generated migrations from
[DAC](https://learn.microsoft.com/en-us/sql/relational-databases/data-tier-applications/data-tier-applications?view=sql-server-ver16),
we manually write migrations. This approach is chosen to enhance performance and prevent accidental
data loss, which is why we have both a `sqlproj` and standalone migrations.

Since we follow [Evolutionary Database Design _(EDD)_](./edd.mdx), any migration that modifies
existing columns most likely needs to be split into at least two parts: a backwards compatible
transition phase, and a non-backwards compatible phase.
## Modifying the database

### Best practices
In accordance with the tenets of [Evolutionary Database Design](./edd.mdx) every change must be
considered as split into two parts:

When writing a migration script there are a couple of best practices we follow. Please check the
[T-SQL Code Style][code-style-sql] for more details. But the most important aspect is ensuring the
script can be re-run on the database multiple times without producing any errors or data loss.
1. A backwards-compatible transition migration
2. A non-backwards-compatible final migration

### Backwards compatible
It is likely that a change does not require a non-backwards-compatible end phase e.g. all changes
may be backwards-compatible in their final form; in that case, only one phase of changes is
Comment on lines +45 to +46
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with our existing DB model is that we've gravitated towards making every field optional. I don't see that as a. best practice and one of the dangers of almost always only using backwards compatible migrations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither do I see it as one.

required. With the use of beta testing, partial roll-outs, [feature flags](../feature-flags.md),
etc. the often-chosen path is to spread a change across several major releases with a calculated
future state that can perform a "cleanup" migration that is backwards-compatible but still
represents an overall-_incompatible_ change beyond the boundaries of what we need for individual
release safety.

Since we follow _EDD_ the first migration needs to retain backwards compatibility with existing
production code.
### Backwards compatible migration

1. Modify the source `.sql` files in `src/Sql/dbo`.
2. Write a migration script, and place it in `util/Migrator/DbScripts`. Each script _must_ be
prefixed with the current date.

Please take care to ensure any existing _Stored Procedure_ accepts the same input parameters which
ensures backwards compatibility. In the case a column is renamed, moved care needs to be taken to
ensure the existing sprocs first checks the new location before falling back to the old location. We
also need to ensure we continue updating the old data columns, since in case a rollback is necessary
no data should be lost.
2. Write a migration script, and place it in `util/Migrator/DbScripts`. Each script must be prefixed
with the current date.

### Data migration
Please take care to ensure:

We now need to write a script that migrates any data from the old location to the new locations.
This script should ideally be written in a way that supports batching, i.e. execute for X number of
rows at a time. This helps avoiding locking the database. When running the scripts against the
server please keep running it until it affects `0 rows`.
- any existing stored procedure accepts the same input parameters and that new parameters have
nullable defaults
- when a column is renamed the existing stored procedures first check (coalesce) the new location
before falling back to the old location
- continued updating of the old data columns since in case of a rollback no data should be lost

### Non-backwards compatible
### Non-backwards compatible migration

These changes should be written from the perspective of "all data has been migrated". And any old
_Stored Procedures_ that were kept around for backwards compatibility should be removed. Any logic
for syncing old and new data should also be removed in this step.
These changes should be written from the perspective of "all data has been migrated" and any old
stored procedures that were kept around for backwards compatibility should be removed. Any logic for
syncing old and new data should also be removed in this step.

Since `Sql/dbo` represents the current state we need to introduce a "future" state which we will
call `dbo_finalization`.

1. Copy the relevant `.sql` files from `src/Sql/dbo` to `src/Sql/dbo_finalization`.
2. Remove the backwards compatibility which is no longer needed.
3. Write a new Migration and place it in `src/Migrator/DbScripts_finalization`, name it
1. Remove the backwards compatibility that is no longer needed.
2. Write a new Migration and place it in `src/Migrator/DbScripts_finalization`. Name it
`YYYY-0M-FinalizationMigration.sql`.
- Typically migrations are designed to be run in sequence. However since the migrations in
DbScripts_future can be run out of order, care must be taken to ensure they remain compatible
with the changes to DbScripts. In order to achieve this we only keep a single migration, which
executes all backwards incompatible schema changes.

### [Not Yet Implemented] Manual MSSQL migrations

There may be a need for a migration to be run outside of our normal update process. These types of
migrations should be saved for very exceptional purposes. One such reason could be an Index rebuild.

1. Write a new Migration with a prefixed current date and place it in
`src/Migrator/DbScripts_manual`
2. After it has been run against our Cloud environments and we are satisfied with the outcome,
create a PR to move it to `DbScripts`. This will enable it to be run by our Migrator processes in
self-host and clean installs of both cloud and self-host environments
Comment on lines -86 to -95
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we determined we will never do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yes. It's problematic for self-host. Things like this just aren't added to source control if they have no auditable change. A rebuild could be done, but that isn't necessary in the source as it is transient.


[repository]:
https://docs.microsoft.com/en-us/dotnet/architecture/microservices/microservice-ddd-cqrs-patterns/infrastructure-persistence-layer-design
[dapper]: https://github.com/DapperLib/Dapper
[code-style-sql]: ../code-style/index.md#t-sql
[MSBuildSQL]:
https://learn.microsoft.com/en-us/sql/azure-data-studio/extensions/sql-database-project-extension-sdk-style-projects?view=sql-server-ver16
[vscode]: https://code.visualstudio.com/
[azureds]:
https://docs.microsoft.com/en-us/sql/azure-data-studio/download-azure-data-studio?view=sql-server-ver16
[SDPE]:
https://docs.microsoft.com/en-us/sql/azure-data-studio/extensions/sql-database-project-extension?view=sql-server-ver16
`DbScripts_finalization` can be run out of order, care must be taken to ensure they remain
compatible with the changes to `DbScripts`. In order to achieve this we only keep a single
migration, which executes all backwards incompatible schema changes.

Upon execution any finalization scripts will be [automatically moved](./edd.mdx#online-environments)
for proper history.
15 changes: 10 additions & 5 deletions docs/getting-started/server/database/ef/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ each.

Our EF implementations currently support Postgres, MySQL, and SQLite3.

## Setting Up EF Databases
## Creating the database

The workflow here is broadly the same as with the normal MSSQL implementation: set up the docker
The workflow here is broadly the same as with the normal MSSQL implementation: set up the Docker
container, configure user secrets, and run migrations against their relating databases in
chronological order.

Expand Down Expand Up @@ -139,7 +139,7 @@ that the changes take effect.

:::

### Migrations
## Updating the database

<Tabs
groupId="provider"
Expand Down Expand Up @@ -185,7 +185,7 @@ You can also run migrations for all database providers at once using
pwsh migrate.ps1 -all
```

### Optional: Verify
### Verifying changes

If you would like to verify that everything worked correctly:

Expand All @@ -194,7 +194,7 @@ If you would like to verify that everything worked correctly:
- Note: this requires a configured MSSQL database. You may also need to set up other EF providers
for tests to pass.

## Testing EF Changes
## Testing changes

In your `server/dev/secrets.json` file find or add this block of secrets in the root of the json
structure:
Expand Down Expand Up @@ -230,3 +230,8 @@ existing databases if you have not already. If these settings are not present at
With connection strings applied to your projects: ensure your databases are all migrated using
`pwsh server/dev/migrate.ps1 --all`. Then you can run EF tests from the
`test/Infrastructure.IntegrationTest` folder using `dotnet test`.

## Modifying the database

The process for modifying the database is described in
[Migrations](./../../../../contributing/database-migrations/ef.md).