Django database migration tool: south, explained
If you are using Django for production level application, you will need to use south. Requirement changes, and therefore your data model will change over time. South is a great tool, but it is complicated. You do not want to make a mistake when migrating an application in production. This is a detail look at how it interacts with your application so that you can understand and use it better.
South interacts (reads and writes) with four different items in an application. This is the most confusing part to me. The four items are:
- models.py — South reads this to determine your current data model.
- migrations/*.py — South creates a sub directory inside your app, and creates for you a migration file for each database migration generation. You can also create these by hand, but normally you will let south creates them for you.
- south_migrationhistory table in your database — when you install south, it creates it’s own table and use it to maintain states. Specifically it records the state of the database in this table. South assumes the application schema in the database is consistent with what it records in this table.
- your application’s schema in your database — south create and update the schema for you according to the database migration generation, which is the ultimate purpose of using south.
Different commands in south interacts with these items differently:
Let is start with a normal database schema migration, from generation N to generation N+1. There are three steps in the migration:
1. The developer (you) change the models.py file, updating the application's data model. 2. Run manage.py schemamigration app_name --auto to create a migration file for generation N+1. 3. Run manage.py migrate app_name to update the database schema and migrationhistory table to generation N+1.
This diagram shows what are the inputs and outputs to each step. (Note, in step 2 it reads all of the migration files from all previous generations):
Let’s add the very first step when using south on a new applicatoin. The initial migration obviously does not have any previous south information. So there are still three steps, but the argument passed to schemamigrate is a little different:
1. The developer (you) creates the first models.py file, defining the application's data model. 2. Run manage.py schemamigration app_name --initial to create a migration file for generation 1. 3. Run manage.py migrate app_name to create the database schema and migrationhistory table to generation 1.
This diagram shows what are the inputs and outputs to each step, adding to the previous diagram:
Converting an Application
Converting an application is a little different because the database schema and the models are already in sync. Somehow we need to “trick” south into creating the other two items, the migration file and the migrationhistory table entry. We also need to deal with the conversion on the first instance vs other instances differently. For the first instance you will create the migration file, and for all other instances you will use the migration file to simply create the migration history table entry.
On the first app instance:
1. Run manage.py convert_to_south app_name to create the migration file for generation 1, and also to create the migrationhistory entry.
On other app instances:
1. Run manage.py migrate app_name 0001 --fake to create the migrationhistory without changing the database schema.
Testing and Trying Migration before
There are several different things you can do to “dry run” your migrations.
Schema Migration Generation Test Run
./manage.py schemamigration my_app_name --auto --stdout
The -stdout argument will have South generate and print out the migration code on the console instead of writing it out to the migrations directory. You can do this before you do the actual database migration schema generation. Note that schemamigration only creates the next migration script in the migrations directory. Worst case is you can just delete that file if you decided the migration schema is not needed.
Migration Dry Run
The migration operation can also be tested before actually running to migrate a database:
./manage.py migrate myapp --db-dry-run
This will run the migration except the database is not changed, and the migration is not recorded in the history table in the database.
Another useful command is the list command. It shows all the migrations defined in the migration files, and whether they have been applied (by reading the migrationhistory table entries):
manage.py migrate --list
Sometime I need to reset a migrated environment. This is particularly useful during initial development of an app. This special command, migrating to the “zero” state, will remove all migration history in the database, but leaving any migration files intact.
manage.py migrate my_app_name zero
This is my initial sketch for my explanation diagrams: