Designing Uniqueness

Not enough people give careful thought about ensuring uniqueness across systems when they design a solution. This goes doubly so for off-the-shelf (OTS) software packages. And if the design isn't well thought out, it is going to be difficult to ensure uniqueness when doing any data migrations.

Granted, data migrations are an art more than they are a science. There are always edge cases and invariably something goes wrong that nobody thought about. This is why testing a data migration multiple times should be built into any project plan (and given an extremely large buffer in the budget).

I'm currently working with a client that needs to migrate data between one OTS software package to another OTS software package. I am running into a major blocking issue simply due to the fact that nobody thinks about the impact of database primary keys during a migration.

During the original customization, users were allowed to login to the product using an SSO-like system, which offloads the authentication to another server, then redirects the user back to the product. When the product sees the authenticated user for the first time, it creates a record in its database. This is fairly standard behaviour and customization for an OTS product. The product will still need a way to reference the user between disparate systems or else it will continue thinking it is the first time it has seen this user.

Now here's where it goes wrong. Everything works great because nobody thought about what will happen come time to migrate to another product (this is called future-proofing, or forward-thinking). In this case, the developer used the unique id from the authentication system and set it as the primary key in the OTS product's users table. This is wrong, but I'll get to that in a bit.

Typically, when you want to add a reference into an OTS product, you create a new field in the users table that will hold the reference to the external system. Then, during authentication or any kind of user lookup, the system uses this newly created field as a reference point.

By using the user table's primary key field as the external reference, the system can no longer be migrated to another system. The problem resides in that primary keys are expected to be valid only within the context of its own database, and migration tools will NEVER migrate primary keys into a new system. This would be a terrible idea on many levels, because:

  • the new system could have a different data type for primary key (UUIDs)
  • the new system could already have data in its users table, causing primary key conflicts.
  • the new system may not allow programmatic access to its primary key 1

The result of this design mistake is that the data migration to the new system strips off the primary keys when inserting user records. I am left with no way to reference a user in the external systems.

The lesson today from this database architect is: DO NOT USE PRIMARY KEYS FOR ANYTHING OTHER THAN LOCAL DATABASE USE.

  1. In other words, you simply create the record in the database using the fields you have, and the return value from the database call is the primary key that was saved into the database.