Boolean Columns Answer Too Few Questions

One of the things that I dislike seeing in an application's database architecture is the use of booleans. They do not convey enough information to any party to be meaningful, other than to ask a question that expects a yes/no answer. Often, questions such as those as a follow-up question that the field cannot answer.

For example, in many databases I see the following fields on a User model:

User
====
id                    primary key
username              the user's username credential
password_encrypted    authenticates the username
... other fields ...
enabled               true if the user is active, false otherwise

When a user has been banned for doing something inappropriate, an application simply sends this query:

UPDATE users
SET enabled = true
WHERE id = 1;   -- username: bob

This looks simple and it makes the application easy to use. You can even query on it. But what if a manager comes along, notices the data, and asks the following question: "When did Bob get disabled?" Now, you'd hope that the model has timestamp to show the last updated date, but even then is the timestamp in that field showing when the user was disabled, or when another field was change (perhaps Bob changed his password).

To get around this issue, I've seen data architects employ an audit table to keep track of every change on the table. Something similar to this:

Audit
=====
id
changed_at              timestamp when the change occurred
field                   the column being changed
previous_value          the value before the change
new_value               the value after the change

And this works and is nice. But now there's two operations (or one, if you use a complex database trigger) to every UPDATE operation on the user table.

Let's employ a different strategy and get rid of the boolean altogether. When I come across a data model that uses booleans, I convert all of them to timestamps. When the value is set, it means the column is active and when it is nil, it says that it is inactive. The audit table is no longer needed (in this contrived case only, there are other legitimate uses for audit tables), and the user model is rewritten like so:

User
====
id
username              the user's username credential
password_encrypted    authenticates the username
... other fields ...
disabled_at           timestamp when the user became inactive

When the application wants to disable a user, it adds a timestamp to the field.

UPDATE users
SET disabled_at = now()
WHERE id = 1;   -- username: bob

and conversely when the user is reactivated the application simply nulls the value.

UPDATE users
SET disabled_at = null
WHERE id = 1;   -- username: bob

For the read-only path, the application can use user.enabled() and if the value is set, return a true result.

This answers two questions:

  1. is the user enabled or disabled?
  2. when was this user disabled?

The question you are probably asking now is why I changed the attribute to a negative (disabled_at) instead of using the positive (enabled_at). I expect all my users to be active in the system (it is the default state of a user) and, in the rarest cases that they are not active, I want to know when they became inactive. I don't much care when they became active again. Your decision to use the negative or positive naming scheme on your columns will depend on the way your application is designed.