Timestamps over Booleans

In the previous article I showed how a model can be disabled. One of the things that you probably noticed is that I'm using a timestamp (DateTime) field instead of a Boolean field. This is a trick I learn years ago after working on a large database (>2B rows) for an Enterprise client. It has been useful in so many ways that I put it into every database design I do, yet I've never written about it before and only talked to developers in person about the purpose of the design.

A Boolean is a simple concept: it represents two states, either true or false. Yes or no. On or off. In a many respects, this is a great data type to have when you need to toggle the state of a database record (or, really, any kind of noun) between two states. But when you work in production environments, inevitably data corruption issues will crop up and having a Boolean data type can hinder the investigation. When I say "data corruption" I'm using the most general term used by non-technical people: there is a mistake in the data and nobody is sure how it happened.

When data corruption strikes, it is imperative to investigate both the scope and severity of the issue first. However, what comes next in the investigation is the timeline. If hundreds of thousands of rows in a database have a state toggled to off — perhaps customers were disabled by accident — then you want to quickly know if this is a single abnormal event or a hourly/daily/weekly occurrence.

Switching Boolean data types to timestamps means that you get to have the same two states, but now one of the states (the "on" state) carries more information with it. The information it carries is when it was turned on. The "off" or false state normally represented by a Boolean is simply the NULL value in the timestamp column. It also means you don't need to add extra records into a table to record both the state and when that state happened.

Now, at this point, developers usually perk up and say that this is the purpose of an "last_updated" timestamp field in a database table. Yes, this is true, but in a production setting a customer's record can change for any reason and unless you have a robust audit table — one that records not just when the change happened but also the before/after state of each value changed — then this column is not going to tell you anything during investigations. Another criticism of my approach is that there is no timestamp for recording when the record was put into the "off" or false state. This is true, so one has to use this type of pseudo-Boolean data type carefully, only the right database columns will receive it where they only care about the "on" state.

Using timestamps for Boolean fields has helped me immensely. When performing an investigation, one can develop a timeline with a single SQL statement, like how many customers were mistakenly disabled on an hourly basis:

-- Postgres
SELECT to_char(disabled_at, 'YYYY-MM-DD HH24'), count(1)
FROM customers
WHERE 1=1
  AND disabled_at IS NOT NULL
GROUP BY to_char(disabled_at, 'YYYY-MM-DD HH24');

When you want to use this design concept in a Rails application, you need to add some fields to your model since Rails won't understand that you are using a timestamp data type in place of a Boolean data type:

class Customer < ApplicationRecord
  def self.enabled
    where(disabled_at: nil)
  end

  def enabled?
    disabled_at.nil?
  end

  def enable!
    update!(disabled_at: nil)
  end

  def self.disabled
    where.not(disabled_at: nil)
  end

  def disabled?
    !disabled_at.nil?
  end

  def disable!
    update!(disabled_at: DateTime.now.utc)
  end
end

While the examples in this article only talk about disabling a record, I have used this successfully for other types of states:

  • Confirming a user (confirmed_at)
  • Locking a user after too many incorrect password attempts (locked_at)
  • Verifying a customer's email address (email_verified_at)

What do you think? Is it something that you could use in your designs? I hope this piques your interest and gets you thinking about what help you, as a developer, can provide to production operations people that have to perform investigations.