Close
Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

LABS
Fact-based state in Rails

Maybe you learned this from experience or you joined a Rich Hickey-esque immutability craze, but one way or another you know state machines are complicated. They couple together a variety of meanings and easily grow out of control as your application gets more and more interesting. What can be done?

Lets examine a class from Railscast #392, demonstrating a declaration of Order’s state via aasm gem and it’s DSL syntax.

class Order < ActiveRecord::Base
  include AASM   scope :open_orders, -> { where(aasm_state: "open") }
  attr_accessor :invalid_payment

  aasm do
    state :incomplete, initial: true
    state :open
    state :canceled
    state :shipped

    event :purchase, before: :process_purchase do
      transitions from: :incomplete, to: :open, guard: :valid_payment?
    end

    event :cancel do
      transitions from: :open, to: :canceled
    end

    event :resume do
      transitions from: :canceled, to: :open
    end

    event :ship do
      transitions from: :open, to: :shipped
    end
  end

  def process_purchase
    # process order ...
  end

  def valid_payment?
    !invalid_payment
  end
end

Terse and syntax-sugery, but what’s completely missing from the code? A strategy to address:

  • Correcting/understanding unexpected states. How exactly did an object get into its state? And if it’s unexpected, what were its previous states? This situation is much much worse in production, where an inappropriate state goes by the name of data corruption and, as such, is one of the most costly failures with a slight chance of recovery.
  • Change. New business requirement makes an old state need to become two states, then two other states are grouped as one. What if you can now capture more data and introduce a ‘packaged’,’returned’,etc.? Or even simply rename a state? These are surpisingly nerve-wrecking changes to the system, necessitating at worst, a live update of entire database tables every time this happens.

State as a function of facts

To address both of the above issues with mutable state, one must simply turn the state machine inside-out and emphasize state’s complement – transitions. Transitions have a time-based nature in that they are ordered and have happened at a particular time. So they may also be known as events. Lets look at the above state machine to gather the states:

incomplete, open, cancelled, shipped

And emphasize the transition verbs:

purchase, cancel, resume, ship

Given a sequence of events, we can always produce a state. So lets gather events as nouns:

Purchase, Cancellation, Resumption, Shipment

class OrderEvent < ActiveRecord::Base
  belongs_to :order

  class Purchase < OrderEvent; end
  class Shipment < OrderEvent; end
  class Resumption < OrderEvent; end
  class Cancellation < OrderEvent; end
end

class Order < ActiveRecord::Base
  has_many :events, class_name: "OrderEvent"

  def state
    #play through events via state.transition(event)
    events.reduce(OrderState::Incomplete.new, &:transition)
  end

end

class OrderState
  def transition(event)
    Invalid.new
    #returning `self` would be another choice,
    #but failing early can be safer
  end

  class Incomplete < OrderState
    def transition(event)
      case event
      when OrderEvent::Purchase; Open.new
      else; Invalid.new
      end
    end
  end

  class Open < OrderState
    def transition(event)
      case event
      when OrderEvent::Cancellation; Cancelled.new
      when OrderEvent::Shipment;     Shipped.new
      else; Invalid.new
      end
    end
  end

  class Cancelled < OrderState
    def transition(event)
      case event
      when OrderEvent::Resumption; Open.new
      else; Invalid.new
      end
    end
  end

  class Shipped < OrderState
  end
  class Invalid < OrderState
  end
end

Verbose? Well, actually, this more verbose system with 3 major classes and inheritance is actually a much more flexible and error-proof solution to the state machine problem. Why does it work? Because it actually decouples incoming data(events) from our business interpretation of it. Events that happened – happened. It is the code of the system that makes sense of the past and guides the entry of new data. The concept of a state, is thus replayed from the history of past events:

o = Order.new
  o.events << OrderEvent::Purchase.new
  o.events << OrderEvent::Cancellation.new
  o.events << OrderEvent::Resumption.new
  o.events << OrderEvent::Cancellation.new
  o.events << OrderEvent::Resumption.new
  o.events << OrderEvent::Shipment.new
  o.state #<OrderState::Shipped:0x007facfa8a20d8>

No data migrations

You can introduce, remove and change states by changing only code. This is a significant improvement because each data migration is risky, difficult to debug and only gets worse with larger volumes of production data

Debugging unexpected states

There’s now full history of how an order got into a certain state and can thus be understood and debugged with a much larger hope of recovery.

And some would consider it a benefit to forego a DSL and meta-programming introduced by AASM or similar gems.

What do you think?

Concerned storing a little extra state will hurt runtime performance? Unconvinced of the simplicity of “state = f(past)” premise? Have taken this approach further? Let us know in the comments.

Comments
  1. Theo Mills says:

    I like this concept quite a bit, but I always needs to filter a set of records based on their current state (pending, complete, etc).

    Would persisting the current state once it’s determined break your model, e.g. in a before_save callback?

  2. Theo Mills says:

    Oops, when I said “break your model” I really meant “break your technique”.

  3. Serguei Filimonov says:

    Hi Theo,

    I think it’s quite ok to memoize the state computation into a column. But it relies on team’s understanding of the above technique, so that we don’t wind up at square 1 with other things writing into that column. I feel “before_save” and similar callbacks become a lot safer given the immutability above. So it’s probably a good candidate for invalidating the cached cell and writing a new value.

    Thanks for the suggestion, I like it.

Post a Comment

Your Information (Name required. Email address will not be displayed with comment.)

* Copy This Password *

* Type Or Paste Password Here *