Serializing ActiveRecord objects
One neat feature of Rails’ ActiveRecord objects is the #serialize method which will allow you to store an object in the database as YAML. This means, for example, you could have a User and a Profile. The Profile could be a hash containing a bunch of values that you want to manage, but that don’t necessarily need their own database table.
class User < ActiveRecord::Base
serialize :profile
end
u = User.find 1
u.profile = {"url" => "http://www.rubyonrails.com", :nickname =>"Ninja master"}
u.save
u = User.find 1
u.profile.class
=> Hash
The magic of the #serialize method takes the given object, serializes it to YAML, stores the YAML in the database, and then unserializes it when you retrieve the data.
This is all well and good, and I thought I would take advantage of this to help me easily cache some data. I have a system in which I have tasks, and a task belongs to a Service which contains the rate we charge for the task. Now, I really want to be able to store the service name and rate on the task when it’s assigned so that the task won’t be affected when I change my rates in the future.
“Aha!” I thought,”I can just use serialize and store the Service object right on the task!” I created a migration that added a service_data field to my database
./script/generate migration AddServiceDataToTasks
class AddServiceDataToTasks< ActiveRecord::Migration
def self.up
add_column :tasks, :service_data, :text
end
def self.down
remove_column :tasks, :service_data
end
end
I wrote a quick unit test which I knew would come in handy later.
def test_saves_service_when_task_is_created
@service = Service.find_by_name "Rails development"
task = Task.create :name=>"Create user registration site", :esthours => 5, :service => @service
t = Task.find task.id
assert_not_nil t.service_data
assert_kind_of Service, t.service_data
end
Then I modified my Task model
class Task < ActiveRecord::Base
belongs_to :service
serialize :service_data
after_create :sync_service_data!
def sync_service_data!
self.service_data = self.service
self.save!
end
end
That seemed simple enough. However, when I tried it, I got a nasty surprise…
t = Task.find 1
t.service_data
=> nil
No matter what i tried, the service data always came back empty.
Running my unit test proved that something was definitely wrong, as I kept seeing “nil expected to not be nil”.
After searching and playing, I decided that #serialize was just not capable of serializing ActiveRecord objects. To get around this, I simply changed my code slightly. I knew that #serialize can handle Hashes so I stored just the attributes hash. Then I redefined #service_data “getter” method to create a new instance of Service from that hash.
class Task < ActiveRecord::Base
belongs_to :service
serialize :service_data
def sync_service_data!
self.service_data = self.service.attributes
self.save!
end
def service_data
Service.new(self.attributes["service_data"]
end
end
A quick run of the tests showed that I was now getting what I wanted.
Shortly after I discovered this solution, Jon Garvin offered a much cleaner solution…. don’t use Serialize. He found that Serialize does some strange magical things that often get in the way of our intended results. He proposed that I try
class Task < ActiveRecord::Base
belongs_to :service
after_create :sync_service_data!
def sync_service_data!
self.service_data = self.service
self.save!
end
def service_data
self[:service_data] ? Marshal.load(self[:service_data]) : nil
end
def service_data=(service)
self[:service_data] = Marshal.dump(service)
end
end
This method simply creates a setter that manually marshals the data to the database column, and a getter that retrieves it again. This method works great, and I thank Jon for his quick solution!

That’s cool, I’m looking to do something similar in one of my models, but I have a question. Doesn’t that cause two database hits (one for the create, one for the sync). I’m on a production app and have to worry about such things.
Thanks for the tip.
@Jeff:
Sure does. In this case, I am going to do things like calculate the total bill for a project, and I’d rather do 2 hits on creation rather than end up with bad calculations later on (like if I change a task,
But what if services were stored on a different database server? Then my calculation routine would have to do a remote request for that service info as I loop over the tasks. Ugh!