February 2007 Archives

Complex SQL Queries with Rails

February 18, 2007 1:39 PM

Rails is a powerful framework - it abstracts many of the underlying database concepts so that you never have to think about them. Of course, sometimes it abstracts away too many.

Frequently I run into a case where I want more data in an ActiveRecord than is actually available from my SQL schema. For instance, I often want to include data from another table through a JOIN or sub-SELECT, or maybe I want to be able to ORDER BY data in a table that's in a belongs_to relationship with the current model.

To illustrate why this is useful, let's assume that we have a table, Pet, which has a PetType. Our schema looks something like this:

pet_types Table:

idname
1Dog
2Platypus
3Cat

pets Table:

idnamepet_type_id
1Santa's Little Helper1
2Snowball3
3Brian1
4Fritz3
5Felix3

Given this schema, there are two things I want to be able to do:

  1. Get a list of pets, ordered by their type. But I want to order on pet_types.name, not pet_types.id. (That is, I want them in order of "Cats", "Dogs", "Platypi", not whatever order their ids are.)
  2. Get a list of pet types, including a count of the number of pets which are of that type.

Admittedly, in this contrived example, I could do these both in rails without making any changes to the model and without too much pain. But with more advanced queries, it's much easier to let your database engine do the work. And rails is relatively graceful about allowing you to do this -- you do need to know a little SQL, but you only need to change a single method in your model.

Ordering by a foreign column
Let's say you wanted to display a list of pets ordered by type. We'd like our database engine to do this for us, since this is one of the things it's optimized at. Since ActiveRecord::Base's various find_* methods all ultimately call find_every(), we need only override that method to add a more complex SQL query. We just need to make one change in our model to support this:

In models/pet.rb:

def Pet.find_every(args = { })
  sql = "SELECT pets.*, pet_types.name AS pet_type_name " +
      "FROM pets, pet_types " +
      "WHERE pets.pet_type_id = pet_types.ID"

  sql += " AND " + sanitize_sql(args[:conditions])
      if(args[:conditions])
  sql += " ORDER BY " + sanitize_sql(args[:order])
      if(args[:order])
  sql += " LIMIT " + sanitize_sql(args[:limit].to_s)
      if(args[:limit])
  sql += " OFFSET " + sanitize_sql(args[:offset].to_s)
      if(args[:offset])

  find_by_sql(sql)
end

Now we can say Pets.find(:all, :order => 'pet_types.name') to sort by pet type. This will give us the following:

idnamepet_type_idpet_type_name
2Snowball1Cat
4Fritz1Cat
5Felix1Cat
1Santa's Little Helper1Dog
3Brian1Dog

Including data from another table
Let's say that we wanted to provide a list of pet types and inc lude the number of pets of that type. We could certainly load each pet type, then do a Pet.count() for each type, but it may be advantageous to let the database engine do this for us.

In models/pet_type.rb:

def PetTypes.find_every(args = { })
  sql = "SELECT pet_types.*, " +
      "(SELECT COUNT(pets.id) FROM pets " +
      "WHERE pets.pet_type_id = pet_type.id) AS pet_type_count " +
      "FROM pet_types "

  sql += " AND " + sanitize_sql(args[:conditions])
      if(args[:conditions])
  sql += " ORDER BY " + sanitize_sql(args[:order])
      if(args[:order])
  sql += " LIMIT " + sanitize_sql(args[:limit].to_s)
      if(args[:limit])
  sql += " OFFSET " + sanitize_sql(args[:offset].to_s)
      if(args[:offset])

  find_by_sql(sql)
end

Now, we get a new column - pet_type_count which we can read (but not write to, of course) as if it were part of the table. We can use it just for display, or we can sort on it just like above. PetType.find(:all, :order => 'pet_type_count') will produce:

idnamepet_type_count
2Platypus0
1Dog2
3Cat3

While these are trivial examples, this sort of overriding can become exceptionally useful when you have a lot of data being computed by the SQL server. If you have a model with several belongs_to or has_many relationships, it's a lot easier to let the SQL server compute these sort of details itself, rather than trying to do it in each method of your model.

Ultimately, you can put any arbitrary SQL into your find_every() method, and rails will continue to work as usual, provided you follow these three simple rules:

  1. Don't exclude columns, only include new ones. Rails examines your table's columns the first time you use a model, and it will expect those columns to be in the results of any query.
  2. Fully qualify conflicting column names. That is, say: :order => 'pets.name' instead of just 'name'.
  3. Ensure that you handle the arguments to find_every(). That is, remember to handle :conditions, :order, :limit and :offset. Otherwise you run the risk of breaking other finds.
Edward Thomson is a Software Engineer at Teamprise, where he develops cross-platform client solutions for Microsoft Team Foundation Server, with an emphasis on Macintosh compatibility and IDE integration.