Archive for December, 2007

The (insane) Importance Of Database Indices

Thursday, December 20th, 2007

One of the great things about Rails (especially for beginners) is that you need to know very little about database.

One of the bad things about Rails (especially for beginners) is that you need to know very little about database.

Through all of the literature I read while learning rails, I only have a vague recollection of databases indexes being mentioned. I started learning more about them recently because the performance of one of my apps was degrading over time, and literally, in about 2 hours, I increased the performance of my backend processing by 5x!!!

Intrigued? First – some background:

Most tables in rails (anything that has a model really) has a field ‘id’. This is the primary key of the database, and an index is created on this. I am a database rookie, but I know that an index is just like an index in a book. It’s extra info that the database stores so that it can find things more quickly.

Indices are not a concern when you get started, nor should they be. When your table gets to 1000s of entries, then they CAN become more important. It largely depends on how your app works.

Lets take a simplified example. Lets say you have an app that has sites and pages:


create_table "sites", :force => true do |t|
  t.column "permalink", :string
end

create_table "pages", :force => true do |t|
  t.column "site_id", :integer
end

Someone can view a site, and each site has pages:


class Site < ActiveRecord::Base
  has_many :pages
end

Because you want pretty urls you want someone to be able to go to www.myapp.com/<site.name>/<page.id>. The first thing you need to do that is


@site = Site.find_by_permalink(params[:id])

Of course, you are also going to need to have navigation for that site, so while rendering you have something like this


<div id="nav">
  <% for page in @site.pages %>
     <%# (do something with the page here) %>
  <% end %>
</div>

So what happens, when your someone tries to view a site?

Step 1: find_by_permalink: The database has to open every single ’site’ and check if the permalink matches.

Step 2: @site.pages: The database has to open every single page and check if the event_id field matches the id of the event that it is looking for.

When you have small numbers of rows, this isn’t a big problem. When you get 1000s of rows, looking through every site and every page gets slow, quickly.

So, the solution is to add an index. You add indices in Rails using migrations. You can create a migration like this:


class AddIndices < ActiveRecord::Migration
  def self.up
    add_index :sites, :permalink
    add_index :pages, :site_id
  end

  def self.down
    remove_index :sites, :permalink
    remove_index :pages, :site_id
  end
end

When you add those 2 indices, the database can now much more quickly find and retrieve the single Site object by permalink, and find and retrieve only the required Page objects rather than having to search the entire database for a match.

Doing this got my a 5x improvement in the speed of some of my more common actions, and literally took 2 hours.

You do need to be careful. Adding indices can slow down your performance because every time you insert, update, or delete rows, the database must also alter the index.

Some useful links:

http://therailsway.com/2006/11/21/tracks-part-4

http://weblog.jamisbuck.org/2006/10/23/indexing-for-db-performance

http://www.websitedatabases.com/database-index.html

There are probably good ways of benchmarking this. After testing locally, I copied my live data to my staging area, added the indices, ran a few actions, and compared the logs from before/after the migration. In my case, some actions got a bitter slower, but those were uncommon ones. My most common actions got 5x faster!! Have a look at the following graph. The red lines show the % of time for an action after adding the indices. Above 100% means the action got slower. Below means it got faster. You an see most actions got a lot faster!

Effect of Database Index

1
Tags: , , ,
Posted in Uncategorized |