2

I have a query that runs very fast in the MySQL console but very slow when I run it using Rails Active Record. This is the query that run against a table of 7 million records:

select broker_id,count(abserror),avg(abserror) from fc_estimates where ( fpe > '2000-05-28') and ( fpe < '2003-06-30') group by broker_id order by broker_id;

That takes 3 minutes to run.

Then I run this query in Rails Active Record:

stats = Estimate. select([ "broker_id", "count(abserror) as abserror_count", "avg(abserror) as abserror_avg" ]). where( :fpe => ((fpe-1098).to_date..(fpe+30).to_date)) group("broker_id"). order("broker_id")

which generates this sql (output from to_sql)

SELECT broker_id, count(abserror) as abserror_count, avg(abserror) as abserror_avg FROM fc_estimates WHERE (fc_estimates.fpe BETWEEN '2000-05-28' AND '2003-06-30') GROUP BY broker_id ORDER BY broker_id

and takes 1 hour 40 minutes to run. It returns 250 records.

I am using Windows 7, MySQl 5.1, Ruby 1.8.7, ActiveRecord 3.04, mysql2 gem 0.2.6

These are InnoDB tables and I have increased the innodb_buffer_pool_size to 480M (which did help with other queries). One thing I do observe is that the MySQL memory use builds up to about 500M and then there is a lot of disk activity (page swapping). Which does explain somehing.

But still why I am getting such poor performance when the same query run in MySQL console is just taking 3 minutes? Thanks for any ideas or anyone who has experienced a similar situation.

UPDATE 2011-02-24

I updated to MySQL 5.5. Now my query in the console runs in about 1min40secs. And using ActiveRecords takes about 40mins.

6
  • You may be seeing the results of the mysql query cache. Try timing your query again in mysql but put SQL_NO_CACHE after SELECT to disable the query cache. Commented Feb 15, 2011 at 13:29
  • Yes I think I understand that. The second time I run a query in the console it runs very fast. So my query that runs in 3 minutes - the second time will run very quickly. I don't think that's it but I will check. Commented Feb 15, 2011 at 13:38
  • 3 mins against a such a small number of rows is a little worrying - you could be getting sub 1 second runtimes if you take advantage of your innodb clustered index. Commented Feb 15, 2011 at 13:47
  • Just ran the query again. In the MySQL console the query took 4mins, ran again it took 1min 14s, ran again it took 3sec, and again 3sec. So I guess that is the query cache taking affect. I then ran with SELECT SQL_NO_CACHE ..., the query took 2mins, and again with SELEC SQL_NO_CACHE, the query took 3min 30sec. Actually my problem is that the disk is getting thrashed (there's a lot of writes to c:\pagefile.sys). But why is that happening? Commented Feb 15, 2011 at 13:56
  • 3mins is fine for me - this is not a web application. It's the 1 hour 40 mins when I use ActiveRecord that is a pain. Commented Feb 15, 2011 at 13:58

1 Answer 1

1

There's much more running in your ruby code than just a SQL Query. I'm not an Ruby Jedi but I can point out some stuff.

Windows is not the best place to work with the MRI. Maybe you should try out 1.9.2 or JRuby - or even switching to some *nix OS.

(fpe-1098).to_date..(fpe+30).to_date) builds a Range instance for the date intervals. Maybe you should try a different syntax, ie: ['fpe > ? AND fpe < ?'(fpe-1098),(fpe+30)] - so less objects will be created.

Since you're not retrieving Estimate instances, instead of running the query with the model class you can pass the sql generated to ActiveRecord::Base.connection.execute. Maybe there will be less memory usage and objects created.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, all good comments. I've tried using both Ruby 1.8.7 and Ruby 1.9.2 and doesn't make much difference. JRuby, could try I suppose.
But for the query itself yes it can be made better. I am just trying to calculate averages over a date range - but the dates are actually just end of month dates, so for a three year period I will just have 36 distinct dates. So now I am doing in two stages. First, just calculate count and average for one date. Second, use theses stats (count and averages) to calculate the overall average over the three years (36 month) period.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.