[LRUG] Request for assistance with Rails performance oddity

Levent Ali lebreeze at gmail.com
Fri Oct 29 23:17:02 PDT 2010


Could activerecord be trying to load existing records into memory?

On 29 Oct 2010, at 21:24, John Winters <john at sinodun.org.uk> wrote:

> On 29/10/10 20:32, Michael Pavling wrote:
>> On 29 October 2010 20:09, John Winters <john at sinodun.org.uk> wrote:
>>> When I first run the script with empty database tables it runs at a
>>> perfectly satisfactory rate.  Loading a total of 105171 records takes
>>> 1662s, or a rate of 63.3 records/second.  Further, it doesn't slow down
>>> as it loads - the rate of loading is the same when there are 100,000
>>> records loaded as it was at the beginning.
>>> 
>>> If however I let the script stop and then start it again to load a
>>> second batch it runs far more slowly.  A small further batch of 1384
>>> records takes 584s, or about 2.4 records/second.
>> 
>> What happens if you load the second batch first? Do they load at the
>> same, slow speed if the DB starts off empty?
> 
> No.  If you load the second batch first then it is fast.  I have yet to
> try to see what happens if you load 50,000 records in one run followed
> by another batch.  I could understand it if it got gradually slower as
> the records were loaded, but it doesn't.  There is no perceptible
> slowdown over the course of 100,000 records, as long as the script runs
> continuously.  What seems to cause a massive slow down is if lots of
> records already exist when the script is started.
> 
>> Any chance you can post or link to some (sanitized/anonymised) code
>> and a source examples?
> 
> Here are a couple of what I think are the relevant chunks.  What it's
> doing is to load a school's timetable into a calendar.  The same weekly
> pattern of lessons is loaded for a range of dates.  An "Event" is
> basically a lesson, and each one involves up to 3 resources - a teacher,
> a group of pupils and a room.  All the resources are already loaded into
> the system and do not change during the load.  All that happens is Event
> records are created, and then Commitment records are added to link each
> Event to its resources.
> 
> My first run loads the autumn term, repeating the same weekly pattern of
> lessons over about 14 weeks.  The second run then tries to load one more
> day, but it happens at a glacial speed.  Exactly the same weekly data
> are being used in the second run as in the first.  I've also tried
> loading the spring term (in case there was something funny about loading
> just one day) but it was equally slow.
> 
> It seems I can load as many weeks as I like at full speed, as long as
> they are done in a single run of the script.  If I load some in one run
> and then try to load some more, it's ridiculously slow, and yet the only
> thing which has changed is that the Event and Commitment tables are no
> longer empty.
> 
> All these Events belong to the same Eventcategory.  Other
> Eventcategories will be used for things other than timetabled lessons.
> 
> app/model/event.rb
> 
> # == Schema Information
> # Schema version: 20100930151421
> #
> # Table name: events
> #
> #  id               :integer(4)      not null, primary key
> #  body             :text
> #  eventcategory_id :integer(4)
> #  on               :date
> #  starts           :time
> #  ends             :time
> #  created_at       :datetime
> #  updated_at       :datetime
> #  autoloaded       :boolean(1)
> #
> 
> class Event < ActiveRecord::Base
> 
>  belongs_to :eventcategory
>  has_many   :commitments, :dependent => :destroy
> 
>  validates_presence_of :eventcategory, :body
>  validates_associated  :eventcategory, :message => "Event category does
> not exist"
> 
>  def durationtext
>    "#{self.starts.strftime("%H:%M")} - #{self.ends.strftime("%H:%M")}"
>  end
> 
> end
> 
> 
> app/model/commitment.rb
> 
> # == Schema Information
> # Schema version: 20100930151421
> #
> # Table name: commitments
> #
> #  id          :integer(4)      not null, primary key
> #  event_id    :integer(4)
> #  resource_id :integer(4)
> #  covering_id :integer(4)
> #  tentative   :boolean(1)
> #  voluntary   :boolean(1)
> #  created_at  :datetime
> #  updated_at  :datetime
> #
> 
> class Commitment < ActiveRecord::Base
> 
>  belongs_to :event
>  belongs_to :resource
> 
>  validates_presence_of :event, :resource
>  validates_associated  :event,    :message => "Event does not exist"
>  validates_associated  :resource, :message => "Resource does not exist"
> 
>  # Note naming here.  If this commitment is covering another commitment
>  # then we point at it.  Code can read:
>  #
>  #  if commitment.covering
>  #
>  # If this commitment is being covered then the coverer should point at
>  # us.  Code can read:
>  #
>  #  if commitment.covered
>  #
>  belongs_to :covering, :class_name => 'Commitment'
> 
>  # If this commitment is being covered and this commitment gets deleted
>  # then the covering commitment should be deleted too.
>  has_one    :covered,
>             :class_name  => 'Commitment',
>             :foreign_key => :covering_id,
>             :dependent   => :destroy
> 
> end
> 
> 
> and then the loop which does all the work.  The timetable (a week's
> worth of lessons) has already been loaded into memory and then we are
> iterating through a period of time (typically one term) creating dated
> event records for each lesson.  In the in-memory form, a week is an
> array of days, a day is an array of periods, and a period is an array of
> events (because more than one lesson happens in any given period).
> 
>      eventcount = 0
>      commitmentcount = 0
>      @@interval.starts_at.upto(@@interval.ends_at) do |tdate|
>        puts "Processing #{tdate}"
>        # Ruby days are indexed from 1 = Monday.  Unfortunately, the
>        # timetables which I'm loading start from Monday so for them,
>        # 0 = Monday.  I therefore have to adjust a bit.
>        wday = tdate.cwday - 1
>        ipday = schooltt.week[wday]
>        # Note that ipday is an array of periods but it's potentially
>        # possible for there to be gaps - nil periods.  This would
>        # happen if no-one has anything scheduled in that period.
>        ipday.each do |ipperiod|
>          if ipperiod
> #           puts "Starting period #{ipperiod.periodindex + 1}"
>            periodeventcount = 0
>            starttime = periodstarttime(wday, ipperiod.periodindex)
>            ipperiod.each do |ipevent|
>              # If this is the 2nd half of a double then don't load it.
>              if !ipevent.previous_event
>                if ipevent.following_event
>                  increment = 1
>                else
>                  increment = 0
>                end
>                endtime   = periodendtime(wday,
>                                          ipperiod.periodindex + increment)
>                # Make sure we can find the relevant resources first.
>                if ipevent.teacher
> #                 puts "Finding teacher"
>                  dbteacher =
> Teacher.find_by_initials(ipevent.teacher.initials)
>                end
>                if ipevent.group
> #                 puts "Finding group"
>                  dbmgroup  = Mtgroup.find_by_year_id_and_name(@@year.id,
> 
> ipevent.group.name)
>                  if !dbmgroup
>                    if ipevent.group.instance_of?(Consolidate::TutorGroup)
>                      dbmgroup =
> Tutorgroup.find_by_name(ipevent.group.tutor.initials)
>                    end
>                  end
>                end
>                if ipevent.room
> #                 puts "Finding room"
>                  dbroom = Location.find_by_name(ipevent.room.name)
>                end
>                if (dbteacher || !ipevent.teacher) &&
>                   (dbmgroup  || !ipevent.group) &&
>                   (dbroom    || !ipevent.room)
>                  # Need to create this event in the database.
> #                 puts "Creating event"
>                  dbevent = Event.new
>                  dbevent.body          = "Hello there"
>                  dbevent.on            = tdate
>                  dbevent.starts        = starttime
>                  dbevent.ends          = endtime
>                  dbevent.autoloaded    = true
>                  dbevent.eventcategory = @@eventcategory
>                  dbevent.save!
>                  eventcount += 1
>                  periodeventcount += 1
>                  # And now add all the resources which this event uses.
>                  if dbteacher
> #                   puts "Linking teacher"
>                    dbcommitment = Commitment.new
>                    dbcommitment.event = dbevent
>                    dbcommitment.resource = dbteacher.resource
>                    dbcommitment.tentative = false
>                    dbcommitment.voluntary = false
>                    dbcommitment.save!
>                    commitmentcount += 1
>                  end
>                  if dbmgroup
> #                   puts "Linking group"
>                    dbcommitment = Commitment.new
>                    dbcommitment.event = dbevent
>                    dbcommitment.resource = dbmgroup.resource
>                    dbcommitment.tentative = false
>                    dbcommitment.voluntary = false
>                    dbcommitment.save!
>                    commitmentcount += 1
>                  end
>                  if dbroom
> #                   puts "Linking room"
>                    dbcommitment = Commitment.new
>                    dbcommitment.event = dbevent
>                    dbcommitment.resource = dbroom.resource
>                    dbcommitment.tentative = false
>                    dbcommitment.voluntary = false
>                    dbcommitment.save!
>                    commitmentcount += 1
>                  end
> #                 puts "Finished event"
>                else
>                  puts "Couldn't find database resources for event."
>                  puts "Teacher: #{ipevent.teacher ?
>                                   ipevent.teacher.initials :
>                                   "none" }"
>                  puts "Group:   #{ipevent.group ? ipevent.group.name :
> "none" }"
>                  puts "Room:    #{ipevent.room ? ipevent.room.name :
> "none" }"
>                end
>              end
>            end
>            puts "Period #{ipperiod.periodindex + 1} has
> #{periodeventcount} events."
>          end
>        end
>      end
>      puts "#{eventcount} events and #{commitmentcount} commitments
> added to the database."
> 
> 
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org



More information about the Chat mailing list