[LRUG] Request for assistance with Rails performance oddity

John Winters john at sinodun.org.uk
Fri Oct 29 13:24:15 PDT 2010


On 29/10/10 20:32, Michael Pavling wrote:
> On 29 October 2010 20:09, John Winters <john at sinodun.org.uk> wrote:
>> When I first run the script with empty database tables it runs at a
>> perfectly satisfactory rate.  Loading a total of 105171 records takes
>> 1662s, or a rate of 63.3 records/second.  Further, it doesn't slow down
>> as it loads - the rate of loading is the same when there are 100,000
>> records loaded as it was at the beginning.
>>
>> If however I let the script stop and then start it again to load a
>> second batch it runs far more slowly.  A small further batch of 1384
>> records takes 584s, or about 2.4 records/second.
> 
> What happens if you load the second batch first? Do they load at the
> same, slow speed if the DB starts off empty?

No.  If you load the second batch first then it is fast.  I have yet to
try to see what happens if you load 50,000 records in one run followed
by another batch.  I could understand it if it got gradually slower as
the records were loaded, but it doesn't.  There is no perceptible
slowdown over the course of 100,000 records, as long as the script runs
continuously.  What seems to cause a massive slow down is if lots of
records already exist when the script is started.

> Any chance you can post or link to some (sanitized/anonymised) code
> and a source examples?

Here are a couple of what I think are the relevant chunks.  What it's
doing is to load a school's timetable into a calendar.  The same weekly
pattern of lessons is loaded for a range of dates.  An "Event" is
basically a lesson, and each one involves up to 3 resources - a teacher,
a group of pupils and a room.  All the resources are already loaded into
the system and do not change during the load.  All that happens is Event
records are created, and then Commitment records are added to link each
Event to its resources.

My first run loads the autumn term, repeating the same weekly pattern of
lessons over about 14 weeks.  The second run then tries to load one more
day, but it happens at a glacial speed.  Exactly the same weekly data
are being used in the second run as in the first.  I've also tried
loading the spring term (in case there was something funny about loading
just one day) but it was equally slow.

It seems I can load as many weeks as I like at full speed, as long as
they are done in a single run of the script.  If I load some in one run
and then try to load some more, it's ridiculously slow, and yet the only
thing which has changed is that the Event and Commitment tables are no
longer empty.

All these Events belong to the same Eventcategory.  Other
Eventcategories will be used for things other than timetabled lessons.

app/model/event.rb

# == Schema Information
# Schema version: 20100930151421
#
# Table name: events
#
#  id               :integer(4)      not null, primary key
#  body             :text
#  eventcategory_id :integer(4)
#  on               :date
#  starts           :time
#  ends             :time
#  created_at       :datetime
#  updated_at       :datetime
#  autoloaded       :boolean(1)
#

class Event < ActiveRecord::Base

  belongs_to :eventcategory
  has_many   :commitments, :dependent => :destroy

  validates_presence_of :eventcategory, :body
  validates_associated  :eventcategory, :message => "Event category does
not exist"

  def durationtext
    "#{self.starts.strftime("%H:%M")} - #{self.ends.strftime("%H:%M")}"
  end

end


app/model/commitment.rb

# == Schema Information
# Schema version: 20100930151421
#
# Table name: commitments
#
#  id          :integer(4)      not null, primary key
#  event_id    :integer(4)
#  resource_id :integer(4)
#  covering_id :integer(4)
#  tentative   :boolean(1)
#  voluntary   :boolean(1)
#  created_at  :datetime
#  updated_at  :datetime
#

class Commitment < ActiveRecord::Base

  belongs_to :event
  belongs_to :resource

  validates_presence_of :event, :resource
  validates_associated  :event,    :message => "Event does not exist"
  validates_associated  :resource, :message => "Resource does not exist"

  # Note naming here.  If this commitment is covering another commitment
  # then we point at it.  Code can read:
  #
  #  if commitment.covering
  #
  # If this commitment is being covered then the coverer should point at
  # us.  Code can read:
  #
  #  if commitment.covered
  #
  belongs_to :covering, :class_name => 'Commitment'

  # If this commitment is being covered and this commitment gets deleted
  # then the covering commitment should be deleted too.
  has_one    :covered,
             :class_name  => 'Commitment',
             :foreign_key => :covering_id,
             :dependent   => :destroy

end


and then the loop which does all the work.  The timetable (a week's
worth of lessons) has already been loaded into memory and then we are
iterating through a period of time (typically one term) creating dated
event records for each lesson.  In the in-memory form, a week is an
array of days, a day is an array of periods, and a period is an array of
events (because more than one lesson happens in any given period).

      eventcount = 0
      commitmentcount = 0
      @@interval.starts_at.upto(@@interval.ends_at) do |tdate|
        puts "Processing #{tdate}"
        # Ruby days are indexed from 1 = Monday.  Unfortunately, the
        # timetables which I'm loading start from Monday so for them,
        # 0 = Monday.  I therefore have to adjust a bit.
        wday = tdate.cwday - 1
        ipday = schooltt.week[wday]
        # Note that ipday is an array of periods but it's potentially
        # possible for there to be gaps - nil periods.  This would
        # happen if no-one has anything scheduled in that period.
        ipday.each do |ipperiod|
          if ipperiod
#           puts "Starting period #{ipperiod.periodindex + 1}"
            periodeventcount = 0
            starttime = periodstarttime(wday, ipperiod.periodindex)
            ipperiod.each do |ipevent|
              # If this is the 2nd half of a double then don't load it.
              if !ipevent.previous_event
                if ipevent.following_event
                  increment = 1
                else
                  increment = 0
                end
                endtime   = periodendtime(wday,
                                          ipperiod.periodindex + increment)
                # Make sure we can find the relevant resources first.
                if ipevent.teacher
#                 puts "Finding teacher"
                  dbteacher =
Teacher.find_by_initials(ipevent.teacher.initials)
                end
                if ipevent.group
#                 puts "Finding group"
                  dbmgroup  = Mtgroup.find_by_year_id_and_name(@@year.id,

ipevent.group.name)
                  if !dbmgroup
                    if ipevent.group.instance_of?(Consolidate::TutorGroup)
                      dbmgroup =
Tutorgroup.find_by_name(ipevent.group.tutor.initials)
                    end
                  end
                end
                if ipevent.room
#                 puts "Finding room"
                  dbroom = Location.find_by_name(ipevent.room.name)
                end
                if (dbteacher || !ipevent.teacher) &&
                   (dbmgroup  || !ipevent.group) &&
                   (dbroom    || !ipevent.room)
                  # Need to create this event in the database.
#                 puts "Creating event"
                  dbevent = Event.new
                  dbevent.body          = "Hello there"
                  dbevent.on            = tdate
                  dbevent.starts        = starttime
                  dbevent.ends          = endtime
                  dbevent.autoloaded    = true
                  dbevent.eventcategory = @@eventcategory
                  dbevent.save!
                  eventcount += 1
                  periodeventcount += 1
                  # And now add all the resources which this event uses.
                  if dbteacher
#                   puts "Linking teacher"
                    dbcommitment = Commitment.new
                    dbcommitment.event = dbevent
                    dbcommitment.resource = dbteacher.resource
                    dbcommitment.tentative = false
                    dbcommitment.voluntary = false
                    dbcommitment.save!
                    commitmentcount += 1
                  end
                  if dbmgroup
#                   puts "Linking group"
                    dbcommitment = Commitment.new
                    dbcommitment.event = dbevent
                    dbcommitment.resource = dbmgroup.resource
                    dbcommitment.tentative = false
                    dbcommitment.voluntary = false
                    dbcommitment.save!
                    commitmentcount += 1
                  end
                  if dbroom
#                   puts "Linking room"
                    dbcommitment = Commitment.new
                    dbcommitment.event = dbevent
                    dbcommitment.resource = dbroom.resource
                    dbcommitment.tentative = false
                    dbcommitment.voluntary = false
                    dbcommitment.save!
                    commitmentcount += 1
                  end
#                 puts "Finished event"
                else
                  puts "Couldn't find database resources for event."
                  puts "Teacher: #{ipevent.teacher ?
                                   ipevent.teacher.initials :
                                   "none" }"
                  puts "Group:   #{ipevent.group ? ipevent.group.name :
"none" }"
                  puts "Room:    #{ipevent.room ? ipevent.room.name :
"none" }"
                end
              end
            end
            puts "Period #{ipperiod.periodindex + 1} has
#{periodeventcount} events."
          end
        end
      end
      puts "#{eventcount} events and #{commitmentcount} commitments
added to the database."





More information about the Chat mailing list