Opened 14 years ago

Closed 14 years ago

Last modified 14 years ago

#494 closed patch (fixed)

Make EIT crawl really low impact.

Reported by: Stuart Auchterlonie Owned by: danielk
Priority: minor Milestone: 0.19
Component: mythtv Version: head
Severity: low Keywords: eit crawl db
Cc: Ticket locked: no

Description

I've tuned the EIT crawl so it isn't as hard on the database.

1) Spend 10mins rather than 5min on a channel (channels often need 7-8min to get EIT data)
2) Call reschedule only once all the events in a list have been inserted.
3) Change EITscanner:RunEventLoop to have a 1000ms wait (no hurry for EIT :)
4) Update VERBOSE logs to more accurately reflect what is going on.

2 patches in this quilt.

1) eitcrawl-10min-perchan.diff (for number 1)
2) eitcrawl-lowerdb-load.diff (for the rest)

Attachments (5)

eitcrawl-10min-perchan.diff (676 bytes) - added by Stuart Auchterlonie 14 years ago.
Spend 10mins on a channel for EIT Crawl.
eitcrawl-lowerdb-load.diff (1.5 KB) - added by Stuart Auchterlonie 14 years ago.
Lower DB load
scheduler-lowDB-load.diff (1.7 KB) - added by Stuart Auchterlonie 14 years ago.
Make background reschedules nicer.
eitcrawl-lowerdb-load.2.diff (2.3 KB) - added by Stuart Auchterlonie 14 years ago.
Updated patch to only reschedule on channel change
eitcrawl-num-events-added.diff (2.6 KB) - added by Stuart Auchterlonie 14 years ago.
Print summary of events added once they've all been added.

Download all attachments as: .zip

Change History (13)

Changed 14 years ago by Stuart Auchterlonie

Attachment: eitcrawl-10min-perchan.diff added

Spend 10mins on a channel for EIT Crawl.

Changed 14 years ago by Stuart Auchterlonie

Attachment: eitcrawl-lowerdb-load.diff added

Lower DB load

Changed 14 years ago by Stuart Auchterlonie

Attachment: scheduler-lowDB-load.diff added

Make background reschedules nicer.

comment:1 Changed 14 years ago by Stuart Auchterlonie

I've attached a patch which should lighten the load on the DB when the EIT crawl is asking for a reschedule.

Now EITHelper requests a reschedule with id -2 rather than -1 When this recordid is seen in UpdateMatch? in the scheduler it puts a 1 second sleep in between each DB query, but otherwise it is treated just like an normal -1 reschedule.

comment:2 Changed 14 years ago by Isaac Richards

I don't like the scheduler patch. It should just be fixed to not sending multiple requests.

comment:3 Changed 14 years ago by tino.keitel@…

I tried 7530 without the patches, and it still bangs on mysqld. After rebuilding with eitcrawl-10min-perchan.diff and eitcrawl-lowerdb-load.diff, the will be a short phase with ~50% CPU load, so it looks a lot better now.

comment:4 Changed 14 years ago by Stuart Auchterlonie

wrt. the scheduler patch.

With the first two patches requests are sent only after the complete insertion of a whole stream of events. This is better but still sends a few reschedules per channel.

I can / will make another patch to only send reschedules on channel change for the active scan, but it still leaves the problem that the reschedule itself is hammering the database, creating pauses for Wendy et al.

If you have other ideas on how to make the reschedule itself low impact, let us know, and I'll bash out something to do that.

Changed 14 years ago by Stuart Auchterlonie

Updated patch to only reschedule on channel change

comment:5 Changed 14 years ago by Stuart Auchterlonie

I've added an updated eitcrawl-lowerdb-load.2.diff

This only does a reschedule on changing channel, rather than after each batch of events. So when the EIT crawl decides to change channel, a resched is done first to process any changes from the previous channels EIT.

The patch quilt is now

  1. eitcrawl-10min-perchan.diff
  2. eitcrawl-lowerdb-load.2.diff

Don't use the scheduler patch unless you want to prove to yourself (and us) that the reschedule itself is a major DB impact.

It really needs rethink on a way to perform a nice reschedule.

Changed 14 years ago by Stuart Auchterlonie

Print summary of events added once they've all been added.

comment:6 Changed 14 years ago by danielk

(In [7558]) References #494.

This moves reschedule out of eithelper into eitscanner. It also only reschedule when we have new info and either we change channels or haven't seen new info in over a minute.

If we wanted to be even kinder to the DB we could also track when these EIT's are for and only call reschedule if they are for the next 3 hours, or we haven't rescheduled in over 3 hours...

comment:7 Changed 14 years ago by danielk

Resolution: fixed
Status: newclosed

(In [7559]) Closes #494.

I'm rejecting the 10 minute per channel patch, it is much better to scan the channel a few times. If you leave a long time between the times a channel is scanned it makes it more likely that you miss a programming change in the next couple hours.

I've also changed the eitcrawl-num-events-added patch, the "EITHelper Added X events" is a debugging message, that tells you something is going on so it shouldn't be disabled when events are inserted. If you want to total the inserts it should be done in the scanner, maybe when it changes channels. To make that possible I changed EITHelper::ProcessEvents?() to return the number of events inserted into the DB.

comment:8 Changed 14 years ago by Stuart Auchterlonie

We should make the time to spend on a channel a configuration item that defaults to 5 minutes. I have channels here that take 7-8 minutes to start delivering events.

While it may be better to scan the channel a few more times it is also pointless to dump a half collected event carousel and move on.

Note: See TracTickets for help on using tickets.