Opened 12 years ago

Closed 12 years ago

#10533 closed Developer Task (Fixed)

Optimize scheduler for duplicate checking and EIT updates

Reported by: gigem Owned by: gigem
Priority: minor Milestone: 0.26
Component: MythTV - Scheduling Version: Unspecified
Severity: medium Keywords:
Cc: Ticket locked: no

Description

This patch includes several scheduler changes. I'll detail all of them when I commit it. The two major ones are described below. Please note that a small schema update must be done by hand and there is also a protocol change that is not yet advertised. Be careful to not mix this test version with unpatched versions.

As noted, a small schema update is needed. To make that change, please run the following MySQL commands.

alter table recordmatch add column findid int not null default 0; alter table recordmatch add index (recordid, findid);

The first major change in this patch is the scheduler only rechecks matching programs for duplicates in recorded and oldrecorded if the status might have changed. Previously, the scheduler rechecked all matched programs every time it ran. Because checking the oldrecorded table can be expensive, this change should provide a big improvement.

The second major change is the scheduler can be told to rematch and recheck only a portion of the program table. This is intended for use with EIT where portions of the program table are updated frequently in an orderly manner. This change should provide a big improvement in those cases. Please note the EIT scanner has not been updated yet. I hope Daniel will be able to do that, but if he can't, I will try myself.

To test this patch, please run the following cases with and without the patch. Wehn doing so, please save the "Reschedule requested for" and "Scheduled <n> items in" log lines from the master mythbackend and send them to me.

Test 1. Run "mythutil --resched". This forces a complete reschedule. There shouldn't be any signifcant change between versions for this test.

Test 2. Go to the EPG in mythfrontend and choose a program not currently covered by a recording rule. Press R/RECORD to create a new kSingleRecord rule. After the scheduler runs and the EPG updates, press DELETE and then OK to delete the rule. There should be a nocicable change between versions for this test.

Test 3. Go to ViewScheduled? in mythfrontend and choose a program with a recurring rule that matches several programs and has many entries in oldrecorded. Press E/EDIT to bring up the schedule editor. Change the rule type, change it back and then press OK to "dirty" the rule and force an update. There should be a big change between versions for this test.

Test 4. Run "mythutil --event 'RESCHEDULE_RECORDINGS' --event 'MATCH 0 0 0 YYYY-MM-DDTHH:MM:SS'" and replace the date with the time 3 hours from the current time. This simulates a reschedule caused by EIT where anything in the next 3 hours might have changed. Please note this can't be run on the unpatched version as it is only supported the more coarse rescheduling covered by "mythutil --resched".

Test 5. Using mythutil, run other limited reschedules that might be requested by EIT. The full syntax of the MATCH reschedule request is "MATCH <recordid> <sourceid> <mplexid> <maxstartttime>". A non-zero value for <recordid>, <sourceid> or <mplexid> or a valid time for <maxstarttime> limits the reschedule to programs matching that criteria.

Attachments (2)

schedopt2.patch (67.1 KB) - added by gigem 12 years ago.
schedopt3.patch (73.7 KB) - added by gigem 12 years ago.

Download all attachments as: .zip

Change History (4)

Changed 12 years ago by gigem

Attachment: schedopt2.patch added

Changed 12 years ago by gigem

Attachment: schedopt3.patch added

comment:1 Changed 12 years ago by David Engel <dengel@…>

In cbb8eb1ee32a658a519d2d5fb751ace114f63bf9/mythtv:

Add duplicate checking and limited matching optimizations and other
scheduler related changes.

The three major changes are as follows.

Split the checks against the oldrecorded and recorded tables for
previous recordings away from the "place" phase of scheduling and call
it the "check" phase. This makes it easier to report how much time is
spent doing this and leads to the next major change.

Drastically reduce the number of checks against the oldrecorded and
recorded tables for previous recordings. Historically, this has been
the most significant contributor to long scheduler runs. Not every
unnecessary check is eliminated, but the vast majority of them are.
Trying to remove the few remaining ones would probably take longer
than simply rechecking them.

Allow for checking a subset of the program table when new guide data
is available. This is primarily aimed at EIT scanning where typically
only a few hours of guide data for specific channels is updated in an
orderly fashion. Not rechecking the unchanged guide data greatly
reduces the amount of work the scheduler has to do. Please note the
EIT scanner has not been updated yet to take advantage of this change.

The other changes are as follows.

Use more expressive reschedule requests. This is primarily to support
the changes listed above. It also makes it easier to understand why
reschedules occurred when reading the logs and might lead to the
elimination of unnecessary reschedules in the future.

MATCH reschedule requests should be used when the guide data or a
specific recording rule is changed. The syntax is as follows.

RESCHEDULE_RECORDINGS
MATCH <recordid> <sourceid> <mplexid> <maxstarttime> <reason>

maxstarttime should be in ISO format. If recordid, sourceid or
mplexid are non-zero or maxstarttime is a valid time, the scheduler
will restrict itself to recording rules and guide data matching
those parameters. reason is purely for purposes of logging.

CHECK reschedule requests should be used when the status of a
specific episode is affected such as when "never record" or "allow
re-record" are selected or a recording finishes or is deleted. The
syntax is as follows.

RESCHEDULE_RECORDINGS
CHECK <recstatus> <recordid> <findid> <reason>
<title>
<subtitle>
<description>
<programid>

recordid should be the parent recordid, when applicable, otherwise,
the normal recordid. Setting programid to 'any" and title to ""
is a special case used to emulate an old reschedule request for
recordid 0. Setting programid to "any" and title to other than
"" is another special case used when all entries for a title are
deleted from oldrecorded. recstatus and reason are purely for
purposes of logging.

PLACE reschedule request should be used in all other cases. The
syntax is as follows.

RESCHEDULE_RECORDINGS
PLACE <reason>

reason is purely for purposes of logging.

Update the PHP and Python bindings to use the new reschedule requests.
Please note the bindings API has not changed to make full use of the
new requests. I'm leaving it to the bindings maintainers to decide
how best to do that.

Clear record.duplicate when a recording is queued for deletion. It
should have been this way all along and avoids a redundant reschedule
when the file is actually unlinked.

Lower the very detailed scheduler placement logging to LOG_DEBUG.
This makes VB_SCHEDULE/LOG_INFO more useful for less voluminous
purposes.

Add findid to the recordmatch table. This avoids having to calculate
it in multiple places.

Tighten the window for rescheduling long running programs from "24
hours ago" to "~8 hours ago".

Avoid creating unnecessary RecordingInfo? objects that are going to get
deleted later anyway.

Fixes #10533

comment:2 Changed 12 years ago by gigem

Resolution: Fixed
Status: newclosed

Fixed in [cbb8eb1e].

Note: See TracTickets for help on using tickets.