Opened 14 years ago

Closed 13 years ago

#3113 closed enhancement (fixed)

HouseKeeper - Add cleanup of recordedmarkup and recordedseek

Reported by: sphery <mtdean@…> Owned by: cpinkham
Priority: minor Milestone: unknown
Component: mythtv Version: head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

The attached patch (inspired by gnome42) adds code to clean up orphaned records from recordedmarkup and recordedseek tables in addition to the existing code that cleans recordedprogram, recordedrating, and recordedcredits. Several recent posts on the lists seem to indicate concerns about orphaned records in these tables. Although it's unlikely that users will have orphaned records from more than a few recordings, there will be > 7200 records in recordedseek for a single one-hour MPEG-2 recording with 15 frames/GOP, so it's probably worthwhile to clean them up.

Although the existing code would have worked for cleaning up the two new tables by simply adding the table names to tables[], performing a JOIN on recordedseek is not acceptable--my recordedseek table has > 2.8M records (for ~400 MPEG-2 recordings totaling ~376 hours) and MySQL takes nearly 7 minutes to do the JOIN (when not recording/database is otherwise idle). However, since it's extremely quick (< 1 sec) to find the DISTINCT chanid's and starttime's in the table (resulting in approximately the same number of rows as are in recorded), the patch creates a temporary table with just this info (takes ~3 sec to copy the data to a temporary table) and JOINs the temporary table with recorded (the JOIN takes less than 1 sec). The entire process required to find orphaned records in recordedseek takes less than 4 secs on my system.

The temporary table is only really necessary for the recordedseek table, but it seemed more elegant to use it for all the tables. If you prefer, I can instead create a temporary table for recordedseek before the loop and then use 2 arrays (one specifying the table to use for the JOIN and the other specifying the table to use for the DELETE), where the arrays are identical except for recordedseek (one array will specify "recordedseek" and the other will specify "temprecordedseek"). I decided against this approach because of the need to keep the two arrays in sync--and the potential badness that could occur if they fall out of sync.

Attachments (3)

mythtv-CleanupRecordedTables-recordedmarkup_and_recordedseek.patch (2.5 KB) - added by sphery <mtdean@…> 14 years ago.
mythtv-CleanupRecordedTables-recordedmarkup_and_recordedseek-20070221.patch (2.8 KB) - added by sphery <mtdean@…> 14 years ago.
Drop the temp table when done since we're using connection pooling; otherwise we end up with a bunch of little temprecordedcleanup tables. Replaces mythtv-CleanupRecordedTables?-recordedmarkup_and_recordedseek.patch .
mythtv-CleanupRecordedTables-recordedmarkup_and_recordedseek-20070307.patch (3.6 KB) - added by sphery <mtdean@…> 14 years ago.
Updated patch

Download all attachments as: .zip

Change History (7)

Changed 14 years ago by sphery <mtdean@…>

Changed 14 years ago by sphery <mtdean@…>

Drop the temp table when done since we're using connection pooling; otherwise we end up with a bunch of little temprecordedcleanup tables. Replaces mythtv-CleanupRecordedTables?-recordedmarkup_and_recordedseek.patch .

comment:1 Changed 14 years ago by sphery <mtdean@…>

Please hold off on applying this. The semantics of starttime are different in recordedmarkup and recordedseek compared to recordedprogram, recordedcredits, and recordedrating. In recorded{markup,seek}, starttime is recorded.starttime; whereas, in recorded{program,credits,rating}, starttime is recorded.progstart. I'll re-work the patch when I return home next weekend.

comment:2 Changed 14 years ago by cpinkham

Owner: changed from Isaac Richards to cpinkham

Changed 14 years ago by sphery <mtdean@…>

Updated patch

comment:3 Changed 14 years ago by sphery <mtdean@…>

mythtv-CleanupRecordedTables?-recordedmarkup_and_recordedseek-20070307.patch is an updated patch that handles the different meaning of starttime in recorded{markup,seek}. It does so by specifying both tablenames and join columns in the tables array.

If you would prefer leaving HouseKeeper::CleanupRecordedTables?() as is and adding a different function for cleaning recorded{markup,seek} (or if you have a cleaner approach for adding the new tables to the existing function/specifying the join column), please let me know and I'll code it that way. Really, the only difference between the existing code and the new code provided by the patch is adding two new tables to the array and specifying a join column. All the other code is "scalability" changes to handle the recordedseek table (which can have millions of rows, so should not be joined with recorded).

comment:4 Changed 13 years ago by cpinkham

Resolution: fixed
Status: newclosed

(In [13570]) Add recodedmarkup and recordedseek tables to the list of tables cleaned up by the HouseKeeper::CleanupRecordedTables?() task. Also rework this method so that it does not directly join against the recorded table since these new tables are very large and a join would lock the recorded table for update for an extended period of time which is undesirable.

Closes #3113 using patch by Michael T. Dean.

Note: See TracTickets for help on using tickets.