Opened 9 years ago

Closed 9 years ago

#8889 closed defect (Works for me)

[Trunk] Runaway mythcommflag can't be stopped

Reported by: databubble Owned by: beirdo
Priority: minor Milestone: 0.24
Component: MythTV - Mythcommflag Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Trunk 26169:

I have a 1 hour HD-PVR recording which takes hours just searching for logo and seems to generate an endless stream of the following to the mythbackend.log

2010-09-08 15:55:26.183 [h264 @ 0x7f49722a96c0]decode_slice_header error
2010-09-08 15:55:26.260 [h264 @ 0x7f49722a96c0]no frame!
2010-09-08 15:55:26.333 AFD Error: Unknown decoding error
2010-09-08 15:55:26.467 [h264 @ 0x7f49722a96c0]get_buffer() failed (stride changed)

This is with the experimental speed-up enabled (all methods, strict detection.) I'm testing with this disabled now to see if it makes any difference.

The recording plays fine, but even left over-night, commflag never gets past the searching for logo stage.

Further, there doesn't seem to be any way to stop mythcommflag. "Pause" and "Stop" in the job queue does nothing. It's not possible to delete the job from the job queue.... and worse, it's not possible to delete the recording from within mythfrontend as it says it's in use by commercial flagging. Killing the process, or mythbackend, just re-starts the job the next time it comes up.

Now I have a queue of 26 commercial flagging jobs backed up after "Top Gear" and no way to stop the offending job apart from removing the recording from the filesystem.

I can upload a portion of the recording if required, just let me know how much and where.

Attachments (1)

byteswap.patch (898 bytes) - added by beirdo 9 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 9 years ago by beirdo

Owner: changed from cpinkham to beirdo
Status: newassigned

If you can dig up in the logs the exit codes of the mythcommflag run when you kill it, and provide that, it would be helpful. I see you are already trying with the speed-ups disabled, which is a good step.

If you really want to delete the recording, one thing that has worked for me is to start playing it back, and hit delete during playback. However, it would be good to track down the core of the issue.

If a commflag process is killed, the intent is to set the job to errored, and then it should not rerun it.

The lines in particular I would like to see from the backend log are like (should be only one of them for the PID of the commflag run):

PID xxxxx: exited
PID xxxxx: signal
PID xxxxx: other
Last edited 9 years ago by beirdo (previous) (diff)

comment:2 Changed 9 years ago by databubble

Turning off the experimental speed-up makes no difference. However, I killed the mythcommflag process and it did move on to the next commercial flagging job in the queue (another Top Gear episode recorded from the same channel)...and seems to be working fine for that recording. Here's a snippet of the log at the time the process was killed:

[endless repeats of the same four errors:]
2010-09-08 16:18:33.285 [h264 @ 0x7f20393286c0]decode_slice_header error
2010-09-08 16:18:33.360 [h264 @ 0x7f20393286c0]no frame!
2010-09-08 16:18:33.460 AFD Error: Unknown decoding error
2010-09-08 16:18:33.510 [h264 @ 0x7f20393286c0]get_buffer() failed (stride changed)
2010-09-08 16:18:33.577 [h264 @ 0x7f20393286c0]decode_slice_header error
2010-09-08 16:18:33.660 [h264 @ 0x7f20393286c0]no frame!
2010-09-08 16:18:33.710 AFD Error: Unknown decoding error
2010-09-08 16:18:33.802 [h264 @ 0x7f20393286c0]get_buffer() failed (stride changed)
Terminated
2010-09-08 16:18:34.520 PID 18026: exited: status=36608, result=143
2010-09-08 16:18:34.589 Launching: /usr/bin/mythpreviewgen --size 0x0 --chanid 2087 --starttime 20100907040000  > /dev/null
2010-09-08 16:18:34.688 PID 18215: launched
2010-09-08 16:18:35.030 MainServer::ANN Monitor
2010-09-08 16:18:35.092 adding: family as a client (events: 0)
2010-09-08 16:18:35.201 MainServer::ANN Monitor
2010-09-08 16:18:35.279 adding: family as a client (events: 1)
2010-09-08 16:19:00.410 PID 18215: exited: status=0, result=0
2010-09-08 16:19:00.463 commflag: Commercial Detection Finished: "Top Gear" recorded from channel 2087 at 2010-09-07T04:00:00 (143 commercial break(s))
2010-09-08 16:19:00.617 JobQueue: Commercial Detection Starting for "Top Gear" recorded from channel 2087 at 2010-09-07T05:00:00
2010-09-08 16:19:00.667 commflag: Commercial Detection Starting: "Top Gear" recorded from channel 2087 at 2010-09-07T05:00:00
2010-09-08 16:19:00.749 Launching: /usr/bin/mythcommflag -j 7044 -V 3
2010-09-08 16:19:00.827 PID 18224: launched
2010-09-08 16:19:01.092 Using runtime prefix = /usr
2010-09-08 16:19:01.174 Using configuration directory = /home/mythtv/.mythtv
2010-09-08 16:19:01.279 Using localhost value of family
2010-09-08 16:19:01.413 New DB connection, total: 1
2010-09-08 16:19:01.544 Connected to database 'mythconverg' at host: 192.168.1.110
2010-09-08 16:19:01.624 Closing DB connection named 'DBManager0'
2010-09-08 16:19:01.666 Current locale en_CA
2010-09-08 16:19:01.764 No locale defaults file for en_CA, skipping
2010-09-08 16:19:01.858 Connected to database 'mythconverg' at host: 192.168.1.110
2010-09-08 16:19:01.903 Loading en_us translation for module mythfrontend
2010-09-08 16:19:02.184 MythCoreContext: Connecting to backend server: 192.168.1.110:6543 (try 1 of 1)
2010-09-08 16:19:02.258 Using protocol version 61
2010-09-08 16:19:02.342 MainServer::ANN Monitor
2010-09-08 16:19:02.417 adding: family as a client (events: 0)
2010-09-08 16:19:02.726 MainServer::ANN Monitor
2010-09-08 16:19:02.769 adding: family as a client (events: 1)
2010-09-08 16:19:02.839 CommFlagFast: 0
2010-09-08 16:19:02.929 special_decode: 0
2010-09-08 16:19:03.137 AFD: Opened codec 0x19316c0, id(H264) type(Video)
2010-09-08 16:19:03.209 AFD: codec AC3 has 2 channels
2010-09-08 16:19:03.260 AFD: Opened codec 0x1933830, id(AC3) type(Audio)

comment:3 Changed 9 years ago by beirdo

2010-09-08 16:18:34.520 PID 18026: exited: status=36608, result=143

Wonderful... This is the same nonsense I get, but most people don't seem to get. I'm willing to venture a guess that your platform is very similar to mine... Ubuntu 10.04?

This line should have been:

2010-09-08 16:18:34.520 PID 18026: signal: status=143, result=242, signal=15

I have a horrible hack that I use to fix this, but it's not useful for what seems to be the majority of systems out there, and I haven't gotten enough data yet to determine what is causing this. What seems to be happening: normal exits work fine. Exits by signal (i.e. killed) have the status byte swapped. How it's happening is a mystery to me.

However, at least you can kill... it thinks you found 143 commercial flags, but the job is stopped.

comment:4 Changed 9 years ago by beirdo

Oh, as for the spew of messages... I've only seen this on horribly damaged H.264 files lately. Since the ffmpeg resync, I haven't seen this often if at all.

Changed 9 years ago by beirdo

Attachment: byteswap.patch added

comment:5 Changed 9 years ago by beirdo

The attachment is the patch I use locally such that my box sees signals as signals. This is not likely to ever be committed as such, but it would be good to track down why that part is funky.

comment:6 Changed 9 years ago by databubble

Thanks Beirdo, I'll give it a try and report back (probably Friday). My platform is actually (k)ubuntu 10.10 beta. I know.... glutten for punishment.

comment:7 Changed 9 years ago by databubble

re: spew of messages.... these are recent recordings, post ffmpeg re-sync. I'm seeing them now on two recordings from the HD-PVR. My log file from the last 18 hours is over 31 megabytes. Should I open a separate ticket for this, as the root cause of my "runaway commflag"?

Without the patch but commflagging worked it's way through another dozen recordings, and then the same issue came up on another recording, from yesterday. I've now applied the patch, as we'll see how things go.

2010-09-09 03:42:32.645 [h264 @ 0x7fa0d3f956c0]decode_slice_header error
2010-09-09 03:42:32.720 [h264 @ 0x7fa0d3f956c0]no frame!
2010-09-09 03:42:32.804 AFD Error: Unknown decoding error
2010-09-09 03:42:32.879 [h264 @ 0x7fa0d3f956c0]get_buffer() failed (stride changed)
Terminated
2010-09-09 03:42:33.630 PID 20413: exited: status=36608, result=143
2010-09-09 03:42:33.705 Launching: /usr/bin/mythpreviewgen --size 0x0 --chanid 2053 --starttime 20100908143000  > /dev/null
2010-09-09 03:42:33.757 PID 26699: launched
2010-09-09 03:42:34.370 MainServer::ANN Monitor
2010-09-09 03:42:34.512 adding: family as a client (events: 0)
2010-09-09 03:42:34.563 MainServer::ANN Monitor
2010-09-09 03:42:34.646 adding: family as a client (events: 1)
2010-09-09 03:42:54.077 PID 26699: exited: status=0, result=0
2010-09-09 03:42:54.128 commflag: Commercial Detection Finished: "Teen Cribs" recorded from channel 2053 at 2010-09-08T14:30:00 (143 commercial break(s))

comment:8 Changed 9 years ago by beirdo

That doesn't look like the patch is applied to me. Looks the same as before (maybe you forgot to restart the backend?)

Also: every time I have seen those errors in the past, the recording has been corrupted. It could be as simple as your HD-PVR may be wanting a power cycle (i.e. unplug it, wait 10s, plug it back in). They tend to be finnicky at times. Do you notice any noticable issues with the files (video glitches, etc) on playback? If the file's corrupt, there's not much we can do about it. You can always locally hack your code to disable those log messages from being printed, but that's about it.

comment:9 Changed 9 years ago by beirdo

Milestone: unknown0.24
Resolution: Works for me
Status: assignedclosed

As this seems to be the result of a pathologically bad file, I don't think there's anything else we can do here. I'm closing this.

Note: See TracTickets for help on using tickets.