Opened 10 years ago

Closed 9 years ago

#8199 closed defect (invalid)

backend lockup on recording expiry

Reported by: michael_zanetti@… Owned by: danielk
Priority: trivial Milestone: unknown
Component: MythTV - Recording Version: 0.23-fixes
Severity: medium Keywords: backend lockup deadlock
Cc: Ticket locked: yes

Description

Since the upgrade to 0.22 my backend keeps locking up on expiring old shows. This happens about once a day if my disk space is filled up and myth needs to expire recordings constantly. Mostly it happens when mythtv should start a new recording and tries to expire an old one to free enough disk space.

If my disk has still plenty of space left, this doesn't happen really often but still happens. In this case it happens most often when I exit LiveTV and (I guess) the currently watched show is about to expire immediately.

When the lockup happens, the frontend times out on the next action it would require backend connectivity.

The backend process still keeps running. It is just not responding to any requests any more and stops logging. I have to restart it manually to resume normal operation.

Attachments (5)

mythbackend.log.1 (1.7 KB) - added by michael_zanetti@… 10 years ago.
backend lockup on exit from live tv
mythbackend.log.2 (11.0 KB) - added by michael_zanetti@… 10 years ago.
backend lockup on autoexpiry event
mythbackend.log.3 (11.0 KB) - added by michael_zanetti@… 10 years ago.
backend lockup
mythbackend.log.4 (25.3 KB) - added by michael_zanetti@… 10 years ago.
And again... This time the log contains some warnings about "Waiting for a process request thread..." Maybe this has to something with the lockup?
mythbackend-lockup.log (27.0 KB) - added by michael_zanetti@… 9 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 10 years ago by Dibblah

Milestone: 0.23unknown
Priority: criticaltrivial

Please read the ticket howto before creating a ticket.

comment:2 Changed 10 years ago by Dibblah

Status: newinfoneeded_new

Please post the logs of the backend and check the end of dmesg for any related issues (running out of RAM, etc)

Changed 10 years ago by michael_zanetti@…

Attachment: mythbackend.log.1 added

backend lockup on exit from live tv

Changed 10 years ago by michael_zanetti@…

Attachment: mythbackend.log.2 added

backend lockup on autoexpiry event

comment:3 Changed 10 years ago by anonymous

I check the backend log on nearly every lockup. So far I can tell you that there is nothing suspicious in there. But I ran the backend with the default loglevel. The backend just stops logging at some point. Sometimes the logs stop completely after the hangup. Sometimes there are still some messages logged.

I have attached two example logs:

mythbackend.log.1 shows the last lockup I had. It happened directly after exiting LiveTV. mythbackend.log.2 shows a somewhat older failure. It happened on an autoexpire event

Please note that in mythbackend.log.2 there are still some frontend connect events from a frontend logged. Those entries happened _after_ the backend locked up. The entries are there because I started and killed the frontend a few times, which for obvious reasons hung too. The last entry while the backend was working was the autoexpire one.

Those logs are created using the default log level. I just have reconfigured my backend to log "all" messages and I will post future lockups with hopefully more detailed informations.

comment:4 in reply to:  1 Changed 10 years ago by anonymous

Replying to Dibblah:

Please read the ticket howto before creating a ticket.

Sorry for that... Won't change priority etc again.

comment:5 Changed 10 years ago by michael_zanetti@…

It happened again... The machine woke up itself to start a recording and locked up. When starting the fronend I still can see the logs "Adding xxx as a client" but I'm unable to enter the watch recordings screen or live TV. The recording that should be started is nowhere. I need to restart the backend to resume operation including the self-shutdown of the backend in idle state. Otherwise it just would sit here for ages accepting frontend connections but doing nothing else.

Log: 2010-04-07 09:07:02.789 mythbackend version: branches/release-0-22-fixes [Unknown] www.mythtv.org 2010-04-07 09:07:02.844 Using runtime prefix = /usr 2010-04-07 09:07:02.845 Using configuration directory = /home/mythtv/.mythtv 2010-04-07 09:07:02.908 Empty LocalHostName?. 2010-04-07 09:07:02.909 Using localhost value of Knuckles 2010-04-07 09:07:02.938 Configuration::Load - Error parsing: /home/mythtv/.mythtv/config.xml at line: 1 column: 1 2010-04-07 09:07:02.939 Configuration::Load - Error Msg: unexpected end of file 2010-04-07 09:07:03.273 New DB connection, total: 1 2010-04-07 09:07:03.480 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:07:03.509 Closing DB connection named 'DBManager0' 2010-04-07 09:07:03.510 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:07:03.658 Current MythTV Schema Version (DBSchemaVer): 1244 2010-04-07 09:07:03.667 MythBackend?: Starting up as the master server. 2010-04-07 09:07:03.877 New DB connection, total: 2 2010-04-07 09:07:03.882 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:07:03.928 New DB connection, total: 3 2010-04-07 09:07:03.931 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:07:05.836 New DB scheduler connection 2010-04-07 09:07:06.034 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:07:06.916 Enabling Upnpmedia rebuild thread. 2010-04-07 09:07:08.589 Main::Registering HttpStatus? Extension 2010-04-07 09:07:08.589 Enabled verbose msgs: important general 2010-04-07 09:07:08.636 AutoExpire?: CalcParams?(): Max required Free Space: 5.0 GB w/freq: 15 min 2010-04-07 09:07:09.326 Reschedule requested for id -1. 2010-04-07 09:07:12.735 MainServer::ANN Monitor 2010-04-07 09:07:12.735 adding: Knuckles as a client (events: 0) 2010-04-07 09:07:12.736 MainServer::ANN Monitor 2010-04-07 09:07:12.737 adding: Knuckles as a client (events: 1) 2010-04-07 09:07:16.559 Scheduled 399 items in 7.2 = 0.66 match + 6.53 place 2010-04-07 09:07:16.564 AUTO-Startup assumed 2010-04-07 09:07:18.007 UPnpMedia: BuildMediaMap? VIDEO scan starting in :/mnt/data/Videos: 2010-04-07 09:07:19.288 UPnpMedia: BuildMediaMap? Done. Found 426 objects 2010-04-07 09:08:00.648 Reschedule requested for id 0. 2010-04-07 09:08:12.870 Scheduled 399 items in 12.0 = 0.01 match + 12.04 place 2010-04-07 09:08:30.468 AutoExpire?: CalcParams?(): Max required Free Space: 5.0 GB w/freq: 15 min 2010-04-07 09:08:31.006 TVRec(4): ASK_RECORDING 4 29 0 0 2010-04-07 09:08:31.032 TVRec(5): ASK_RECORDING 5 29 0 0 2010-04-07 09:08:31.070 TVRec(6): ASK_RECORDING 6 29 0 0 2010-04-07 09:09:01.824 TVRec(4): Changing from None to Watching RecordingOnly? 2010-04-07 09:09:01.827 TVRec(4): HW Tuner: 4->4 2010-04-07 09:09:01.949 AutoExpire?: CalcParams?(): Max required Free Space: 7.0 GB w/freq: 15 min 2010-04-07 09:09:01.965 Started recording: Sturm der Liebe: channel 32107 on cardid 4, sourceid 4 2010-04-07 09:09:31.632 New DB connection, total: 4 2010-04-07 09:09:31.633 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:12:21.400 New DB connection, total: 5 2010-04-07 09:12:21.401 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:19:06.758 MainServer::ANN Playback 2010-04-07 09:19:06.759 adding: Knuckles as a client (events: 0) 2010-04-07 09:19:06.760 MainServer::ANN Monitor 2010-04-07 09:19:06.760 adding: Knuckles as a client (events: 1) 2010-04-07 09:19:11.057 New DB connection, total: 6 2010-04-07 09:19:11.059 Connected to database 'mythconverg' at host: localhost 2010-04-07 09:19:41.063 MainServer::ANN Playback 2010-04-07 09:19:41.063 adding: Knuckles as a client (events: 0) 2010-04-07 09:20:33.578 MainServer::ANN Playback 2010-04-07 09:20:33.578 adding: Knuckles as a client (events: 0) 2010-04-07 09:21:03.607 MainServer::ANN Playback 2010-04-07 09:21:03.607 adding: Knuckles as a client (events: 0) 2010-04-07 09:21:33.622 MainServer::ANN Playback 2010-04-07 09:21:33.622 adding: Knuckles as a client (events: 0)

Here is a shot of "top" just to prove that there are now large I/O waits pending:

top - 09:28:38 up 22 min, 1 user, load average: 0.14, 0.11, 0.12 Tasks: 116 total, 1 running, 115 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.7%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 1539664k total, 863080k used, 676584k free, 12188k buffers Swap: 987988k total, 0k used, 987988k free, 542044k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1871 mythtv 20 0 404m 83m 26m S 0.7 5.5 0:09.56 mythwelcome 1481 root 20 0 24192 1268 1076 S 0.3 0.1 0:00.42 hald-addon-stor 2910 root 20 0 10680 1176 876 R 0.3 0.1 0:00.02 top

1 root 20 0 3820 632 532 S 0.0 0.0 0:00.19 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/0 7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm

10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 sync_supers 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 bdi-default 12 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kblockd/0 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpid 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_notify 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_hotplug 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kseriod

Changed 10 years ago by michael_zanetti@…

Attachment: mythbackend.log.3 added

backend lockup

Changed 10 years ago by michael_zanetti@…

Attachment: mythbackend.log.4 added

And again... This time the log contains some warnings about "Waiting for a process request thread..." Maybe this has to something with the lockup?

comment:6 Changed 10 years ago by robertm

Status: infoneeded_newnew

comment:7 Changed 10 years ago by sphery

Status: newinfoneeded_new

There's nothing useful in the logs. Please see http://www.gossamer-threads.com/lists/mythtv/dev/427721#427721 . If this still happens regardless of whether you've set "Delete files slowly" or not, please upgrade to 0.23-fixes and see if it still happens.

If it still happens with 0.23-fixes, we're going to need a backtrace as described at http://www.mythtv.org/wiki/Debugging . Thanks.

comment:8 Changed 10 years ago by anonymous

I actually had replied to this mail. Seems the list moderator must have missed it as I'm not subscribed to the dev lists. Anyways, here is the content of that reply:


I had "Delete files slowly" enabled before. While hunting this issue I turned it off. It makes no difference.

I doubt it's only I/O waits, as sometimes happens a lot more on this machine without any problem. The lockups appears also when the system is idle. I have also waited for a long time before restarting the backend, reading logs etc... I believe I would have noticed such large I/O waits. Sometimes, when I come home and don't turn the TV on, I notice the lockup only because the machine is running and the HDD LED isn't doing anything. Usually the machine shuts itself down when idle.


I will try to upgrade to 0.23 and report back.

comment:9 Changed 10 years ago by dude71@…

Same thing here... I'm on mythtv 0.23.0+fixes24486, backend locks up either on expiring livetv or on starting second recording on second dvb card. Using EXT/4 with slow delete (also tried without; same result). In log file I sometimes get "waiting for a process" or "waiting for threads" but usually nothing special when this happens. On the frontend I cannot get into the recordings screen. It just waits and after that displays empty recordings screen. If I restart the backend process everything works fine again.

This is by far the biggest issue i have with the 0.23 version. Hope this gets resolved soon.

Thanks and keep up the good work, The Dude

comment:10 Changed 9 years ago by danielk

Resolution: invalid
Status: infoneeded_newclosed

The i/o waits indicate an OS problem. ext2/3 are known to have serious problems handling large files. And ext4 is known to have serious problems as a system disk. Absent further evidence I'm assuming you are using one of these filesystems in an application for which it is not suited and am closing as invalid.

comment:11 in reply to:  10 Changed 9 years ago by michael_zanetti@…

Resolution: invalid
Status: closednew

Replying to danielk:

The i/o waits indicate an OS problem. ext2/3 are known to have serious problems handling large files. And ext4 is known to have serious problems as a system disk. Absent further evidence I'm assuming you are using one of these filesystems in an application for which it is not suited and am closing as invalid.

There are no i/o waits... It happens also when the system is idle.

Nevertheless, I have changed my recording partitions filesystems to XFS as suggested by the wiki. The problem still persists.

Btw. I have upgraded to 0.23-fixes in the meantime... Still the same.

Today, when it locked up again I had a look at the logs and hope I have a more detailed description now. Please see the attached log "mythbackend-lockup.log" in combination with this comments:

  • The backend should have started to record a show starting at 15:10. I have set it up to start 3 minutes earlier for every show plus 3 minutes for this specific one. So the actual start should have been at 15:04. The machine woke up on RTC alarm and freshly bootet up from poweroff. At 15:02:02.493 the backend came alive for the first time.
  • The recording was successfully started. However, it is only 4 seconds long!!
  • Then, at 15:18:36.054 the backend seems to have restarted itself. I haven't been at home so no one could have restarted the backend. Is this possible? I haven't set up a watchdog to do such tasks.
  • At 15:18:38.840, there is again this message. "~MythContext waiting for threads to exit." After that there isn't much activity from the backend any more. There are still a few messages generated to prove that the process is still alive.
  • The (now locked up) backend missed a recording at 18:00 and another one at 18:25 without any notice in the logs.
  • About 19:30 I turned on TV, started the frontend out of mythwelcome and got to the system information screen. The first page, the one about EPG data, has been shown successfully. When I pressed "down" to go to the schedules-page also the frontend locked up. I first killed the frontend, then restarted also mythwelcome. This produced the few last log entries starting at 19:33.

Then, at 19:35 I restarted the backend. Its running again fine since then.

Changed 9 years ago by michael_zanetti@…

Attachment: mythbackend-lockup.log added

comment:12 Changed 9 years ago by dude71@…

I also changed my recordings partition from EXT4 to XFS. Now on XFS the same thing happens when on EXT4; incomplete recordings, hanging backend process (frontends see empty recordings screen, cannot start livetv). This happens especially when a recording starts on second DVB card while watching livetv on the first. My recordings partition is fine, I can read and write to it no problem (when backend process in bad state) with more than enough free space.

Then I started experimenting with the signal/channel timeout and dvb tuning delay settings. I upped those values (can't give you the exact values right now i'm not at home) and things improved a bit. It still hangs sometimes but less often. Also when I'm watching livetv normally and I exit to the main menu I sometimes can't restart livetv or get in the recordings screen. Again restarting the backend process (on separate server) does the trick. But at least I'm now able to record of multiple cards at once without hanging up the backend process all the time.

Somehow the backend process gets into a state where it doesn't handle frontend connections normally anymore and is not able to record of the DVB cards while the rest of the system is performing normally. Even if the DVB part is to blame somehow, the frontends should be able to get into recordings and stream a recording. Still looks like a bug to me.

Grntx,

comment:13 Changed 9 years ago by stuartm

Status: newinfoneeded_new

The wide range of unrelated symptoms you are reporting, combined with the fact that no-one else is reporting anything similar tends to suggest a problem with your hardware or distro and not MythTV itself. It's rather hard to point to anything specific, bad disk controllers, HDDs that is just too slow, bad stick of RAM, network that is dropping packets or a mis-configured firewall ... there are countless possibilities.

If you want us to continue debugging all the possible MythTV angles, then we need a backtrace of frozen backend process.

http://www.mythtv.org/docs/mythtv-HOWTO-22.html#ss22.2

comment:14 Changed 9 years ago by stuartm

Version: 0.22-fixes0.23-fixes

comment:15 Changed 9 years ago by dude71@…

No one else reporting anything similar ? Michael_zanetti's post sounds pretty similar to me. What i still don't understand is how a bad disk controller only effects the mythbackend process (we are talking RAID-0 2x 7200 sata disks here; not really slow either). I can SSH the backend just fine right after this "lockup" and after restarting the backend process it works normal again so I would exclude network dropping packets as well. I don't use a firewall internally so i'll exclude that as well.

I would say that nothing you mention here or what I could think of would justify any service from getting into an undefined state (not serving recordings or streaming livetv in this case). It's only the mythbackend process that doesn't work as it should the rest of the machine is completely responsive. No errors in syslog/dmesg. Bad RAM stick would show up earlier I guess.. but I'll check that when I've the opportunity.

I understand that it's hard to debug this without a logfile/backtrace. If this problem shows up again I'll try to post one. The line "~MythContext waiting for threads to exit." that Michael_zanetti reports I also have seen. Apart from that nothing special in the mythtv-backend.log.

Grtnx,

comment:16 Changed 9 years ago by stuartm

Right, sorry I didn't notice that there were reports from two different people. In that case which distro are you both using and which kernel?

Two users out of tens of thousands is still a small number, so there has to be something common to both your systems/setup.

comment:17 Changed 9 years ago by dude71@…

I agree 2 isn't much. I'm using Ubuntu 10.04, kernel 2.6.32.22. My DVB setup might be a bit special. I use the S2-liplianin drivers (budget-dvb and Mantis) for my Terratec 1200 DVB-C and TechniSat? CableStar? HD DVB-C cards. On top of that I use opensasc-ng to do the descrambling.

I have to state that when my backend gets into the "locked" state only restarting the mythtv-backend process solves my problems (for a while). Even when somewhere in the DVB chain (sasc-ng -> dvbloopback -> dvb driver) something gets fucked up (without hanging the machine) mythbackend should still be serving recordings i would say. LiveTV or new recordings not working in that specific case I won't blame on mythbackend.

I'm wondering what Michael_zanetti uses.

Cheers,

comment:18 Changed 9 years ago by danielk

Resolution: invalid
Status: infoneeded_newclosed
Ticket locked: set

There are bugs in MythTV, and there are bugs reports against MythTV, but this is not a valid bug report by any measure.

Note: See TracTickets for help on using tickets.