Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#5268 closed defect (fixed)

mythtv-backend seems to go into endless loop, pegs cpu

Reported by: eharris@… Owned by: Anduin Withers
Priority: major Milestone: unknown
Component: mythtv Version: 0.21-fixes
Severity: medium Keywords: backend reschedule loop SCHEDULE_CHANGE
Cc: Ticket locked: no

Description

For some reason I haven't been able to narrow down yet, mythtv-backend sometimes seems to go into a fairly hard loop and peg the cpu until it is restarted.

The backend still seems to be servicing other requests, as frontends using it seem to be able to operate, but are slower apparently due to the load on the backend.

As an additional piece of info, this backend currently has no encoder cards in it, it is merely a storage backend for players, so I'm not sure why it appears to be continually calling for a reschedule.

Here's a small snippet of output of "strace -s 64 -t" of the backend process in this state:

14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1

Here's version info:

MythTV Version   : 16473M
MythTV Branch    : branches/release-0-21-fixes
Library API      : 0.21.20080304-1
Network Protocol : 40
Options compiled in:
 linux release using_oss using_alsa using_arts using_jack using_backend using_dbox2 using_directfb using_dvb using_firewire using_frontend using_hdhomerun using_iptv using_ivtv using_joystick_menu using_lirc using_opengl_vsync using_v4l using_x11 using_xrandr using_xv using_xvmc using_xvmcw using_xvmc_vld using_bindings_perl using_bindings_python using_opengl using_ffmpeg_threads using_libavc_5_3 using_live

Attachments (2)

mythcontext-endless-loop.diff (1.1 KB) - added by Shane Shrybman <gnome42@…> 11 years ago.
mythcontext endless loop debug
5757-ick.patch (727 bytes) - added by Anduin Withers 11 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 11 years ago by Joe Ripley <vitaminjoe@…>

Is this ticket worth investigating? The reporter mentions that the backend doesn't have any tuner cards installed or configured. According to historical discussion (http://www.gossamer-threads.com/lists/mythtv/users/310919#310919) this is an unsupported feature (i.e. tunerless backend).

-- Joe Ripley vitaminjoe@…

comment:2 in reply to:  1 Changed 11 years ago by anonymous

Replying to Joe Ripley <vitaminjoe@gmail.com>:

Is this ticket worth investigating? The reporter mentions that the backend doesn't have any tuner cards installed or configured. According to historical discussion

How do you know the problem has anything at all to do with the lack of tuners before investigating it? Now if it was investigated and determined that that is the sole cause of the problem, then assigning the bug a "wontfix" status may be understandable/justified. But it needs to be investigated before that can be determined.

comment:3 Changed 11 years ago by paulh

Status: newinfoneeded_new

If you could get a backtrace from mythbackend when it gets stuck that would help to find the problem.

http://www.mythtv.org/docs/mythtv-HOWTO-22.html#ss22.2

Press CTRL-C to capture the backtrace when the BE appears to get stuck.

comment:4 Changed 11 years ago by Shane Shrybman <gnome42@…>

I saw a problem with similar symptoms, hard loop and repeating messages going around. Here's a small debug patch that should show if it is a similar problem of the backend sending messages to itself.

Changed 11 years ago by Shane Shrybman <gnome42@…>

mythcontext endless loop debug

comment:5 Changed 11 years ago by stuartm

Resolution: invalid
Status: infoneeded_newclosed

Requested information not provided.

comment:6 Changed 11 years ago by Anduin Withers

Owner: changed from Isaac Richards to Anduin Withers

comment:7 Changed 11 years ago by Anduin Withers

Resolution: invalid
Status: closednew

comment:8 Changed 11 years ago by Anduin Withers

Status: newassigned

Changed 11 years ago by Anduin Withers

Attachment: 5757-ick.patch added

comment:9 Changed 11 years ago by Anduin Withers

I’ve attached a patch which should work around one possible route to this. It is currently specific to a slave backend being the cause and will not be committed until I find a better fix.

What can happen:

MBE: scheduler posts a MythEvent?("SCHEDULE_CHANGE")

Loop: MBE: MainServer::customEvent() dispatches MythEvent?("SCHEDULE_CHANGE") as BACKEND_MESSAGE to anyone listening. SBE: Receives BACKEND_MESSAGE and resurrects MythEvent?("SCHEDULE_CHANGE") SBE: MainServer::customEvent() dispatches MythEvent?("SCHEDULE_CHANGE") as BACKEND_MESSAGE to anyone listening (including the connection to the master backend) see Loop

comment:10 Changed 11 years ago by Anduin Withers

Resolution: fixed
Status: assignedclosed

(In [18792]) Closes #5268, #5319

  • Allows for checkTimeZone() to not establish a connection to the MBE.
  • Changes the way checkTimeZone() works for SBEs
    • Create a dummy Monitor connection to the MBE which doesn't accept events and is closed after checkTimeZone() is done.
    • Only perform checkTimeZone() if the MBE is running, previously it enforced a startup order we've never done before.
  • Some minor cleanup I've had sitting around.

What I didn't do, make the tz checking happen on each SBE reconnect, doing so would now be easy but seems a little overboard.

Note: This doesn't fix the issue originally reported in #5268. I haven't found the other cause but it is likely the same issue (some function in MythContext being called that establishes an event connection to the MBE).

comment:11 Changed 10 years ago by eharris@…

In case it is helpful, I've narrowed down what seems to be the cause of the problem. Whenever the backend actually deletes a recording due to the backend getting below the minimum free space limit, shortly thereafter the system goes into the highly loaded continuous issuance of SCHEDULE_CHANGE messages. Just to be clear, this is NOT caused when a program is marked as deleted, only after the video file is actually deleted. One additional nuance is that while the load increases significantly (until being "fixed" by restarting the backend) with a single recording being deleted, it usually takes two deletions (on my system) before the load goes high enough to be user-visible and cause problems (video skips, delayed responsiveness, etc). Three or more deletions usually makes the system practically unresponsive, so the load generating effect seems to be cumulative.

I can't tell for sure from trac or the changelog, but has this fix made it into the 0.21-fixes branch? If so, I will upgrade my system to a later version that includes the fix to see if it resolves the issue I am seeing.

Note: See TracTickets for help on using tickets.