Modify

Ticket #5268 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

mythtv-backend seems to go into endless loop, pegs cpu

Reported by: eharris@… Owned by: awithers
Priority: major Milestone: unknown
Component: mythtv Version: 0.21-fixes
Severity: medium Keywords: backend reschedule loop SCHEDULE_CHANGE
Cc: Ticket locked: no

Description

For some reason I haven't been able to narrow down yet, mythtv-backend sometimes seems to go into a fairly hard loop and peg the cpu until it is restarted.

The backend still seems to be servicing other requests, as frontends using it seem to be able to operate, but are slower apparently due to the load on the backend.

As an additional piece of info, this backend currently has no encoder cards in it, it is merely a storage backend for players, so I'm not sure why it appears to be continually calling for a reschedule.

Here's a small snippet of output of "strace -s 64 -t" of the backend process in this state:

14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(14, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(14, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(19, "45      BACKEND_MESSAGE[]:[]SCHEDULE_CHANGE[]:[]empty", 53) = 53
14:38:19 select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout)
14:38:19 ioctl(19, FIONREAD, [0])       = 0
14:38:19 write(13, "0", 1)              = 1
14:38:19 write(13, "0", 1)              = 1

Here's version info:

MythTV Version   : 16473M
MythTV Branch    : branches/release-0-21-fixes
Library API      : 0.21.20080304-1
Network Protocol : 40
Options compiled in:
 linux release using_oss using_alsa using_arts using_jack using_backend using_dbox2 using_directfb using_dvb using_firewire using_frontend using_hdhomerun using_iptv using_ivtv using_joystick_menu using_lirc using_opengl_vsync using_v4l using_x11 using_xrandr using_xv using_xvmc using_xvmcw using_xvmc_vld using_bindings_perl using_bindings_python using_opengl using_ffmpeg_threads using_libavc_5_3 using_live

Attachments

mythcontext-endless-loop.diff (1.1 KB) - added by Shane Shrybman <gnome42@…> 4 years ago.
mythcontext endless loop debug
5757-ick.patch (727 bytes) - added by awithers 4 years ago.

Change History

comment:1 follow-up: ↓ 2 Changed 4 years ago by Joe Ripley <vitaminjoe@…>

Is this ticket worth investigating? The reporter mentions that the backend doesn't have any tuner cards installed or configured. According to historical discussion (http://www.gossamer-threads.com/lists/mythtv/users/310919#310919) this is an unsupported feature (i.e. tunerless backend).

-- Joe Ripley vitaminjoe@…

comment:2 in reply to: ↑ 1 Changed 4 years ago by anonymous

Replying to Joe Ripley <vitaminjoe@gmail.com>:

Is this ticket worth investigating? The reporter mentions that the backend doesn't have any tuner cards installed or configured. According to historical discussion

How do you know the problem has anything at all to do with the lack of tuners before investigating it? Now if it was investigated and determined that that is the sole cause of the problem, then assigning the bug a "wontfix" status may be understandable/justified. But it needs to be investigated before that can be determined.

comment:3 Changed 4 years ago by paulh

  • Status changed from new to infoneeded_new

If you could get a backtrace from mythbackend when it gets stuck that would help to find the problem.

http://www.mythtv.org/docs/mythtv-HOWTO-22.html#ss22.2

Press CTRL-C to capture the backtrace when the BE appears to get stuck.

comment:4 Changed 4 years ago by Shane Shrybman <gnome42@…>

I saw a problem with similar symptoms, hard loop and repeating messages going around. Here's a small debug patch that should show if it is a similar problem of the backend sending messages to itself.

Changed 4 years ago by Shane Shrybman <gnome42@…>

mythcontext endless loop debug

comment:5 Changed 4 years ago by stuartm

  • Status changed from infoneeded_new to closed
  • Resolution set to invalid

Requested information not provided.

comment:6 Changed 4 years ago by awithers

  • Owner changed from ijr to awithers

comment:7 Changed 4 years ago by awithers

  • Status changed from closed to new
  • Resolution invalid deleted

comment:8 Changed 4 years ago by awithers

  • Status changed from new to assigned

Changed 4 years ago by awithers

comment:9 Changed 4 years ago by awithers

I’ve attached a patch which should work around one possible route to this. It is currently specific to a slave backend being the cause and will not be committed until I find a better fix.

What can happen:

MBE: scheduler posts a MythEvent?("SCHEDULE_CHANGE")

Loop: MBE: MainServer::customEvent() dispatches MythEvent?("SCHEDULE_CHANGE") as BACKEND_MESSAGE to anyone listening. SBE: Receives BACKEND_MESSAGE and resurrects MythEvent?("SCHEDULE_CHANGE") SBE: MainServer::customEvent() dispatches MythEvent?("SCHEDULE_CHANGE") as BACKEND_MESSAGE to anyone listening (including the connection to the master backend) see Loop

comment:10 Changed 4 years ago by awithers

  • Status changed from assigned to closed
  • Resolution set to fixed

(In [18792]) Closes #5268, #5319

  • Allows for checkTimeZone() to not establish a connection to the MBE.
  • Changes the way checkTimeZone() works for SBEs
    • Create a dummy Monitor connection to the MBE which doesn't accept events and is closed after checkTimeZone() is done.
    • Only perform checkTimeZone() if the MBE is running, previously it enforced a startup order we've never done before.
  • Some minor cleanup I've had sitting around.

What I didn't do, make the tz checking happen on each SBE reconnect, doing so would now be easy but seems a little overboard.

Note: This doesn't fix the issue originally reported in #5268. I haven't found the other cause but it is likely the same issue (some function in MythContext being called that establishes an event connection to the MBE).

comment:11 Changed 3 years ago by eharris@…

In case it is helpful, I've narrowed down what seems to be the cause of the problem. Whenever the backend actually deletes a recording due to the backend getting below the minimum free space limit, shortly thereafter the system goes into the highly loaded continuous issuance of SCHEDULE_CHANGE messages. Just to be clear, this is NOT caused when a program is marked as deleted, only after the video file is actually deleted. One additional nuance is that while the load increases significantly (until being "fixed" by restarting the backend) with a single recording being deleted, it usually takes two deletions (on my system) before the load goes high enough to be user-visible and cause problems (video skips, delayed responsiveness, etc). Three or more deletions usually makes the system practically unresponsive, so the load generating effect seems to be cumulative.

I can't tell for sure from trac or the changelog, but has this fix made it into the 0.21-fixes branch? If so, I will upgrade my system to a later version that includes the fix to see if it resolves the issue I am seeing.

View

Add a comment

Modify Ticket

Action
as closed
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.