Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#6969 closed defect (fixed)

mythbackend hangs after recording delete from mythweb

Reported by: David Asher <asherml@…> Owned by: danielk
Priority: minor Milestone: 0.22
Component: MythTV - General Version: head
Severity: medium Keywords:
Cc: Ticket locked: no

Description (last modified by danielk)

I upgraded to trunk 21644 and now when I delete a recording from mythweb's recorded programs page, the ajax interface says "1 request pending" for a long time (approx 1 minute or so) before finally refreshing. From this point on the backend in non-responsive. I have a backtrace of the backend and the logfile (all, nodatabase, noupnp). g

Attachments (7)

gdb.txt (32.1 KB) - added by David Asher <asherml@…> 10 years ago.
backtrace for non-responsive mythbackend
mythbackend.log.gz (54.5 KB) - added by David Asher <asherml@…> 10 years ago.
log from non-responsive backend
gdb.2.txt (32.2 KB) - added by David Asher <asherml@…> 10 years ago.
second backtrace
mythbackend.log.hang2 (142.8 KB) - added by David Asher <asherml@…> 10 years ago.
second logfile
mythfrontend.log.lost (1.8 KB) - added by David Asher <asherml@…> 10 years ago.
frontend log from attempting to reconnect (not real interesting, standard verbosity)
6969-dbg-v1.patch (2.2 KB) - added by danielk 10 years ago.
Debugging patch
mythbackend.log.worked.gz (62.6 KB) - added by David Asher <asherml@…> 10 years ago.
log from 2 succesful deletes with v1 debug patch

Download all attachments as: .zip

Change History (17)

Changed 10 years ago by David Asher <asherml@…>

Attachment: gdb.txt added

backtrace for non-responsive mythbackend

Changed 10 years ago by David Asher <asherml@…>

Attachment: mythbackend.log.gz added

log from non-responsive backend

comment:1 Changed 10 years ago by David Asher <asherml@…>

Oh, forgot version information:

$ /usr/bin/mythbackend --version Please include all output in bug reports. MythTV Version : exported MythTV Branch : trunk Network Protocol : 48 Library API : 0.22.20090829-1 QT Version : 4.5.0 Options compiled in:

linux profile using_oss using_alsa using_pulse using_jack using_backend using_directfb using_dvb using_firewire using_frontend using_glx_proc_addr_arb using_hdhomerun using_hdpvr using_iptv using_ivtv using_joystick_menu using_libfftw3 using_lirc using_mheg using_opengl_video using_opengl_vsync using_qtwebkit using_v4l using_x11 using_xrandr using_xv using_xvmc using_xvmc_vld using_xvmcw using_bindings_perl using_bindings_python using_opengl using_vdpau using_ffmpeg_threads using_libavc_5_3 using_live using_mheg

comment:2 Changed 10 years ago by danielk

Milestone: unknown0.22
Owner: changed from Isaac Richards to danielk
Status: newassigned
Version: unknownhead

comment:3 Changed 10 years ago by danielk

Description: modified (diff)

For some reason we are transmitting a RECORDING_LIST_CHANGE DELETE event to a mythweb socket that has already been disconnected. We do check for disconnected sockets in the transmit code, but that isn't working. The easy workaround is to just add "if (sock->socket() < 0) continue;" on line 917 of mainserver.cpp before the upref, but I'd honestly the PlaybackSock? should be handled by HandleDone?(), because it appears this is a slow memory leak of PlaybackSock? classes which will grow with each PlaybackSock? connection...

However, I'm not sure that this socket issue would have caused the backend to become unresponsive. It appears the backend is running after this and nothing is preventing it from accepting new connections. This may have something to do with the mythweb side of things, with which I'm not familiar. I'll fix the backend side of the issue, and if the problem persists I'll pass this ticket on to xris or kormoc.

This may also be related to #6955.

comment:4 Changed 10 years ago by David Asher <asherml@…>

You got me thinking that I never tried accessing the backend from a frontend after it became non-responsive to mythweb. So I just deleted another recording from mythweb. Everything behaved exactly as before, non-responsive to mythweb. In addition, the running frontend lost its connection to the backend and cannot reconnect to it. The backend is definitely off in the weeds. I do note, however, that the jobqueue thread is still merrily looking for jobs to run. The backend logs, however, don't log anything about the new attempts to connect.

I'll attach the log (from the point where I initiated the delete from mythweb) and gdb.txt from this attempt in case it illuminates anything. The backtrace is from AFTER I attempted to reconnect from the frontend.

Changed 10 years ago by David Asher <asherml@…>

Attachment: gdb.2.txt added

second backtrace

Changed 10 years ago by David Asher <asherml@…>

Attachment: mythbackend.log.hang2 added

second logfile

Changed 10 years ago by David Asher <asherml@…>

Attachment: mythfrontend.log.lost added

frontend log from attempting to reconnect (not real interesting, standard verbosity)

Changed 10 years ago by danielk

Attachment: 6969-dbg-v1.patch added

Debugging patch

comment:5 Changed 10 years ago by danielk

Does the attached patch make the deadlock go away?

It is not correct code and will cause some other oddities, but I just want to verify if I'm looking at the right lock for the deadlock bug.

comment:6 Changed 10 years ago by David Asher <asherml@…>

Yep. No deadlock on 2 deletes.

I'll attach the mythbackend.log for the 2 deletes in case its interesting.

You said it might cause some oddities, should I leave the patch in or revert it for general use?

Changed 10 years ago by David Asher <asherml@…>

Attachment: mythbackend.log.worked.gz added

log from 2 succesful deletes with v1 debug patch

comment:7 Changed 10 years ago by danielk

(In [21721]) Refs #6969. Applies the dbg patch from a couple days ago.

This gets rid of the sock->UpRef?()/DownRef?() on broadcast, this is already done in PlaybackSock?, and since we UpRef? it a few lines earlier, the MythSocket? can not be deleated while we're here. It can get disconnected so we add checks to see if it is still connected before doing a writeStringList.

svn commit -m Refs

comment:8 Changed 10 years ago by danielk

Some of my commit message got cut off due to a keyboard fumble. [21721] also gets rid of the readReadyLock. This is why I marked the patch as debug patch earlier and not as a solution, but from looking back at the history of this file as far as the svn history goes I believe provided the same protection provided by other locks in a more fine grained manner now. It predates the locking of reference counts and other locks we now employ.

[21721] does not address either of the resource leaks.

comment:9 Changed 10 years ago by danielk

Resolution: fixed
Status: assignedclosed

(In [21725]) Fixes #6969. Refs #6955. Fixes a file descriptor + memory leak.

This may be a fix for #6955. If not, please provide -v socket,network,extra logs for the backend for a long enough time to catch this leak (15 mins?)

comment:10 Changed 10 years ago by cpinkham

(In [21771]) Fix a lot of "Unknown socket closing MythSocket?" messages in the backend after [21725]. Since MainServer::connectionClosed() is now called by MythSocket::close(), we don't need to call MainServer::connectionClosed() directly from MythSocketThread::ReadyToBeRead?() anymore after we call sock->close().

Refs #6969.

Note: See TracTickets for help on using tickets.