Opened 10 years ago

Closed 10 years ago

Last modified 7 years ago

#6516 closed defect (fixed)

Mythfilldatabase times out when rescheduling

Reported by: Joel <mythtv@…> Owned by: danielk
Priority: blocker Milestone: 0.22
Component: MythTV - General Version: head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

The visible problem is that my recording schedules aren't getting updated when mythfilldatabase runs. This is because the socket on which mythfilldatabase requests the reschedule is timing out waiting for mythbackend to respond. I believe the reason for this is that mythbackend is blocking on various responses to mysql queries it has made, and mysql is running slowly, unable to respond within the socket timeout of 7 seconds, as a result of the load put on it by the recently finished mythfilldatabase run.

I have tried to make the database faster by increasing it's limits in my.cnf, but it still seems that fastest it can complete the requests is around 15 seconds. I don't know why the database is slow, but mythtv should be able to work around this. I've temporarily increased the quick timeout to 30 seconds (matching the long timeout) by editing mythsocket.cpp, and that works for me, but there's probably a better solution. I propose one or both of the following:

  1. Increase the timout or retries that mythfilldatbase uses. It could be told to simply not use the quick timeout, and that would work for me. It retries once currently, but a second and third would also give it enough time to let the database catch up with itself.
  1. Figure out why mythtv is blocking on mysql for this. I don't see any reason that mythbackend should be capable of accepting and queuing a reschedule request even if mysql is being slow to respond. Maybe there is a good reason I'm unaware of though.

I realize it would be easy to say it's the databases fault, rather than mythtv's, but I think there are some simple ways to work around this, which would improve the user experience for those with slow databases. This seems to be a recurring problem for many users.

Attachments (3)

mythfilldatabase.log (19.1 KB) - added by mythtv@… 10 years ago.
mythfilldatabase -v network
6516-fix-windows-compilation-after-r21591.patch (699 bytes) - added by Jonathan Martens <jonathan@…> 10 years ago.
Fix compilation on windows after [21591]
6516-fix-windows-compilation-after-r21591.2.patch (1.1 KB) - added by Jonathan Martens <jonathan@…> 10 years ago.
Updated version of previous patch that also reorders variable initialisation to prevent two compiler warnings about unused variables as seen in comment 10.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 10 years ago by cpinkham

Status: newinfoneeded_new

If this is reproducable, can you run mythfilldatabase with the "-v network" command line option so we can see where in the process it is hanging?

Changed 10 years ago by mythtv@…

Attachment: mythfilldatabase.log added

mythfilldatabase -v network

comment:2 Changed 10 years ago by mythtv@…

I unpatched mythsocket.cpp and reproduced this. I've attached the log of -v network

I failed to mention in the bug, that this actually results in mythfilldatabase segfaulting. It's only at the end, so it's already completed all the important database updates. It just fails to tell mythbackend to reschedule.

For the discussion thread on this bug at mythtv-users, read here: http://www.gossamer-threads.com/lists/mythtv/users/380043

There's a full mythfilldatabase log here: http://ebel.mybox.org/mythfilldatabase2.log And a tcpdump (loadable in wireshark) of the mythtv conversation showing the delayed response from the backend here: http://ebel.mybox.org/mythfilldatabase.dump

comment:3 Changed 10 years ago by danielk

Status: infoneeded_newnew

comment:4 Changed 10 years ago by danielk

Milestone: unknown0.22
Owner: changed from Isaac Richards to danielk
Priority: minorblocker
Status: newassigned
Version: 0.21-fixeshead

I've been looking at the socket code recently and already have a patch that retries more effectively... This can also cause recorders to go offline until mythbackend is restarted and cause slave backends to disconnect permanently.

I already have a patch that I've been testing here, but it still needs some more work.

comment:5 Changed 10 years ago by mythtv@…

Actually, here's a better matched log and dump: http://ebel.mybox.org/mythfilldatabase.log.1 http://ebel.mybox.org/mythfilldatabase.dump.1

The dump I provided before didn't seem to match the behavior I was reproducing now.

comment:6 Changed 10 years ago by danielk

(In [21445]) Refs #6516. One of several patches for the myth socket timeout problem.

This one is the least important, it just handles ANN error conditions and some timeouts better. However, it is the only one of the patches that changes the protocol slightly, so I'm applying this first.

The protocol change is simply that ANN handling returns "ERROR" followed by the reason for the error instead of nothing for certain error conditions.

Because of the protocol change all backends and frontends must be recompiled.

comment:7 Changed 10 years ago by danielk

(In [21505]) Refs #6516. This consolidates command socket reconnection code and fixes a couple popups which didn't work correctly after the mythui port of the menu screens.

There is also a hidden option to increase reconnect tries, but I don't know what an optimal value is for this yet so I've left it at the old default for now.

The consolidation allows us to handle an unresponive server consistently. The popup for a version mismatch had not been ported to MythUI yet, and the popup for a general unsuccessful connect didn't pop up when called because we called it before the event thread had started.

comment:8 Changed 10 years ago by danielk

(In [21568]) Refs #6516. fixes segfault fix introduced in [21505]. This simple fix has been in my tree a few days, I just forgot about it until I was looking at something else in this file.

comment:9 Changed 10 years ago by danielk

(In [21589]) Refs #6516. Reworks the locking in mainserver to use R/W locks and to stagger some of the locking. This improves the measured worst case socket response by a number of seconds.

comment:10 Changed 10 years ago by danielk

(In [21591]) Refs #6516. This simply moves MythSocketThread? into it's own set of files and reorders a couple variables in MythSocket?.

This commit is simply to keep formatting changes seperate from the functional changes in a later changeset which speeds up some myth socket processing. This does change the plugin ABI so the plugins need recompiling.

comment:11 in reply to:  10 Changed 10 years ago by Jonathan Martens <jonathan@…>

Replying to danielk:

(In [21591]) Refs #6516. This simply moves MythSocketThread? into it's own set of files and reorders a couple variables in MythSocket?.

This commit is simply to keep formatting changes seperate from the functional changes in a later changeset which speeds up some myth socket processing. This does change the plugin ABI so the plugins need recompiling.

This commit seems to make compilation on windows fail:

bash.exe"-3.1$ cd libs/libmythdb/
bash.exe"-3.1$ make
makefile:185: warning: overriding commands for target `version.cpp'
makefile:182: warning: ignoring old commands for target `version.cpp'
u:/qt/4.5.2/bin/qmake.exe -win32 -o Makefile libmythdb.pro
makefile:185: warning: overriding commands for target `version.cpp'
makefile:182: warning: ignoring old commands for target `version.cpp'
g++ -c -march=pentium4 -fomit-frame-pointer -O3 -DNDEBUG -g -Wall -Wno-switch -W
pointer-arith -Wredundant-decls -Wno-non-virtual-dtor -funit-at-a-time -D__STDC_
CONSTANT_MACROS -frtti -fexceptions -mthreads -DQT_LARGEFILE_SUPPORT -DWIN32 -DU
SING_MINGW -DUSING_WINAUDIO -DUSING_D3D -DMMX -Di386 -DUSING_DIRECTX -D_GNU_SOUR
CE -D_FILE_OFFSET_BITS=64 -DRUNPREFIX=\"..\" -DLIBDIRNAME=\"lib\" -DQT_THREAD_SU
PPORT -DQT_DLL -DQT_SQL_LIB -DQT_GUI_LIB -DQT_NETWORK_LIB -DQT_CORE_LIB -I'../..
/../../Qt/4.5.2/include/QtCore' -I'../../../../Qt/4.5.2/include/QtNetwork' -I'..
/../../../Qt/4.5.2/include/QtGui' -I'../../../../Qt/4.5.2/include/QtSql' -I'../.
./../../Qt/4.5.2/include' -I'/usr/include' -I'/u/mythtv/build' -I'../../../../Qt
/4.5.2/include/ActiveQt' -I'.' -I'../../../../qt/4.5.2/mkspecs/win32-g++' -o myt
hsocket.o mythsocket.cpp
mythsocket.cpp: In member function 'bool MythSocket::readStringList(QStringList&
, uint)':
mythsocket.cpp:497: warning: comparison between signed and unsigned integer expr
essions
g++ -c -march=pentium4 -fomit-frame-pointer -O3 -DNDEBUG -g -Wall -Wno-switch -W
pointer-arith -Wredundant-decls -Wno-non-virtual-dtor -funit-at-a-time -D__STDC_
CONSTANT_MACROS -frtti -fexceptions -mthreads -DQT_LARGEFILE_SUPPORT -DWIN32 -DU
SING_MINGW -DUSING_WINAUDIO -DUSING_D3D -DMMX -Di386 -DUSING_DIRECTX -D_GNU_SOUR
CE -D_FILE_OFFSET_BITS=64 -DRUNPREFIX=\"..\" -DLIBDIRNAME=\"lib\" -DQT_THREAD_SU
PPORT -DQT_DLL -DQT_SQL_LIB -DQT_GUI_LIB -DQT_NETWORK_LIB -DQT_CORE_LIB -I'../..
/../../Qt/4.5.2/include/QtCore' -I'../../../../Qt/4.5.2/include/QtNetwork' -I'..
/../../../Qt/4.5.2/include/QtGui' -I'../../../../Qt/4.5.2/include/QtSql' -I'../.
./../../Qt/4.5.2/include' -I'/usr/include' -I'/u/mythtv/build' -I'../../../../Qt
/4.5.2/include/ActiveQt' -I'.' -I'../../../../qt/4.5.2/mkspecs/win32-g++' -o myt
hsocketthread.o mythsocketthread.cpp
mythsocketthread.cpp:6:38: error: sys/select.h: No such file or directory
mythsocketthread.cpp: In member function 'virtual void MythSocketThread::run()':
mythsocketthread.cpp:237: warning: comparison between signed and unsigned intege
r expressions
mythsocketthread.cpp:160: warning: unused variable 'rfds'
mythsocketthread.cpp:162: warning: unused variable 'maxfd'
make: *** [mythsocketthread.o] Error 1
bash.exe"-3.1$

According to the changeset this has been fixed in the past like this as sys/select.h is not in MINGW's includes:

#ifdef USING_MINGW #include <winsock2.h> #else #include <sys/select.h> #endif

I will attach a patch that fixes compilation on Windows again.

Changed 10 years ago by Jonathan Martens <jonathan@…>

Fix compilation on windows after [21591]

Changed 10 years ago by Jonathan Martens <jonathan@…>

Updated version of previous patch that also reorders variable initialisation to prevent two compiler warnings about unused variables as seen in comment 10.

comment:12 Changed 10 years ago by Jonathan Martens <jonathan@…>

It seems more is broken as even after the patch I still fail to build [21591] where I am able to build [21590] on Windows, with the last patch applied build fails in libmythtv with the following error:

g++ -c -march=pentium4 -fomit-frame-pointer -O3 -DNDEBUG -g -Wall -Wno-switch -W
pointer-arith -Wredundant-decls -Wno-non-virtual-dtor -funit-at-a-time -D__STDC_
CONSTANT_MACROS -I/mingw/include/freetype2 -I/mingw/include -frtti -fexceptions
-mthreads -DQT_LARGEFILE_SUPPORT -DWIN32 -DUSING_MINGW -DUSING_WINAUDIO -DUSING_
D3D -DMMX -Di386 -DUSING_DIRECTX -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFI
LE_SOURCE -DUSING_DIRECTX -DUSING_OPENGL -DUSING_OPENGL_VIDEO -DUSING_MHEG -DUSI
NG_FRONTEND -DUSING_FFMPEG_THREADS -DUSING_HDHOMERUN -DUSING_BACKEND -DUSING_D3D
 -DQT_THREAD_SUPPORT -DQT_DLL -DQT_SQL_LIB -DQT_XML_LIB -DQT_GUI_LIB -DQT_NETWOR
K_LIB -DQT_CORE_LIB -I'../../../../Qt/4.5.2/include/QtCore' -I'../../../../Qt/4.
5.2/include/QtNetwork' -I'../../../../Qt/4.5.2/include/QtGui' -I'../../../../Qt/
4.5.2/include/QtXml' -I'../../../../Qt/4.5.2/include/QtSql' -I'../../../../Qt/4.
5.2/include' -I'../../../build' -I'..' -I'../..' -I'.' -I'../libmyth' -I'../liba
vcodec' -I'../libavformat' -I'../libavutil' -I'../libswscale' -I'../libmythmpeg2
' -I'../libmythdb' -I'../libmythhdhomerun' -I'../libmythdvdnav' -I'dvbdev' -I'mp
eg' -I'iptv' -I'channelscan' -I'../libmythlivemedia/BasicUsageEnvironment/includ
e' -I'../libmythlivemedia/BasicUsageEnvironment' -I'../libmythlivemedia/groupsoc
k/include' -I'../libmythlivemedia/groupsock' -I'../libmythlivemedia/liveMedia/in
clude' -I'../libmythlivemedia/liveMedia' -I'../libmythlivemedia/UsageEnvironment
/include' -I'../libmythlivemedia/UsageEnvironment' -I'../libmythdb' -I'../libmyt
hui' -I'/usr/include' -I'../../../../Qt/4.5.2/include/ActiveQt' -I'.' -I'../../.
./../qt/4.5.2/mkspecs/win32-g++' -o util-opengl.o util-opengl.cpp
util-opengl.cpp: In function 'void* get_gl_proc_address(const QString&)':
util-opengl.cpp:334: error: 'wglGetProcAddress' was not declared in this scope
util-opengl.cpp:337: error: 'GetModuleHandle' was not declared in this scope
util-opengl.cpp:338: error: 'GetProcAddress' was not declared in this scope
make[2]: *** [util-opengl.o] Error 1
make[2]: Leaving directory `/u/mythtv/mythtv/libs/libmythtv'
make[1]: *** [sub-libmythtv-make_default] Error 2
make[1]: Leaving directory `/u/mythtv/mythtv/libs'
make: *** [sub-libs-make_default-ordered] Error 2

comment:13 Changed 10 years ago by danielk

Resolution: fixed
Status: assignedclosed

(In [21608]) Fixes #6516. This is the last of the mythsocket fixes sitting in my tree...

This speeds up worst case mythsocket handling. The worst case speedup is entirely do to with the locking*, but we also use QWaitConditions for some special cases, such as when starting up the socket handling loop and when all sockets are closed, or when anonymous file descriptor pipes are not available. This also adds some timing debug code at the -v socket,extra level, but I left out most of the ones I used when optimising this code, the remaning ones should be sufficient to locate any significant regressions in the future.

This also gets rid of the special case stuff for MS Windows for which there may need to be some additional header changes, but the code itself does work when you inlcude compat.h. But compat.h shouldn't be included in any plugin exported header files so I've left the header problem for someone with an MS Windows system to work out (aside from fixing the select header problem previously reported).

FYI [21589] provides a larger worst-case speedup, but in my testing both changesets are needed to completely avoid timeouts.

  • The main locking problem was just when we deleted sockets, this was treated as a rare occurance and we dropped and relocked the m_readyread_lock. But this would sometimes cause large enough delays to cause timeouts, which lead to more deleted sockets, causing more timeouts, etc. We now put these in a list and handle them later on when we've dropped the m_readyread_lock anyway for ready read data handling.

A secondary locking problem had to do with our isLocked() handling,

this was really a Qt4 port problem, the code was rewritten to work with Qt4 which dropped QMutex::locked(), but the new code was not very efficient.

comment:14 Changed 10 years ago by danielk

(In [21614]) Refs #6516. Header fixes for MINGW.

comment:15 Changed 10 years ago by danielk

(In [21726]) Refs #6516. Fixes the other socket leak, see also #6969.

comment:16 Changed 7 years ago by Github

Refs #6516. Fix a regression caused by [21445] aka [2fdeb3fc].

We were unlocking a mutex after it was deleted. The lock is for the writeStringList()/readStringList() and should have been released before the DownRef?().

Branch: master Changeset: fb1c7607f4aa2b5343aad98a043d94db78bf6d15

Note: See TracTickets for help on using tickets.