Opened 7 years ago

Closed 8 weeks ago

#11870 closed Bug Report - Hang/Deadlock (Fixed)

Channel scanning deadlock with DVB

Reported by: Rune Petersen <rune@…> Owned by: Klaas de Waal
Priority: minor Milestone: 32.0
Component: MythTV - Channel Scanner Version: 0.27-fixes
Severity: medium Keywords:
Cc: Stuart Auchterlonie Ticket locked: no

Description

I get a 100% reproducible deadlock when scanning for DVB channels.

It always happens on transponder 1070, which may mean this issue is hard to trigger.

Anyway this is clearly a deadlock:

DVBStreamHandler::RunTS()

holds MPEGStreamData::_encryption_lock and then tries to take ChannelScanSM::lock

ChannelScanSM::run()

holds ChannelScanSM::lock and then tries to take MPEGStreamData::_encryption_lock

callstack: #0 syscall () at ../ports/sysdeps/unix/sysv/linux/arm/syscall.S:37 #1 0x75a19624 in ?? () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 #2 0x75a16cfe in QMutex::lockInternal() () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 #3 0x769e24e0 in QMutex::lockInline (this=0xe236a8) at /usr/include/qt4/QtCore/qmutex.h:198 #4 0x769e2438 in QMutexLocker::QMutexLocker (this=0x642b993c, m=0xe236a8) at /usr/include/qt4/QtCore/qmutex.h:109 #5 0x76cf9f7c in ChannelScanSM::HandleEncryptionStatus? (this=0xe23660, pnum=2740, encrypted=true)

at channelscan/channelscan_sm.cpp:551

#6 0x76af79be in MPEGStreamData::ProcessEncryptedPacket? (this=0xe40890, tspacket=...) at mpeg/mpegstreamdata.cpp:2016 #7 0x76af47a2 in MPEGStreamData::ProcessTSPacket (this=0xe40890, tspacket=...) at mpeg/mpegstreamdata.cpp:1046 #8 0x76af4694 in MPEGStreamData::ProcessData? (this=0xe40890, buffer=0x6380a008 "G\037\220\220", <incomplete sequence \336>,

len=102648) at mpeg/mpegstreamdata.cpp:1022

#9 0x76dcbb4e in DVBStreamHandler::RunTS (this=0xe3d828) at recorders/dvbstreamhandler.cpp:266 #10 0x76dcad84 in DVBStreamHandler::run (this=0xe3d828) at recorders/dvbstreamhandler.cpp:118 #11 0x767ebcde in ?? () from /usr/lib/libmythbase-0.27.so.0 #12 0x75a1a49a in ?? () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 ---Type <return> to continue, or q <return> to quit--- #13 0x75a1a49a in ?? () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 21 (Thread 0x652ba410 (LWP 27669)): #0 syscall () at ../ports/sysdeps/unix/sysv/linux/arm/syscall.S:37 #1 0x75a19624 in ?? () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 ---Type <return> to continue, or q <return> to quit--- #2 0x75a16dcc in QMutex::lock() () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4 #3 0x769e24e0 in QMutex::lockInline (this=0xe4094c) at /usr/include/qt4/QtCore/qmutex.h:198 #4 0x769e2438 in QMutexLocker::QMutexLocker (this=0x652b9834, m=0xe4094c) at /usr/include/qt4/QtCore/qmutex.h:109 #5 0x76af71d2 in MPEGStreamData::TestDecryption? (this=0xe40890, pmt=0x6dca7170) at mpeg/mpegstreamdata.cpp:1885 #6 0x76cfa450 in ChannelScanSM::TestNextProgramEncryption? (this=0xe23660) at channelscan/channelscan_sm.cpp:627 #7 0x76cfb69e in ChannelScanSM::UpdateChannelInfo? (this=0xe23660, wait_until_complete=false)

at channelscan/channelscan_sm.cpp:877

#8 0x76cfede6 in ChannelScanSM::HandleActiveScan? (this=0xe23660) at channelscan/channelscan_sm.cpp:1584 #9 0x76cfe8d8 in ChannelScanSM::run (this=0xe23660) at channelscan/channelscan_sm.cpp:1482 #10 0x767ebcba in MThread::run() () from /usr/lib/libmythbase-0.27.so.0 #11 0x767ebcde in ?? () from /usr/lib/libmythbase-0.27.so.0 #12 0x75a1a49a in ?? () from /usr/lib/arm-linux-gnueabihf/libQtCore.so.4

Attachments (1)

mythtv_ChannelScanSM_simplify_locking_poc.patch (24.4 KB) - added by Rune Petersen <rune@…> 7 years ago.

Download all attachments as: .zip

Change History (7)

comment:1 Changed 7 years ago by Stuart Auchterlonie

Cc: Stuart Auchterlonie added

Can you attach the patch you posted to the mailing list?

Changed 7 years ago by Rune Petersen <rune@…>

comment:2 Changed 7 years ago by Rune Petersen <rune@…>

When debugging this I was reminded that ChannelScanSM uses the "big lock"/"lock everything" pattern which I'm not a big fan of.

I have an idea for fixing it: Move all the work that modifies the state of to ChannelScanSM the scanner thread, that way we don't need to hold the big lock when evaluating or modifying the state.

Is this something you would find interesting, or are there any better ideas?

I have attaches is a PoC patch showing this.

comment:3 Changed 11 months ago by Klaas de Waal

Owner: changed from danielk to Klaas de Waal
Status: newassigned

Deadlock situation is still present in today's master, although I have not yet reproduced the problem.

comment:4 Changed 8 months ago by Klaas de Waal

Milestone: unknown32.0

comment:5 Changed 8 weeks ago by Klaas de Waal

The deadlock has been reproduced. This can only happen when the "Test Decryptability" option is selected when doing a channel scan. This option is by default off as it greatly increases the channel scanning time and the results are not very reliable anyway.

The problem is not really in the "big lock" m_lock but more in the locking order itself. This can be fixed in MPEGStreamData::ProcessEncryptedPacket? by releasing the m_encryptionLock before calling the HandleEncryptionStatus? of all listeners. In case of channel scanning the listener is the ChannelScanSM::HandleEncryptionStatus?. This fix is being tested and the deadlock has not yet appeared again.

comment:6 Changed 8 weeks ago by Klaas de Waal

Resolution: Fixed
Status: assignedclosed

The deadlock has not been seen again after the fix has been applied so the problem is considered solved and therefore the ticket is now closed. If the problem appears again please create a new ticket.

Note: See TracTickets for help on using tickets.