Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#13479 closed Bug Report - General (fixed)

mythbackend deadlock when editing schedule metadata on frontend

Reported by: Peter Bennett Owned by: Peter Bennett
Priority: minor Milestone: 31.0
Component: MythTV - General Version: v30-fixes
Severity: medium Keywords:
Cc: Ticket locked: no

Description

This has happened a few times, but only since V30.

After performing a search for artwork in the schedule editor, the backend sometimes gets into a state where the frontends lose the connection and no frontend can get connected again.

Attachments (5)

gdb.core.1567104223.200.1464.mythbackend_v30.0-66-g83e27017203 (121.5 KB) - added by Peter Bennett 5 years ago.
Full backtrace
ver.gdb.core.1567104223.200.1464.mythbackend_v30.0-66-g83e27017203 (1.1 KB) - added by Peter Bennett 5 years ago.
Version
thread_analysis.txt (5.2 KB) - added by Peter Bennett 4 years ago.
Thread Analysis
20200204_1528_deadlock_v30.patch (1.5 KB) - added by Peter Bennett 4 years ago.
Updated V30 patch
20200204_1521_deadlock_v31.patch (1.5 KB) - added by Peter Bennett 4 years ago.
Updated V31 (master) patch

Download all attachments as: .zip

Change History (15)

Changed 5 years ago by Peter Bennett

Full backtrace

Changed 5 years ago by Peter Bennett

Version

comment:1 Changed 5 years ago by Peter Bennett

Version: Unspecifiedv30-fixes

comment:2 Changed 5 years ago by Peter Bennett

This happened again, this time with no schedule or unusual actions on the frontend, just connecting the frontend and navigating to the recordings list.

comment:3 Changed 4 years ago by Peter Bennett

This may be related to the report in the forum at https://forum.mythtv.org/viewtopic.php?f=36&t=3545

Last edited 4 years ago by Peter Bennett (previous) (diff)

comment:4 Changed 4 years ago by Peter Bennett

Owner: set to Peter Bennett
Status: newassigned

See the attached file thread_analysis.txt

Thread 34 and Thread 6 are both doing this

MainServer::connectionClosed calls sockListLock.lockForWrite
MainServer::connectionClosed calls MainServer::UpdateSystemdStatus
MainServer::UpdateSystemdStatus calls m_sched->GetAllPending(recordings)
Scheduler::GetAllPending calls QMutexLocker lockit(&schedLock)

This causes

  1. lock on sockListLock
  2. lock on schedLock

Thread 22 is doing this

Scheduler::run calls QMutexLocker lockit(&schedLock); (l2070)
Scheduler::run calls HandleIdleShutdown; (l2234)
Scheduler::HandleIdleShutdown calls MainServer::isClientConnected :8358
MainServer::isClientConnected calls sockListLock.lockForRead

This causes

  1. lock on schedLock
  2. lock on sockListLock

* DEADLOCK *

If you are setting two or more locks, you need to always set them in the same order in every thread, otherwise deadlocks can happen.

There are many places where mainserver calls scheduler and fewer than 10 places where scheduler calls mainserver. All are potential deadlock scenarios. Perhaps whenever scheduler calls mainserver it should first unlock its mutex and lock it again upon return.

ANY COMMENTS FROM DEVELOPERS?

Changed 4 years ago by Peter Bennett

Attachment: thread_analysis.txt added

Thread Analysis

comment:5 Changed 4 years ago by Peter Bennett

scheduler.cpp already has 9 places where calls to other routines are surrounded by schedLock.unlock() and schedLock.lock(). It looks like I only need to add those to the two places in scheduler.cpp where m_mainServer->isClientConnected() is called.

comment:6 Changed 4 years ago by Peter Bennett

Regarding m_recListChanged - this can be used to check if the list changed. If the list changes during the call to m_mainServer->isClientConnected, that will not be a problem. In HandleIdleShutdown it does look through the list, and the list cannot change while it is looking through the list since it is locked at that time. I prefer to make as few changes as possible to reduce risk, so I plan to only unlock and relock at the two places m_mainServer->isClientConnected is called inside HandleIdleShutdown.

comment:7 in reply to:  6 Changed 4 years ago by gigem

Replying to Peter Bennett:

Regarding m_recListChanged - this can be used to check if the list changed. If the list changes during the call to m_mainServer->isClientConnected, that will not be a problem. In HandleIdleShutdown it does look through the list, and the list cannot change while it is looking through the list since it is locked at that time. I prefer to make as few changes as possible to reduce risk, so I plan to only unlock and relock at the two places m_mainServer->isClientConnected is called inside HandleIdleShutdown.

Please do not do this. When m_recListChanged is true, stop and return. Always. It's not just that iterators might have been invalidated, but other assumptions like list ordering might have been invalidated too. You might be able to convince yourself that in this case at this time, it won't hurt anything. That might not always be the case, though, as an unrelated, future change might invalidate it. To avoid that and keep things simple, just stop and return.

Changed 4 years ago by Peter Bennett

Updated V30 patch

Changed 4 years ago by Peter Bennett

Updated V31 (master) patch

comment:8 Changed 4 years ago by Peter Bennett <pbennett@…>

Resolution: fixed
Status: assignedclosed

In 3e9e8353a/mythtv:

Scheduler: Fix deadlock in mythbackend

Fixes deadlock caused by conflicting Mutex locks in HandleIdleShutdown?

Fixes #13479

comment:9 Changed 4 years ago by Peter Bennett <pbennett@…>

In 47814b6d6/mythtv:

Scheduler: Fix deadlock in mythbackend

Fixes deadlock caused by conflicting Mutex locks in HandleIdleShutdown?

Fixes #13479

comment:10 Changed 4 years ago by Stuart Auchterlonie

Milestone: needs_triage31.0
Note: See TracTickets for help on using tickets.