Opened 13 years ago

Closed 13 years ago

#2011 closed defect (fixed)

Backend deadlock, multiple delete threads waiting on mutex.

Reported by: Stuart Auchterlonie Owned by: danielk
Priority: minor Milestone: 0.20
Component: mythtv Version: head
Severity: medium Keywords:
Cc: Ticket locked: no

Description (last modified by danielk)

The backend deadlock with multiple (33) Delete Threads waiting on a mutex.

Full backtrace is attached (it's at the end of the file, there's a short backtrace first, then a thread summary)

Attachments (2)

backend-stuck.log.bz2 (11.6 KB) - added by Stuart Auchterlonie 13 years ago.
backtrace of stuck backend.
2011.patch (2.5 KB) - added by danielk 13 years ago.
possible fix

Download all attachments as: .zip

Change History (6)

Changed 13 years ago by Stuart Auchterlonie

Attachment: backend-stuck.log.bz2 added

backtrace of stuck backend.

comment:1 Changed 13 years ago by danielk

Description: modified (diff)
Resolution: wontfix
Status: newclosed

Not a bug.

These threads are waiting for their turn. Only one delete proceeds at a time so we can rate limit deletes. If a large file delete happens due to an auto-expire or user request and then you do a lot of channel flipping, many small deletes will bunch up waiting for the delete lock. Within a few minutes they will all get their turn and execute quickly.

comment:2 Changed 13 years ago by danielk

Keywords: deadlock removed
Resolution: wontfix
Status: closedreopened

While there isn't a deadlock, there is resource exhaustion as noticed by Jim Westfall. We keep a sql connection going while waiting for the delete to proceed, and eventually we run out of connections. When this happens MythTV is 'frozen' until one of the deletes completes.

Changed 13 years ago by danielk

Attachment: 2011.patch added

possible fix

comment:3 Changed 13 years ago by danielk

(In [10370]) Refs #2011. Fixes "deadlock" when many deletes are pending.

The delete method was grabbing a DB connection before waiting for it's turn to actually delete the file, and then releasing it after the delete completed. If a large file delete is scheduled, followed by many deletes of any sized files, such as after channel flipping in LiveTV, then the number of open connections grows quickly. This can make MythTV unresponsive until the big delete completes because most MythTV actions require a DB connection and none (or too few) are available. With this change the delete function lets the DB connection go before it blocks on the delete mutex.

comment:4 Changed 13 years ago by danielk

Resolution: fixed
Status: reopenedclosed

Fixed by [10370].

Note: See TracTickets for help on using tickets.