Modify

Ticket #9885 (closed Bug Report - General: Fixed)

Opened 2 years ago

Last modified 22 months ago

Deadlock on slave backend disconnect

Reported by: Ian Dall <ian@…> Owned by: danielk
Priority: major Milestone: 0.25
Component: MythTV - General Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

I have a setup with a master BE, 2 slave BEs and 1 - 3 frontends. I am running code compiled from git (v0.25pre-2145-gf199a84-dirty).

The behaviour is that nothing works and one slave is dead and the master backend is deadlocked. FE's don't work and accessing the status port with a browser times-out.

The slave death is accompanied by kernel syslog messages like: kernel: [371375.689820] mythbackend: page allocation failure. order:0, mode:0x20 Maybe the slave death is due to a kernel bug, BUT the master should not deadlock!

The attached backtrace shows that master Thread 16 is trying, in SlaveDisconnected? to get a shedlock, when shedlock is already held by Scheduler::run further up the stack.

I saw exactly the same problem with an older version: 0.24-7.fc14 (464fa28373) but went to git head in the hope that this had been fixed :-(

I notice other deadlock tickets (esp #9745), but none seem to have quite the same description, and the version I am running has the #9745 fix included.

Attachments

hex.backtrace (20.3 KB) - added by Ian Dall <ian@…> 2 years ago.
Backtrace all threads
hex.log.gz (25.1 KB) - added by Ian Dall <ian@…> 2 years ago.
Master backend log.
slavedisconnect.patch (3.7 KB) - added by Jonatan <mythtv@…> 2 years ago.
hex-a41e965.backtrace (27.6 KB) - added by Ian Dall <ian@…> 23 months ago.
Backtrace all threads as of commit a41e965
hex-a41e965.log.gz (201.9 KB) - added by Ian Dall <ian@…> 23 months ago.
Master backend log as of commit a41e965

Change History

Changed 2 years ago by Ian Dall <ian@…>

Backtrace all threads

Changed 2 years ago by Ian Dall <ian@…>

Master backend log.

Changed 2 years ago by Jonatan <mythtv@…>

comment:1 Changed 2 years ago by Jonatan <mythtv@…>

I have also seen a similar deadlock a few times on 0.24. I have been running the backend with the attached patch for a while now without any problems.

comment:2 Changed 23 months ago by Ian Dall <ian@…>

I haven't tried Jonatan's patch yet but will soon (I missed it until now).

Without the patch, the issue is still there as of version: v0.25pre-2563-ga41e965

See thread 21 in the attached backtrace.

Changed 23 months ago by Ian Dall <ian@…>

Backtrace all threads as of commit a41e965

Changed 23 months ago by Ian Dall <ian@…>

Master backend log as of commit a41e965

comment:3 Changed 23 months ago by Ian Dall <ian@…>

I have been running with Jonatan's patch for a week now and it seems to fix the problem. I deliberately tried to provoke it by killing and restarting the backend many times and never saw the deadlock.

Can this patch be applied?

comment:4 Changed 23 months ago by danielk

Ian, Janathan's patch is more of a debugging patch, it just disables a bit of code. And can't be applied as is. But if it fixed the problem for you, it does show that you are both experiencing the same deadlock and the same fix will help both of you.

comment:5 Changed 23 months ago by Github

Refs #9885. Fixes deadlock when a slave backend disconnect is first seen from within the Scheduler thread. Patch by Ian Dall.

Keeping ticket open since this should be backported to 0.24-fixes.

Branch: master Changeset: 1fae22a8bc56a5474375332f7799b0ee91bb6244

comment:6 Changed 23 months ago by danielk

  • Owner set to danielk
  • Status changed from new to assigned
  • Milestone changed from unknown to 0.25

comment:7 Changed 22 months ago by danielk

  • Status changed from assigned to closed
  • Resolution set to Fixed

Fixed in [3a5f78862a5be39e02ee549551d59e9c40fa575d]

Fixes #9885. Fixes deadlock when a slave backend disconnect is first seen from within the Scheduler thread. Patch by Ian Dall.

This has been running in master for 4 weeks without reports of regression.

View

Add a comment

Modify Ticket

Action
as closed
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.