Opened 6 years ago
Closed 6 years ago
Last modified 6 years ago
#13150 closed Bug Report - Crash (fixed)
Backend spawns threads until it crashes when user jobs active
Reported by: | Owned by: | Jonatan Lindblad | |
---|---|---|---|
Priority: | minor | Milestone: | 30.0 |
Component: | MythTV - General | Version: | 0.28.1 |
Severity: | medium | Keywords: | |
Cc: | Ticket locked: | no |
Description
I'm using LinHES, and the past two upgrades have had mythbackends that crash. There seems to be a system that spawns a new thread roughly every two seconds. Screen output when running in gdb is:
[New Thread 0x7ffd6bd37700 (LWP 30169)] 2017-10-12 11:54:23.158482 I MainServer: MainServer::ANN Monitor 2017-10-12 11:54:23.158490 I MainServer: adding: mythtv.i.cneufeld.ca(ca1920) as a client (events: 0) 2017-10-12 11:54:23.728318 I Monitor sock(ca1920) 'mythtv.i.cneufeld.ca' disconnected [New Thread 0x7ffd6b536700 (LWP 30225)] 2017-10-12 11:54:24.941168 I MainServer: MainServer::ANN Monitor 2017-10-12 11:54:24.941177 I MainServer: adding: mythtv.i.cneufeld.ca(ca19f0) as a client (events: 0) 2017-10-12 11:54:25.517690 I Monitor sock(ca19f0) 'mythtv.i.cneufeld.ca' disconnected
Eventually the system runs out of file descriptors for pipes and the backend crashes. In normal operation, there seems to be some housekeeping routine that is reaping the threads. However, when there is a long-running user job (mine can run for hours), the threads pile up and the backend crashes. It then is restarted by the system, and so resubmits the long job with the same jobqueue ID.
My current version is v0.28.1-44-gebd69ec There is a backtrace with some details at https://pastebin.com/mpN9eNcA
I've built a backend with debugging symbols, and can test patches.
Change History (8)
comment:2 Changed 6 years ago by
Replying to natanojl:
Can you try reverting 019044db324fc630581def44e0d02a570dd184ed and see if that helps?
Reverting that change prevents the runaway thread spawning. Threads are consistently reaped, and the thread count stays around 30-40. I'll be applying this reversion change to the binary running on my machine.
Thank you.
comment:3 Changed 6 years ago by
To follow up: I've been running in production for four days. I've gone from about 30 crashes a day in regular use (not running my scripts), to none.
This bug never damaged any recordings, I suspect that the reaping routine runs regularly when a recording is going on.
Thank you again.
comment:4 Changed 6 years ago by
Owner: | set to Jonatan Lindblad |
---|---|
Status: | new → assigned |
comment:5 Changed 6 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:6 Changed 6 years ago by
Milestone: | needs_triage → 30.0 |
---|---|
Version: | Unspecified → 0.28.1 |
Can you try reverting 019044db324fc630581def44e0d02a570dd184ed and see if that helps?