Opened 8 years ago

Closed 6 years ago

#10032 closed Bug Report - Hang/Deadlock (Fixed)

mythwelcome getting unstable

Reported by: amlopezalonso@… Owned by: beirdo
Priority: minor Milestone: 0.25
Component: MythTV - Mythwelcome & Mythshutdown Version: Master Head
Severity: medium Keywords: mythwelcome, hang
Cc: Ticket locked: no

Description

I'm experiencing some quite unstable behavior in mythwelcome. After booting my MythTV box, now and then, mythwelcome tries to start but then hangs leaving just a blank theme background, thus having to kill it to start FE manually.

If ever been succesful in starting mythwelcome, it crashes almost every time I quit FE.

I haven't found any pattern (yet) for succesful start/stops.

Attaching log of a frozen mythwelcome instance.

Regards, Antonio

Attachments (9)

mythwelcome.log (239.8 KB) - added by amlopezalonso@… 8 years ago.
mythwelcome log
mythshutdown.20111008234333.7168.log (19.7 KB) - added by Raymond Wagner 8 years ago.
mythshutdown.20111008234334.7185.log (3.4 KB) - added by Raymond Wagner 8 years ago.
mythwelcome.20111008234332.7142.log (17.5 KB) - added by Raymond Wagner 8 years ago.
gdb.txt (57.8 KB) - added by amlopezalonso@… 8 years ago.
mytharchivehelper_crash.txt (17.9 KB) - added by paulh 8 years ago.
mytharchivehelper crash
gdb2.txt (57.8 KB) - added by amlopezalonso@… 8 years ago.
gdb2_true.txt (21.3 KB) - added by amlopezalonso@… 8 years ago.
gn.bt.all.oct11.txt (3.4 KB) - added by George Nassas <gnassas@…> 8 years ago.

Download all attachments as: .zip

Change History (41)

Changed 8 years ago by amlopezalonso@…

Attachment: mythwelcome.log added

mythwelcome log

comment:1 Changed 8 years ago by j2@…

I would guess that a "killall mythshutdown" will unfreeze the problem and resume normal operation?

comment:2 Changed 8 years ago by j2@…

When launching mythwelcome I see

mythtv 3860 0.0 0.0 1832 512 pts/4 S+ 20:29 0:00 sh -c /usr/local/bin/mythshutdown --status 0 --verbose general --loglevel debug mythtv 3861 14.0 2.1 166520 44088 pts/4 Sl+ 20:29 0:00 /usr/local/bin/mythshutdown --status 0 --verbose general --loglevel debug

which then changes to

mythtv 3880 0.0 0.0 1832 512 pts/4 S+ 20:29 0:00 sh -c /usr/local/bin/mythshutdown --startup --verbose general --loglevel debug mythtv 3881 6.5 2.1 141848 43788 pts/4 Sl+ 20:29 0:00 /usr/local/bin/mythshutdown --startup --verbose general --loglevel debug

And hangs at this point.

comment:3 in reply to:  1 Changed 8 years ago by amlopezalonso@…

Replying to j2@…:

I would guess that a "killall mythshutdown" will unfreeze the problem and resume normal operation?

Yes, confirming.

comment:4 Changed 8 years ago by j2@…

I am not sure if "quick workarounds" are welcome in a ticket, but I will risk it.

Ive simply modified my startup to

/usr/local/bin/mythwelcome --loglevel debug --logfile /var/log/mythtv/mythwelcome.log &

sleep 4s ; killall mythshutdown

Everything still seem to work (ACPI shutdown/wakeup etc)..

comment:5 Changed 8 years ago by amlopezalonso@…

More info:

Killing mythshutdown effectively allows mythwelcome to start, but then trying to lock/unlock mythshutdown from mythwelcome makes the latter to crash, issuing:

mythwelcome[1953]: segfault at 40 ip 00007f29d67150a0 sp 00007ffff85f07c8 error 4 in i965_dri.so[7f29d66e8000+ac000]

(the locking/unlocking takes place though).

IMHO it seems related to the intel xorg driver (I own a GMA-X4500 graphics chip).

When not killing mythshutdown I'm getting:

[drm:drm_mode_getfb] *ERROR* invalid framebuffer id

Any thoughts?

comment:6 Changed 8 years ago by amlopezalonso@…

Sorry, in the first paragraph of my previous comment I meant "to freeze" instead of "to crash".

comment:7 Changed 8 years ago by beirdo

Owner: Anduin Withers deleted
Status: newassigned

comment:8 Changed 8 years ago by beirdo

See also #10062

comment:9 Changed 8 years ago by paulh

Milestone: unknown0.25
Owner: set to Beirdo

I believe this is a problem with the new logging stuff and short lived command line tools like MythShutdown/MythArchiveHelper?. I think the problem is caused when the command does its stuff and exits before the logging threads are properly up and running but that's just a guess. The only way I could get some of the mytharchivehelper commands to works was to either disable logging altogether or add a sleep to slow the command down. This is based on what I was seeing about 4 weeks ago don't know if there have been any changes recently to fix this problem.

I have to question why mythshutdown for example now requires the 6 or more threads that seem to be started before it does anything useful. It all just seems a bit over the top for what should be a simple command line tool.

comment:10 Changed 8 years ago by beirdo

Owner: changed from Beirdo to beirdo

I will require backtraces from gdb to look into this very far. If the crash is in the video driver, it is almost certainly not related to logging.

As for short-lived processes, we will need to look into this, but if they are going to be logging, they will be using the same logging code as the rest of the application. With some of the changes that Daniel's made to the logging since I've last looked at it, I'm not sure where this stands, but I'll take a look.

comment:11 Changed 8 years ago by beirdo

In your initial startup script, please change any --logfile parameter (and/or redirection to a file) to use --logpath pointing at an empty log directory. That way, you can zip up all of the related logfiles, and I can tell which process is logging what. The log you have attached here has all of the subprograms in it (it seems to be a redirection of the console). I will still need a backtrace on the hung process.

comment:12 Changed 8 years ago by amlopezalonso@…

OK attaching logfiles as requested.

Changed 8 years ago by Raymond Wagner

Changed 8 years ago by Raymond Wagner

Changed 8 years ago by Raymond Wagner

comment:13 Changed 8 years ago by Raymond Wagner

No sense making things hard on us when the files are small enough to fit.

comment:14 Changed 8 years ago by beirdo

To be clear, did this hang? It does seem to be maybe related to the database logging and not starting up fast enough. I'll have to take a good look at that section of the code. Not sure why it would hang though.

If one of those mythshutdowns hangs and doesn't return, please grab a backtrace of it so we can look into why it's getting hung up.

Changed 8 years ago by amlopezalonso@…

Attachment: gdb.txt added

comment:15 Changed 8 years ago by amlopezalonso@…

Yes it did.

Attaching backtrace. Hope it helps.

comment:16 Changed 8 years ago by beirdo

So, it looks like mythwelcome crashes and mythshutdown is hanging?

I'll go take a look. Thanks.

comment:17 Changed 8 years ago by Github

Fix dblogging deadlock during shutdown

Refs #10032.

If the program starts and exits while we are waiting for 100ms for the db to be ready, we got into a deadlock due to a logging message being expedited while the db queue is locked. Message expediting only happens during shutdown. The way around this is to briefly unlock the queue while checking the database to remove the possibility of deadlock.

An alternative fix would be to fix the delays to no longer use the conditional waits, but that goes against the redesign goals that Daniel had earlier.

Branch: master Changeset: aab13b3936b05188bdab02f9b7df029c79c1ff97

comment:18 Changed 8 years ago by beirdo

Status: assignedinfoneeded

Please try this and report back.

comment:19 Changed 8 years ago by paulh

I'm still seeing the segfaults when testing mytharchivehelper. They are completely random with maybe 1 out 6 runs segfaulting when run normally. I can never get it to crash when run under gdb though. I'll attach a bt I managed to get from a core dump.

Changed 8 years ago by paulh

Attachment: mytharchivehelper_crash.txt added

mytharchivehelper crash

comment:20 Changed 8 years ago by beirdo

Please don't mix several issues into one ticket. This ticket is about the deadlock in mythwelcome/mythshutdown. If you could put the mytharchivehelper crash into another ticket, I'm more than happy to look at it there.

Changed 8 years ago by amlopezalonso@…

Attachment: gdb2.txt added

comment:21 Changed 8 years ago by amlopezalonso@…

Unfortunately it didn't work.

Attaching new backtrace.

comment:22 Changed 8 years ago by beirdo

If in the future you could post separate backtrace files for mythwelcome and mythshutdown, it would make it easier to figure out.

The change I made should have fixed the deadlock in the second half of the backtrace (not sure which program that is now), but will not in any way change the crash which is apparently an issue with your video card drivers, and I won't even try to fix here.

Also, you're sure you are running the latest master with the patch in it?

comment:23 in reply to:  22 Changed 8 years ago by amlopezalonso@…

Replying to beirdo:

If in the future you could post separate backtrace files for mythwelcome and mythshutdown, it would make it easier to figure out.

I just attached gdb to the hung mythshutdown process, so I guess the backtrace should be mythshutdown exclusively?

The change I made should have fixed the deadlock in the second half of the backtrace (not sure which program that is now),

I'm still having to kill mythshutdown to get either desktop screen (thus mythwelcome crashed) or mythwelcome screen (not crashing then), so I guess the issue is not fixed yet.

but will not in any way change the crash which is apparently an issue with your video >card drivers, and I won't even try to fix here.

Do you suggest opening a new ticket regarding this issue? Or should I refer to the xorg-intel guys?

Also, you're sure you are running the latest master with the patch in it?

Yes as per:

http://www.ubuntuupdates.org/packages/show/315186

the patch is included in the latest mythtv Mythbuntu PPA (UbuntuUpdates?) which I have installed.

Regards, Antonio

comment:24 Changed 8 years ago by beirdo

Actually, that "new" backtrace is the same file as the previous one. I think you uploaded the wrong backtrace.

Changed 8 years ago by amlopezalonso@…

Attachment: gdb2_true.txt added

comment:25 Changed 8 years ago by amlopezalonso@…

Sorry. Attaching the right file.

comment:26 Changed 8 years ago by George Nassas <gnassas@…>

Also, you're sure you are running the latest master with the patch in it?

I'm running master from Tuesday morning. The changes have made a big difference but there are still occasional hangs. I'll attach the output from a "thread apply all bt" on the most recent hang.

Changed 8 years ago by George Nassas <gnassas@…>

Attachment: gn.bt.all.oct11.txt added

comment:27 Changed 8 years ago by Github

Unlock the db queue mutex during InitCon? in the startup

If the program is exiting quickly (before the database is ready), the db queue is still locked, and then the messages bypass the queue. When it goes to initialize the database connection, the code in mythdbcon logs a message, which then tries to acquire the lock again, causing a deadlock.

To get around that, unlock the mutex around the call to InitCon?(). I think it would be better to re-re-think the waiting code so it's not using that mutex so the mutex can get locked only after the connection is ready, removing this and the previous unlock/relock.

Refs #10032. Hopefully this will fix the deadlocks.

Branch: master Changeset: a266129767a624dc6e15014ee5f7996d5ff33c7e

comment:28 Changed 8 years ago by Github

Simplify the locking in the DB logging startup

This looks nicer, and, as far as I can tell, gives the exact protection we need.

Refs #10032

Branch: master Changeset: d9ec12fec36568a777b120eb63b395dd6c3e6ce5

comment:29 Changed 8 years ago by amlopezalonso@…

Seems fixed... Good job!

Thanks a lot!

comment:30 Changed 8 years ago by Raymond Wagner

Resolution: Fixed
Status: infoneededclosed

Reported as fixed.

comment:31 Changed 6 years ago by jtl_@…

Resolution: Fixed
Status: closednew

This bug is still alive and well in .25 running over Ubuntu 12.04.2 desktop. Have enabled repositories for .25 and updated, but problem persists.

I did discover if you Ctrl+Alt+Right Arrow then Ctrl+Alt+Left Arrow the Mythwelcome screen becomes visible.

comment:32 Changed 6 years ago by Raymond Wagner

Resolution: Fixed
Status: newclosed

I'm closing this again, as 0.25 is no longer supported. Reopen if this still occurs in 0.26.

Note: See TracTickets for help on using tickets.