Opened 6 years ago

Closed 5 years ago

#10867 closed Bug Report - Crash (Fixed)

Segv from mythshutdown in MSqlDatabase

Reported by: keemllib@… Owned by:
Priority: blocker Milestone: 0.27
Component: MythTV - Mythwelcome & Mythshutdown Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Caused when running:

mythshutdown --quiet --nodblog --unlock

This does not happen every time the above is run. When trying the above with --logpath /tmp --loglevel debug, no log file is produced. mythlogserver is running.

Perhaps related, but without dropping core, running the above may print:

Error: Not all threads were shut down properly: 
Thread SignalingTimer is still running

or

Real-time signal 0

This system is running: Ubuntu 12.04 LTS with a 3.2.0 kernel.

Attachments (4)

gdb6804.out (19.0 KB) - added by keemllib@… 6 years ago.
version.txt (808 bytes) - added by keemllib@… 6 years ago.
gdb-mythshutdown-11542.txt (19.7 KB) - added by Bill Meek <keemllib@…> 6 years ago.
mythshutdown --lock running on v0.26-pre-842-gfb9b725
gdb-mythshutdown-5417.txt (28.1 KB) - added by Bill Meek <keemllib@…> 5 years ago.

Download all attachments as: .zip

Change History (30)

Changed 6 years ago by keemllib@…

Attachment: gdb6804.out added

Changed 6 years ago by keemllib@…

Attachment: version.txt added

comment:1 Changed 6 years ago by beirdo

Component: MythTV - GeneralMythTV - Mythwelcome & Mythshutdown

comment:2 Changed 6 years ago by beirdo

Type: Bug Report - GeneralBug Report - Crash

comment:3 Changed 6 years ago by Bill Meek <keemllib@…>

Update:

Turns out that mythbuntu's apport was stomping on my:

kernel.core_pattern = /tmp/core.%t.%u.%p.%e

With that restored, I'm now seeing the same core dumps on mythavtest, mythfilldatabase, mythbackend --setloglevel and mythcommflag.

These SEGVs can occur on the 1st run of a program or it has taken me as many as 40 attempts to get a core file.

I'm now running on v0.26-pre-836-g42b2b24.

Please note that the "Error: Not all threads were shut down properly:" symptom above matches #10898

comment:4 Changed 6 years ago by danielk

Milestone: unknown0.26
Owner: set to beirdo
Status: newassigned

Looks like the MythSignalingTimer? for logging isn't being shutdown in or before the MythContext destructor is run.

comment:5 Changed 6 years ago by beirdo

This is due to the following code in MythSignalingTimer::start():

        while (dorun && !running)
            usleep(10 * 1000);

That is where it is in the backtrace. If we are done and shutdown quickly enough, and stop the timer correctly, it never actually stops the timer as dorun is mutex protected, and the stop() is blocked waiting for it. I'm going to add some lock frobbage in here.

comment:6 Changed 6 years ago by Gavin Hurlbut <ghurlbut@…>

Resolution: fixed
Status: assignedclosed

In 1dc048f1b71c92e4bb81a9e2ae6766cbd2df1c16/mythtv:

Unlock the startstop mutex while waiting for timer startup

If we start and stop the timer too quickly, the lock can be still held by
the start() function, and the stop() sits deadlocked waiting for it. As this
delay is primarily to wait for the thread to come up before returning from
start(), it should be safe to release the mutex while waiting to allow the
stop() to take effect.

If this doesn't fix the issue, please reopen the ticket

Fixes #10867

comment:7 Changed 6 years ago by Bill Meek <keemllib@…>

No joy. I ran 50 mythshutdown --logpath /tmp --loglevel debug --lock tests and got the following:

10 core files
 8 Real-time signal 0s
23 Error: Not all threads were shut down properly:

I only looked at the backtrace on the last core file and it appears the same as before v0.26-pre-840-g1dc048f. As before, no mythshutdown...log files were created.

comment:8 Changed 6 years ago by Bill Meek <keemllib@…>

Resolution: fixed
Status: closednew

comment:9 Changed 6 years ago by beirdo

I can not recreate the problem here at all. As for no logging, it could just be that mythshutdown --lock doesn't log anything. I don't use mythshutdown at all, so I've never really looked.

I'd suggest a make uninstall, delete any remaining installed files, make distclean, make, make install. It could be you have a mix of old and new code somehow.

comment:10 Changed 6 years ago by beirdo

Hmm, for logging, I think I see the issue. However, I never see a crash.

comment:11 Changed 6 years ago by beirdo

Yes, add -v general to your commandline if you want any logging. mythshutdown has a default mask of -v none. And with that, I see the threads not shut down properly nonsense. Ahh, it's because of no CleanupGuard? in mythshutdown. I'll try to get you a fix momentarily.

comment:12 Changed 6 years ago by Gavin Hurlbut <ghurlbut@…>

In fb9b72518b2b2b3912f3ee633f02c185fd93865d/mythtv:

Delete logThread in logStop

Seems I never actually deleted the logThread when stopping logging. This was
causing issues as then the timers don't get stopped (as that's in the dtor)
on quick runs, causing nasty messages, and maybe even crashes.

Refs #10867

comment:13 Changed 6 years ago by beirdo

Please let me know if that cleans it up for you. So surprised I never noticed this before.

comment:14 Changed 6 years ago by Bill Meek <keemllib@…>

Sorry for the delay, ISP went belly up.

I believe the "Error: Not all threads were shut down properly:" is fixed.

100 Tests on v0.26-pre-842-gfb9b725 adding -v general --quiet --nodblog

 1 core file
19 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

100 with the original test:

 1 core file
16 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

Perhaps specifying --logAnything without -v isn't a valid test, so the next 100 were just mythshutdown --lock.

15 core files
10 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

All builds done with distclean and removal of everything MythTV under /usr/local. Even do a git clean -xfd.

comment:15 Changed 6 years ago by beirdo

Ok. Could you post an updated coredump for me so I can have sure I'm chasing the right things? Thanks.

Changed 6 years ago by Bill Meek <keemllib@…>

Attachment: gdb-mythshutdown-11542.txt added

mythshutdown --lock running on v0.26-pre-842-gfb9b725

comment:16 Changed 6 years ago by Bill Meek <keemllib@…>

Fresh backtrace attached. With my kernel.core_pattern fixed, I see that when my backend runs: mythshutdown -c --quiet --syslog none, it too drops core at times.

comment:17 Changed 6 years ago by beirdo

Milestone: 0.26unknown
Owner: beirdo deleted
Status: newassigned

The source of this coredump seems to be something different now. As I can not reproduce this at all, and I can't see it being related to the code I was tweaking anymore, I'm punting this back for others to look at.

comment:18 Changed 6 years ago by stuartm

Milestone: unknown0.27
Priority: minormajor
Status: assignednew
Summary: Segv from mythshutdownSegv from mythshutdown in MSqlDatabase

comment:19 Changed 5 years ago by paulh

Bill, can you still reproduce this? A lot has changed since you opened the ticket.

comment:20 in reply to:  19 Changed 5 years ago by Bill Meek <keemllib@…>

Replying to paulh:

I can still reproduce it, 3 cores files out of 200 tests. Attaching a fresh backtrace, with --version at the beginning of the file.

There were 3 major events prior to this starting. Upgrading from Ubuntu 10.04 to 12.04, the UTC conversion and 0MQ/mythlogserver. The only one I can easily "revert" is the last one. So back on 7/7/2013 I switched to the devel/logging branch and ran 1000 tests - FWIW there were no failures.

Changed 5 years ago by Bill Meek <keemllib@…>

Attachment: gdb-mythshutdown-5417.txt added

comment:21 Changed 5 years ago by stuartm

Priority: majorblocker

comment:22 Changed 5 years ago by JYA

Status: newinfoneeded_new

Bill could you retry compiling myth with configure --disable-mythlogserver and report ? thanks

comment:23 Changed 5 years ago by Bill Meek <keemllib@…>

Already did on 8/27/13. Over 1000 tests done by script and no failures. In normal operation, I'd see 0-3 cores per day and now see none.

I should also note that the other high runner core is from mythpreviewgen and I'm seeing none from it with the logserver out of the picture.

I just wanted to get a few days of normal use (as opposed to banging mythshutdown from a script every 1 second) before declaring success.

I've also seen no problems with logging. My normal logging is done with --logpath, -q and (previously) --nodblog.

Sorry not to have seen more folks on the lists/channels with the problem. I'm debating about recommending this for the release notes. I'd think *buntu (perhaps Debian) users would want it.

Thanks.

comment:24 Changed 5 years ago by JYA

you mean you had no failure if compiled with --disable-mythlogserver? or it's any myth from 27/8 that works?

comment:25 Changed 5 years ago by keemllib@…

Sorry, no failures after compiling with --disable-mythlogserver.

On my other host (both on 0.27-beta-89 at the time) I only ran 100 tests and had 3 cores. It was compiled without --disable-mythlogserver.

To answer your question on the developer's channel. --lock just sets the shutdown lock in the database (MythShutdownLock?.) I use it when NOT running a frontend and accessing the host via ssh, thus preventing the backend's Idle Shutdown Timer from shutting down my system.

I chose the --lock option for testing and just stuck with it so that all the tests would be the same.

comment:26 Changed 5 years ago by JYA

Resolution: Fixed
Status: infoneeded_newclosed

mark as fixed, will make --disable-mythlogserver the default

Note: See TracTickets for help on using tickets.