Modify

Opened 3 years ago

Closed 22 months ago

#10867 closed Bug Report - Crash (Fixed)

Segv from mythshutdown in MSqlDatabase

Reported by: keemllib@… Owned by:
Priority: blocker Milestone: 0.27
Component: MythTV - Mythwelcome & Mythshutdown Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Caused when running:

mythshutdown --quiet --nodblog --unlock

This does not happen every time the above is run. When trying the above with --logpath /tmp --loglevel debug, no log file is produced. mythlogserver is running.

Perhaps related, but without dropping core, running the above may print:

Error: Not all threads were shut down properly: 
Thread SignalingTimer is still running

or

Real-time signal 0

This system is running: Ubuntu 12.04 LTS with a 3.2.0 kernel.

Attachments (4)

gdb6804.out (19.0 KB) - added by keemllib@… 3 years ago.
version.txt (808 bytes) - added by keemllib@… 3 years ago.
gdb-mythshutdown-11542.txt (19.7 KB) - added by Bill Meek <keemllib@…> 3 years ago.
mythshutdown --lock running on v0.26-pre-842-gfb9b725
gdb-mythshutdown-5417.txt (28.1 KB) - added by Bill Meek <keemllib@…> 2 years ago.

Download all attachments as: .zip

Change History (30)

Changed 3 years ago by keemllib@…

Changed 3 years ago by keemllib@…

comment:1 Changed 3 years ago by beirdo

  • Component changed from MythTV - General to MythTV - Mythwelcome & Mythshutdown

comment:2 Changed 3 years ago by beirdo

  • Type changed from Bug Report - General to Bug Report - Crash

comment:3 Changed 3 years ago by Bill Meek <keemllib@…>

Update:

Turns out that mythbuntu's apport was stomping on my:

kernel.core_pattern = /tmp/core.%t.%u.%p.%e

With that restored, I'm now seeing the same core dumps on mythavtest, mythfilldatabase, mythbackend --setloglevel and mythcommflag.

These SEGVs can occur on the 1st run of a program or it has taken me as many as 40 attempts to get a core file.

I'm now running on v0.26-pre-836-g42b2b24.

Please note that the "Error: Not all threads were shut down properly:" symptom above matches #10898

comment:4 Changed 3 years ago by danielk

  • Milestone changed from unknown to 0.26
  • Owner set to beirdo
  • Status changed from new to assigned

Looks like the MythSignalingTimer? for logging isn't being shutdown in or before the MythContext destructor is run.

comment:5 Changed 3 years ago by beirdo

This is due to the following code in MythSignalingTimer::start():

        while (dorun && !running)
            usleep(10 * 1000);

That is where it is in the backtrace. If we are done and shutdown quickly enough, and stop the timer correctly, it never actually stops the timer as dorun is mutex protected, and the stop() is blocked waiting for it. I'm going to add some lock frobbage in here.

comment:6 Changed 3 years ago by Gavin Hurlbut <ghurlbut@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 1dc048f1b71c92e4bb81a9e2ae6766cbd2df1c16/mythtv:

Unlock the startstop mutex while waiting for timer startup

If we start and stop the timer too quickly, the lock can be still held by
the start() function, and the stop() sits deadlocked waiting for it. As this
delay is primarily to wait for the thread to come up before returning from
start(), it should be safe to release the mutex while waiting to allow the
stop() to take effect.

If this doesn't fix the issue, please reopen the ticket

Fixes #10867

comment:7 Changed 3 years ago by Bill Meek <keemllib@…>

No joy. I ran 50 mythshutdown --logpath /tmp --loglevel debug --lock tests and got the following:

10 core files
 8 Real-time signal 0s
23 Error: Not all threads were shut down properly:

I only looked at the backtrace on the last core file and it appears the same as before v0.26-pre-840-g1dc048f. As before, no mythshutdown...log files were created.

comment:8 Changed 3 years ago by Bill Meek <keemllib@…>

  • Resolution fixed deleted
  • Status changed from closed to new

comment:9 Changed 3 years ago by beirdo

I can not recreate the problem here at all. As for no logging, it could just be that mythshutdown --lock doesn't log anything. I don't use mythshutdown at all, so I've never really looked.

I'd suggest a make uninstall, delete any remaining installed files, make distclean, make, make install. It could be you have a mix of old and new code somehow.

comment:10 Changed 3 years ago by beirdo

Hmm, for logging, I think I see the issue. However, I never see a crash.

comment:11 Changed 3 years ago by beirdo

Yes, add -v general to your commandline if you want any logging. mythshutdown has a default mask of -v none. And with that, I see the threads not shut down properly nonsense. Ahh, it's because of no CleanupGuard? in mythshutdown. I'll try to get you a fix momentarily.

comment:12 Changed 3 years ago by Gavin Hurlbut <ghurlbut@…>

In fb9b72518b2b2b3912f3ee633f02c185fd93865d/mythtv:

Delete logThread in logStop

Seems I never actually deleted the logThread when stopping logging. This was
causing issues as then the timers don't get stopped (as that's in the dtor)
on quick runs, causing nasty messages, and maybe even crashes.

Refs #10867

comment:13 Changed 3 years ago by beirdo

Please let me know if that cleans it up for you. So surprised I never noticed this before.

comment:14 Changed 3 years ago by Bill Meek <keemllib@…>

Sorry for the delay, ISP went belly up.

I believe the "Error: Not all threads were shut down properly:" is fixed.

100 Tests on v0.26-pre-842-gfb9b725 adding -v general --quiet --nodblog

 1 core file
19 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

100 with the original test:

 1 core file
16 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

Perhaps specifying --logAnything without -v isn't a valid test, so the next 100 were just mythshutdown --lock.

15 core files
10 Real-time signal 0s
 0 Error: Not all threads were shut down properly:

All builds done with distclean and removal of everything MythTV under /usr/local. Even do a git clean -xfd.

comment:15 Changed 3 years ago by beirdo

Ok. Could you post an updated coredump for me so I can have sure I'm chasing the right things? Thanks.

Changed 3 years ago by Bill Meek <keemllib@…>

mythshutdown --lock running on v0.26-pre-842-gfb9b725

comment:16 Changed 3 years ago by Bill Meek <keemllib@…>

Fresh backtrace attached. With my kernel.core_pattern fixed, I see that when my backend runs: mythshutdown -c --quiet --syslog none, it too drops core at times.

comment:17 Changed 3 years ago by beirdo

  • Milestone changed from 0.26 to unknown
  • Owner beirdo deleted
  • Status changed from new to assigned

The source of this coredump seems to be something different now. As I can not reproduce this at all, and I can't see it being related to the code I was tweaking anymore, I'm punting this back for others to look at.

comment:18 Changed 3 years ago by stuartm

  • Milestone changed from unknown to 0.27
  • Priority changed from minor to major
  • Status changed from assigned to new
  • Summary changed from Segv from mythshutdown to Segv from mythshutdown in MSqlDatabase

comment:19 follow-up: Changed 2 years ago by paulh

Bill, can you still reproduce this? A lot has changed since you opened the ticket.

comment:20 in reply to: ↑ 19 Changed 2 years ago by Bill Meek <keemllib@…>

Replying to paulh:

I can still reproduce it, 3 cores files out of 200 tests. Attaching a fresh backtrace, with --version at the beginning of the file.

There were 3 major events prior to this starting. Upgrading from Ubuntu 10.04 to 12.04, the UTC conversion and 0MQ/mythlogserver. The only one I can easily "revert" is the last one. So back on 7/7/2013 I switched to the devel/logging branch and ran 1000 tests - FWIW there were no failures.

Changed 2 years ago by Bill Meek <keemllib@…>

comment:21 Changed 23 months ago by stuartm

  • Priority changed from major to blocker

comment:22 Changed 22 months ago by jyavenard

  • Status changed from new to infoneeded_new

Bill could you retry compiling myth with configure --disable-mythlogserver and report ? thanks

comment:23 Changed 22 months ago by Bill Meek <keemllib@…>

Already did on 8/27/13. Over 1000 tests done by script and no failures. In normal operation, I'd see 0-3 cores per day and now see none.

I should also note that the other high runner core is from mythpreviewgen and I'm seeing none from it with the logserver out of the picture.

I just wanted to get a few days of normal use (as opposed to banging mythshutdown from a script every 1 second) before declaring success.

I've also seen no problems with logging. My normal logging is done with --logpath, -q and (previously) --nodblog.

Sorry not to have seen more folks on the lists/channels with the problem. I'm debating about recommending this for the release notes. I'd think *buntu (perhaps Debian) users would want it.

Thanks.

comment:24 Changed 22 months ago by jyavenard

you mean you had no failure if compiled with --disable-mythlogserver? or it's any myth from 27/8 that works?

comment:25 Changed 22 months ago by keemllib@…

Sorry, no failures after compiling with --disable-mythlogserver.

On my other host (both on 0.27-beta-89 at the time) I only ran 100 tests and had 3 cores. It was compiled without --disable-mythlogserver.

To answer your question on the developer's channel. --lock just sets the shutdown lock in the database (MythShutdownLock?.) I use it when NOT running a frontend and accessing the host via ssh, thus preventing the backend's Idle Shutdown Timer from shutting down my system.

I chose the --lock option for testing and just stuck with it so that all the tests would be the same.

comment:26 Changed 22 months ago by jyavenard

  • Resolution set to Fixed
  • Status changed from infoneeded_new to closed

mark as fixed, will make --disable-mythlogserver the default

Add Comment

Modify Ticket

Action
as closed The ticket will remain with no owner.
The resolution will be deleted. Next status will be 'new'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.