Opened 14 years ago

Closed 14 years ago

Last modified 14 years ago

#9168 closed defect (Fixed)

use of toUtf8() in myth_system_fork can *still* wedge the child: multiple mythfrontend.real processes; no sound

Reported by: Kevin Buhr <buhr@…> Owned by: beirdo
Priority: major Milestone: 0.24
Component: MythTV - General Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Short story: the patch in bug #9016 isn't a complete fix for locking issues in myth_system_fork(). toUtf8() can cause a deadlock, and operator+(QString,...) probably remains an issue, too.

After an upgrade to Mythbuntu's trunk autobuild (0.24.0~trunk26882-0ubuntu0~mythbuntu2), I noticed that sometimes after switching programs or stopping and restarting Live TV, sound would stop working. I tracked it down to lingering extra "mythfrontend.real" processes that were launched to poke the screen savers but never got there.

In the front-end logs, I'd see:

2010-10-31 21:38:49.049 ScreenSaverX11Private: Calling xscreensaver-command -deactivate
2010-10-31 21:38:49.049 Launching: xscreensaver-command -deactivate >&- 2>&- &
2010-10-31 21:38:49.056 PID 12775: launched in the background, not waiting
2010-10-31 21:38:49.056 ScreenSaverX11Private: Calling gnome-screensaver-command --poke
2010-10-31 21:38:49.056 Launching: gnome-screensaver-command --poke >&- 2>&- &
2010-10-31 21:38:49.064 PID 12776: launched in the background, not waiting
2010-10-31 21:38:49.271 PID 12775: exited: status=0, result=0

with PID 12776 forked but not execed and never exiting, and sound would fail with:

2010-10-31 21:57:29.125 Opening ALSA audio device 'iec958:CARD=Intel,DEV=0'.
2010-10-31 21:57:29.125 ALSA, Error: snd_pcm_open(iec958:CARD=Intel,DEV=0): Device or resource busy
2010-10-31 21:57:29.125 AudioOutput Error: Aborting reconfigure

If the extra process was killed, sound would start working again.

A backtrace of one of the processes showed it getting stuck in the toUtf8() call after the fork and just before the execl:

buhr@medi:~$ sudo gdb /usr/bin/mythfrontend.real 12776
[ . . . ]
__lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
136     ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
        in ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
(gdb) info stack
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f1102647864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007f11026476c7 in __pthread_mutex_lock (mutex=0x7f11023b7120) at pthread_mutex_lock.c:82
#3  0x00007f110215e7a9 in ?? () from /usr/lib/nvidia-current/libGL.so.1
#4  0x00007f1102161709 in ?? () from /usr/lib/nvidia-current/libGL.so.1
#5  0x00007f1102161b3a in ?? () from /usr/lib/nvidia-current/libGL.so.1
#6  0x00007f10fceba98d in ?? () from /usr/lib/nvidia-current/tls/libnvidia-tls.so.260.19.06
#7  0x00007f10feefe11d in QByteArray::resize(int) () from /usr/lib/libQtCore.so.4
#8  0x00007f10ff025fee in ?? () from /usr/lib/libQtCore.so.4
#9  0x00007f10fef4017e in QString::toUtf8() const () from /usr/lib/libQtCore.so.4
#10 0x00007f1106d4518a in myth_system_fork (command=..., result=@0x7fff64f6ed0c) at mythsystem.cpp:411
#11 0x00007f1106d44360 in myth_system (command=..., flags=23, timeout=0) at mythsystem.cpp:256
#12 0x00007f11069e6032 in ScreenSaverX11Private::ResetScreenSaver (this=0x7f10f002ffa0) at screensaver-x11.cpp:218
#13 0x00007f11069e34fe in ScreenSaverX11::resetSlot (this=0x7f10f0033aa0) at screensaver-x11.cpp:346
#14 0x00007f1106a17554 in ScreenSaverX11::qt_metacall (this=0x7f10f0033aa0, _c=QMetaObject::InvokeMetaMethod, _id=0, 
    _a=0x7fff64f6ef50) at moc_screensaver-x11.cpp:74
[ . . . ]

Note that there are probably additional potential deadlocks with operator+() on QStrings, since I assume that function can call QByteArray::resize() too.

Change History (7)

comment:1 Changed 14 years ago by beirdo

Owner: set to beirdo
Status: newaccepted

comment:2 Changed 14 years ago by beirdo

Resolution: Fixed
Status: acceptedclosed

(In [27064]) Remove the use of QString and especially toUtf8() in myth_system_fork in the child process as there are remaining occasional lock issues.

Fixes #9168

comment:3 Changed 14 years ago by beirdo

(In [27065]) Merging [27064] from trunk.

Remove the use of QString and especially toUtf8() in myth_system_fork in the child process as there are remaining occasional lock issues.

Fixes #9168

comment:4 Changed 14 years ago by beirdo

Milestone: unknown0.24
Priority: minormajor

comment:5 Changed 14 years ago by Bill Meek <keemllib@…>

It appears that 'cmdargs' (or its contents) is getting stepped on.

After applying [27064], commands from preSDWUCheckCommand and MiscStatusScript? etc. don't always run. For example:

010-11-02 08:15:52.791 Launching: /usr/local/sbin/LocalPreShutdownTests.sh
2010-11-02 08:15:52.825 PID 1848: launched
sh: uL: not found
2010-11-02 08:15:52.892 PID 1848: exited: status=32512, result=127
2010-11-02 08:15:53.899 Launching: q2 (sometimes=^A2 or -2 or just empty)
2010-11-02 08:15:53.916 PID 1849: launched
sh: Àq2: not found

Reverting to [27063] allows the commands to run.

Not a patch, but this (very ugly) change allows commands to run with [27064]:

--- libs/libmythdb/mythsystem.cpp	(revision 27064)
+++ libs/libmythdb/mythsystem.cpp	(working copy)
@@ -361,7 +361,8 @@
 #ifndef USING_MINGW
 pid_t myth_system_fork(const QString &command, uint &result)
 {
-    const char *cmdargs = command.toUtf8().constData();
+    char cmdargs[256];
+    strcpy(cmdargs, command.toUtf8().constData());

comment:6 Changed 14 years ago by beirdo

(In [27078]) Change cmdargs and locerr to be static buffers of 1024 size, and copy the strings into them. It seems that between when I tested it, and now (a day later), it started acting stupid. This fixes it.

Fixes #9168 (again)

comment:7 Changed 14 years ago by beirdo

(In [27079]) Merged [27078] from trunk

Change cmdargs and locerr to be static buffers of 1024 size, and copy the strings into them. It seems that between when I tested it, and now (a day later), it started acting stupid. This fixes it.

Fixes #9168 (again)

Note: See TracTickets for help on using tickets.