Ticket #7135 (closed defect: invalid)
Opened 2 years ago
Last modified 2 years ago
multiple [mythfrontend] <defunct>
| Reported by: | simons.philippe@… | Owned by: | ijr |
|---|---|---|---|
| Priority: | trivial | Milestone: | unknown |
| Component: | MythTV - General | Version: | head |
| Severity: | low | Keywords: | mythwelcome |
| Cc: | Ticket locked: | no |
Description
my box is starting mythwelcome in autostart (through autologin of mythtv user and a .xinitrc) I've 3 instances of mythfrontend and 2 are <defunct>
could be because mythwelcome try to lauchn mythfrontend before mythbackend is completely up and running
Attachments
Change History
comment:1 Changed 2 years ago by cpinkham
- Status changed from new to infoneeded_new
comment:2 Changed 2 years ago by simons.philippe@…
here is mythwelcome log but there is not much intersting here...
comment:3 Changed 2 years ago by danielk
- Milestone changed from 0.22 to unknown
Nothing notable in the log. We're probably just not waiting on the child process some place in mythwelcome. As a rule this doesn't consume any resources aside from the entry in the process table, so not a big deal to fix before 0.22.
Simon you are seeing this with trunk, not 0.21-fixes? In trunk, we use myth_system() which should be waiting on the mythfrontend pid to exit.
comment:4 Changed 2 years ago by simons.philippe@…
Yup, only with trunk, didnt see this with .21-fixes
comment:5 Changed 2 years ago by paulh
- Status changed from infoneeded_new to new
Are you sure this is a problem with MythWelcome?? I'm seeing a defunct mythfrontend process from just starting it and letting it sit on the menu.
comment:6 Changed 2 years ago by simons.philippe@…
honestly, no, it was an assumption (seems i was wrong)
comment:7 Changed 2 years ago by Josh Winters <fuxxociety@…>
At the request with sphery from #mythtv-users:
I"m seeing the mythfrontend<defunct> processes and are NOT using mythwelcome.
My system is a mythbuntu based machine, without VDPAU, operating as a remote frontend.
I removed all traces of mythtv that I could find before installing mythtv 0.22-fixes.
This includes removing the old autostart entry from "Startup Programs" and replacing it with my own.
Attaching a snippet of 'ps -efw' for refrence.
comment:8 Changed 2 years ago by mdean
- Component changed from MythTV - Mythwelcome & Mythshutdown to MythTV - General
- Summary changed from mythwelcome creating mythfrontend <defunct> to multiple [mythfrontend] <defunct>
Seems unrelated to mythwelcome.
comment:9 Changed 2 years ago by Dibblah
- Status changed from new to infoneeded_new
Can you provide compressed logs with -v all, please - and a matching ps -efw for when you see the issue.
comment:10 Changed 2 years ago by Josh Winters <fuxxociety@…>
This may or may not be important, but I notice that once I get the <defunct> processes, they are reaped by the kernel when mythfrontend is killed (as they should be).
However, when I restart mythfrontend, the defunct processes come back with the new mythfrontend instance. This behavior is occurring as soon as mythfrontend is started, no sort of interaction with mythfrontend has been done otherwise.
comment:11 Changed 2 years ago by derliebegott@…
I am also doing the same:
- autologin with mingetty on tty7
- start mythwelcome from .xinitrc
I always see two defunct mythfrontend processes but there are no visible problems. Logs are absolutely ok.
root@mythbox:/tmp# ps -efw | grep mythfront mythtv 4004 3944 0 08:43 tty7 00:00:04 /usr/bin/mythfrontend -d -v general mythtv 4023 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct> mythtv 4026 4004 0 08:43 tty7 00:00:00 [mythfrontend] <defunct>
comment:12 Changed 2 years ago by Bill Meek <llibkeem@…>
On my combined frontend/backend running mythbuntu 9.04, when mythfrontend is started, I get:
PID PPID USER STAT COMMAND 13399 1 bill Rl mythfrontend --verbose all --logfile /var/log/mythtv/0.22-fe.log 13420 13399 bill Z \_ [mythfrontend] <defunct> 13423 13399 bill Z \_ [mythfrontend] <defunct> 13425 13399 bill Z \_ [mythfrontend] <defunct> 13428 13399 bill Z \_ [mythfrontend] <defunct> 13431 13399 bill Z \_ [mythfrontend] <defunct> 13434 13399 bill Z \_ [mythfrontend] <defunct> 13437 13399 bill Z \_ [mythfrontend] <defunct> 13440 13399 bill Z \_ [mythfrontend] <defunct> 13443 13399 bill Z \_ [mythfrontend] <defunct>
MythTV Version : 22679M MythTV Branch : trunk Network Protocol : 50 Library API : 0.22.20091022-1 QT Version : 4.5.0
My frontend is not started automatically.
This was for a 4 minute session. Started, waited for 'quiet' log and exited.
Logfile attached (I think.)
Bill
comment:13 Changed 2 years ago by Bill Meek <llibkeem@…>
Started wondering why I have 13 defuncts and the report before mine and the original had only 2. So, I plugged in an SD card into my card reader, restarted the frontend and my defunct count dropped from 13 to 12.
Most of the time, there are no cards plugged into the card reader.
On a roll here, I shutdown and disconnected the USB plug for the card reader (which has 5 slots CF/SD/uSD...). Restarting the frontend again, I got 2 defuncts, (for /dev/sd0?) which I'm guessing match log entries:
MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power,
When the card reader is plugged in, there are 12 Error entries. 2 each for /dev/sd[defgh] and /dev/sr0.
bill@rc1:~/Download$ zcat mlog.gz|cut -c25- |grep /dev/ MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdd MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sde MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdf MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdg MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sdh MMUnix::AddDevice() Error: failed to stat /dev/bdi, MMUnix::AddDevice() Error: failed to stat /dev/power, MMUnix::AddDevice() - Added /dev/sr0
There are truly no /dev/bdi or /dev/power files, however,
/sys/devices/pci0000:00/0000:00:14.1/host6/target6:0:0/6:0:0:0/block/sr0/bdi
exists.
My frontend logs go back as far as 2009-06-16, which is when I started running the trunk. The 1st entry with this type of error started on 2009-07-21 and I was at 20844. I update my box about every 100 commits. The card reader was purchased on 2008-11-12 and most likely installed the same day, although I won't swear to that.
Hope this helps.
Bill
Changed 2 years ago by Bill Meek <llibkeem@…>
-
attachment
mediamonitor.diff
added
hard codes udevadm rather than udevinfo (deprecated?)
comment:14 Changed 2 years ago by Bill Meek <llibkeem@…>
The attached changes work on a 9.04 mythbuntu distribution. If there are still distributions without udevadm, this 'fix' will give them the same problem we're seeing in this ticket.
Point me in the right direction and give me a shove and I'd be happy to make a real fix.
Details:
trunk/mythtv/libs/libmyth/mediamonitor-unix.cpp executes udevinfo, which doesn't exist in mythubuntu 9.04 and ubuntu 9.10 (the two distributions I have.)
% type udevinfo -bash: type: udevinfo: not found % type udevadm udevadm is /sbin/udevadm
If the device is valid, both return the full path, as in:
udevinfo -q name -rp /sys/block/sdd (existing code)
udevadm info -q name -rp /sys/block/sdd (proposed)
/dev/sdd
In the error case, (... -q name -rp /sys/block/sdfoo) the existing 'udevinfo' code checks for a response of:
device not found in database
but udevadm returns:
device path not found
Also, if udevinfo is used but linked to udevadm, the following will appear in mythfrontend.log:
MMUnix::GetDeviceFile(/sys/block/sdd) - udevinfo error... the program '/usr/local/bin/mythfrontend' called 'udevinfo', it should use 'udevadm info <options>', this will stop working in a future release
comment:15 Changed 2 years ago by mdean
Refs #6137
comment:16 Changed 2 years ago by stuartm
- Status changed from infoneeded_new to new
We can't switch to udevadm, it's root-only on some distributions.
comment:17 Changed 2 years ago by mdean
- Status changed from new to infoneeded_new
Attached patch, mythtv-7135-defunct_processes.patch , might work to prevent the zombie processes. I can't reproduce the issue, so I'm posting the patch for others to test.
The only way I could get defunct processes with my contrived test application was to delete the QProcess before the process exited. It's possible that this can happen when deleteLater() is called right after kill(), so the patch just puts another waitForFinished() call after the kill() to allow the process to die before deleteLater() gets called (it will wait much less than 2s, but will give up after 2s if the kill() fails). This extra waitForFinished() should probably be done regardless of whether it solves the issue or not.
However, in theory, if the specified application doesn't exist, waitForStarted() should return false, meaning that in the described cases, we should be returning from the waitForStarted() block above. (If you enable -v important,general,media on mythfrontend, you should see "Error - udevinfo failed to start!" and/or "Error - udevinfo failed to end! Terminating" which will indicate whether the problem is in the waitForStarted() or waitForFinished() block.) If the problem is in the waitForStarted() block, we may need to do the same kill() and waitForFinished() in it before deleteLater().
If someone can test and adjust the patch as necessary, please report back whether it works (and upload any modified version of the patch). #6137 (udevadm vs udevinfo) will be handled separately.
Changed 2 years ago by Bill Meek <llibkeem@…>
-
attachment
mediamonitor-unix.cpp-svn-diff
added
Patch as modified per mdean's request.
comment:18 Changed 2 years ago by Bill Meek <llibkeem@…>
Thanks for the quick response. Unfortunately, the kill()/waitFor...() patch didn't eliminate the defunct processes.
Your theory is spot on, the "Error - udevinfo failed to start!" leg is the one taken.
I've attached your patch, to be sure I modified it correctly. Also, attached is a stand alone file with libudev tests that would solve??? #6137 and eliminate the need for these changes.
The program will take /sys/block/sda and find /dev/sda like udevinfo/adm do.
Full disclosure, I don't know spit about udev, but would be willing to try modifying the program if it looks reasonable. It did require getting libudev-dev, which I'm guessing is a drawback.
Changed 2 years ago by Bill Meek <llibkeem@…>
-
attachment
mdean.patch.mediamonitor-unix.cpp
added
comment:19 Changed 2 years ago by Bill Meek <llibkeem@…>
The fix from the mailing list eliminates the defunct processes.
Nice work!
To be clear, the udev.c attachment I added wasn't intended as a fix or workaround, but a test of the udev library which could replace the existing calls to udevinfo.
What I don't know is how to find out which distributions have the library and/or if they require root access to run them.
comment:20 Changed 2 years ago by Bill Meek <llibkeem@…>
Bad test.
I had a udevinfo script that called udevadm in place when I tested.
Removed it and the defuncts are still happening.
Sorry.
comment:21 Changed 2 years ago by JohnnyJboss <johnnyjboss@…>
I have also tested this fix and it does not work.
1648 ? Rsl 0:11 /usr/local/bin/mythfrontend 1746 ? Z 0:00 \_ [mythfrontend] <defunct> 1748 ? Z 0:00 \_ [mythfrontend] <defunct>
comment:22 Changed 2 years ago by stuartm
- Status changed from infoneeded_new to new
- Severity changed from medium to low
comment:23 Changed 2 years ago by Bill Meek <llibkeem@…>
Update: haven't given up, but the number of defuncts changes. I can't figure out why.
I usually had about 15 defunct processes. The next patch dropped that to a solid 5. Both "bdi" and "power" appear on my machine, "trace" was added because I saw a comment on gossamer-threads by jarpublic. This seems like a keeper.
Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp (revision 22872)
+++ mediamonitor-unix.cpp (working copy)
@@ -553,7 +553,9 @@
// skip some sysfs dirs that are _not_ sub-partitions
if (*pit == "device" || *pit == "holders" || *pit == "queue"
- || *pit == "slaves" || *pit == "subsystem")
+ || *pit == "slaves" || *pit == "subsystem"
+ || *pit == "bdi" || *pit == "power"
+ || *pit == "trace")
continue;
found_partitions |= FindPartitions(
I suspect that mediamonitor-unix.cpp isn't and wasn't causing the actual defuncts to occur. When I was at 5, there were a matching 5 failures to mount. That makes sense, because there are no memory cards plugged in any of the 5 slots. In these cases, myth_system() is called to do the mounts and I tried the below. But the number of defuncts now floats between 1 and 6. Not recommending this as a fix, but the change does make a difference.
Index: mythmedia.cpp
===================================================================
--- mythmedia.cpp (revision 22872)
+++ mythmedia.cpp (working copy)
@@ -121,7 +121,7 @@
.arg(m_DevicePath);
VERBOSE(VB_MEDIA, QString("Executing '%1'").arg(MountCommand));
- if (0 == myth_system(MountCommand))
+ if (0 == myth_system(MountCommand, MYTH_SYSTEM_DONT_BLOCK_PARENT))
{
if (DoMount)
{
I keep tripping over why is it that udevinfo works and the error case doesn't? udevinfo returns the same string, although the default has no trailing new line. I tried appending a new line to the default case, (ret.append(QChar '\n')) to no avail.
comment:24 Changed 2 years ago by stuartm
- Priority changed from minor to trivial
There is a known bug with mythsystem in 0.22/trunk, multiple concurrent processes started with mythsystem share the same pid file meaning they aren't cleaned up properly when complete. This is probably related.
comment:25 Changed 2 years ago by Bill Meek <llibkeem@…>
I added VERBOSE lines to print out the child PIDs in myth_system and found that they didn't match those of the defunct processes. So I did the same [udevinfo->pid()] in mediamonitor-unix.cpp right after udevinfo->start and got an exact match! The usleep() in the following has eliminated the defuncts for me. The additional tests below that cut down the number of ?unrequired? attempts.
Index: mediamonitor-unix.cpp
===================================================================
--- mediamonitor-unix.cpp (revision 22889)
+++ mediamonitor-unix.cpp (working copy)
@@ -229,6 +229,8 @@
args << sysfs;
udevinfo->start("udevinfo", args);
+ usleep(100000);
+
if (!udevinfo->waitForStarted(2000 /*ms*/))
{
VERBOSE(VB_MEDIA, msg + ", Error - udevinfo failed to start!");
@@ -553,7 +555,10 @@
// skip some sysfs dirs that are _not_ sub-partitions
if (*pit == "device" || *pit == "holders" || *pit == "queue"
- || *pit == "slaves" || *pit == "subsystem")
+ || *pit == "slaves" || *pit == "subsystem"
+ || *pit == "bdi" || *pit == "power"
+ || *pit == "trace")
+
continue;
found_partitions |= FindPartitions(
Of course those affected could just add a shell script something like this as /sbin/udevinfo:
# If you already have udevinfo, you don't want this script!!!
UDEVADM=/sbin/udevadm
if [ ! -e $UDEVADM ]; then
echo "Strange, you don't have $UDEVADM either, bye!"
exit 1
fi
RESULT=`$UDEVADM info $1 $2 $3 $4 2>&1`
RETURN_CODE=$?
if [ $RETURN_CODE = 0 ]; then
echo "$RESULT"
else
echo "device not found in database"
fi
exit $RETURN_CODE
comment:26 Changed 2 years ago by Bill Meek <llibkeem@…>
Update:
This link http://bugreports.qt.nokia.com/browse/QTBUG-5990 seems to address the defunct/zombie problem.
Also, http://qt.nokia.com/doc/4.5/qprocess.html#starts speaks to starting a process that is still running. The log shows all my udevinfo->start()s (up to 16) were done in a 122msec window. I saw a log from another user with 5 removable devices do it in 112msec.
Also, the echo "device not found in database" above should have: 1>&2 appended in order to emulate udevinfo.
comment:27 Changed 2 years ago by mdean
- Status changed from new to closed
- Resolution set to invalid
The problem is a bug in Qt ( http://bugreports.qt.nokia.com/browse/QTBUG-5990 ). When #6137 is fixed, it will prevent our seeing the symptoms of this Qt bug, even on broken Qt versions. Until #6137 is fixed, users may use workarounds mentioned above or keep an eye on QTBUG-5990.
Thanks to Bill Meek for tracking down the Qt bug and to Bill Meek and Josh Winters and all the others for all the debugging help.

Please provide log files from mythwelcome and mythfrontend, otherwise it is impossible for us to diagnose.