Opened 7 years ago
Closed 7 years ago
Last modified 7 years ago
#13207 closed Bug Report - General (fixed)
Failed recordings due to Active EIT Scanning on satellite adapters using DiSEqC
Reported by: | Owned by: | gigem | |
---|---|---|---|
Priority: | minor | Milestone: | 29.2 |
Component: | MythTV - General | Version: | v29.0 |
Severity: | medium | Keywords: | EIT, DiSEqC, recording |
Cc: | Ticket locked: | no |
Description
This is a bug I have been facing since the transition from cardinput to capturecard in v0.28. I have first downgraded to v0.27 to get around it, but when I finally upgraded to v29 I have started investigating the problem.
As per release note instruction of v0.28 “Users of LNB/DiSEqC have created multiple DVB cards for the same adapter/frontend to keep LNB/DiSEqC working...” I now have a set-up as follows (only relevant adapter shown):
/dev/dvb/adapter0/frontend0 in capturecard:
cardid, parentid, dvb_diseqc_type, diseqcid, sourceid, displayname 1, 0, NULL, 1, 7, Astra 19.E 5, 0, NULL, 1, 3, Astra 28.2E 6, 0, NULL, 1, 4, Hotbird 13.0E 7, 1, 0, 1, 7, Astra 19.E 10, 5, 0, 1, 3, Astra 28.2E 11, 6, 0, 1, 4, Hotbird 13.0E
For all entries, Signal timeout is 7000, Tuning timeout is 10000, Wait for SEQ start header is active, Open DVB card on demand is active.
diseqc_config and diseqc_tree are correctly configured accordingly.
What I observe is that, when “Use DVB card for active EIT scan” is activated on /dev/dvb/adapter0/frontend0 sources, most recordings on /dev/dvb/adapter0/frontend0 fail. The log of a failed recording is attached.
When disabling Active EIT Scan on /dev/dvb/adapter0/frontend0 sources, recordings are consistently successful.
Possibly, this bug only appears in conjunction with slow database access. Since the upgrade from v0.27->v29, I noticed in mythweb that database use by mythtv is very slow. I upgraded from mysql 5.7 to mariadb 10.2, which was an improvement, but it’s still slow. A list with database table sizes is attached for information (I got many adapters and many channels).
I’m ready to support further analysis of this bug by running test cases.
Attachments (28)
Change History (92)
Changed 7 years ago by
Attachment: | mythconverg.pdf added |
---|
comment:1 Changed 7 years ago by
Owner: | set to gigem |
---|---|
Status: | new → accepted |
Thank you for the report. This has been a suspected issue for a while. Unfortunately, I am unable to test it and diseqc users able to help debug it have been hard to find. Are you willing and able to test patches? I suspect there is an inccorect conflict detection between diseqc cards somewhere or there is another issue with the diseqc control. I will look for the former. In the meantime, can you leave recording and EIT scanning enabled on one of the diseqc options and disable everything else on the other diseqc options. Even better would be to temporarily delete the other diseqc cards. When that is done, do recordings still fail? That might help to isolate the problem.
comment:2 Changed 7 years ago by
I have today set up on another computer a test configuration with mythtv installed from source, and using one adapter with a diseqc switch for four satellite positions. It does show the same behaviour, so I believe I can support debugging through applying patches.
So that now it's confirmed that my test set-up shows the same behaviour, I'll delete all adapters, and just create one with diseqc position 1. I'll report on this tomorrow.
comment:3 Changed 7 years ago by
Ran a test all day with only 1 source connected, to the first instance of /dev/dvb/adapter0 / diseqc port 1, with EIT running on it. This resulted in consistently successful recordings. I guess that with this there's no use to do the earlier proposed test with the other adapter instances deleted instead of not connected to a source?
Waiting for further instructions or patches! :-)
comment:4 Changed 7 years ago by
Dang. That means I probably can't simulate the problem with clear QAM on my DVB card. Can you please generate a new mythbackend log showing the problem with logging set to -v general,record,eit,channel?
Changed 7 years ago by
Attachment: | mythbackend.2.log added |
---|
Attachment to comment 5, with 2 failed and 2 successful recordings
comment:5 Changed 7 years ago by
Ok, done. Reconnected the four sources, with EIT activated on all of them. The attached log shows failed recordings for the programmes at 19:55 and 20:15, and successful recordings at 19:30 and 20:30. It is complete from initialisation of mythbackend to its closure.
comment:7 Changed 7 years ago by
Do you have your DVB cards set to "Open DVB card on demand"? If not, please turn it on and re-test. If it is already on, please let me know so I can prepare a patch with more debug logging.
Changed 7 years ago by
Attachment: | 13207-debug1.patch added |
---|
comment:10 Changed 7 years ago by
Please run again with the new, 13207-debug1.patch and be sure to use -v general,record,eit,channel. Note that this won't fix anything. It will hoepfully shed a little more light on what is happening.
comment:11 Changed 7 years ago by
Thanks a lot! I applied the patch and ran as requested. Attached log shows successful recording at 19:30, and failed recording at 20:15.
Changed 7 years ago by
Attachment: | mythbackend.log.gz added |
---|
With patch 13207 applied, successful 19:30, failed 20:15
Changed 7 years ago by
Attachment: | 13207-debug2.patch added |
---|
comment:12 Changed 7 years ago by
I have another patch with a little more debugging. Again, this won't fix anything.
Please also run the following query. I'd like to see your inputgroup table.
MariaDB [mythconverg]> select * from inputgroup order by inputgroupname, cardinputid, inputgroupid ;
Finally, please run the following query too. This might actually fix things, but I'm not sure.
MariaDB [mythconverg]> update capturecard set dvb_eitscan = 0 where parentid != 0 ;
comment:13 Changed 7 years ago by
Mythbackend log with patch 2 is attached, with failed recording at 22:20 and successful record at 23:30.
Here's the queries:
MariaDB [mythconverg]> select * from inputgroup order by inputgroupname, cardinputid, inputgroupid ; +-------------+--------------+-------------------------------------+ | cardinputid | inputgroupid | inputgroupname | +-------------+--------------+-------------------------------------+ | 5 | 1 | agrajag|/dev/dvb/adapter0/frontend0 | | 6 | 1 | agrajag|/dev/dvb/adapter0/frontend0 | | 7 | 1 | agrajag|/dev/dvb/adapter0/frontend0 | | 8 | 1 | agrajag|/dev/dvb/adapter0/frontend0 | | 9 | 1 | agrajag|/dev/dvb/adapter0/frontend0 | +-------------+--------------+-------------------------------------+ 5 rows in set (0.00 sec)
Before applying your second query, I checked the content of capturecard:
MariaDB [mythconverg]> SELECT cardid, parentid, videodevice, diseqcid, dvb_eitscan, sourceid, displayname FROM `capturecard` ; +--------+----------+-----------------------------+----------+-------------+----------+-------------+ | cardid | parentid | videodevice | diseqcid | dvb_eitscan | sourceid | displayname | +--------+----------+-----------------------------+----------+-------------+----------+-------------+ | 9 | 0 | /dev/dvb/adapter0/frontend0 | 5 | 1 | 4 | H13.0E | | 8 | 0 | /dev/dvb/adapter0/frontend0 | 5 | 1 | 3 | A28.2E | | 7 | 0 | /dev/dvb/adapter0/frontend0 | 5 | 1 | 0 | | | 6 | 0 | /dev/dvb/adapter0/frontend0 | 5 | 1 | 2 | A23.5E | | 5 | 0 | /dev/dvb/adapter0/frontend0 | 5 | 1 | 1 | A19.2E | +--------+----------+-----------------------------+----------+-------------+----------+-------------+ 5 rows in set (0.00 sec)
All parentids are zero, so the query would not do anything.
The only "anomaly" is cardid 7, I created one instance too much on the test machine. As this is not the same as on my production machine, that also shows the same behaviour, I presume this is not an element in the bug.
Changed 7 years ago by
Attachment: | mythbackend.3.log added |
---|
Log with patch 2, failed at 22:20, successful at 23:30
comment:15 Changed 7 years ago by
I believe I know what's going on at a high level. I don't yet know how it was handled previously nor how to handle it now.
comment:16 Changed 7 years ago by
I checked with a former developer who is more familiar than I am with this part of the code. Unfortunately, the way this issue was handled previously can no longer be done. I'm going to have to add some new code to handle it. I'll try to get that done in the next week.
comment:17 Changed 7 years ago by
Any progress on this topic, or anything I can do to provide additional test support?
comment:18 Changed 7 years ago by
I think I know what to change and where to change it. I just don't know exactly how yet due to complicated locking. I am waiting on another developer more familiar with that part of the code to help me with it. It hopefully won't be much longer before I can give you something to test.
Changed 7 years ago by
Attachment: | StopEIT.patch added |
---|
If a recording is pending, stop the EIT scanner. (possible fix?)
comment:19 Changed 7 years ago by
Can you try the StopEIT.patch? If I understand the situation, then this might fix the issue.
comment:20 Changed 7 years ago by
Thanks, but unfortunately I get a compilation error (on latest fixes/29):
tv_rec.cpp: In member function ‘void TVRec::HandlePendingRecordings()’: tv_rec.cpp:1584:72: error: too few arguments to function ‘int eit_start_rand(uint, int)’ eitCrawlIdleStart + 30 + eit_start_rand(eitTransportTimeout)); ^ tv_rec.cpp:1277:12: note: declared here static int eit_start_rand(uint inputid, int eitTransportTimeout) ^ Makefile:13714: recipe for target 'tv_rec.o' failed make[2]: *** [tv_rec.o] Error 1 make[2]: *** Waiting for unfinished jobs.... In file included from ../../external/FFmpeg/libavcodec/mpegvideo.h:42:0, from recorders/dtvrecorder.cpp:33: ../../external/FFmpeg/libavcodec/internal.h: In function ‘float ff_exp2fi(int)’: ../../external/FFmpeg/libavcodec/internal.h:273:30: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] return av_int2float(x+127 << 23); ^ make[2]: Leaving directory '/home/jmw/mythtv-testing/build/mythtv/mythtv/libs/libmythtv' Makefile:263: recipe for target 'sub-libmythtv-make_first' failed make[1]: *** [sub-libmythtv-make_first] Error 2 make[1]: Leaving directory '/home/jmw/mythtv-testing/build/mythtv/mythtv/libs' Makefile:67: recipe for target 'libs' failed make: *** [libs] Error 2
comment:21 Changed 7 years ago by
Adding "inputid" as a first argument to eit_start_rand in tv_rec.cpp on line 1584 results in successful compilation. I'll test that version overnight.
Changed 7 years ago by
Attachment: | mythbackend.4.log added |
---|
mythbackend.log (StopEIT.patch applied to MASTER): successful @16:05, failed @ 16:30/17:00
comment:23 Changed 7 years ago by
I applied the patch against master and it applied smoothly.
No joy however for the bug fixing. Attached mythbackend.log (-v general,record,eit,channel) shows a successful recording at 16:05, and failed recordings at 16:30 and 17:00.
comment:24 Changed 7 years ago by
Can you please run with both StopEIT.patch and 13207-debug2.patch? The latter helps me track the state of the file descriptors.
Changed 7 years ago by
Attachment: | mythbackend.log.zip added |
---|
mythbackend.log (StopEIT.patch and 13207-debug2.patch applied to Master) failed recording at 18:30, successful at 18:29/19:00
comment:25 Changed 7 years ago by
The just uploaded log file does not seem to be a classical recording<>EIT conflict, but a recording<>recording conflict. This is something I have not yet seen or been able to reproduce, also not on my production system. I'll try to catch a classical case, unless you tell me this one is equally interesting.
comment:26 Changed 7 years ago by
Here's a proper one, with failed recording at 20:00 and successful at 19:00/19:30/20:30:20:45.
Changed 7 years ago by
Attachment: | 13207-debug3.patch added |
---|
comment:28 Changed 7 years ago by
Thanks for your patience with this issue. It's proving to be a very difficult one to figure out. Please try the new debug3 patch. It's a complete patch, so you should apply it on top of a clean version 29. It won't fix anything, but will hopefully help us identify the next layer of the problem.
comment:29 Changed 7 years ago by
Thanks for your continued investigations! ;-) Here it is. Failed at 21:00 and 21:30, successful at 22:00.
Changed 7 years ago by
Attachment: | mythbackend.5.log added |
---|
with patch 3 applied, failed 21:00/21:30, successful 22:00
Changed 7 years ago by
Attachment: | 13207-debug4.patch added |
---|
comment:30 Changed 7 years ago by
Please try the debug4 patch. This one might actually improve things. It's by no means finished, though. There are still race condition and locking issues to resolve.
comment:31 Changed 7 years ago by
Ok, will do. Shall I keep patch 3, or install from a clean baseline?
comment:32 Changed 7 years ago by
No, install this over a clean baseline. Also, I've convinced myself that one part of the new patch isn't needed. If you have the chance to test the debug4 patch tonight, the extra stuff won't hurt. Otherwise, I'll prepare a new patch in a little while for you to use tomorrow.
Changed 7 years ago by
Attachment: | 13207-debug5.patch added |
---|
comment:33 Changed 7 years ago by
The debug5 patch is ready. Use it if you haven't already tried debug4.
comment:34 Changed 7 years ago by
I ran debug4 overnight. 6 successful recordings, and one failed. But the failed one does not show double as it normally does. Log attached. Failed recording tracing starts at 00:09:30.
Changed 7 years ago by
Attachment: | mythbackend2.log added |
---|
With patch 4 applied, failed recording at 00:09:30.
comment:35 Changed 7 years ago by
What was your recording schedule supposed to be and what was the resulting status for the recording at 00:09:30? Also, what is your recording pre-roll set to? There was a record pending annoucnement for input 8 at 00:09:30 and then one for input 5 at 00:09:34. That looks very suspicious.
comment:36 Changed 7 years ago by
Unfortunately I don't know what the recording schedule was: I use a series of frequent international keywords like "sport", "club" or "docu" to generate frequent random recordings from random sources. I recognize this may have strange effects in case of temporary encryption, no transmission, or channel reconfigurations. So this may have been such a case, especially since it tried to record "currently no program"?
There is no file for the recording at 00:09:30. The recording is set to start on time (all default values).
I found there was a more classic failure at 4:43, see attached log. Looks like overall there's more successful recordings though (about 10 for 1-2 failures), so there may indeed be progress. :-)
If you want I can record with manually set recordings, but then I'd need you to define the cases which I should test.
Changed 7 years ago by
Attachment: | mythbackend.log.2.gz added |
---|
comment:37 Changed 7 years ago by
A dump of the oldrecorded table or screen shot of the Previously Recorded screen after the fact would probably suffice. As for improvement, yes, I saw some cases that previously would have resulted in failed recordings.
Changed 7 years ago by
Attachment: | oldrecorded.csv added |
---|
comment:38 Changed 7 years ago by
A full night of successful random recordings (11). :-) Log available on demand. I need to interrupt the testing for 2 days now, sorry!
comment:39 Changed 7 years ago by
That's good to hear. Yes, please post the latest log so I can make sure it's working as intended.
Changed 7 years ago by
Attachment: | mythbackend.log.3.gz added |
---|
Changed 7 years ago by
Attachment: | mythbackend.log.4.gz added |
---|
comment:40 Changed 7 years ago by
I confirm better statistics. 24 hours of random recordings. 20 recordings, of which only 1 failed (CBBC at 21:30).
comment:41 Changed 7 years ago by
Unfortunately, that failure is still related to the problem at hand. I know what's happening, though. I just need to decide on the best fix.
comment:42 Changed 7 years ago by
Cool. Then I guess I wait for your next patch before I start testing again.
Changed 7 years ago by
Attachment: | 13207-debug6.patch added |
---|
comment:43 Changed 7 years ago by
Please try the new, debug6 patch. Note this might actually be a regression from debug5. I'm trying to fix the issue in the proper way. Also, note a different, recent change is needed, so please make sure you are using the very latest fixes/29 version.
comment:44 Changed 7 years ago by
Running since yesterday evening. 18 successful recordings so far. I'll send the log as soon as (or if?) there's a failed recording.
comment:45 Changed 7 years ago by
That's very encouraging. If you can, please go ahead and upload the logs even without a failure. I can check to see if the cases I know about are being handled as intended.
Changed 7 years ago by
Attachment: | mythbackend.log.5.gz added |
---|
comment:46 Changed 7 years ago by
34 successful recordings in a row, and counting! :-) Is there any corner case or scenario I should try to trigger?
comment:47 Changed 7 years ago by
No, just let it run for now. I've only had a brief chance to look at your latest log and I didn't see what I was hoping to see, but I could have easily missed it. I still need to look at it more closely.
Changed 7 years ago by
Attachment: | mythbackend.log.6.gz added |
---|
Log of the next 24h, maybe you'll find what you're looking for in here. Meanwhile, I continue recording.
Changed 7 years ago by
Attachment: | 13207-debug7.patch added |
---|
comment:48 Changed 7 years ago by
I had a chance to review the last log. It appears to be working as intended except for one little anomaly. I need to check with another developer about that, but I don't think it's anything serious. Is all of your guide data getting updated as expected? All in all, this is looking good. I just put up a new, debug7 patch. It's a slight refinement of the debug6 patch. Please apply it in place of debug6 when it's convenient. Let's then let it run for a few days.
comment:49 Changed 7 years ago by
Patch 7 is running. Guide data completeness is hard to assess because it's just a huge raw channel list I'm using on my test system, so it's logical that not all channels have data and that not all channels are visited regularly. I see that there have been updates on all satellites, I see there's been recent updates (now/next inputs), but I also see there's a few channels where I would expect data but do not see any for now.
Changed 7 years ago by
Attachment: | 13207-debug8.patch added |
---|
comment:51 Changed 7 years ago by
I know what's happening, but this fix is potentially trickier. I've asked one of hte other developers for his recommendation, but haven't heard back yet. In the meantime, the debug8 patch has a heavy-handed attempt.
Changed 7 years ago by
Attachment: | 13207-debug9.patch added |
---|
comment:53 Changed 7 years ago by
I tried a more direct path to killing the EIT scan in debug8. It didn't work. debug9 reverts that part and also uses the less direct path to try to fix the failure that occurred with debug6/7.
comment:54 Changed 7 years ago by
Debug 9 has been running for 72 hours, with 60 successful recordings, no failed ones! Looks to me like you managed to address the issue! Or do you still have cases you know are not covered? I'll attach the log for you to check the proper functioning of your patch, if you want to.
comment:56 Changed 7 years ago by
Status: | accepted → assigned |
---|
I got the log file. Thanks. I'll review it later. No, I don't currently know of any other cases that need attention. I'd like John Poet to review the changes one last time before committing them, but his availabiltiy is limited for the next couple of weeks. If you're okay with it, just keep using debug9 until John gets back or you notice a new problem.
comment:57 Changed 7 years ago by
200 successful random recordings in a row, no single failure. I think we can conclude that the patch works! Congratulations and thanks for your perseverance! Don't hesitate to ask me if you need testing for another bug, it'll be my pleasure.
comment:59 Changed 7 years ago by
Yes, soon, quite possibly today. One of the other deveopers was supposed to review the last patch this weekend, but didn't, so I'm going to commit it as without the debug log messages.
comment:60 Changed 7 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:62 Changed 7 years ago by
Milestone: | needs_triage → 29.2 |
---|
Overview of DB size.