Opened 7 years ago
Closed 7 years ago
Last modified 7 years ago
#13181 closed Bug Report - General (Fixed)
ExternalRecorder fails with "resource temporarily unavailable"
Reported by: | Owned by: | jpoet | |
---|---|---|---|
Priority: | minor | Milestone: | 29.1 |
Component: | MythTV - General | Version: | Master Head |
Severity: | medium | Keywords: | |
Cc: | Ticket locked: | no |
Description
I've been working on an external recorder for a while. It works most of the time, but recently i've been getting new errors.
First, the sequence of calls for a successful play look like this: StartStreaming? StopStreaming? StartStreaming?
There's only a few milliseconds between the StopStreaming? and the 2nd StartStreaming?, so my code allows for it and that seems to work just fine... The StopStreaming? in the logs below is probably a red herring.
My problem below is that I am getting a "Resource temporarily unavailable (11)" when mythtv is trying to read from the stdin (the stdout of my program with the video stream)
This seems to happen from this call: https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythtv/recorders/ExternalStreamHandler.cpp#L125
The STDIN stream is configured to be non blocking, so the errno 11 isn't unexpected: https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythtv/recorders/ExternalStreamHandler.cpp#L343
I am not sure what to change to keep this from happening.... it fails about 50% of the time, which is annoying.
I am looking for advice on a fix, or how to troubleshoot. I am running 0.29 on Ubuntu 16.04.
Thanks! Marc
Trimmed logs from mythbackend:
Nov 21 21:23:30 mythbackend: mythbackend[1286]: I TVRecEvent recorders/ExternalStreamHandler.cpp:1077 (ProcessCommand) ExternalRec(/home/recorder1/record.sh): ProcessCommand('Version?') = 'OK:1.0' Nov 21 21:23:30 mythbackend: mythbackend[1286]: W TVRecEvent recorders/ExternalStreamHandler.cpp:1066 (ProcessCommand) ExternalRec(/home/recorder1/record.sh): External Recorder did not respond to 'StopStreaming' Nov 21 21:23:30 mythbackend: mythbackend[1286]: I TVRecEvent recorders/ExternalStreamHandler.cpp:1077 (ProcessCommand) ExternalRec(/home/recorder1/record.sh): ProcessCommand('StopStreaming') = 'OK:Stopped' Nov 21 21:23:30 mythbackend: mythbackend[1286]: E TVRecEvent recorders/ExternalStreamHandler.cpp:130 (Read) Failed to read from External Recorder: #012#011#011#011eno: Resource temporarily unavailable (11) Nov 21 21:23:30 everest mythbackend: mythbackend[1286]: E TVRecEvent recorders/ExternalStreamHandler.cpp:151 (GetStatus) ExternIO::GetStatus: already in error state: 'Failed to read from External Recorder: #012#011#011#011eno: Resource temporarily unavailable (11)' Nov 21 21:23:30 mythbackend: mythbackend[1286]: I TVRecEvent recorders/ExternalStreamHandler.cpp:1014 (StopStreaming) ExternalRec(/home/recorder1/record.sh): Streaming stopped Nov 21 21:23:30 mythbackend: mythbackend[1286]: E ExternSH recorders/ExternalStreamHandler.cpp:625 (run) ExternalRec(/home/recorder1/record.sh): Failed to read from External Recorder: Failed to read from External Recorder: #012#011#011#011eno: Resource temporarily unavailable (11)
I grabbed the latest V30 (master) and hacked up ExternalStreamHandler?.cpp with extra log messages. I replaced the Ready function to use select instead of poll and still had the same behavior. I then commented out this line:
- m_error = "Failed to read from External Recorder: " + ENO;
+ m_error = "MR: Failed to read from External Recorder: " + ENO; preventing the hard failure, and got a functioning video stream.
In the snippet below, FD 62 is the "stdout" with the transport stream. The "Checking for data" line comes from the Ready function: 1 indicates there's data to read. The empty line in the log is the from the commented out m_error... It's when we get the unexpected "erno 11".
We can see there's data before and after the error... I am not sure it's a problem with the Ready function, but with the 'read'
2017-11-23 15:07:35.517510 I ExternalRec(/home/mrawji/recorder/record.sh): ProcessCommand('StopStreaming') = 'OK:Stopped' 2017-11-23 15:07:35.517516 E ExternIO::Read '256' bytes, buffer size 312 2017-11-23 15:07:35.517690 E MR: Checking for data, return value was: '1', for FD '62' 2017-11-23 15:07:35.517693 E MR: Checking for data, return value was: '1', for FD '62' 2017-11-23 15:07:35.517700 E 2017-11-23 15:07:35.517709 E ExternIO::Read '256' bytes, buffer size 380 2017-11-23 15:07:35.517822 E MR: Checking for data, return value was: '1', for FD '62' 2017-11-23 15:07:35.517842 E ExternIO::Read '256' bytes, buffer size 260 2017-11-23 15:07:35.517879 E MR: Checking for data, return value was: '1', for FD '62' 2017-11-23 15:07:35.517901 E ExternIO::Read '256' bytes, buffer size 328
I am thinking that I could possibly add a counter on the failed reads and if we hit 2 (or n) in row, then fail... It also looks like it'll read "0 bytes" a few times despite the Ready function returning True. In the trace above, I see one 0 byte read, followed by the "erno 11". Then, everything keeps working again (9 microseconds after the error).
I am writing 256 bytes at a time from my python app... not sure if that means anything.
Attachments (1)
Change History (14)
comment:1 Changed 7 years ago by
comment:2 Changed 7 years ago by
Owner: | set to jpoet |
---|---|
Status: | new → accepted |
comment:3 Changed 7 years ago by
Hi Marc,
This ticket prompted me to realize I had some uncommitted code for the External Recorder code. I just committed it, so if you could give it a try I would appreciate it.
If these changes don't fix your problem, then I will work with you to get this figured out.
John
comment:4 Changed 7 years ago by
Looking at your diff... I have a patch I am testing right now. I'll let you know where i end up (preliminary checks looked good). Assuming my testing works tonight, i'll back out my change, put in your change and see if the behavior is better. Thankfully, it's pretty easy for me to reproduce
comment:5 Changed 7 years ago by
Hey John... Your patch looks helpful, but with another bug I hit... every now and again i'd get an error saying my script hadn't responded to a command, but i saw a response the logs.
I just rebuilt the recent HEAD with your recent commits and I still have the same failure I did before. I am testing a patch (2 active recordings going on now) that gracefully handles the EAGAIN error when trying to read from the stream. Looks like one of my 2 recordings would have failed, without my patch, so I am encouraged.
Marc
comment:6 Changed 7 years ago by
Potential workaround here. I've successfully tested with 2 recordings tonight (on the fixes/29 branch), but ported to master here...
Pull Request: https://github.com/MythTV/mythtv/pull/161
comment:7 Changed 7 years ago by
Forgot to mention. I am super open to feedback on the pull request / code etc... this the first time i've contributed to a project like this. Thanks! Marc
comment:8 Changed 7 years ago by
Reporter: | changed from mrawji@… to Marc Rawji <mrawji@…> |
---|
comment:9 Changed 7 years ago by
Hi Marc,
You patch looks pretty good. The only feedback I would give, is that pre-increment is more efficient than post-increment (e.g. ++i is better than i++).
I did change the messages and "tighten" it up a bit. I am attaching a slightly modified version of your patch. Please verify that it works for you and that I have not broken it somehow.
Thanks,
John
Changed 7 years ago by
Attachment: | ExternalRecorder-EAGAIN.patch added |
---|
comment:10 Changed 7 years ago by
I just tried out your patch, and seems to work just fine. Thank for cleaning it up. Much more succinct than my original one. I am happy to merge it if you are. Can we backport this fixes/29 as well?
One more request, if I can.
Occasionally, my external recorder needs more than 2.5 seconds to start (I don't know why). Can you increase the timeout for the "Version?" command to let my app more time to start?
- return ProcessCommand("Version?", 2500, result); + return ProcessCommand("Version?", 5000, result);
If you want to track this in another ticket, let me know and I can do that.
Thanks John! Marc
comment:11 Changed 7 years ago by
Resolution: | → Fixed |
---|---|
Status: | accepted → closed |
comment:13 Changed 7 years ago by
Milestone: | needs_triage → 29.1 |
---|---|
Version: | Unspecified → Master Head |
I was chatting with a colleague at work and his take, was "yeah, it happens. the stream will report it's ready via poll when it isn't quite"..
I'll try coding up a non-fatal error check and see if that improves it.