Opened 13 years ago

Closed 13 years ago

#9511 closed Bug Report (Fixed)

[patch] avformat decoder buffer underrun (DVB livetv)

Reported by: Jiri Fojtasek <jiri.fojtasek@…> Owned by: Janne Grunau
Priority: critical Milestone: unknown
Component: MythTV - Video Playback Version: Master Head
Severity: high Keywords:
Cc: Ticket locked: no

Description

Since i updated to 0.24 i experienced lots of "Video buffer failed many times" during livetv playback. Then i updated to trunk to and the same issues happening to me. This problem happened for me with 90% chance when livetv switching to new program. I using an old computer whish is combined FE/BE.

Long story short:

  • New program coming in to place and switchtoprogram is called
  • Switch to program took long time and after resuming playback ring buffer is drained
  • ringbuffer:read() return negative value because no new data is available
  • This tell avfdecoder than EOF happening and the decoder is closed. Then jumptoprogram is issued. this fail too because EOF is unrecoverable state to the avformat decoder without reopening played file or other hacks.
  • Then poof and playback is over.

I sending two patches:

  • Firts patch add "-EAGAIN" handling support in to mpegts decoder (it is supported in ffmpeg high level API), ringbuffer:read() during livetv playback return -EAGAIN when the buffer is drained (this force ffmpeg and then avformatdecoder deal with this issue itself), return -EIO when no livetv media is played.
  • Second patch optimizing decoder state handling during switch to program and fixing "Waited 100ms for decoder to pause" error messages. Simply, when we pause audio, video processing thread first, decoder will stop sooner because is not blocked by those threads.

To apply this new functionality for other playback contexts is required add timeout handling when av_read_frame return -EAGAIN (in nuppel the same in to required places) and return -EAGAIN from ringbuffer:read() when underrun happen. For livetv it is not required timeout handling because its endless stream.

I have few gigabytes of logs (-v all) and spent few weeks with debuging this issue. If you like i can grep and post parts of those logs here, but the changes are self explaining and simplest as posible.

Jiri

Attachments (6)

103-livetv_buffer_underrun.patch (1.8 KB) - added by Jiri Fojtasek <jiri.fojtasek@…> 13 years ago.
first patch
104-optimize_decoder_state_change.patch (845 bytes) - added by Jiri Fojtasek <jiri.fojtasek@…> 13 years ago.
second patch
decoder-lockup-livetv.diff (661 bytes) - added by markk 13 years ago.
103-livetv_buffer_underrun_v2.patch (4.5 KB) - added by Jiri Fojtasek <jiri.fojtasek@…> 13 years ago.
patch v2
103-livetv_buffer_underrun_v3.patch (1.7 KB) - added by Jiri Fojtasek <jiri.fojtasek@…> 13 years ago.
patch v3
103-livetv_buffer_underrun_v3a.patch (1.7 KB) - added by Jiri Fojtasek <jiri.fojtasek@…> 13 years ago.
v3a, fixed missing brackets, old file i picked from wrong location :oops:

Download all attachments as: .zip

Change History (21)

Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

first patch

Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

second patch

comment:1 Changed 13 years ago by markk

Pause/unpause change applied in b4f10394e5d668266810 - thanks for the patch.

comment:2 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

aditional info:

Forgot to mention, you can reproduce rungbuffer underrun by adding sleep(X); before Pause(); in switchtoprogram() function. X set to 2 or more seconds. This exactly show what may happen in a slow computer (in my case Sempron 2800+ 1.6ghz).

Jiri

comment:3 Changed 13 years ago by andrew@…

This is excellent, its fixed the Live TV bail I had every time a program crosses a boundary in the EPG. Been waiting for this fix since .22 or there abouts. I hope patch 1 gets applied also.

comment:4 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

Hold on, i working at version 2 of the underrun patch, it will have better inegration, with proper logging and protection for posible lockup in the decoder thread.

Jiri

Changed 13 years ago by markk

Attachment: decoder-lockup-livetv.diff added

comment:5 Changed 13 years ago by markk

Jiri - I've attached a simple (but not complete fix) for the decoder lockup. I've tested this with about 8 hours of livetv here today and it has worked very well (i.e. no program transition failures). I did not need the eagain patch - so I'm curious whether that is still needed or whether it is addressing a different issue.

comment:6 in reply to:  5 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

Replying to markk:

Jiri - I've attached a simple (but not complete fix) for the decoder lockup. I've tested this with about 8 hours of livetv here today and it has worked very well (i.e. no program transition failures). I did not need the eagain patch - so I'm curious whether that is still needed or whether it is addressing a different issue.

Mark,

Your change is related to one i have tried before with no luck to this problem.

When the avformatdecoder set eof its too late to anything. Any more calls to av_read_frame always return -5 (EIO) and the ffmpeg decoder will never retry read from the ringbuffer because his internal eof flag is set to 1. Its game over until playback crash with "buffers failed". The key log message always preceding switchtoprogram and associated decoder failure is:

2011-01-22 11:20:03.246 RingBuf(/var/spool/mythtv/livetv/9008_20110122111500.mpg): ReadPriv(..6436, normal): ReadBufAvail() == 0

Have you tried this pause HACK i posted to reproduce this bug ?

We can made a hack to set ffmpeg's eof flag back to 0 but i do not like hacks, and hacking this have beside issues than decoder returning incomplette frame (this still doing with v1 patch i posted, in v2 ill post soon its fixed) My change to mpegts is clean because eagain is supported in ffmpeg high level api.

Jiri

comment:7 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

The propposed solution in first patch i posted is only hack to hide decoder failure. Ignore this patch complettly. I have found properly hidden race condition between decoder, player, and ringbuffer threads, and have a solution for it. Ill post the patch and all details soon as posible after all test will pass.

Jiri

comment:8 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

The problem with decoder eof is cause by race condition between ringbuffer,decoder,and player thread. Here is part of log showing this race with my comments:

# Ringbuffer thread detecting end of file and invoking switchtoprogram
2011-01-20 23:50:01.484 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): safe_read(...@2549768, 1048576) -> 0
2011-01-20 23:50:01.485 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): @ end of read ahead loop
2011-01-20 23:50:01.485 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): ReadPriv(..32768, normal) -- copying data
2011-01-20 23:50:01.485 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): ReadPriv(..32768, normal) -- checksum 38247

# Decoder thread reading remained portion of data in ringbuffer and waiting in RingBuffer::WaitForAvail(). The race condition begin ...
2011-01-20 23:50:01.491 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): ReadPriv(..6136, normal) @2549768 -- begin

# Player thread jumping in to switch to program (sz=0KB so buffer is drained)
2011-01-20 23:50:01.509 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): IsReallyNearEnd() br(468KB) sz(0KB) vfl(28) frh(0) ne:1
2011-01-20 23:50:01.509 Player(0): SwitchToProgram - start

# Player thread stopping ringbuffer reading, and decoder thread still waiting in RingBuffer::WaitForAvail()
2011-01-20 23:50:01.618 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:01.718 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:01.818 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:01.918 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.022 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.122 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.222 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.322 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.423 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.490 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): @ end of read ahead loop
2011-01-20 23:50:02.523 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.623 Player(0), Warning: Waited 100ms for decoder to pause
2011-01-20 23:50:02.723 Player(0), Warning: Waited 100ms for decoder to pause

# Decoder thread waiting in RingBuffer::WaitForAvail() timeouted and entered eof state, decoder thread was finally stopped
# After this the decoder will not decode anything anymore ...
2011-01-20 23:50:02.740 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): ReadPriv(..6136, normal): ReadBufAvail() == 0
2011-01-20 23:50:02.748 RingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): @ end of read ahead loop

#Player thread opening next file, but decoder is dead
2011-01-20 23:50:02.778 FileRingBuf(/var/spool/mythtv/livetv/9008_20110120234837.mpg): OpenFile(/var/spool/mythtv/livetv/9008_20110120235000.mpg, 2000 ms)
2011-01-20 23:50:02.801 FileRingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): OpenFile() made 1 attempts in 23 ms
2011-01-20 23:50:02.801 Player(0): SwitchToProgram(void) discont: 0 newtype: 0 newid: 2 eof: 0
2011-01-20 23:50:02.801 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@2549768, 1048576) -- begin
2011-01-20 23:50:02.923 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@2549768, 1048576) -> 549336
2011-01-20 23:50:02.924 Player(0): SwitchToProgram - end
2011-01-20 23:50:02.928 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): @ end of read ahead loop
2011-01-20 23:50:02.934 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@3099104, 1048576) -- begin
2011-01-20 23:50:02.971 Player(0): LiveTV forcing JumpTo 1

#Decoder thread has been set eof, so jumtoprogram trying switch file, but it will not work because decoder is dead
2011-01-20 23:50:02.982 Player(0): JumpToProgram - start
2011-01-20 23:50:03.044 Waiting on select..
2011-01-20 23:50:03.047 Waiting on select..
2011-01-20 23:50:03.054 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@3099104, 1048576) -> 72944
2011-01-20 23:50:03.054 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): @ end of read ahead loop
2011-01-20 23:50:03.081 Waiting on select..
2011-01-20 23:50:03.083 FileRingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): OpenFile(/var/spool/mythtv/livetv/9008_20110120235000.mpg, 2000 ms)
2011-01-20 23:50:03.086 FileRingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): OpenFile() made 1 attempts in 3 ms
2011-01-20 23:50:03.086 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@0, 1048576) -- begin
2011-01-20 23:50:03.127 Waiting on select..
2011-01-20 23:50:03.148 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@0, 1048576) -> 622280
2011-01-20 23:50:03.148 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): @ end of read ahead loop
2011-01-20 23:50:03.153 RingBuf(/var/spool/mythtv/livetv/9008_20110120235000.mpg): safe_read(...@622280, 1048576) -- begin
2011-01-20 23:50:03.166 Player(0): JumpToProgram - end
2011-01-20 23:50:03.166 Player(0): Waiting for video buffers...

# Player start spaming that waiting for buffers but will not got em. This will do until playback die and 
# error message "Buffers failed ..." show
2011-01-20 23:50:03.269 Player(0): Waited 100ms for video buffers AAAAAAAAAAAAAAAAAAAAAAAAAAAaAaL

This is the most worse scenario, but sometimes decoder set eof before player stop decoder to drain ringbuffer. Then stopping decoder timeout messages will not popup, but the rest is the same.

The attached v2 patch changes:

  • Open file is called before player thread try stop decoder thread to ensure decoder will finish reading from ringbuffer and/or new playback file during trying to pause.
  • Ringbuffer logic has been adjusted to deal with this change, some dead code that trying to hide (not solve) this problem has been removed. Most notable change is than decoder will wait in RingBuffer::WaitForAvail?() until ringbuffer thread fill requested data.
  • This patch not changing other pause/unpause logic whish can be removed in future.

After this change remaining portion of the dead code can be removed, but its not mandatory in current state. After this issue will be fixed in trunk ill post batch of fixes targeting other major playback issues i have found during this investigation.

Also i will be pleasured if Mark Kendal will include in his commits references to ticket from where he picked code changes.

Jiri

Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

patch v2

comment:9 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

forgot to mention,

I created this shell script to fill program data with selected interval to speed up debuging:

#!/bin/sh

STEP=2                  #Duration of program
CHANID="'9008'"         #Channel ID

echo "DELETE FROM program WHERE chanid=$CHANID" \
| mysql --host=localhost --user=mythtv --password=mythtv mythconverg

DATE=$(date +"%Y-%m-%d %H:")
COUNTER=0
while [  $COUNTER -lt $((60 - $STEP)) ]; do
    COUNTER=$(($COUNTER + $STEP))
    DATE1="'${DATE}${COUNTER}'"
    DATE2="'${DATE}$(($COUNTER + $STEP))'"
    NAME="'Test $COUNTER'"
    echo "INSERT INTO program (chanid, starttime, endtime, title, subtitle, description, category, category_type, airdate, stars, previouslyshown, title_pronounce, stereo, subtitled, hdtv, closecaptioned, partnumber, parttotal, seriesid, originalairdate, showtype, colorcode, syndicatedepisodenumber, programid, manualid, generic, listingsource, first, last, audioprop, subtitletypes, videoprop) VALUES \
    ($CHANID, $DATE1, $DATE2, $NAME, '', '...', '', 'tvshow', 0000, 0, 0, '', 0, 0, 0, 0, 0, 0, '', NULL, '', '', '', '', 0, 0, 1, 0, 1, '', '', '')" \
    | mysql --host=localhost --user=mythtv --password=mythtv mythconverg
done

comment:10 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

I have found that in my case ringbuffer is drained fater than filled. It causing also ocasional stuttering. This may happen only if playback is faster than recording. Maybe i have faulty hardware. Will check it later.

But if this hit other peoples this problem may be logged. I created v3 patch than log this problem when this issue happen, remove unneded waiting in RingBuffer::WaitForAvail?(). Its simplest solution to recover from switchto program crash ...

Jiri

Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

patch v3

comment:11 Changed 13 years ago by markk

Jiri - Slightly confused. Are you using all 3 'versions' of the buffer underrun patch? or should we now be ignoring the first 2

comment:12 Changed 13 years ago by andrew@…

@Jiri, I have the same issue also with the stuttering, I put it down to playing 50hz video on a 60hz display, if I seek back a few seconds it goes away.

Your v2 patch worked well, will give v3 a go now.

@markk Im going to say ignore the first 2.

comment:13 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

Andrew, You got it, i playing 25hz(50hz with doublerate interlacer) at 60hz TV too. So this may be timing issue i mentioned. In 0.24 and up there was lots of changes to video timing, maybe for this i never got this problem before.

Mark, V1 trying to hide issues vith playback buffer underrun. V2 show than there is a race condition but trying to hide this issue too. V1 and V2 are too complicated and in fact it not loging this issue.

V3 is dirty hack to ffmpeg decoder but its properly logged/noted and also removing unneded timeout during switchto next or eof condition. I voting for V3 approach because is simplest and the issue is exactly logged.

This cause is closed for me. If buffer underrun happen, we are at safe side and playback will not crash. I going to check why the video going forward with 50/60hz setup. When i found and/or fix something valuable regarding this problem ill open new ticket.

Jiri

Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

v3a, fixed missing brackets, old file i picked from wrong location :oops:

comment:14 Changed 13 years ago by Jiri Fojtasek <jiri.fojtasek@…>

I found a way to force my Philips lcd tv work at 50HZ (Fedora 14, Radeon X1300, DVI to HDMI reduction):

# output from command: cvt 1360 768 50
# 1360x768 49.89 Hz (CVT) hsync: 39.56 kHz; pclk: 69.00 MHz
# Modeline "1360x768_50.00"   69.00  1360 1416 1552 1744  768 771 781 793 -hsync +vsync
# Create new mode:
xrandr --newmode "1360x768_50.00"   69.00  1360 1416 1552 1744  768 771 781 793 -hsync +vsync
xrandr --addmode DVI-0 1360x768_50.00
xrandr -r 50
# output from glxgears:
# Running synchronized to the vertical refresh.  The framerate should be
# approximately the same as the monitor refresh rate.
# 250 frames in 5.0 seconds = 49.864 FPS

This created new mode, officially unsupported by this TV but it working so i do not care. The problem with underrun still happen so its definitely not 50/60hz playback issue. The problem with stuttering caused by 50hz/60hz shift is gone and playback is smooth. Its most notable in videos with moving text.

So the error message in patch should be changed to something else like this:

"buffer underrun detected, trying to recover ..."

Jiri

comment:15 Changed 13 years ago by markk

Resolution: Fixed
Status: newclosed

Jiri - I'm closing this ticket as fixed, though recognise that there are probably outstanding issues.

The RingBuffer? wait condition should have been resolved in c3e1879e0c284b3c2267 and I took a slightly different approach to the decoder eof problem in 679b668e3669a38bd08e

If you are still seeing issues with livetv hitting the end of the file during normal playback, please open another ticket as that is a separate issue that needs a proper root cause analysis.

Thanks for the time and effort taken in debugging the issue and producing the patches.

Mark

Note: See TracTickets for help on using tickets.