Opened 9 years ago

Closed 9 years ago

#9458 closed Patch - Bug Fix (Fixed)

[PATCH] posix_fadvice usage

Reported by: jiri.fojtasek@… Owned by: danielk
Priority: minor Milestone: 0.25
Component: MythTV - General Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Goal of this patch is fix issues introduced by latest changes to page cache management and to have page cache under our control much as posible.

threadedfilewriter fixes:

  • We need toss only writen portion of the data from the pagecache, since the same file may be used by ringbuffer (live tv play). This bug marked any posix_fadvice optimisation changes to the ringuffer useless because it doing only unnecessarily disk io (see ringbuffer fixes).
  • Changed order of syncer and writer thread termination in order to correctly toss remaining portion of unused page cache.

ringbuffer fixes:

  • POSIX_FADV_WILLNEED called after a seek with 1MB size is useless since read() may be called immediately, so this may cause only unnecessarily disk io. Linux OS have already its own readahead algorithm so lets tweak it by blockdev --setra
  • Marking whole portion of the file in pagecache as unused is useless since more frontends may read the file at same time and this will negate OS readahead effeciency (another unnecessarily disk io)
  • In main readahead loop mark as unused only portion of the pagecache we used in last read (more frontends may read the same file).
  • When the file is closed mark whole pagecache unused, since we do not have explicit control to what is filled by OS readahead. This may flush pagecace for other readers but this not happen often so this payoff is acceptable.

Jiri

Attachments (2)

101-posix_fadvise_fixes.patch (5.6 KB) - added by jiri.fojtasek@… 9 years ago.
the patch
100-posix_fadvise_fixes_v2.patch (6.4 KB) - added by Jiri Fojtasek <jiri.fojtasek@…> 9 years ago.
v2 patch

Download all attachments as: .zip

Change History (7)

Changed 9 years ago by jiri.fojtasek@…

the patch

comment:1 Changed 9 years ago by danielk

Owner: set to danielk
Status: newassigned

Jiri, I noticed that one of the fixes was signed by a "--setra". Is there a thread somewhere with longer explanations of each of the changes? The main thing I see here is that these advise calls appear to be interpreted globally rather than per file descriptor. i.e. if process A is reading file X on fd 2 at location 1024 and process B is writing to X fd 3 then when B issues a POSIX_FADV_WONTNEED for the data before location 2048, process A now needs to read from the disk rather than reading from the cache. This would be caused by the OS is not being smart enough to realize that even if process B doesn't plan to read the file soon, process A is reading the file with the SEQUENTIAL attribute and will soon read bytes 1025 through 2048.

comment:2 in reply to:  1 Changed 9 years ago by Jiri Fojtasek <jiri.fojtasek@…>

Replying to danielk:

Jiri, I noticed that one of the fixes was signed by a "--setra". Is there a thread somewhere with longer explanations of each of the changes? The main thing I see here is that these advise calls appear to be interpreted globally rather than per file descriptor. i.e. if process A is reading file X on fd 2 at location 1024 and process B is writing to X fd 3 then when B issues a POSIX_FADV_WONTNEED for the data before location 2048, process A now needs to read from the disk rather than reading from the cache. This would be caused by the OS is not being smart enough to realize that even if process B doesn't plan to read the file soon, process A is reading the file with the SEQUENTIAL attribute and will soon read bytes 1025 through 2048.


Page cache (or just file cache) are shared per physical file not per file descriptor so its global. Opening a file than have portion in the cache and calling DONTNEED will clear it from the cache.

If you writing to a file and continuously flushung content from the cache by calling DONTNEED (both parameters zero will flush whole cache for given file) and another reader will always read the content from the disc instead of cache read ahead algorithm will be negated because read ahead should happen imeediately after read call and continuously are flushed by writer. For this reason is smart to flush only content we really do not need for both reader and writer process. About this is this patch.

I started play with this before to fix problem than mythtv continuously filling file cache with cached streams and pushing out other more importand cached data and in even worse scenario causing program space pages swaped out with default vm.vfs_cache_pressure settings. Then i found than you played with this in the trunk and and only finely tuned the needs :)

I using mythtv at relative slow computer and starting any application after a hour of watching mythv was really slow ...

blockdev --setra is standart linux command with parameter to set number of sectors will be readahead from file by OS when its accessed (hdparm can do the same thing).

Before i started play with mythv code i played a lot with programs from this page: http://insights.oetiker.ch/linux/fadvise/ There are two programs. One is to flush a file from the cache and second show pages in cache used by the file. I used it also to test if my changes are corect at live tv stream and it show me for example need to swap order of killing threads in threadedfilewriter.

Jiri

comment:3 Changed 9 years ago by Jiri Fojtasek <jiri.fojtasek@…>

This is v2 of the patch. It fixing issues from previous version:

  • Internalreadpos in ringbuffer not always follow real position in the file, so it was replaced by seek() to always get correct position.

Improvements:

  • Leave portion of the file in pagecache because livetv playback may need it very soon. This will reduce disk io during livetv playback and decrease time required to open next file when player switching/jumping to next program. The size of may_need has been choosed as 5 seconds of 35mbit playback.

Jiri

Changed 9 years ago by Jiri Fojtasek <jiri.fojtasek@…>

v2 patch

comment:4 Changed 9 years ago by beirdo

Summary: [patch] posix_fadvice usage[PATCH] posix_fadvice usage

comment:5 Changed 9 years ago by danielk

Resolution: Fixed
Status: assignedclosed

Most of the issues here were dealt with by [e4c5909]. The rest would seem to be a bad idea for the same reason previous fadvise patches were a bad idea. First, linux interprets the DONTNEED calls globally, so if one application says it won't need the data and another says it will, the data can get thrown out of cache repeatedly. The other problem is that WILLNEED doesn't tell the OS to start reading the data in the background so a future read will execute more quickly, it actually blocks while reading in the entire amount of data that we're telling it we will need in the future. All this means the only really useful thing we can do with fadvise is to issue the SEQUENTIAL one, and there are a couple cases where it makes sense to do a small WILLNEED just to avoid starting playback and then pausing (it's better to just take a few milliseconds longer start delivering frames than to deliver a few and then pause for a few milliseconds.)

Note: See TracTickets for help on using tickets.