Opened 6 years ago
Closed 6 years ago
Last modified 6 years ago
#13315 closed Patch - Bug Fix (fixed)
Use fully-decoded URL paths when translating to local paths
Reported by: | ijc | Owned by: | Peter Bennett |
---|---|---|---|
Priority: | minor | Milestone: | 30.0 |
Component: | MythTV - General | Version: | Master Head |
Severity: | medium | Keywords: | |
Cc: | Ticket locked: | no |
Description
Please see https://github.com/MythTV/mythtv/pull/167. Not quite sure about the component, is this services API backend? I left as general. I set the version/milestone to master & 30, but it would be great to get this back ported to fixes/29 too please.
Quoting the PR description:
Previously it was not possible to access files with certain characters in them remotely (such as music tracks with quotes in the title) because the characters are URL %-escaped, e.g.:
ProcessRequest storagegroup.cpp:608 (FindFile) SG(Music): FindFile: Searching for '/Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(Music): FindFileDir: Checking '/storage/music' for '/storage/music//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(Default): FindFileDir: Checking '/var/lib/mythtv/recordings' for '/var/lib/mythtv/recordings//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/Banners' for '/var/lib/mythtv/.mythtv/Banners//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/Coverart' for '/var/lib/mythtv/.mythtv/Coverart//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/backups/mythtv' for '/var/backups/mythtv//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/recordings' for '/var/lib/mythtv/recordings//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/Fanart' for '/var/lib/mythtv/.mythtv/Fanart//Enslaved - Isa/01-Intro: %22Gree Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/storage/music' for '/storage/music//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/MusicArt' for '/var/lib/mythtv/.mythtv/MusicArt//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/storage/media' for '/storage/media//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/Screenshots' for '/var/lib/mythtv/.mythtv/Screenshots//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythtv/.mythtv/Trailers' for '/var/lib/mythtv/.mythtv/Trailers//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:639 (FindFileDir) SG(): FindFileDir: Checking '/var/lib/mythvideo' for '/var/lib/mythvideo//Enslaved - Isa/01-Intro: %22Green Reflection%22.flac' ProcessRequest storagegroup.cpp:622 (FindFile) SG(Music): FindFile: Unable to find '/Enslaved - Isa/01-Intro: %22Green Reflection%22.flac'! ProcessRequest mainserver.cpp:7968 (LocalFilePath) MainServer: ERROR: LocalFilePath unable to find local path for '/Enslaved - Isa/01-Intro: %22Green Reflection%22.flac'.
This is because the actual file on disk is not %-encoded, the actual filename
is 01-Intro: "Green Reflection".flac
. If the filename had been %-encoded,
then the resulting %'s would, I beleive, have themselves been encoded as %25 by
this point.
To fix this switch from url.toString()
to url.path(QUrl::FullyDecoded)
.
Note that:
url.toString()
does not acceptQUrl::FullyDecoded
. That produces the error message:QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL
- at this point in the code the URL only contains a path element and a possible
fragment (handled separately in the code). The is no
scheme://
or hostname etc). I think the only reason it is aQUrl
at this point and not aQString
is the fragment handling. - a path (not a full URL) is what
StorageGroup::FindFile
and friends expect, they takeQString
notQUrl
.
Also add QUrl::FullyDecoded
to some existing uses of url.path()
in these
code paths.
There seems to be two very similar copies of this code (libmythprotoserver and in mythbackend/mainserver directly) so I have adjusted both. It's not entirely clear to me where they are each used from, but I have tested (on a backport to fixes/29):
- recording playback (both preexisting recordings and live TV), the filenames here are autogenerated and are not expected to contain any quotable characters AFAIK.
- mythvideo (filenames with and without quote characters).
- mythmusic (filenames with and without quote characters).
- old gallery plugin (only filenames without quotes).
Signed-off-by: Ian Campbell <ijc@…>
Attachments (2)
Change History (20)
comment:1 Changed 6 years ago by
comment:2 Changed 6 years ago by
I posted on the PR in September but forgot to repeat here:
This is working for me, but I'm now also playing with a path which ditches QUrl altogether and just uses a QString, we aren't treating these things as URLs in any meaningful way here and just end up undoing all the things which QUrl does for us.
I've been running with this for a few days now and everything seems ok AFAICT. I've pushed it here now.
It's now been a few months not days and all remains fine AFAICT.
comment:3 Changed 6 years ago by
Owner: | set to Peter Bennett |
---|---|
Status: | new → assigned |
comment:4 Changed 6 years ago by
I went to squash and realised that I had only applied the change to remove QUrl in favour of QString from one of the two copies of this code. I will update ASAP with a version that does both halves.
comment:5 Changed 6 years ago by
I've attached an updated (and squashed) patch which handles both sets of code. I also updated the PR with the same thing.
comment:6 Changed 6 years ago by
There is a problem here. The patch fixes some things and breaks others.
I tested the following video file name :
ticker_1080i%25.mkv
This plays correctly on a local frontend as well as a remote frontend. However with your patch the playback fails in both cases with the backend reporting it cannot find ticker_1080i%.mkv
Downloaded files frequently have url encoded file names and people may have such files. These would fail with the new patch.
The patch does fix the case of failing to play a file with a quote (") in the name.
comment:7 Changed 6 years ago by
Hi Peter, I somehow missed the notification for your comment but I was coming here to report the same thing (or at least something similar) which is that handling of UTF-8 seems to be a bit off now too.
From what you are saying it seems like there is still some URL decoding going on somewhere i.e. a request for the filename ticker_1080i%25.mkv
seems to have been decoded into ticker_1080i%.mkv
somewhere along the line which is unexpected I think.
comment:8 Changed 6 years ago by
So I have added a file with a %25
in it so I have:
$ ls /var/lib/mythvideo/temp/ticker_1080i%25.mkv /var/lib/mythvideo/temp/ticker_1080i%25.mkv
Which is in the database as:
MariaDB [mythconverg]> select title,subtitle,filename from videometadata where filename like "%ticker%"; +-----------------+----------+--------------------------+ | title | subtitle | filename | +-----------------+----------+--------------------------+ | ticker 1080i%25 | | temp/ticker_1080i%25.mkv | +-----------------+----------+--------------------------+ 1 row in set (0.01 sec)
When playing (without my patches) I see:
mythfrontend[9987]: I CoreContext remotefile.cpp:518 (Exists) RemoteFile::Exists(): looking for remote file: myth://Videos@iranon/temp/ticker_1080i%25.mkv mythfrontend[9987]: I MythSocketThread(-1) mythsocket.cpp:758 (WriteStringListReal) MythSocket(2891300:36): write -> 36 55 QUERY_FILE_EXISTS[]:[]temp/ticker_1080i%.mkv[]:[]Videos mythfrontend[9987]: I MythSocketThread(-1) mythsocket.cpp:758 (WriteStringListReal) MythSocket(5222e40:76): write -> 76 191 ANN FileTransfer iranon 0 1 2000[]:[]/temp/ticker_1080i%.mkv[]:[]...
and
mythbackend[9874]: I MythSocketThread(96) mythsocket.cpp:959 (ReadStringListReal) MythSocket(2aced50:96): read <- 96 55 QUERY_FILE_EXISTS[]:[]temp/ticker_1080i%.mkv[]:[]Videos mythbackend[9874]: E ProcessRequest storagegroup.cpp:622 (FindFile) SG(Videos): FindFile: Unable to find 'temp/ticker_1080i%.mkv'! mythbackend[9874]: I MythSocketThread(182) mythsocket.cpp:959 (ReadStringListReal) MythSocket(2ad0fb0:182): read <- 182 191 ANN FileTransfer iranon 0 1 2000[]:[]/temp/ticker_1080i%.mkv[]:[... mythbackend[9874]: I ProcessRequest mainserver.cpp:7994 (LocalFilePath) MainServer: LocalFilePath(/temp/ticker_1080i%25.mkv 'ticker_1080i%.mkv'), found file through exhaustive search at '/var/lib/mythvideo//temp/ticker_1080i%25.mkv'
So, I think it looks like something somewhere (on the client side, between the log for RemoteFile::Exists()
and the one for WriteStringListReal
) is decoding the URL escaping in the name and then relying on the code I removed to reencode it so it can find the file, which is rather backwards (if anything it should be further encoding it to ticker_1080i%2525.mkv
so it can be decoded to ticker_1080i%25.mkv
).
I think the code which messes with the encoding is spread through libs/libmythbase/remotefile.{cpp,h}
.
I think the choices are to try fix the use of Qurl
in remotefile.cpp
(which must be there in this case because it actually is a URL, so we can't just excise it completely) or to roll back to my first attempt here which added fixes to the use of QUrl in mainserver.cpp
etc (e.g. ensuring the fragment and query were put back, etc).
I prefer the first choice (although it might take a bit longer to get right). I'm going to have a play right now.
comment:9 Changed 6 years ago by
OK, so I'm now fairly convinced that the issue is that myth is treating myth://Videos@iranon/temp/ticker_1080i%25.mkv
as a URL referring to the file temp/ticker_1080i%25.mkv
.
The correct URL to refer to that file would be myth://Videos@iranon/temp/ticker_1080i%2525.mkv
(i.e. the literal % has been encoded).
I think we've got a situation where two wrongs are making a (partially) right in that the incorrect URL can sometimes be accepted because the backend is reencoding the path.
I'm debating what to do here -- finding all the places which call into RemoteFile::*
and ensuring that they properly URL encode things correctly seems like a pretty large (if not impossible) task, but would be the correct thing to do I think.
Declaring that RemoteFile
URLs are not exactly the same as regular ones, wrt %encoding and the handling of query&fragment parts might be simpler but feels very ugly :-/
comment:10 Changed 6 years ago by
I agree that there shouldn't be two different flavors of url. Too confusing to both users and developers.
comment:11 Changed 6 years ago by
Most (all?) of the RemoteFile
methods seem to take some variant on QString url
and then reasonably promptly do a QUrl qurl(url)
to turn it into a structured URL.
I think it would make sense to push the use of QUrl
into the method arguments, this will then propagate up the call chain to whichever places are currently failing to properly encode which can then be fixed. It also seems like the right thing to do -- if these things are going to be treated as URLs we should probably use the right type for them.
One wrinkle is that some (all?) of these methods also handle an argument which is a local path and not a URL at all. I think that's OK, all it means is these become QUrl
's with a scheme of file
and this all comes out in the wash.
I'll give this a go, I expect it'll be a big patch and take a little while to do...
comment:12 Changed 6 years ago by
I'll give this a go, I expect it'll be a big patch and take a little while to do...
and indeed it is a pretty big patch (and I didn't even finish)...
However, in the process of looking at it I realised that we can get a large improvement in correctness by just having GenMythURL
use QUrl
internally but still return a QString
(i.e. qurl.toString()
) this ensures that everything in the returned stringified url is correctly escaped such that converting back to a QUrl
later on (e.g. in remotefile.cpp
) does the right thing.
The patch for this looks quite tractable and I'm testing it right now. I'm hopeful it might even be acceptable for the 30 release.
Once GenMythURL
uses QUrl
internally we can start to bubble that up the call chains (perhaps using MDEPRECATED
as needed to break off even smaller chunks at a time) and eventually end up using QUrl
everywhere (my initial mistake was to start from the remotefile.cpp
functions and trying to tease it out from the other end -- which was backwards and left me with a big pile of yarn...)
Changed 6 years ago by
Attachment: | 0001-Drop-internal-use-of-QUrl-in-FileTransfer.patch added |
---|
[PATCH 1/2] Drop internal use of QUrl in FileTransfer?
Changed 6 years ago by
Attachment: | 0002-Use-QUrl-internally-in-MythCoreContext-GenMythURL.patch added |
---|
[PATCH 2/2] Use QUrl internally in MythCoreContext::GenMythURL
comment:13 Changed 6 years ago by
I've refreshed the original patch and added a second (cumulative) patch. This seems to be work for me in all the test cases I tried before, plus Peter's case with the % encoded file name.
I'm still looking into more widespread internal use of QUrl, but I think this pair could go in now.
comment:14 Changed 6 years ago by
I tested files with quote (") and with %25
- Unpatched backend and frontend - file with quote fails on remote frontend
- Unpatched backend, patched frontend - file with quote fails on remote frontend
- Patched backend, unpatched remote frontend - file with %25 fails on remote frontend
- Patched backend and frontend - all works
This seems fine to me. Are you ready for me to commit it? There was some discussion on irc about adding some tests.
comment:15 Changed 6 years ago by
I think the testing discussion on IRC is a longer term thing (and will be quite involved).
IMHO this can go in whenever you are ready.
comment:16 Changed 6 years ago by
Resolution: | → Fixed |
---|---|
Status: | assigned → closed |
commit b18a1ef110500d1634891b4d3534682db61a2348 (HEAD -> master, origin/master)
Author: Ian Campbell <ijc@hellion.org.uk> Date: Sun Dec 9 18:34:40 2018 -0500 Fix handling of special characters in remote files Change the usage of QUrl so that file names are correctly escaped when accessing files on a remote backend, for example videos with a quote as part of the name. Fixes #13315 Signed-off-by: Peter Bennett <pbennett@mythtv.org>
New comment on PR:
Things which looked like fragments (bits after #) and queries (bits after ?) were being dropped, even though they were actually part of the filename, not the URL. In a few (but not all) places the fragment was being put back on, but the query never was. Handle this across the board, also be sure to decode any %escapes in those too.
This is working for me, but I'm now also playing with a path which ditches QUrl altogether and just uses a QString, we aren't treating these things as URLs in any meaningful way here and just end up undoing all the things which QUrl does for us.