Opened 9 years ago
Closed 5 years ago
#12298 closed Bug Report - General (fixed)
Sorting improperly assuming an Article
Reported by: | Owned by: | David Hampton | |
---|---|---|---|
Priority: | minor | Milestone: | 30.0 |
Component: | MythTV - General | Version: | 0.27-fixes |
Severity: | medium | Keywords: | |
Cc: | Ticket locked: | no |
Description
A new show started this season called "A to Z" (http://thetvdb.com/?tab=series&id=281588&lid=7)
In the watch recordings screen, it assumes "A" is an article, and not part of the actual title, and therefore this show is sorted as "to Z, A" - which can bury it in a large library of recordings.
Change History (8)
comment:2 Changed 6 years ago by
Owner: | set to David Hampton |
---|---|
Status: | new → assigned |
comment:3 Changed 6 years ago by
Owner: | changed from David Hampton to David Hampton |
---|
comment:4 Changed 5 years ago by
Milestone: | unknown → 30.0 |
---|
comment:5 Changed 5 years ago by
Replying to Gary Buhrmaster <gary.buhrmaster@…>:
Trying to add semantic analysis to the parser is going to be, um, interesting, to get right.
And internationalization is even more interesting (I have no idea if all languages delete the equivalent of "A", "An", and "The" in their sorting order). And then there are the special cases of titles like "Das Boot", which I am going to guess even in German should still include the "Das" ("The" in my very stale German), and "A Few Good Men" is typically under "A". I am going to guess that one will need to have some ability to override any default deletion of the "A", "An", and "The" (new field which if NULL uses the default rules, otherwise uses as the user has specified?).
comment:6 Changed 5 years ago by
musicbrainz has an entirely separate "sortname" field for this purpose/problem. Perhaps the answer is to have such a field, set it from API provided data if an API supplies it and otherwise set it using some heuristic allowing it to be overridden?
I had a look but it doesn't look on a quick glance like any of IMDB, TVDB nor my xmtlv-sdjson feed have anything like that field, which is a shame. Though with an overrideable heuristic and the possibility that APIs might one day provide this perhaps the extra field still make some sense.
comment:7 Changed 5 years ago by
The changes that I'm planning to commit take the list of prefixes to delete from the existing language translation of the string '(A |An |The )' that is used in the 'Watch Recordings' section of MythTV.
My changes will extend the prefix manipulation into the rest of the user interface. If the translation of the prefix strings is empty then no changes are made. If the translation is not empty, there is a user settable flag indicating whether or not the user would like the prefixes deleted. If prefixes are to be deleted, there is a user settable list of strings to be exempt from prefix deletion. I think that covers all the cases.
At one point I was storing the sortable versions of the strings into the database, per a suggestion from dekarl, but I backed out that code. Upgrading the database to populate the new sortable string columns takes quite a bit of time, which is ok, but then any change to the sort preferences would require another delay while repopulating the database with new sort strings. That delay would not be part of an upgrade and probably not be welcome by the user. Individual translation take on the order of 1 usec on my development system, and 11 usec on a rPi, which seems fast enough for now.
I would like to see better musicbrainz support at some point in the future, including parsing the sort tags. That would definitely require storing the sort strings in the database, but would also introduce the need to track which strings came from an external source like MB and which came from heuristics.
Typically "A", "An", and "The" should not be used for sorting (and I certainly would not want to see them used for sorting for most titles). Trying to add semantic analysis to the parser is going to be, um, interesting, to get right.