Opened 2 years ago

Closed 2 years ago

Last modified 21 months ago

#13159 closed Patch - Bug Fix (fixed)

Automatic lookup of metadata should prefer exact title match over partial title matches

Reported by: Steve Erlenborn <simon.sinister@…> Owned by: Peter Bennett
Priority: minor Milestone: 30.0
Component: MythTV - Mythmetadatalookup Version: Master Head
Severity: medium Keywords: metadata
Cc: piotr.oniszczuk@… Ticket locked: no

Description

The automatic lookup of metadata only works well when the title of the recording matches a single movie or TV series. When there are multiple matches, the current behavior is to give up. When there is an exact match on one of the titles, it should take precedence over partial matches. When there are multiple exact matches on title, then one of those should be selected.

This patch provides preference to the most recently released, exact title match.

Attachments (6)

mythtv-13159-PreferExactTitleMatchesForMetadata.patch (6.2 KB) - added by simon.sinister@… 2 years ago.
This patch provides better selection of metadata
PreferExactTitleMatchesForMetadata.examples (4.4 KB) - added by simon.sinister@… 2 years ago.
Examples of patch behavior from mythmetadatalookup.log
mythtv-13159-PreferExactTitleMatchesForMetadata2.patch (11.9 KB) - added by Steve Erlenborn 2 years ago.
Updated patch to handle some tricky lookup scenarios
mythtv-13159-PreferExactTitleMatchesForMetadata3.patch (12.1 KB) - added by Steve Erlenborn <simon.sinister@…> 2 years ago.
Eliminate Segmentation Fault when removing last item without artwork
mythtv-13159-PreferExactTitleMatchesForMetadata4.patch (15.3 KB) - added by Steve Erlenborn <simon.sinister@…> 2 years ago.
Updated patch to handle "Molly & Wors" and "Go West" scenarios
mythtv-13159-PreferExactTitleMatchesForMetadata5.patch (17.4 KB) - added by Steve Erlenborn <simon.sinister@…> 2 years ago.
Updating Popularity references in videoLookupInfo.h

Download all attachments as: .zip

Change History (13)

Changed 2 years ago by simon.sinister@…

This patch provides better selection of metadata

Changed 2 years ago by simon.sinister@…

Examples of patch behavior from mythmetadatalookup.log

comment:1 Changed 2 years ago by Peter Bennett

Component: MythTV - GeneralMythTV - Mythmetadatalookup
Owner: set to Peter Bennett
Status: newassigned

See also #12277

comment:2 Changed 2 years ago by Stuart Auchterlonie

Milestone: needs_triage30.0

comment:3 Changed 2 years ago by Peter Bennett

Reporter: changed from simon.sinister@… to Steve Erlenborn <simon.sinister@…>

Changed 2 years ago by Steve Erlenborn

Updated patch to handle some tricky lookup scenarios

Changed 2 years ago by Steve Erlenborn <simon.sinister@…>

Eliminate Segmentation Fault when removing last item without artwork

comment:4 Changed 2 years ago by Peter Bennett

Cc: piotr.oniszczuk@… added

I found two problems.

Problem 1

I have a problem with the artwork check. You are rejecting any entry found with no artwork. I don't know how many shows exist without artwork, but even if there is no artwork it should be able to find the season and episode.

Here is a test I ran. I just updated a random show to be the below (using "Edit Recording Metadata")

  • Title "Molly & Wors"
  • Subtitle "Episode 40"
  • Inetref: ttvdb.py_162941
  • Set season and episode to 1

If you got to thetvdb.com you can find that this is Season 2 Episode 14. Request the metadata job.

Without your patch: It is updated to show Season 2 Episode 14. With your patch: It is not updated, log message is "Removing 'Molly & Wors' entry with no artwork", still shows season and episode as 1.

Problem 2

With the above test, if I do not put in the inetref, both with and without the patch it is filling in the inetref as tmdb3.py_242263, which is "Molly & Wors The Movie" and giving me artwork for the movie, which seems strange as there is an exact match for "Molly & Wors" in the TV database, so why is it using the movie database?

comment:5 Changed 2 years ago by Steve Erlenborn <simon.sinister@…>

Thanks for the example case. I've added it to my list of example cases which need to be handled.

Tomorrowland [A George Clooney movie]

  • A partial match on TV show "Miles From Tomorrowland", with artwork
  • One exact match on movie, with artwork <----- This is the one we want

How to Marry a Millionaire [A Marilyn Monroe movie]

  • An exact match on TV, with no artwork. The release date is later than the movie.
  • An exact match on Movie, with beautiful artwork. <----- This is the one we want

Valor [A 2017 television series]

  • One TV show with an exact title match <----- This is the one we want
  • Several movies with partial title matches

Supergirl [A recent television series]

  • One TV series with an exact match. <----- This is the one we want
  • Two movies with exact title matches.

Go West [A Marx Brothers movie]

  • One partial match on TV show, with no artwork
  • Four movies match exactly. The Marx Brothers movie is the most popular. <----- This is the one we want

Molly & Wors

  • One TV show matches exactly, but it has no artwork. <----- This is the one we want
  • One movie partially matches, and has artwork

I've updated my patch to remove the blocks of code I previously added, which removed list elements with no artwork. Based on your feedback, that design was flawed. I've come up with an alternative solution, by adding a new method called findExactMatchCount(). This routine can be invoked to count exact matches, or exact matches with artwork.

When a recording is expected to be a television series, we retrieve the count of exact matches in the television database with artwork. If that number is zero, then we add movies to the list of possible solutions. When a recording is expected to be a movie, we retrieve the count of exact matches in the movie database with artwork. If that number is zero, then we add television shows to the list of possible solutions.

The findBestMatch() method then looks for exact matches with artwork. If there are exact matches, but none have artwork, it will pick a solution without artwork. [This handles your "Molly & Wors" scenario]. I tried a solution where both television and movie databases were always searched, but this caused problems for my "Supergirl" scenario. The movie "Supergirl" was released later than the start of the television series. So, it works better to first try finding an exact match, with artwork, in the television database. In the "Supergirl" case, an exact match with artwork is found. So there's no need to add in the entries from the movie database. For the "How to Marry a Millionaire" scenario, a television search yields one exact match, but it has no artwork. The call to findExactMatchCount(*,*,true) returns 0 since there are no matches with artwork. So we do an additional search in the movie database, which finds the best solution.

The "Go West" scenario was difficult. Originally, when there were multiple exact title matches, I'd pick the most recently released one. This works, but it did not retrieve the artwork for the movie I recorded. It picked artwork for a more recent movie using the same name. To "fix" this, I've modified the algorithm in findBestMatch() to pick the most popular exact title match. This works pretty well for movies. I found that the movie database tends to include a popularity value for most movies. Unfortunately, the television database usually doesn't include a popularity value. I had to implement a multi-pronged solution. If there are popularity values, it will pick the most popular movie. If it's selecting from television entries, they will all have popularity of 0.0, so it will then fall back to picking the one with the most recent release date.

The GetPopularity?() service wasn't really used before this. I found that it had been improperly implemented to return an unsigned int value. This didn't function very well since the database entries for movies have floating point values. To work around this, I've updated m_popularity to be of type float, and have modified GetPopularity?() to return a float.

With these changes, all of my example cases are working the way I wanted. "Molly & Wors" now picks the television series, even though it has no artwork.

Please try mythtv-13159-PreferExactTitleMatchesForMetadata4.patch and let me know if you find any other scenarios which are not handled in a satisfactory manor.

Changed 2 years ago by Steve Erlenborn <simon.sinister@…>

Updated patch to handle "Molly & Wors" and "Go West" scenarios

Changed 2 years ago by Steve Erlenborn <simon.sinister@…>

Updating Popularity references in videoLookupInfo.h

comment:6 Changed 2 years ago by Steve Erlenborn <simon.sinister@…>

Resolution: fixed
Status: assignedclosed

In e81c7fd117b148a3bc1f572fd5ea74aabb2162ed/mythtv:

Metadata lookup: Prefer exact title match

Where there are multiple title matches, prefer exact title
and if there are multiple exact then prefer most recent.

Fixes #13159

Signed-off-by: Peter Bennett <pbennett@…>

comment:7 Changed 21 months ago by Peter Bennett

Owner: changed from Peter Bennett to Peter Bennett
Note: See TracTickets for help on using tickets.