Opened 5 years ago
Closed 5 years ago
Last modified 5 years ago
#13607 closed Bug Report - General (fixed)
Program description not being extracted where title expands across additional dvb field
Reported by: | bib1963 | Owned by: | Klaas de Waal |
---|---|---|---|
Priority: | minor | Milestone: | 32.0 |
Component: | MythTV - General | Version: | Master Head |
Severity: | medium | Keywords: | |
Cc: | Ticket locked: | no |
Description
A program "The league of Extraordinary Gentlemen" is being transmitted here in the UK.
The program description is not being extracted.
From the db: MariaDB [mythtvdb]> select starttime,title,subtitle,description from program where title like "%gentlemen%" limit 1; +---------------------+---------------------------------------+----------+-------------+ | starttime | title | subtitle | description | +---------------------+---------------------------------------+----------+-------------+ | 2020-04-12 17:50:00 | The League of Extraordinary Gentlemen | | | +---------------------+---------------------------------------+----------+-------------+ 1 row in set (0.05 sec)
And from dvbsnoop, the relevant extract:
DVB-DescriptorTag: 77 (0x4d) [= short_event_descriptor] descriptor_length: 230 (0xe6) ISO639_2_language_code: eng event_name_length: 30 (0x1e) event_name: "The League of Extraordinary..." -- Charset: Latin alphabet text_length: 195 (0xc3) text_char: "...Gentlemen: (2003) Fantasy with Sean Connery. In an alternative Victorian age, Allan Quatermain, Dorian Gray, Captain Nemo, Mina Harker and the Invisible Man stop a world war. Violence. [AD,S]" -- Charset: Latin alphabet
I assume it breaks when it hits that colon at the end of "Gentlemen".
Change History (10)
comment:1 Changed 5 years ago by
comment:2 Changed 5 years ago by
That particular extraction was on DVB-T2, but I am sure I have also seen it on satellite.
comment:3 Changed 5 years ago by
Here are some more which seem to be missing descriptions:
2020-04-20 13:30:00 | Beyond Stardom 2020-04-20 20:00:00 | Harbour Lights 2020-04-21 00:10:00 | House 2020-04-22 18:30:00 | Lawmen of the Old West 2020-04-18 09:45:00 | Tad the Lost Explorer and the Secret of King Midas 2020-04-20 19:30:00 | Tales of the Unexpected 2020-04-18 18:50:00 | The League of Extraordinary Gentlemen 2020-04-18 20:00:00 | World Without End
I'm not sure all of them could be hit by data going across multiple fields. "House" is very short and would appear to have corrupted entries or they are not using plain ascii, yet it's the same same entries. Here is the dvbsnoop details...
DVB-DescriptorTag: 77 (0x4d) [= short_event_descriptor] descriptor_length: 93 (0x5d) ISO639_2_language_code: eng event_name_length: 5 (0x05) event_name: "House" -- Charset: Latin alphabet text_length: 83 (0x53) text_char: "..215264.363l.313j_]351.M263376342ޛ222235336.333).8251246277251v202214303314327.347307341/363p327z.@210Ip.[.330E272351352246355356242e276<270256C.273`.3323s342.M257@" -- Charset: reserved
comment:4 Changed 5 years ago by
That looks suspiciously like it's been encoded as some of the Freesat stuff is
comment:5 Changed 5 years ago by
Owner: | set to Klaas de Waal |
---|---|
Status: | new → assigned |
comment:6 Changed 5 years ago by
Status: | assigned → infoneeded |
---|
The issue with the "The League of Extraordinary Gentlemen" has been reproduced on channel 300, Film4, on Astra-2 28E2. A fix for this issue has been applied in master in commit c1fb397f7f6ad25845f6fe7cde0cead07e11c932.
Please give feedback on this, especially if it does not only fix the "League" issue but if it causes unwanted effects, i.e. regressions, on other programs.
comment:8 Changed 5 years ago by
With additional debug code running for 24 hours receiving EIT from Astra-2 there are four occasions with two different programs where the description would be discarded because there was a year in the concatenated string, as shown here:
2020-04-21 02:01:59.505022 I KdW UK EIT fixup fix #13607 position1 m_ukYear 108 strFull 'Hollywood's Brightest Bombshell: The Hedy Lamarr Story. Documentary about Hollywood wild-child Hedy Lamarr. [2017]' kdwfix t,s,d 'Hollywood's Brightest Bombshell' '' 'The Hedy Lamarr Story. Documentary about Hollywood wild-child Hedy Lamarr. [2017]' no_fix t,s,d 'Hollywood's Brightest Bombshell' '' '' -- 2020-04-21 02:13:04.503598 I KdW UK EIT fixup fix #13607 position1 m_ukYear 50 strFull 'Teenage Mutant Ninja Turtles: Out of the Shadows: (2016) Part-animated superhero adventure. The quartet of crime-fighting friends try to stop their enemy Shredder from helping the alien Krang from conquering Earth.' kdwfix t,s,d 'Teenage Mutant Ninja Turtles: Out of the Shadows' '' '(2016) Part-animated superhero adventure. The quartet of crime-fighting friends try to stop their enemy Shredder from helping the alien Krang from conquering Earth.' no_fix t,s,d 'Teenage Mutant Ninja Turtles: Out of the Shadows' '' ''
The "no_fix" string is the title, subtitle, description as a result of the original code and that code discards the desciption because there is a year in the concatenated string.
The "kdwfix" string is the title, subtitle, description with the fix applied. Note that the year in the description is removed by later processing so you do not see that in the guide.
comment:9 Changed 5 years ago by
Resolution: | → fixed |
---|---|
Status: | infoneeded → closed |
comment:10 Changed 5 years ago by
Milestone: | needs_triage → 32.0 |
---|---|
Version: | Unspecified → Master Head |
Which channel is it? Is this on DVB-T/T2 (Freeview) or on Astra 28.2E satellite (Freesat)? If it is on Freesat I might be able to reproduce this.