Opened 6 months ago

Last modified 4 months ago

#13299 new Patch - Bug Fix

Python Bindings fail to calculate date-time object

Reported by: rcrdnalor Owned by: Raymond Wagner
Priority: minor Milestone: needs_triage
Component: Bindings - Python Version: Unspecified
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Somewhere after Ubuntu 14.04 the content of the time-zoneinfo files in /usr/share/zoneinfo/* changed by introducing a reference to the Local Mean Time (LMT) from Year, Month, Date (1,1,1) to the first introduction of time zones: This is a very huge value, even the "zdump" utility for the zoneinfo files cannot cope with this. This change causes the Python bindings to fail when calculating time transitions (File: dt.py, class: posixtzinfo, method: _process). Unfortunately, this exception is silently covered by datetime.duck() at line 437 (https://github.com/MythTV/mythtv/blob/fixes/29/mythtv/bindings/python/MythTV/utility/dt.py#L434):

        try:
            # existing built-in datetime
            return cls.fromDatetime(t)
        except: pass

If you change it to raise an exception, one will see the backtrace:

Traceback (most recent call last):
  File "./timetest_local.py", line 4, in <module>
    from MythTV import datetime
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/__init__.py", line 41, in <module>
    from .dataheap import *
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/dataheap.py", line 665, in <module>
    class Job( DBDataWrite, JOBTYPE, JOBCMD, JOBFLAG, JOBSTATUS ):
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/dataheap.py", line 671, in Job
    _defaults = {'id':None,     'inserttime':datetime.now(),
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/utility/dt.py", line 303, in now
    tz = cls.localTZ()
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/utility/dt.py", line 273, in localTZ
    cls._localtz = posixtzinfo()
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/utility/singleton.py", line 49, in __call__
    inst = type.__call__(cls, *args, **kwargs)
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/utility/dt.py", line 219, in __init__
    self._process(fd, version)
  File "/home/local-user/MythTV_v_29_fixes/test/MythTV/utility/dt.py", line 163, in _process
    tt = time.gmtime(t)
ValueError: (75, 'Value too large for defined data type')

Please see attachment for the setup of timezones of my PC.

This failure causes a wrong time calculation of the MythTV Python Bindings, i.e.: the calculated time-zone is off by an hour:

$ timedatectl
                      Local time: So 2018-07-01 10:59:06 CEST
                  Universal time: So 2018-07-01 08:59:06 UTC
                        RTC time: So 2018-07-01 08:59:06
                       Time zone: Europe/Vienna (CEST, +0200)
       System clock synchronized: yes
systemd-timesyncd.service active: yes
                 RTC in local TZ: no

$ python2
>>> from MythTV import datetime
>>> import time
>>> tnow = time.time()
>>> datetime.fromtimestamp(tnow).strftime("%Y%m%d%H%M%S %z")
'20180701100955 +0100'

The attached patch adds robustness to the method "posixtzinfo._process()" by simply eliminating values that are out of range for the python data type. The resulting "transitions" object ot the class "posixtzinfo" is identical to those one created on Ubuntu 14.04 with mythtv_0.27_fixes (without the patch).

With this patch, I get the correct result when asking mythtv-python for the current time:

$ python2
>>> from MythTV import datetime
>>> import time
>>> tnow = time.time()
>>> datetime.fromtimestamp(tnow).strftime("%Y%m%d%H%M%S %z")
'20180701110458 +0200'

Please note: I need this method to create 'xmltv' files for mythtv. The attached patch is against fixes/29, but compatible to master as well. Once accepted, plase backport to fixes/29.

Attachments (4)

0001-Python-Bindings-Fix-parsing-of-zoneinfo-files.patch (2.2 KB) - added by rcrdnalor 6 months ago.
Patch for fixing date-time calculation
python_timezone_woes.txt (1.3 KB) - added by rcrdnalor 6 months ago.
Time Zone Settings on my PC
0001-Python-Bindings-Fix-parsing-of-zoneinfo-files_V2.patch (2.2 KB) - added by rcrdnalor 5 months ago.
0001-Python-Bindings-Fix-parsing-of-zoneinfo-files_V2.patch
0001-python-Handle-massively-negative-initial-zoneinfo-en.patch (4.1 KB) - added by ijc 5 months ago.

Download all attachments as: .zip

Change History (17)

Changed 6 months ago by rcrdnalor

Patch for fixing date-time calculation

Changed 6 months ago by rcrdnalor

Attachment: python_timezone_woes.txt added

Time Zone Settings on my PC

comment:1 Changed 6 months ago by rcrdnalor

Forgot to mention version information:

$ mythpython --version
MythTV Python Bindings -- Local Install
  local versions
    bindings version:         29.0.-1.0
    ttvdb version:            2.0-dev
    tmdb version:             v0.7.0
  external versions
    lxml version:             4.2.1
    MySQLdb version:          1.3.10.final.0
  protocol versions
    backend:                  91
    schema:                   1348
    music schema:             1024
    netvision schema:         1007

$ mythbackend --version
Please attach all output as a file in bug reports.
MythTV Version : v29.1-27-g3237c42
MythTV Branch : fixes/29
Network Protocol : 91
Library API : 29.20180316-1
QT Version : 5.9.5

comment:2 Changed 5 months ago by ijc

I'm not seeing the offset you are, but then again I'm a) on Debian (testing, with 2018c tzinfo package) and in British timezone, so maybe there is not LMT in my zoneinfo or maybe it doesn't differ enough to be noticeable (perhaps due to my proximity to UTC).

AIUI you need those "dummy" GMT entries since the data is in mutliple consecutive arrays and you need to maintain synchronisation, hence inserting them and removing at the end. That seems subtle enough to me to be worth a code comment explaining it. But perhaps just using None would be a better sentinel for a faulty tt than using the GMT list and a separate "these are not good" array, then you would check for None in the subsequent loops and skip? Or even just use transitions[i] = ... instead of transitions.append(...) and use the None entries in the array as the sentinel.

Finally, catching the ValueError? seems like a roundabout way of detecting these entries, in some sense it is treating the symptom, not the cause -- is it not possible to distinguish these LMT entries from desired ones directly? For example are all entries which are too large for time.gmtime() LMT entries and never any possibility of being something else?

Are the LMT entries the "NULL" ones shown here:

$ zdump -v /etc/localtime  | grep -E NULL\|2018
/etc/localtime  -9223372036854775808 = NULL
/etc/localtime  -9223372036854689408 = NULL
/etc/localtime  Sun Mar 25 00:59:59 2018 UT = Sun Mar 25 00:59:59 2018 GMT isdst=0 gmtoff=0
/etc/localtime  Sun Mar 25 01:00:00 2018 UT = Sun Mar 25 02:00:00 2018 BST isdst=1 gmtoff=3600
/etc/localtime  Sun Oct 28 00:59:59 2018 UT = Sun Oct 28 01:59:59 2018 BST isdst=1 gmtoff=3600
/etc/localtime  Sun Oct 28 01:00:00 2018 UT = Sun Oct 28 01:00:00 2018 GMT isdst=0 gmtoff=0
/etc/localtime  9223372036854689407 = NULL
/etc/localtime  9223372036854775807 = NULL

??? In which case can either the NULL or the enormous magnitude not be used to detect them more directly ? The NULL seems like it might be an especially interesting indicator of "something unusual"?

I wasn't actually able to find any docs on the format of /etc/localtime, nor of the LMT entries though so maybe it isn't possible to distinguish explicitly, but it would be better to try IMHO.

comment:3 Changed 5 months ago by rcrdnalor

@ijc: Thank you again for reviewing my proposed patches. I see in your post that you are a lucky one not having time-zone troubles ;-)

Following your suggestions, I provide an update of the patch: Comments have been added and the None type is used as sentinel.

Interesting observation: The transition that gets deleted is the first one:

Transition(time=-576460752303423488, utc=None, local=None, offset=-75, abbrev='LMT', isdst=0)

The time=-576460752303423488 is F800000000000000 in hex notation or 2**59, aka called Big_Bang time in zic, the zone information compiler.

See https://www.daemon-systems.org/man/zic.8.html and http://tz.iana.narkive.com/rd7eYDmc/tz-tzfiles-contain-unix-epoch-for-the-first-transition-time

The change was introduced late 2013 and 2014 and revised 26 days ago on https://github.com/eggert/tz/. This might explain, why I do not see this bug on ubuntu 14.04.

Info about 'Remove Big bang hack': https://github.com/eggert/tz/commit/83c119f4d5d48ba37d73e42b0f12da3ae06b6c3f http://tz.iana.narkive.com/5DTeOBe7/tz-proposed-remove-big-bang-hack

FWIW: I found these -9223372036854775808 = NULL entries from zdump in my old system (14.04) as well. It seems not to be related to the first transition mentioned above.

Looking forward to future changes in the zoneinfo files, the attached conservative patch seems to be appropriate for now and follows the 'KISS' principle.

Changed 5 months ago by rcrdnalor

0001-Python-Bindings-Fix-parsing-of-zoneinfo-files_V2.patch

comment:4 Changed 5 months ago by rcrdnalor

I'd like to attract more attention to this issue.

This bug causes every meaningful call to the Recorded class to fail (using chanid and starttime), but with this patch installed, I can use all these fancy methods the datetime.duck() method was written for.

With this patch, I get the same behavior as with my ubuntu 14.04 / mythtv fixes/0.27 installation, for example, for a recording '2102_20180606202400.ts' with valid database entry:

$ python2
Python 2.7.15rc1 (default, Apr 15 2018, 21:51:34) 
>>> from MythTV import MythError, Recorded, MythDB, datetime
>>> db = MythDB()
>>> 
>>> # recording in database : 2102_20180606202400.ts
>>> # local time : CEST = UTC+02:00
>>> chanid = 2102
>>> utc_mythtime   = 20180606202400
>>> local_mythtime = 20180606222400
>>> local_time_str = '2018-06-06 22:24:00'
>>>
>>> Recorded((chanid,local_mythtime), db =db)
b'<Recorded 'WORLDjournal','2018-06-06 22:24:00+02:00' at 0x7fc746755ab8>'
>>>
>>> Recorded((chanid,local_time_str), db =db)
b'<Recorded 'WORDjournal','2018-06-06 22:24:00+02:00' at 0x7fc744d25990>'

Without this patch, every of the above mentioned calls fail with:

....
    raise MythError('DBData() could not read from database')
MythTV.exceptions.MythError: DBData() could not read from database

I have to use some - non existing - intermediate time values to get the Recorded class loaded: Applying an offset of 1 hour (for whatever reason):

>>> Recorded((chanid,(local_mythtime-10000)), db =db)
b'<Recorded 'WORLDjournal','2018-06-06 21:24:00+01:00' at 0x7fc341e97ab8>'
>>> Recorded((chanid,'2018-06-06 21:24:00'), db =db)
b'<Recorded 'WORLDjournal','2018-06-06 21:24:00+01:00' at 0x7fc340452ab8>'

This makes absolutely no sense.

Please note: MythTV decided to keep the 'chanid, starttime' tuple to initialize the Recorded class of the Python Bindings. See commit ce1935b0ebc0918391f05893be82cd275cbd3969 and #12260.

The property Job(jobid, db=db).starttime used when calling a python script triggered by a user job is not affected by this bug.

Interesting observation: I have seen python scripts available in the wiki or in the forum implementing some - more or less useful - workarounds to circumvent this problem, like requiring date/time values and an offset as input on the command line for the starttime. But I found no message that complains about the broken datetime implementation in dt.py

And, please, do not add an "But it works for me as it is." comment if you are living in UTC (GMT, BST) Zone ;-)

Instead, try my patch and report the result.

comment:5 Changed 5 months ago by ijc

And, please, do not add an "But it works for me as it is." comment if you are living in UTC (GMT, BST) Zone ;-)

I'm in GMT/BST and I realised that I _can_ reproduce, just by running TZ=Europe/Vienna python (or TZ=America/Los_Angeles or TZ=... ./some-script.py etc).

I've had a play with various approaches and it seems sufficient, and safest, to only ignore initial entries which are out of range. I will attach my attempt at doing this.

comment:6 Changed 5 months ago by rcrdnalor

I am fine with the patch "0001-python-Handle-massively-negative-initial-zoneinfo-en.patch". I tested successfully all proposed patches with the timezone-info for ubuntu (trusty, bionic) and debian (jessie, stretch, sid) in respect to the time zomes 'Europe/Vienna?' and 'Antartica/Macquarie?'. The latter one is known to be problematic in the mailing lists of 'iana.org/time-zones'.

I simply missed the fact that all entries in the zone-info files needs to be in ascending order respective to the time.

Could you please update your patch to PEP8 style in the hope that no one of the devs may reject them (I mean limit to 79 chars per line) ?

comment:7 Changed 5 months ago by ijc

That file already makes pycodestyle pretty unhappy, but I've fixed the patch so pycodestyle --diff no longer complains on it.

Changed 5 months ago by ijc

comment:8 Changed 5 months ago by ijc

I've stuck my patch in a PR on GH at https://github.com/MythTV/mythtv/pull/166

comment:9 Changed 5 months ago by Bill Meek

FYI:

I've got commit access and have been using the latest version of the patch A-OK in two local programs (that had very ugly hacks I made.)

I think we would handle the change by putting special instructions in the Release Notes for v30. Not that everyone reads them, but at least there's a docmented warning of the change (for others that put local workarounds in place or use programs that have a timezone offset switch, as mentioned earlier.)

What I don't have/know how to do: testing any other impact to users or the bindings.

comment:10 Changed 5 months ago by ijc

Adding to the release notes sounds wise.

OOI was the nature of your local hacks such that this change would actually break them or just make them redundant? (I'm assuming they are somewhat representative of other workarounds people have made).

What I don't have/know how to do: testing any other impact to users or the bindings.

I guess it should be sufficient to manually audit these looking for workarounds:

$ git grep -l -E "from MythTV import|import MythTV"
mythplugins/mytharchive/mythburn/scripts/mythburn.py
mythtv/bindings/python/MythTV/wikiscripts/wikiscripts.py
mythtv/bindings/python/scripts/mythpython
mythtv/contrib/imports/mirobridge/mirobridge.py
mythtv/contrib/imports/mirobridge/mirobridge/metadata.py
mythtv/programs/scripts/hardwareprofile/config.py
mythtv/programs/scripts/hardwareprofile/distros/mythtv_data/data_mythtv.py
mythtv/programs/scripts/internetcontent/nv_python_libs/bbciplayer/bbciplayer_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/bliptv/bliptv_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/common/common_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/dailymotion/dailymotion_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/mnvsearch/mnvsearch_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/vimeo/vimeo_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/youtube/youtube_api.py
mythtv/programs/scripts/metadata/Movie/tmdb3.py
mythtv/programs/scripts/metadata/Television/ttvdb.py

I suspect most of them don't use the TZ objects at all. Perhaps a more accurate shortlist would be:

$ git grep -l datetime $(git grep -l -E "from MythTV import|import MythTV")
mythplugins/mytharchive/mythburn/scripts/mythburn.py
mythtv/contrib/imports/mirobridge/mirobridge.py
mythtv/programs/scripts/hardwareprofile/distros/mythtv_data/data_mythtv.py
mythtv/programs/scripts/internetcontent/nv_python_libs/bbciplayer/bbciplayer_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/common/common_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/mnvsearch/mnvsearch_api.py
mythtv/programs/scripts/internetcontent/nv_python_libs/vimeo/vimeo_api.py

I'm about to head out now but I shall try and go through the longer list soon.

comment:11 Changed 5 months ago by ijc

I've been through them and:

  • mythplugins/mytharchive/mythburn/scripts/mythburn.py uses datetime.duck(starttime).utcisoformat()
  • mythtv/contrib/imports/mirobridge/mirobridge.py uses datetime.now() & datetime.now().strftime(...)
  • mythtv/programs/scripts/hardwareprofile/config.py uses MythTV.utility.datetime(2001,1,1,0,0) & self.td_to_secs(MythTV.utility.datetime.now() - oldest.starttime)

None of these have anything which looks like a workaround which would need removing (I suspect they are all just subtly broken right now).

None of the following seem to use MythTV.utility.datetime in any way (they use other MythTV.* things and some of them use the stdlib datetime):

  • mythtv/bindings/python/MythTV/wikiscripts/wikiscripts.py
  • mythtv/bindings/python/scripts/mythpython
  • mythtv/contrib/imports/mirobridge/mirobridge/metadata.py
  • mythtv/programs/scripts/hardwareprofile/distros/mythtv_data/data_mythtv.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/bbciplayer/bbciplayer_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/bliptv/bliptv_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/common/common_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/dailymotion/dailymotion_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/mnvsearch/mnvsearch_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/vimeo/vimeo_api.py
  • mythtv/programs/scripts/internetcontent/nv_python_libs/youtube/youtube_api.py
  • mythtv/programs/scripts/metadata/Movie/tmdb3.py
  • mythtv/programs/scripts/metadata/Television/ttvdb.py

So AFAICT WRT in-tree Python code things are ok to proceed.

Last edited 5 months ago by ijc (previous) (diff)

comment:12 Changed 4 months ago by ijc

What is the process for release notes? Are they just accumulated in Release Notes - 30 until the release? I assume it would be inappropriate to add an entry for this issue while the fix remains unmerged so I'll propose some wording here instead.

I propose the following for either Special Notices & Instructions or a (currently non-existent) "Python" or perhaps "Language Bindings" section:

The MythTV.utility.posixtzinfo class in the Python language bindings, used internally by MythTV.utility.datetime now correctly handles tzinfo databases with an extremely large initial offset or LMT (Local Mean Time) entries, meaning it no longer produces incorrect times for timezones away from UTC/GMT. Local scripts using either of these classes should be examined and any workarounds for this issue removed.

comment:13 Changed 4 months ago by ijc

Bill Meek, what do you think of this now with the comments above?

Note: See TracTickets for help on using tickets.