Opened 9 years ago

Closed 8 years ago

Last modified 8 years ago

#12455 closed Patch - Bug Fix (fixed)

tmdb3.py crashes with status code 1

Reported by: spmorton@… Owned by: Karl Egly
Priority: minor Milestone: 0.27.6
Component: MythTV - Mythmetadatalookup Version: Unspecified
Severity: medium Keywords:
Cc: Ticket locked: no

Description

tmdb3 crashes when large number of returns are encountered. Directly running the script from command line with tmdb3.py --debug -l en -a US -M 'alien' returns KeyError?: 25. line 167 of the script is intended to limit the number of returns, however, the number is greater than the limit imposed by tmdb api per the docs, this limit is 40. The api termination of the data stream causes the script to crash. testing the limits with 15 seems to be acceptable to tmdb which is considerably less than 40, however, this is a shared key service

Traceback (most recent call last):

File "/usr/share/mythtv/metadata/Movie/tmdb3.py", line 300, in <module>

main()

File "/usr/share/mythtv/metadata/Movie/tmdb3.py", line 291, in main

buildList(args[0], opts)

File "/usr/share/mythtv/metadata/Movie/tmdb3.py", line 140, in buildList

if getattr(res, j):

File "/usr/lib/python2.7/site-packages/MythTV/tmdb3/util.py", line 153, in get

self.poller.get(inst, owner)()

File "/usr/lib/python2.7/site-packages/MythTV/tmdb3/util.py", line 70, in call

if not self.apply(req.readJSON(), False):

File "/usr/lib/python2.7/site-packages/MythTV/tmdb3/cache.py", line 118, in call

data = self.func(*args, kwargs)

File "/usr/lib/python2.7/site-packages/MythTV/tmdb3/request.py", line 128, in readJSON

handle_status(data, url)

File "/usr/lib/python2.7/site-packages/MythTV/tmdb3/request.py", line 163, in handle_status

status = status_handlers[data.get('status_code', 1)]

KeyError?: 25

Attachments (2)

tmdb3_query_limit.diff (2.8 KB) - added by mharbudd@… 8 years ago.
Diff file for my changes
tmdb3_query_limit.2.diff (2.8 KB) - added by mharbudd@… 8 years ago.
Added an incrementer, lol

Download all attachments as: .zip

Change History (13)

comment:1 Changed 9 years ago by bhboyle@…

I have the same problem. it does not happen when it searches by name instead of number.

/usr/local/share/mythtv/metadata/Movie/tmdb3.py -l en -a CA -D 135397 Traceback (most recent call last):

File "/usr/local/share/mythtv/metadata/Movie/tmdb3.py", line 278, in <module>

main()

File "/usr/local/share/mythtv/metadata/Movie/tmdb3.py", line 272, in main

buildSingle(args[0], opts)

File "/usr/local/share/mythtv/metadata/Movie/tmdb3.py", line 91, in buildSingle

if crew.profile: dthumb? = crew.profile.geturl()

File "/usr/local/lib/python2.7/dist-packages/MythTV/tmdb3/util.py", line 143, in get

self.poller.get(inst, owner)()

File "/usr/local/lib/python2.7/dist-packages/MythTV/tmdb3/util.py", line 77, in call

self.apply(req.readJSON())

File "/usr/local/lib/python2.7/dist-packages/MythTV/tmdb3/cache.py", line 111, in call

data = self.func(*args, kwargs)

File "/usr/local/lib/python2.7/dist-packages/MythTV/tmdb3/request.py", line 119, in readJSON

handle_status(data, url)

File "/usr/local/lib/python2.7/dist-packages/MythTV/tmdb3/request.py", line 154, in handle_status

status = status_handlers[data.get('status_code', 1)]

KeyError?: 25

comment:2 Changed 9 years ago by spmorton@…

A possible solution is to get your own API key from TMDB and adjusting the query limit down (I did at it seems to work very well). An enhancement request might be on the order of having a place in the backend configuration to input your own key for TMDB allowing the user to have the full benefits of 40 queries per minute and adjusting the python code to use a variable for the key and an adjustable query limit as well. I would provide some code, but unfortunately I am time constrained with my other research.

comment:3 Changed 9 years ago by sargenthp@…

This is what I ended up tweaking: /usr/lib/python2.7/dist-packages/MythTV/tmdb3/request.py

Just under the other import lines at the top of the file I added:

import time

Then I replaced this function...

    def readJSON(self):
        """Parse result from specified URL as JSON data."""
        url = self.get_full_url()
        try:
            # catch HTTP error from open()
            data = json.load(self.open())
        except TMDBHTTPError, e:
            try:
                # try to load whatever was returned
                data = json.loads(e.response)
            except:
                # cannot parse json, just raise existing error
                raise e
            else:
                # response parsed, try to raise error from TMDB
                handle_status(data, url)
            # no error from TMDB, just raise existing error
            raise e
        handle_status(data, url)
        if DEBUG:
            import pprint
            pprint.PrettyPrinter().pprint(data)
        return data

With...

    def readJSON(self):
        """Parse result from specified URL as JSON data."""
        url = self.get_full_url()
        while True:
           try:
              # catch HTTP error from open()
              data = json.load(self.open())
              break
           except TMDBHTTPError, e:
              try:
                 # try to load whatever was returned
                 data = json.loads(e.response)
              except:
                 # cannot parse json, just raise existing error
                 raise e
              else:
                 # Check for error code of 25 which means we are doing more than 40 requests per minute.
                 if data.get('status_code', 1) == 25:
                    # Sleep and retry query.
                    time.sleep(10)
                    continue
                 else:
                    # response parsed, try to raise error from TMDB
                    handle_status(data, url)
              # no error from TMDB, just raise existing error
              raise e
        handle_status(data, url)
        if DEBUG:
            import pprint
            pprint.PrettyPrinter().pprint(data)
        return data

comment:4 Changed 8 years ago by Karl Egly

Component: MythTV - GeneralMythTV - Mythmetadatalookup
Status: newinfoneeded_new
Type: Bug Report - GeneralPatch - Bug Fix

Thank you for the patch. Can you add support for the "Retry-After" header? Maybe waiting 10 seconds or "Retry-After" whichever is higher? Also a hard limit on the retries, instead of an endless loop, would be good. (even if its 100)

https://www.themoviedb.org/talk/5317af69c3a3685c4a0003b1

comment:5 Changed 8 years ago by mharbudd@…

I took what sargenthp did and made the improvements suggested by dekarl.

Requires changes to two files.

First in /usr/lib/python2.7/dist-packages/MythTV/tmdb3/tmdb_exceptions.py

I add one line to the TMDBHTTPError function so that the headers are no longer discarded on an error, changing it to be:

class TMDBHTTPError(TMDBError):
    def __init__(self, err):
        self.httperrno = err.code
        self.response = err.fp.read()
        self.headers = err.headers
        super(TMDBHTTPError, self).__init__(str(err))

Then the readJSON function in /usr/lib/python2.7/dist-packages/MythTV/tmdb3/request.py is altered to be:

    def readJSON(self):
        """Parse result from specified URL as JSON data."""
        url = self.get_full_url()
        tries = 0
        while tries < 100:
            try:
                # catch HTTP error from open()
                data = json.load(self.open())
                break
            except TMDBHTTPError, e:
                try:
                    # try to load whatever was returned
                    data = json.loads(e.response)
                except:
                    # cannot parse json, just raise existing error
                    raise e
                else:
                    # Check for error code of 25 which means we are doing more than 40 requests per 10 seconds
                    if data.get('status_code', 1) ==25:
                        # Sleep and retry query.
                        if DEBUG:
                            print 'Retry after {0} seconds'.format(max(float(e.headers['retry-after']),10))
                        time.sleep(max(float(e.headers['retry-after']),10))
                        continue
                    else:
                        # response parsed, try to raise error from TMDB
                        handle_status(data, url)
                # no error from TMDB, just raise existing error
                raise e
        handle_status(data, url)
        if DEBUG:
            import pprint
            pprint.PrettyPrinter().pprint(data)
        return data

Changed 8 years ago by mharbudd@…

Attachment: tmdb3_query_limit.diff added

Diff file for my changes

comment:6 Changed 8 years ago by mharbudd@…

Whoops, forgot to increment my loop. Add the following before the continue

                        tries += 1

Throwing up a new diff as well. Sorry about that

Changed 8 years ago by mharbudd@…

Attachment: tmdb3_query_limit.2.diff added

Added an incrementer, lol

comment:7 Changed 8 years ago by Karl Egly

Owner: set to Karl Egly
Status: infoneeded_newassigned

comment:8 Changed 8 years ago by sargenthp@…

Thanks guys! Been too busy to get back to this. Just glad my tweak helped others, and you helped improve it! :)

comment:9 Changed 8 years ago by Karl Dietz <dekarl@…>

Resolution: fixed
Status: assignedclosed

In d11f1953532854396c3da536082c2afaeb188237/mythtv:

handle API request limiting responses from themoviedb

Tested with a long list:
tmdb3.py -M The --debug

Patch by Casey Barrett
Fixes #12455

comment:10 Changed 8 years ago by Karl Dietz <dekarl@…>

In fcfa4f52e8f16dcb9b0f98e34b4ad2d967c82833/mythtv:

handle API request limiting responses from themoviedb

Tested with a long list:
tmdb3.py -M The --debug

Patch by Casey Barrett
Fixes #12455

(cherry picked from commit d11f1953532854396c3da536082c2afaeb188237)

comment:11 Changed 8 years ago by Karl Egly

Milestone: unknown0.27.6
Note: See TracTickets for help on using tickets.