Fetching data from TMDB API

after failing with 2 libraries, just using Python's requests to fetch movie data from TMDB API - themoviedb.org

Goal to fetch my private lists of movies on TMDB for Project: Top 1000 movies

The Movie Database (TMDB) is a popular, user editable database for movies and TV shows.

787,956 Movies on TMDB as of 25 Jul 2022

See unsuccessful attempts:
Nic Note: Library: tmdbv3api
Nic Note: Library: tmdbapis

API resources
Documentation https://developers.themoviedb.org/3/getting-started/introduction
Support forum https://www.themoviedb.org/talk/category/5047958519c29526b50017d6
Wrappers & libraries https://www.themoviedb.org/documentation/api/wrappers-libraries
Service status https://status.themoviedb.org

Here's an example API request:
https://api.themoviedb.org/3/movie/550?api_key=xxxxxxxxxxxxxxxxxxx

Attribution required:
https://www.themoviedb.org/about/logos-attribution

Fetching data from my private list(s) on TMDB

Successful attempt in 5mns to fetch one movie's data with:

tmdb_api_key = os.getenv("TMDB_API_KEY") # TMDB API v3 key
print(f"{tmdb_api_key=}") # print to check that proper key is passed in the payload later

my_top_1000_list_id = int(my_list_id) # as found in the URL of my list

# Fetching movie data for movie ID 100
movie_url = f"https://api.themoviedb.org/3/movie/100?api_key={tmdb_api_key}"
res_movie = requests.get(movie_url).json()
pp.pprint(res_movie)
print()
title = res_movie["title"]
original_language = res_movie["original_language"]
original_title = res_movie["original_title"]
tagline = res_movie["tagline"]
budget = res_movie["budget"]
tmdb_id = res_movie["id"]
imdb_id = res_movie["imdb_id"]
original_language = res_movie["original_language"]
overview = res_movie["overview"]
release_date = res_movie["release_date"]
revenue = res_movie["revenue"]
runtime = res_movie["runtime"]
poster_path = f"https://image.tmdb.org/t/p/original/{res_movie['poster_path']}"
print(title)
print(f"{original_language=}")
print(f"{original_title=}")
print(tagline)
print(f"{budget=}")
print(tmdb_id)
print(imdb_id)
print(original_language)
print(overview)
print(release_date)
print(f"{revenue=}")
print(f"{runtime=}")
print(poster_path)

for storing the API key, see Nic Note: How to save confidential data in environment variables with dotenv

and I can get movie data for each movie in my list with:

tmdb_api_key = os.getenv("TMDB_API_KEY")
my_top_1000_list_id = int(my_list_id)

count = 0

my_1000_url = f"https://api.themoviedb.org/4/list/{my_top_1000_list_id}?page=1&api_key={tmdb_api_key}"
res_list = requests.get(my_1000_url).json() 
# returns dict with 19 fields:
    # average_rating
    # backdrop_path
    # comments
    # created_by
    # description
    # id
    # iso_3166_1
    # iso_639_1
    # name
    # object_ids
    # page
    # poster_path
    # public
    # results
    # revenue
    # runtime
    # sort_by
    # total_pages
    # total_results
    # pp.pprint(m)

    # the object_ids object includes the list of all movie ids in my list. Will be using that. 

for m in res_list['object_ids']:
    count += 1 # to validate number of movies processed
    print(f"{m=}") # m is a string like 'movie:100' (print res_list['object_ids'] to check), so need to remove first 6 characters to grab the movie id.
    movie_id = m[6:] # 
    print(f"{movie_id=}")

    movie_url = f"https://api.themoviedb.org/3/movie/{movie_id}?api_key={tmdb_api_key}"
    res_movie = requests.get(movie_url).json()

    title = res_movie["title"]
    original_language = res_movie["original_language"]
    original_title = res_movie["original_title"]
    tagline = res_movie["tagline"]
    budget = res_movie["budget"]
    tmdb_id = res_movie["id"]
    imdb_id = res_movie["imdb_id"]
    original_language = res_movie["original_language"]
    overview = res_movie["overview"]
    release_date = res_movie["release_date"]
    revenue = res_movie["revenue"]
    runtime = res_movie["runtime"]
    poster_path = f"https://image.tmdb.org/t/p/original/{res_movie['poster_path']}"
    print(f"{title=}")
    print(f"{original_language=}")
    print(f"{original_title=}")
    print(f"{tagline=}")
    print(f"{budget=}")
    print(f"{tmdb_id=}")
    print(f"{imdb_id=}")
    print(f"{original_language=}")
    print(f"{overview=}")
    print(f"{release_date=}")
    print(f"{revenue=}")
    print(f"{runtime=}")
    print(f"{poster_path=}")
    print()

print(f"{count=}") # to validate number of movies processed

returns (edited for length):

m='movie:100'
movie_id='100'
title='Lock, Stock and Two Smoking Barrels'
original_language='en'
original_title='Lock, Stock and Two Smoking Barrels'
tagline='A Disgrace to Criminals Everywhere.'
budget=1350000
tmdb_id=100
imdb_id='tt0120735'
original_language='en'
overview='A card shark and his unwillingly-enlisted friends need to make a lot of cash quick after losing a sketchy poker match. To do this they decide to pull a heist on a small-time gang who happen to be operating out of the flat next door.'
release_date='1998-08-28'
revenue=28356188
runtime=105
poster_path='https://image.tmdb.org/t/p/original//8kSerJrhrJWKLk1LViesGcnrUPE.jpg'

m='movie:101'
movie_id='101'
title='Léon: The Professional'
original_language='en'
original_title='Léon: The Professional'
tagline='If you want a job done well, hire a professional.'
budget=16000000
tmdb_id=101
imdb_id='tt0110413'
original_language='en'
overview='Léon, the top hit man in New York, has earned a rep as an effective "cleaner". But when his next-door neighbors are wiped out by a loose-cannon DEA agent, he becomes the unwilling custodian of 12-year-old Mathilda. Before long, Mathilda\'s thoughts turn to revenge, and she considers following in Léon\'s footsteps.'
release_date='1994-09-14'
revenue=45284974
runtime=111
poster_path='https://image.tmdb.org/t/p/original//yI6X2cCM5YPJtxMhUd3dPGqDAhw.jpg'

m='movie:103'
movie_id='103'
title='Taxi Driver'
original_language='en'
original_title='Taxi Driver'
tagline="On every street in every city, there's a nobody who dreams of being a somebody."
budget=1300000
tmdb_id=103
imdb_id='tt0075314'
original_language='en'
overview='A mentally unstable Vietnam War veteran works as a night-time taxi driver in New York City where the perceived decadence and sleaze feed his urge for violent action.'
release_date='1976-02-09'
revenue=28570902
runtime=114
poster_path='https://image.tmdb.org/t/p/original//ekstpH614fwDX8DUln1a2Opz0N8.jpg'

Fetching Top Rated movies on TMDB

top_rated_movies_url = f"https://api.themoviedb.org/3/movie/top_rated?page=1&api_key={tmdb_api_key}"
res_list = requests.get(top_rated_movies_url).json() 
pp.pprint(res_list)

returns:

<class 'dict'>
Keys:
- page
- results
- total_pages
- total_results

output (edited for length):

{   'page': 1,
    'results': [   {   'adult': False,
                       'backdrop_path': '/kXfqcdQKsToO0OUXHcrrNCHDBzO.jpg',
                       'genre_ids': [18, 80],
                       'id': 278,
                       'original_language': 'en',
                       'original_title': 'The Shawshank Redemption',
                       'overview': 'Framed in the 1940s for the double murder '
                                   'of his wife and her lover, upstanding '
                                   'banker Andy Dufresne begins a new life at '
                                   'the Shawshank prison, where he puts his '
                                   'accounting skills to work for an amoral '
                                   'warden. During his long stretch in prison, '
                                   'Dufresne comes to be admired by the other '
                                   'inmates -- including an older prisoner '
                                   'named Red -- for his integrity and '
                                   'unquenchable sense of hope.',
                       'popularity': 82.945,
                       'poster_path': '/q6y0Go1tsGEsmtFryDOJo3dEmqu.jpg',
                       'release_date': '1994-09-23',
                       'title': 'The Shawshank Redemption',
                       'video': False,
                       'vote_average': 8.7,
                       'vote_count': 21823},
                   {   'adult': False,
                       'backdrop_path': '/90ez6ArvpO8bvpyIngBuwXOqJm5.jpg',
                       'genre_ids': [35, 18, 10749],
                       'id': 19404,
                       'original_language': 'hi',
                       'original_title': 'दिलवाले दुल्हनिया ले जायेंगे',
                       'overview': 'Raj is a rich, carefree, happy-go-lucky '
                                   'second generation NRI. Simran is the '
                                   'daughter of Chaudhary Baldev Singh, who in '
                                   'spite of being an NRI is very strict about '
                                   'adherence to Indian values. Simran has '
                                   'left for India to be married to her '
                                   'childhood fiancé. Raj leaves for India '
                                   'with a mission at his hands, to claim his '
                                   'lady love under the noses of her whole '
                                   'family. Thus begins a saga.',
                       'popularity': 23.792,
                       'poster_path': '/2CAL2433ZeIihfX1Hb2139CX0pW.jpg',
                       'release_date': '1995-10-19',
                       'title': 'Dilwale Dulhania Le Jayenge',
                       'video': False,
                       'vote_average': 8.7,
                       'vote_count': 3722},
                   {   'adult': False,
                       'backdrop_path': '/rSPw7tgCH9c6NqICZef4kZjFOQ5.jpg',
                       'genre_ids': [18, 80],
                       'id': 238,
                       'original_language': 'en',
                       'original_title': 'The Godfather',
                       'overview': 'Spanning the years 1945 to 1955, a '
                                   'chronicle of the fictional '
                                   'Italian-American Corleone crime family. '
                                   'When organized crime family patriarch, '
                                   'Vito Corleone barely survives an attempt '
                                   'on his life, his youngest son, Michael '
                                   'steps in to take care of the would-be '
                                   'killers, launching a campaign of bloody '
                                   'revenge.',
                       'popularity': 89.756,
                       'poster_path': '/3bhkrj58Vtu7enYsRolD1fZdja1.jpg',
                       'release_date': '1972-03-14',
                       'title': 'The Godfather',
                       'video': False,
                       'vote_average': 8.7,
                       'vote_count': 16247},

full script to fetch data per page:

tmdb_api_key = os.getenv("TMDB_API_KEY")
# print(f"{tmdb_api_key=}")
tmdb_v4_token = os.getenv("TMDB_V4_TOKEN")
# print(f"{tmdb_v4_token=}")

count = 0

# Per page request

top_rated_movies_url = f"https://api.themoviedb.org/3/movie/top_rated?page=1&api_key={tmdb_api_key}"

res_list = requests.get(top_rated_movies_url).json() 
pp.pprint(res_list)

print()
for x in res_list['results'][0]:
    print(x)

# returns dict with 4 fields:
    # page: int
    # results: list of dicts with 14 fields per movie, 20 movies per page: 
        # adult
        # backdrop_path
        # genre_ids
        # id
        # original_language
        # original_title
        # overview
        # popularity
        # poster_path
        # release_date
        # title
        # video
        # vote_average
        # vote_count
    # total_pages: int / 507
    # total_results: int / 10127

    # the results object includes the list of all movies. Will be using that. 

for m in res_list['results']:
    count += 1
    title = m["title"]
    original_language = m["original_language"]
    original_title = m["original_title"]
    # tagline = m["tagline"] # NO tagline returned here
    # budget = m["budget"] # NO budget returned here
    tmdb_id = m["id"]
    # imdb_id = m["imdb_id"] # NO imdb_id returned here
    original_language = m["original_language"]
    overview = m["overview"]
    release_date = m["release_date"]
    # revenue = m["revenue"] # NO revenue returned here
    # runtime = m["runtime"] # NO runtime returned here
    poster_path = f"https://image.tmdb.org/t/p/original/{m['poster_path']}"
    print(f"{title=}")
    print(f"{original_language=}")
    print(f"{original_title=}")
    # print(f"{tagline=}")
    # print(f"{budget=}")
    print(f"{tmdb_id=}")
    # print(f"{imdb_id=}")
    print(f"{original_language=}")
    print(f"{overview=}")
    print(f"{release_date=}")
    # print(f"{revenue=}")
    # print(f"{runtime=}")
    print(f"{poster_path=}")
    print()

    # => need to use the https://api.themoviedb.org/3/movie/{movie_id}?api_key={tmdb_api_key} endpoint for each to get missing fields

TODO need to add logic to loop per page + use the /movie endpoint to get missing fields.

Resources

How to fetch different sizes of movie posters

http://image.tmdb.org/t/p/wXXX/, eg http://image.tmdb.org/t/p/w500/

Available sizes, for reference:

"backdrop_sizes": [
  "w300",
  "w780",
  "w1280",
  "original"
],
"logo_sizes": [
  "w45",
  "w92",
  "w154",
  "w185",
  "w300",
  "w500",
  "original"
],
"poster_sizes": [
  "w92",
  "w154",
  "w185",
  "w342",
  "w500",
  "w780",
  "original"
],
"profile_sizes": [
  "w45",
  "w185",
  "h632",
  "original"
],
"still_sizes": [
  "w92",
  "w185",
  "w300",
  "original"
]

links

social