Goal to fetch my private lists of movies on TMDB for Project: Top 1000 movies
The Movie Database (TMDB) is a popular, user editable database for movies and TV shows.
787,956 Movies on TMDB as of 25 Jul 2022
See unsuccessful attempts:
Nic Note: Library: tmdbv3api
Nic Note: Library: tmdbapis
API resources | |
---|---|
Documentation | https://developers.themoviedb.org/3/getting-started/introduction |
Support forum | https://www.themoviedb.org/talk/category/5047958519c29526b50017d6 |
Wrappers & libraries | https://www.themoviedb.org/documentation/api/wrappers-libraries |
Service status | https://status.themoviedb.org |
Here's an example API request:
https://api.themoviedb.org/3/movie/550?api_key=xxxxxxxxxxxxxxxxxxx
Attribution required:
https://www.themoviedb.org/about/logos-attribution
Fetching data from my private list(s) on TMDB
Successful attempt in 5mns to fetch one movie's data with:
tmdb_api_key = os.getenv("TMDB_API_KEY") # TMDB API v3 key
print(f"{tmdb_api_key=}") # print to check that proper key is passed in the payload later
my_top_1000_list_id = int(my_list_id) # as found in the URL of my list
# Fetching movie data for movie ID 100
movie_url = f"https://api.themoviedb.org/3/movie/100?api_key={tmdb_api_key}"
res_movie = requests.get(movie_url).json()
pp.pprint(res_movie)
print()
title = res_movie["title"]
original_language = res_movie["original_language"]
original_title = res_movie["original_title"]
tagline = res_movie["tagline"]
budget = res_movie["budget"]
tmdb_id = res_movie["id"]
imdb_id = res_movie["imdb_id"]
original_language = res_movie["original_language"]
overview = res_movie["overview"]
release_date = res_movie["release_date"]
revenue = res_movie["revenue"]
runtime = res_movie["runtime"]
poster_path = f"https://image.tmdb.org/t/p/original/{res_movie['poster_path']}"
print(title)
print(f"{original_language=}")
print(f"{original_title=}")
print(tagline)
print(f"{budget=}")
print(tmdb_id)
print(imdb_id)
print(original_language)
print(overview)
print(release_date)
print(f"{revenue=}")
print(f"{runtime=}")
print(poster_path)
for storing the API key, see Nic Note: How to save confidential data in environment variables with dotenv
and I can get movie data for each movie in my list with:
tmdb_api_key = os.getenv("TMDB_API_KEY")
my_top_1000_list_id = int(my_list_id)
count = 0
my_1000_url = f"https://api.themoviedb.org/4/list/{my_top_1000_list_id}?page=1&api_key={tmdb_api_key}"
res_list = requests.get(my_1000_url).json()
# returns dict with 19 fields:
# average_rating
# backdrop_path
# comments
# created_by
# description
# id
# iso_3166_1
# iso_639_1
# name
# object_ids
# page
# poster_path
# public
# results
# revenue
# runtime
# sort_by
# total_pages
# total_results
# pp.pprint(m)
# the object_ids object includes the list of all movie ids in my list. Will be using that.
for m in res_list['object_ids']:
count += 1 # to validate number of movies processed
print(f"{m=}") # m is a string like 'movie:100' (print res_list['object_ids'] to check), so need to remove first 6 characters to grab the movie id.
movie_id = m[6:] #
print(f"{movie_id=}")
movie_url = f"https://api.themoviedb.org/3/movie/{movie_id}?api_key={tmdb_api_key}"
res_movie = requests.get(movie_url).json()
title = res_movie["title"]
original_language = res_movie["original_language"]
original_title = res_movie["original_title"]
tagline = res_movie["tagline"]
budget = res_movie["budget"]
tmdb_id = res_movie["id"]
imdb_id = res_movie["imdb_id"]
original_language = res_movie["original_language"]
overview = res_movie["overview"]
release_date = res_movie["release_date"]
revenue = res_movie["revenue"]
runtime = res_movie["runtime"]
poster_path = f"https://image.tmdb.org/t/p/original/{res_movie['poster_path']}"
print(f"{title=}")
print(f"{original_language=}")
print(f"{original_title=}")
print(f"{tagline=}")
print(f"{budget=}")
print(f"{tmdb_id=}")
print(f"{imdb_id=}")
print(f"{original_language=}")
print(f"{overview=}")
print(f"{release_date=}")
print(f"{revenue=}")
print(f"{runtime=}")
print(f"{poster_path=}")
print()
print(f"{count=}") # to validate number of movies processed
returns (edited for length):
m='movie:100'
movie_id='100'
title='Lock, Stock and Two Smoking Barrels'
original_language='en'
original_title='Lock, Stock and Two Smoking Barrels'
tagline='A Disgrace to Criminals Everywhere.'
budget=1350000
tmdb_id=100
imdb_id='tt0120735'
original_language='en'
overview='A card shark and his unwillingly-enlisted friends need to make a lot of cash quick after losing a sketchy poker match. To do this they decide to pull a heist on a small-time gang who happen to be operating out of the flat next door.'
release_date='1998-08-28'
revenue=28356188
runtime=105
poster_path='https://image.tmdb.org/t/p/original//8kSerJrhrJWKLk1LViesGcnrUPE.jpg'
m='movie:101'
movie_id='101'
title='Léon: The Professional'
original_language='en'
original_title='Léon: The Professional'
tagline='If you want a job done well, hire a professional.'
budget=16000000
tmdb_id=101
imdb_id='tt0110413'
original_language='en'
overview='Léon, the top hit man in New York, has earned a rep as an effective "cleaner". But when his next-door neighbors are wiped out by a loose-cannon DEA agent, he becomes the unwilling custodian of 12-year-old Mathilda. Before long, Mathilda\'s thoughts turn to revenge, and she considers following in Léon\'s footsteps.'
release_date='1994-09-14'
revenue=45284974
runtime=111
poster_path='https://image.tmdb.org/t/p/original//yI6X2cCM5YPJtxMhUd3dPGqDAhw.jpg'
m='movie:103'
movie_id='103'
title='Taxi Driver'
original_language='en'
original_title='Taxi Driver'
tagline="On every street in every city, there's a nobody who dreams of being a somebody."
budget=1300000
tmdb_id=103
imdb_id='tt0075314'
original_language='en'
overview='A mentally unstable Vietnam War veteran works as a night-time taxi driver in New York City where the perceived decadence and sleaze feed his urge for violent action.'
release_date='1976-02-09'
revenue=28570902
runtime=114
poster_path='https://image.tmdb.org/t/p/original//ekstpH614fwDX8DUln1a2Opz0N8.jpg'
Fetching Top Rated movies on TMDB
top_rated_movies_url = f"https://api.themoviedb.org/3/movie/top_rated?page=1&api_key={tmdb_api_key}"
res_list = requests.get(top_rated_movies_url).json()
pp.pprint(res_list)
returns:
<class 'dict'>
Keys:
- page
- results
- total_pages
- total_results
output (edited for length):
{ 'page': 1,
'results': [ { 'adult': False,
'backdrop_path': '/kXfqcdQKsToO0OUXHcrrNCHDBzO.jpg',
'genre_ids': [18, 80],
'id': 278,
'original_language': 'en',
'original_title': 'The Shawshank Redemption',
'overview': 'Framed in the 1940s for the double murder '
'of his wife and her lover, upstanding '
'banker Andy Dufresne begins a new life at '
'the Shawshank prison, where he puts his '
'accounting skills to work for an amoral '
'warden. During his long stretch in prison, '
'Dufresne comes to be admired by the other '
'inmates -- including an older prisoner '
'named Red -- for his integrity and '
'unquenchable sense of hope.',
'popularity': 82.945,
'poster_path': '/q6y0Go1tsGEsmtFryDOJo3dEmqu.jpg',
'release_date': '1994-09-23',
'title': 'The Shawshank Redemption',
'video': False,
'vote_average': 8.7,
'vote_count': 21823},
{ 'adult': False,
'backdrop_path': '/90ez6ArvpO8bvpyIngBuwXOqJm5.jpg',
'genre_ids': [35, 18, 10749],
'id': 19404,
'original_language': 'hi',
'original_title': 'दिलवाले दुल्हनिया ले जायेंगे',
'overview': 'Raj is a rich, carefree, happy-go-lucky '
'second generation NRI. Simran is the '
'daughter of Chaudhary Baldev Singh, who in '
'spite of being an NRI is very strict about '
'adherence to Indian values. Simran has '
'left for India to be married to her '
'childhood fiancé. Raj leaves for India '
'with a mission at his hands, to claim his '
'lady love under the noses of her whole '
'family. Thus begins a saga.',
'popularity': 23.792,
'poster_path': '/2CAL2433ZeIihfX1Hb2139CX0pW.jpg',
'release_date': '1995-10-19',
'title': 'Dilwale Dulhania Le Jayenge',
'video': False,
'vote_average': 8.7,
'vote_count': 3722},
{ 'adult': False,
'backdrop_path': '/rSPw7tgCH9c6NqICZef4kZjFOQ5.jpg',
'genre_ids': [18, 80],
'id': 238,
'original_language': 'en',
'original_title': 'The Godfather',
'overview': 'Spanning the years 1945 to 1955, a '
'chronicle of the fictional '
'Italian-American Corleone crime family. '
'When organized crime family patriarch, '
'Vito Corleone barely survives an attempt '
'on his life, his youngest son, Michael '
'steps in to take care of the would-be '
'killers, launching a campaign of bloody '
'revenge.',
'popularity': 89.756,
'poster_path': '/3bhkrj58Vtu7enYsRolD1fZdja1.jpg',
'release_date': '1972-03-14',
'title': 'The Godfather',
'video': False,
'vote_average': 8.7,
'vote_count': 16247},
full script to fetch data per page:
tmdb_api_key = os.getenv("TMDB_API_KEY")
# print(f"{tmdb_api_key=}")
tmdb_v4_token = os.getenv("TMDB_V4_TOKEN")
# print(f"{tmdb_v4_token=}")
count = 0
# Per page request
top_rated_movies_url = f"https://api.themoviedb.org/3/movie/top_rated?page=1&api_key={tmdb_api_key}"
res_list = requests.get(top_rated_movies_url).json()
pp.pprint(res_list)
print()
for x in res_list['results'][0]:
print(x)
# returns dict with 4 fields:
# page: int
# results: list of dicts with 14 fields per movie, 20 movies per page:
# adult
# backdrop_path
# genre_ids
# id
# original_language
# original_title
# overview
# popularity
# poster_path
# release_date
# title
# video
# vote_average
# vote_count
# total_pages: int / 507
# total_results: int / 10127
# the results object includes the list of all movies. Will be using that.
for m in res_list['results']:
count += 1
title = m["title"]
original_language = m["original_language"]
original_title = m["original_title"]
# tagline = m["tagline"] # NO tagline returned here
# budget = m["budget"] # NO budget returned here
tmdb_id = m["id"]
# imdb_id = m["imdb_id"] # NO imdb_id returned here
original_language = m["original_language"]
overview = m["overview"]
release_date = m["release_date"]
# revenue = m["revenue"] # NO revenue returned here
# runtime = m["runtime"] # NO runtime returned here
poster_path = f"https://image.tmdb.org/t/p/original/{m['poster_path']}"
print(f"{title=}")
print(f"{original_language=}")
print(f"{original_title=}")
# print(f"{tagline=}")
# print(f"{budget=}")
print(f"{tmdb_id=}")
# print(f"{imdb_id=}")
print(f"{original_language=}")
print(f"{overview=}")
print(f"{release_date=}")
# print(f"{revenue=}")
# print(f"{runtime=}")
print(f"{poster_path=}")
print()
# => need to use the https://api.themoviedb.org/3/movie/{movie_id}?api_key={tmdb_api_key} endpoint for each to get missing fields
TODO need to add logic to loop per page + use the /movie
endpoint to get missing fields.
Resources
How to fetch different sizes of movie posters
http://image.tmdb.org/t/p/wXXX/
, eg http://image.tmdb.org/t/p/w500/
Available sizes, for reference:
"backdrop_sizes": [
"w300",
"w780",
"w1280",
"original"
],
"logo_sizes": [
"w45",
"w92",
"w154",
"w185",
"w300",
"w500",
"original"
],
"poster_sizes": [
"w92",
"w154",
"w185",
"w342",
"w500",
"w780",
"original"
],
"profile_sizes": [
"w45",
"w185",
"h632",
"original"
],
"still_sizes": [
"w92",
"w185",
"w300",
"original"
]