Fetch Notion database data with Python

Get data from Notion with Python

07 Jul 2022

Needed a way to fetch data from Notion to generate Markdown files dynamically for this static site.

Reference

Source database:

notion-clipper.jpg

Generate access

Secret / token

  • Generate secret/token at https://www.notion.so/my-integrations
  • Ensure all access is granted, Read content, Update content and Insert content. Initial tests with just giving Read rights did not work for some reason.
  • Set as Internal integration
  • Internal Integration Token will be like secret_XXXXXXXXXXXXXXXX: use as token below

Database ID

This can be found in the URL of the database page to fetch:

https://www.notion.so/myworkspacename/THIS_IS_THE_DATABASE_ID?v=XXXXXXXXXXXXXXXXXXXXXXXXXXXX

28 Sep 2022

The database needs to be shared with the integration: https://developers.notion/docs/getting-started#step-2-share-a-database-with-your-integration

python-notion/notion-integration-share.jpg
here Clipper Import is the name of the integration I setup for this.

Note: if when selecting Add Connections your newly created integration does not show up:

  • try from the web interface rather than the desktop app
  • refresh your browser
  • if still nothing, add a dummy record or do anything for the document to show Edited just now in the top right corner. Your integration should appear then.

Script

import json
import requests
import pprint
pp = pprint.PrettyPrinter(indent=4)

token = 'secret_XXXXXXXXXXXXXXXX'
database_id = 'XXXXXXXXXXXXXXXXX' # first variable in database URL, ie before ?

headers = {
    "Authorization": "Bearer " + token,
    "Content-Type": "application/json",
    "Notion-Version": "2021-05-13"
}

url = f'https://api.notion/v1/databases/{database_id}/query'

r = requests.post(url, headers={
    "Authorization": f"Bearer {token}",
    "Notion-Version": "2021-08-16"
    })

result_dict = r.json()
results = result_dict['results']

# for analysis of results structure
pp.pprint(results) 
print()

for record in results:
    created_time = record['created_time'][:10] # only 10 first chars for YYYY-MM-DD format
    # below: 2nd field in 'properties' field will depend on column headers in database
    url = record['properties']['URL']['url']
    name = record['properties']['\ufeffName']['title'][0]['plain_text'] # some need deep navigation / nested fields
    if record['properties']['Category']['select'] != None:
        category = record['properties']['Category']['select']['name'] # simple select = single dict
    tags_list = record['properties']['Tags']['multi_select'] # multi-select fields is list of dicts
    tags = ''
    for tag in tags_list:
        if tags == '':
            tags = tag['name']
        else:
            tags = f"{tags}, {tag['name']}"
    notes_list = record['properties']['Notes']['rich_text'] # Text field is a list of dict
    if len(notes_list) > 0:
        notes = notes_list[0]['text']['content']
    else:
        notes = ''

    print(f"created_time: {created_time}")
    print(f"URL: {url}") 
    print(f"Name: {name}") 
    print(f"Category: {category}")
    print(f"Tags: {tags}")
    print(f"Notes: {notes}")
    print()

Returns a list of dict with each dict as:

    {   'archived': False,
        'cover': None,
        'created_by': {   'id': 'xxxxxxxx-0855-xxxx-xxxx-xxxxxxxx',
                          'object': 'user'},
        'created_time': '2022-07-06T21:20:00.000Z',
        'icon': None,
        'id': '1cd954ea-xxxx-xxxx-xxxx-xxxxxxxx',
        'last_edited_by': {   'id': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx',
                              'object': 'user'},
        'last_edited_time': '2022-07-07T07:36:00.000Z',
        'object': 'page',
        'parent': {   'database_id': 'xxxxxxxx-f088-4d47-bca5-xxxxxxxx',
                      'type': 'database_id'},
        'properties': {   'Category': {   'id': '%60%7DkC',
                                          'select': {   'color': 'pink',
                                                        'id': 'xxxxxxxx-bd0a-4cf0-ab54-xxxxxxxx',
                                                        'name': 'macOS'},
                                          'type': 'select'},
                          'Notes': {   'id': 'teVv',
                                       'rich_text': ['Terminal alternative to test'],
                                       'type': 'rich_text'},
                          'Tags': {   'id': '%5DzQp',
                                      'multi_select': [   {   'color': 'gray',
                                                              'id': 'xxxxxxxx-f4c5-4236-870d-xxxxxxxx',
                                                              'name': 'terminal'}],
                                      'type': 'multi_select'},
                          'URL': {   'id': 'xxxxxx',
                                     'type': 'url',
                                     'url': 'https://fig.io/'},
                          '\ufeffName': {   'id': 'title',
                                            'title': [   {   'annotations': {   'bold': False,
                                                                                'code': False,
                                                                                'color': 'default',
                                                                                'italic': False,
                                                                                'strikethrough': False,
                                                                                'underline': False},
                                                             'href': None,
                                                             'plain_text': 'Fig '
                                                                           '- '
                                                                           'Your '
                                                                           'terminal, '
                                                                           'reimagined',
                                                             'text': {   'content': 'Fig '
                                                                                    '- '
                                                                                    'Your '
                                                                                    'terminal, '
                                                                                    'reimagined',
                                                                         'link': None},
                                                             'type': 'text'}],
                                            'type': 'title'}},
        'url': 'https://www.notion.so/Fig-Your-terminal-reimagined-1cd9xxxxxxx431199ae46a78af43ebc'},

Deleting records (blocks)

# Delete from Notion
record_id = record['id']
print(f"\nDELETING {record_id} from Notion...")

url_delete = f'https://api.notion/v1/blocks/{record_id}'
request_delete = requests.delete(url_delete, headers={
    "Authorization": f"Bearer {token}",
    "Notion-Version": "2021-08-16"
    })

What tripped me up

  • tried first the notion package on PPI (pip3 install notion) and another approach using the token to be found in the browser's cookie (ie without integration/token) - both to no avail.
  • unable to make script work when only Read content only activated in integration on Notion.
  • field Name is prefixed with \ufeff for some reason. 30 Sep 2022 seems to have been fixed?

links

social