Branched out from !helpers/scraping-apis

14 Apr 2023


Starting to test on free plan.


Basic Use

import requests

url = ''
params = {
    'url': '',
    'x-api-key': '<YOUR-API-KEY>'

response = requests.get(url, params=params)


working function

def get_soup(url):

    request_url = ''
    params = {
        'url': url,
        'x-api-key': SCRAPINGANT_API_KEY,
        'browser': True, # default: True / Set to True for JS rendering (10 credits), False otherwise (1 credit)
        'proxy_country': 'GB', # default: world
        'return_page_source': False, # default: False

    response = requests.get(request_url, params=params)

    html_text = response.text

    soup = BeautifulSoup(html_text, 'html.parser')

    return soup

using class selector

When website uses JS to load content, you can use the class selector option to ensure data is only returned once that CSS class is availabe.

CSS class (as .hero-unit) needs to be URL encoded.

    request_url = ''

    params = {
        'url': url,
        'x-api-key': SCRAPINGANT_API_KEY,
        'browser': True,  # Set to True for JS rendering (10 credits), False otherwise (1 credit)
        'proxy_country': 'GB',
        'wait_for_selector': urllib.parse.quote('.hero-unit'),

    response = requests.get(request_url, params=params)


API client

Library resources
PyPI ---

Getting Started

pip install scrapingant-client


from scrapingant_client import ScrapingAntClient

client = ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')
# Scrape the site.
result = client.general_request('')

Error codes

HTTP status code Reason
400 Wrong request format. Make sure that you're using a proper JSON input.
403 The API token is wrong or you have exceeded the API credits limit.
404 The requested URL is not reachable. Please, check it in your browser or try again.
405 The API endpoint can only be accessed using the following HTTP methods: GET, POST, PUT, DELETE.
409 Concurrent requests limit exceeded. Please, try again or upgrade to the paid plan
422 Invalid value provided. Please, look into detail for more info.
423 The anti-bot detection system has detected the request. Please, retry or change the request settings.
500 Something went wrong with the server side code. Rare case. We're recommending to contact us in this case.

Response example

apparent_encoding: utf-8
close: <bound method Response.close of <Response [404]>>
connection: <requests.adapters.HTTPAdapter object at 0x1069c6860>
content: b'{"detail":"This site can\xe2\x80\x99t be reached"}'
cookies: <RequestsCookieJar[]>
elapsed: 0:01:00.988698
encoding: utf-8
headers: {'Content-Length': '41', 'Content-Type': 'application/json', 'Date': 'Sat, 15 Apr 2023 11:11:07 GMT', 'Server': 'uvicorn', 'Vary': 'Accept-Encoding'}
history: []
is_permanent_redirect: False
is_redirect: False
iter_content: <bound method Response.iter_content of <Response [404]>>
iter_lines: <bound method Response.iter_lines of <Response [404]>>
json: <bound method Response.json of <Response [404]>>
links: {}
next: None
ok: False
raise_for_status: <bound method Response.raise_for_status of <Response [404]>>
raw: <urllib3.response.HTTPResponse object at 0x106c63040>
reason: Not Found
request: <PreparedRequest [GET]>
status_code: 404
text: {"detail":"This site can’t be reached"}

AI extraction

27 Sep 2023

Launched AI data extraction.

Can handle requests like extracting product title, price(number), full description, reviews(list: review title, review content).


Price seems prohibitive though:

each 30 characters of input and output text cost 1 API credit

So a webpage with 30k characters would cost 1k credits!