GitHub

Getting Data Was Never Easier

Forget about proxies, servers, and IP addresses. Just get the data you need.

Google and eCommerce HTML Scraper Send a request with up to 1,000 URLs and receive the raw, unblocked HTML files.

Quick Start

Create a new account at: https://app.scrapezone.com
Copy your scrape username and password:
Start getting the data you need.

Sending a request

A request is sent in batches of 1-1,000 URLs.

Endpoint: POST http://api.scrapezone.com/scrape

Parameters:

query: a list of URLs to scrape.

callback_url: the URL to send the response to once the scrape is done (Optional).

country: the country from which the request should be originated. Supported countries:

'us', 'fr', 'it', 'de', 'uk'

Request Example:

curl --user user:pass \
--header "Content-Type: application/json" \
--request POST \
--data '{"query":["https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics"]}' \
https://api.scrapezone.com/scrape

Response

The response will be formatted in the following way: job_id: a list of URLs to scrape.

callback_url: the URL to send the response to once the scrape is done.

parser_name: the name of the parser to use on the results. For more info check Parsed Results

Response Example:

{
  "job_id": "12345678987654321",
  "callback_url": "YOUR_CALLBACK_URL",
  "parser_name": "Requested parser name"
}

Getting the results:

There are two methods of getting the response:

Using continuous polling (GET /scrape/job_id)
Using a callback URL

GET /scrape/job_id

An endpoint to check the scrape status and download the results once the scrape is done. Status: Status can be

Callback URL If a callback URL was given in the request, once the scrape is done we will send a POST request to that URL, containing the response object.

The response object will be in the following format:

{
  "job_id": "12345678987654321",
  "callback_url": "THE_CALLBACK_URL",
  "status:" <scraping/done/faulted>,
  "html_files:"
    [
    {
       "url": <given_url_1>,
       "output": <URL of the downloadable html file>
    },
    ...
    {
       "url": <given_url_n>,
       "output": <URL of the downloadable html file>
    },
    ]
}

“html_files” will be sent only for scrapes with status “done”, otherwise “results” will be null.

Parsed Results

Parsed results allow you to get a JSON or CSV file with the parsed data! Available parsers:

Scraper Name	Description	Results File Structure
amazon_product_display	Amazon Product Display Page	Documentation
amazon_search	Amazon search or category page	Documentation
bestbuy_product_display	BestBuy Product Display Page	Documentation
ebay_product_display	Ebay Product Display Page	Documentation
etsy_product_display	Etsy Product Display Page	Documentation
flipkart_product_display	Flipkart Product Display Page	Documentation
google_news	Google News Results Page	Documentation
google_search	Google Search Results Page	Documentation
homedepot_product_display	The Home Depot Product Display Page	Documentation
lowes_product_display	Lowes Product Display Page	Documentation
target_product_display	Target Product Display Page	Documentation
walmart_product_display	Walmart Product Display Page	Documentation
wayfair_product_display	Wayfair Product Display Page	Documentation

Example Request

This requst will result in the parsed product details of 2 Amazon products.

curl --user user:pass \
--header "Content-Type: application/json" \
--request POST \
--data '{"query":["https://www.amazon.com/dp/B08J65DST5", "https://www.amazon.com/dp/B07FZ8S74R"], \
"parser_name": "amazon_product_display"}' \
https://api.scrapezone.com/scrape

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
data		data
images		images
parsers		parsers
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Data Was Never Easier

Quick Start

Sending a request

Response

Getting the results:

GET /scrape/job_id

Parsed Results

Example Request

About

Releases

Packages

Contributors 2

Scrapezone/documentation

Folders and files

Latest commit

History

Repository files navigation

Getting Data Was Never Easier

Quick Start

Sending a request

Response

Getting the results:

GET /scrape/job_id

Parsed Results

Example Request

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages