How We Generated Content For Affiliate Marketing Website

Web Scraping — Example of the product page

The project had 2 stages:

Go through categories on retailers’ websites, discover all product pages and store them in a database
Visit each product page and extract relevant data points.

We wanted to collect the following fields for each product:

Product name
Image URLs
Dimensions
Description
Breadcrumbs
SKU(s)

Decisions that we’ve made:

Some websites had anti-bot measures in place. Considering a onetime extraction nature of the project, we decided that Zyte’s Smart Proxy Manager is the cheapest solution for that problem.

JavaScript rendering: we’ve settled on Splash because it was already in use by the client.

Dealing with multiple variants: some products had multiple options (different colors, sizes, shapes) that had to be treated as a separate product. Dealing with those required custom scripts in each case.

Images extraction: extracting the highest-quality images was non-trivial. Client asked to ensure we are extracting the highest resolution which took some effort.

Q&A instead of code reviews: we calculated ‌‌it will be cheaper to do more manual QA rather than conduct code reviews.

Backend:

In order to minimize costs, we used Scrapy Cloud to run the project in the cloud.

Storing the output: we developed several solutions for hosting databases on different stages of the project, starting from simple Google sheets tables to a PostgreSQL integration.

Web Scraping 50 eCommerce Domains: Product Info Collection

The project had 2 stages:

We wanted to collect the following fields for each product:

Decisions that we’ve made:

Backend:

Would you like to see it in action?

How will customers benefit from our services?

eCommerce companies:

Wholesalers and manufacturers:

Data science teams:

Want to hear more?

Contact Form

Web Scraping 50 eCommerce Domains: Product Info Collection

The project had 2 stages:

We wanted to collect the following fields for each product:

Decisions that we’ve made:

Backend:

Would you like to see it in action?

How will customers benefit from our services?

eCommerce companies:

Wholesalers and manufacturers:

Data science teams:

Want to hear more?

Privacy Preferences

Contact Form