Web Scraping 50 eCommerce Domains: Product Info Collection

Aggregator of Furniture Stores
Web Scraping
Example of the product page
The project had 2 stages:
  1. Go through categories on retailers’ websites, discover all product pages and store them in a database
  2. Visit each product page and extract relevant data points.
We wanted to collect the following fields for each product:
  1. Product name
  2. Image URLs
  3. Dimensions
  4. Description 
  5. Breadcrumbs
  6. SKU(s)
Decisions that we’ve made:

Some websites had anti-bot measures in place. Considering a onetime extraction nature of the project, we decided that Zyte’s Smart Proxy Manager is the cheapest solution for that problem.

JavaScript rendering: we’ve settled on Splash because it was already in use by the client. 

Dealing with multiple variants: some products had multiple options (different colors, sizes, shapes) that had to be treated as a separate product. Dealing with those required custom scripts in each case.

Images extraction: extracting the highest-quality images was non-trivial. Client asked to ensure we are extracting the highest resolution which took some effort.

Q&A instead of code reviews: we calculated ‌‌it will be cheaper to do more manual QA rather than conduct code reviews.

Backend:

In order to minimize costs, we used Scrapy Cloud to run the project in the cloud.

Storing the output: we developed several solutions for hosting databases on different stages of the project, starting from simple Google sheets tables to a PostgreSQL integration.

Start Typing
Web Scraping Logo

How will customers benefit from our services?

eCommerce companies:

Be ahead of your competitors by intelligent price setting. Access to additional information can boost your sales by up to 85%

Wholesalers and manufacturers:

Know the stocks of your customers in near real-time. This information will help you predict the demand better and reduce your stockpile.

Data science teams:

Web scraping can be tedious. We will take care of that headache for you.

Want to hear more?







    Privacy Preferences

    When you visit our website, it may store information through your browser from specific services, usually in the form of cookies. Here you can change your Privacy preferences. It is worth noting that blocking some types of cookies may impact your experience on our website and the services we are able to offer.

    We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.

    Contact Form