We have scraped products from the Amazon marketplace for one of the sellers on the platform. The client gave us several high-level categories they were interested in and asked us to parse product information from amazon.com. The data we provided was used to decide on the next niche our client wanted to develop in.
We scanned all the categories and subcategories and extracted all the products pages, here is a visualisation:
Following that, our scrapers browsed through the product pages extracted in the previous stage. Our software downloaded and parsed nearly 100,000 pages before storing the data into a spreadsheet.
We have extracted eight data fields:
- Product name
- Price
- Product Rating
- Shipping/Delivery Costs
- Availability
- Description
- Product Images
- URL
The final output looked like this:
The next step in the project was to keep track of the product prices and availability for a list of selected products, but that is a topic for another case study.
However, no task is without complications. We had to avoid IP bans by using our large proxy pool and solve captchas using our AI (computer vision) model.
The final output was a database containing valuable market information that assisted our client in identifying his specialised market and successfully incorporating it into his Amazon business.
Web scraping and parsing are important, but they are not the sole methods for locating the best deals and market niches. For this reason, we have compiled a list of some of the more fair and real requirements concerning web scraping software that produces accurate data while also being affordable: quality and practicality. The main points in making a sufficient impact on sales are:
- Scraping output
The stats speak for themselves; we have built a specialised scraping bot that not only outperformed the standard rate of resources scraped but was also 30% cheaper compared to other similarly popular solutions on the market. Our bot was significantly faster (150 requests per minute) compared to Chrome plugins (35–40 requests per minute) and manual scraping (3–4 requests per minute).
- Reputation
All of our products and our trustworthiness are easily verifiable in our Upwork and in our social media. This task, along with a great number of others, received a perfect score from our client. We maintain both a good rating and a trustworthy reputation in the community.
- Finding a profitable market niche
One of the primary purposes of publicly available data scraping is to ensure that each new campaign is on the proper track. With this example of an Amazon scraping project, the data we have collected helped our client to gain an upper hand in a highly competitive niche.
- Timeframe
We finished the entire project within a week.
If you wish to learn more about web scraping, please checkout our blog, which contains a variety of articles that provide a clear view of the various markets where scraping can be beneficial.
If you want to create a similar project, contact us.