Octoparse is a web scraping tool that’s easily usable and lets you scrape any website without much experience. It’s quite simple to scrape using Octoparse as you simply have to input the target website’s URL, select the data you want to extract, and then run it. You can use Octoparse for free but there are also premium versions that offer more features.
When scraping a huge amount of data using a web scraper, you will be sending a lot of requests to the website’s server. So, most websites will ban any account that’s caught using an automated tool and blacklist their IP. To avoid this ban from happening to you, you need to pair Octoparse with an Octoparse proxy.
A proxy will act as a gateway between your device and the internet so any website you visit will only be able to see the IP address of the proxy server. By continuously rotating proxies, Octoparse would have a different IP on a given time interval. This deceives any website into thinking that its requests are from multiple, different users instead of from a single computer.
Let’s set up an Octoparse proxy server.
How To Use Proxies With Octoparse
Step 1. Open Octoparse and click on the New icon.
Step 2. Click on Advanced Mode.
Step 3. On the new task tab, enter the website/s you want to scrape and then click Save.
Step 4. After the website loads, click on the Task Settings button.
Step 5. On the Settings menu, turn on Use IP proxies.
Step 6. Then, click on Settings.
Step 7. On the pop-up window, decide the Time interval when switching proxies.
Step 8. Input your proxy list here in IP:Port format.
Step 9. Click on Confirm.
Step 10. Lastly, click on Save.
Congratulations! You have successfully finished configuring an Octoparse proxy server whenever you scrape any website.