Python Web Scraping(Second Edition)
上QQ阅读APP看书,第一时间看更新

Background research

Before diving into crawling a website, we should develop an understanding about the scale and structure of our target website. The website itself can help us via the robots.txt and Sitemap files, and there are also external tools available to provide further details such as Google Search and WHOIS.