top of page
Search
  • Writer's pictureScraper

Web scraping history

Updated: Feb 16, 2020

The history of web scraping is actually much longer than it seems. It begins with the time when the Internet appeared.



The biginning


The first web robot "World Wide Web Wanderer" (Wandex) was created in June 1993. Robot was intended only for measuring the dimensions of the World Wide Web.


Interesting fact

Wandex recovered in 2011. A search bot was recorded by the owners of many sites, which became the reason for discussion. At the moment, on the main page there is a search bar.




The first WWW search engine appeared in 1993. It relied on work on the JumpStation web robot. It could already track information, index it and search by words, ranking links. The first full-text search engine Webcrawler was launched in 1994.


The development


The first Web and Crawler APIs appeared in 2000. Salesforce and eBay launched their own API, through which programmers gained access to some public data. Since then many websites offer a Web API, which greatly simplifies web scraping.


The breakthrough


The Beautiful Soup library for Python was launched in 2004. Since not all websites offer APIs, programmers have been looking for a new solution for web scraping in such cases. Such a solution was the Beautiful Soup library, which determines the structure of the site using parsing, which helps to analyze and retrieve the contents of HTML pages. Beautiful Soup is considered the most sophisticated and advanced library for web scraping.


In 2018, the Internet represented more than 1.8 billion websites. Web scraping made the recently emerged World Wide Web searchable, and then the fast-growing Internet became more convenient and affordable.


3 views0 comments

Recent Posts

See All
bottom of page