The Web: bigger than we know. Bigger than we can know?

From Search Engine Watch…

A new survey has made an attempt to measure how much information exists outside of the search engines’ reach. The company behind the survey is also offering up a solution for those who want tap into this “hidden” material.

The study, conducted by search company BrightPlanet, estimates that the inaccessible part of the web is about 500 times larger than what search engines already provide access to. To put that another way, Google currently claims to have indexed or know about 1 billion web pages, making it the largest crawler-based search engine, based on reported numbers. Using Google as a benchmark, that means BrightPlanet would estimate there are about 500 billion pages of information available on the web, and only 1/500 of that information can be reached via traditional search engines.

Hmmm… That was written in 2000. When it stopped bragging about the number of pages it had indexed, Google was claiming over 8 billion. Let me see, that’s 8 billion by 500, er 4,000 billion pages. Pardon me while I go and lie down in a darkened room. I wonder if Tim Berners-Lee realised what kind of monster he was unleashing when he dreamed up the Web.

Memex 1.1

John Naughton's online diary

The Web: bigger than we know. Bigger than we can know?