ManticMoo.COM All Articles Jeff's Articles
Jeffrey P. Bigham

A random web page

Jeffrey P. Bigham

Related Ads

Generating a random web page is full of subtle gotchas that make doing so incredibly difficult. First of all, what do you mean by random? Some web pages are more popular than others, so should a web page's likelihood of showing up be somehow dependent on its popularity? Even if we want to treat all web pages as having a uniformly equal chance of showing up, the bottom line is that we can't do it. No index will ever know about every single web page on the web. Even Google acknowledges that it only indexes approximately 10% of the total web, with the rest (the so-called dark web) being isolated places of semi-private sites that often require some sort of input to generate information (either a login/pass combo or just a query).

So, is it hopeless? Well, no, you'll just have to relax the requirements of randomness. Yahoo, for instance, has a neat service that will take you to a random web page that happens to be listed in their index. Another service takes you to a "random" Google web page by creating a random query from a set of words and then taking you to a random result. Because Google ranks pages based on PageRank, which assesses quality while largely ignoring content, you're likely to get a "random" web page biased toward popular sites.

Jeffrey P. Bigham
ManticMoo.COM All Articles Jeff's Articles