Groups know better than Google

Written on May 24, 2008 in Personal.
Adam Fortuna

Hey hey! I'm a developer who lives in Orlando, FL. Right now I work at Code School, listen to a lot of audiobooks, set way too many goals, write at minafi.com and tweet often.

How do you filter out what’s important to you? For knowledge that flows towards you, you have a lot more control over this. Through our feed readers we have precise control over what we follow, and through email we can filter out spam and label all mail automatically. When we go out searching for something though things aren’t quite as controllable. An earlier podcast on StackOverflow talked briefly about this problem and their human driven solution to solving it. The problem is simple: there is a lot of information out there so how do you cut through the stuff you don’t want so that you only see what’s worth seeing?

Slashdot has been doing exactly this with their comments for as long as I can remember. Each comment to an article on Slashdot can be rated (1-5) as well as have a reason attached to it (funny, informative, insightful, etc). This enables the average user to glance over the comments and see those that stand out the most. Digg has a similar system based on up/down votes for comments, which seems to be the route StackOverflow is headed as well (which means they probably won’t have 50 options for displaying comments, but I can live with that). I’m a fan of this kind of system. It’s the same thing Digg or Tallyhoh use, just on a micro scale - that of user exchanges. But it doesn’t solve the bigger problem: everything not submitted. Everything Google finds, for example, it finds on it’s own by using spiders to explore every available link. The only data available about a given page is right there, on pages that link to that page and possibly on the DNS record for the domain. That’s not too much to go on when you’re trying to evaluate the importance of one site over another.

Eric Sink mentioned this in a very good article titled Baptists and Boundries . Apparently he out Eric’d a number of other much more recognizable Erics throughout history (lots more to that article by the way, and worth a read). The same problem exists in the Adam-verse as well: First Page: Adam (bible), Adam Curry (geek), Adam Lasnik (geek), Adam Kinney (geek), Adam Trachtenberg (geek). Other Notables: #15 Adam Savage (Mythbusters), #21 Adam Baldwin (Firefly, The man they call Jane), #23 Adam Sandler (Actor, Comedian), #39 Adam West (Batman), #100+ Creation of Adam (Michelangelo). It makes me wonder how many people searching for “Adam” would actually be looking for geeks. Based on the results on Google’s first page, 4 are geeks, 4 are technical topics and the other two Wikipedia articles for the biblical Adam.

To be fair, the Wikipedia Adam is much more relevant to what people would actually want to find, and is grouped much more logically. The problem with the two is that the Google index is created automatically while on Wikipedia the entire set of results was created manually by someone to resolve such ambiguos results. Googles solution is to use search engine optimization practices to determine who’s on top, coupled with Pagerank and incoming links- - but that’s obviously flawed. They also order based on some personal preferences (if you’re logged in) which might help focus results to your interests. The blogs in the top might have semantic HTML, proper title tags and a decent amount of incoming links from similar geeky sites, but I wouldn’t say their more important than the Sistine Chapel. Microformats might help to give context to some of the results, but it doesn’t do anything for order. I wonder if it’s time for Google to start leveraging group think for order. Having a Google system that can be that easily gamed is a recipe for disaster, but I wonder if there’s a way the two could be merged to fixed problems like this. Maybe Google really thinks geeks are that important. After all, Penn said it best:

Google doesn’t seem to fuck up much. - Penn

Comments