It’s amazing how the Web has come and quickly. The original web was based on the idea of interlinking informational resources, the hypertext nature of things. The original Backrub (AKA Google) research project was based on this notion, with PageRank using a unique combination of academic citations and democracy to evaluate websites. Unfortunately, the NoFollow attribute was a terrible idea, and it’s use on Twitter and other social network sites has resulted in a horrible perversion of the concept of the interconnected Web.
If you aren’t a professional SEO or linear algebra geek, here is the summary: take the entire web as individual pages, give each page a single vote (in the linear algebra, 1/N votes were N is number of pages, so we’re dealing with fractional probability), then take that vote, and divide it amongst the pages it links to and assign that to the next page. So if 20 pages link to me and 4 others sites, I get 20 1/5 votes, or 4 in the next round. Next round, my score of 20 is divided by my links out and assigned to them, rinse, lather, and repeat. Over the iterations, this ranking, PageRank moves towards the sites with the most incoming links.
This is academic in the idea that a paper cited more often is probably better, likewise an Internet resource linked to more often is probably better. It’s Democratic in that each page starts with a single vote, whether put out by a powerful corporation or someone with a website dedicated to their dog.
Now there are plenty of mathematical models that you can read out there helping you understand how PageRank works, and how much it matters, but essentially, it let the rankings on the Web be determined by the publishers of content. Yes you can game PageRank, and it used to really matter a lot, but it was also nice to think of a world where all publishers were considered.
Now, 12 years ago (1997), having a website meant that you were a corporation that could pay someone to do so, a university student that played with a text editor, or a computer geek interested in running a website, animated GIFs were all the rage. A few years later, when the Backrub project at Stanford was a PhD project, Stanford’s academic use of Hypertext made this ranking of things a perfect way to identify useful content. As it moved to the general Internet (or critical subsets), it did a good job of identifying the most relevant link.
If you remember the pre-Google search engines, simply turning up the New York Yankees home page for the phrase New York Yankees was quite an accomplishment, many major corporations didn’t have websites, or if they did it was hung on a local ISP’s domain name, not their own. So in that regard, the links worked wonderfully. Given a sizable number of pages in the index (Google used to brag about Index size), adding one or more pages wouldn’t give you enough votes, but spamming by the millions would. Google wrote more and more sophisticated methods for detecting spammers, and while their percentage of searches increased, they became targeted more and more for spammers.
The worst link spamming was the harassment of community forums, posting garbage links in forums looking for PageRank, link popularity, and traffic. Many semi-abandoned blogs and forums, or the older posts/threads on them, were filled with spam. Our SEO friendly and ranked TV Show Site, The OC Files, was basically ruined by spammers. So Google created an attribute to add to links that said “don’t follow this link, I don’t know if its good or now.”
Why would anyone create a link without knowing the value? The webmaster wouldn’t, but if they have a section for user created content, the users might put it up looking for links. By adding NoFollow, you stripped the link of its value to the spammer for ranking reasons, helping Google, and hopefully helping yourself reduce spamming.
Now let’s look at Twitter and the other Social Media sites. The links on Twitter and Facebook are some of the MOST legitimate “votes” on the modern web. If I read an article of interest, I no longer put it up on my website, I share it on Facebook and maybe Tweet about it, sharing this information with my friends and followers. That’s a huge vote.
If you treat all Twitter Users like PageRank treated pages, and allocated “votes” based on how many people you were Following, you could probably identify the most valuable Twitter users, and assign accordingly. A bit of Twitter’s structure hurts that, as internally the links go to those Following you and those you are Following, but regardless, most people share content they like not with a blog, but with this micro-blogging power.
By NoFollowing these links, apparently at the request of Google, Twitter is discounting the value of this “voting” and reserving the voting for those still maintaining a normal web presence. As more and more of our online social lives are on these sites, and less and less on blogs and personal home pages, the individual “votes” are getting thrown away.
Perhaps Google shouldn’t think of themselves as a Web Democracy, but more of a Web Republic, where only the “white male property owners of the web” (the people running web sites, i.e. the pros) get to vote, and the “rabble” of common users on the social web get no say. When Twitter was a haven of coastal users getting spammed out to game Google, the change made sense. With its growing importance as a finger on the pulse of the Internet, it’s terrible to discount it.
If that isn’t “Doing Evil,” than Google doesn’t know what that is. Regarding NoFollow, I don’t use it on my sites, don’t use it when doing site management for clients, and generally avoid it accept in the case of Comment Tags and other spam-infested areas… and even then it’s only when I’m using pre-canned software that adds it. Using it the manage “Google Juice” as recommended by SEOs and to some extent Google is adding non-standard HTML elements solely for the purposes of affecting search rankings, and that’s SEO spam, pure and simple.