What do search engines like so much about Wikipedia?

By Warren Cowan | 01 Jul 2007

If you've done a search in just about any vertical and were the type to take slightly more notice of who's where on the page, you'd have been hard pressed not to have repeatedly come across Wikipedia ranking just about everywhere. Initially confined to academic searches and purely reference searches, the strength of Wikipedia has seen it popping up for increasingly Generic and more commercially oriented phrases, to the point where it seems that almost every keyword's search engine result page, (SERP) has seen some sort of wiki creep.


And I know all of you have been picking up on this, because I've been to so many pitches and client meetings of late, where "what is it that Wikipedia is doing and why does Google like them so much?" is popping up as a pretty regular question.

So seeing as I'd repeated myself so much, I thought I'd share the light we've shed on it with everyone, here in our newsletter.

The first thing I have to say is actually something of a let down, in that I don't have any startling revelations to reveal. There are no secrets to spill and there's no dirt to dig up. There is no overlooked, straight forward tweak or master stroke that you can all crank up your web editors and knock out easily, to make Google love you and put you on the next rocket ship to page 1.

That said Wikipedia is doing something right; partly due to the fact that it's got some damn fine architectural structuring to it, and by that I mean in terms of hierarchies and interlinking between pages. But what it really boils down to is that it's just a very well written, high quality interest magnet. Consequently, Wikipedia attracts links from the community it serves and that, in SEO terms, is the less replicable part. But let's break it down in pieces:

Great content
Ok yes, 'content is king' etc. etc. yada yada. Wikipedia has it and lots of it. But it's far more than the presence of text that's doing the job here. The content is interesting, but more on that in a bit.

A well structured site through powerful Interlinking
If you've browsed any page on Wikipedia, you'll have noticed that the page was littered with text anchored links, pointing to other pages on Wikipedia on related and similar topics. So if you're reading the Heathrow airport page, there'll be links to the Wikipedia BAA page, Gatwick Airport page, Aberdeen Airport page and so on.

The effect that this incestuous interlinking has is threefold.

- Firstly... The text in those outbound links (even internal ones) is used by Google to contextualise the page at the other end of that link. So if the link points and says 'Gatwick Airport', search engines develop a pre-emptive assumption that the page at the other end is about Gatwick Airport. Of course they follow it, and hey presto, surely enough it is. Brownie points all round.

This keyword relevant interlinking within the site means that related pages on Wikipedia add relevancy to each other. Given that every Wikipedia pages does this an awful lot, it means every page on Wikipedia is very heavily referenced by its peer pages, in the context of the keyword it should want to rank for.

- Secondly... On top of this, the act of linking to related pages through links that use similar keywords to the page that's doing the linking, partially reinforces the context of the page itself.

So the Heathrow Airport page links to the Gatwick Airport page through a text anchored link, which contextualises the Gatwick Airport page, helping it rank for Gatwick Airport. The fact that the outbound link on the Heathrow page contains the word 'airport', means that the Heathrow page is partially reinforced for the term 'airport', which of course will help in its ranking for 'Heathrow airport'.

- Thirdly… It also means that somewhere on the site, in the same vein, the Heathrow page has also benefited (like the Gatwick page) by being linked to from somewhere else on the site via a link that says 'Heathrow airport'. This helps to contextualise it for search engines for that keyword.


Hopefully you're still with me, but I'll apologise now as this doesn't get any easier. If you are with me and thinking this goes round in circles, you'd be exactly right. This is something you can do and maybe already have done, to some extent, on your website. If you do, it will help. But it's not the panacea!

The Wikipedia secret
The real Wikipedia advantage, (and here's the kicker), is that when all these pages link, they pass 2 things: Context and Authority. Context determines what you are relevant for and is passed through the content or text of the link; while Authority determines how well you should rank for the context and is passed by the physical link itself. So you need both to truly rank effectively.

A link from a Wikipedia page to another page, like we've described above, is like a raw lightning bolt in terms of Context and Authority and is the real reason why this well tuned structure works so well.

So let's look at why that is!

The pure link strength of the Wikipedia domain
The first thing we should understand is that each of those pages of Wikipedia is on a very powerful and link rich domain, Wikipedia.org itself. This is a pagerank 8 domain, with something like 4m links pointing at it (according to Yahoo!). In other words, 4 million other pages on the web rated Wikipedia by linking to it. Even if you only know the bare minimum about link popularity, it's not hard to see that this makes Wikipedia an 800lb gorilla in link popularity terms, and so is very much a trusted authority site.

So any child page on the site immediately benefits from having link rich, powerful parents. This parent domain of course donates some of its link popularity or pagerank (for want of a better word) to all their child pages, via some navigational link somewhere on the site. You could call this a parental recommendation - think of it as nepotism for the web. This gives those pages a powerful ability to rank.

But it doesn't stop there. There are 3rd party recommendations too
On top of this powerful and generous nepotism, each Wikipedia page also typically attracts lots of links from the outside web, from people who have an interest in the topic it discusses.

For example, a group of people in the scientific community might all link to the Wikipedia page on some scientific discipline, because they feel it provides good reference material for their site's readers, or because they have quoted it in some way. This link magnetism happens right across the site for just about every page on Wikipedia. Yes even the Heathrow Airport page, which has accrued some 1,700 links. This provides each page an even greater ability to rank and it's not so easy to replicate for your own site, despite you now knowing it to be the case.

This is because the only reason these people linked to Wikipedia, was because the page (like the vast majority of pages on Wikipedia) was purposefully written by experts (pseudo and official), to be accurate, factual, comprehensive and useful reference material on the topic.

Wikipedia is so useful and info rich that it is bordering on Earth's time capsule. If Aliens stumbled upon this rock in 2 eon's time, to find it long devoid of life, they could boot up the internet (I'm sure they'd know how, it'd be like cave paintings to them) and be able to find accurate and detailed information on just about everything.

So bottom line, each page of Wikipedia serves as a focal point for members of the relevant community and those seeking info on a topic; and hence it naturally attracts links from the community it serves, who cite it in their link lists, blogs, tags, posts etc. So Wikipedia is bloated with 3rd party recommendation like no one else on the web, out of nothing more than the fact that people find it eminently useful.

It's the best example there is of good content being king. And as I alluded to above, it's much more than the presence of text. I actually dislike the term 'Content'. It's such an over used word in this industry to describe everything and at the same time, to mean nothing. In my eyes, content is not text, images, function. It's an analogy, for proposition. The Wikipedia proposition is one of the strongest out there. So if you want links….stop gunning for content. Develop a useful proposition, as this will be by far the best driver of 3rd party links and your best chance at replicating the Wikipedia advantage.

Now let's combine those things
So basically, your average topical Wikipedia page has got:

1. Incredibly powerful parents which donate their massive link popularity to it.

2. They also donate it via a keyword anchored text link to contextualise that donation of popularity and help the page's ranking.

3. Then each page is packed full of tons of rich relevant content.

4. Then each page gets heaps of direct recommendations from other relevant topic sites, getting further donation of popularity from outside the site, also via keyword anchored text links.

5. Then each page links out to lots of other Wikipedia and non Wikipedia pages, via keyword anchored text links further reinforcing its own relevance.

6. And then, if that wasn't enough, because each Wikipedia page is linked to by many other Wikipedia pages, the popularity they received from other websites, is donated across to each page internally though all the links.

Using the nepotism example again, if rich parents set up their kids to succeed, and they do really well on their own, build their own reputation and then move in the same circle and get references from others who have done the same, why wouldn't you expect the kids to succeed? This is the case with Wikipedia pages.

Hopefully I haven't lost you throughout all that. But the bottom line is that Wikipedia is a collection of interest magnets within a very powerful family, all cross fertilising each other through a very well tuned architecture.

At the end of the day though, one thing you need to be aware of, is that whilst the structure may be there, if it wasn't great compelling content in that structure, no one would have linked to any part of Wikipedia in the first place. Consequently this finely tuned architecture wouldn't be funnelling anywhere near the kind of link popularity around itself that it is and wouldn't rank anywhere near as well as it does.

So copying the structure is not a guarantee of success and won't cause Google to like to you. Structure merely helps classification and what Google really wants is popular content. Something with a visibly popular and well liked proposition.

In fact, I guess the real lesson here is that it's great content and proposition that creates massive interest and popularity, which when channelled into a well engineered structure, can breed great success. But no links, means no rank and no links comes from no interest in your proposition.

So really…Google doesn't like Wikipedia at all!

You like Wikipedia...and Google likes what you like.


Share this article:


About the author