Q&A: How Does Google Treat Duplicate Content?

Reteep Asked:

“How much of a problem is the duplicate content stuff for the bottom layer [of] autogenerated sites? Does it matter?”

Duplicate Content is one of the great boogie men of SEO. So many people are scared of it; so many people are worried that if they have duplicate content their site will face a ranking penalty. Is it true? Does Google Penalize you for having Duplicate Content? If so how much? How can duplicate content hurt you? How can it help you? Well sit back, because although I’m sure it’s been answered before, I’ve gonna give you the straight dope on duplicate content and Google.

First Off; what do we mean by “Duplicate Content”? Duplicate content means that the text of one web page matches another page’s. The text matching does not need to be 100% to be considered duplicate content. Matching can be less than 50% and still be considered duplicate especially if various chunks of content can be found on other pages.

For Example: Any site that runs AP stories will have heaps of duplicate content. Google doesn’t penalize the news site for running the stories, but unlike last year, now all (or almost all) of the AP stories are hosted on Google. Examples:

http://www.google.com/hostednews/ap/article/ALeqM5g8-DEMtAE9q4i4ySQ0eV_qZefmRQD99D0RV80

http://www.google.com/hostednews/ap/article/ALeqM5iwJhPuY4ndVAdfJgwbiS3uh7uIGgD99CJOBO0

Interestingly, When I google “AP Interview: Hayden denies Congress not informed” The top 10 results:

SERPs for AP Stories

Include Yahoo at number 1 and Google’s own story in the Top 10. Long term, expect Google to put itself at number 1 for all these types of queries.

The benefit to running an AP story is that if you can rank for the query (like Yahoo did), you can get search traffic. Plus there’s a chance when those stories run on XYZ newspaper site (or your site) that the story will get picked up by a slashdot, or a Digg, or Fark or whatever and receive a few hundred links. The only negative the newspaper sites could get from running these stories is by diluting their internal link juice by linking internally to these stories.

Clearly, Google does not penalize trusted sites for having duplicate content. In almost all cases, your Penalty in Google is NOT because you have duplicate content. 99.9% of the time, the problem is somewhere else.

The way that duplicate content can hurt you would be if you have multiple copies of the same story on your website. You will not get a “penalty” from Google but you will dilute both your internal link juice and could potentially split any natural inbound links (and therefore ranking power) among all the copies of the page. This type of dilution could take an item that would have ranked and banish it to obscurity.

The same applies for spam sites. Unless your site is screaming “I AM A SPAM SITE”, the duplicate content penalty is not gonna hit. And since you would have gotten hit anyway by that human reviewer that just marked your site (and Network, if you weren’t careful) as spam, we can safely say that there is no official Duplicate Content Penalty at Google.

Fire away with more questions; don’t worry, I intend to tackle some of the other ones that were asked this weekend later this week.

Both comments and pings are currently closed.

5 Responses to “Q&A: How Does Google Treat Duplicate Content?”

  1. elmarijanili says:

    Ok….. so I get it…to REALLY excel at the SEO thing and make some serious fucking wonga..moola…cash you have to build this HUGE …sorry SERIOUSLY FUCKING HUGE (300,000 sites at the bottom??!!!!) network of sites…all trying to get about 40 of them up the ranking to make serious money…not get identified as SPAM by Google or any other world dominating Search Engine…and try not to create too many footprints…all whilst tracking ALL these sites etc……..Yeah really a noob question I suppose but where would you start….today….if you did not have your experience and knowledge…but had read you posts..and the seriously fucking funny fight with Eli in the comments- if that really was him…..so yeah not forgetting your post about ‘Do it Fucking now’ …… and I am NOT looking for hand holding but more to pique my curiosity…. where WOULD you start Quadz?? Where would YOU start?? oh yeah and just again to keep my interest going……..how long could it take you to get some money through….again if you HAD to start again!?? Rock ON!!

  2. Brainerd Minnesota says:

    You would start at the beginning. I would advise you to join 1 or 2 good SEO forums and stick to them. Set your reader to stream all the SEO sites you can subscribe to. You don’t need to have a gigantic site to see good results. Unless you are talking about adsense I wound’t suggest going into it if you are looking for a quick buck. It is a lot of hard work that NEEDs to be done on a daily basis. I’m talking 8-10 hour days at least especially if your doing this by yourself.

  3. Brainerd Minnesota says:

    This is what I always say on forums (like a broken record):

    There is no way Google will punish for duplicate content. If you think about it there are legitimate reasons why people need to publish an article in more than one spot. I’ve always just assumed the worst they will do for duplicate content is basically treat it as a big ol’ nofollow link, that is take away all the juice it may carry. I’ve had good results from spreading an article around. I do paraphrase it though just so it looks better in the search results. Great article out of 65 new SEO articles in my reader this is the one I clicked on. Nice work 🙂

  4. jeff01050 says:

    QuadsZilla,

    Can you share and and give a understanding of link juice an how to use it effectively. I understand the term , but need a mapping of how to direct it or a game plan when it comes to effective use of “Link Juice” .

    Thanks.

  5. Muskie says:

    What about scraped content. Sometimes that just scrape the first X characters/words, but someone stole this whole article I wrote for an online magazine the other day, but he was too dumb to remove my name from the footer at the bottom so it came up when I googled myself.

    http://bluegreennews.com/technology/business-tech/107-going-for-the-green-targeting-socially-conscious-consumers.html