Every Search Engine Spammer Needs to Know…

Be Careful with your keyword research because google may have poisoned your keyword list.

Every Search Engine Spammer Needs to Know…

comments below

Last Thursday, the boys at the ‘plex announced that they would be releasing 10 gazillion keywords for statistical analysis and other research. That perked my ears up right away. We love large data sets because they are the cornerstone of building massive spam sites targeted niche aggregators.

The fine print is that you have to jump through some hoops to get the data - details are to be released, but you will likely have to be a member of the L.B.C.


“So tell me wuts up wit dis LBC thang?”

Wait . . . make that the LDC, the Linguistic Data Consortium. Their annual membership is $20k and they sometimes make you pay more for certain data sets.

The almost invisible print is pointed out by greywolf and confirmed by Matt Cutts in this threadwatch discussion.

When people sell a mailing list it’s extremly common for sellers to seed the list with some names that only exist for the purpose of catching people who are misusing it. I would have to assume the boys and girls at the plex would do the same. - Greywolf

 

graywolf, you have a devious, devious mind. How many other people would consider seeding the terms with some nonsense phrases? I ask you–how many other people would come up with an idea like that?

Well, I guess I can think of a couple people.. - Matt Cutts

 

graywolf, yes you should take it as a compliment. Not to worry, I’m familiar with the practice. My favorite is Lye Close, the fake street in London: http://wiki.openstreetmap.org/index.php/Copyright_Easter_Eggs

billhartzer, sshhh. I was just watching boogybonbon find out about “google monitor query or googletestad” today. Don’t ruin the fun. - Matt Cutts

 

referring to boogybonbon’s post on keyword research.

Trap admiral akabar from star wars

That’s right, it’s a trap.

We know about poisoning seasoning keyword lists - in fact sometimes we’ll do it ourselves. However, this exchange confirms what a few of us have been thinking all along - that the search engines are on to this tactic and use it as well.

Are you using wordcatcher, overture, the google keyword suggestor or any data directly from the search engines? It seems there’s a good chance that it could be a trap. If you’re using poisoned data, that could certainly explain why your sites are only lasting 6-9 weeks in the SERPs.

Understanding this kinda puts a damper on the 400+ meg file (update:mirror with data)that contains all the AOL searches of 500k users for the last 3 months.

“Jacta alea est!” - Julius Caesar

It’s a war. Develop your own supply lines so you don’t have to get food from the enemy.

bookmark this article:
  • reddit
  • digg
  • netscape
  • del.icio.us

15 Responses to “Every Search Engine Spammer Needs to Know…”

 

woah! simple, but effective

“So tell me wuts up wit dis LBC thang?” :)

A friend posted the other day Matt’s comment about Boogybonbon.com. But what does it have to do with all this?

Wordtracker is a good supply line. Does anybody disagree?

DevilKing,

Google spit out “google monitor query or googletestad” as one of the top searches. But guess what? no one is searching for that. So when you build a massive site and page 34,532 is optomized for that it gets a red flag for human review. When they see your site is computer generated, they boot your ass.

Get it?

hmm.. But thats the thing that is intresting, Im not scraping Google and people are searching for google_monitor_query… mutch to think about!

Thanks!

[…] After my post about my keyword research system, seoblackhat posted about another thread on another site talking about the group LDC selling a big old keyword list and provided a link to this site. […]

update on that AOL keyword list:

http://battellemedia.com/archives/002792.php

if anyone got it before they took it down. PLEASE email me.

hmm - yes it looks that is an actual querry - but probably one by google - but how could it possiably outrank food or GOOGLE for that matter? That’s impossiable.

What matt is saying is that Google send send high volumes of queries to make it appear as though people are searching for any term they like.

Imagine if they were’t the “Do no Evil” crowd. How easy would it be for them to manipulate mid capp stocks? or world politics for that matter.

I agree 100%, and Im rather pissed that it was such a low punch by Google, and that I really did not give it much thought at the time as I was so excited about the system itself working really good.

Not only does it show that Google is willing to stoop low to kick others in the balls, they are also willing to send mass robots to other search engines to try and seed their search results. Very dirty in my book!

What are some good sources of natural keyword data then? Your own log files are one place to look but it’s good to have a headstart when you’re starting building for new niches.

ok, i give up. How The F*** do you open a 40mb txt file without crashing ur pc???

[…] Interesting take on keyword list freebies . Are you using wordcatcher, overture, the google keyword suggestor or any data directly from the search engines? It seems there’s a good chance that it could be a trap. If you’re using poisoned data, that could certainly explain why your sites are only lasting 6-9 weeks in the SERPs. […]

[…] Но полным сюрпризом стало для меня заявление, что Google и, возможно, прочие посковики занимаются так называемым keyword seeding, то есть загрязняют реальную статистику поисковых запросов мусорными запросами, причем вне зависимости от того кто эту статистику собирает. Как заявляет Seoblackhat, не удивляйтесь тому что ваши сайты живут в поиске каких-то пару недель если вы используете poisoned data. […]

[…] 1. Popular sites with nice user interfaces that are basically glorified scraper sites 2. The New York Times cloaking; in fact, of the top 500 alexa sites, less than 10% show the same content all over the world 3. Major sites creating computer generated content 4. Google Poisoning Keyword lists (by referral spamming) 5. Massive site Networks purchased and built primarily to link farm […]

Leave a Reply

You must be logged in to post a comment.