Scrapers
Scrapers or Content Scrapers are computer programs that go out onto the web and borrow content from other sites for you. These content scraper programs scape what is on other sites so you have the content for your sites.
You know that the best SEO Black Hats are doing something more than scraping, using a site generator, comment spamming, and pinging to be raking in more than $100k per month.
But what is it?
Right now, there is way too much good stuff that I simply can’t publish on the SEO Black Hat blog. If I posted these tactics and exploits they would immediately get all the wrong kind of attention. The detailed conversations about how exactly to abuse search engine algorithms, generate massive traffic, and what other Black Hats are doing must remain underground to retain their effectiveness.
But what if I told you that you could discuss these exploits with me without paying my $500 an hour consulting fee? What if I told you there was a way to join in on the private, cutting edge discussions with some of the best Black Hats and web entrepreneurs in the world?
Would you be interested?
Because now you can . . .
Today is the official launch of the resource you’ve looked everywhere for but never found:
The Private SEO Black Hat Forum
Normally what you get on forums are people who don’t know anything talking with people who don’t want to say anything. You can occasionally find amazing tips on some forums: but you have to dig through 400 crappy posts just to find one post that is useful. That becomes a huge time sink.
How are the SEO Black Hat forums different?
Quality: We’re not going to have any contests to see who can make the most posts. That just creates tons of crap that no one wants to read. Our focus is on quality over quantity. Our primary concern is with succinctly answering one question: “What works?”
Sophisticated: Many of the topics we discuss are very advanced and require a high level of technical or business acumen to appreciate.
Expert Discussions: The SEO Black Hat forums are not for everyone and they may not be right for you. If you are relatively new to SEO or building websites, then do not join the SEO Black Hat Forums: you will be in way over your head. There are plenty of newbie forums out there for you – this is not one of them. Our forums are for successful web entrepreneurs to develop strategies that drive more traffic and generate more revenues.
Forum Membership Benefits
Access to Expert Advice and Discussions
We have both White Hat and Black Hat Experts that are already benefiting from new tool development, techniques, scripts and the sharing of ideas.
Some members you may already be familiar with include:
* CountZero from blackhat-seo.com (Black Hat)
* RSnake from ha.ckers.org (Web Security Expert)
* Dan Kramer from Kloakit (Cloaking Expert)
* Jaimie Sirovich from seoegghead.com (Token White Hat / SEO Geek)
There are several other members that you are certainly familiar with who are using handles for anonymity. We have others who are more focused on security, vulnerabilities, and coding. There are still more that you are likely unfamiliar with but are nevertheless web millionaires.
Databases – Large Datasets
If you want your sites to have massive amounts of unique content you need large data sets. The trading, discussion and posting of large data sets is going on right now on our forums.
Expired / Deleted Domain Tools
Want to use to use the same domain Tool that I used to get a Page Rank 6 site in the Gambling Space for just $8? This domain tool is available for members to use for free.
50% off on Kloakit – The Professional Cloaking Software
Scripts – Several useful scripts have already been posted – interesting thing you may not have thought of before are being discussed and developed.
Exploits and Case Studies: The really good stuff I can’t talk about on the SEO Blackhat Blog is being discussed on the SEO Black Hat Forums. Right now, some of the conversations include beating captchas, domain kiting, data mining, hoax marketing, XSS vulnerabilities as they relate to SEO, and much more.
Pricing: $100 per month.
The price will soon be rising significantly as more databases, hosted tools, scripts and exploits are added. However, once you lock in a membership rate it will never go up and you will continue to have access to everything.
So, if you think you’re ready for the most intense Black Hat SEO discussions anywhere, then here’s what you need to do:
1. Register at the SEO Black Hat Forums.
2. Go to the User CP and select Paid Subscription.
I’ll see you on the inside!
Last Thursday, the boys at the ‘plex announced that they would be releasing 10 gazillion keywords for statistical analysis and other research. That perked my ears up right away. We love large data sets because they are the cornerstone of building massive spam sites targeted niche aggregators.
The fine print is that you have to jump through some hoops to get the data - details are to be released, but you will likely have to be a member of the L.B.C.

“So tell me wuts up wit dis LBC thang?”
Wait . . . make that the LDC, the Linguistic Data Consortium. Their annual membership is $20k and they sometimes make you pay more for certain data sets.
The almost invisible print is pointed out by greywolf and confirmed by Matt Cutts in this threadwatch discussion.
When people sell a mailing list it’s extremly common for sellers to seed the list with some names that only exist for the purpose of catching people who are misusing it. I would have to assume the boys and girls at the plex would do the same. - Greywolf
graywolf, you have a devious, devious mind. How many other people would consider seeding the terms with some nonsense phrases? I ask you–how many other people would come up with an idea like that?
Well, I guess I can think of a couple people.. - Matt Cutts
graywolf, yes you should take it as a compliment. Not to worry, I’m familiar with the practice. My favorite is Lye Close, the fake street in London: http://wiki.openstreetmap.org/index.php/Copyright_Easter_Eggs
billhartzer, sshhh. I was just watching boogybonbon find out about “google monitor query or googletestad” today. Don’t ruin the fun. - Matt Cutts
referring to boogybonbon’s post on keyword research.

That’s right, it’s a trap.
We know about poisoning seasoning keyword lists - in fact sometimes we’ll do it ourselves. However, this exchange confirms what a few of us have been thinking all along - that the search engines are on to this tactic and use it as well.
Are you using wordcatcher, overture, the google keyword suggestor or any data directly from the search engines? It seems there’s a good chance that it could be a trap. If you’re using poisoned data, that could certainly explain why your sites are only lasting 6-9 weeks in the SERPs.
Understanding this kinda puts a damper on the 400+ meg file (update:mirror with data)that contains all the AOL searches of 500k users for the last 3 months.
“Jacta alea est!” - Julius Caesar
It’s a war. Develop your own supply lines so you don’t have to get food from the enemy.
1. William George of pugetsystems.com writes an article about dual Dual Processor vs Dual Core. It makes front page of digg 111 days ago.
2. Frameworkx.com reposts the entire article and gets it to the front page of digg today.
Here’s the kicker, if you check out the farameworkx article, they are even hotlinking the images from the original article. LOL. Now that’s ballsy.
From the Digg Comments:
Dude, all they did was just repost *our* article! http://www.pugetsystems.com/articles.php?id=23
They did attribute you in the bottom of the article - the poster just didn’t see that or chose to use the ‘Framewokx’ repost for whatever insidious reason he might have.
They are stealing bandwidth for those images as well. They didn’t even have the courtesy to ask they were too lazy to try and make it look like they wrote the article.
Well, then the solution is to go in and do a little mod_rewrite magic in your apache config. for these guys.
![]()
Amen, Seumas. 111 days ago! http://digg.com/hardware/Dual_Processor_vs_Dual_Core
Indeed, why rack your brain trying to come up with something that can make the front page of digg? Just take anything that is 4+ months old, post it to a legit looking site, digg it about 60 times and you should be in business!
Although, I might avoid hotlinking any of the original images
Tired of getting your content stolen from your RSS Feed and reposted on splogs? Here is a simple solution for you. I was inspired by RSnake to add some code to my .htaccess file to stop some of the people from scraping my feeds and will show you how to do the same.
Basically, all you need is the IP address of whoever is stealing your feed and you can deliver whatever content you want to them. One way you can get it is to “ping” the site - go to a DOS prompt and type “ping spammersite.com”. It’ll spit out the IP for you. Traceroute (tracert) will also work.
In my case, I just redirected any instance from their IP address back on their own feed. I’m not sure yet, but this may cause a loop in there server to post things over and over again.
If you want to delivery any kind of custom content to a specific IP address, you just need to add these 3 lines of code to your .htaccess files.
RewriteEngine on
RewriteCond %{REMOTE_ADDR} ^69.16.226.12
RewriteRule ^(.*)$ http://newfeedurl.com/feed
Where 69.16.226.12= the IP address you want to send to and http://newfeedurl.com/feed is the custom content you want to send them.
You can always test what content will be delivered by changing the IP address to that of the machine you are working on. You can check your IP Address here.
You can be as creative as you wish with what you feed them. You can even use them to blog and ping for you if you like. The possibilities are endless.
So why is a site about Search Engine Spamming teaching people how to stop thier content from being splogged? Because I am a dirty link whore and this is the kind of thing that people like to link to.
It might even be the type of story that people like to Digg.
Hilarious. Some dumbshit thought it would be a good idea to scrape content from RSnake at ha.ckers.org.
I’ve got to think this is just some sort of dumb joke, but that would be way too smart. No, this is just stupidity. So anyway, it was fairly trivial to figure out who was ripping my RSS feed. So it took me a few seconds to modify my document management system to do some IP delivery to the moron, and a few seconds of searching on the web for some nice prescription drug spam and poof! His site now looks like a bad spam doorway page and will continue to do so even more so with every post he indexes.
Not to mension he is registered with Godaddy. I won’t even start with the trouble you can get into when spamming from a Godaddy registered domain.
Nice work Rsnake.
Hilarious. Here is someone who owns a freaking basketball team calling someone else greedy!
All i know is that when the black hat hackers see easy money, they take it. I also know that they are greedy and a jealous bunch. The more they see the more they take . . .
Pot meet kettle. Kettle this is pot.
He then goes on in classic Cuban style to blur the lines between PPC Arbitrage, Splogging and Click Fraud.
Sorry Mark, but Capitalizing on discrepancies in bid/ask markets is not fraud. Creating a “newsmaster site” or “Niche Aggregator” (he calls them splogs) is not fraud.
And who the hell is he to complain about splogs and scraper sites anyway? He owns Icerocket, one of the largest scraper sites on the Net!
Hey Mark, what percentage of the 36,200 icerocket pages indexed in Google are original content?
Well, 31,900 contain the word “tag” in the URL. 99% of those are pages like:
blogs.icerocket.com/tag/gifts
blogs.icerocket.com/tag/pictures
blogs.icerocket.com/tag/english
That is TEXTBOOK scraper. There is no original content and there are sponsor links that give nothing to the original content producers. And when I say nothing, I mean that even the citation links are nofollow. There is NO difference between that and what he calls splogging.
From the wikipedia:
The term hypocrisy is also commonly used in a way which should be more specifically termed a double standard, bias, or inconsistency. An example would be when one honestly believes that one group of individuals should be held to a different set of morals than another group.
Hypocrisy also refers to the act of criticizing others for behavior which one engages in as well, or in other words, not practicing what you preach.
All that’s missing from that wikipedia entry is this picture:

From Searchview’s 5 Questions with David Zito of Yahoo Publisher Network we see Q&A #3 relating to search engine spam and scraper sites:
3. Spam has been a big problem with your competitors - scraper sites and such. What is your strategy for preventing its negative effects?
Click-through protection has been a top priority for Yahoo! since we pioneered the space in 1998. We’ve continually refined our systems and have built up numerous layers of defense against unwanted clicks.
The beta participants we are initially inviting have been pre-screened for content and site quality. In addition, we have built proprietary quality systems and processes to monitor offensive and inappropriate sites. We will continue to monitor and enhance these systems and processes during the beta period to develop a full set of quality criteria to be used when we open the program to all interested publishers.
Notice David avoided saying anything about spam or scraper sites? He only mentions “numerous layers of defense against unwanted clicks” (click fraud), and “offensive and inappropriate sites” (hate speech and poorly targeted sites).
Don’t get me wrong, I do believe that their respective Search Groups want the best possible, non spammy results to populate their SERPs.
However, neither YPN nor Adsense really want to stop their ads from appearing on spammy sites. They both know it’s a major money maker. As long as you stay within the TOS, you should be fine.
Black Hat Forum
Recent Threads:Pages
Categories
- Adult Industry (19)
- Affiliate Marketing (5)
- Black Hat SEO Site Reviews (10)
- Black Hat Tools (23)
- Click Fraud (7)
- Cloaking (15)
- Comment Spam (15)
- Computer Generated Content (12)
- CSS Spam (4)
- Directories (4)
- Domain (9)
- Doorway Pages (5)
- Ethics (8)
- Google (127)
- Adsense (26)
- GoogleBowling (3)
- Keyword Stuffing (7)
- Link Bait (20)
- Link Dumping (14)
- MSN (30)
- Myspace Friends (11)
- Poker (19)
- Redirects (9)
- Scrapers (8)
- Search Engine Spam (34)
- Spamouflage (7)
- Trackback (3)
- Typo Spam (3)
- White Hat vs Black Hat (12)
- XSS Cross Site Scripting (10)
- Yahoo (39)




