Archive for the ‘Scrapers’ Category

How to Build Link Pyramids

Yesterday, we discussed two of ways you can use bursts of spammy links to help you make money. But just because you have the ability to spam thousands of links a day to a single page doesn’t mean it’s the most effective use of the tool.

What if I told you there is a way to launder those spammy links; a way to sift out the negative Google Karma to leave you with pure Link juice that Google uses to rank. Would you be Interested?

If so, then you’re gonna want to know about:

Link Pyramids

Among of The Great Wonders of the Web are Link Pyramids. These majestic towers of ranking excellence are built on the backs of spammy links just like the Ancient Pyramids were built by slaves. Today we’re going to define what Link Pyramids are, why they work so well and what they look like. Later this month we’ll get into the nitty gritty on building them.

The Idea behind the Link Pyramid is that while link juice can pass from one site to the next, ranking penalties generally stop after one hop. It works on the premise that the lowest quality sites link upwards to the next higher quality sites in your network. Sites can link laterally and upwards but not downwards.

So here’s what the pyramid looks like . . .

[I started making a graphic of a Pyramid, but honesty, I just can’t be bothered. You know what a fucking pyramid graphic looks like – right? Great! Moving on . . .]

The Bottom

At the bottom of the Pyramid sits a network of sites you created on free blog host or Squido, or anywhere you can get web space for free. These sites should not have ads on them. These sites will link to random other quality sites on the web and to your 2nd level sites. You will experiment with how few links you can spam to these sites to get them to acquire link juice vs. how many you can spam to them before they get banned. If you are not selling anything on these sites there is more leeway. The only type of links you are sending to these sites are those free spammy links from link software.

2nd Level of Pyramid

Then we have your 2nd level of the Link Pyramid. This is where you put your domain portfolio to work. This network of sites is distributed across cheep shared hosting accounts. The more shared hosting accounts you have for this purpose, the smaller your footprint will be. Hosting is really cheep these days, and you’ll probably be spending more each year on domain renewals than you will be on shared hosting.

This 2nd level of sites will get links from the bottom level of sites, but never link back down to them. These sites can still receive spammy links but tread lighter: you don’t want your account getting banned with the shared host. Buying cheap links to these sites also helps the network grow. Directory submission and press release type links are good to go at this level, as is moderate monitization. The Primary purpose of these sites is to build a link farm to link to the Golden Crest of your link Pyramid.

The Golden Crest

Here we have the sites that are ready for prime time. They have fantastic designs and flow. They are your niche authority sites. Their links come from Level 2 of the pyramid, link buys, link bait, and manual, targeted Link spamming. The Golden crest can make money, but the real purpose of these sites is to link to the Top of your pyramid.

The Top

At the Top of the Pyramid is the Target site: That’s the pristine white hat looking site that you want to present to your customers and to rank in the search engines. This site has the bulk of your editorial content: you’re link bait. This site is an e-commerce site that is designed to make money. This is the site you’re paying bloggers and reviewers to link to. You’re only purchasing the highest quality links to this site and your link Ninjas are securing only the best quality links. This is the site that is optimized for the keywords you know make money. This is your money site.

Make sure to have every level of your site link to sites outside of your network and for fuck’s sake:

DO NOT INTERLINK YOUR ENTIRE NETWORK!!!

If you’re gonna do that, you might as well fill out a spam report on yourself with a list of all your sites and submit it to Google. Along similar lines, don’t use any of Google Products for these sites (like analytics, or Adsense, or Adwords . . . or even surfing to them with a Google Toolbar installed or Google Chrome) with the possible exception of the site at the top of your Pyramid.

I’m sure some of you have some questions. Fire away if you do: this way I have more shit to blog about.

Adsense Scraper Cartoon

Google KidSense

The Most Cutting Edge SEO Exploits No One is Publishing

You know that the best SEO Black Hats are doing something more than scraping, using a site generator, comment spamming, and pinging to be raking in more than $100k per month.

But what is it?

Right now, there is way too much good stuff that I simply can’t publish on the SEO Black Hat blog. If I posted these tactics and exploits they would immediately get all the wrong kind of attention. The detailed conversations about how exactly to abuse search engine algorithms, generate massive traffic, and what other Black Hats are doing must remain underground to retain their effectiveness.

But what if I told you that you could discuss these exploits with me without paying my $500 an hour consulting fee? What if I told you there was a way to join in on the private, cutting edge discussions with some of the best Black Hats and web entrepreneurs in the world?

Would you be interested?

Because now you can . . .

Today is the official launch of the resource you’ve looked everywhere for but never found:

The Private SEO Black Hat Forum

Normally what you get on forums are people who don’t know anything talking with people who don’t want to say anything. You can occasionally find amazing tips on some forums: but you have to dig through 400 crappy posts just to find one post that is useful. That becomes a huge time sink.

How are the SEO Black Hat forums different?

Quality: We’re not going to have any contests to see who can make the most posts. That just creates tons of crap that no one wants to read. Our focus is on quality over quantity. Our primary concern is with succinctly answering one question: “What works?”

Sophisticated: Many of the topics we discuss are very advanced and require a high level of technical or business acumen to appreciate.

Expert Discussions: The SEO Black Hat forums are not for everyone and they may not be right for you. If you are relatively new to SEO or building websites, then do not join the SEO Black Hat Forums: you will be in way over your head. There are plenty of newbie forums out there for you – this is not one of them. Our forums are for successful web entrepreneurs to develop strategies that drive more traffic and generate more revenues.

Forum Membership Benefits

Access to Expert Advice and Discussions
We have both White Hat and Black Hat Experts that are already benefiting from new tool development, techniques, scripts and the sharing of ideas.
Some members you may already be familiar with include:

* CountZero from blackhat-seo.com (Black Hat)

* RSnake from ha.ckers.org (Web Security Expert)

* Dan Kramer from Kloakit (Cloaking Expert)

* Jaimie Sirovich from seoegghead.com (Token White Hat / SEO Geek)

There are several other members that you are certainly familiar with who are using handles for anonymity. We have others who are more focused on security, vulnerabilities, and coding. There are still more that you are likely unfamiliar with but are nevertheless web millionaires.

Databases – Large Datasets
If you want your sites to have massive amounts of unique content you need large data sets. The trading, discussion and posting of large data sets is going on right now on our forums.

Expired / Deleted Domain Tools
Want to use to use the same domain Tool that I used to get a Page Rank 6 site in the Gambling Space for just $8? This domain tool is available for members to use for free.

50% off on Kloakit – The Professional Cloaking Software

Scripts – Several useful scripts have already been posted – interesting thing you may not have thought of before are being discussed and developed.

Exploits and Case Studies: The really good stuff I can’t talk about on the SEO Blackhat Blog is being discussed on the SEO Black Hat Forums. Right now, some of the conversations include beating captchas, domain kiting, data mining, hoax marketing, XSS vulnerabilities as they relate to SEO, and much more.

Pricing: $100 per month.

The price will soon be rising significantly as more databases, hosted tools, scripts and exploits are added. However, once you lock in a membership rate it will never go up and you will continue to have access to everything.

So, if you think you’re ready for the most intense Black Hat SEO discussions anywhere, then here’s what you need to do:

1. Register at the SEO Black Hat Forums.

2. Go to the User CP and select Paid Subscription.

I’ll see you on the inside!

Every Search Engine Spammer Needs to Know…

Last Thursday, the boys at the ‘plex announced that they would be releasing 10 gazillion keywords for statistical analysis and other research. That perked my ears up right away. We love large data sets because they are the cornerstone of building massive spam sites targeted niche aggregators.

The fine print is that you have to jump through some hoops to get the data – details are to be released, but you will likely have to be a member of the L.B.C.


“So tell me wuts up wit dis LBC thang?”

Wait . . . make that the LDC, the Linguistic Data Consortium. Their annual membership is $20k and they sometimes make you pay more for certain data sets.

The almost invisible print is pointed out by greywolf and confirmed by Matt Cutts in this threadwatch discussion.

When people sell a mailing list it’s extremly common for sellers to seed the list with some names that only exist for the purpose of catching people who are misusing it. I would have to assume the boys and girls at the plex would do the same. – Greywolf

graywolf, you have a devious, devious mind. How many other people would consider seeding the terms with some nonsense phrases? I ask you–how many other people would come up with an idea like that?

Well, I guess I can think of a couple people.. – Matt Cutts

graywolf, yes you should take it as a compliment. Not to worry, I’m familiar with the practice. My favorite is Lye Close, the fake street in London: http://wiki.openstreetmap.org/index.php/Copyright_Easter_Eggs

billhartzer, sshhh. I was just watching boogybonbon find out about “google monitor query or googletestad” today. Don’t ruin the fun. – Matt Cutts

referring to boogybonbon’s post on keyword research.

Trap admiral akabar from star wars

That’s right, it’s a trap.

We know about poisoning seasoning keyword lists – in fact sometimes we’ll do it ourselves. However, this exchange confirms what a few of us have been thinking all along – that the search engines are on to this tactic and use it as well.

Are you using wordcatcher, overture, the google keyword suggestor or any data directly from the search engines? It seems there’s a good chance that it could be a trap. If you’re using poisoned data, that could certainly explain why your sites are only lasting 6-9 weeks in the SERPs.

Understanding this kinda puts a damper on the 400+ meg file (update:mirror with data)that contains all the AOL searches of 500k users for the last 3 months.

“Jacta alea est!” – Julius Caesar

It’s a war. Develop your own supply lines so you don’t have to get food from the enemy.

Recycling Front Page Digg Stories

1. William George of pugetsystems.com writes an article about dual Dual Processor vs Dual Core. It makes front page of digg 111 days ago.

2. Frameworkx.com reposts the entire article and gets it to the front page of digg today.

Here’s the kicker, if you check out the farameworkx article, they are even hotlinking the images from the original article. LOL. Now that’s ballsy.

From the Digg Comments:

Dude, all they did was just repost *our* article! http://www.pugetsystems.com/articles.php?id=23

They did attribute you in the bottom of the article – the poster just didn’t see that or chose to use the ‘Framewokx’ repost for whatever insidious reason he might have.

They are stealing bandwidth for those images as well. They didn’t even have the courtesy to ask they were too lazy to try and make it look like they wrote the article.

Well, then the solution is to go in and do a little mod_rewrite magic in your apache config. for these guys. 🙂

Amen, Seumas. 111 days ago! http://digg.com/hardware/Dual_Processor_vs_Dual_Core

Indeed, why rack your brain trying to come up with something that can make the front page of digg? Just take anything that is 4+ months old, post it to a legit looking site, digg it about 60 times and you should be in business!

Although, I might avoid hotlinking any of the original images 😉

IP Delivery to Stop RSS “Content Thieves”

Tired of getting your content stolen from your RSS Feed and reposted on splogs? Here is a simple solution for you. I was inspired by RSnake to add some code to my .htaccess file to stop some of the people from scraping my feeds and will show you how to do the same.

Basically, all you need is the IP address of whoever is stealing your feed and you can deliver whatever content you want to them. One way you can get it is to “ping” the site – go to a DOS prompt and type “ping spammersite.com”. It’ll spit out the IP for you. Traceroute (tracert) will also work.

In my case, I just redirected any instance from their IP address back on their own feed. I’m not sure yet, but this may cause a loop in there server to post things over and over again.

If you want to delivery any kind of custom content to a specific IP address, you just need to add these 3 lines of code to your .htaccess files.

RewriteEngine on
RewriteCond %{REMOTE_ADDR} ^69.16.226.12
RewriteRule ^(.*)$ http://newfeedurl.com/feed

Where 69.16.226.12= the IP address you want to send to and http://newfeedurl.com/feed is the custom content you want to send them.

You can always test what content will be delivered by changing the IP address to that of the machine you are working on. You can check your IP Address here.

You can be as creative as you wish with what you feed them. You can even use them to blog and ping for you if you like. The possibilities are endless.

So why is a site about Search Engine Spamming teaching people how to stop thier content from being splogged? Because I am a dirty link whore and this is the kind of thing that people like to link to.

It might even be the type of story that people like to Digg.

Here’s What Happens When You Scrape a Hacker Site

Hilarious. Some dumbshit thought it would be a good idea to scrape content from RSnake at ha.ckers.org.

He thought wrong.

I’ve got to think this is just some sort of dumb joke, but that would be way too smart. No, this is just stupidity. So anyway, it was fairly trivial to figure out who was ripping my RSS feed. So it took me a few seconds to modify my document management system to do some IP delivery to the moron, and a few seconds of searching on the web for some nice prescription drug spam and poof! His site now looks like a bad spam doorway page and will continue to do so even more so with every post he indexes.

Not to mension he is registered with Godaddy. I won’t even start with the trouble you can get into when spamming from a Godaddy registered domain.

Nice work Rsnake.

Mark Cuban is a Hypocrite

Hilarious. Here is someone who owns a freaking basketball team calling someone else greedy!

All i know is that when the black hat hackers see easy money, they take it. I also know that they are greedy and a jealous bunch. The more they see the more they take . . .

Pot meet kettle. Kettle this is pot.

He then goes on in classic Cuban style to blur the lines between PPC Arbitrage, Splogging and Click Fraud.

Sorry Mark, but Capitalizing on discrepancies in bid/ask markets is not fraud. Creating a “newsmaster site” or “Niche Aggregator” (he calls them splogs) is not fraud.

And who the hell is he to complain about splogs and scraper sites anyway? He owns Icerocket, one of the largest scraper sites on the Net!

Hey Mark, what percentage of the 36,200 icerocket pages indexed in Google are original content?

Well, 31,900 contain the word “tag” in the URL. 99% of those are pages like:

blogs.icerocket.com/tag/gifts
blogs.icerocket.com/tag/pictures
blogs.icerocket.com/tag/english

That is TEXTBOOK scraper. There is no original content and there are sponsor links that give nothing to the original content producers. And when I say nothing, I mean that even the citation links are nofollow. There is NO difference between that and what he calls splogging.

From the wikipedia:

The term hypocrisy is also commonly used in a way which should be more specifically termed a double standard, bias, or inconsistency. An example would be when one honestly believes that one group of individuals should be held to a different set of morals than another group.

Hypocrisy also refers to the act of criticizing others for behavior which one engages in as well, or in other words, not practicing what you preach.

All that’s missing from that wikipedia entry is this picture:

Mark Cuban is a Hypocrite

Why YPN! Sidestepped Spam Site Question

From Searchview’s 5 Questions with David Zito of Yahoo Publisher Network we see Q&A #3 relating to search engine spam and scraper sites:

3. Spam has been a big problem with your competitors – scraper sites and such. What is your strategy for preventing its negative effects?

Click-through protection has been a top priority for Yahoo! since we pioneered the space in 1998. We’ve continually refined our systems and have built up numerous layers of defense against unwanted clicks.

The beta participants we are initially inviting have been pre-screened for content and site quality. In addition, we have built proprietary quality systems and processes to monitor offensive and inappropriate sites. We will continue to monitor and enhance these systems and processes during the beta period to develop a full set of quality criteria to be used when we open the program to all interested publishers.

Notice David avoided saying anything about spam or scraper sites? He only mentions “numerous layers of defense against unwanted clicks” (click fraud), and “offensive and inappropriate sites” (hate speech and poorly targeted sites).

Don’t get me wrong, I do believe that their respective Search Groups want the best possible, non spammy results to populate their SERPs.

However, neither YPN nor Adsense really want to stop their ads from appearing on spammy sites. They both know it’s a major money maker. As long as you stay within the TOS, you should be fine.