Wired Splogs

Some of the factors for spotting splogs according to wired:

“If we see 10,000 pings within 60 seconds, and all the blogs point to the same Web site, it’s really easy to recognize that as a link farm,” Sifry says.

1. Ping Frequency: Not too often nor too regularly.

Like most blogs, Some Title consists of a number of 50- to 100-word posts (incoherent ones, in this case), all with hyperlinks to other Web sites. In real blogs, the hyperlinks’ anchor text – the word or phrase users click on – is generally something innocuous like “previous post” or “interesting discussion.”

2. Sure, Real blogs don’t usually link to “buy viagra online”. . .

The links in ordinary blogs usually take users to well-known sites like Flickr and YouTube or prominent blogs like Talking Points Memo and Boing Boing. By contrast, each link in Some Title takes the user to a spam Web page or another splog.

3. Link to authority sites.

These sites, moreover, often have odd-looking, superlong URLs that are packed with keywords, because search engines tend to award high ranks to Web sites with keywords in their title, and sploggers are constantly looking for ways to increase their visibility in search engines. One LiveJournal splog that mentioned me, for example, was called New-york-agency-direct-mail-insurance-marketing. The grave-robbing Web site had the absurd address www.1michaelgraves7.info/conducting-from-the-grave/
grave-robbing-in-ventura-california-1985.html. “If it’s a Blogspot blog with more than two dashes, it’s spam,” Mullenweg says. Simply checking for dashes and search terms in links, in other words, will eliminate many splogs.

4. Too many keywords in URL

Another giveaway: Both Some Title and the grave-robbing page it links to had Web addresses in the .info domain. Spammers flock to .info, which was created as an alternative to the crowded .com, because its domain names are cheaper – registrars often let people use them gratis for the first year – which is helpful for those, like sploggers, who buy Internet addresses in bulk. Splogs so commonly have .info addresses that many experts simply assume all blogs from that domain are fake.

5. Is anyone still buying dot info domains? If I were a search engine I would simply not index any content from that TLD. Problem solved.

By looking for multiple dashes, .info domains, and other trip wires, says Technorati software architect Ian Kallen, his company can deconstruct the links and content in every new blog post, as well as all the other elements of the page. In essence, he says, “you’re going after the money – what they have to do to get money. And you can use this to spot the abusers.”

Ask yourself, if “I wanted to eliminate spam from My search engine, what would I look for?” Then make sure you’re not setting off any of those red flags.

Both comments and pings are currently closed.

2 Responses to “Wired Splogs”

  1. Pancho says:

    Good find Quadszilla.

    A great list of things not to do.

  2. ppzhao says:

    “If it’s a Blogspot blog with more than two dashes, it’s spam”? Doesn’t blogspot (Googles Blogger) automatically create URL’s by using the blog Title and put hyphens around them (up to 6 or 8 or something like that)? So does that mean if the title of the blog has more than 3 words, it is perceived as spam?