Wikimedia Dump Service

The following dumps are avalible:

special | wikibooks | wikinews | wikipedia | wikiquote | wiktionary | images

At the root of each Wikimedia project dump, you will find a listing of languages and the md5 sum for all files available in the tree. Each language directory contains these XML dumps:

pages_current.xml.gz – Link to the lastest current pages dump
all_titles_in_ns0.gz – All titles in the main namespace
pages_full.xml.gz – Link to the lastest full pages dump

Now what could a search engine spammer possiably do with these dumps and the markov chain?

Well, with the language feed and markov, you can now spam the search engines in tongues you don’t even speak. With markov, keep in mind that you need about 300+% imput to output ratio for optimal articles.

Both comments and pings are currently closed.

4 Responses to “Wikimedia Dump Service”

  1. ech0 says:

    holy shit, thats alot of data.

  2. Yurij says:

    Great news!

  3. bock says:

    I’m about to cry… This will be EXCELLENT with my 5,563,359 row database of world cities and towns in both english and native character sets, UTF, latitude, longitude, etc. Thank you wikipedia and SEO BlackHat, THANK YOU!!! Terabyte site hosting anyone???

  4. professional web design says:

    this so great!!