For the Graffiti Network Project we needed a list of open MediaWiki sites that would allow us to store arbitrary data. Using the Yahoo! Web Search API and a Python script for Google search results, we found over 22,000+ publically available MediaWiki installations. The key was to search for patterns that are specific to a newly installed site, such "Configuration settings list" and "MediaWiki has been successfully installed", in combination with a random word from a dictionary file (usually installed on your system as /usr/share/dict/words).
Once we found a site, our crawler inspected it by probing certain URLs to determine whether it allowed for anonymous edits, or whether it was protected by CAPTCHAs or the lame puzzle authentication plugin.
I am providing our entire list of sites collected from December 2008:
The format of the file is as follows. Some fields are blank because we were unable to find the proper information. In some cases we also found that our crawler incorrectly determined that a site was open to anonymous edits, but the site was actually used a CAPTCHA after an edit was made. We also found that some sites used a modified version of MediaWiki that had been integrated into another CMS, such as phpBB, and thus these sites did not use the default MediaWiki registration form.