Search needs your help!
To make changes to search at a quicker pace, the Search team needs to be able to test changes before making them available on-wiki. Discernatron is a tool that allows participants to judge the relevance of search results. When evaluating potential changes to the Wikimedia search the team will use these judgements to help rate potential changes by how much better they are at putting the most relevant articles at the top of the search results page.
Get Started » (login with your Wikimedia account)
What queries am I rating?
Every month the Discovery department loads approximately 500 randomly selected search queries from the English Wikipedia into Discernatron for grading. These queries represent around 0.0001% of all full text searches on English Wikipedia. This sample is incredibly small, but still represents a wide swath of the types of queries received. Before being released to Discernatron, WMF engineers review the sampled set of queries and remove anything that could be considered personally identifiable information (PII). Initially only queries for English Wikipedia are being used but Discernatron will expand to other languages—such as French, Spanish and Russian—as time goes by.
So someone is looking at all my searches?
No. When reviewing queries there is no additional meta data, such as user name, location, or IP address. Additionally due to the sample size it is very unlikely that the sample contains more than one query from any single user. See Discovery's Data access guidelines for more information on how the department manages user data.
What kinds of queries are removed?
Anything potentially personally identifiable. This means any kind of phone number, serial number, or non-notable address. We remove searches for specific URLs, and names of non-notable companies and non-notable people (those that don't have wiki articles and aren't mentioned prominently in any other article). For the benefit of participants most non-English searches are also removed, as it would be hard to judge the quality of results. Finally "junk" queries, such as "Ikohoyugc", are removed. (These junk queries make up one to two percent of total query volume.)
What license is the collected data released under?
All data collected by Discernatron is released under CC-Zero.