Re: How do I know if I've won?
by
ekdysiast
09/23/2009, 12:19 AM #
jweber:In the case of Microsoft's search engine, it would be up to Microsoft
to install the participants' algorithms on their servers, and see how
they perform
Not really. Though Farhad suggested such a method, there are plenty of other ways to evaluate how well a search engine does. Information retrieval is one of the more mature fields in artificial intelligence. As such, it also has fairly well established standards of evaluation. Considering how big a company MS is, it probably has tons of data on search histories and corresponding click throughs and whatever else you might need to automatically judge all entrants fairly and (to a certain extant) reliably.
jweber:In the case of Google's translation engine, there's no measurably "best" way to translate a work of literature
Actually, there is a measurable way. It's called the BLEU score. As you can see in the Google scholar link, it's been cited more than 1500 times in the statistical machine translation (SMT) literature. It is THE de facto standard for evaluating how well a SMT model performs. It's not perfect. Everyone I know who does SMT hates it. But they abide by it because it's the best option in a field of shitty options. Franz Och, the head of SMT at Google, uses it so I guess that says something.