enter the fray: our reader discussion forum
Search in:
Advanced
View:FlatThreaded
How do I know if I've won?
by jweber
I'm sure this model would work for some companies, but I'm not so sure about the examples provided. The Netflix prize (and other mathematical prizes) works because any participant can measure for themselves whether they've won. In the case of Microsoft's search engine, it would be up to Microsoft to install the participants' algorithms on their servers, and see how they perform. In the case of Google's translation engine, there's no measurably "best" way to translate a work of literature.
Re: How do I know if I've won?
by ekdysiast

jweber:
In the case of Microsoft's search engine, it would be up to Microsoft to install the participants' algorithms on their servers, and see how they perform

Not really. Though Farhad suggested such a method, there are plenty of other ways to evaluate how well a search engine does. Information retrieval is one of the more mature fields in artificial intelligence. As such, it also has fairly well established standards of evaluation. Considering how big a company MS is, it probably has tons of data on search histories and corresponding click throughs and whatever else you might need to automatically judge all entrants fairly and (to a certain extant) reliably.

jweber:
In the case of Google's translation engine, there's no measurably "best" way to translate a work of literature

Actually, there is a measurable way. It's called the BLEU score. As you can see in the Google scholar link, it's been cited more than 1500 times in the statistical machine translation (SMT) literature. It is THE de facto standard for evaluating how well a SMT model performs. It's not perfect. Everyone I know who does SMT hates it. But they abide by it because it's the best option in a field of shitty options. Franz Och, the head of SMT at Google, uses it so I guess that says something.

View as RSS news feed in XML