THE DISSERTATION DEFENSE FOR THE DEGREE OF Ph.D. IN COMPUTER SCIENCE FOR
Chang Hu
An enormous potential exists for solving certain classes of computational problems through rich collaboration between humans and computers. Solutions to these problems used to involve human professionals who are expensive to hire or difficult to find. Despite significant advances, fully automatic systems still have much room for improvement. Recent research has involved recruiting large crowds of skilled humans (``crowdsourcing''), but crowdsourcing solutions are still restricted by the availability of those skilled human participants. With translation, for example, professional translators incur high cost and are not always available. Machine translation systems have been greatly improved recently, but still can only provide passable translation, and for only limited language pairs at that. The bottom line is that crowdsourced translation is limited by the availability of bilingual humans.
This dissertation describes crowdsourced monolingual translation, a new solution to the translation problem which combines the best of automated computational approaches and unskilled humans (i.e., those that speak only one language). Crowdsourced monolingual translation uses computational approaches to support the collaboration between machine translation and two groups of monolingual people, each group speaking only one of the two languages involved. By combining the three parties, crowdsourced monolingual translation performs better than any one of the three alone.
A general protocol to handle crowdsourced monolingual translation is introduced along with three systems that implement the protocol. The MonoTrans system initially established the feasibility of the protocol. Then, MonoTrans2 enabled lab experiments with a second implementation of the protocol. MonoTrans2 was also applied to a an emergency-response scenario in a developing country (Haiti). The MonoTrans Widgets system was deployed to a large crowd of casual web users with a third implementation of the protocol. These systems were studied in various settings, and were found to supply improvement in quality over both machine translation and monolingual post-editing.
Examining Committee:
Committee Chair: Dr. Ben Bederson
Co-Chair Dr. Philip Resnik
Dean's Representative: Dr. Ping Wang
Committee Members: Dr. Bonnie Dorr
Dr. Chris Callison-Burch