UGC Science
User Generated Content (UGC) is evolving. In 2006, Time Magazine featured “You” as the person of the year, recognizing the people who create the information age by contributing to websites like YouTube, Facebook, and Wikipedia. Today we’re seeing a trend where these small nuggets of human intelligence and creativity are gathered and processed by machines to learn about our world. We can leverage what humans do best in transient interactions. Along the way to another task or as idle play, we can provide effective pattern recognition that is quite easy for the human brain and so hard for a computer. This UGC seeds machine learning systems with the masses of data needed to train a computer to recognize patterns. I noticed this trend last year and our Director of UX, Loredana Crisan, named it UGC Science.
UGC Science is at the intersection of cloud storage and distributed computing, where the majority of the world population has a mobile computing device that allows for the collection and communication of data. Seventy percent of the world’s population has a mobile phone and a majority in the US own Smartphones, even among teens.
UGC Science does not include recommendation engines that seek merely to understand people’s preferences. We looked specifically for machine learning applications where small bits of human intelligence were collected en masse and contribute to an understanding of the real world. I brainstormed examples of this trend with my colleague, Judy Tuan, and I’ve highlighted a few of these below.
Optical Character Recognition
A CAPTCHA (Completely Automated Public Turing Test To Tell Computers and Humans Apart) is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. Initially developed at Carnegie Mellon University by Luis Von Ahn, uses images of words that can’t be reliably recognized during automated digitization of books and other printed material. Tests have shown that textual images are deciphered and transcribed with 99.1% accuracy, a rate comparable to the best human professional transcription services.
uses human computing power to improve Optical Character Recognition so that computers can turn printed books and magazines into text that can be searched or read aloud.
Protein Folding and RNA
Humans can visually recognize answers to protein-folding problems that computers aren’t as good at spotting. Dr. Rhiju Das at Stanford created the game, fold.it. This game leverages our collective intelligence, honed by machine selection, to help scientists learn about protein folding. The New York Times reports that some of these same researchers went on to create a new game with UGC Science.
“If you give a gamer a chance, they can design RNA molecules far better than any computer,” says FastCompany describing EteRNA, a newer effort in which gamers solve puzzles.
Stanford School of Medicine dives into more detail: “EteRNA’s unique kicker: Das’s laboratory actually synthesizes the ‘winning’ RNA sequences on a weekly basis, and figures out if they fold up as designed. The lab then feeds the experimental findings back to the players. ‘This way, there’s a chance that thousands of non-expert enthusiasts will be able to collectively solve biochemical challenges that experts can’t,’ said Das. ‘If the molecule folds as the players think, they win — and so do we.’”
Voice Transcription
Google has been quietly applying UGC Science for years. There’s a particularly nice application in their voice transcription system where they have done a great job of addressing privacy concerns by making it clear that your personal communications may be “donated” to improve future transcription. The following images illustrate a sequence of interactions with Google Voice. The incorrect attempt to transcribe a voice mail may be corrected by the recipient. The new text can train Google’s voice recognition algorithm.
Incorrect Initial Transcription
User edits message with correct transcription.
Google Voice prompts user to “donate” their voice mail.
Future Applications
This application of machine learning may be susceptible to the AI Effect, where as soon as something becomes a solved problem it is no longer considered Artificial Intelligence. I hope that by identifying this trend we can start to see design patterns, both from a human experience perspective and with shared technical approaches. There is an urgent need turn data into knowledge.
One Trackback
[...] interpret the real world more effectively on our behalf. At Blazing Cloud last year, we called this UGC Science, more generally it’s called crowdsourcing. Thousands of people are participating in dozens of [...]