Automatic Lip Reading Software
Posted By admin On 08/04/18Winlive Pro 5 5 Cracks more. Researchers from Google's AI division DeepMind and the University of Oxford have used artificial intelligence to create the most accurate lip-reading software ever. Using thousands of hours of TV footage from the BBC, scientists trained a neural network to annotate video footage with 46.8 percent accuracy. Lip reading software free download - Audeo: Bad Lip Reads and Video Overlay, Lip Balm, Lip Dial, and many more programs.
Watch my lips Burton Pritzker/Getty By Hal Hodson This article is being made free to view thanks to sponsorship from P&G Artificial intelligence is getting its teeth into lip reading. A project by Google’s DeepMind and the University of Oxford applied deep learning to a huge data set of BBC programmes to create a lip-reading system that leaves professionals in the dust.
The AI system was trained using some 5000 hours from six different TV programmes, including Newsnight, BBC Breakfast and Question Time. In total, the videos contained 118,000 sentences. AI shows the way The AI vastly outperformed a professional who attempted to decipher 200 randomly selected clips from the data set. The professional annotated just 12.4 per cent of words without any error. But the AI annotated 46.8 per cent of all words in the March to September data set without any error. And many of its mistakes were small slips, like missing an ‘s’ at the end of a word. With these results, the system also outperforms all other automatic.
“It’s a big step for developing fully automatic lip-reading systems,” says at the University of Oulu in Finland. “Without that huge data set, it’s very difficult for us to verify new technologies like deep learning.” Two weeks ago, a similar deep learning system called – also developed at the University of Oxford – outperformed humans on a lip-reading data set known as GRID.
But where GRID only contains a vocabulary of 51 unique words, the BBC data set contains nearly 17,500 unique words, making it a much bigger challenge. In addition, the grammar in the BBC data set comes from a wide diversity of real human speech, whereas the grammar in GRID’s 33,000 sentences follows the same pattern and so is far easier to predict. The DeepMind and Oxford group says it will release its BBC data set as a training resource., who is working on LipNet, says he is looking forward to using it. Lining up the lips To make the BBC data set suitable for automatic lip reading in the study, video clips had to be prepared using machine learning.
The problem was that the audio and video streams were sometimes out of sync by almost a second, which would have made it impossible for the AI to learn associations between the words said and the way the speaker moved their lips. But by assuming that most of the video was correctly synced to its audio, a computer system was taught the correct links between sounds and mouth shapes. Using this information, the system figured out how much the feeds were out of sync when they didn’t match up, and realigned them. It then automatically processed all 5000 hours of the video and audio ready for the lip-reading challenge – a task that would have been onerous by hand. The question now is how to use AI’s new lip-reading capabilities.
We probably don’t need to fear computer systems eavesdropping on our conversations by reading our lips because long-range microphones are better for spying in most situations. Instead, Zhou thinks lip-reading AIs are most likely to be used in consumer devices to help them figure out what we are trying to say.