Bing Speech Recognition Gets faster, More Accurate
Microsoft's Bing team claims that voice search and voice-to-text - two Bing-powered phone features - are now up to twice as fast and 15 percent more accurate, a feat accomplished by exploiting some recent biology-inspired artificial intelligence breakthroughs by Microsoft Research scientists.
Available to Windows Phone owners in the US, the updates improve the speed and accuracy of voice to text and voice search.
To achieve the speed and accuracy improvements, Microsoft focused on an advanced approach called Deep Neural Networks (DNNs). DNN is a technology that is inspired by the functioning of neurons in the brain. In a similar way, DNN technology can detect patterns akin to the way biological systems recognize patterns.
By coupling MSR's research breakthroughs in the use of DNNs with the large datasets provided by Bing's massive index, the DNNs were able to learn more quickly and help Bing voice capabilities get noticeably closer to the way humans recognize speech. Microsoft also made a few improvements under the hood that allowed Bing to more easily identify speech patterns and cut through ambient and background noise - cutting down response time by half and improving the word error rate by 15 percent, even in noisy situations.
To achieve the speed and accuracy improvements, Microsoft focused on an advanced approach called Deep Neural Networks (DNNs). DNN is a technology that is inspired by the functioning of neurons in the brain. In a similar way, DNN technology can detect patterns akin to the way biological systems recognize patterns.
By coupling MSR's research breakthroughs in the use of DNNs with the large datasets provided by Bing's massive index, the DNNs were able to learn more quickly and help Bing voice capabilities get noticeably closer to the way humans recognize speech. Microsoft also made a few improvements under the hood that allowed Bing to more easily identify speech patterns and cut through ambient and background noise - cutting down response time by half and improving the word error rate by 15 percent, even in noisy situations.