Breaking News

CASIO introduces a new limited edition Hammered heritage model, the New MRG-B5000HT Introducing the Game-Changing MINISFORUM G1 Pro PlayStation Plus Monthly Games for December 2025 SSSTC Launches 16TB Enterprise SATA SSD with Breakthrough IOPS Performance Lexar Unveils Industry’s First AI Storage Core for Next Generation Edge AI Devices

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Microsoft Researchers Reach Human Parity in Conversational Speech Recognition

Microsoft Researchers Reach Human Parity in Conversational Speech Recognition

Enterprise & IT Oct 28,2016 0

Microsoft researchers have set a world record for speech recognition, using a technology it announced this week with GPU-accelerated deep learning to recognize words in a conversation as well as a person does. Microsoft's team described how they achieved an error rate of 5.9 percent - the lowest ever for machine transcription - and about as accurate as people who transcribed the same conversation. It’s also a 6 percent improvement over a record Microsoft set only a month ago.

"We’ve reached human parity," said Xuedong Huang, the company’s chief speech scientist and co-author of a paper published this week. "This is an historic achievement."

Conversational speech poses some of the biggest challenges to speech recognition, said Geoffrey Zweig, who manages the Speech & Dialog research group at Microsoft.

"Speech recognition gets hard when people are talking informally, when they get excited, when they make mistakes and correct themselves, when they change topics. All of these are characteristics of conversational speech," he said.

The researchers credit their breakthrough in conversational speech recognition to deep learning, in particular, the systematic use of convolutional and recurrent neural networks. In their latest work, the team applied a type of recurrent neural network called Long Short-Term Memory (LSTM) to the language model.

LSTM networks have the advantage of being able to "remember" information for a longer period time, so they are sensitive to more words than most neural network language models are.

Microsoft’s Cognitive Toolkit (previously known as CNTK), an open source deep learning framework, played key role in reaching human parity for conversational speech recognition. The cognitive toolkit, which Microsoft announced this week, is a system for deep learning that is used to speed advances in areas such as speech and image recognition and search relevance on GPUs.

By using Nvidia's Tesla M40 GPUs, Zweig said researchers reduced the training time for some language models from months to weeks. "That makes all the difference because the rate of progress we can make is linked to the number of experiments we can run," he said.

More work needs to be done to improve speech recognition in real-life settings like parties or city streets, where there may be music, traffic, people talking and other types of background noise. Researchers are also improving conversational speech recognition for meetings, where there are often multiple speakers seated at different distances from a microphone.

Zweig said the research milestone means the company has the right tools to quickly deploy a new generation of improved speech recognition in its Cortana personal digital assistant, Xbox gaming console and other products.

Their long-term goal is to move from speech recognition to understanding, he said. This would make it possible for devices to answer questions or take actions based on what they’re told.

Tags: Microsoft
Previous Post
Shuttle to Launch ARM-based Android Mini PC-NS02
Next Post
EU Privacy Authorities Warn WhatsApp on Data Privacy Policy, Yahoo on Breach

Related Posts

  • Snapdragon X Series is the Exclusive Platform to Power the Next Generation of Windows PCs with Copilot+ Today

  • Activision Blizzard King to Team Xbox

  • NVIDIA Studio Lineup Adds RTX-Powered Microsoft Surface Laptop Studio 2

  • Samsung and Microsoft Unveil First On-Device Attestation Solution for Enterprise

  • Introducing Xbox Game Pass Core, Coming This September

  • Announcing the next wave of AI innovation with Microsoft Bing and Edge

  • Microsoft Announces Security Copilot AI

  • Microsoft breaks new ground in healthcare with the next evolution of AI

Latest News

CASIO introduces a new limited edition Hammered heritage model, the New MRG-B5000HT
Consumer Electronics

CASIO introduces a new limited edition Hammered heritage model, the New MRG-B5000HT

Introducing the Game-Changing MINISFORUM G1 Pro
Enterprise & IT

Introducing the Game-Changing MINISFORUM G1 Pro

PlayStation Plus Monthly Games for December 2025
Gaming

PlayStation Plus Monthly Games for December 2025

SSSTC Launches 16TB Enterprise SATA SSD with Breakthrough IOPS Performance
Enterprise & IT

SSSTC Launches 16TB Enterprise SATA SSD with Breakthrough IOPS Performance

Lexar Unveils Industry’s First AI Storage Core for Next Generation Edge AI Devices
Enterprise & IT

Lexar Unveils Industry’s First AI Storage Core for Next Generation Edge AI Devices

Popular Reviews

be quiet! Dark Mount Keyboard

be quiet! Dark Mount Keyboard

Terramaster F8-SSD

Terramaster F8-SSD

be quiet! Light Mount Keyboard

be quiet! Light Mount Keyboard

Soundpeats Pop Clip

Soundpeats Pop Clip

Akaso 360 Action camera

Akaso 360 Action camera

Dragon Touch Digital Calendar

Dragon Touch Digital Calendar

Noctua NF-A12x25 G2 fans

Noctua NF-A12x25 G2 fans

be quiet! Pure Loop 3 280mm

be quiet! Pure Loop 3 280mm

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed