Breaking News

DJI Breaks Through the Limits of Fixed Aperture with Osmo Action 6 PlayStation’s Black Friday Deals 2025 TerraMaster Black Friday & Cyber Monday 2025 Mega Sale Is Here HighPoint and ASK Corp Redefine 8K Post-Production with Verified 50.5GB/s Gen5 NVMe Storage at Inter BEE 2025 EDIFICE Launches the New ECB-S10 Series

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Researchers Advance Image Recognition Technology

Researchers Advance Image Recognition Technology

Enterprise & IT Nov 18,2014 0

Google Research scientists have have created artificial intelligence software capable of recognizing and describing the content of photographs and videos with greater accuracy than ever before. Google's machine-learning system can automatically produce captions to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images.

The idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German.

The researchers replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images. Normally, the CNN’s last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But by removing that final layer, reseearchers instead fed the CNN’s rich encoding of the image into a RNN designed to produce phrases. The whole system was trained directly on images and their captions, so they managed to maximize the likelihood that descriptions it produces best match the training descriptions for each image. The model combines a vision CNN with a language-generating RNN so it can take in an image and generate a fitting natural-language caption.

Google says that its experiments with this system on several openly published datasets, including Flickr8k, Flickr30k and SBU, showed qualitative results. It also performed well in quantitative evaluations with the Bilingual Evaluation Understudy (BLEU), a metric used in machine translation to evaluate the quality of generated sentences.

To get more details about the framework used to generate descriptions from images, as well as the model evaluation, read the full paper here.

Tags: Google
Previous Post
China Blocks Edgecast Websites
Next Post
Microsoft Surface Pro 3 Update Fixes Bugs

Related Posts

  • Google announces Pixel 10, Pixel 10 Pro Fold and Pixel Buds 2a

  • Elevate your gameplay across mobile and PC

  • What’s new in Android 15, plus more updates

  • NVIDIA Teams Up With Google DeepMind to Drive Large Language Model Innovation

  • Google at CES 2024

  • Google introduces Gemini AI model

  • Google Cloud Launches AI-Powered Anti Money Laundering Product for Financial Institutions

  • Connecting all things Android at MWC Barcelona

Latest News

DJI Breaks Through the Limits of Fixed Aperture with Osmo Action 6
Cameras

DJI Breaks Through the Limits of Fixed Aperture with Osmo Action 6

PlayStation’s Black Friday Deals 2025
Gaming

PlayStation’s Black Friday Deals 2025

TerraMaster Black Friday & Cyber Monday 2025 Mega Sale Is Here
Enterprise & IT

TerraMaster Black Friday & Cyber Monday 2025 Mega Sale Is Here

HighPoint and ASK Corp Redefine 8K Post-Production with Verified 50.5GB/s Gen5 NVMe Storage at Inter BEE 2025
Enterprise & IT

HighPoint and ASK Corp Redefine 8K Post-Production with Verified 50.5GB/s Gen5 NVMe Storage at Inter BEE 2025

EDIFICE Launches the New ECB-S10 Series
Consumer Electronics

EDIFICE Launches the New ECB-S10 Series

Popular Reviews

be quiet! Dark Mount Keyboard

be quiet! Dark Mount Keyboard

Terramaster F8-SSD

Terramaster F8-SSD

be quiet! Light Mount Keyboard

be quiet! Light Mount Keyboard

Soundpeats Pop Clip

Soundpeats Pop Clip

Akaso 360 Action camera

Akaso 360 Action camera

Dragon Touch Digital Calendar

Dragon Touch Digital Calendar

Noctua NF-A12x25 G2 fans

Noctua NF-A12x25 G2 fans

be quiet! Pure Loop 3 280mm

be quiet! Pure Loop 3 280mm

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed