NEC & Samsung face recognition technology selected for MPEG-7 standard by MPEG Committee
NEC Corporation and Samsung Advanced Institute of Technology (SAIT) today announced that the MPEG (Moving Picture Experts Group) Committee has decided to adopt NEC and SAIT jointly proposed new face recognition technology for the upcoming MPEG-7 standard.
The MPEG-7 standard provides a set of standardized tools to describe content important for multimedia retrieval. To date there has been a need to standardize face description to represent facial features as a tool for identifying people. The NEC/SAIT technology was chosen due to best performance in retrieval accuracy, speed, and data size proposed in the MPEG-7 benchmark tests.
Referred to as MPEG-7 AFR (Advanced Face Recognition Descriptor), the technology is a description method that presents facial features in still or moving picture form for multimedia retrieval. It boasts extremely small data size as well as fast and accurate retrieval. Facial features can be described as metadata enabling a variety of applications, such as instantaneous retrieval of a scene from a large video archive system, or of an appearance in an acting scene using the human face as the query factor. Through adoption into international standardization it will enable establishment of large archive systems with the function of searching and retrieving scenes using the face as the query factor and is expected to deploy new services in the market of the spreading of digital broadcasting and the internet, such as video archives, home videos and surveillance systems.
It has been achieved by the following:
(1) NEC developed "Cascaded Linear Discriminant Analysis", which selects features of human faces in order of performance within the cascading architecture and realizes an accurate description of each face image in a minimum data size of 253 bits/face.
(2) SAIT developed "Face Component Based Face Feature Representation Method" that extracts facial features from each face component, such as the eyes and mouth, and when applied to (1) improves the level of accuracy of the technology.
In comparison to the previous standard, this technology achieves a reduction in the rate of retrieval error by one eighth (1/8) on average. In addition, it realizes a matching speed capability of one million times per second on a conventional PC thus making it possible to retrieve a scene starring a specific person in approximately one second from a 24 hour video.
With the rapid spread of IT network technologies in recent years, multimedia retrieval technologies have become increasingly important for providing access to contents that users require from large multimedia video and audio databases. Each company will continue to develop multimedia retrieval technologies through further integration of video and audio recognition, and strive to develop a product based on this technology at the earliest opportunity.
Referred to as MPEG-7 AFR (Advanced Face Recognition Descriptor), the technology is a description method that presents facial features in still or moving picture form for multimedia retrieval. It boasts extremely small data size as well as fast and accurate retrieval. Facial features can be described as metadata enabling a variety of applications, such as instantaneous retrieval of a scene from a large video archive system, or of an appearance in an acting scene using the human face as the query factor. Through adoption into international standardization it will enable establishment of large archive systems with the function of searching and retrieving scenes using the face as the query factor and is expected to deploy new services in the market of the spreading of digital broadcasting and the internet, such as video archives, home videos and surveillance systems.
It has been achieved by the following:
(1) NEC developed "Cascaded Linear Discriminant Analysis", which selects features of human faces in order of performance within the cascading architecture and realizes an accurate description of each face image in a minimum data size of 253 bits/face.
(2) SAIT developed "Face Component Based Face Feature Representation Method" that extracts facial features from each face component, such as the eyes and mouth, and when applied to (1) improves the level of accuracy of the technology.
In comparison to the previous standard, this technology achieves a reduction in the rate of retrieval error by one eighth (1/8) on average. In addition, it realizes a matching speed capability of one million times per second on a conventional PC thus making it possible to retrieve a scene starring a specific person in approximately one second from a 24 hour video.
With the rapid spread of IT network technologies in recent years, multimedia retrieval technologies have become increasingly important for providing access to contents that users require from large multimedia video and audio databases. Each company will continue to develop multimedia retrieval technologies through further integration of video and audio recognition, and strive to develop a product based on this technology at the earliest opportunity.