Demos don't hide rough road faced by MPEG-7 spec
A handful of Japanese companies have demonstrated their own versions of MPEG-7 applications software, but industry watchers caution that much work remains to be done to jumpstart the market for MPEG-7, which provides a standard method of describing digital multimedia content.
At a recent symposium here sponsored by the Information Processing Society, Ricoh Co. Ltd. demonstrated tools that automatically generate MPEG-7 tags for digital movies; Pioneer Corp. showed MPEG-7 encoders that allow consumers to edit video content; Toshiba Corp. and Hitachi Ltd. each demoed video retrieval technologies; and NTT Docomo showed an MPEG-7-based movie distribution service.
The MPEG-7 standard allows digital content to be described at various levels. High-level descriptions are based on such items as the content's title or creator, or may say when or where it was created. At a deeper level, MPEG-7 can describe structural features of audiovisual content such as its color, texture, tone or tempo. While other MPEG standards deal with the compression of audiovisual material, MPEG-7 defines a method of describing multimedia content. If the content is visual data, it is compressed by using an existing MPEG technology — MPEG-1, MPEG-2 or MPEG-4 — and is then described by MPEG-7.
MPEG-7 is intended to provide an efficient method for retrieving and filtering information stored in vast audiovisual databases. "There is no alternative at present that can replace MPEG-7. It will be a very important, key technology in the broadband era," said Osamu Hori, senior research scientist at Toshiba's Multimedia Laboratory. MPEG-7 describes content using the extensible markup language (XML), then provides a compression scheme named BiM to make the description a compact binary. Higher-level data descriptions are in text and are usually done manually. Deeper-level data is usually quantized by a computer. That presents a problem.
"To promote MPEG-7, it is necessary to lower the cost of content description," said Takayuki Kunieda, R&D engineer at Ricoh's imaging system business group, and a member of the MPEG-7 domestic committee. A detailed high-level description must be done by an individual, and is therefore costly at present. "Such a high cost would hinder MPEG-7 penetration," Kunieda said.
"To generate this data [without human intervention], a powerful tool will be essential," Kunieda said. Ricoh and Pioneer each demonstrated encoding tools for such high-level data generation.
Ricoh's MovieTool generates MPEG-7 tags depending on the structure of video content. An operator must then type text next to those tags. "If you want to make rich content, it is necessary to describe [it] in detail manually," a Ricoh researcher said. The time required to describe video footage will increase with the description's detail. Ricoh is distributing a beta version of the MovieTool on its Web site.
Pioneer's MPEG-7 encoder was developed as a tool for consumers to edit video content stored on a home server or disk recorder. Its software, written in C++, sports a graphical user interface and runs on a PC.
Toshiba's tool, designed to aid data retrieval, describes content in terms of time and space. In a demonstration, Toshiba engineers used the locator to describe objects in video data quickly referenced information related to each object.
The retrieval system could be used in a TV-shopping application, for example, said a Toshiba engineer. If a consumer wants to purchase a clock that appears in a TV drama, one click on the clock would take the viewer to a virtual clock shop, he said.
NTT Docomo demonstrated a system that would distribute video over mobile phones. In this application, MPEG-7 is used to summarize long video footage, and the distribution system sends MPEG-7 meta data and a Java agent to mobile phones so that a user can select preferred video footage and play it immediately.
Hitachi Ltd. has developed a video retrieval technology based on a description scheme that groups video frames with similar characteristics and describes those grouped frames. The description requires only about 20 percent of what's required for a frame-to-frame description, Hitachi said.
An algorithm that compares features of grouped frames rather than each frame enables searches in about half the average time needed by conventional methods, a Hitachi spokesman said. The retrieval software can be implemented on an embedded CPU installed in a TV or video terminal. When run with a 450-MHz processor, the software can peruse 24 hours of TV programming and retrieve an image within a second, according to Hitachi.
Copyright protection is a critical issue for multimedia content, and MPEG-7 provides rules for describing copyrights. However, "copy protection itself is not included [in the spec]," said Kunieda of Ricoh. "Higher-level applications will define how to protect content," he said. Tools able to handle copyright protection, fare charges and distribution of content to various terminals are under discussion at a separate standards group called MPEG-21.
"MPEG-7 only defines how to describe the contents," said Hori of Toshiba. "How to generate and how to use MPEG-7 data are left open. Therefore, standardization of MPEG-7 does not mean the end, but it should be the starting point from which various usage and applications will be developed."
Once MPEG-7 is standardized, proponents plan to establish the MPEG-7 Industrial Focus Group at a meeting to be held Oct. 27 in Washington, called the third MPEG-7 Awareness Event. The group's goal is to promote MPEG-7, provide tools and adjust patent issues. It will also work to verify the interoperability of MPEG-7 by building a test bed system.
The MPEG-7 standard allows digital content to be described at various levels. High-level descriptions are based on such items as the content's title or creator, or may say when or where it was created. At a deeper level, MPEG-7 can describe structural features of audiovisual content such as its color, texture, tone or tempo. While other MPEG standards deal with the compression of audiovisual material, MPEG-7 defines a method of describing multimedia content. If the content is visual data, it is compressed by using an existing MPEG technology — MPEG-1, MPEG-2 or MPEG-4 — and is then described by MPEG-7.
MPEG-7 is intended to provide an efficient method for retrieving and filtering information stored in vast audiovisual databases. "There is no alternative at present that can replace MPEG-7. It will be a very important, key technology in the broadband era," said Osamu Hori, senior research scientist at Toshiba's Multimedia Laboratory. MPEG-7 describes content using the extensible markup language (XML), then provides a compression scheme named BiM to make the description a compact binary. Higher-level data descriptions are in text and are usually done manually. Deeper-level data is usually quantized by a computer. That presents a problem.
"To promote MPEG-7, it is necessary to lower the cost of content description," said Takayuki Kunieda, R&D engineer at Ricoh's imaging system business group, and a member of the MPEG-7 domestic committee. A detailed high-level description must be done by an individual, and is therefore costly at present. "Such a high cost would hinder MPEG-7 penetration," Kunieda said.
"To generate this data [without human intervention], a powerful tool will be essential," Kunieda said. Ricoh and Pioneer each demonstrated encoding tools for such high-level data generation.
Ricoh's MovieTool generates MPEG-7 tags depending on the structure of video content. An operator must then type text next to those tags. "If you want to make rich content, it is necessary to describe [it] in detail manually," a Ricoh researcher said. The time required to describe video footage will increase with the description's detail. Ricoh is distributing a beta version of the MovieTool on its Web site.
Pioneer's MPEG-7 encoder was developed as a tool for consumers to edit video content stored on a home server or disk recorder. Its software, written in C++, sports a graphical user interface and runs on a PC.
Toshiba's tool, designed to aid data retrieval, describes content in terms of time and space. In a demonstration, Toshiba engineers used the locator to describe objects in video data quickly referenced information related to each object.
The retrieval system could be used in a TV-shopping application, for example, said a Toshiba engineer. If a consumer wants to purchase a clock that appears in a TV drama, one click on the clock would take the viewer to a virtual clock shop, he said.
NTT Docomo demonstrated a system that would distribute video over mobile phones. In this application, MPEG-7 is used to summarize long video footage, and the distribution system sends MPEG-7 meta data and a Java agent to mobile phones so that a user can select preferred video footage and play it immediately.
Hitachi Ltd. has developed a video retrieval technology based on a description scheme that groups video frames with similar characteristics and describes those grouped frames. The description requires only about 20 percent of what's required for a frame-to-frame description, Hitachi said.
An algorithm that compares features of grouped frames rather than each frame enables searches in about half the average time needed by conventional methods, a Hitachi spokesman said. The retrieval software can be implemented on an embedded CPU installed in a TV or video terminal. When run with a 450-MHz processor, the software can peruse 24 hours of TV programming and retrieve an image within a second, according to Hitachi.
Copyright protection is a critical issue for multimedia content, and MPEG-7 provides rules for describing copyrights. However, "copy protection itself is not included [in the spec]," said Kunieda of Ricoh. "Higher-level applications will define how to protect content," he said. Tools able to handle copyright protection, fare charges and distribution of content to various terminals are under discussion at a separate standards group called MPEG-21.
"MPEG-7 only defines how to describe the contents," said Hori of Toshiba. "How to generate and how to use MPEG-7 data are left open. Therefore, standardization of MPEG-7 does not mean the end, but it should be the starting point from which various usage and applications will be developed."
Once MPEG-7 is standardized, proponents plan to establish the MPEG-7 Industrial Focus Group at a meeting to be held Oct. 27 in Washington, called the third MPEG-7 Awareness Event. The group's goal is to promote MPEG-7, provide tools and adjust patent issues. It will also work to verify the interoperability of MPEG-7 by building a test bed system.