Fujitsu Achieves High Recognition Rate for Handwritten Chinese Characters
Fujitsu says it has developed a handwriting recognition technology by utilizing AI technology modeled on human brain processes to surpass a human equivalent recognition rate of 96.7%.
The company had previously achieved top-level accuracy in this field, as demonstrated by taking first place, with a recognition rate of 94.8%, at a handwritten Chinese character recognition contest held at the International Conference on Document Analysis and Recognition (ICDAR), a top-level conference in the document image processing field. However, in order to further increase recognition accuracy, a new mechanism for studying the diversity of character deformations was required.
Now, with a focus on a hierarchical model of expanded connections between neurons, a model based on the human brain which grasps the features of the characters, Fujitsu has developed a technology to automatically create numerous patterns of character deformation from the character's base pattern, thereby "training" this hierarchical neural model. Using this method, Fujitsu has achieved an accuracy rate of 96.7%, surpassing the human equivalent recognition rate of 96.1% for handwritten Chinese characters.
Fujitsu is aiming for the practical application of this technology in fiscal 2015, while also further increasing the accuracy of character recognition technology and expanding its use to the recognition of media other than written characters, such as pictures and voice.
In addition, Fujitsu is also studying the applications of this character recognition technology to many other languages, such as Japanese, alphabet-based languages, and numerals.
How it works
With character recognition technology, the goal is to learn and store the features of the many character patterns thought to be used by humans when recognizing characters, using a model of connected hierarchies based on human neurons. When a character image is input, the first layer of the model perceives the simple features of the character, and then the next layer perceives the complex features of the character. In this way, the features effective for differentiating characters are extracted in an automatic and hierarchical fashion, and then the results of the learning process, including which features (neurons) the model reacted to, are accumulated. When attempting to recognize a character, the features of the input character are extracted in the same way as in the learning process, and the character is identified and recognition results output on the basis of which features (neurons) reacted as determined by the learning process.
In order to further increase the accuracy of recognition, there was a need for a new effort to study the diversity of character deformations.
Fujitsu has expanded the scale of the connections between neurons in the hierarchical model used in the character recognition process, raising recognition accuracy by increasing the number of connections from 2.8 million used in the previous technology (recognition rate 94.8%) to 150 million, in order to fine-tune the study of deformations. There are about 3,800 Chinese characters to be recognized, making it extremely difficult to collect real-world patterns of deformation for each character. Therefore, Fujitsu has developed a technology to randomly deform existing character samples to automatically create all sorts of character samples for learning. This made it possible to have the hierarchical model study a multitude of different types of deformed character patterns.
With previous methods, because they only randomized the character's position in two dimensions, differences in the brightness of parts of the background or parts of the character (strokes) and localized differences created problems. To address this, Fujitsu devised a character sample generation technology based on random deformations in three dimensions. By adding the grey value of each image element as a Z-axis parameter to the existing X and Y axes of the character pattern image, they were able to generate a variety of deformed patterns.
Now, with a focus on a hierarchical model of expanded connections between neurons, a model based on the human brain which grasps the features of the characters, Fujitsu has developed a technology to automatically create numerous patterns of character deformation from the character's base pattern, thereby "training" this hierarchical neural model. Using this method, Fujitsu has achieved an accuracy rate of 96.7%, surpassing the human equivalent recognition rate of 96.1% for handwritten Chinese characters.
Fujitsu is aiming for the practical application of this technology in fiscal 2015, while also further increasing the accuracy of character recognition technology and expanding its use to the recognition of media other than written characters, such as pictures and voice.
In addition, Fujitsu is also studying the applications of this character recognition technology to many other languages, such as Japanese, alphabet-based languages, and numerals.
How it works
With character recognition technology, the goal is to learn and store the features of the many character patterns thought to be used by humans when recognizing characters, using a model of connected hierarchies based on human neurons. When a character image is input, the first layer of the model perceives the simple features of the character, and then the next layer perceives the complex features of the character. In this way, the features effective for differentiating characters are extracted in an automatic and hierarchical fashion, and then the results of the learning process, including which features (neurons) the model reacted to, are accumulated. When attempting to recognize a character, the features of the input character are extracted in the same way as in the learning process, and the character is identified and recognition results output on the basis of which features (neurons) reacted as determined by the learning process.
In order to further increase the accuracy of recognition, there was a need for a new effort to study the diversity of character deformations.
Fujitsu has expanded the scale of the connections between neurons in the hierarchical model used in the character recognition process, raising recognition accuracy by increasing the number of connections from 2.8 million used in the previous technology (recognition rate 94.8%) to 150 million, in order to fine-tune the study of deformations. There are about 3,800 Chinese characters to be recognized, making it extremely difficult to collect real-world patterns of deformation for each character. Therefore, Fujitsu has developed a technology to randomly deform existing character samples to automatically create all sorts of character samples for learning. This made it possible to have the hierarchical model study a multitude of different types of deformed character patterns.
With previous methods, because they only randomized the character's position in two dimensions, differences in the brightness of parts of the background or parts of the character (strokes) and localized differences created problems. To address this, Fujitsu devised a character sample generation technology based on random deformations in three dimensions. By adding the grey value of each image element as a Z-axis parameter to the existing X and Y axes of the character pattern image, they were able to generate a variety of deformed patterns.