The CANG JIE SYSTEM and the COMPUTER
The Cang Jie system (CJS) is like an iceberg in which the Cang Jie Input Method is
but the top visible part.
When Mr. Chu designed CJS, he wanted that all the functions and benefits a
computer offers to alphabetical languages could also be fully used for the Chinese
language, and more , if possible.
The CJS's is characterized by three design principles:
COMPLETENESS—because the system’s purpose is the preservation of
Chinese culture of which Chinese characters are primary vessels, the system must be able to process all past, present, and future
Chinese characters.
EFFICIENCY---------all aspects of the system must use the minimum
memory and time possible within the framework of the platform that it is
employed, namely the computer.
ERGONOMIC--------all aspects of the system that will have an interface
with humans must be in accordance with the principles of human behavior and habits.
The design incorporates the major six functions a language can use in
a computer,
and is geared towards the more complex level: "to provide an [Information Media] for [Machine
Understanding]" to allow computers to communicate with humans. In this
system, the computer will ultimately have to understand Chinese.
- First function: Input System => The Cang Jie Method
Principles of design and features to incorporate:
--One code must represents only one distinct character, and vice versa.
--It was designed within guidelines of the most common input hardware tool at the time:
the ASCII Keyboard.
--Only keys that represent letters should be used for Codes, and should only
require one individual key (i.e. : should not have to hold down the
"Shift" key simultaneously).
--The least amount of the Signs possible to represent all existing and future
Chinese Characters.
--The least amount of Rules for the selection of Codes
--The least amount of codes to input one character for speed and ease of use.
The reality of designing a method for an existing counterpart (Chinese
characters) to be used on an existing platform (computer) employing an
existing interface (ASCII keyboard) requires that certain Design Principles
with regards to certain characters must be balanced at the expense of other
Design Principles. There must be some compromises in the design of the
Method.
However, the continued use of the Cang Jie Input Method over the last
twenty years by individuals who need to input large amounts of Chinese
characters at optimum speeds demonstrates CangJie’s integrity as the
premier input method for Chinese characters.
- Second function: Character Sorting
One of the Latin alphabet and its various derivatives’ most powerful
application in terms of the computer is ease and clarity for sorting. Large
amounts of information can be sorted, searched, and classified at great speed.
On first observation, due to the graphic nature of Chinese characters, it
would appear that a systematic sorting method would not be feasible. However,
the Cang Jie signs correspond to 25 English alphabet keys and with the rules that
determine the selection of signs, each character has its own code and therefore characters
can be sorted as well as with alphabetical languages.
- Third function: Character Morphology
Originally computers only had a "Text Mode", and characters
displayed on the screen could only be "8 x 8", and this was a major
limitation for Chinese characters. However, the development of the
"Graphic Mode" and then later the full "Graphic
Environment" created an ideal situation
for the fully exploit the strengths of the graphic nature of Chinese language.
However,
the shameless reliance on font libraries totally negates the power of Chinese
characters. Over the last 15 years, Mr. Chu and his team have developed, using
tight efficient assembly language code and streamlined data structures over
ten different Character Generators that utilize the Basic Morphological
Elements of Chinese characters that were derived from the research conducted
when developing the Cang Jie Input Method. When using the CJ Input Method, these
Character Generators can produce characters faster and in a greater variety of sizes
and shapes than any present system. The latest generation, the Complete Chinese
Character Generator, only uses 160KB of memory and can generate 1000 200*200
characters a second on a 330 MHZ computer.
4. Fourth function: Written Character Recognition
As any person knows, it is much easier to read any language than to
write it. Otherwise, why would elementary schools engage in Spelling Bees? However,
with the computer the situation is reversed: recognition has
become a more difficult task than generation.
Mr. Chu based his research
largely on the premise of "How do humans do it?" In his research, he
realized that humans only need recognize some prominent "features" of whatever
object they are considering. In the case of Chinese characters recognition
would be of the
structures that make up the character: this is the process that was already
exploited in the CJ Input Method where codification of characters is based on the
shapes and structure of the characters.
The CJ Recognition Program uses many
of the same Rules for Selection of Signs (shapes) to recognize characters and to gain the CJ Code. The "Chinese
Character Recognition System" developed by Mr. Chu’s colleague, Michele
Shen, can recognize Chinese characters at a rate of 5,000 characters/second, and
takes up only 60KB of memory. The 114 Signs (5th Generation")
are the Basic Elements for character recognition.
5. Fifth function: Character Sound Recognition
While Chinese characters are graphic symbols, over 90 per cent of them contain phonetic
indicators that provide the reader a hint on its pronunciation. Mr. Chu has
identified over a 100 of these phonetic indicators and has classified them as
the Basic Elements of Phonetics for Chinese characters sound recognition.
6. Sixth function: Character Meaning: Computer Natural Language:
Chinese language is the only one on Earth using ideographic character. It
has been edited and classified for thousands of years by an untold number of scholars. Original characters were pictographs
corresponding to objects and concepts and have evolved to
accommodate increasing amounts of needed information and communication.
Chinese character
"Radicals" represents a system of classification for meaning, and the
225 radicals are the result of scholars editing Chinese characters into
groups represented by the Radicals.
Mr. Chu has built a new classification system based on 512 Basic Meaning
Elements: (256 Elements of
Common Sense and 256 Basic Elements of Knowledge) that are formed by either
a Sign or a Unit.
On the basis of the meaning of this Basic Elements, all Chinese characters
have been classified by Mr. Chu's team and placed into a Natural Language
Program that can allow the computer to understand what its human
counterpart has conveyed. This part of the Cang Jie System is like the real
summit and achievement of the many years of research of Mr. Chu and his team; it
is called the Understanding System.
More information on the Understanding System on the next page.
(contribution presented by Walter Van Patten, long time student of Master Chu)
|