The CJ System

  Up ] Site Plan ] Get the CJ Book ] Method Overview ] Easy CangJie ] Simplifed Chinese ] CJ Advantages ] Tutorial ] Teachers of Chinese ] Links ] Downloads ] Suggestions ] CJ Forum ] Chinese Greetings ]

 


Home
Up

The CANG JIE SYSTEM and the COMPUTER

 

The Cang Jie system (CJS) is like an iceberg in which the Cang Jie Input Method is but the top visible part.

When Mr. Chu designed CJS, he wanted that all the functions and benefits a computer offers to alphabetical languages could also be fully used for the Chinese language, and more , if possible. 

The CJS's is characterized by three design principles:

COMPLETENESS—because the system’s purpose is the preservation of Chinese culture of which Chinese characters are primary vessels, the system must be able to process all past, present, and future Chinese characters.

EFFICIENCY---------all aspects of the system must use the minimum memory and time possible within the framework of the platform that it is employed, namely the computer.

ERGONOMIC--------all aspects of the system that will have an interface with humans must be in accordance with the principles of human behavior and habits.

 

The design incorporates the major six functions a language can use in a computer, and is geared towards the more complex level: "to provide an [Information Media] for [Machine Understanding]" to allow computers to communicate with humans. In this system, the computer will ultimately have to understand Chinese.

 

  1. First function: Input System => The Cang Jie Method
  2. Principles of design and features to incorporate:

    --One code must represents only one distinct character, and vice versa.

    --It was designed within guidelines of the most common input hardware tool at the time: the ASCII Keyboard.

    --Only keys that represent letters should be used for Codes, and should only require one individual key (i.e. : should not have to hold down the "Shift" key simultaneously).

    --The least amount of the Signs possible to represent all existing and future Chinese Characters.

    --The least amount of Rules for the selection of Codes

    --The least amount of codes to input one character for speed and ease of use.

    The reality of designing a method for an existing counterpart (Chinese characters) to be used on an existing platform (computer) employing an existing interface (ASCII keyboard) requires that certain Design Principles with regards to certain characters must be balanced at the expense of other Design Principles. There must be some compromises in the design of the Method. 

    However, the continued use of the Cang Jie Input Method over the last twenty years by individuals who need to input large amounts of Chinese characters at optimum speeds demonstrates CangJie’s integrity as the premier input method for Chinese characters.

     

  3. Second function: Character Sorting
  4. One of the Latin alphabet and its various derivatives’ most powerful application in terms of the computer is ease and clarity for sorting. Large amounts of information can be sorted, searched, and classified at great speed. 

    On first observation, due to the graphic nature of Chinese characters, it would appear that a systematic sorting method would not be feasible. However, the Cang Jie signs correspond to 25 English alphabet keys and with the rules that determine the selection of signs, each character has its own code and therefore characters can be sorted as well as with alphabetical languages.

     

  5. Third function: Character Morphology

Originally computers only had a "Text Mode", and characters displayed on the screen could only be "8 x 8", and this was a major limitation for Chinese characters. However, the development of the "Graphic Mode" and then later the full "Graphic Environment" created an ideal situation for the fully exploit the strengths of the graphic nature of Chinese language. 

However, the shameless reliance on font libraries totally negates the power of Chinese characters. Over the last 15 years, Mr. Chu and his team have developed, using tight efficient assembly language code and streamlined data structures over ten different Character Generators that utilize the Basic Morphological Elements of Chinese characters that were derived from the research conducted when developing the Cang Jie Input Method. When using the CJ Input Method, these Character Generators can produce characters faster and in a greater variety of sizes and shapes than any present system. The latest generation, the Complete Chinese Character Generator, only uses 160KB of memory and can generate 1000 200*200 characters a second on a 330 MHZ computer.

 

4. Fourth function:  Written Character Recognition

As any person knows, it is much easier to read any language than  to write it. Otherwise, why would elementary schools engage in Spelling Bees? However, with the computer the situation is reversed: recognition has become a more difficult task than generation. 

Mr. Chu based his research largely on the premise of "How do humans do it?" In his research, he realized that humans only need recognize some prominent "features" of whatever object  they are considering. In the case of Chinese characters recognition would be of the structures that make up the character: this is the process that was already exploited in the CJ Input Method where codification of characters is based on the shapes and structure of the characters. 

The CJ Recognition Program uses many of the same Rules for Selection of Signs (shapes) to recognize characters and to gain the CJ Code. The "Chinese Character Recognition System" developed by Mr. Chu’s colleague, Michele Shen, can recognize Chinese characters at a rate of 5,000 characters/second, and takes up only 60KB of memory. The 114 Signs (5th Generation") are the Basic Elements for character recognition.

 

5. Fifth function: Character Sound Recognition

While Chinese characters are graphic symbols, over 90 per cent of them contain phonetic indicators that provide the reader a hint on its pronunciation. Mr. Chu has identified over a 100 of these phonetic indicators and has classified them as the Basic Elements of Phonetics for Chinese characters sound recognition.

 

6. Sixth function: Character Meaning: Computer Natural Language:

Chinese language is the only one on Earth using ideographic character. It has been edited and classified for thousands of years by an untold number of scholars. Original characters were pictographs corresponding to objects and concepts and have evolved to accommodate increasing amounts of needed information and communication.

Chinese character "Radicals" represents a system of classification for meaning, and the 225 radicals are the result of scholars editing Chinese characters into groups represented by the Radicals.

Mr. Chu has built a new classification system based on 512 Basic Meaning Elements: (256 Elements of Common Sense and 256 Basic Elements of Knowledge) that are formed by either a Sign or a Unit. 

On the basis of the meaning of this Basic Elements, all Chinese characters have been classified by Mr. Chu's team and placed into a Natural Language Program that can allow the computer to understand what its human counterpart has conveyed. This part of the Cang Jie System is like the real summit and achievement of the many years of research of Mr. Chu and his team; it is called the Understanding System.

More information on the Understanding System on the next page.

 

(contribution presented by Walter Van Patten, long time student of Master Chu)

 

 

Up ] Nine Discourses ] Cosmic Traveler ] Defense ] [ The CJ System ] Natural Language ] Literary Works ] Chinese E-Book ]