The Chinese Character Wiki is a free and open source dictionary of Chinese characters, including stroke orders, pronunciations, definitions, examples, origins, and component breakdowns.
The dictionary currently contains manually verified information about
- The 1000 most common characters in movie subtitles
- The 1000 most common characters in books
- Characters from HSK 1-4
- Characters from the first 1000 Dong Chinese difficulty levels
(balanced list based on graded readers, frequency, and standardized tests) - All the components of the characters in the above lists
Character components
Most Chinese characters are built from a combination of components. This wiki sorts characters into the following eight categories:
- Meaning component
- Sound component
- Iconic component
- Remnant component
- Simplified component
- Deleted component
- Distinguishing component
- Unknown component
These categories are different from the traditional 六书通 system. Many characters do not fit neatly into the traditional six categories.
Note that components are different from radicals. Radicals are traditionally used for organizing Chinese dictionaries, but are not always useful for understanding how characters are actually built.
Meaning component
A meaning or semantic component hints at the meaning of the character.
For example:
Character | Meaning component |
---|---|
mother mā | woman nǚ |
question wèn | mouth kǒu |
think; desire xiǎng | heart xīn |
Meaning components are color-coded as red.
Historical shifts in meaning
For example, the character 错 originally meant to decorate something by inlaying it with gold or silver, which is why it contains the 金 (metal) component. Later this character expanded to include other meanings:
- interlocking pattern
- stagger / crossing
- complex / chaotic
- incorrect / mistake
- bad / wrong
Character | Meaning component |
---|---|
(orig.) inlay with gold cuò | metal jīn |
Sound component
A sound or phonetic component hints at how the character is pronounced.
For example:
Character | Sound component |
---|---|
mā mother | mǎ horse |
wèn question | mén door |
xiǎng think; desire | xiàng appearance |
Sound components are color-coded as blue.
Historical sound changes
Sometimes a character does not sound similar to its sound component. Most Chinese characters were invented thousands of years ago. Since then, there have been many changes to the way people speak. For that reason, the sound components of some characters are leftovers from old Chinese pronunciation, and do not reflect modern pronunciation.
For example, in old Chinese, 他 was pronounced /*l̥ʰaːl/ and 也 was pronounced /*laːlʔ/, so 也 was used as a sound component in 他. These two characters no longer sound similar.
Character | Sound component |
---|---|
tā he | yě also |
Audio courtesy of the AllSet Learning Chinese Pronunciation Wiki, used with permission.
Iconic component
An iconic or form component is a direct visual representation of an object or idea (also known as a pictograph or ideograph).
For example:
Character | Iconic components | |
---|---|---|
forest lín | wood; tree mù | wood; tree mù |
dawn; morning dàn | sun rì | horizon [no pronunciation] |
to have yǒu | hand yòu | meat ròu |
Iconic components are color-coded as green.
Remnant component
A remnant component is a component that is derived from a part of another character.
For example, the character 孝 (filial piety) is derived from a remnant of 老 (old), and 子 (child).
Character | Remnant | Taken from |
---|---|---|
xiào filial piety | lǎo old |
Remnant components are color-coded as chartreuse.
Simplified component
A simplified component is a component that was changed during character simplification to reduce the number of strokes.
For example:
Traditional | Simplified |
---|---|
nán difficult | nán difficult |
diǎn dot; point | diǎn dot; point |
hái; huán still; return | hái; huán still; return |
Simplified components are color-coded as teal.
Deleted component
A deleted component is a component that was removed during character simplification to reduce the number of strokes.
For example:
Traditional | Deleted component | Simplified |
---|---|---|
open kāi | door mén | open kāi |
Distinguishing component
A distinguishing component is a component that was added to distinguish one character from another character.
For example, the characters 王 (king) and 玉 (jade) were written similarly in seal script, so a dot was added to distinguish them.
wáng king | yù jade |
Distinguishing components are color-coded as purple.
Unknown component
An unkown component is a component whose purpose is unclear. Unfortunately, not all Chinese characters have a clear explanation.
For example, nobody really knows for certain what the top component of 是 was originally supposed to represent.
Character | Components | |
---|---|---|
shì to be | [unkown meaning] | foot; stop zhǐ |
Unknown components are color-coded as gray.
Sources of information
It is difficult to find reliable information about the origins of Chinese characters. Misinformation about Chinese characters is unfortunately very common, even from Chinese teachers, and it can be frustrating to wade through all of the conflicting information out there.
Top-notch sources
- 季旭昇《說文新證》
This is an update to the traditional Shuowen dictionary, with insights from modern analysis of recently discovered Oracle bone fragments that were unknown to ancient lexicographers. - Outlier Dictionary of Chinese Characters
The authors of this dictionary are academic experts in Chinese paleography and have in-depth knowledge about the history of Chinese characters.
Usually pretty good
- 漢語多功能字庫 (Multi-function Chinese Character Database)
Free online dictionary provided by the University of Hong Kong, with explanations of character origins. - 李学勤《字源》
Dictionary of character origins from mainland China scholarship.
Useful for specific purposes
- Chinese Text Project
Free online database of ancient Chinese texts, useful for finding out how characters have been used historically, and finding references to more obscure characters. - 小學堂 - Academia Sinica
Free online database of historical character forms.
Unreliable but occasionally useful
- 說文解字
This is the traditional character dictionary that scholars have relied on for thousands of years. The information is often inaccurate, but it does provide valuable insight into how characters were written and understood at that point in history. - Wiktionary
Wiktionary usually works decently for looking up the meaning of characters or historical/dialectical pronunciations, but is not always useful for finding out character origins.
Character builder
The character builder is a tool for generating stroke data for obscure characters by combining strokes from other characters.
For example, if the database didn't already have stroke data for 犸, you could generate it from the first three strokes of 狼 and the last three strokes of 妈. If it doesn't line up quite right, you can move and stretch the components.
Verified characters
Characters in the list with a green checkmark are verified, which means they have been manually checked by a human to determine whether or not the information is correct.
Pages for characters that have not been manually verified yet will show a warning message at the top to indicate that the information may not be reliable.
Contributing
If you see a mistake in the dictionary or want to help add more data, feel free to suggest edits or post on the talk page for a character.
Your edits must be approved first before they show up. If you have a track record of positive contributions, you will gain permission to edit without approval and to approve/reject edits from other people.
Data downloads
Dumps of the dictionary data are generated every month and are free to download.