Encoding orders of Brahmic scripts
Norbert Lindenberg
November 2, 2023
This article documents the encoding orders that the OpenType Universal Shaping Engine assumes for the Brahmic scripts it supports. Understanding encoding orders is necessary when rendering or otherwise interpreting text in these scripts, as well as when entering text using input methods or otherwise generating text.
Contents
- Introduction
- Scripts covered in this document
- Derivation of the tables
- Caveats
- How to report issues
- Acknowledgments
- References
- 𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇
- Ahom
- Balinese
- Batak
- Bhaiksuki
- Brahmi
- Buginese (Lontara’)
- Buhid
- Chakma
- (Eastern) Cham
- Dives Akuru
- Dogra
- Grantha
- Gunjala Gondi
- Hanunoo
- Javanese
- Kaithi
- Kawi
- Kayah Li
- Kharoshthi
- Khojki
- Khudawadi
- Lepcha
- Limbu
- Mahajani
- Makasar
- Marchen
- Masaram Gondi
- Meetei Mayek
- Modi
- Multani
- Nag Mundari
- Nandinagari
- Newa
- Phags-pa
- Rejang
- Saurashtra
- Sharada
- Siddham
- Sinhala
- Soyombo
- Sundanese
- Syloti Nagri
- Tagalog
- Tagbanwa
- Tai Le
- Tai Tham
- Tai Viet
- Takri
- Tibetan
- Tirhuta
- Zanabazar Square
Introduction
Text written in a Brahmic script consists of orthographic syllables, two-dimensional visual arrangements of glyphs that form a unit. When encoding orthographic syllables, the Unicode characters corresponding to the glyphs should be arranged in a well-defined order, so that text can be rendered correctly, and compared, searched, or otherwise processed without ambiguities or missed matches. The Unicode Standard does not define this order for most Brahmic scripts, and does so incompletely or ambiguously for others. The shaping engines of OpenType font rendering systems have therefore become the de-facto definitions of the encoding orders of orthographic syllables.
The main tables in this document show the encoding orders for Brahmic scripts as defined by the OpenType Universal Shaping Engine (USE). This information can be used in several ways:
- Implementers of smart keyboards can use it to reorder characters input by the user into the correct order.
- Implementers of spelling checkers or input prediction engines can use it to ensure that their dictionaries and thus corrected text use the correct order.
- Font engineers can see in which order characters will be presented to fonts, and use it when implementing shaping logic.
- Engineers of fonts using technologies other than OpenType, such as Apple Advanced Typography, can use it to implement validation in their fonts.
- Implementers of other text-generating systems, such as speech input systems or chatbots, can use it to ensure they generate text that will render and be interpreted correctly.
The main tables show the classes into which the USE classifies the characters of each script, the order in which their characters can occur within a cluster (the USE’s approximation of an orthographic syllable), and how often they can occur. They may also show that some sections, such as consonant conjunct forms and their associated modifiers, can be repeated, and that after the initial consonant cluster a syllable may continue with either a final virama (for which the USE uses the Hindi term “halant”) or a more substantial sequence of medial consonants, vowels, vowel modifiers, and final consonants.
Supplementary tables show characters that are used in a script but can not be part of clusters, and for some scripts decompositions that the USE applies and character sequences that it does not allow because there are equivalent individual characters.
Scripts covered in this document
This document covers scripts that are encoded in Unicode 15.1, have the Unicode property Indic_Syllabic_Category defined for at least some of their characters, and are included in the list of scripts supported in the Universal Shaping Engine.
The Unicode property Indic_Syllabic_Category is generally defined for those characters in Brahmic scripts that participate in forming orthographic syllables. The Kharoshthi script, which was a contemporary rather than a descendant of Brahmi, is included because it forms orthographic syllables similar to Brahmic scripts. Some scripts that are descendants of Brahmi may not be included because their structure has been simplified to a simple linear sequence of glyphs.
Only scripts supported by the USE are included because it is the only OpenType shaping engine that provides a well-defined cluster structure for Brahmic scripts that enables largely interoperable implementations. The documentation for shaping engines supporting other Brahmic scripts (Bengali (Bangla), Devanagari, Gujarati, Gurmukhi, Kannada, Khmer, Lao, Malayalam, Myanmar, Odia (Oriya), Tamil, Thai, Telugu) is incomplete and has been abandoned by its owner, and implementations have diverged substantially, as documented for Devanagari and Khmer. New Tai Lue is missing because OpenType documentation does not assign it to any shaping engine.
Derivation of the tables
The first step in creating these tables was to determine which characters are used with each script. The Unicode Standard includes two properties, Script and Script_Extensions, that provide much of this information. However, the use of characters in the shared scripts Common and Inherited for specific scripts is generally not documented there, and some other information provided in script proposals may have been lost as well. Long-time Unicode contributor Roozbeh Pournader has therefore started a project to collect exemplar characters for each script, and that information was included here. In addition, U+25CC DOTTED CIRCLE was included for all scripts that use combining marks (except variation selectors) or characters with USE class REPHA.
In the classification of characters, some assumptions had to be made to work around USE bug 475, which causes ambiguous classifications for several characters. For characters where the USE overrides the Indic syllabic category, categorization was based on the override. U+00A0 NO-BREAK SPACE was assigned to the BASE_OTHER class to enable its use with combining marks, as recommended in the Unicode Standard. U+200D ZERO WIDTH JOINER was assigned to the ZWJ class. The remaining characters were assigned to the BASE_IND class to avoid creating expectations that may not be met. Another assumption had to be made to work around USE bug 928 by assigning U+1171E AHOM CONSONANT SIGN MEDIAL RA to subclass CONS_MED_PRE.
USE subclasses that the USE merges into one “sigla” are merged into one representative subclass. For example, VOWEL_PRE_ABOVE, VOWEL_PRE_ABOVE_POST, VOWEL_PRE_POST are merged into VOWEL_PRE.
The seven regular expressions defining cluster structures in the USE documentation were reduced to three. The USE’s independent cluster is not relevant because no characters of the scripts covered here that fall into the classes starting such clusters allow variation selectors. Instead, a list of characters that can not be part of clusters and are therefore treated as standalone is provided. The standard cluster and virama-terminated cluster were merged into one regular cluster, as they share a long common start, followed by a long or short tail. These tails show up as alternatives in the tables provided. The number-joiner terminated cluster and the numeral cluster were merged into one special cluster, as the only difference is an optional number-joiner. The symbol cluster remains as is, but is only shown for Balinese, the only script where the cluster can be longer than one character. For all other scripts, SYM characters are lumped into the list of characters that can not be part of clusters. The hieroglyph cluster is not relevant for Brahmic scripts.
For each script, all characters used are sorted into their USE classes and subclasses, and inserted into the three regular expressions. Empty classes and subclasses are removed, empty subexpressions are removed, and empty regular expressions are removed. The remainder is formatted into tables whose format is derived from that used to describe encoding orders in the Unicode Standard, but which are extended to preserve the structural information of the underlying regular expressions.
Caveats
The precise cluster structures used by implementations of the USE can differ somewhat. While the USE’s generic Brahmic cluster model works reasonably well for most of the scripts discussed here, some scripts require adjustments, which may have been made differently (or not at all) in different implementations.
As the USE serves many different scripts, the encoding order for any particular script will allow many character sequences that don’t make sense for that particular script. That’s OK – an encoding order is no substitute for a spelling checker.
The USE occasionally places characters into classes where they linguistically or graphically don’t belong. One example are visible viramas, which it classifies as dependent vowels because they occur instead of such vowels in the encoding order. Other cases are often motivated by the need to enable a character sequence that occurs in real life but wouldn’t be allowed by a strict interpretation of the USE cluster model and the underlying Unicode character data. One such example occurs in Chakma, where above-base and below-base vowels were swapped in order to match the canonical decomposition of two vowels.
Some of the tables in this document include characters that are used in the particular script, but belong to a different one, and classify them as the USE would do. However, before text reaches a shaping engine, OpenType systems will break it into script runs. How they do that is one of the undocumented mysteries of OpenType, but the result may be that such adopted characters can't form clusters together with characters from the script in whose table they appear.
The characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON-JOINER are used in numerous scripts, sometimes for specific purposes described in the Unicode Standard, sometimes for their generic purposes of requesting or breaking up ligatures (including conjunct forms). The USE allows them anywhere in a cluster, so they’re not documented in the cluster structure. See the USE sections Zero-width joiner and Zero-width non-joiner for details.
How to report issues
You might find some problem in the encoding orders documented in this article. As it combines information from multiple sources, issues should be reported where they can be fixed:
- If your script uses a character that is not encoded in Unicode yet, you can propose it for encoding.
- If your script uses a character that is encoded in Unicode, but not shown in the encoding order table for the script, you can propose to the UTC that it be added through the Script_Extensions property. A good example for such a proposal is the one for adding the Ardhavisarga character to the Oriya script (L2/18-330).
- If your script uses a character sequence which isn’t allowed by the encoding order, scroll to the bottom of the USE documentation to submit feedback.
- If you find issues in the text of this article, or can’t determine which of the three cases above applies, contact the author through the About page.
Acknowledgments
I’d like to thank: Andrew Glass for specifying the Universal Shaping Engine, without which this article would not have been possible. Roozbeh Pournader for collecting the script exemplars data, and for quickly addressing the issues I found. Many contributors for writing script proposals, and the few who turn these proposals into the Unicode Standard, including the character properties underlying the USE. Many type designers and font engineers for creating, and Google for sponsoring, the Noto font family, which makes it possible to show actual characters in this article. Marc Durdin, Muthu Nedumaran, and Simon Cozens for providing feedback on this article.
References
Norbert Lindenberg: Issues in Khmer syllable validation. Lindenberg Software LLC, 2019.
Norbert Lindenberg: Issues in Devanagari cluster validation. Lindenberg Software LLC, 2020.
Norbert Lindenberg: Implementing Javanese. The Unicode Consortium, 2022.
Microsoft Corporation: Creating and supporting OpenType fonts for the Universal Shaping Engine. Microsoft Corporation, dated 2022-10-01.
Microsoft Corporation: Microsoft Font-tools. GitHub, as of 2022-09-21. Includes data tables for the Universal Shaping Engine.
Roozbeh Pournader: unicode-data. GitHub, as of 2023-10-24. Includes script exemplar data.
The Unicode Consortium: The Unicode Standard, Version 15.1. The Unicode Consortium, 2023.
𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇
Ahom
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ 𑜀 𑜁 𑜂 𑜃 𑜄 𑜅 𑜆 𑜇 𑜈 𑜉 𑜊 𑜋 𑜌 𑜍 𑜎 𑜏 𑜐 𑜑 𑜒 𑜓 𑜔 𑜕 𑜖 𑜗 𑜘 𑜙 𑜚 𑜰 𑜱 𑜲 𑜳 𑜴 𑜵 𑜶 𑜷 𑜸 𑜹 𑜺 𑜻 𑝀 𑝁 𑝂 𑝃 𑝄 𑝅 𑝆 | [U+25CC, U+11700..U+1171A, U+11730..U+1173B, U+11740..U+11746, U+00A0] | 1 | |||
CONS_MED_PRE | ◌𑜞 | U+1171E | 0 or 1 | |||
CONS_MED_ABOVE | ◌𑜟 | U+1171F | 0 or 1 | |||
CONS_MED_BELOW | ◌𑜝 | U+1171D | 0 or 1 | |||
VOWEL_PRE | ◌𑜦 | U+11726 | 0 or more | |||
VOWEL_ABOVE | ◌𑜢 ◌𑜣 ◌𑜧 ◌𑜩 ◌𑜪 ◌𑜫 | [U+11722..U+11723, U+11727, U+11729..U+1172B] | 0 or more | |||
VOWEL_BELOW | ◌𑜤 ◌𑜥 ◌𑜨 | [U+11724..U+11725, U+11728] | 0 or more | |||
VOWEL_POST | ◌𑜠 ◌𑜡 | [U+11720..U+11721] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑜼 𑜽 𑜾 𑜿 | [U+0020, U+1173C..U+1173F] |
Known bugs:
- Missing CONS_MED subclasses for some Indic_Positional_Category values (affects U+1171E)
Balinese
For the Balinese script, the USE supports two cluster structures: A regular one for the normal orthographic syllables used for writing normal languages, and a special one for the combinations of musical symbols and combining marks used for writing musical scores. Several notations using these symbols are described in Unicode Technical Note 51 Musical symbols and Sasak characters in the Balinese script.
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌ᬻ | U+1B3B | ◌ᬺ ◌ᬵ | <U+1B3A, U+1B35> |
◌ᬽ | U+1B3D | ◌ᬼ ◌ᬵ | <U+1B3C, U+1B35> |
◌ᭀ | U+1B40 | ◌ᬾ ◌ᬵ | <U+1B3E, U+1B35> |
◌ᭁ | U+1B41 | ◌ᬿ ◌ᬵ | <U+1B3F, U+1B35> |
◌ᭃ | U+1B43 | ◌ᭂ ◌ᬵ | <U+1B42, U+1B35> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬲ ᬳ ᭅ ᭆ ᭇ ᭈ ᭉ ᭊ ᭋ ᭌ ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙ ◌ | [U+1B05..U+1B33, U+1B45..U+1B4C, U+1B50..U+1B59, U+25CC] | 1 | |||
CONS_MOD_ABOVE | ◌᬴ | U+1B34 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌᭄ | U+1B44 | 1 | |||
BASE | ᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬲ ᬳ ᭅ ᭆ ᭇ ᭈ ᭉ ᭊ ᭋ ᭌ ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙ ◌ | [U+1B05..U+1B33, U+1B45..U+1B4C, U+1B50..U+1B59, U+25CC] | 1 | |||
CONS_MOD_ABOVE | ◌᬴ | U+1B34 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌᭄ | U+1B44 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌ᬾ ◌ᬿ | [U+1B3E..U+1B3F] | 0 or more | |||
VOWEL_ABOVE | ◌ᬶ ◌ᬷ ◌ᬼ ◌ᭂ | [U+1B36..U+1B37, U+1B3C, U+1B42] | 0 or more | |||
VOWEL_BELOW | ◌ᬸ ◌ᬹ ◌ᬺ | [U+1B38..U+1B3A] | 0 or more | |||
VOWEL_POST | ◌ᬵ | U+1B35 | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ᬀ ◌ᬁ ◌ᬂ | [U+1B00..U+1B02] | 0 or more | |||
VOWEL_MOD_POST | ◌ᬄ | U+1B04 | 0 or more | |||
CONS_FINAL_ABOVE | ◌ᬃ | U+1B03 | 0 or more |
Special cluster structure for symbols:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
SYM | ᭡ ᭢ ᭣ ᭤ ᭥ ᭦ ᭧ ᭨ ᭩ ᭪ ᭴ ᭵ ᭶ ᭷ ᭸ ᭹ ᭺ ᭻ ᭼ | [U+1B61..U+1B6A, U+1B74..U+1B7C] | 1 | |||
SYM_MOD_ABOVE | ◌᭫ ◌᭭ ◌᭮ ◌᭯ ◌᭰ ◌᭱ ◌᭲ ◌᭳ | [U+1B6B, U+1B6D..U+1B73] | 0 or more | |||
SYM_MOD_BELOW | ◌᭬ | U+1B6C | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᭚ ᭛ ᭜ ᭝ ᭞ ᭟ ᭠ ᭽ ᭾ | [U+1B5A..U+1B60, U+1B7D..U+1B7E] |
Known bugs:
- Fully decomposed split vowels may occupy more than one position (affects U+1B3C)
- Problems in the “Split vowel handling” section (affects U+1B3C)
Batak
The Batak script has a feature that the USE does not support well: In a character sequence consonant-vowel-consonant-virama, the vowel and second consonant are displayed in reverse order. For example, the syllable stored as ᯖ ta, ◌ᯪ i, ᯇ pa, ◌᯲ virama has to be displayed as ᯖᯪᯇ᯲ tip. For the USE, however, this sequence consists of two clusters. For a way to implement this reordering in fonts, see Constructing fonts for the Batak script. Implementers of keyboards and other text producers should ensure that a Batak cluster in storage never contains both a vowel and a virama. If a user types both after the same consonant, the vowel has to be reordered before the consonant in storage.
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᯀ ᯁ ᯂ ᯃ ᯄ ᯅ ᯆ ᯇ ᯈ ᯉ ᯊ ᯋ ᯌ ᯍ ᯎ ᯏ ᯐ ᯑ ᯒ ᯓ ᯔ ᯕ ᯖ ᯗ ᯘ ᯙ ᯚ ᯛ ᯜ ᯝ ᯞ ᯟ ᯠ ᯡ ᯢ ᯣ ᯤ ᯥ ◌ | [U+1BC0..U+1BE5, U+25CC] | 1 | |||
CONS_MOD_ABOVE | ◌᯦ | U+1BE6 | 0 or more | |||
CONS_MOD_BELOW | ◌᯲ ◌᯳ | [U+1BF2..U+1BF3] | 0 or more | |||
VOWEL_ABOVE | ◌ᯨ ◌ᯩ ◌ᯭ ◌ᯯ | [U+1BE8..U+1BE9, U+1BED, U+1BEF] | 0 or more | |||
VOWEL_POST | ◌ᯧ ◌ᯪ ◌ᯫ ◌ᯬ ◌ᯮ | [U+1BE7, U+1BEA..U+1BEC, U+1BEE] | 0 or more | |||
CONS_FINAL_ABOVE | ◌ᯰ ◌ᯱ | [U+1BF0..U+1BF1] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᯼ ᯽ ᯾ ᯿ | [U+1BFC..U+1BFF] |
Bhaiksuki
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑰀 𑰁 𑰂 𑰃 𑰄 𑰅 𑰆 𑰇 𑰈 𑰊 𑰋 𑰌 𑰍 𑰎 𑰏 𑰐 𑰑 𑰒 𑰓 𑰔 𑰕 𑰖 𑰗 𑰘 𑰙 𑰚 𑰛 𑰜 𑰝 𑰞 𑰟 𑰠 𑰡 𑰢 𑰣 𑰤 𑰥 𑰦 𑰧 𑰨 𑰩 𑰪 𑰫 𑰬 𑰭 𑰮 𑱀 𑱐 𑱑 𑱒 𑱓 𑱔 𑱕 𑱖 𑱗 𑱘 𑱙 𑱚 𑱛 𑱜 𑱝 𑱞 𑱟 𑱠 𑱡 𑱢 𑱣 𑱤 𑱥 𑱦 𑱧 𑱨 𑱩 𑱪 𑱫 𑱬 | [U+25CC, U+11C00..U+11C08, U+11C0A..U+11C2E, U+11C40, U+11C50..U+11C6C] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑰿 | U+11C3F | 1 | |||
BASE | ◌ 𑰀 𑰁 𑰂 𑰃 𑰄 𑰅 𑰆 𑰇 𑰈 𑰊 𑰋 𑰌 𑰍 𑰎 𑰏 𑰐 𑰑 𑰒 𑰓 𑰔 𑰕 𑰖 𑰗 𑰘 𑰙 𑰚 𑰛 𑰜 𑰝 𑰞 𑰟 𑰠 𑰡 𑰢 𑰣 𑰤 𑰥 𑰦 𑰧 𑰨 𑰩 𑰪 𑰫 𑰬 𑰭 𑰮 𑱀 𑱐 𑱑 𑱒 𑱓 𑱔 𑱕 𑱖 𑱗 𑱘 𑱙 𑱚 𑱛 𑱜 𑱝 𑱞 𑱟 𑱠 𑱡 𑱢 𑱣 𑱤 𑱥 𑱦 𑱧 𑱨 𑱩 𑱪 𑱫 𑱬 | [U+25CC, U+11C00..U+11C08, U+11C0A..U+11C2E, U+11C40, U+11C50..U+11C6C] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑰿 | U+11C3F | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑰰 ◌𑰱 ◌𑰸 ◌𑰹 ◌𑰺 ◌𑰻 | [U+11C30..U+11C31, U+11C38..U+11C3B] | 0 or more | |||
VOWEL_BELOW | ◌𑰲 ◌𑰳 ◌𑰴 ◌𑰵 ◌𑰶 | [U+11C32..U+11C36] | 0 or more | |||
VOWEL_POST | ◌𑰯 | U+11C2F | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑰼 ◌𑰽 | [U+11C3C..U+11C3D] | 0 or more | |||
VOWEL_MOD_POST | ◌𑰾 | U+11C3E | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑱁 𑱂 𑱃 𑱄 𑱅 | [U+11C41..U+11C45] |
Brahmi
For the Brahmi script, the USE supports two cluster structures: a regular one for the normal orthographic syllables used for writing normal languages, and a special one for an old additive-multiplicative notation for numbers. This notation is described in section 14.1 “Brahmi” of The Unicode Standard.
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
CONS_WITH_STACKER | 𑀃 𑀄 | [U+11003..U+11004] | 0 or 1 | |||
BASE | ◌ 𑀅 𑀆 𑀇 𑀈 𑀉 𑀊 𑀋 𑀌 𑀍 𑀎 𑀏 𑀐 𑀑 𑀒 𑀓 𑀔 𑀕 𑀖 𑀗 𑀘 𑀙 𑀚 𑀛 𑀜 𑀝 𑀞 𑀟 𑀠 𑀡 𑀢 𑀣 𑀤 𑀥 𑀦 𑀧 𑀨 𑀩 𑀪 𑀫 𑀬 𑀭 𑀮 𑀯 𑀰 𑀱 𑀲 𑀳 𑀴 𑀵 𑀶 𑀷 𑁦 𑁧 𑁨 𑁩 𑁪 𑁫 𑁬 𑁭 𑁮 𑁯 𑁱 𑁲 𑁵 | [U+25CC, U+11005..U+11037, U+11066..U+1106F, U+11071..U+11072, U+11075] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑁆 | U+11046 | 1 | |||
BASE | ◌ 𑀅 𑀆 𑀇 𑀈 𑀉 𑀊 𑀋 𑀌 𑀍 𑀎 𑀏 𑀐 𑀑 𑀒 𑀓 𑀔 𑀕 𑀖 𑀗 𑀘 𑀙 𑀚 𑀛 𑀜 𑀝 𑀞 𑀟 𑀠 𑀡 𑀢 𑀣 𑀤 𑀥 𑀦 𑀧 𑀨 𑀩 𑀪 𑀫 𑀬 𑀭 𑀮 𑀯 𑀰 𑀱 𑀲 𑀳 𑀴 𑀵 𑀶 𑀷 𑁦 𑁧 𑁨 𑁩 𑁪 𑁫 𑁬 𑁭 𑁮 𑁯 𑁱 𑁲 𑁵 | [U+25CC, U+11005..U+11037, U+11066..U+1106F, U+11071..U+11072, U+11075] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑁆 | U+11046 | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑀸 ◌𑀹 ◌𑀺 ◌𑀻 ◌𑁂 ◌𑁃 ◌𑁄 ◌𑁅 ◌𑁰 ◌𑁳 ◌𑁴 | [U+11038..U+1103B, U+11042..U+11045, U+11070, U+11073..U+11074] | 0 or more | |||
VOWEL_BELOW | ◌𑀼 ◌𑀽 ◌𑀾 ◌𑀿 ◌𑁀 ◌𑁁 | [U+1103C..U+11041] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑀁 | U+11001 | 0 or more | |||
VOWEL_MOD_POST | ◌𑀀 ◌𑀂 | [U+11000, U+11002] | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑀅 ◌𑀸 | <U+11005, U+11038> |
𑀋 ◌𑀾 | <U+1100B, U+1103E> |
𑀏 ◌𑁂 | <U+1100F, U+11042> |
Special cluster structure for numerals:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE_NUM | 𑁒 𑁓 𑁔 𑁕 𑁖 𑁗 𑁘 𑁙 𑁚 𑁛 𑁜 𑁝 𑁞 𑁟 𑁠 𑁡 𑁢 𑁣 𑁤 𑁥 | [U+11052..U+11065] | 1 | |||
Repeating group | 0 or more | |||||
HALANT_NUM | ◌𑁿 | U+1107F | 1 | |||
BASE_NUM | 𑁒 𑁓 𑁔 𑁕 𑁖 𑁗 𑁘 𑁙 𑁚 𑁛 𑁜 𑁝 𑁞 𑁟 𑁠 𑁡 𑁢 𑁣 𑁤 𑁥 | [U+11052..U+11065] | 1 | |||
HALANT_NUM | ◌𑁿 | U+1107F | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑁇 𑁈 𑁉 𑁊 𑁋 𑁌 𑁍 | [U+11047..U+1104D] |
Buginese (Lontara’)
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ᨀ ᨁ ᨂ ᨃ ᨄ ᨅ ᨆ ᨇ ᨈ ᨉ ᨊ ᨋ ᨌ ᨍ ᨎ ᨏ ᨐ ᨑ ᨒ ᨓ ᨔ ᨕ ᨖ ◌ | [U+1A00..U+1A16, U+25CC, U+00A0] | 1 | |||
VOWEL_PRE | ◌ᨙ | U+1A19 | 0 or more | |||
VOWEL_ABOVE | ◌ᨗ ◌ᨘ ◌ᨛ | [U+1A17..U+1A18, U+1A1B] | 0 or more | |||
VOWEL_POST | ◌ᨚ | U+1A1A | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᨞ ᨟ ꧏ | [U+0020, U+1A1E..U+1A1F, U+A9CF] |
Buhid
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᝀ ᝁ ᝂ ᝃ ᝄ ᝅ ᝆ ᝇ ᝈ ᝉ ᝊ ᝋ ᝌ ᝍ ᝎ ᝏ ᝐ ᝑ ◌ | [U+1740..U+1751, U+25CC] | 1 | |||
VOWEL_ABOVE | ◌ᝒ | U+1752 | 0 or more | |||
VOWEL_BELOW | ◌ᝓ | U+1753 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᜵ ᜶ | [U+1735..U+1736] |
Chakma
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌𑄮 | U+1112E | ◌𑄱 ◌𑄧 | <U+11131, U+11127> |
◌𑄯 | U+1112F | ◌𑄲 ◌𑄧 | <U+11132, U+11127> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ◌ 𑄃 𑄄 𑄅 𑄆 𑄇 𑄈 𑄉 𑄊 𑄋 𑄌 𑄍 𑄎 𑄏 𑄐 𑄑 𑄒 𑄓 𑄔 𑄕 𑄖 𑄗 𑄘 𑄙 𑄚 𑄛 𑄜 𑄝 𑄞 𑄟 𑄠 𑄡 𑄢 𑄣 𑄤 𑄥 𑄦 𑄶 𑄷 𑄸 𑄹 𑄺 𑄻 𑄼 𑄽 𑄾 𑄿 𑅄 𑅇 | [U+09E6..U+09EF, U+1040..U+1049, U+25CC, U+11103..U+11126, U+11136..U+1113F, U+11144, U+11147] | 1 | |||
CONS_MOD_ABOVE | ◌𑄴 | U+11134 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑄳 | U+11133 | 1 | |||
BASE | ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ◌ 𑄃 𑄄 𑄅 𑄆 𑄇 𑄈 𑄉 𑄊 𑄋 𑄌 𑄍 𑄎 𑄏 𑄐 𑄑 𑄒 𑄓 𑄔 𑄕 𑄖 𑄗 𑄘 𑄙 𑄚 𑄛 𑄜 𑄝 𑄞 𑄟 𑄠 𑄡 𑄢 𑄣 𑄤 𑄥 𑄦 𑄶 𑄷 𑄸 𑄹 𑄺 𑄻 𑄼 𑄽 𑄾 𑄿 𑅄 𑅇 | [U+09E6..U+09EF, U+1040..U+1049, U+25CC, U+11103..U+11126, U+11136..U+1113F, U+11144, U+11147] | 1 | |||
CONS_MOD_ABOVE | ◌𑄴 | U+11134 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑄳 | U+11133 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑄬 | U+1112C | 0 or more | |||
VOWEL_ABOVE | ◌𑄪 ◌𑄫 ◌𑄱 ◌𑄲 | [U+1112A..U+1112B, U+11131..U+11132] | 0 or more | |||
VOWEL_BELOW | ◌𑄧 ◌𑄨 ◌𑄩 ◌𑄭 ◌𑄰 | [U+11127..U+11129, U+1112D, U+11130] | 0 or more | |||
VOWEL_POST | ◌𑅅 ◌𑅆 | [U+11145..U+11146] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑄀 ◌𑄁 ◌𑄂 | [U+11100..U+11102] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑅀 𑅁 𑅂 𑅃 | [U+11140..U+11143] |
(Eastern) Cham
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | 0 1 2 3 4 5 6 7 8 9 ◌ ꨀ ꨁ ꨂ ꨃ ꨄ ꨅ ꨆ ꨇ ꨈ ꨉ ꨊ ꨋ ꨌ ꨍ ꨎ ꨏ ꨐ ꨑ ꨒ ꨓ ꨔ ꨕ ꨖ ꨗ ꨘ ꨙ ꨚ ꨛ ꨜ ꨝ ꨞ ꨟ ꨠ ꨡ ꨢ ꨣ ꨤ ꨥ ꨦ ꨧ ꨨ ꩀ ꩁ ꩂ ꩄ ꩅ ꩆ ꩇ ꩈ ꩉ ꩊ ꩋ ꩐ ꩑ ꩒ ꩓ ꩔ ꩕ ꩖ ꩗ ꩘ ꩙ - ‐ ‑ | [U+0030..U+0039, U+25CC, U+AA00..U+AA28, U+AA40..U+AA42, U+AA44..U+AA4B, U+AA50..U+AA59, U+002D, U+00A0, U+2010..U+2011] | 1 | |||
CONS_MED_PRE | ◌ꨴ | U+AA34 | 0 or 1 | |||
CONS_MED_ABOVE | ◌ꨵ | U+AA35 | 0 or 1 | |||
CONS_MED_BELOW | ◌ꨶ | U+AA36 | 0 or 1 | |||
CONS_MED_POST | ◌ꨳ | U+AA33 | 0 or 1 | |||
VOWEL_PRE | ◌ꨯ ◌ꨰ | [U+AA2F..U+AA30] | 0 or more | |||
VOWEL_ABOVE | ◌ꨪ ◌ꨫ ◌ꨬ ◌ꨮ ◌ꨱ | [U+AA2A..U+AA2C, U+AA2E, U+AA31] | 0 or more | |||
VOWEL_BELOW | ◌ꨭ ◌ꨲ | [U+AA2D, U+AA32] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ꨩ | U+AA29 | 0 or more | |||
CONS_FINAL_ABOVE | ◌ꩃ ◌ꩌ | [U+AA43, U+AA4C] | 0 or more | |||
CONS_FINAL_POST | ◌ꩍ | U+AA4D | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
: ? ꩜ ꩝ ꩞ ꩟ | [U+0020, U+003A, U+003F, U+AA5C..U+AA5F] |
Dives Akuru
The USE decomposes the following multi-part vowel, which therefore doesn’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌𑤸 | U+11938 | ◌𑤵 ◌𑤰 | <U+11935, U+11930> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑤿◌ 𑥁◌ | [U+1193F, U+11941] | 0 or 1 | |||
BASE | ◌ 𑤀 𑤁 𑤂 𑤃 𑤄 𑤅 𑤆 𑤉 𑤌 𑤍 𑤎 𑤏 𑤐 𑤑 𑤒 𑤓 𑤕 𑤖 𑤘 𑤙 𑤚 𑤛 𑤜 𑤝 𑤞 𑤟 𑤠 𑤡 𑤢 𑤣 𑤤 𑤥 𑤦 𑤧 𑤨 𑤩 𑤪 𑤫 𑤬 𑤭 𑤮 𑤯 𑥐 𑥑 𑥒 𑥓 𑥔 𑥕 𑥖 𑥗 𑥘 𑥙 | [U+25CC, U+11900..U+11906, U+11909, U+1190C..U+11913, U+11915..U+11916, U+11918..U+1192F, U+11950..U+11959] | 1 | |||
CONS_MOD_BELOW | ◌𑥃 | U+11943 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑤾 | U+1193E | 1 | |||
BASE | ◌ 𑤀 𑤁 𑤂 𑤃 𑤄 𑤅 𑤆 𑤉 𑤌 𑤍 𑤎 𑤏 𑤐 𑤑 𑤒 𑤓 𑤕 𑤖 𑤘 𑤙 𑤚 𑤛 𑤜 𑤝 𑤞 𑤟 𑤠 𑤡 𑤢 𑤣 𑤤 𑤥 𑤦 𑤧 𑤨 𑤩 𑤪 𑤫 𑤬 𑤭 𑤮 𑤯 𑥐 𑥑 𑥒 𑥓 𑥔 𑥕 𑥖 𑥗 𑥘 𑥙 | [U+25CC, U+11900..U+11906, U+11909, U+1190C..U+11913, U+11915..U+11916, U+11918..U+1192F, U+11950..U+11959] | 1 | |||
CONS_MOD_BELOW | ◌𑥃 | U+11943 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑤾 | U+1193E | 1 | |||
Alternative 2 | ||||||
CONS_MED_POST | ◌𑥀 ◌𑥂 | [U+11940, U+11942] | 0 or 1 | |||
VOWEL_PRE | ◌𑤵 ◌𑤷 | [U+11935, U+11937] | 0 or more | |||
VOWEL_POST | ◌𑤰 ◌𑤱 ◌𑤲 ◌𑤳 ◌𑤴 ◌𑤽 | [U+11930..U+11934, U+1193D] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑤻 ◌𑤼 | [U+1193B..U+1193C] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑥄 𑥅 𑥆 | [U+11944..U+11946] |
Known bugs:
- Double REPHA in Dives Akuru (affects U+1193F, U+11941)
Dogra
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑠀 𑠁 𑠂 𑠃 𑠄 𑠅 𑠆 𑠇 𑠈 𑠉 𑠊 𑠋 𑠌 𑠍 𑠎 𑠏 𑠐 𑠑 𑠒 𑠓 𑠔 𑠕 𑠖 𑠗 𑠘 𑠙 𑠚 𑠛 𑠜 𑠝 𑠞 𑠟 𑠠 𑠡 𑠢 𑠣 𑠤 𑠥 𑠦 𑠧 𑠨 𑠩 𑠪 𑠫 | [U+0966..U+096F, U+25CC, U+11800..U+1182B] | 1 | |||
CONS_MOD_BELOW | ◌𑠺 | U+1183A | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑠹 | U+11839 | 1 | |||
BASE | ० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑠀 𑠁 𑠂 𑠃 𑠄 𑠅 𑠆 𑠇 𑠈 𑠉 𑠊 𑠋 𑠌 𑠍 𑠎 𑠏 𑠐 𑠑 𑠒 𑠓 𑠔 𑠕 𑠖 𑠗 𑠘 𑠙 𑠚 𑠛 𑠜 𑠝 𑠞 𑠟 𑠠 𑠡 𑠢 𑠣 𑠤 𑠥 𑠦 𑠧 𑠨 𑠩 𑠪 𑠫 | [U+0966..U+096F, U+25CC, U+11800..U+1182B] | 1 | |||
CONS_MOD_BELOW | ◌𑠺 | U+1183A | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑠹 | U+11839 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑠭 | U+1182D | 0 or more | |||
VOWEL_ABOVE | ◌𑠳 ◌𑠴 ◌𑠵 ◌𑠶 | [U+11833..U+11836] | 0 or more | |||
VOWEL_BELOW | ◌𑠯 ◌𑠰 ◌𑠱 ◌𑠲 | [U+1182F..U+11832] | 0 or more | |||
VOWEL_POST | ◌𑠬 ◌𑠮 | [U+1182C, U+1182E] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑠷 | U+11837 | 0 or more | |||
VOWEL_MOD_POST | ◌𑠸 | U+11838 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑠻 | [U+0964..U+0965, U+A830..U+A839, U+1183B] |
Grantha
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌𑍋 | U+1134B | ◌𑍇 ◌𑌾 | <U+11347, U+1133E> |
◌𑍌 | U+1134C | ◌𑍇 ◌𑍗 | <U+11347, U+11357> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ௦ ௧ ௨ ௩ ௪ ௫ ௬ ௭ ௮ ௯ ◌ 𑌅 𑌆 𑌇 𑌈 𑌉 𑌊 𑌋 𑌌 𑌏 𑌐 𑌓 𑌔 𑌕 𑌖 𑌗 𑌘 𑌙 𑌚 𑌛 𑌜 𑌝 𑌞 𑌟 𑌠 𑌡 𑌢 𑌣 𑌤 𑌥 𑌦 𑌧 𑌨 𑌪 𑌫 𑌬 𑌭 𑌮 𑌯 𑌰 𑌲 𑌳 𑌵 𑌶 𑌷 𑌸 𑌹 𑌽 𑍞 𑍟 𑍠 𑍡 | [U+0BE6..U+0BEF, U+25CC, U+11305..U+1130C, U+1130F..U+11310, U+11313..U+11328, U+1132A..U+11330, U+11332..U+11333, U+11335..U+11339, U+1133D, U+1135E..U+11361] | 1 | |||
CONS_MOD_BELOW | ◌𑌻 ◌𑌼 | [U+1133B..U+1133C] | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑍍 | U+1134D | 1 | |||
BASE | ௦ ௧ ௨ ௩ ௪ ௫ ௬ ௭ ௮ ௯ ◌ 𑌅 𑌆 𑌇 𑌈 𑌉 𑌊 𑌋 𑌌 𑌏 𑌐 𑌓 𑌔 𑌕 𑌖 𑌗 𑌘 𑌙 𑌚 𑌛 𑌜 𑌝 𑌞 𑌟 𑌠 𑌡 𑌢 𑌣 𑌤 𑌥 𑌦 𑌧 𑌨 𑌪 𑌫 𑌬 𑌭 𑌮 𑌯 𑌰 𑌲 𑌳 𑌵 𑌶 𑌷 𑌸 𑌹 𑌽 𑍞 𑍟 𑍠 𑍡 | [U+0BE6..U+0BEF, U+25CC, U+11305..U+1130C, U+1130F..U+11310, U+11313..U+11328, U+1132A..U+11330, U+11332..U+11333, U+11335..U+11339, U+1133D, U+1135E..U+11361] | 1 | |||
CONS_MOD_BELOW | ◌𑌻 ◌𑌼 | [U+1133B..U+1133C] | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑍍 | U+1134D | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑍇 ◌𑍈 | [U+11347..U+11348] | 0 or more | |||
VOWEL_ABOVE | ◌𑍀 | U+11340 | 0 or more | |||
VOWEL_POST | ◌𑌾 ◌𑌿 ◌𑍁 ◌𑍂 ◌𑍃 ◌𑍄 ◌𑍗 ◌𑍢 ◌𑍣 | [U+1133E..U+1133F, U+11341..U+11344, U+11357, U+11362..U+11363] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌॑ ◌᳐ ◌᳒ ◌᳴ ◌᳸ ◌᳹ ◌⃰ ◌𑌀 ◌𑌁 ◌𑍦 ◌𑍧 ◌𑍨 ◌𑍩 ◌𑍪 ◌𑍫 ◌𑍬 ◌𑍰 ◌𑍱 ◌𑍲 ◌𑍳 ◌𑍴 | [U+0951, U+1CD0, U+1CD2, U+1CF4, U+1CF8..U+1CF9, U+20F0, U+11300..U+11301, U+11366..U+1136C, U+11370..U+11374] | 0 or more | |||
VOWEL_MOD_BELOW | ◌॒ | U+0952 | 0 or more | |||
VOWEL_MOD_POST | ◌𑌂 ◌𑌃 | [U+11302..U+11303] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ௰ ௱ ௲ ௳ ᳓ ᳲ ᳳ 𑍐 𑍝 𑿐 𑿑 𑿓 | [U+0964..U+0965, U+0BF0..U+0BF3, U+1CD3, U+1CF2..U+1CF3, U+11350, U+1135D, U+11FD0..U+11FD1, U+11FD3] |
Gunjala Gondi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑵠 𑵡 𑵢 𑵣 𑵤 𑵥 𑵧 𑵨 𑵪 𑵫 𑵬 𑵭 𑵮 𑵯 𑵰 𑵱 𑵲 𑵳 𑵴 𑵵 𑵶 𑵷 𑵸 𑵹 𑵺 𑵻 𑵼 𑵽 𑵾 𑵿 𑶀 𑶁 𑶂 𑶃 𑶄 𑶅 𑶆 𑶇 𑶈 𑶉 𑶠 𑶡 𑶢 𑶣 𑶤 𑶥 𑶦 𑶧 𑶨 𑶩 | [U+25CC, U+11D60..U+11D65, U+11D67..U+11D68, U+11D6A..U+11D89, U+11DA0..U+11DA9] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑶗 | U+11D97 | 1 | |||
BASE | ◌ 𑵠 𑵡 𑵢 𑵣 𑵤 𑵥 𑵧 𑵨 𑵪 𑵫 𑵬 𑵭 𑵮 𑵯 𑵰 𑵱 𑵲 𑵳 𑵴 𑵵 𑵶 𑵷 𑵸 𑵹 𑵺 𑵻 𑵼 𑵽 𑵾 𑵿 𑶀 𑶁 𑶂 𑶃 𑶄 𑶅 𑶆 𑶇 𑶈 𑶉 𑶠 𑶡 𑶢 𑶣 𑶤 𑶥 𑶦 𑶧 𑶨 𑶩 | [U+25CC, U+11D60..U+11D65, U+11D67..U+11D68, U+11D6A..U+11D89, U+11DA0..U+11DA9] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑶗 | U+11D97 | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑶐 ◌𑶑 | [U+11D90..U+11D91] | 0 or more | |||
VOWEL_POST | ◌𑶊 ◌𑶋 ◌𑶌 ◌𑶍 ◌𑶎 ◌𑶓 ◌𑶔 | [U+11D8A..U+11D8E, U+11D93..U+11D94] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑶕 | U+11D95 | 0 or more | |||
VOWEL_MOD_POST | ◌𑶖 | U+11D96 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
. : · । ॥ 𑶘 | [U+002E, U+003A, U+00B7, U+0964..U+0965, U+11D98] |
Hanunoo
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᜠ ᜡ ᜢ ᜣ ᜤ ᜥ ᜦ ᜧ ᜨ ᜩ ᜪ ᜫ ᜬ ᜭ ᜮ ᜯ ᜰ ᜱ ◌ | [U+1720..U+1731, U+25CC] | 1 | |||
VOWEL_ABOVE | ◌ᜲ | U+1732 | 0 or more | |||
VOWEL_BELOW | ◌ᜳ | U+1733 | 0 or more | |||
VOWEL_POST | ◌᜴ | U+1734 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᜵ ᜵ ᜶ ᜶ | [U+1735, U+1735..U+1736, U+1736] |
Javanese
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ ꦄ ꦅ ꦆ ꦇ ꦈ ꦉ ꦊ ꦋ ꦌ ꦍ ꦎ ꦏ ꦐ ꦑ ꦒ ꦓ ꦔ ꦕ ꦖ ꦗ ꦘ ꦙ ꦚ ꦛ ꦜ ꦝ ꦞ ꦟ ꦠ ꦡ ꦢ ꦣ ꦤ ꦥ ꦦ ꦧ ꦨ ꦩ ꦪ ꦫ ꦬ ꦭ ꦮ ꦯ ꦰ ꦱ ꦲ ꧐ ꧑ ꧒ ꧓ ꧔ ꧕ ꧖ ꧗ ꧘ ꧙ | [U+25CC, U+A984..U+A9B2, U+A9D0..U+A9D9, U+00A0] | 1 | |||
CONS_MOD_ABOVE | ◌꦳ | U+A9B3 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌꧀ | U+A9C0 | 1 | |||
BASE | ◌ ꦄ ꦅ ꦆ ꦇ ꦈ ꦉ ꦊ ꦋ ꦌ ꦍ ꦎ ꦏ ꦐ ꦑ ꦒ ꦓ ꦔ ꦕ ꦖ ꦗ ꦘ ꦙ ꦚ ꦛ ꦜ ꦝ ꦞ ꦟ ꦠ ꦡ ꦢ ꦣ ꦤ ꦥ ꦦ ꦧ ꦨ ꦩ ꦪ ꦫ ꦬ ꦭ ꦮ ꦯ ꦰ ꦱ ꦲ ꧐ ꧑ ꧒ ꧓ ꧔ ꧕ ꧖ ꧗ ꧘ ꧙ | [U+25CC, U+A984..U+A9B2, U+A9D0..U+A9D9] | 1 | |||
CONS_MOD_ABOVE | ◌꦳ | U+A9B3 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌꧀ | U+A9C0 | 1 | |||
Alternative 2 | ||||||
CONS_MED_BELOW | ◌ꦽ ◌ꦿ | [U+A9BD, U+A9BF] | 0 or 1 | |||
CONS_MED_POST | ◌ꦾ | U+A9BE | 0 or 1 | |||
VOWEL_PRE | ◌ꦺ ◌ꦻ | [U+A9BA..U+A9BB] | 0 or more | |||
VOWEL_ABOVE | ◌ꦶ ◌ꦷ ◌ꦼ | [U+A9B6..U+A9B7, U+A9BC] | 0 or more | |||
VOWEL_BELOW | ◌ꦸ ◌ꦹ | [U+A9B8..U+A9B9] | 0 or more | |||
VOWEL_POST | ◌ꦴ ◌ꦵ | [U+A9B4..U+A9B5] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ꦀ ◌ꦁ | [U+A980..U+A981] | 0 or more | |||
VOWEL_MOD_POST | ◌ꦃ | U+A983 | 0 or more | |||
CONS_FINAL_ABOVE | ◌ꦂ | U+A982 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
꧁ ꧂ ꧃ ꧄ ꧅ ꧆ ꧇ ꧈ ꧉ ꧊ ꧋ ꧌ ꧍ ꧏ ꧞ ꧟ | [U+A9C1..U+A9CD, U+A9CF, U+A9DE..U+A9DF] |
Kaithi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑂃 𑂄 𑂅 𑂆 𑂇 𑂈 𑂉 𑂊 𑂋 𑂌 𑂍 𑂎 𑂏 𑂐 𑂑 𑂒 𑂓 𑂔 𑂕 𑂖 𑂗 𑂘 𑂙 𑂚 𑂛 𑂜 𑂝 𑂞 𑂟 𑂠 𑂡 𑂢 𑂣 𑂤 𑂥 𑂦 𑂧 𑂨 𑂩 𑂪 𑂫 𑂬 𑂭 𑂮 𑂯 - ‐ ‑ | [U+0966..U+096F, U+25CC, U+11083..U+110AF, U+002D, U+2010..U+2011] | 1 | |||
CONS_MOD_BELOW | ◌𑂺 | U+110BA | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑂹 | U+110B9 | 1 | |||
BASE | ० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑂃 𑂄 𑂅 𑂆 𑂇 𑂈 𑂉 𑂊 𑂋 𑂌 𑂍 𑂎 𑂏 𑂐 𑂑 𑂒 𑂓 𑂔 𑂕 𑂖 𑂗 𑂘 𑂙 𑂚 𑂛 𑂜 𑂝 𑂞 𑂟 𑂠 𑂡 𑂢 𑂣 𑂤 𑂥 𑂦 𑂧 𑂨 𑂩 𑂪 𑂫 𑂬 𑂭 𑂮 𑂯 | [U+0966..U+096F, U+25CC, U+11083..U+110AF] | 1 | |||
CONS_MOD_BELOW | ◌𑂺 | U+110BA | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑂹 | U+110B9 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑂱 | U+110B1 | 0 or more | |||
VOWEL_ABOVE | ◌𑂵 ◌𑂶 | [U+110B5..U+110B6] | 0 or more | |||
VOWEL_BELOW | ◌𑂳 ◌𑂴 ◌𑃂 | [U+110B3..U+110B4, U+110C2] | 0 or more | |||
VOWEL_POST | ◌𑂰 ◌𑂲 ◌𑂷 ◌𑂸 | [U+110B0, U+110B2, U+110B7..U+110B8] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑂀 ◌𑂁 | [U+11080..U+11081] | 0 or more | |||
VOWEL_MOD_POST | ◌𑂂 | U+11082 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
+ ⸱ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑂻 𑂼 𑂾 𑂿 𑃀 𑃁 | [U+002B, U+2E31, U+A830..U+A839, U+110BB..U+110C1, U+110CD] |
Kawi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑼂◌ | U+11F02 | 0 or 1 | |||
BASE | ◌ 𑼄 𑼅 𑼆 𑼇 𑼈 𑼉 𑼊 𑼋 𑼌 𑼍 𑼎 𑼏 𑼐 𑼒 𑼓 𑼔 𑼕 𑼖 𑼗 𑼘 𑼙 𑼚 𑼛 𑼜 𑼝 𑼞 𑼟 𑼠 𑼡 𑼢 𑼣 𑼤 𑼥 𑼦 𑼧 𑼨 𑼩 𑼪 𑼫 𑼬 𑼭 𑼮 𑼯 𑼰 𑼱 𑼲 𑼳 𑽐 𑽑 𑽒 𑽓 𑽔 𑽕 𑽖 𑽗 𑽘 𑽙 | [U+25CC, U+11F04..U+11F10, U+11F12..U+11F33, U+11F50..U+11F59] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑽂 | U+11F42 | 1 | |||
BASE | ◌ 𑼄 𑼅 𑼆 𑼇 𑼈 𑼉 𑼊 𑼋 𑼌 𑼍 𑼎 𑼏 𑼐 𑼒 𑼓 𑼔 𑼕 𑼖 𑼗 𑼘 𑼙 𑼚 𑼛 𑼜 𑼝 𑼞 𑼟 𑼠 𑼡 𑼢 𑼣 𑼤 𑼥 𑼦 𑼧 𑼨 𑼩 𑼪 𑼫 𑼬 𑼭 𑼮 𑼯 𑼰 𑼱 𑼲 𑼳 𑽐 𑽑 𑽒 𑽓 𑽔 𑽕 𑽖 𑽗 𑽘 𑽙 | [U+25CC, U+11F04..U+11F10, U+11F12..U+11F33, U+11F50..U+11F59] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑽂 | U+11F42 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑼾 ◌𑼿 | [U+11F3E..U+11F3F] | 0 or more | |||
VOWEL_ABOVE | ◌𑼶 ◌𑼷 ◌𑽀 | [U+11F36..U+11F37, U+11F40] | 0 or more | |||
VOWEL_BELOW | ◌𑼸 ◌𑼹 ◌𑼺 | [U+11F38..U+11F3A] | 0 or more | |||
VOWEL_POST | ◌𑼴 ◌𑼵 ◌𑽁 | [U+11F34..U+11F35, U+11F41] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑼀 ◌𑼁 | [U+11F00..U+11F01] | 0 or more | |||
VOWEL_MOD_POST | ◌𑼃 | U+11F03 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑽃 𑽄 𑽅 𑽆 𑽇 𑽈 𑽉 𑽊 𑽋 𑽌 𑽍 𑽎 𑽏 | [U+11F43..U+11F4F] |
Kayah Li
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ ꤀ ꤁ ꤂ ꤃ ꤄ ꤅ ꤆ ꤇ ꤈ ꤉ ꤊ ꤋ ꤌ ꤍ ꤎ ꤏ ꤐ ꤑ ꤒ ꤓ ꤔ ꤕ ꤖ ꤗ ꤘ ꤙ ꤚ ꤛ ꤜ ꤝ ꤞ ꤟ ꤠ ꤡ ꤢ ꤣ ꤤ ꤥ | [U+25CC, U+A900..U+A925] | 1 | |||
VOWEL_ABOVE | ◌ꤦ ◌ꤧ ◌ꤨ ◌ꤩ ◌ꤪ | [U+A926..U+A92A] | 0 or more | |||
VOWEL_MOD_BELOW | ◌꤫ ◌꤬ ◌꤭ | [U+A92B..U+A92D] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
꤮ ꤯ | [U+A92E..U+A92F] |
Kharoshthi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ 𐨀 𐨐 𐨑 𐨒 𐨓 𐨕 𐨖 𐨗 𐨙 𐨚 𐨛 𐨜 𐨝 𐨞 𐨟 𐨠 𐨡 𐨢 𐨣 𐨤 𐨥 𐨦 𐨧 𐨨 𐨩 𐨪 𐨫 𐨬 𐨭 𐨮 𐨯 𐨰 𐨱 𐨲 𐨳 𐨴 𐨵 𐩀 𐩁 𐩂 𐩃 𐩄 𐩅 𐩆 𐩇 𐩈 - ‐ | [U+25CC, U+10A00, U+10A10..U+10A13, U+10A15..U+10A17, U+10A19..U+10A35, U+10A40..U+10A48, U+002D, U+00A0, U+2010] | 1 | |||
CONS_MOD_BELOW | ◌𐨸 ◌𐨹 ◌𐨺 | [U+10A38..U+10A3A] | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𐨿 | U+10A3F | 1 | |||
BASE | ◌ 𐨀 𐨐 𐨑 𐨒 𐨓 𐨕 𐨖 𐨗 𐨙 𐨚 𐨛 𐨜 𐨝 𐨞 𐨟 𐨠 𐨡 𐨢 𐨣 𐨤 𐨥 𐨦 𐨧 𐨨 𐨩 𐨪 𐨫 𐨬 𐨭 𐨮 𐨯 𐨰 𐨱 𐨲 𐨳 𐨴 𐨵 𐩀 𐩁 𐩂 𐩃 𐩄 𐩅 𐩆 𐩇 𐩈 | [U+25CC, U+10A00, U+10A10..U+10A13, U+10A15..U+10A17, U+10A19..U+10A35, U+10A40..U+10A48] | 1 | |||
CONS_MOD_BELOW | ◌𐨸 ◌𐨹 ◌𐨺 | [U+10A38..U+10A3A] | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𐨿 | U+10A3F | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𐨅 | U+10A05 | 0 or more | |||
VOWEL_BELOW | ◌𐨁 ◌𐨂 ◌𐨃 ◌𐨆 | [U+10A01..U+10A03, U+10A06] | 0 or more | |||
VOWEL_POST | ◌𐨌 | U+10A0C | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𐨏 | U+10A0F | 0 or more | |||
VOWEL_MOD_BELOW | ◌𐨍 ◌𐨎 | [U+10A0D..U+10A0E] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𐩐 𐩑 𐩒 𐩓 𐩔 𐩕 𐩖 𐩗 𐩘 | [U+0020, U+10A50..U+10A58] |
Khojki
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯ ◌ 𑈀 𑈁 𑈂 𑈃 𑈄 𑈅 𑈆 𑈇 𑈈 𑈉 𑈊 𑈋 𑈌 𑈍 𑈎 𑈏 𑈐 𑈑 𑈓 𑈔 𑈕 𑈖 𑈗 𑈘 𑈙 𑈚 𑈛 𑈜 𑈝 𑈞 𑈟 𑈠 𑈡 𑈢 𑈣 𑈤 𑈥 𑈦 𑈧 𑈨 𑈩 𑈪 𑈫 𑈿 𑉀 | [U+0AE6..U+0AEF, U+25CC, U+11200..U+11211, U+11213..U+1122B, U+1123F..U+11240] | 1 | |||
CONS_MOD_ABOVE | ◌𑈶 ◌𑈷 | [U+11236..U+11237] | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑈵 | U+11235 | 1 | |||
BASE | ૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯ ◌ 𑈀 𑈁 𑈂 𑈃 𑈄 𑈅 𑈆 𑈇 𑈈 𑈉 𑈊 𑈋 𑈌 𑈍 𑈎 𑈏 𑈐 𑈑 𑈓 𑈔 𑈕 𑈖 𑈗 𑈘 𑈙 𑈚 𑈛 𑈜 𑈝 𑈞 𑈟 𑈠 𑈡 𑈢 𑈣 𑈤 𑈥 𑈦 𑈧 𑈨 𑈩 𑈪 𑈫 𑈿 𑉀 | [U+0AE6..U+0AEF, U+25CC, U+11200..U+11211, U+11213..U+1122B, U+1123F..U+11240] | 1 | |||
CONS_MOD_ABOVE | ◌𑈶 ◌𑈷 | [U+11236..U+11237] | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑈵 | U+11235 | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑈰 ◌𑈱 ◌𑈲 ◌𑈳 | [U+11230..U+11233] | 0 or more | |||
VOWEL_BELOW | ◌𑈯 ◌𑉁 | [U+1122F, U+11241] | 0 or more | |||
VOWEL_POST | ◌𑈬 ◌𑈭 ◌𑈮 | [U+1122C..U+1122E] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑈴 ◌𑈾 | [U+11234, U+1123E] | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑈀 ◌𑈬 | <U+11200, U+1122C> |
𑈀 ◌𑈱 | <U+11200, U+11231> |
𑈀 ◌𑈳 | <U+11200, U+11233> |
𑈀 ◌𑈬 ◌𑈱 | <U+11200, U+1122C, U+11231> |
𑈆 ◌𑈬 | <U+11206, U+1122C> |
◌𑈬 ◌𑈰 | <U+1122C, U+11230> |
◌𑈬 ◌𑈱 | <U+1122C, U+11231> |
𑉀 ◌𑈮 | <U+11240, U+1122E> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑈸 𑈹 𑈺 𑈻 𑈼 𑈽 | [U+A830..U+A839, U+11238..U+1123D] |
Khudawadi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑊰 𑊱 𑊲 𑊳 𑊴 𑊵 𑊶 𑊷 𑊸 𑊹 𑊺 𑊻 𑊼 𑊽 𑊾 𑊿 𑋀 𑋁 𑋂 𑋃 𑋄 𑋅 𑋆 𑋇 𑋈 𑋉 𑋊 𑋋 𑋌 𑋍 𑋎 𑋏 𑋐 𑋑 𑋒 𑋓 𑋔 𑋕 𑋖 𑋗 𑋘 𑋙 𑋚 𑋛 𑋜 𑋝 𑋞 𑋰 𑋱 𑋲 𑋳 𑋴 𑋵 𑋶 𑋷 𑋸 𑋹 | [U+25CC, U+112B0..U+112DE, U+112F0..U+112F9] | 1 | |||
CONS_MOD_BELOW | ◌𑋩 | U+112E9 | 0 or more | |||
VOWEL_PRE | ◌𑋡 | U+112E1 | 0 or more | |||
VOWEL_ABOVE | ◌𑋥 ◌𑋦 ◌𑋧 ◌𑋨 | [U+112E5..U+112E8] | 0 or more | |||
VOWEL_BELOW | ◌𑋣 ◌𑋤 ◌𑋪 | [U+112E3..U+112E4, U+112EA] | 0 or more | |||
VOWEL_POST | ◌𑋠 ◌𑋢 | [U+112E0, U+112E2] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑋟 | U+112DF | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑊰 ◌𑋠 | <U+112B0, U+112E0> |
𑊰 ◌𑋥 | <U+112B0, U+112E5> |
𑊰 ◌𑋦 | <U+112B0, U+112E6> |
𑊰 ◌𑋧 | <U+112B0, U+112E7> |
𑊰 ◌𑋨 | <U+112B0, U+112E8> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
. : ; । ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ | [U+002E, U+003A..U+003B, U+0964..U+0965, U+A830..U+A839] |
Lepcha
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᰀ ᰁ ᰂ ᰃ ᰄ ᰅ ᰆ ᰇ ᰈ ᰉ ᰊ ᰋ ᰌ ᰍ ᰎ ᰏ ᰐ ᰑ ᰒ ᰓ ᰔ ᰕ ᰖ ᰗ ᰘ ᰙ ᰚ ᰛ ᰜ ᰝ ᰞ ᰟ ᰠ ᰡ ᰢ ᰣ ᱀ ᱁ ᱂ ᱃ ᱄ ᱅ ᱆ ᱇ ᱈ ᱉ ᱍ ᱎ ᱏ ◌ | [U+1C00..U+1C23, U+1C40..U+1C49, U+1C4D..U+1C4F, U+25CC] | 1 | |||
CONS_MOD_BELOW | ◌᰷ | U+1C37 | 0 or more | |||
Repeating group | 0 or more | |||||
CONS_SUB | ◌ᰤ ◌ᰥ | [U+1C24..U+1C25] | 1 | |||
CONS_MOD_BELOW | ◌᰷ | U+1C37 | 0 or more | |||
VOWEL_PRE | ◌ᰧ ◌ᰨ ◌ᰩ | [U+1C27..U+1C29] | 0 or more | |||
VOWEL_BELOW | ◌ᰬ | U+1C2C | 0 or more | |||
VOWEL_POST | ◌ᰦ ◌ᰪ ◌ᰫ | [U+1C26, U+1C2A..U+1C2B] | 0 or more | |||
VOWEL_MOD_PRE | ◌ᰴ ◌ᰵ | [U+1C34..U+1C35] | 0 or more | |||
CONS_FINAL_ABOVE | ◌ᰭ ◌ᰮ ◌ᰯ ◌ᰰ ◌ᰱ ◌ᰲ ◌ᰳ | [U+1C2D..U+1C33] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
, . ? ◌ᰶ ᰻ ᰼ ᰽ ᰾ ᰿ | [U+002C, U+002E, U+003F, U+1C36, U+1C3B..U+1C3F] |
Known bugs:
- Unused new Indic_Syllabic_Category values (affects U+1C36)
Limbu
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ० १ २ ३ ४ ५ ६ ७ ८ ९ ᤁ ᤂ ᤃ ᤄ ᤅ ᤆ ᤇ ᤈ ᤉ ᤊ ᤋ ᤌ ᤍ ᤎ ᤏ ᤐ ᤑ ᤒ ᤓ ᤔ ᤕ ᤖ ᤗ ᤘ ᤙ ᤚ ᤛ ᤜ ᤝ ᤞ ᥆ ᥇ ᥈ ᥉ ᥊ ᥋ ᥌ ᥍ ᥎ ᥏ ◌ ᤀ | [U+0966..U+096F, U+1901..U+191E, U+1946..U+194F, U+25CC, U+1900] | 1 | |||
Repeating group | 0 or more | |||||
CONS_SUB | ◌ᤩ ◌ᤪ ◌ᤫ | [U+1929..U+192B] | 1 | |||
VOWEL_ABOVE | ◌ᤠ ◌ᤡ ◌ᤥ ◌ᤦ ◌ᤧ ◌ᤨ | [U+1920..U+1921, U+1925..U+1928] | 0 or more | |||
VOWEL_BELOW | ◌ᤢ | U+1922 | 0 or more | |||
VOWEL_POST | ◌ᤣ ◌ᤤ | [U+1923..U+1924] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌᤺ | U+193A | 0 or more | |||
VOWEL_MOD_BELOW | ◌ᤲ | U+1932 | 0 or more | |||
CONS_FINAL_BELOW | ◌᤹ | U+1939 | 0 or more | |||
CONS_FINAL_POST | ◌ᤰ ◌ᤱ ◌ᤳ ◌ᤴ ◌ᤵ ◌ᤶ ◌ᤷ ◌ᤸ | [U+1930..U+1931, U+1933..U+1938] | 0 or more | |||
CONS_FINAL_MOD | ◌᤻ | U+193B | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ ॥ ᥀ ᥄ ᥅ | [U+0660..U+0669, U+0965, U+1940, U+1944..U+1945] |
Mahajani
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑅐 𑅑 𑅒 𑅓 𑅔 𑅕 𑅖 𑅗 𑅘 𑅙 𑅚 𑅛 𑅜 𑅝 𑅞 𑅟 𑅠 𑅡 𑅢 𑅣 𑅤 𑅥 𑅦 𑅧 𑅨 𑅩 𑅪 𑅫 𑅬 𑅭 𑅮 𑅯 𑅰 𑅱 𑅲 | [U+0966..U+096F, U+25CC, U+11150..U+11172] | 1 | |||
CONS_MOD_BELOW | ◌𑅳 | U+11173 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
: · । ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑅴 𑅵 𑅶 | [U+003A, U+00B7, U+0964..U+0965, U+A830..U+A839, U+11174..U+11176] |
Makasar
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | 0 1 2 3 4 5 6 7 8 9 ◌ 𑻠 𑻡 𑻢 𑻣 𑻤 𑻥 𑻦 𑻧 𑻨 𑻩 𑻪 𑻫 𑻬 𑻭 𑻮 𑻯 𑻰 𑻱 𑻲 | [U+0030..U+0039, U+25CC, U+11EE0..U+11EF1, U+00A0, U+11EF2] | 1 | |||
VOWEL_PRE | ◌𑻵 | U+11EF5 | 0 or more | |||
VOWEL_ABOVE | ◌𑻳 | U+11EF3 | 0 or more | |||
VOWEL_BELOW | ◌𑻴 | U+11EF4 | 0 or more | |||
VOWEL_POST | ◌𑻶 | U+11EF6 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ 𑻷 𑻸 | [U+0020, U+0660..U+0669, U+11EF7..U+11EF8] |
Marchen
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑱲 𑱳 𑱴 𑱵 𑱶 𑱷 𑱸 𑱹 𑱺 𑱻 𑱼 𑱽 𑱾 𑱿 𑲀 𑲁 𑲂 𑲃 𑲄 𑲅 𑲆 𑲇 𑲈 𑲉 𑲊 𑲋 𑲌 𑲍 𑲎 𑲏 | [U+25CC, U+11C72..U+11C8F] | 1 | |||
Repeating group | 0 or more | |||||
CONS_SUB | ◌𑲒 ◌𑲓 ◌𑲔 ◌𑲕 ◌𑲖 ◌𑲗 ◌𑲘 ◌𑲙 ◌𑲚 ◌𑲛 ◌𑲜 ◌𑲝 ◌𑲞 ◌𑲟 ◌𑲠 ◌𑲡 ◌𑲢 ◌𑲣 ◌𑲤 ◌𑲥 ◌𑲦 ◌𑲧 ◌𑲩 ◌𑲪 ◌𑲫 ◌𑲬 ◌𑲭 ◌𑲮 ◌𑲯 | [U+11C92..U+11CA7, U+11CA9..U+11CAF] | 1 | |||
VOWEL_PRE | ◌𑲱 | U+11CB1 | 0 or more | |||
VOWEL_ABOVE | ◌𑲳 | U+11CB3 | 0 or more | |||
VOWEL_BELOW | ◌𑲰 ◌𑲲 | [U+11CB0, U+11CB2] | 0 or more | |||
VOWEL_POST | ◌𑲴 | U+11CB4 | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑲵 ◌𑲶 | [U+11CB5..U+11CB6] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑱰 𑱱 | [U+11C70..U+11C71] |
Masaram Gondi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑵆◌ | U+11D46 | 0 or 1 | |||
BASE | ◌ 𑴀 𑴁 𑴂 𑴃 𑴄 𑴅 𑴆 𑴈 𑴉 𑴋 𑴌 𑴍 𑴎 𑴏 𑴐 𑴑 𑴒 𑴓 𑴔 𑴕 𑴖 𑴗 𑴘 𑴙 𑴚 𑴛 𑴜 𑴝 𑴞 𑴟 𑴠 𑴡 𑴢 𑴣 𑴤 𑴥 𑴦 𑴧 𑴨 𑴩 𑴪 𑴫 𑴬 𑴭 𑴮 𑴯 𑴰 𑵐 𑵑 𑵒 𑵓 𑵔 𑵕 𑵖 𑵗 𑵘 𑵙 | [U+25CC, U+11D00..U+11D06, U+11D08..U+11D09, U+11D0B..U+11D30, U+11D50..U+11D59] | 1 | |||
CONS_MOD_BELOW | ◌𑵂 | U+11D42 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑵅 | U+11D45 | 1 | |||
BASE | ◌ 𑴀 𑴁 𑴂 𑴃 𑴄 𑴅 𑴆 𑴈 𑴉 𑴋 𑴌 𑴍 𑴎 𑴏 𑴐 𑴑 𑴒 𑴓 𑴔 𑴕 𑴖 𑴗 𑴘 𑴙 𑴚 𑴛 𑴜 𑴝 𑴞 𑴟 𑴠 𑴡 𑴢 𑴣 𑴤 𑴥 𑴦 𑴧 𑴨 𑴩 𑴪 𑴫 𑴬 𑴭 𑴮 𑴯 𑴰 𑵐 𑵑 𑵒 𑵓 𑵔 𑵕 𑵖 𑵗 𑵘 𑵙 | [U+25CC, U+11D00..U+11D06, U+11D08..U+11D09, U+11D0B..U+11D30, U+11D50..U+11D59] | 1 | |||
CONS_MOD_BELOW | ◌𑵂 | U+11D42 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑵅 | U+11D45 | 1 | |||
Alternative 2 | ||||||
CONS_MED_BELOW | ◌𑵇 | U+11D47 | 0 or 1 | |||
VOWEL_ABOVE | ◌𑴱 ◌𑴲 ◌𑴳 ◌𑴴 ◌𑴵 ◌𑴺 ◌𑴼 ◌𑴽 ◌𑴿 ◌𑵃 | [U+11D31..U+11D35, U+11D3A, U+11D3C..U+11D3D, U+11D3F, U+11D43] | 0 or more | |||
VOWEL_BELOW | ◌𑴶 ◌𑵄 | [U+11D36, U+11D44] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑵀 ◌𑵁 | [U+11D40..U+11D41] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ | [U+0964..U+0965] |
Meetei Mayek
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ ꫠ ꫡ ꫢ ꫣ ꫤ ꫥ ꫦ ꫧ ꫨ ꫩ ꫪ ꯀ ꯁ ꯂ ꯃ ꯄ ꯅ ꯆ ꯇ ꯈ ꯉ ꯊ ꯋ ꯌ ꯍ ꯎ ꯏ ꯐ ꯑ ꯒ ꯓ ꯔ ꯕ ꯖ ꯗ ꯘ ꯙ ꯚ ꯛ ꯜ ꯝ ꯞ ꯟ ꯠ ꯡ ꯢ ꯰ ꯱ ꯲ ꯳ ꯴ ꯵ ꯶ ꯷ ꯸ ꯹ | [U+25CC, U+AAE0..U+AAEA, U+ABC0..U+ABE2, U+ABF0..U+ABF9] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌꫶ | U+AAF6 | 1 | |||
BASE | ◌ ꫠ ꫡ ꫢ ꫣ ꫤ ꫥ ꫦ ꫧ ꫨ ꫩ ꫪ ꯀ ꯁ ꯂ ꯃ ꯄ ꯅ ꯆ ꯇ ꯈ ꯉ ꯊ ꯋ ꯌ ꯍ ꯎ ꯏ ꯐ ꯑ ꯒ ꯓ ꯔ ꯕ ꯖ ꯗ ꯘ ꯙ ꯚ ꯛ ꯜ ꯝ ꯞ ꯟ ꯠ ꯡ ꯢ ꯰ ꯱ ꯲ ꯳ ꯴ ꯵ ꯶ ꯷ ꯸ ꯹ | [U+25CC, U+AAE0..U+AAEA, U+ABC0..U+ABE2, U+ABF0..U+ABF9] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌꫶ | U+AAF6 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌ꫫ ◌ꫮ | [U+AAEB, U+AAEE] | 0 or more | |||
VOWEL_ABOVE | ◌ꫭ ◌ꯥ | [U+AAED, U+ABE5] | 0 or more | |||
VOWEL_BELOW | ◌ꫬ ◌ꯨ ◌꯭ | [U+AAEC, U+ABE8, U+ABED] | 0 or more | |||
VOWEL_POST | ◌ꫯ ◌ꯣ ◌ꯤ ◌ꯦ ◌ꯧ ◌ꯩ ◌ꯪ | [U+AAEF, U+ABE3..U+ABE4, U+ABE6..U+ABE7, U+ABE9..U+ABEA] | 0 or more | |||
VOWEL_MOD_POST | ◌ꫵ ◌꯬ | [U+AAF5, U+ABEC] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
꫰ ꫱ ꫲ ꫳ ꫴ ꯫ | [U+AAF0..U+AAF4, U+ABEB] |
Modi
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ 𑘀 𑘁 𑘂 𑘃 𑘄 𑘅 𑘆 𑘇 𑘈 𑘉 𑘊 𑘋 𑘌 𑘍 𑘎 𑘏 𑘐 𑘑 𑘒 𑘓 𑘔 𑘕 𑘖 𑘗 𑘘 𑘙 𑘚 𑘛 𑘜 𑘝 𑘞 𑘟 𑘠 𑘡 𑘢 𑘣 𑘤 𑘥 𑘦 𑘧 𑘨 𑘩 𑘪 𑘫 𑘬 𑘭 𑘮 𑘯 𑙐 𑙑 𑙒 𑙓 𑙔 𑙕 𑙖 𑙗 𑙘 𑙙 | [U+25CC, U+11600..U+1162F, U+11650..U+11659, U+00A0] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑘿 | U+1163F | 1 | |||
BASE | ◌ 𑘀 𑘁 𑘂 𑘃 𑘄 𑘅 𑘆 𑘇 𑘈 𑘉 𑘊 𑘋 𑘌 𑘍 𑘎 𑘏 𑘐 𑘑 𑘒 𑘓 𑘔 𑘕 𑘖 𑘗 𑘘 𑘙 𑘚 𑘛 𑘜 𑘝 𑘞 𑘟 𑘠 𑘡 𑘢 𑘣 𑘤 𑘥 𑘦 𑘧 𑘨 𑘩 𑘪 𑘫 𑘬 𑘭 𑘮 𑘯 𑙐 𑙑 𑙒 𑙓 𑙔 𑙕 𑙖 𑙗 𑙘 𑙙 | [U+25CC, U+11600..U+1162F, U+11650..U+11659] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑘿 | U+1163F | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑘹 ◌𑘺 ◌𑙀 | [U+11639..U+1163A, U+11640] | 0 or more | |||
VOWEL_BELOW | ◌𑘳 ◌𑘴 ◌𑘵 ◌𑘶 ◌𑘷 ◌𑘸 | [U+11633..U+11638] | 0 or more | |||
VOWEL_POST | ◌𑘰 ◌𑘱 ◌𑘲 ◌𑘻 ◌𑘼 | [U+11630..U+11632, U+1163B..U+1163C] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑘽 | U+1163D | 0 or more | |||
VOWEL_MOD_POST | ◌𑘾 | U+1163E | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑘀 ◌𑘹 | <U+11600, U+11639> |
𑘀 ◌𑘺 | <U+11600, U+1163A> |
𑘁 ◌𑘹 | <U+11601, U+11639> |
𑘁 ◌𑘺 | <U+11601, U+1163A> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
. ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑙁 𑙂 𑙃 𑙄 | [U+0020, U+002E, U+A830..U+A839, U+11641..U+11644] |
Multani
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ੦ ੧ ੨ ੩ ੪ ੫ ੬ ੭ ੮ ੯ 𑊀 𑊁 𑊂 𑊃 𑊄 𑊅 𑊆 𑊈 𑊊 𑊋 𑊌 𑊍 𑊏 𑊐 𑊑 𑊒 𑊓 𑊔 𑊕 𑊖 𑊗 𑊘 𑊙 𑊚 𑊛 𑊜 𑊝 𑊟 𑊠 𑊡 𑊢 𑊣 𑊤 𑊥 𑊦 𑊧 𑊨 | [U+0A66..U+0A6F, U+11280..U+11286, U+11288, U+1128A..U+1128D, U+1128F..U+1129D, U+1129F..U+112A8] | 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑊩 | U+112A9 |
Nag Mundari
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ 𞓐 𞓑 𞓒 𞓓 𞓔 𞓕 𞓖 𞓗 𞓘 𞓙 𞓚 𞓛 𞓜 𞓝 𞓞 𞓟 𞓠 𞓡 𞓢 𞓣 𞓤 𞓥 𞓦 𞓧 𞓨 𞓩 𞓪 𞓫 𞓰 𞓱 𞓲 𞓳 𞓴 𞓵 𞓶 𞓷 𞓸 𞓹 - ‐ | [U+25CC, U+1E4D0..U+1E4EB, U+1E4F0..U+1E4F9, U+002D, U+00A0, U+2010] | 1 | |||
VOWEL_ABOVE | ◌𞓬 ◌𞓭 ◌𞓮 ◌𞓯 | [U+1E4EC..U+1E4EF] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
! " ' , . ? ‘ ’ “ ” | [U+0020..U+0022, U+0027, U+002C, U+002E, U+003F, U+2018..U+2019, U+201C..U+201D] |
Nandinagari
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ೦ ೧ ೨ ೩ ೪ ೫ ೬ ೭ ೮ ೯ ◌ 𑦠 𑦡 𑦢 𑦣 𑦤 𑦥 𑦦 𑦧 𑦪 𑦫 𑦬 𑦭 𑦮 𑦯 𑦰 𑦱 𑦲 𑦳 𑦴 𑦵 𑦶 𑦷 𑦸 𑦹 𑦺 𑦻 𑦼 𑦽 𑦾 𑦿 𑧀 𑧁 𑧂 𑧃 𑧄 𑧅 𑧆 𑧇 𑧈 𑧉 𑧊 𑧋 𑧌 𑧍 𑧎 𑧏 𑧐 𑧡 ᳺ | [U+0CE6..U+0CEF, U+25CC, U+119A0..U+119A7, U+119AA..U+119D0, U+119E1, U+1CFA] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑧠 | U+119E0 | 1 | |||
BASE | ೦ ೧ ೨ ೩ ೪ ೫ ೬ ೭ ೮ ೯ ◌ 𑦠 𑦡 𑦢 𑦣 𑦤 𑦥 𑦦 𑦧 𑦪 𑦫 𑦬 𑦭 𑦮 𑦯 𑦰 𑦱 𑦲 𑦳 𑦴 𑦵 𑦶 𑦷 𑦸 𑦹 𑦺 𑦻 𑦼 𑦽 𑦾 𑦿 𑧀 𑧁 𑧂 𑧃 𑧄 𑧅 𑧆 𑧇 𑧈 𑧉 𑧊 𑧋 𑧌 𑧍 𑧎 𑧏 𑧐 𑧡 | [U+0CE6..U+0CEF, U+25CC, U+119A0..U+119A7, U+119AA..U+119D0, U+119E1] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑧠 | U+119E0 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑧒 ◌𑧤 | [U+119D2, U+119E4] | 0 or more | |||
VOWEL_ABOVE | ◌𑧚 ◌𑧛 | [U+119DA..U+119DB] | 0 or more | |||
VOWEL_BELOW | ◌𑧔 ◌𑧕 ◌𑧖 ◌𑧗 | [U+119D4..U+119D7] | 0 or more | |||
VOWEL_POST | ◌𑧑 ◌𑧓 ◌𑧜 ◌𑧝 | [U+119D1, U+119D3, U+119DC..U+119DD] | 0 or more | |||
VOWEL_MOD_POST | ◌𑧞 ◌𑧟 | [U+119DE..U+119DF] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ᳩ ᳲ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ 𑧢 𑧣 | [U+0964..U+0965, U+1CE9, U+1CF2, U+A830..U+A835, U+119E2..U+119E3] |
Newa
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
CONS_WITH_STACKER | 𑑠 𑑡 | [U+11460..U+11461] | 0 or 1 | |||
BASE | ◌ 𑐀 𑐁 𑐂 𑐃 𑐄 𑐅 𑐆 𑐇 𑐈 𑐉 𑐊 𑐋 𑐌 𑐍 𑐎 𑐏 𑐐 𑐑 𑐒 𑐓 𑐔 𑐕 𑐖 𑐗 𑐘 𑐙 𑐚 𑐛 𑐜 𑐝 𑐞 𑐟 𑐠 𑐡 𑐢 𑐣 𑐤 𑐥 𑐦 𑐧 𑐨 𑐩 𑐪 𑐫 𑐬 𑐭 𑐮 𑐯 𑐰 𑐱 𑐲 𑐳 𑐴 𑑇 𑑐 𑑑 𑑒 𑑓 𑑔 𑑕 𑑖 𑑗 𑑘 𑑙 𑑟 | [U+25CC, U+11400..U+11434, U+11447, U+11450..U+11459, U+1145F] | 1 | |||
CONS_MOD_BELOW | ◌𑑆 | U+11446 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑑂 | U+11442 | 1 | |||
BASE | ◌ 𑐀 𑐁 𑐂 𑐃 𑐄 𑐅 𑐆 𑐇 𑐈 𑐉 𑐊 𑐋 𑐌 𑐍 𑐎 𑐏 𑐐 𑐑 𑐒 𑐓 𑐔 𑐕 𑐖 𑐗 𑐘 𑐙 𑐚 𑐛 𑐜 𑐝 𑐞 𑐟 𑐠 𑐡 𑐢 𑐣 𑐤 𑐥 𑐦 𑐧 𑐨 𑐩 𑐪 𑐫 𑐬 𑐭 𑐮 𑐯 𑐰 𑐱 𑐲 𑐳 𑐴 𑑇 𑑐 𑑑 𑑒 𑑓 𑑔 𑑕 𑑖 𑑗 𑑘 𑑙 𑑟 | [U+25CC, U+11400..U+11434, U+11447, U+11450..U+11459, U+1145F] | 1 | |||
CONS_MOD_BELOW | ◌𑑆 | U+11446 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑑂 | U+11442 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑐶 | U+11436 | 0 or more | |||
VOWEL_ABOVE | ◌𑐾 ◌𑐿 | [U+1143E..U+1143F] | 0 or more | |||
VOWEL_BELOW | ◌𑐸 ◌𑐹 ◌𑐺 ◌𑐻 ◌𑐼 ◌𑐽 | [U+11438..U+1143D] | 0 or more | |||
VOWEL_POST | ◌𑐵 ◌𑐷 ◌𑑀 ◌𑑁 | [U+11435, U+11437, U+11440..U+11441] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑑃 ◌𑑄 | [U+11443..U+11444] | 0 or more | |||
VOWEL_MOD_POST | ◌𑑅 | U+11445 | 0 or more | |||
CONS_FINAL_MOD | ◌᷻ ◌𑑞 | [U+1DFB, U+1145E] | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
⁕ 𑑈 𑑉 𑑊 𑑋 𑑌 𑑍 𑑎 𑑏 𑑚 𑑛 𑑝 | [U+2055, U+11448..U+1144F, U+1145A..U+1145B, U+1145D] |
Phags-pa
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ꡀ ꡁ ꡂ ꡃ ꡄ ꡅ ꡆ ꡇ ꡈ ꡉ ꡊ ꡋ ꡌ ꡍ ꡎ ꡏ ꡐ ꡑ ꡒ ꡓ ꡔ ꡕ ꡖ ꡗ ꡘ ꡙ ꡚ ꡛ ꡜ ꡝ ꡞ ꡟ ꡠ ꡡ ꡢ ꡣ ꡤ ꡥ ꡦ ꡧ ꡨ ꡩ ꡪ ꡫ ꡬ ꡭ ꡮ ꡯ ꡰ ꡱ ꡲ ꡳ | [U+A840..U+A873, U+00A0] | 1 | |||
VARIATION_SELECTOR | ︀ | U+FE00 | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᠂ ᠃ ᠅ 。 ꡴ ꡵ ꡶ ꡷ | [U+0020, U+1802..U+1803, U+1805, U+202F, U+3002, U+A874..U+A877] |
Rejang
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ ꤰ ꤱ ꤲ ꤳ ꤴ ꤵ ꤶ ꤷ ꤸ ꤹ ꤺ ꤻ ꤼ ꤽ ꤾ ꤿ ꥀ ꥁ ꥂ ꥃ ꥄ ꥅ ꥆ | [U+25CC, U+A930..U+A946, U+00A0] | 1 | |||
VOWEL_ABOVE | ◌ꥊ | U+A94A | 0 or more | |||
VOWEL_BELOW | ◌ꥇ ◌ꥈ ◌ꥉ ◌ꥋ ◌ꥌ ◌ꥍ ◌ꥎ | [U+A947..U+A949, U+A94B..U+A94E] | 0 or more | |||
VOWEL_POST | ◌꥓ | U+A953 | 0 or more | |||
CONS_FINAL_ABOVE | ◌ꥏ ◌ꥐ ◌ꥑ | [U+A94F..U+A951] | 0 or more | |||
CONS_FINAL_POST | ◌ꥒ | U+A952 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
, . : ꥟ | [U+0020, U+002C, U+002E, U+003A, U+A95F] |
Saurashtra
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ ꢂ ꢃ ꢄ ꢅ ꢆ ꢇ ꢈ ꢉ ꢊ ꢋ ꢌ ꢍ ꢎ ꢏ ꢐ ꢑ ꢒ ꢓ ꢔ ꢕ ꢖ ꢗ ꢘ ꢙ ꢚ ꢛ ꢜ ꢝ ꢞ ꢟ ꢠ ꢡ ꢢ ꢣ ꢤ ꢥ ꢦ ꢧ ꢨ ꢩ ꢪ ꢫ ꢬ ꢭ ꢮ ꢯ ꢰ ꢱ ꢲ ꢳ ꣐ ꣑ ꣒ ꣓ ꣔ ꣕ ꣖ ꣗ ꣘ ꣙ | [U+25CC, U+A882..U+A8B3, U+A8D0..U+A8D9] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌꣄ | U+A8C4 | 1 | |||
BASE | ◌ ꢂ ꢃ ꢄ ꢅ ꢆ ꢇ ꢈ ꢉ ꢊ ꢋ ꢌ ꢍ ꢎ ꢏ ꢐ ꢑ ꢒ ꢓ ꢔ ꢕ ꢖ ꢗ ꢘ ꢙ ꢚ ꢛ ꢜ ꢝ ꢞ ꢟ ꢠ ꢡ ꢢ ꢣ ꢤ ꢥ ꢦ ꢧ ꢨ ꢩ ꢪ ꢫ ꢬ ꢭ ꢮ ꢯ ꢰ ꢱ ꢲ ꢳ ꣐ ꣑ ꣒ ꣓ ꣔ ꣕ ꣖ ꣗ ꣘ ꣙ | [U+25CC, U+A882..U+A8B3, U+A8D0..U+A8D9] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌꣄ | U+A8C4 | 1 | |||
Alternative 2 | ||||||
CONS_MED_POST | ◌ꢴ | U+A8B4 | 0 or 1 | |||
VOWEL_POST | ◌ꢵ ◌ꢶ ◌ꢷ ◌ꢸ ◌ꢹ ◌ꢺ ◌ꢻ ◌ꢼ ◌ꢽ ◌ꢾ ◌ꢿ ◌ꣀ ◌ꣁ ◌ꣂ ◌ꣃ | [U+A8B5..U+A8C3] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ꣅ | U+A8C5 | 0 or more | |||
VOWEL_MOD_POST | ◌ꢀ ◌ꢁ | [U+A880..U+A881] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
, . ? ꣎ ꣏ | [U+002C, U+002E, U+003F, U+A8CE..U+A8CF] |
Sharada
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑇂◌ 𑇃◌ | [U+111C2..U+111C3] | 0 or 1 | |||
BASE | ◌ 𑆃 𑆄 𑆅 𑆆 𑆇 𑆈 𑆉 𑆊 𑆋 𑆌 𑆍 𑆎 𑆏 𑆐 𑆑 𑆒 𑆓 𑆔 𑆕 𑆖 𑆗 𑆘 𑆙 𑆚 𑆛 𑆜 𑆝 𑆞 𑆟 𑆠 𑆡 𑆢 𑆣 𑆤 𑆥 𑆦 𑆧 𑆨 𑆩 𑆪 𑆫 𑆬 𑆭 𑆮 𑆯 𑆰 𑆱 𑆲 𑇁 𑇐 𑇑 𑇒 𑇓 𑇔 𑇕 𑇖 𑇗 𑇘 𑇙 𑇚 | [U+25CC, U+11183..U+111B2, U+111C1, U+111D0..U+111DA] | 1 | |||
CONS_MOD_BELOW | ◌𑇊 | U+111CA | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑇀 | U+111C0 | 1 | |||
BASE | ◌ 𑆃 𑆄 𑆅 𑆆 𑆇 𑆈 𑆉 𑆊 𑆋 𑆌 𑆍 𑆎 𑆏 𑆐 𑆑 𑆒 𑆓 𑆔 𑆕 𑆖 𑆗 𑆘 𑆙 𑆚 𑆛 𑆜 𑆝 𑆞 𑆟 𑆠 𑆡 𑆢 𑆣 𑆤 𑆥 𑆦 𑆧 𑆨 𑆩 𑆪 𑆫 𑆬 𑆭 𑆮 𑆯 𑆰 𑆱 𑆲 𑇁 𑇐 𑇑 𑇒 𑇓 𑇔 𑇕 𑇖 𑇗 𑇘 𑇙 𑇚 | [U+25CC, U+11183..U+111B2, U+111C1, U+111D0..U+111DA] | 1 | |||
CONS_MOD_BELOW | ◌𑇊 | U+111CA | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑇀 | U+111C0 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑆴 ◌𑇎 | [U+111B4, U+111CE] | 0 or more | |||
VOWEL_ABOVE | ◌𑆼 ◌𑆽 ◌𑆾 ◌𑆿 ◌𑇋 | [U+111BC..U+111BF, U+111CB] | 0 or more | |||
VOWEL_BELOW | ◌𑆶 ◌𑆷 ◌𑆸 ◌𑆹 ◌𑆺 ◌𑆻 ◌𑇌 | [U+111B6..U+111BB, U+111CC] | 0 or more | |||
VOWEL_POST | ◌𑆳 ◌𑆵 | [U+111B3, U+111B5] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌॑ ◌᳠ ◌𑆀 ◌𑆁 ◌𑇏 | [U+0951, U+1CE0, U+11180..U+11181, U+111CF] | 0 or more | |||
VOWEL_MOD_BELOW | ◌᳗ ◌᳙ ◌᳜ ◌᳝ | [U+1CD7, U+1CD9, U+1CDC..U+1CDD] | 0 or more | |||
VOWEL_MOD_POST | ◌𑆂 | U+11182 | 0 or more | |||
CONS_FINAL_MOD | ◌𑇉 | U+111C9 | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠸ 𑇄 𑇅 𑇆 𑇇 𑇈 𑇍 𑇛 𑇜 𑇝 𑇞 𑇟 | [U+A830..U+A835, U+A838, U+111C4..U+111C8, U+111CD, U+111DB..U+111DF] |
Siddham
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌𑖺 | U+115BA | ◌𑖸 ◌𑖯 | <U+115B8, U+115AF> |
◌𑖻 | U+115BB | ◌𑖹 ◌𑖯 | <U+115B9, U+115AF> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑖀 𑖁 𑖂 𑖃 𑖄 𑖅 𑖆 𑖇 𑖈 𑖉 𑖊 𑖋 𑖌 𑖍 𑖎 𑖏 𑖐 𑖑 𑖒 𑖓 𑖔 𑖕 𑖖 𑖗 𑖘 𑖙 𑖚 𑖛 𑖜 𑖝 𑖞 𑖟 𑖠 𑖡 𑖢 𑖣 𑖤 𑖥 𑖦 𑖧 𑖨 𑖩 𑖪 𑖫 𑖬 𑖭 𑖮 𑗘 𑗙 𑗚 𑗛 | [U+25CC, U+11580..U+115AE, U+115D8..U+115DB] | 1 | |||
CONS_MOD_BELOW | ◌𑗀 | U+115C0 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑖿 | U+115BF | 1 | |||
BASE | ◌ 𑖀 𑖁 𑖂 𑖃 𑖄 𑖅 𑖆 𑖇 𑖈 𑖉 𑖊 𑖋 𑖌 𑖍 𑖎 𑖏 𑖐 𑖑 𑖒 𑖓 𑖔 𑖕 𑖖 𑖗 𑖘 𑖙 𑖚 𑖛 𑖜 𑖝 𑖞 𑖟 𑖠 𑖡 𑖢 𑖣 𑖤 𑖥 𑖦 𑖧 𑖨 𑖩 𑖪 𑖫 𑖬 𑖭 𑖮 𑗘 𑗙 𑗚 𑗛 | [U+25CC, U+11580..U+115AE, U+115D8..U+115DB] | 1 | |||
CONS_MOD_BELOW | ◌𑗀 | U+115C0 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑖿 | U+115BF | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑖰 ◌𑖸 ◌𑖹 | [U+115B0, U+115B8..U+115B9] | 0 or more | |||
VOWEL_BELOW | ◌𑖲 ◌𑖳 ◌𑖴 ◌𑖵 ◌𑗜 ◌𑗝 | [U+115B2..U+115B5, U+115DC..U+115DD] | 0 or more | |||
VOWEL_POST | ◌𑖯 ◌𑖱 | [U+115AF, U+115B1] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑖼 ◌𑖽 | [U+115BC..U+115BD] | 0 or more | |||
VOWEL_MOD_POST | ◌𑖾 | U+115BE | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑗁 𑗂 𑗃 𑗄 𑗅 𑗆 𑗇 𑗈 𑗉 𑗊 𑗋 𑗌 𑗍 𑗎 𑗏 𑗐 𑗑 𑗒 𑗓 𑗔 𑗕 𑗖 𑗗 | [U+115C1..U+115D7] |
Known bugs:
- Fully decomposed split vowels may occupy more than one position (affects U+115B9)
- Problems in the “Split vowel handling” section (affects U+115B9)
Sinhala
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌ේ | U+0DDA | ◌ෙ ◌් | <U+0DD9, U+0DCA> |
◌ො | U+0DDC | ◌ෙ ◌ා | <U+0DD9, U+0DCF> |
◌ෝ | U+0DDD | ◌ෙ ◌ා ◌් | <U+0DD9, U+0DCF, U+0DCA> |
◌ෞ | U+0DDE | ◌ෙ ◌ෟ | <U+0DD9, U+0DDF> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | 0 1 2 3 4 5 6 7 8 9 අ ආ ඇ ඈ ඉ ඊ උ ඌ ඍ ඎ ඏ ඐ එ ඒ ඓ ඔ ඕ ඖ ක ඛ ග ඝ ඞ ඟ ච ඡ ජ ඣ ඤ ඥ ඦ ට ඨ ඩ ඪ ණ ඬ ත ථ ද ධ න ඳ ප ඵ බ භ ම ඹ ය ර ල ව ශ ෂ ස හ ළ ෆ ෦ ෧ ෨ ෩ ෪ ෫ ෬ ෭ ෮ ෯ ◌ 𑇡 𑇢 𑇣 𑇤 𑇥 𑇦 𑇧 𑇨 𑇩 𑇪 𑇫 𑇬 𑇭 𑇮 𑇯 𑇰 𑇱 𑇲 𑇳 𑇴 | [U+0030..U+0039, U+0D85..U+0D96, U+0D9A..U+0DB1, U+0DB3..U+0DBB, U+0DBD, U+0DC0..U+0DC6, U+0DE6..U+0DEF, U+25CC, U+111E1..U+111F4] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌් | U+0DCA | 1 | |||
BASE | 0 1 2 3 4 5 6 7 8 9 අ ආ ඇ ඈ ඉ ඊ උ ඌ ඍ ඎ ඏ ඐ එ ඒ ඓ ඔ ඕ ඖ ක ඛ ග ඝ ඞ ඟ ච ඡ ජ ඣ ඤ ඥ ඦ ට ඨ ඩ ඪ ණ ඬ ත ථ ද ධ න ඳ ප ඵ බ භ ම ඹ ය ර ල ව ශ ෂ ස හ ළ ෆ ෦ ෧ ෨ ෩ ෪ ෫ ෬ ෭ ෮ ෯ ◌ 𑇡 𑇢 𑇣 𑇤 𑇥 𑇦 𑇧 𑇨 𑇩 𑇪 𑇫 𑇬 𑇭 𑇮 𑇯 𑇰 𑇱 𑇲 𑇳 𑇴 | [U+0030..U+0039, U+0D85..U+0D96, U+0D9A..U+0DB1, U+0DB3..U+0DBB, U+0DBD, U+0DC0..U+0DC6, U+0DE6..U+0DEF, U+25CC, U+111E1..U+111F4] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌් | U+0DCA | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌ෙ ◌ෛ | [U+0DD9, U+0DDB] | 0 or more | |||
VOWEL_ABOVE | ◌ි ◌ී | [U+0DD2..U+0DD3] | 0 or more | |||
VOWEL_BELOW | ◌ු ◌ූ | [U+0DD4, U+0DD6] | 0 or more | |||
VOWEL_POST | ◌ා ◌ැ ◌ෑ ◌ෘ ◌ෟ ◌ෲ ◌ෳ | [U+0DCF..U+0DD1, U+0DD8, U+0DDF, U+0DF2..U+0DF3] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ඁ | U+0D81 | 0 or more | |||
VOWEL_MOD_POST | ◌ං ◌ඃ | [U+0D82..U+0D83] | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
අ ◌ා | <U+0D85, U+0DCF> |
අ ◌ැ | <U+0D85, U+0DD0> |
අ ◌ෑ | <U+0D85, U+0DD1> |
උ ◌ෟ | <U+0D8B, U+0DDF> |
ඍ ◌ෘ | <U+0D8D, U+0DD8> |
ඏ ◌ෟ | <U+0D8F, U+0DDF> |
එ ◌් | <U+0D91, U+0DCA> |
එ ◌ෙ | <U+0D91, U+0DD9> |
එ ◌ේ | <U+0D91, U+0DDA> |
එ ◌ො | <U+0D91, U+0DDC> |
එ ◌ෝ | <U+0D91, U+0DDD> |
එ ◌ෞ | <U+0D91, U+0DDE> |
ඔ ◌ෟ | <U+0D94, U+0DDF> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ෴ ᳲ | [U+0964..U+0965, U+0DF4, U+1CF2] |
Known bugs:
- Decomposed Sinhala Vowels (affects U+0DDA, U+0DDD, U+0DCA)
Soyombo
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑪄◌ 𑪅◌ 𑪆◌ 𑪇◌ 𑪈◌ 𑪉◌ | [U+11A84..U+11A89] | 0 or 1 | |||
BASE | ◌ 𑩐 𑩜 𑩝 𑩞 𑩟 𑩠 𑩡 𑩢 𑩣 𑩤 𑩥 𑩦 𑩧 𑩨 𑩩 𑩪 𑩫 𑩬 𑩭 𑩮 𑩯 𑩰 𑩱 𑩲 𑩳 𑩴 𑩵 𑩶 𑩷 𑩸 𑩹 𑩺 𑩻 𑩼 𑩽 𑩾 𑩿 𑪀 𑪁 𑪂 𑪃 𑪝 | [U+25CC, U+11A50, U+11A5C..U+11A83, U+11A9D] | 1 | |||
CONS_MOD_ABOVE | ◌𑪘 | U+11A98 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑪙 | U+11A99 | 1 | |||
BASE | ◌ 𑩐 𑩜 𑩝 𑩞 𑩟 𑩠 𑩡 𑩢 𑩣 𑩤 𑩥 𑩦 𑩧 𑩨 𑩩 𑩪 𑩫 𑩬 𑩭 𑩮 𑩯 𑩰 𑩱 𑩲 𑩳 𑩴 𑩵 𑩶 𑩷 𑩸 𑩹 𑩺 𑩻 𑩼 𑩽 𑩾 𑩿 𑪀 𑪁 𑪂 𑪃 𑪝 | [U+25CC, U+11A50, U+11A5C..U+11A83, U+11A9D] | 1 | |||
CONS_MOD_ABOVE | ◌𑪘 | U+11A98 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑪙 | U+11A99 | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌𑩑 ◌𑩔 ◌𑩕 ◌𑩖 | [U+11A51, U+11A54..U+11A56] | 0 or more | |||
VOWEL_BELOW | ◌𑩒 ◌𑩓 ◌𑩙 ◌𑩚 ◌𑩛 | [U+11A52..U+11A53, U+11A59..U+11A5B] | 0 or more | |||
VOWEL_POST | ◌𑩗 ◌𑩘 | [U+11A57..U+11A58] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑪖 | U+11A96 | 0 or more | |||
VOWEL_MOD_POST | ◌𑪗 | U+11A97 | 0 or more | |||
CONS_FINAL_BELOW | ◌𑪊 ◌𑪋 ◌𑪌 ◌𑪍 ◌𑪎 ◌𑪏 ◌𑪐 ◌𑪑 ◌𑪒 ◌𑪓 ◌𑪔 ◌𑪕 | [U+11A8A..U+11A95] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑪚 𑪛 𑪜 𑪞 𑪟 𑪠 𑪡 𑪢 | [U+11A9A..U+11A9C, U+11A9E..U+11AA2] |
Sundanese
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ᮃ ᮄ ᮅ ᮆ ᮇ ᮈ ᮉ ᮊ ᮋ ᮌ ᮍ ᮎ ᮏ ᮐ ᮑ ᮒ ᮓ ᮔ ᮕ ᮖ ᮗ ᮘ ᮙ ᮚ ᮛ ᮜ ᮝ ᮞ ᮟ ᮠ ᮮ ᮯ ᮰ ᮱ ᮲ ᮳ ᮴ ᮵ ᮶ ᮷ ᮸ ᮹ ᮺ ᮻ ᮼ ᮽ ᮾ ᮿ ◌ | [U+1B83..U+1BA0, U+1BAE..U+1BBF, U+25CC, U+00A0] | 1 | |||
Repeating group | 0 or more | |||||
Alternative 1 | ||||||
HALANT | ◌᮫ | U+1BAB | 1 | |||
BASE | ᮃ ᮄ ᮅ ᮆ ᮇ ᮈ ᮉ ᮊ ᮋ ᮌ ᮍ ᮎ ᮏ ᮐ ᮑ ᮒ ᮓ ᮔ ᮕ ᮖ ᮗ ᮘ ᮙ ᮚ ᮛ ᮜ ᮝ ᮞ ᮟ ᮠ ᮮ ᮯ ᮰ ᮱ ᮲ ᮳ ᮴ ᮵ ᮶ ᮷ ᮸ ᮹ ᮺ ᮻ ᮼ ᮽ ᮾ ᮿ ◌ | [U+1B83..U+1BA0, U+1BAE..U+1BBF, U+25CC] | 1 | |||
Alternative 2 | ||||||
CONS_SUB | ◌ᮡ ◌ᮢ ◌ᮣ ◌ᮬ ◌ᮭ | [U+1BA1..U+1BA3, U+1BAC..U+1BAD] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌᮫ | U+1BAB | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌ᮦ | U+1BA6 | 0 or more | |||
VOWEL_ABOVE | ◌ᮤ ◌ᮨ ◌ᮩ | [U+1BA4, U+1BA8..U+1BA9] | 0 or more | |||
VOWEL_BELOW | ◌ᮥ | U+1BA5 | 0 or more | |||
VOWEL_POST | ◌ᮧ ◌᮪ | [U+1BA7, U+1BAA] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ᮀ | U+1B80 | 0 or more | |||
VOWEL_MOD_POST | ◌ᮂ | U+1B82 | 0 or more | |||
CONS_FINAL_ABOVE | ◌ᮁ | U+1B81 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
" , . ? ᳀ ᳁ ᳂ ᳃ ᳄ ᳅ ᳆ ᳇ “ ” | [U+0020, U+0022, U+002C, U+002E, U+003F, U+1CC0..U+1CC7, U+201C..U+201D] |
Syloti Nagri
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ◌ ꠀ ꠁ ꠃ ꠄ ꠅ ꠇ ꠈ ꠉ ꠊ ꠌ ꠍ ꠎ ꠏ ꠐ ꠑ ꠒ ꠓ ꠔ ꠕ ꠖ ꠗ ꠘ ꠙ ꠚ ꠛ ꠜ ꠝ ꠞ ꠟ ꠠ ꠡ ꠢ | [U+09E6..U+09EF, U+25CC, U+A800..U+A801, U+A803..U+A805, U+A807..U+A80A, U+A80C..U+A822] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌꠆ | U+A806 | 1 | |||
BASE | ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ◌ ꠀ ꠁ ꠃ ꠄ ꠅ ꠇ ꠈ ꠉ ꠊ ꠌ ꠍ ꠎ ꠏ ꠐ ꠑ ꠒ ꠓ ꠔ ꠕ ꠖ ꠗ ꠘ ꠙ ꠚ ꠛ ꠜ ꠝ ꠞ ꠟ ꠠ ꠡ ꠢ | [U+09E6..U+09EF, U+25CC, U+A800..U+A801, U+A803..U+A805, U+A807..U+A80A, U+A80C..U+A822] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌꠆ | U+A806 | 1 | |||
Alternative 2 | ||||||
VOWEL_ABOVE | ◌ꠂ ◌ꠦ | [U+A802, U+A826] | 0 or more | |||
VOWEL_BELOW | ◌ꠥ ◌꠬ | [U+A825, U+A82C] | 0 or more | |||
VOWEL_POST | ◌ꠣ ◌ꠤ ◌ꠧ | [U+A823..U+A824, U+A827] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ꠋ | U+A80B | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
, . : ; ? । ॥ ⁕ ꠨ ꠩ ꠪ ꠫ | [U+002C, U+002E, U+003A..U+003B, U+003F, U+0964..U+0965, U+2055, U+A828..U+A82B] |
Tagalog
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᜀ ᜁ ᜂ ᜃ ᜄ ᜅ ᜆ ᜇ ᜈ ᜉ ᜊ ᜋ ᜌ ᜍ ᜎ ᜏ ᜐ ᜑ ᜟ ◌ | [U+1700..U+1711, U+171F, U+25CC] | 1 | |||
VOWEL_ABOVE | ◌ᜒ | U+1712 | 0 or more | |||
VOWEL_BELOW | ◌ᜓ ◌᜔ | [U+1713..U+1714] | 0 or more | |||
VOWEL_POST | ◌᜕ | U+1715 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᜵ ᜶ | [U+1735..U+1736] |
Tagbanwa
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ᝠ ᝡ ᝢ ᝣ ᝤ ᝥ ᝦ ᝧ ᝨ ᝩ ᝪ ᝫ ᝬ ᝮ ᝯ ᝰ ◌ | [U+1760..U+176C, U+176E..U+1770, U+25CC] | 1 | |||
VOWEL_ABOVE | ◌ᝲ | U+1772 | 0 or more | |||
VOWEL_BELOW | ◌ᝳ | U+1773 | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
᜵ ᜶ | [U+1735..U+1736] |
Tai Le
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | 0 1 2 3 4 5 6 7 8 9 ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ᥐ ᥑ ᥒ ᥓ ᥔ ᥕ ᥖ ᥗ ᥘ ᥙ ᥚ ᥛ ᥜ ᥝ ᥞ ᥟ ᥠ ᥡ ᥢ ᥣ ᥤ ᥥ ᥦ ᥧ ᥨ ᥩ ᥪ ᥫ ᥬ ᥭ ᥰ ᥱ ᥲ ᥳ ᥴ ◌ | [U+0030..U+0039, U+1040..U+1049, U+1950..U+196D, U+1970..U+1974, U+25CC] | 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
◌̀ ◌́ ◌̇ ◌̈ ◌̌ | [U+0300..U+0301, U+0307..U+0308, U+030C] |
Known bugs:
- USE cluster definition incompatible with non-Brahmic combining marks (affects U+0300..U+0301, U+0307..U+0308, U+030C)
Tai Tham
Tai Tham is not fully supported by the USE. Orthographic syllables in Tai Tham can be more complicated than those of most other scripts, and there’s no agreement yet on how to encode them in Unicode. See the Topical Document List: Tai Tham.
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | 0 1 2 3 4 5 6 7 8 9 ᨠ ᨡ ᨢ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯ ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿ ᩀ ᩁ ᩂ ᩃ ᩄ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᩓ ᩔ ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙ ◌ | [U+0030..U+0039, U+1A20..U+1A54, U+1A80..U+1A89, U+1A90..U+1A99, U+25CC] | 1 | |||
Repeating group | 0 or more | |||||
Alternative 1 | ||||||
HALANT | ◌᩠ | U+1A60 | 1 | |||
BASE | 0 1 2 3 4 5 6 7 8 9 ᨠ ᨡ ᨢ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯ ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿ ᩀ ᩁ ᩂ ᩃ ᩄ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᩓ ᩔ ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙ ◌ | [U+0030..U+0039, U+1A20..U+1A54, U+1A80..U+1A89, U+1A90..U+1A99, U+25CC] | 1 | |||
Alternative 2 | ||||||
CONS_SUB | ◌ᩗ ◌ᩛ ◌ᩜ ◌ᩝ ◌ᩞ | [U+1A57, U+1A5B..U+1A5E] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌᩠ | U+1A60 | 1 | |||
Alternative 2 | ||||||
CONS_MED_PRE | ◌ᩕ | U+1A55 | 0 or 1 | |||
CONS_MED_BELOW | ◌ᩖ | U+1A56 | 0 or 1 | |||
VOWEL_PRE | ◌ᩮ ◌ᩯ ◌ᩰ ◌ᩱ ◌ᩲ | [U+1A6E..U+1A72] | 0 or more | |||
VOWEL_ABOVE | ◌ᩢ ◌ᩥ ◌ᩦ ◌ᩧ ◌ᩨ ◌ᩫ ◌ᩳ ◌᩺ | [U+1A62, U+1A65..U+1A68, U+1A6B, U+1A73, U+1A7A] | 0 or more | |||
VOWEL_BELOW | ◌ᩩ ◌ᩪ ◌ᩬ | [U+1A69..U+1A6A, U+1A6C] | 0 or more | |||
VOWEL_POST | ◌ᩡ ◌ᩣ ◌ᩤ ◌ᩭ | [U+1A61, U+1A63..U+1A64, U+1A6D] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ᩴ ◌᩵ ◌᩶ ◌᩷ ◌᩸ ◌᩹ ◌᩻ ◌᩼ | [U+1A74..U+1A79, U+1A7B..U+1A7C] | 0 or more | |||
VOWEL_MOD_BELOW | ◌᩿ | U+1A7F | 0 or more | |||
CONS_FINAL_ABOVE | ◌ᩘ ◌ᩙ | [U+1A58..U+1A59] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
! " ( ) ? ◌ᩚ ᪠ ᪡ ᪢ ᪣ ᪤ ᪥ ᪦ ᪧ ᪨ ᪩ ᪪ ᪫ ᪬ ᪭ “ ” | [U+0021..U+0022, U+0028..U+0029, U+003F, U+1A5A, U+1AA0..U+1AAD, U+201C..U+201D] |
Known bugs:
- Reordering of multiple pre-base vowels not specified (affects U+1A6E..U+1A72)
- Consonant_Initial_Postfixed should be mapped to CONS_MED (affects U+1A5A)
Tai Viet
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ◌ ꪀ ꪁ ꪂ ꪃ ꪄ ꪅ ꪆ ꪇ ꪈ ꪉ ꪊ ꪋ ꪌ ꪍ ꪎ ꪏ ꪐ ꪑ ꪒ ꪓ ꪔ ꪕ ꪖ ꪗ ꪘ ꪙ ꪚ ꪛ ꪜ ꪝ ꪞ ꪟ ꪠ ꪡ ꪢ ꪣ ꪤ ꪥ ꪦ ꪧ ꪨ ꪩ ꪪ ꪫ ꪬ ꪭ ꪮ ꪯ ꪱ ꪵ ꪶ ꪹ ꪺ ꪻ ꪼ ꪽ ꫀ ꫂ | [U+25CC, U+AA80..U+AAAF, U+AAB1, U+AAB5..U+AAB6, U+AAB9..U+AABD, U+AAC0, U+AAC2, U+00A0] | 1 | |||
VOWEL_ABOVE | ◌ꪰ ◌ꪲ ◌ꪳ ◌ꪷ ◌ꪸ ◌ꪾ | [U+AAB0, U+AAB2..U+AAB3, U+AAB7..U+AAB8, U+AABE] | 0 or more | |||
VOWEL_BELOW | ◌ꪴ | U+AAB4 | 0 or more | |||
VOWEL_MOD_ABOVE | ◌꪿ ◌꫁ | [U+AABF, U+AAC1] | 0 or more |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
ꫛ ꫜ ꫝ ꫞ ꫟ | [U+0020, U+AADB..U+AADF] |
Takri
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑚀 𑚁 𑚂 𑚃 𑚄 𑚅 𑚆 𑚇 𑚈 𑚉 𑚊 𑚋 𑚌 𑚍 𑚎 𑚏 𑚐 𑚑 𑚒 𑚓 𑚔 𑚕 𑚖 𑚗 𑚘 𑚙 𑚚 𑚛 𑚜 𑚝 𑚞 𑚟 𑚠 𑚡 𑚢 𑚣 𑚤 𑚥 𑚦 𑚧 𑚨 𑚩 𑚪 𑚸 𑛀 𑛁 𑛂 𑛃 𑛄 𑛅 𑛆 𑛇 𑛈 𑛉 | [U+25CC, U+11680..U+116AA, U+116B8, U+116C0..U+116C9] | 1 | |||
CONS_MOD_BELOW | ◌𑚷 | U+116B7 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑚶 | U+116B6 | 1 | |||
BASE | ◌ 𑚀 𑚁 𑚂 𑚃 𑚄 𑚅 𑚆 𑚇 𑚈 𑚉 𑚊 𑚋 𑚌 𑚍 𑚎 𑚏 𑚐 𑚑 𑚒 𑚓 𑚔 𑚕 𑚖 𑚗 𑚘 𑚙 𑚚 𑚛 𑚜 𑚝 𑚞 𑚟 𑚠 𑚡 𑚢 𑚣 𑚤 𑚥 𑚦 𑚧 𑚨 𑚩 𑚪 𑚸 𑛀 𑛁 𑛂 𑛃 𑛄 𑛅 𑛆 𑛇 𑛈 𑛉 | [U+25CC, U+11680..U+116AA, U+116B8, U+116C0..U+116C9] | 1 | |||
CONS_MOD_BELOW | ◌𑚷 | U+116B7 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑚶 | U+116B6 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑚮 | U+116AE | 0 or more | |||
VOWEL_ABOVE | ◌𑚭 ◌𑚲 ◌𑚳 ◌𑚴 ◌𑚵 | [U+116AD, U+116B2..U+116B5] | 0 or more | |||
VOWEL_BELOW | ◌𑚰 ◌𑚱 | [U+116B0..U+116B1] | 0 or more | |||
VOWEL_POST | ◌𑚯 | U+116AF | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑚫 | U+116AB | 0 or more | |||
VOWEL_MOD_POST | ◌𑚬 | U+116AC | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑚀 ◌𑚭 | <U+11680, U+116AD> |
𑚀 ◌𑚴 | <U+11680, U+116B4> |
𑚀 ◌𑚵 | <U+11680, U+116B5> |
𑚆 ◌𑚲 | <U+11686, U+116B2> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑚹 | [U+0964..U+0965, U+A830..U+A839, U+116B9] |
Tibetan
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌ཱི | U+0F73 | ◌ཱ ◌ི | <U+0F71, U+0F72> |
◌ཱུ | U+0F75 | ◌ཱ ◌ུ | <U+0F71, U+0F74> |
◌ྲྀ | U+0F76 | ◌ྲ ◌ྀ | <U+0FB2, U+0F80> |
◌ླྀ | U+0F78 | ◌ླ ◌ྀ | <U+0FB3, U+0F80> |
◌ཱྀ | U+0F81 | ◌ཱ ◌ྀ | <U+0F71, U+0F80> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE, BASE_OTHER | ༀ ༁ ༄ ༅ ༆ ༠ ༡ ༢ ༣ ༤ ༥ ༦ ༧ ༨ ༩ ༪ ༫ ༬ ༭ ༮ ༯ ༰ ༱ ༲ ༳ ཀ ཁ ག གྷ ང ཅ ཆ ཇ ཉ ཊ ཋ ཌ ཌྷ ཎ ཏ ཐ ད དྷ ན པ ཕ བ བྷ མ ཙ ཚ ཛ ཛྷ ཝ ཞ ཟ འ ཡ ར ལ ཤ ཥ ས ཧ ཨ ཀྵ ཪ ཫ ཬ ྈ ྉ ྊ ྋ ྌ ◌ | [U+0F00..U+0F01, U+0F04..U+0F06, U+0F20..U+0F33, U+0F40..U+0F47, U+0F49..U+0F6C, U+0F88..U+0F8C, U+25CC, U+00A0] | 1 | |||
CONS_MOD_ABOVE | ◌༹ | U+0F39 | 0 or more | |||
CONS_MOD_BELOW | ◌ཱ | U+0F71 | 0 or more | |||
Repeating group | 0 or more | |||||
CONS_SUB | ◌ྍ ◌ྎ ◌ྏ ◌ྐ ◌ྑ ◌ྒ ◌ྒྷ ◌ྔ ◌ྕ ◌ྖ ◌ྗ ◌ྙ ◌ྚ ◌ྛ ◌ྜ ◌ྜྷ ◌ྞ ◌ྟ ◌ྠ ◌ྡ ◌ྡྷ ◌ྣ ◌ྤ ◌ྥ ◌ྦ ◌ྦྷ ◌ྨ ◌ྩ ◌ྪ ◌ྫ ◌ྫྷ ◌ྭ ◌ྮ ◌ྯ ◌ྰ ◌ྱ ◌ྲ ◌ླ ◌ྴ ◌ྵ ◌ྶ ◌ྷ ◌ྸ ◌ྐྵ ◌ྺ ◌ྻ ◌ྼ | [U+0F8D..U+0F97, U+0F99..U+0FBC] | 1 | |||
CONS_MOD_ABOVE | ◌༹ | U+0F39 | 0 or more | |||
CONS_MOD_BELOW | ◌ཱ | U+0F71 | 0 or more | |||
VOWEL_ABOVE | ◌ུ ◌ཷ ◌ཹ | [U+0F74, U+0F77, U+0F79] | 0 or more | |||
VOWEL_BELOW | ◌ི ◌ེ ◌ཻ ◌ོ ◌ཽ ◌ྀ ◌྄ | [U+0F72, U+0F7A..U+0F7D, U+0F80, U+0F84] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌ཾ ◌ྂ ◌ྃ ◌྆ ◌྇ | [U+0F7E, U+0F82..U+0F83, U+0F86..U+0F87] | 0 or more | |||
CONS_FINAL_MOD | ◌༵ ◌༷ ◌࿆ | [U+0F35, U+0F37, U+0FC6] | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
༂ ༃ ༇ ༈ ༉ ༊ ་ ༌ ། ༎ ༏ ༐ ༑ ༒ ༓ ༔ ༕ ༖ ༗ ◌༘ ◌༙ ༚ ༛ ༜ ༝ ༞ ༟ ༴ ༶ ༸ ༺ ༻ ༼ ༽ ◌༾ ◌༿ ◌ཿ ྅ ྾ ྿ ࿀ ࿁ ࿂ ࿃ ࿄ ࿅ ࿇ ࿈ ࿉ ࿊ ࿋ ࿌ ࿎ ࿏ ࿐ ࿑ ࿒ ࿓ ࿔ ࿕ ࿖ ࿗ ࿘ ࿙ ࿚ ☸ | [U+0F02..U+0F03, U+0F07..U+0F1F, U+0F34, U+0F36, U+0F38, U+0F3A..U+0F3F, U+0F7F, U+0F85, U+0FBE..U+0FC5, U+0FC7..U+0FCC, U+0FCE..U+0FDA, U+2638] |
Known bugs:
- Several characters are in two USE classes each (affects U+0F01, U+0F04..U+0F06)
Tirhuta
The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:
Composed Character | Composed Encoding | Decomposed Characters | Decomposed Encoding |
---|---|---|---|
◌𑒻 | U+114BB | ◌𑒹 ◌𑒺 | <U+114B9, U+114BA> |
◌𑒼 | U+114BC | ◌𑒹 ◌𑒰 | <U+114B9, U+114B0> |
◌𑒾 | U+114BE | ◌𑒹 ◌𑒽 | <U+114B9, U+114BD> |
Encoding order:
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
BASE | ◌ 𑒁 𑒂 𑒃 𑒄 𑒅 𑒆 𑒇 𑒈 𑒉 𑒊 𑒋 𑒌 𑒍 𑒎 𑒏 𑒐 𑒑 𑒒 𑒓 𑒔 𑒕 𑒖 𑒗 𑒘 𑒙 𑒚 𑒛 𑒜 𑒝 𑒞 𑒟 𑒠 𑒡 𑒢 𑒣 𑒤 𑒥 𑒦 𑒧 𑒨 𑒩 𑒪 𑒫 𑒬 𑒭 𑒮 𑒯 𑓄 𑓐 𑓑 𑓒 𑓓 𑓔 𑓕 𑓖 𑓗 𑓘 𑓙 | [U+25CC, U+11481..U+114AF, U+114C4, U+114D0..U+114D9] | 1 | |||
CONS_MOD_BELOW | ◌𑓃 | U+114C3 | 0 or more | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑓂 | U+114C2 | 1 | |||
BASE | ◌ 𑒁 𑒂 𑒃 𑒄 𑒅 𑒆 𑒇 𑒈 𑒉 𑒊 𑒋 𑒌 𑒍 𑒎 𑒏 𑒐 𑒑 𑒒 𑒓 𑒔 𑒕 𑒖 𑒗 𑒘 𑒙 𑒚 𑒛 𑒜 𑒝 𑒞 𑒟 𑒠 𑒡 𑒢 𑒣 𑒤 𑒥 𑒦 𑒧 𑒨 𑒩 𑒪 𑒫 𑒬 𑒭 𑒮 𑒯 𑓄 𑓐 𑓑 𑓒 𑓓 𑓔 𑓕 𑓖 𑓗 𑓘 𑓙 | [U+25CC, U+11481..U+114AF, U+114C4, U+114D0..U+114D9] | 1 | |||
CONS_MOD_BELOW | ◌𑓃 | U+114C3 | 0 or more | |||
Alternative 1 | ||||||
HALANT | ◌𑓂 | U+114C2 | 1 | |||
Alternative 2 | ||||||
VOWEL_PRE | ◌𑒱 ◌𑒹 | [U+114B1, U+114B9] | 0 or more | |||
VOWEL_ABOVE | ◌𑒺 | U+114BA | 0 or more | |||
VOWEL_BELOW | ◌𑒳 ◌𑒴 ◌𑒵 ◌𑒶 ◌𑒷 ◌𑒸 | [U+114B3..U+114B8] | 0 or more | |||
VOWEL_POST | ◌𑒰 ◌𑒲 ◌𑒽 | [U+114B0, U+114B2, U+114BD] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌॑ ◌𑒿 ◌𑓀 | [U+0951, U+114BF..U+114C0] | 0 or more | |||
VOWEL_MOD_BELOW | ◌॒ | U+0952 | 0 or more | |||
VOWEL_MOD_POST | ◌𑓁 | U+114C1 | 0 or more |
The USE does not allow the following character sequences because there are equivalent individual characters:
Character sequence | Encoding |
---|---|
𑒁 ◌𑒰 | <U+11481, U+114B0> |
𑒋 ◌𑒺 | <U+1148B, U+114BA> |
𑒍 ◌𑒺 | <U+1148D, U+114BA> |
𑒪 ◌𑒵 | <U+114AA, U+114B5> |
𑒪 ◌𑒶 | <U+114AA, U+114B6> |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
। ॥ ৴ ᳲ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑒀 𑓅 𑓆 𑓇 | [U+0964..U+0965, U+09F4, U+1CF2, U+A830..U+A839, U+11480, U+114C5..U+114C7] |
Zanabazar Square
Classes | Characters | Encoding | Count | |||
---|---|---|---|---|---|---|
REPHA | 𑨺◌ | U+11A3A | 0 or 1 | |||
BASE | ◌ 𑨀 𑨋 𑨌 𑨍 𑨎 𑨏 𑨐 𑨑 𑨒 𑨓 𑨔 𑨕 𑨖 𑨗 𑨘 𑨙 𑨚 𑨛 𑨜 𑨝 𑨞 𑨟 𑨠 𑨡 𑨢 𑨣 𑨤 𑨥 𑨦 𑨧 𑨨 𑨩 𑨪 𑨫 𑨬 𑨭 𑨮 𑨯 𑨰 𑨱 𑨲 | [U+25CC, U+11A00, U+11A0B..U+11A32] | 1 | |||
Repeating group | 0 or more | |||||
HALANT | ◌𑩇 | U+11A47 | 1 | |||
BASE | ◌ 𑨀 𑨋 𑨌 𑨍 𑨎 𑨏 𑨐 𑨑 𑨒 𑨓 𑨔 𑨕 𑨖 𑨗 𑨘 𑨙 𑨚 𑨛 𑨜 𑨝 𑨞 𑨟 𑨠 𑨡 𑨢 𑨣 𑨤 𑨥 𑨦 𑨧 𑨨 𑨩 𑨪 𑨫 𑨬 𑨭 𑨮 𑨯 𑨰 𑨱 𑨲 | [U+25CC, U+11A00, U+11A0B..U+11A32] | 1 | |||
Alternative 1 | ||||||
HALANT | ◌𑩇 | U+11A47 | 1 | |||
Alternative 2 | ||||||
CONS_MED_BELOW | ◌𑨻 ◌𑨼 ◌𑨽 ◌𑨾 | [U+11A3B..U+11A3E] | 0 or 1 | |||
VOWEL_ABOVE | ◌𑨁 ◌𑨄 ◌𑨅 ◌𑨆 ◌𑨇 ◌𑨈 ◌𑨉 | [U+11A01, U+11A04..U+11A09] | 0 or more | |||
VOWEL_BELOW | ◌𑨂 ◌𑨃 ◌𑨊 ◌𑨴 | [U+11A02..U+11A03, U+11A0A, U+11A34] | 0 or more | |||
VOWEL_MOD_ABOVE | ◌𑨵 ◌𑨶 ◌𑨷 ◌𑨸 | [U+11A35..U+11A38] | 0 or more | |||
VOWEL_MOD_POST | ◌𑨹 | U+11A39 | 0 or more | |||
CONS_FINAL_MOD | ◌𑨳 | U+11A33 | 0 or 1 |
The following characters are used with the script but can not be part of clusters:
Characters | Encoding |
---|---|
𑨿 𑩀 𑩁 𑩂 𑩃 𑩄 𑩅 𑩆 | [U+11A3F..U+11A46] |
Known bugs:
- Double MBlw in Zanabazar Square (affects U+11A3B..U+11A3E)
- Several characters are in two USE classes each (affects U+11A3F, U+11A45)