Lontar

Encoding orders of Brahmic scripts

Norbert Lindenberg
November 2, 2023

This article documents the encoding orders that the OpenType Universal Shaping Engine assumes for the Brahmic scripts it supports. Understanding encoding orders is necessary when rendering or otherwise interpreting text in these scripts, as well as when entering text using input methods or otherwise generating text.

Contents

Introduction

Text written in a Brahmic script consists of orthographic syllables, two-dimensional visual arrangements of glyphs that form a unit. When encoding orthographic syllables, the Unicode characters corresponding to the glyphs should be arranged in a well-defined order, so that text can be rendered correctly, and compared, searched, or otherwise processed without ambiguities or missed matches. The Unicode Standard does not define this order for most Brahmic scripts, and does so incompletely or ambiguously for others. The shaping engines of OpenType font rendering systems have therefore become the de-facto definitions of the encoding orders of orthographic syllables.

The main tables in this document show the encoding orders for Brahmic scripts as defined by the OpenType Universal Shaping Engine (USE). This information can be used in several ways:

The main tables show the classes into which the USE classifies the characters of each script, the order in which their characters can occur within a cluster (the USE’s approximation of an orthographic syllable), and how often they can occur. They may also show that some sections, such as consonant conjunct forms and their associated modifiers, can be repeated, and that after the initial consonant cluster a syllable may continue with either a final virama (for which the USE uses the Hindi term “halant”) or a more substantial sequence of medial consonants, vowels, vowel modifiers, and final consonants.

Supplementary tables show characters that are used in a script but can not be part of clusters, and for some scripts decompositions that the USE applies and character sequences that it does not allow because there are equivalent individual characters.

Scripts covered in this document

This document covers scripts that are encoded in Unicode 15.1, have the Unicode property Indic_Syllabic_Category defined for at least some of their characters, and are included in the list of scripts supported in the Universal Shaping Engine.

The Unicode property Indic_Syllabic_Category is generally defined for those characters in Brahmic scripts that participate in forming orthographic syllables. The Kharoshthi script, which was a contemporary rather than a descendant of Brahmi, is included because it forms orthographic syllables similar to Brahmic scripts. Some scripts that are descendants of Brahmi may not be included because their structure has been simplified to a simple linear sequence of glyphs.

Only scripts supported by the USE are included because it is the only OpenType shaping engine that provides a well-defined cluster structure for Brahmic scripts that enables largely interoperable implementations. The documentation for shaping engines supporting other Brahmic scripts (Bengali (Bangla), Devanagari, Gujarati, Gurmukhi, Kannada, Khmer, Lao, Malayalam, Myanmar, Odia (Oriya), Tamil, Thai, Telugu) is incomplete and has been abandoned by its owner, and implementations have diverged substantially, as documented for Devanagari and Khmer. New Tai Lue is missing because OpenType documentation does not assign it to any shaping engine.

Derivation of the tables

The first step in creating these tables was to determine which characters are used with each script. The Unicode Standard includes two properties, Script and Script_Extensions, that provide much of this information. However, the use of characters in the shared scripts Common and Inherited for specific scripts is generally not documented there, and some other information provided in script proposals may have been lost as well. Long-time Unicode contributor Roozbeh Pournader has therefore started a project to collect exemplar characters for each script, and that information was included here. In addition, U+25CC DOTTED CIRCLE was included for all scripts that use combining marks (except variation selectors) or characters with USE class REPHA.

In the classification of characters, some assumptions had to be made to work around USE bug 475, which causes ambiguous classifications for several characters. For characters where the USE overrides the Indic syllabic category, categorization was based on the override. U+00A0 NO-BREAK SPACE was assigned to the BASE_OTHER class to enable its use with combining marks, as recommended in the Unicode Standard. U+200D ZERO WIDTH JOINER was assigned to the ZWJ class. The remaining characters were assigned to the BASE_IND class to avoid creating expectations that may not be met. Another assumption had to be made to work around USE bug 928 by assigning U+1171E AHOM CONSONANT SIGN MEDIAL RA to subclass CONS_MED_PRE.

USE subclasses that the USE merges into one “sigla” are merged into one representative subclass. For example, VOWEL_PRE_ABOVE, VOWEL_PRE_ABOVE_POST, VOWEL_PRE_POST are merged into VOWEL_PRE.

The seven regular expressions defining cluster structures in the USE documentation were reduced to three. The USE’s independent cluster is not relevant because no characters of the scripts covered here that fall into the classes starting such clusters allow variation selectors. Instead, a list of characters that can not be part of clusters and are therefore treated as standalone is provided. The standard cluster and virama-terminated cluster were merged into one regular cluster, as they share a long common start, followed by a long or short tail. These tails show up as alternatives in the tables provided. The number-joiner terminated cluster and the numeral cluster were merged into one special cluster, as the only difference is an optional number-joiner. The symbol cluster remains as is, but is only shown for Balinese, the only script where the cluster can be longer than one character. For all other scripts, SYM characters are lumped into the list of characters that can not be part of clusters. The hieroglyph cluster is not relevant for Brahmic scripts.

For each script, all characters used are sorted into their USE classes and subclasses, and inserted into the three regular expressions. Empty classes and subclasses are removed, empty subexpressions are removed, and empty regular expressions are removed. The remainder is formatted into tables whose format is derived from that used to describe encoding orders in the Unicode Standard, but which are extended to preserve the structural information of the underlying regular expressions.

Caveats

The precise cluster structures used by implementations of the USE can differ somewhat. While the USE’s generic Brahmic cluster model works reasonably well for most of the scripts discussed here, some scripts require adjustments, which may have been made differently (or not at all) in different implementations.

As the USE serves many different scripts, the encoding order for any particular script will allow many character sequences that don’t make sense for that particular script. That’s OK – an encoding order is no substitute for a spelling checker.

The USE occasionally places characters into classes where they linguistically or graphically don’t belong. One example are visible viramas, which it classifies as dependent vowels because they occur instead of such vowels in the encoding order. Other cases are often motivated by the need to enable a character sequence that occurs in real life but wouldn’t be allowed by a strict interpretation of the USE cluster model and the underlying Unicode character data. One such example occurs in Chakma, where above-base and below-base vowels were swapped in order to match the canonical decomposition of two vowels.

Some of the tables in this document include characters that are used in the particular script, but belong to a different one, and classify them as the USE would do. However, before text reaches a shaping engine, OpenType systems will break it into script runs. How they do that is one of the undocumented mysteries of OpenType, but the result may be that such adopted characters can't form clusters together with characters from the script in whose table they appear.

The characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON-JOINER are used in numerous scripts, sometimes for specific purposes described in the Unicode Standard, sometimes for their generic purposes of requesting or breaking up ligatures (including conjunct forms). The USE allows them anywhere in a cluster, so they’re not documented in the cluster structure. See the USE sections Zero-width joiner and Zero-width non-joiner for details.

How to report issues

You might find some problem in the encoding orders documented in this article. As it combines information from multiple sources, issues should be reported where they can be fixed:

Acknowledgments

I’d like to thank: Andrew Glass for specifying the Universal Shaping Engine, without which this article would not have been possible. Roozbeh Pournader for collecting the script exemplars data, and for quickly addressing the issues I found. Many contributors for writing script proposals, and the few who turn these proposals into the Unicode Standard, including the character properties underlying the USE. Many type designers and font engineers for creating, and Google for sponsoring, the Noto font family, which makes it possible to show actual characters in this article. Marc Durdin, Muthu Nedumaran, and Simon Cozens for providing feedback on this article.

References

Norbert Lindenberg: Issues in Khmer syllable validation. Lindenberg Software LLC, 2019.

Norbert Lindenberg: Issues in Devanagari cluster validation. Lindenberg Software LLC, 2020.

Norbert Lindenberg: Implementing Javanese. The Unicode Consortium, 2022.

Microsoft Corporation: Creating and supporting OpenType fonts for the Universal Shaping Engine. Microsoft Corporation, dated 2022-10-01.

Microsoft Corporation: Microsoft Font-tools. GitHub, as of 2022-09-21. Includes data tables for the Universal Shaping Engine.

Roozbeh Pournader: unicode-data. GitHub, as of 2023-10-24. Includes script exemplar data.

The Unicode Consortium: The Unicode Standard, Version 15.1. The Unicode Consortium, 2023.

𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇𑽇𑽎𑽇

Ahom

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ 𑜀 𑜁 𑜂 𑜃 𑜄 𑜅 𑜆 𑜇 𑜈 𑜉 𑜊 𑜋 𑜌 𑜍 𑜎 𑜏 𑜐 𑜑 𑜒 𑜓 𑜔 𑜕 𑜖 𑜗 𑜘 𑜙 𑜚 𑜰 𑜱 𑜲 𑜳 𑜴 𑜵 𑜶 𑜷 𑜸 𑜹 𑜺 𑜻 𑝀 𑝁 𑝂 𑝃 𑝄 𑝅 𑝆   [U+25CC, U+11700..U+1171A, U+11730..U+1173B, U+11740..U+11746, U+00A0]1
CONS_MED_PRE◌𑜞 U+1171E0 or 1
CONS_MED_ABOVE◌𑜟 U+1171F0 or 1
CONS_MED_BELOW◌𑜝 U+1171D0 or 1
VOWEL_PRE◌𑜦 U+117260 or more
VOWEL_ABOVE◌𑜢 ◌𑜣 ◌𑜧 ◌𑜩 ◌𑜪 ◌𑜫 [U+11722..U+11723, U+11727, U+11729..U+1172B]0 or more
VOWEL_BELOW◌𑜤 ◌𑜥 ◌𑜨 [U+11724..U+11725, U+11728]0 or more
VOWEL_POST◌𑜠 ◌𑜡 [U+11720..U+11721]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑜼 𑜽 𑜾 𑜿 [U+0020, U+1173C..U+1173F]

Known bugs:

Balinese

For the Balinese script, the USE supports two cluster structures: A regular one for the normal orthographic syllables used for writing normal languages, and a special one for the combinations of musical symbols and combining marks used for writing musical scores. Several notations using these symbols are described in Unicode Technical Note 51 Musical symbols and Sasak characters in the Balinese script.

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌ᬻ U+1B3B◌ᬺ ◌ᬵ <U+1B3A, U+1B35>
◌ᬽ U+1B3D◌ᬼ ◌ᬵ <U+1B3C, U+1B35>
◌ᭀ U+1B40◌ᬾ ◌ᬵ <U+1B3E, U+1B35>
◌ᭁ U+1B41◌ᬿ ◌ᬵ <U+1B3F, U+1B35>
◌ᭃ U+1B43◌ᭂ ◌ᬵ <U+1B42, U+1B35>

Encoding order:

ClassesCharactersEncodingCount
BASEᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬲ ᬳ ᭅ ᭆ ᭇ ᭈ ᭉ ᭊ ᭋ ᭌ ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙ ◌ [U+1B05..U+1B33, U+1B45..U+1B4C, U+1B50..U+1B59, U+25CC]1
CONS_MOD_ABOVE◌᬴ U+1B340 or more
Repeating group0 or more
  HALANT◌᭄ U+1B441
BASEᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬲ ᬳ ᭅ ᭆ ᭇ ᭈ ᭉ ᭊ ᭋ ᭌ ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙ ◌ [U+1B05..U+1B33, U+1B45..U+1B4C, U+1B50..U+1B59, U+25CC]1
CONS_MOD_ABOVE◌᬴ U+1B340 or more
Alternative 1
  HALANT◌᭄ U+1B441
Alternative 2
  VOWEL_PRE◌ᬾ ◌ᬿ [U+1B3E..U+1B3F]0 or more
VOWEL_ABOVE◌ᬶ ◌ᬷ ◌ᬼ ◌ᭂ [U+1B36..U+1B37, U+1B3C, U+1B42]0 or more
VOWEL_BELOW◌ᬸ ◌ᬹ ◌ᬺ [U+1B38..U+1B3A]0 or more
VOWEL_POST◌ᬵ U+1B350 or more
VOWEL_MOD_ABOVE◌ᬀ ◌ᬁ ◌ᬂ [U+1B00..U+1B02]0 or more
VOWEL_MOD_POST◌ᬄ U+1B040 or more
CONS_FINAL_ABOVE◌ᬃ U+1B030 or more

Special cluster structure for symbols:

ClassesCharactersEncodingCount
SYM᭡ ᭢ ᭣ ᭤ ᭥ ᭦ ᭧ ᭨ ᭩ ᭪ ᭴ ᭵ ᭶ ᭷ ᭸ ᭹ ᭺ ᭻ ᭼ [U+1B61..U+1B6A, U+1B74..U+1B7C]1
SYM_MOD_ABOVE◌᭫ ◌᭭ ◌᭮ ◌᭯ ◌᭰ ◌᭱ ◌᭲ ◌᭳ [U+1B6B, U+1B6D..U+1B73]0 or more
SYM_MOD_BELOW◌᭬ U+1B6C0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᭚ ᭛ ᭜ ᭝ ᭞ ᭟ ᭠ ᭽ ᭾ [U+1B5A..U+1B60, U+1B7D..U+1B7E]

Known bugs:

Batak

The Batak script has a feature that the USE does not support well: In a character sequence consonant-vowel-consonant-virama, the vowel and second consonant are displayed in reverse order. For example, the syllable stored as ta, ◌ᯪ i, pa, ◌᯲ virama has to be displayed as ᯖᯪᯇ᯲ tip. For the USE, however, this sequence consists of two clusters. For a way to implement this reordering in fonts, see Constructing fonts for the Batak script. Implementers of keyboards and other text producers should ensure that a Batak cluster in storage never contains both a vowel and a virama. If a user types both after the same consonant, the vowel has to be reordered before the consonant in storage.

ClassesCharactersEncodingCount
BASEᯀ ᯁ ᯂ ᯃ ᯄ ᯅ ᯆ ᯇ ᯈ ᯉ ᯊ ᯋ ᯌ ᯍ ᯎ ᯏ ᯐ ᯑ ᯒ ᯓ ᯔ ᯕ ᯖ ᯗ ᯘ ᯙ ᯚ ᯛ ᯜ ᯝ ᯞ ᯟ ᯠ ᯡ ᯢ ᯣ ᯤ ᯥ ◌ [U+1BC0..U+1BE5, U+25CC]1
CONS_MOD_ABOVE◌᯦ U+1BE60 or more
CONS_MOD_BELOW◌᯲ ◌᯳ [U+1BF2..U+1BF3]0 or more
VOWEL_ABOVE◌ᯨ ◌ᯩ ◌ᯭ ◌ᯯ [U+1BE8..U+1BE9, U+1BED, U+1BEF]0 or more
VOWEL_POST◌ᯧ ◌ᯪ ◌ᯫ ◌ᯬ ◌ᯮ [U+1BE7, U+1BEA..U+1BEC, U+1BEE]0 or more
CONS_FINAL_ABOVE◌ᯰ ◌ᯱ [U+1BF0..U+1BF1]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᯼ ᯽ ᯾ ᯿ [U+1BFC..U+1BFF]

Bhaiksuki

ClassesCharactersEncodingCount
BASE◌ 𑰀 𑰁 𑰂 𑰃 𑰄 𑰅 𑰆 𑰇 𑰈 𑰊 𑰋 𑰌 𑰍 𑰎 𑰏 𑰐 𑰑 𑰒 𑰓 𑰔 𑰕 𑰖 𑰗 𑰘 𑰙 𑰚 𑰛 𑰜 𑰝 𑰞 𑰟 𑰠 𑰡 𑰢 𑰣 𑰤 𑰥 𑰦 𑰧 𑰨 𑰩 𑰪 𑰫 𑰬 𑰭 𑰮 𑱀 𑱐 𑱑 𑱒 𑱓 𑱔 𑱕 𑱖 𑱗 𑱘 𑱙 𑱚 𑱛 𑱜 𑱝 𑱞 𑱟 𑱠 𑱡 𑱢 𑱣 𑱤 𑱥 𑱦 𑱧 𑱨 𑱩 𑱪 𑱫 𑱬 [U+25CC, U+11C00..U+11C08, U+11C0A..U+11C2E, U+11C40, U+11C50..U+11C6C]1
Repeating group0 or more
  HALANT◌𑰿 U+11C3F1
BASE◌ 𑰀 𑰁 𑰂 𑰃 𑰄 𑰅 𑰆 𑰇 𑰈 𑰊 𑰋 𑰌 𑰍 𑰎 𑰏 𑰐 𑰑 𑰒 𑰓 𑰔 𑰕 𑰖 𑰗 𑰘 𑰙 𑰚 𑰛 𑰜 𑰝 𑰞 𑰟 𑰠 𑰡 𑰢 𑰣 𑰤 𑰥 𑰦 𑰧 𑰨 𑰩 𑰪 𑰫 𑰬 𑰭 𑰮 𑱀 𑱐 𑱑 𑱒 𑱓 𑱔 𑱕 𑱖 𑱗 𑱘 𑱙 𑱚 𑱛 𑱜 𑱝 𑱞 𑱟 𑱠 𑱡 𑱢 𑱣 𑱤 𑱥 𑱦 𑱧 𑱨 𑱩 𑱪 𑱫 𑱬 [U+25CC, U+11C00..U+11C08, U+11C0A..U+11C2E, U+11C40, U+11C50..U+11C6C]1
Alternative 1
  HALANT◌𑰿 U+11C3F1
Alternative 2
  VOWEL_ABOVE◌𑰰 ◌𑰱 ◌𑰸 ◌𑰹 ◌𑰺 ◌𑰻 [U+11C30..U+11C31, U+11C38..U+11C3B]0 or more
VOWEL_BELOW◌𑰲 ◌𑰳 ◌𑰴 ◌𑰵 ◌𑰶 [U+11C32..U+11C36]0 or more
VOWEL_POST◌𑰯 U+11C2F0 or more
VOWEL_MOD_ABOVE◌𑰼 ◌𑰽 [U+11C3C..U+11C3D]0 or more
VOWEL_MOD_POST◌𑰾 U+11C3E0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑱁 𑱂 𑱃 𑱄 𑱅 [U+11C41..U+11C45]

Brahmi

For the Brahmi script, the USE supports two cluster structures: a regular one for the normal orthographic syllables used for writing normal languages, and a special one for an old additive-multiplicative notation for numbers. This notation is described in section 14.1 “Brahmi” of The Unicode Standard.

ClassesCharactersEncodingCount
CONS_WITH_STACKER𑀃 𑀄 [U+11003..U+11004]0 or 1
BASE◌ 𑀅 𑀆 𑀇 𑀈 𑀉 𑀊 𑀋 𑀌 𑀍 𑀎 𑀏 𑀐 𑀑 𑀒 𑀓 𑀔 𑀕 𑀖 𑀗 𑀘 𑀙 𑀚 𑀛 𑀜 𑀝 𑀞 𑀟 𑀠 𑀡 𑀢 𑀣 𑀤 𑀥 𑀦 𑀧 𑀨 𑀩 𑀪 𑀫 𑀬 𑀭 𑀮 𑀯 𑀰 𑀱 𑀲 𑀳 𑀴 𑀵 𑀶 𑀷 𑁦 𑁧 𑁨 𑁩 𑁪 𑁫 𑁬 𑁭 𑁮 𑁯 𑁱 𑁲 𑁵 [U+25CC, U+11005..U+11037, U+11066..U+1106F, U+11071..U+11072, U+11075]1
Repeating group0 or more
  HALANT◌𑁆 U+110461
BASE◌ 𑀅 𑀆 𑀇 𑀈 𑀉 𑀊 𑀋 𑀌 𑀍 𑀎 𑀏 𑀐 𑀑 𑀒 𑀓 𑀔 𑀕 𑀖 𑀗 𑀘 𑀙 𑀚 𑀛 𑀜 𑀝 𑀞 𑀟 𑀠 𑀡 𑀢 𑀣 𑀤 𑀥 𑀦 𑀧 𑀨 𑀩 𑀪 𑀫 𑀬 𑀭 𑀮 𑀯 𑀰 𑀱 𑀲 𑀳 𑀴 𑀵 𑀶 𑀷 𑁦 𑁧 𑁨 𑁩 𑁪 𑁫 𑁬 𑁭 𑁮 𑁯 𑁱 𑁲 𑁵 [U+25CC, U+11005..U+11037, U+11066..U+1106F, U+11071..U+11072, U+11075]1
Alternative 1
  HALANT◌𑁆 U+110461
Alternative 2
  VOWEL_ABOVE◌𑀸 ◌𑀹 ◌𑀺 ◌𑀻 ◌𑁂 ◌𑁃 ◌𑁄 ◌𑁅 ◌𑁰 ◌𑁳 ◌𑁴 [U+11038..U+1103B, U+11042..U+11045, U+11070, U+11073..U+11074]0 or more
VOWEL_BELOW◌𑀼 ◌𑀽 ◌𑀾 ◌𑀿 ◌𑁀 ◌𑁁 [U+1103C..U+11041]0 or more
VOWEL_MOD_ABOVE◌𑀁 U+110010 or more
VOWEL_MOD_POST◌𑀀 ◌𑀂 [U+11000, U+11002]0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑀅 ◌𑀸 <U+11005, U+11038>
𑀋 ◌𑀾 <U+1100B, U+1103E>
𑀏 ◌𑁂 <U+1100F, U+11042>

Special cluster structure for numerals:

ClassesCharactersEncodingCount
BASE_NUM𑁒 𑁓 𑁔 𑁕 𑁖 𑁗 𑁘 𑁙 𑁚 𑁛 𑁜 𑁝 𑁞 𑁟 𑁠 𑁡 𑁢 𑁣 𑁤 𑁥 [U+11052..U+11065]1
Repeating group0 or more
  HALANT_NUM◌𑁿 U+1107F1
BASE_NUM𑁒 𑁓 𑁔 𑁕 𑁖 𑁗 𑁘 𑁙 𑁚 𑁛 𑁜 𑁝 𑁞 𑁟 𑁠 𑁡 𑁢 𑁣 𑁤 𑁥 [U+11052..U+11065]1
HALANT_NUM◌𑁿 U+1107F0 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑁇 𑁈 𑁉 𑁊 𑁋 𑁌 𑁍 [U+11047..U+1104D]

Buginese (Lontara’)

ClassesCharactersEncodingCount
BASE, BASE_OTHERᨀ ᨁ ᨂ ᨃ ᨄ ᨅ ᨆ ᨇ ᨈ ᨉ ᨊ ᨋ ᨌ ᨍ ᨎ ᨏ ᨐ ᨑ ᨒ ᨓ ᨔ ᨕ ᨖ ◌   [U+1A00..U+1A16, U+25CC, U+00A0]1
VOWEL_PRE◌ᨙ U+1A190 or more
VOWEL_ABOVE◌ᨗ ◌ᨘ ◌ᨛ [U+1A17..U+1A18, U+1A1B]0 or more
VOWEL_POST◌ᨚ U+1A1A0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᨞ ᨟ ꧏ [U+0020, U+1A1E..U+1A1F, U+A9CF]

Buhid

ClassesCharactersEncodingCount
BASEᝀ ᝁ ᝂ ᝃ ᝄ ᝅ ᝆ ᝇ ᝈ ᝉ ᝊ ᝋ ᝌ ᝍ ᝎ ᝏ ᝐ ᝑ ◌ [U+1740..U+1751, U+25CC]1
VOWEL_ABOVE◌ᝒ U+17520 or more
VOWEL_BELOW◌ᝓ U+17530 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᜵ ᜶ [U+1735..U+1736]

Chakma

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌𑄮 U+1112E◌𑄱 ◌𑄧 <U+11131, U+11127>
◌𑄯 U+1112F◌𑄲 ◌𑄧 <U+11132, U+11127>

Encoding order:

ClassesCharactersEncodingCount
BASE০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ◌ 𑄃 𑄄 𑄅 𑄆 𑄇 𑄈 𑄉 𑄊 𑄋 𑄌 𑄍 𑄎 𑄏 𑄐 𑄑 𑄒 𑄓 𑄔 𑄕 𑄖 𑄗 𑄘 𑄙 𑄚 𑄛 𑄜 𑄝 𑄞 𑄟 𑄠 𑄡 𑄢 𑄣 𑄤 𑄥 𑄦 𑄶 𑄷 𑄸 𑄹 𑄺 𑄻 𑄼 𑄽 𑄾 𑄿 𑅄 𑅇 [U+09E6..U+09EF, U+1040..U+1049, U+25CC, U+11103..U+11126, U+11136..U+1113F, U+11144, U+11147]1
CONS_MOD_ABOVE◌𑄴 U+111340 or more
Repeating group0 or more
  HALANT◌𑄳 U+111331
BASE০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ◌ 𑄃 𑄄 𑄅 𑄆 𑄇 𑄈 𑄉 𑄊 𑄋 𑄌 𑄍 𑄎 𑄏 𑄐 𑄑 𑄒 𑄓 𑄔 𑄕 𑄖 𑄗 𑄘 𑄙 𑄚 𑄛 𑄜 𑄝 𑄞 𑄟 𑄠 𑄡 𑄢 𑄣 𑄤 𑄥 𑄦 𑄶 𑄷 𑄸 𑄹 𑄺 𑄻 𑄼 𑄽 𑄾 𑄿 𑅄 𑅇 [U+09E6..U+09EF, U+1040..U+1049, U+25CC, U+11103..U+11126, U+11136..U+1113F, U+11144, U+11147]1
CONS_MOD_ABOVE◌𑄴 U+111340 or more
Alternative 1
  HALANT◌𑄳 U+111331
Alternative 2
  VOWEL_PRE◌𑄬 U+1112C0 or more
VOWEL_ABOVE◌𑄪 ◌𑄫 ◌𑄱 ◌𑄲 [U+1112A..U+1112B, U+11131..U+11132]0 or more
VOWEL_BELOW◌𑄧 ◌𑄨 ◌𑄩 ◌𑄭 ◌𑄰 [U+11127..U+11129, U+1112D, U+11130]0 or more
VOWEL_POST◌𑅅 ◌𑅆 [U+11145..U+11146]0 or more
VOWEL_MOD_ABOVE◌𑄀 ◌𑄁 ◌𑄂 [U+11100..U+11102]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑅀 𑅁 𑅂 𑅃 [U+11140..U+11143]

(Eastern) Cham

ClassesCharactersEncodingCount
BASE, BASE_OTHER0 1 2 3 4 5 6 7 8 9 ◌ ꨀ ꨁ ꨂ ꨃ ꨄ ꨅ ꨆ ꨇ ꨈ ꨉ ꨊ ꨋ ꨌ ꨍ ꨎ ꨏ ꨐ ꨑ ꨒ ꨓ ꨔ ꨕ ꨖ ꨗ ꨘ ꨙ ꨚ ꨛ ꨜ ꨝ ꨞ ꨟ ꨠ ꨡ ꨢ ꨣ ꨤ ꨥ ꨦ ꨧ ꨨ ꩀ ꩁ ꩂ ꩄ ꩅ ꩆ ꩇ ꩈ ꩉ ꩊ ꩋ ꩐ ꩑ ꩒ ꩓ ꩔ ꩕ ꩖ ꩗ ꩘ ꩙ -   ‐ ‑ [U+0030..U+0039, U+25CC, U+AA00..U+AA28, U+AA40..U+AA42, U+AA44..U+AA4B, U+AA50..U+AA59, U+002D, U+00A0, U+2010..U+2011]1
CONS_MED_PRE◌ꨴ U+AA340 or 1
CONS_MED_ABOVE◌ꨵ U+AA350 or 1
CONS_MED_BELOW◌ꨶ U+AA360 or 1
CONS_MED_POST◌ꨳ U+AA330 or 1
VOWEL_PRE◌ꨯ ◌ꨰ [U+AA2F..U+AA30]0 or more
VOWEL_ABOVE◌ꨪ ◌ꨫ ◌ꨬ ◌ꨮ ◌ꨱ [U+AA2A..U+AA2C, U+AA2E, U+AA31]0 or more
VOWEL_BELOW◌ꨭ ◌ꨲ [U+AA2D, U+AA32]0 or more
VOWEL_MOD_ABOVE◌ꨩ U+AA290 or more
CONS_FINAL_ABOVE◌ꩃ ◌ꩌ [U+AA43, U+AA4C]0 or more
CONS_FINAL_POST◌ꩍ U+AA4D0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
: ? ꩜ ꩝ ꩞ ꩟ [U+0020, U+003A, U+003F, U+AA5C..U+AA5F]

Dives Akuru

The USE decomposes the following multi-part vowel, which therefore doesn’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌𑤸 U+11938◌𑤵 ◌𑤰 <U+11935, U+11930>

Encoding order:

ClassesCharactersEncodingCount
REPHA𑤿◌ 𑥁◌ [U+1193F, U+11941]0 or 1
BASE◌ 𑤀 𑤁 𑤂 𑤃 𑤄 𑤅 𑤆 𑤉 𑤌 𑤍 𑤎 𑤏 𑤐 𑤑 𑤒 𑤓 𑤕 𑤖 𑤘 𑤙 𑤚 𑤛 𑤜 𑤝 𑤞 𑤟 𑤠 𑤡 𑤢 𑤣 𑤤 𑤥 𑤦 𑤧 𑤨 𑤩 𑤪 𑤫 𑤬 𑤭 𑤮 𑤯 𑥐 𑥑 𑥒 𑥓 𑥔 𑥕 𑥖 𑥗 𑥘 𑥙 [U+25CC, U+11900..U+11906, U+11909, U+1190C..U+11913, U+11915..U+11916, U+11918..U+1192F, U+11950..U+11959]1
CONS_MOD_BELOW◌𑥃 U+119430 or more
Repeating group0 or more
  HALANT◌𑤾 U+1193E1
BASE◌ 𑤀 𑤁 𑤂 𑤃 𑤄 𑤅 𑤆 𑤉 𑤌 𑤍 𑤎 𑤏 𑤐 𑤑 𑤒 𑤓 𑤕 𑤖 𑤘 𑤙 𑤚 𑤛 𑤜 𑤝 𑤞 𑤟 𑤠 𑤡 𑤢 𑤣 𑤤 𑤥 𑤦 𑤧 𑤨 𑤩 𑤪 𑤫 𑤬 𑤭 𑤮 𑤯 𑥐 𑥑 𑥒 𑥓 𑥔 𑥕 𑥖 𑥗 𑥘 𑥙 [U+25CC, U+11900..U+11906, U+11909, U+1190C..U+11913, U+11915..U+11916, U+11918..U+1192F, U+11950..U+11959]1
CONS_MOD_BELOW◌𑥃 U+119430 or more
Alternative 1
  HALANT◌𑤾 U+1193E1
Alternative 2
  CONS_MED_POST◌𑥀 ◌𑥂 [U+11940, U+11942]0 or 1
VOWEL_PRE◌𑤵 ◌𑤷 [U+11935, U+11937]0 or more
VOWEL_POST◌𑤰 ◌𑤱 ◌𑤲 ◌𑤳 ◌𑤴 ◌𑤽 [U+11930..U+11934, U+1193D]0 or more
VOWEL_MOD_ABOVE◌𑤻 ◌𑤼 [U+1193B..U+1193C]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑥄 𑥅 𑥆 [U+11944..U+11946]

Known bugs:

Dogra

ClassesCharactersEncodingCount
BASE० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑠀 𑠁 𑠂 𑠃 𑠄 𑠅 𑠆 𑠇 𑠈 𑠉 𑠊 𑠋 𑠌 𑠍 𑠎 𑠏 𑠐 𑠑 𑠒 𑠓 𑠔 𑠕 𑠖 𑠗 𑠘 𑠙 𑠚 𑠛 𑠜 𑠝 𑠞 𑠟 𑠠 𑠡 𑠢 𑠣 𑠤 𑠥 𑠦 𑠧 𑠨 𑠩 𑠪 𑠫 [U+0966..U+096F, U+25CC, U+11800..U+1182B]1
CONS_MOD_BELOW◌𑠺 U+1183A0 or more
Repeating group0 or more
  HALANT◌𑠹 U+118391
BASE० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑠀 𑠁 𑠂 𑠃 𑠄 𑠅 𑠆 𑠇 𑠈 𑠉 𑠊 𑠋 𑠌 𑠍 𑠎 𑠏 𑠐 𑠑 𑠒 𑠓 𑠔 𑠕 𑠖 𑠗 𑠘 𑠙 𑠚 𑠛 𑠜 𑠝 𑠞 𑠟 𑠠 𑠡 𑠢 𑠣 𑠤 𑠥 𑠦 𑠧 𑠨 𑠩 𑠪 𑠫 [U+0966..U+096F, U+25CC, U+11800..U+1182B]1
CONS_MOD_BELOW◌𑠺 U+1183A0 or more
Alternative 1
  HALANT◌𑠹 U+118391
Alternative 2
  VOWEL_PRE◌𑠭 U+1182D0 or more
VOWEL_ABOVE◌𑠳 ◌𑠴 ◌𑠵 ◌𑠶 [U+11833..U+11836]0 or more
VOWEL_BELOW◌𑠯 ◌𑠰 ◌𑠱 ◌𑠲 [U+1182F..U+11832]0 or more
VOWEL_POST◌𑠬 ◌𑠮 [U+1182C, U+1182E]0 or more
VOWEL_MOD_ABOVE◌𑠷 U+118370 or more
VOWEL_MOD_POST◌𑠸 U+118380 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑠻 [U+0964..U+0965, U+A830..U+A839, U+1183B]

Grantha

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌𑍋 U+1134B◌𑍇 ◌𑌾 <U+11347, U+1133E>
◌𑍌 U+1134C◌𑍇 ◌𑍗 <U+11347, U+11357>

Encoding order:

ClassesCharactersEncodingCount
BASE௦ ௧ ௨ ௩ ௪ ௫ ௬ ௭ ௮ ௯ ◌ 𑌅 𑌆 𑌇 𑌈 𑌉 𑌊 𑌋 𑌌 𑌏 𑌐 𑌓 𑌔 𑌕 𑌖 𑌗 𑌘 𑌙 𑌚 𑌛 𑌜 𑌝 𑌞 𑌟 𑌠 𑌡 𑌢 𑌣 𑌤 𑌥 𑌦 𑌧 𑌨 𑌪 𑌫 𑌬 𑌭 𑌮 𑌯 𑌰 𑌲 𑌳 𑌵 𑌶 𑌷 𑌸 𑌹 𑌽 𑍞 𑍟 𑍠 𑍡 [U+0BE6..U+0BEF, U+25CC, U+11305..U+1130C, U+1130F..U+11310, U+11313..U+11328, U+1132A..U+11330, U+11332..U+11333, U+11335..U+11339, U+1133D, U+1135E..U+11361]1
CONS_MOD_BELOW◌𑌻 ◌𑌼 [U+1133B..U+1133C]0 or more
Repeating group0 or more
  HALANT◌𑍍 U+1134D1
BASE௦ ௧ ௨ ௩ ௪ ௫ ௬ ௭ ௮ ௯ ◌ 𑌅 𑌆 𑌇 𑌈 𑌉 𑌊 𑌋 𑌌 𑌏 𑌐 𑌓 𑌔 𑌕 𑌖 𑌗 𑌘 𑌙 𑌚 𑌛 𑌜 𑌝 𑌞 𑌟 𑌠 𑌡 𑌢 𑌣 𑌤 𑌥 𑌦 𑌧 𑌨 𑌪 𑌫 𑌬 𑌭 𑌮 𑌯 𑌰 𑌲 𑌳 𑌵 𑌶 𑌷 𑌸 𑌹 𑌽 𑍞 𑍟 𑍠 𑍡 [U+0BE6..U+0BEF, U+25CC, U+11305..U+1130C, U+1130F..U+11310, U+11313..U+11328, U+1132A..U+11330, U+11332..U+11333, U+11335..U+11339, U+1133D, U+1135E..U+11361]1
CONS_MOD_BELOW◌𑌻 ◌𑌼 [U+1133B..U+1133C]0 or more
Alternative 1
  HALANT◌𑍍 U+1134D1
Alternative 2
  VOWEL_PRE◌𑍇 ◌𑍈 [U+11347..U+11348]0 or more
VOWEL_ABOVE◌𑍀 U+113400 or more
VOWEL_POST◌𑌾 ◌𑌿 ◌𑍁 ◌𑍂 ◌𑍃 ◌𑍄 ◌𑍗 ◌𑍢 ◌𑍣 [U+1133E..U+1133F, U+11341..U+11344, U+11357, U+11362..U+11363]0 or more
VOWEL_MOD_ABOVE◌॑ ◌᳐ ◌᳒ ◌᳴ ◌᳸ ◌᳹ ◌⃰ ◌𑌀 ◌𑌁 ◌𑍦 ◌𑍧 ◌𑍨 ◌𑍩 ◌𑍪 ◌𑍫 ◌𑍬 ◌𑍰 ◌𑍱 ◌𑍲 ◌𑍳 ◌𑍴 [U+0951, U+1CD0, U+1CD2, U+1CF4, U+1CF8..U+1CF9, U+20F0, U+11300..U+11301, U+11366..U+1136C, U+11370..U+11374]0 or more
VOWEL_MOD_BELOW◌॒ U+09520 or more
VOWEL_MOD_POST◌𑌂 ◌𑌃 [U+11302..U+11303]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ௰ ௱ ௲ ௳ ᳓ ᳲ ᳳ 𑍐 𑍝 𑿐 𑿑 𑿓 [U+0964..U+0965, U+0BF0..U+0BF3, U+1CD3, U+1CF2..U+1CF3, U+11350, U+1135D, U+11FD0..U+11FD1, U+11FD3]

Gunjala Gondi

ClassesCharactersEncodingCount
BASE◌ 𑵠 𑵡 𑵢 𑵣 𑵤 𑵥 𑵧 𑵨 𑵪 𑵫 𑵬 𑵭 𑵮 𑵯 𑵰 𑵱 𑵲 𑵳 𑵴 𑵵 𑵶 𑵷 𑵸 𑵹 𑵺 𑵻 𑵼 𑵽 𑵾 𑵿 𑶀 𑶁 𑶂 𑶃 𑶄 𑶅 𑶆 𑶇 𑶈 𑶉 𑶠 𑶡 𑶢 𑶣 𑶤 𑶥 𑶦 𑶧 𑶨 𑶩 [U+25CC, U+11D60..U+11D65, U+11D67..U+11D68, U+11D6A..U+11D89, U+11DA0..U+11DA9]1
Repeating group0 or more
  HALANT◌𑶗 U+11D971
BASE◌ 𑵠 𑵡 𑵢 𑵣 𑵤 𑵥 𑵧 𑵨 𑵪 𑵫 𑵬 𑵭 𑵮 𑵯 𑵰 𑵱 𑵲 𑵳 𑵴 𑵵 𑵶 𑵷 𑵸 𑵹 𑵺 𑵻 𑵼 𑵽 𑵾 𑵿 𑶀 𑶁 𑶂 𑶃 𑶄 𑶅 𑶆 𑶇 𑶈 𑶉 𑶠 𑶡 𑶢 𑶣 𑶤 𑶥 𑶦 𑶧 𑶨 𑶩 [U+25CC, U+11D60..U+11D65, U+11D67..U+11D68, U+11D6A..U+11D89, U+11DA0..U+11DA9]1
Alternative 1
  HALANT◌𑶗 U+11D971
Alternative 2
  VOWEL_ABOVE◌𑶐 ◌𑶑 [U+11D90..U+11D91]0 or more
VOWEL_POST◌𑶊 ◌𑶋 ◌𑶌 ◌𑶍 ◌𑶎 ◌𑶓 ◌𑶔 [U+11D8A..U+11D8E, U+11D93..U+11D94]0 or more
VOWEL_MOD_ABOVE◌𑶕 U+11D950 or more
VOWEL_MOD_POST◌𑶖 U+11D960 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
. : · । ॥ 𑶘 [U+002E, U+003A, U+00B7, U+0964..U+0965, U+11D98]

Hanunoo

ClassesCharactersEncodingCount
BASEᜠ ᜡ ᜢ ᜣ ᜤ ᜥ ᜦ ᜧ ᜨ ᜩ ᜪ ᜫ ᜬ ᜭ ᜮ ᜯ ᜰ ᜱ ◌ [U+1720..U+1731, U+25CC]1
VOWEL_ABOVE◌ᜲ U+17320 or more
VOWEL_BELOW◌ᜳ U+17330 or more
VOWEL_POST◌᜴ U+17340 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᜵ ᜵ ᜶ ᜶ [U+1735, U+1735..U+1736, U+1736]

Javanese

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ ꦄ ꦅ ꦆ ꦇ ꦈ ꦉ ꦊ ꦋ ꦌ ꦍ ꦎ ꦏ ꦐ ꦑ ꦒ ꦓ ꦔ ꦕ ꦖ ꦗ ꦘ ꦙ ꦚ ꦛ ꦜ ꦝ ꦞ ꦟ ꦠ ꦡ ꦢ ꦣ ꦤ ꦥ ꦦ ꦧ ꦨ ꦩ ꦪ ꦫ ꦬ ꦭ ꦮ ꦯ ꦰ ꦱ ꦲ ꧐ ꧑ ꧒ ꧓ ꧔ ꧕ ꧖ ꧗ ꧘ ꧙   [U+25CC, U+A984..U+A9B2, U+A9D0..U+A9D9, U+00A0]1
CONS_MOD_ABOVE◌꦳ U+A9B30 or more
Repeating group0 or more
  HALANT◌꧀ U+A9C01
BASE◌ ꦄ ꦅ ꦆ ꦇ ꦈ ꦉ ꦊ ꦋ ꦌ ꦍ ꦎ ꦏ ꦐ ꦑ ꦒ ꦓ ꦔ ꦕ ꦖ ꦗ ꦘ ꦙ ꦚ ꦛ ꦜ ꦝ ꦞ ꦟ ꦠ ꦡ ꦢ ꦣ ꦤ ꦥ ꦦ ꦧ ꦨ ꦩ ꦪ ꦫ ꦬ ꦭ ꦮ ꦯ ꦰ ꦱ ꦲ ꧐ ꧑ ꧒ ꧓ ꧔ ꧕ ꧖ ꧗ ꧘ ꧙ [U+25CC, U+A984..U+A9B2, U+A9D0..U+A9D9]1
CONS_MOD_ABOVE◌꦳ U+A9B30 or more
Alternative 1
  HALANT◌꧀ U+A9C01
Alternative 2
  CONS_MED_BELOW◌ꦽ ◌ꦿ [U+A9BD, U+A9BF]0 or 1
CONS_MED_POST◌ꦾ U+A9BE0 or 1
VOWEL_PRE◌ꦺ ◌ꦻ [U+A9BA..U+A9BB]0 or more
VOWEL_ABOVE◌ꦶ ◌ꦷ ◌ꦼ [U+A9B6..U+A9B7, U+A9BC]0 or more
VOWEL_BELOW◌ꦸ ◌ꦹ [U+A9B8..U+A9B9]0 or more
VOWEL_POST◌ꦴ ◌ꦵ [U+A9B4..U+A9B5]0 or more
VOWEL_MOD_ABOVE◌ꦀ ◌ꦁ [U+A980..U+A981]0 or more
VOWEL_MOD_POST◌ꦃ U+A9830 or more
CONS_FINAL_ABOVE◌ꦂ U+A9820 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
꧁ ꧂ ꧃ ꧄ ꧅ ꧆ ꧇ ꧈ ꧉ ꧊ ꧋ ꧌ ꧍ ꧏ ꧞ ꧟ [U+A9C1..U+A9CD, U+A9CF, U+A9DE..U+A9DF]

Kaithi

ClassesCharactersEncodingCount
BASE, BASE_OTHER० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑂃 𑂄 𑂅 𑂆 𑂇 𑂈 𑂉 𑂊 𑂋 𑂌 𑂍 𑂎 𑂏 𑂐 𑂑 𑂒 𑂓 𑂔 𑂕 𑂖 𑂗 𑂘 𑂙 𑂚 𑂛 𑂜 𑂝 𑂞 𑂟 𑂠 𑂡 𑂢 𑂣 𑂤 𑂥 𑂦 𑂧 𑂨 𑂩 𑂪 𑂫 𑂬 𑂭 𑂮 𑂯 - ‐ ‑ [U+0966..U+096F, U+25CC, U+11083..U+110AF, U+002D, U+2010..U+2011]1
CONS_MOD_BELOW◌𑂺 U+110BA0 or more
Repeating group0 or more
  HALANT◌𑂹 U+110B91
BASE० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑂃 𑂄 𑂅 𑂆 𑂇 𑂈 𑂉 𑂊 𑂋 𑂌 𑂍 𑂎 𑂏 𑂐 𑂑 𑂒 𑂓 𑂔 𑂕 𑂖 𑂗 𑂘 𑂙 𑂚 𑂛 𑂜 𑂝 𑂞 𑂟 𑂠 𑂡 𑂢 𑂣 𑂤 𑂥 𑂦 𑂧 𑂨 𑂩 𑂪 𑂫 𑂬 𑂭 𑂮 𑂯 [U+0966..U+096F, U+25CC, U+11083..U+110AF]1
CONS_MOD_BELOW◌𑂺 U+110BA0 or more
Alternative 1
  HALANT◌𑂹 U+110B91
Alternative 2
  VOWEL_PRE◌𑂱 U+110B10 or more
VOWEL_ABOVE◌𑂵 ◌𑂶 [U+110B5..U+110B6]0 or more
VOWEL_BELOW◌𑂳 ◌𑂴 ◌𑃂 [U+110B3..U+110B4, U+110C2]0 or more
VOWEL_POST◌𑂰 ◌𑂲 ◌𑂷 ◌𑂸 [U+110B0, U+110B2, U+110B7..U+110B8]0 or more
VOWEL_MOD_ABOVE◌𑂀 ◌𑂁 [U+11080..U+11081]0 or more
VOWEL_MOD_POST◌𑂂 U+110820 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
+ ⸱ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑂻 𑂼 𑂽 𑂾 𑂿 𑃀 𑃁 𑃍 [U+002B, U+2E31, U+A830..U+A839, U+110BB..U+110C1, U+110CD]

Kawi

ClassesCharactersEncodingCount
REPHA𑼂◌ U+11F020 or 1
BASE◌ 𑼄 𑼅 𑼆 𑼇 𑼈 𑼉 𑼊 𑼋 𑼌 𑼍 𑼎 𑼏 𑼐 𑼒 𑼓 𑼔 𑼕 𑼖 𑼗 𑼘 𑼙 𑼚 𑼛 𑼜 𑼝 𑼞 𑼟 𑼠 𑼡 𑼢 𑼣 𑼤 𑼥 𑼦 𑼧 𑼨 𑼩 𑼪 𑼫 𑼬 𑼭 𑼮 𑼯 𑼰 𑼱 𑼲 𑼳 𑽐 𑽑 𑽒 𑽓 𑽔 𑽕 𑽖 𑽗 𑽘 𑽙 [U+25CC, U+11F04..U+11F10, U+11F12..U+11F33, U+11F50..U+11F59]1
Repeating group0 or more
  HALANT◌𑽂 U+11F421
BASE◌ 𑼄 𑼅 𑼆 𑼇 𑼈 𑼉 𑼊 𑼋 𑼌 𑼍 𑼎 𑼏 𑼐 𑼒 𑼓 𑼔 𑼕 𑼖 𑼗 𑼘 𑼙 𑼚 𑼛 𑼜 𑼝 𑼞 𑼟 𑼠 𑼡 𑼢 𑼣 𑼤 𑼥 𑼦 𑼧 𑼨 𑼩 𑼪 𑼫 𑼬 𑼭 𑼮 𑼯 𑼰 𑼱 𑼲 𑼳 𑽐 𑽑 𑽒 𑽓 𑽔 𑽕 𑽖 𑽗 𑽘 𑽙 [U+25CC, U+11F04..U+11F10, U+11F12..U+11F33, U+11F50..U+11F59]1
Alternative 1
  HALANT◌𑽂 U+11F421
Alternative 2
  VOWEL_PRE◌𑼾 ◌𑼿 [U+11F3E..U+11F3F]0 or more
VOWEL_ABOVE◌𑼶 ◌𑼷 ◌𑽀 [U+11F36..U+11F37, U+11F40]0 or more
VOWEL_BELOW◌𑼸 ◌𑼹 ◌𑼺 [U+11F38..U+11F3A]0 or more
VOWEL_POST◌𑼴 ◌𑼵 ◌𑽁 [U+11F34..U+11F35, U+11F41]0 or more
VOWEL_MOD_ABOVE◌𑼀 ◌𑼁 [U+11F00..U+11F01]0 or more
VOWEL_MOD_POST◌𑼃 U+11F030 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑽃 𑽄 𑽅 𑽆 𑽇 𑽈 𑽉 𑽊 𑽋 𑽌 𑽍 𑽎 𑽏 [U+11F43..U+11F4F]

Kayah Li

ClassesCharactersEncodingCount
BASE◌ ꤀ ꤁ ꤂ ꤃ ꤄ ꤅ ꤆ ꤇ ꤈ ꤉ ꤊ ꤋ ꤌ ꤍ ꤎ ꤏ ꤐ ꤑ ꤒ ꤓ ꤔ ꤕ ꤖ ꤗ ꤘ ꤙ ꤚ ꤛ ꤜ ꤝ ꤞ ꤟ ꤠ ꤡ ꤢ ꤣ ꤤ ꤥ [U+25CC, U+A900..U+A925]1
VOWEL_ABOVE◌ꤦ ◌ꤧ ◌ꤨ ◌ꤩ ◌ꤪ [U+A926..U+A92A]0 or more
VOWEL_MOD_BELOW◌꤫ ◌꤬ ◌꤭ [U+A92B..U+A92D]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
꤮ ꤯ [U+A92E..U+A92F]

Kharoshthi

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ 𐨀 𐨐 𐨑 𐨒 𐨓 𐨕 𐨖 𐨗 𐨙 𐨚 𐨛 𐨜 𐨝 𐨞 𐨟 𐨠 𐨡 𐨢 𐨣 𐨤 𐨥 𐨦 𐨧 𐨨 𐨩 𐨪 𐨫 𐨬 𐨭 𐨮 𐨯 𐨰 𐨱 𐨲 𐨳 𐨴 𐨵 𐩀 𐩁 𐩂 𐩃 𐩄 𐩅 𐩆 𐩇 𐩈 -   ‐ [U+25CC, U+10A00, U+10A10..U+10A13, U+10A15..U+10A17, U+10A19..U+10A35, U+10A40..U+10A48, U+002D, U+00A0, U+2010]1
CONS_MOD_BELOW◌𐨸 ◌𐨹 ◌𐨺 [U+10A38..U+10A3A]0 or more
Repeating group0 or more
  HALANT◌𐨿 U+10A3F1
BASE◌ 𐨀 𐨐 𐨑 𐨒 𐨓 𐨕 𐨖 𐨗 𐨙 𐨚 𐨛 𐨜 𐨝 𐨞 𐨟 𐨠 𐨡 𐨢 𐨣 𐨤 𐨥 𐨦 𐨧 𐨨 𐨩 𐨪 𐨫 𐨬 𐨭 𐨮 𐨯 𐨰 𐨱 𐨲 𐨳 𐨴 𐨵 𐩀 𐩁 𐩂 𐩃 𐩄 𐩅 𐩆 𐩇 𐩈 [U+25CC, U+10A00, U+10A10..U+10A13, U+10A15..U+10A17, U+10A19..U+10A35, U+10A40..U+10A48]1
CONS_MOD_BELOW◌𐨸 ◌𐨹 ◌𐨺 [U+10A38..U+10A3A]0 or more
Alternative 1
  HALANT◌𐨿 U+10A3F1
Alternative 2
  VOWEL_ABOVE◌𐨅 U+10A050 or more
VOWEL_BELOW◌𐨁 ◌𐨂 ◌𐨃 ◌𐨆 [U+10A01..U+10A03, U+10A06]0 or more
VOWEL_POST◌𐨌 U+10A0C0 or more
VOWEL_MOD_ABOVE◌𐨏 U+10A0F0 or more
VOWEL_MOD_BELOW◌𐨍 ◌𐨎 [U+10A0D..U+10A0E]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𐩐 𐩑 𐩒 𐩓 𐩔 𐩕 𐩖 𐩗 𐩘 [U+0020, U+10A50..U+10A58]

Khojki

ClassesCharactersEncodingCount
BASE૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯ ◌ 𑈀 𑈁 𑈂 𑈃 𑈄 𑈅 𑈆 𑈇 𑈈 𑈉 𑈊 𑈋 𑈌 𑈍 𑈎 𑈏 𑈐 𑈑 𑈓 𑈔 𑈕 𑈖 𑈗 𑈘 𑈙 𑈚 𑈛 𑈜 𑈝 𑈞 𑈟 𑈠 𑈡 𑈢 𑈣 𑈤 𑈥 𑈦 𑈧 𑈨 𑈩 𑈪 𑈫 𑈿 𑉀 [U+0AE6..U+0AEF, U+25CC, U+11200..U+11211, U+11213..U+1122B, U+1123F..U+11240]1
CONS_MOD_ABOVE◌𑈶 ◌𑈷 [U+11236..U+11237]0 or more
Repeating group0 or more
  HALANT◌𑈵 U+112351
BASE૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯ ◌ 𑈀 𑈁 𑈂 𑈃 𑈄 𑈅 𑈆 𑈇 𑈈 𑈉 𑈊 𑈋 𑈌 𑈍 𑈎 𑈏 𑈐 𑈑 𑈓 𑈔 𑈕 𑈖 𑈗 𑈘 𑈙 𑈚 𑈛 𑈜 𑈝 𑈞 𑈟 𑈠 𑈡 𑈢 𑈣 𑈤 𑈥 𑈦 𑈧 𑈨 𑈩 𑈪 𑈫 𑈿 𑉀 [U+0AE6..U+0AEF, U+25CC, U+11200..U+11211, U+11213..U+1122B, U+1123F..U+11240]1
CONS_MOD_ABOVE◌𑈶 ◌𑈷 [U+11236..U+11237]0 or more
Alternative 1
  HALANT◌𑈵 U+112351
Alternative 2
  VOWEL_ABOVE◌𑈰 ◌𑈱 ◌𑈲 ◌𑈳 [U+11230..U+11233]0 or more
VOWEL_BELOW◌𑈯 ◌𑉁 [U+1122F, U+11241]0 or more
VOWEL_POST◌𑈬 ◌𑈭 ◌𑈮 [U+1122C..U+1122E]0 or more
VOWEL_MOD_ABOVE◌𑈴 ◌𑈾 [U+11234, U+1123E]0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑈀 ◌𑈬 <U+11200, U+1122C>
𑈀 ◌𑈱 <U+11200, U+11231>
𑈀 ◌𑈳 <U+11200, U+11233>
𑈀 ◌𑈬 ◌𑈱 <U+11200, U+1122C, U+11231>
𑈆 ◌𑈬 <U+11206, U+1122C>
◌𑈬 ◌𑈰 <U+1122C, U+11230>
◌𑈬 ◌𑈱 <U+1122C, U+11231>
𑉀 ◌𑈮 <U+11240, U+1122E>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑈸 𑈹 𑈺 𑈻 𑈼 𑈽 [U+A830..U+A839, U+11238..U+1123D]

Khudawadi

ClassesCharactersEncodingCount
BASE◌ 𑊰 𑊱 𑊲 𑊳 𑊴 𑊵 𑊶 𑊷 𑊸 𑊹 𑊺 𑊻 𑊼 𑊽 𑊾 𑊿 𑋀 𑋁 𑋂 𑋃 𑋄 𑋅 𑋆 𑋇 𑋈 𑋉 𑋊 𑋋 𑋌 𑋍 𑋎 𑋏 𑋐 𑋑 𑋒 𑋓 𑋔 𑋕 𑋖 𑋗 𑋘 𑋙 𑋚 𑋛 𑋜 𑋝 𑋞 𑋰 𑋱 𑋲 𑋳 𑋴 𑋵 𑋶 𑋷 𑋸 𑋹 [U+25CC, U+112B0..U+112DE, U+112F0..U+112F9]1
CONS_MOD_BELOW◌𑋩 U+112E90 or more
VOWEL_PRE◌𑋡 U+112E10 or more
VOWEL_ABOVE◌𑋥 ◌𑋦 ◌𑋧 ◌𑋨 [U+112E5..U+112E8]0 or more
VOWEL_BELOW◌𑋣 ◌𑋤 ◌𑋪 [U+112E3..U+112E4, U+112EA]0 or more
VOWEL_POST◌𑋠 ◌𑋢 [U+112E0, U+112E2]0 or more
VOWEL_MOD_ABOVE◌𑋟 U+112DF0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑊰 ◌𑋠 <U+112B0, U+112E0>
𑊰 ◌𑋥 <U+112B0, U+112E5>
𑊰 ◌𑋦 <U+112B0, U+112E6>
𑊰 ◌𑋧 <U+112B0, U+112E7>
𑊰 ◌𑋨 <U+112B0, U+112E8>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
. : ; । ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ [U+002E, U+003A..U+003B, U+0964..U+0965, U+A830..U+A839]

Lepcha

ClassesCharactersEncodingCount
BASEᰀ ᰁ ᰂ ᰃ ᰄ ᰅ ᰆ ᰇ ᰈ ᰉ ᰊ ᰋ ᰌ ᰍ ᰎ ᰏ ᰐ ᰑ ᰒ ᰓ ᰔ ᰕ ᰖ ᰗ ᰘ ᰙ ᰚ ᰛ ᰜ ᰝ ᰞ ᰟ ᰠ ᰡ ᰢ ᰣ ᱀ ᱁ ᱂ ᱃ ᱄ ᱅ ᱆ ᱇ ᱈ ᱉ ᱍ ᱎ ᱏ ◌ [U+1C00..U+1C23, U+1C40..U+1C49, U+1C4D..U+1C4F, U+25CC]1
CONS_MOD_BELOW◌᰷ U+1C370 or more
Repeating group0 or more
  CONS_SUB◌ᰤ ◌ᰥ [U+1C24..U+1C25]1
CONS_MOD_BELOW◌᰷ U+1C370 or more
VOWEL_PRE◌ᰧ ◌ᰨ ◌ᰩ [U+1C27..U+1C29]0 or more
VOWEL_BELOW◌ᰬ U+1C2C0 or more
VOWEL_POST◌ᰦ ◌ᰪ ◌ᰫ [U+1C26, U+1C2A..U+1C2B]0 or more
VOWEL_MOD_PRE◌ᰴ ◌ᰵ [U+1C34..U+1C35]0 or more
CONS_FINAL_ABOVE◌ᰭ ◌ᰮ ◌ᰯ ◌ᰰ ◌ᰱ ◌ᰲ ◌ᰳ [U+1C2D..U+1C33]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
, . ? ◌ᰶ ᰻ ᰼ ᰽ ᰾ ᰿ [U+002C, U+002E, U+003F, U+1C36, U+1C3B..U+1C3F]

Known bugs:

Limbu

ClassesCharactersEncodingCount
BASE, BASE_OTHER० १ २ ३ ४ ५ ६ ७ ८ ९ ᤁ ᤂ ᤃ ᤄ ᤅ ᤆ ᤇ ᤈ ᤉ ᤊ ᤋ ᤌ ᤍ ᤎ ᤏ ᤐ ᤑ ᤒ ᤓ ᤔ ᤕ ᤖ ᤗ ᤘ ᤙ ᤚ ᤛ ᤜ ᤝ ᤞ ᥆ ᥇ ᥈ ᥉ ᥊ ᥋ ᥌ ᥍ ᥎ ᥏ ◌ ᤀ [U+0966..U+096F, U+1901..U+191E, U+1946..U+194F, U+25CC, U+1900]1
Repeating group0 or more
  CONS_SUB◌ᤩ ◌ᤪ ◌ᤫ [U+1929..U+192B]1
VOWEL_ABOVE◌ᤠ ◌ᤡ ◌ᤥ ◌ᤦ ◌ᤧ ◌ᤨ [U+1920..U+1921, U+1925..U+1928]0 or more
VOWEL_BELOW◌ᤢ U+19220 or more
VOWEL_POST◌ᤣ ◌ᤤ [U+1923..U+1924]0 or more
VOWEL_MOD_ABOVE◌᤺ U+193A0 or more
VOWEL_MOD_BELOW◌ᤲ U+19320 or more
CONS_FINAL_BELOW◌᤹ U+19390 or more
CONS_FINAL_POST◌ᤰ ◌ᤱ ◌ᤳ ◌ᤴ ◌ᤵ ◌ᤶ ◌ᤷ ◌ᤸ [U+1930..U+1931, U+1933..U+1938]0 or more
CONS_FINAL_MOD◌᤻ U+193B0 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ ॥ ᥀ ᥄ ᥅ [U+0660..U+0669, U+0965, U+1940, U+1944..U+1945]

Mahajani

ClassesCharactersEncodingCount
BASE० १ २ ३ ४ ५ ६ ७ ८ ९ ◌ 𑅐 𑅑 𑅒 𑅓 𑅔 𑅕 𑅖 𑅗 𑅘 𑅙 𑅚 𑅛 𑅜 𑅝 𑅞 𑅟 𑅠 𑅡 𑅢 𑅣 𑅤 𑅥 𑅦 𑅧 𑅨 𑅩 𑅪 𑅫 𑅬 𑅭 𑅮 𑅯 𑅰 𑅱 𑅲 [U+0966..U+096F, U+25CC, U+11150..U+11172]1
CONS_MOD_BELOW◌𑅳 U+111730 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
: · । ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑅴 𑅵 𑅶 [U+003A, U+00B7, U+0964..U+0965, U+A830..U+A839, U+11174..U+11176]

Makasar

ClassesCharactersEncodingCount
BASE, BASE_OTHER0 1 2 3 4 5 6 7 8 9 ◌ 𑻠 𑻡 𑻢 𑻣 𑻤 𑻥 𑻦 𑻧 𑻨 𑻩 𑻪 𑻫 𑻬 𑻭 𑻮 𑻯 𑻰 𑻱   𑻲 [U+0030..U+0039, U+25CC, U+11EE0..U+11EF1, U+00A0, U+11EF2]1
VOWEL_PRE◌𑻵 U+11EF50 or more
VOWEL_ABOVE◌𑻳 U+11EF30 or more
VOWEL_BELOW◌𑻴 U+11EF40 or more
VOWEL_POST◌𑻶 U+11EF60 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ 𑻷 𑻸 [U+0020, U+0660..U+0669, U+11EF7..U+11EF8]

Marchen

ClassesCharactersEncodingCount
BASE◌ 𑱲 𑱳 𑱴 𑱵 𑱶 𑱷 𑱸 𑱹 𑱺 𑱻 𑱼 𑱽 𑱾 𑱿 𑲀 𑲁 𑲂 𑲃 𑲄 𑲅 𑲆 𑲇 𑲈 𑲉 𑲊 𑲋 𑲌 𑲍 𑲎 𑲏 [U+25CC, U+11C72..U+11C8F]1
Repeating group0 or more
  CONS_SUB◌𑲒 ◌𑲓 ◌𑲔 ◌𑲕 ◌𑲖 ◌𑲗 ◌𑲘 ◌𑲙 ◌𑲚 ◌𑲛 ◌𑲜 ◌𑲝 ◌𑲞 ◌𑲟 ◌𑲠 ◌𑲡 ◌𑲢 ◌𑲣 ◌𑲤 ◌𑲥 ◌𑲦 ◌𑲧 ◌𑲩 ◌𑲪 ◌𑲫 ◌𑲬 ◌𑲭 ◌𑲮 ◌𑲯 [U+11C92..U+11CA7, U+11CA9..U+11CAF]1
VOWEL_PRE◌𑲱 U+11CB10 or more
VOWEL_ABOVE◌𑲳 U+11CB30 or more
VOWEL_BELOW◌𑲰 ◌𑲲 [U+11CB0, U+11CB2]0 or more
VOWEL_POST◌𑲴 U+11CB40 or more
VOWEL_MOD_ABOVE◌𑲵 ◌𑲶 [U+11CB5..U+11CB6]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑱰 𑱱 [U+11C70..U+11C71]

Masaram Gondi

ClassesCharactersEncodingCount
REPHA𑵆◌ U+11D460 or 1
BASE◌ 𑴀 𑴁 𑴂 𑴃 𑴄 𑴅 𑴆 𑴈 𑴉 𑴋 𑴌 𑴍 𑴎 𑴏 𑴐 𑴑 𑴒 𑴓 𑴔 𑴕 𑴖 𑴗 𑴘 𑴙 𑴚 𑴛 𑴜 𑴝 𑴞 𑴟 𑴠 𑴡 𑴢 𑴣 𑴤 𑴥 𑴦 𑴧 𑴨 𑴩 𑴪 𑴫 𑴬 𑴭 𑴮 𑴯 𑴰 𑵐 𑵑 𑵒 𑵓 𑵔 𑵕 𑵖 𑵗 𑵘 𑵙 [U+25CC, U+11D00..U+11D06, U+11D08..U+11D09, U+11D0B..U+11D30, U+11D50..U+11D59]1
CONS_MOD_BELOW◌𑵂 U+11D420 or more
Repeating group0 or more
  HALANT◌𑵅 U+11D451
BASE◌ 𑴀 𑴁 𑴂 𑴃 𑴄 𑴅 𑴆 𑴈 𑴉 𑴋 𑴌 𑴍 𑴎 𑴏 𑴐 𑴑 𑴒 𑴓 𑴔 𑴕 𑴖 𑴗 𑴘 𑴙 𑴚 𑴛 𑴜 𑴝 𑴞 𑴟 𑴠 𑴡 𑴢 𑴣 𑴤 𑴥 𑴦 𑴧 𑴨 𑴩 𑴪 𑴫 𑴬 𑴭 𑴮 𑴯 𑴰 𑵐 𑵑 𑵒 𑵓 𑵔 𑵕 𑵖 𑵗 𑵘 𑵙 [U+25CC, U+11D00..U+11D06, U+11D08..U+11D09, U+11D0B..U+11D30, U+11D50..U+11D59]1
CONS_MOD_BELOW◌𑵂 U+11D420 or more
Alternative 1
  HALANT◌𑵅 U+11D451
Alternative 2
  CONS_MED_BELOW◌𑵇 U+11D470 or 1
VOWEL_ABOVE◌𑴱 ◌𑴲 ◌𑴳 ◌𑴴 ◌𑴵 ◌𑴺 ◌𑴼 ◌𑴽 ◌𑴿 ◌𑵃 [U+11D31..U+11D35, U+11D3A, U+11D3C..U+11D3D, U+11D3F, U+11D43]0 or more
VOWEL_BELOW◌𑴶 ◌𑵄 [U+11D36, U+11D44]0 or more
VOWEL_MOD_ABOVE◌𑵀 ◌𑵁 [U+11D40..U+11D41]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ [U+0964..U+0965]

Meetei Mayek

ClassesCharactersEncodingCount
BASE◌ ꫠ ꫡ ꫢ ꫣ ꫤ ꫥ ꫦ ꫧ ꫨ ꫩ ꫪ ꯀ ꯁ ꯂ ꯃ ꯄ ꯅ ꯆ ꯇ ꯈ ꯉ ꯊ ꯋ ꯌ ꯍ ꯎ ꯏ ꯐ ꯑ ꯒ ꯓ ꯔ ꯕ ꯖ ꯗ ꯘ ꯙ ꯚ ꯛ ꯜ ꯝ ꯞ ꯟ ꯠ ꯡ ꯢ ꯰ ꯱ ꯲ ꯳ ꯴ ꯵ ꯶ ꯷ ꯸ ꯹ [U+25CC, U+AAE0..U+AAEA, U+ABC0..U+ABE2, U+ABF0..U+ABF9]1
Repeating group0 or more
  HALANT◌꫶ U+AAF61
BASE◌ ꫠ ꫡ ꫢ ꫣ ꫤ ꫥ ꫦ ꫧ ꫨ ꫩ ꫪ ꯀ ꯁ ꯂ ꯃ ꯄ ꯅ ꯆ ꯇ ꯈ ꯉ ꯊ ꯋ ꯌ ꯍ ꯎ ꯏ ꯐ ꯑ ꯒ ꯓ ꯔ ꯕ ꯖ ꯗ ꯘ ꯙ ꯚ ꯛ ꯜ ꯝ ꯞ ꯟ ꯠ ꯡ ꯢ ꯰ ꯱ ꯲ ꯳ ꯴ ꯵ ꯶ ꯷ ꯸ ꯹ [U+25CC, U+AAE0..U+AAEA, U+ABC0..U+ABE2, U+ABF0..U+ABF9]1
Alternative 1
  HALANT◌꫶ U+AAF61
Alternative 2
  VOWEL_PRE◌ꫫ ◌ꫮ [U+AAEB, U+AAEE]0 or more
VOWEL_ABOVE◌ꫭ ◌ꯥ [U+AAED, U+ABE5]0 or more
VOWEL_BELOW◌ꫬ ◌ꯨ ◌꯭ [U+AAEC, U+ABE8, U+ABED]0 or more
VOWEL_POST◌ꫯ ◌ꯣ ◌ꯤ ◌ꯦ ◌ꯧ ◌ꯩ ◌ꯪ [U+AAEF, U+ABE3..U+ABE4, U+ABE6..U+ABE7, U+ABE9..U+ABEA]0 or more
VOWEL_MOD_POST◌ꫵ ◌꯬ [U+AAF5, U+ABEC]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
꫰ ꫱ ꫲ ꫳ ꫴ ꯫ [U+AAF0..U+AAF4, U+ABEB]

Modi

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ 𑘀 𑘁 𑘂 𑘃 𑘄 𑘅 𑘆 𑘇 𑘈 𑘉 𑘊 𑘋 𑘌 𑘍 𑘎 𑘏 𑘐 𑘑 𑘒 𑘓 𑘔 𑘕 𑘖 𑘗 𑘘 𑘙 𑘚 𑘛 𑘜 𑘝 𑘞 𑘟 𑘠 𑘡 𑘢 𑘣 𑘤 𑘥 𑘦 𑘧 𑘨 𑘩 𑘪 𑘫 𑘬 𑘭 𑘮 𑘯 𑙐 𑙑 𑙒 𑙓 𑙔 𑙕 𑙖 𑙗 𑙘 𑙙   [U+25CC, U+11600..U+1162F, U+11650..U+11659, U+00A0]1
Repeating group0 or more
  HALANT◌𑘿 U+1163F1
BASE◌ 𑘀 𑘁 𑘂 𑘃 𑘄 𑘅 𑘆 𑘇 𑘈 𑘉 𑘊 𑘋 𑘌 𑘍 𑘎 𑘏 𑘐 𑘑 𑘒 𑘓 𑘔 𑘕 𑘖 𑘗 𑘘 𑘙 𑘚 𑘛 𑘜 𑘝 𑘞 𑘟 𑘠 𑘡 𑘢 𑘣 𑘤 𑘥 𑘦 𑘧 𑘨 𑘩 𑘪 𑘫 𑘬 𑘭 𑘮 𑘯 𑙐 𑙑 𑙒 𑙓 𑙔 𑙕 𑙖 𑙗 𑙘 𑙙 [U+25CC, U+11600..U+1162F, U+11650..U+11659]1
Alternative 1
  HALANT◌𑘿 U+1163F1
Alternative 2
  VOWEL_ABOVE◌𑘹 ◌𑘺 ◌𑙀 [U+11639..U+1163A, U+11640]0 or more
VOWEL_BELOW◌𑘳 ◌𑘴 ◌𑘵 ◌𑘶 ◌𑘷 ◌𑘸 [U+11633..U+11638]0 or more
VOWEL_POST◌𑘰 ◌𑘱 ◌𑘲 ◌𑘻 ◌𑘼 [U+11630..U+11632, U+1163B..U+1163C]0 or more
VOWEL_MOD_ABOVE◌𑘽 U+1163D0 or more
VOWEL_MOD_POST◌𑘾 U+1163E0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑘀 ◌𑘹 <U+11600, U+11639>
𑘀 ◌𑘺 <U+11600, U+1163A>
𑘁 ◌𑘹 <U+11601, U+11639>
𑘁 ◌𑘺 <U+11601, U+1163A>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
. ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑙁 𑙂 𑙃 𑙄 [U+0020, U+002E, U+A830..U+A839, U+11641..U+11644]

Multani

ClassesCharactersEncodingCount
BASE੦ ੧ ੨ ੩ ੪ ੫ ੬ ੭ ੮ ੯ 𑊀 𑊁 𑊂 𑊃 𑊄 𑊅 𑊆 𑊈 𑊊 𑊋 𑊌 𑊍 𑊏 𑊐 𑊑 𑊒 𑊓 𑊔 𑊕 𑊖 𑊗 𑊘 𑊙 𑊚 𑊛 𑊜 𑊝 𑊟 𑊠 𑊡 𑊢 𑊣 𑊤 𑊥 𑊦 𑊧 𑊨 [U+0A66..U+0A6F, U+11280..U+11286, U+11288, U+1128A..U+1128D, U+1128F..U+1129D, U+1129F..U+112A8]1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑊩 U+112A9

Nag Mundari

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ 𞓐 𞓑 𞓒 𞓓 𞓔 𞓕 𞓖 𞓗 𞓘 𞓙 𞓚 𞓛 𞓜 𞓝 𞓞 𞓟 𞓠 𞓡 𞓢 𞓣 𞓤 𞓥 𞓦 𞓧 𞓨 𞓩 𞓪 𞓫 𞓰 𞓱 𞓲 𞓳 𞓴 𞓵 𞓶 𞓷 𞓸 𞓹 -   ‐ [U+25CC, U+1E4D0..U+1E4EB, U+1E4F0..U+1E4F9, U+002D, U+00A0, U+2010]1
VOWEL_ABOVE◌𞓬 ◌𞓭 ◌𞓮 ◌𞓯 [U+1E4EC..U+1E4EF]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
! " ' , . ? ‘ ’ “ ” [U+0020..U+0022, U+0027, U+002C, U+002E, U+003F, U+2018..U+2019, U+201C..U+201D]

Nandinagari

ClassesCharactersEncodingCount
BASE, BASE_OTHER೦ ೧ ೨ ೩ ೪ ೫ ೬ ೭ ೮ ೯ ◌ 𑦠 𑦡 𑦢 𑦣 𑦤 𑦥 𑦦 𑦧 𑦪 𑦫 𑦬 𑦭 𑦮 𑦯 𑦰 𑦱 𑦲 𑦳 𑦴 𑦵 𑦶 𑦷 𑦸 𑦹 𑦺 𑦻 𑦼 𑦽 𑦾 𑦿 𑧀 𑧁 𑧂 𑧃 𑧄 𑧅 𑧆 𑧇 𑧈 𑧉 𑧊 𑧋 𑧌 𑧍 𑧎 𑧏 𑧐 𑧡 ᳺ [U+0CE6..U+0CEF, U+25CC, U+119A0..U+119A7, U+119AA..U+119D0, U+119E1, U+1CFA]1
Repeating group0 or more
  HALANT◌𑧠 U+119E01
BASE೦ ೧ ೨ ೩ ೪ ೫ ೬ ೭ ೮ ೯ ◌ 𑦠 𑦡 𑦢 𑦣 𑦤 𑦥 𑦦 𑦧 𑦪 𑦫 𑦬 𑦭 𑦮 𑦯 𑦰 𑦱 𑦲 𑦳 𑦴 𑦵 𑦶 𑦷 𑦸 𑦹 𑦺 𑦻 𑦼 𑦽 𑦾 𑦿 𑧀 𑧁 𑧂 𑧃 𑧄 𑧅 𑧆 𑧇 𑧈 𑧉 𑧊 𑧋 𑧌 𑧍 𑧎 𑧏 𑧐 𑧡 [U+0CE6..U+0CEF, U+25CC, U+119A0..U+119A7, U+119AA..U+119D0, U+119E1]1
Alternative 1
  HALANT◌𑧠 U+119E01
Alternative 2
  VOWEL_PRE◌𑧒 ◌𑧤 [U+119D2, U+119E4]0 or more
VOWEL_ABOVE◌𑧚 ◌𑧛 [U+119DA..U+119DB]0 or more
VOWEL_BELOW◌𑧔 ◌𑧕 ◌𑧖 ◌𑧗 [U+119D4..U+119D7]0 or more
VOWEL_POST◌𑧑 ◌𑧓 ◌𑧜 ◌𑧝 [U+119D1, U+119D3, U+119DC..U+119DD]0 or more
VOWEL_MOD_POST◌𑧞 ◌𑧟 [U+119DE..U+119DF]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ᳩ ᳲ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ 𑧢 𑧣 [U+0964..U+0965, U+1CE9, U+1CF2, U+A830..U+A835, U+119E2..U+119E3]

Newa

ClassesCharactersEncodingCount
CONS_WITH_STACKER𑑠 𑑡 [U+11460..U+11461]0 or 1
BASE◌ 𑐀 𑐁 𑐂 𑐃 𑐄 𑐅 𑐆 𑐇 𑐈 𑐉 𑐊 𑐋 𑐌 𑐍 𑐎 𑐏 𑐐 𑐑 𑐒 𑐓 𑐔 𑐕 𑐖 𑐗 𑐘 𑐙 𑐚 𑐛 𑐜 𑐝 𑐞 𑐟 𑐠 𑐡 𑐢 𑐣 𑐤 𑐥 𑐦 𑐧 𑐨 𑐩 𑐪 𑐫 𑐬 𑐭 𑐮 𑐯 𑐰 𑐱 𑐲 𑐳 𑐴 𑑇 𑑐 𑑑 𑑒 𑑓 𑑔 𑑕 𑑖 𑑗 𑑘 𑑙 𑑟 [U+25CC, U+11400..U+11434, U+11447, U+11450..U+11459, U+1145F]1
CONS_MOD_BELOW◌𑑆 U+114460 or more
Repeating group0 or more
  HALANT◌𑑂 U+114421
BASE◌ 𑐀 𑐁 𑐂 𑐃 𑐄 𑐅 𑐆 𑐇 𑐈 𑐉 𑐊 𑐋 𑐌 𑐍 𑐎 𑐏 𑐐 𑐑 𑐒 𑐓 𑐔 𑐕 𑐖 𑐗 𑐘 𑐙 𑐚 𑐛 𑐜 𑐝 𑐞 𑐟 𑐠 𑐡 𑐢 𑐣 𑐤 𑐥 𑐦 𑐧 𑐨 𑐩 𑐪 𑐫 𑐬 𑐭 𑐮 𑐯 𑐰 𑐱 𑐲 𑐳 𑐴 𑑇 𑑐 𑑑 𑑒 𑑓 𑑔 𑑕 𑑖 𑑗 𑑘 𑑙 𑑟 [U+25CC, U+11400..U+11434, U+11447, U+11450..U+11459, U+1145F]1
CONS_MOD_BELOW◌𑑆 U+114460 or more
Alternative 1
  HALANT◌𑑂 U+114421
Alternative 2
  VOWEL_PRE◌𑐶 U+114360 or more
VOWEL_ABOVE◌𑐾 ◌𑐿 [U+1143E..U+1143F]0 or more
VOWEL_BELOW◌𑐸 ◌𑐹 ◌𑐺 ◌𑐻 ◌𑐼 ◌𑐽 [U+11438..U+1143D]0 or more
VOWEL_POST◌𑐵 ◌𑐷 ◌𑑀 ◌𑑁 [U+11435, U+11437, U+11440..U+11441]0 or more
VOWEL_MOD_ABOVE◌𑑃 ◌𑑄 [U+11443..U+11444]0 or more
VOWEL_MOD_POST◌𑑅 U+114450 or more
CONS_FINAL_MOD◌᷻ ◌𑑞 [U+1DFB, U+1145E]0 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
⁕ 𑑈 𑑉 𑑊 𑑋 𑑌 𑑍 𑑎 𑑏 𑑚 𑑛 𑑝 [U+2055, U+11448..U+1144F, U+1145A..U+1145B, U+1145D]

Phags-pa

ClassesCharactersEncodingCount
BASE, BASE_OTHERꡀ ꡁ ꡂ ꡃ ꡄ ꡅ ꡆ ꡇ ꡈ ꡉ ꡊ ꡋ ꡌ ꡍ ꡎ ꡏ ꡐ ꡑ ꡒ ꡓ ꡔ ꡕ ꡖ ꡗ ꡘ ꡙ ꡚ ꡛ ꡜ ꡝ ꡞ ꡟ ꡠ ꡡ ꡢ ꡣ ꡤ ꡥ ꡦ ꡧ ꡨ ꡩ ꡪ ꡫ ꡬ ꡭ ꡮ ꡯ ꡰ ꡱ ꡲ ꡳ   [U+A840..U+A873, U+00A0]1
VARIATION_SELECTORU+FE000 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᠂ ᠃ ᠅   。 ꡴ ꡵ ꡶ ꡷ [U+0020, U+1802..U+1803, U+1805, U+202F, U+3002, U+A874..U+A877]

Rejang

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ ꤰ ꤱ ꤲ ꤳ ꤴ ꤵ ꤶ ꤷ ꤸ ꤹ ꤺ ꤻ ꤼ ꤽ ꤾ ꤿ ꥀ ꥁ ꥂ ꥃ ꥄ ꥅ ꥆ   [U+25CC, U+A930..U+A946, U+00A0]1
VOWEL_ABOVE◌ꥊ U+A94A0 or more
VOWEL_BELOW◌ꥇ ◌ꥈ ◌ꥉ ◌ꥋ ◌ꥌ ◌ꥍ ◌ꥎ [U+A947..U+A949, U+A94B..U+A94E]0 or more
VOWEL_POST◌꥓ U+A9530 or more
CONS_FINAL_ABOVE◌ꥏ ◌ꥐ ◌ꥑ [U+A94F..U+A951]0 or more
CONS_FINAL_POST◌ꥒ U+A9520 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
, . : ꥟ [U+0020, U+002C, U+002E, U+003A, U+A95F]

Saurashtra

ClassesCharactersEncodingCount
BASE◌ ꢂ ꢃ ꢄ ꢅ ꢆ ꢇ ꢈ ꢉ ꢊ ꢋ ꢌ ꢍ ꢎ ꢏ ꢐ ꢑ ꢒ ꢓ ꢔ ꢕ ꢖ ꢗ ꢘ ꢙ ꢚ ꢛ ꢜ ꢝ ꢞ ꢟ ꢠ ꢡ ꢢ ꢣ ꢤ ꢥ ꢦ ꢧ ꢨ ꢩ ꢪ ꢫ ꢬ ꢭ ꢮ ꢯ ꢰ ꢱ ꢲ ꢳ ꣐ ꣑ ꣒ ꣓ ꣔ ꣕ ꣖ ꣗ ꣘ ꣙ [U+25CC, U+A882..U+A8B3, U+A8D0..U+A8D9]1
Repeating group0 or more
  HALANT◌꣄ U+A8C41
BASE◌ ꢂ ꢃ ꢄ ꢅ ꢆ ꢇ ꢈ ꢉ ꢊ ꢋ ꢌ ꢍ ꢎ ꢏ ꢐ ꢑ ꢒ ꢓ ꢔ ꢕ ꢖ ꢗ ꢘ ꢙ ꢚ ꢛ ꢜ ꢝ ꢞ ꢟ ꢠ ꢡ ꢢ ꢣ ꢤ ꢥ ꢦ ꢧ ꢨ ꢩ ꢪ ꢫ ꢬ ꢭ ꢮ ꢯ ꢰ ꢱ ꢲ ꢳ ꣐ ꣑ ꣒ ꣓ ꣔ ꣕ ꣖ ꣗ ꣘ ꣙ [U+25CC, U+A882..U+A8B3, U+A8D0..U+A8D9]1
Alternative 1
  HALANT◌꣄ U+A8C41
Alternative 2
  CONS_MED_POST◌ꢴ U+A8B40 or 1
VOWEL_POST◌ꢵ ◌ꢶ ◌ꢷ ◌ꢸ ◌ꢹ ◌ꢺ ◌ꢻ ◌ꢼ ◌ꢽ ◌ꢾ ◌ꢿ ◌ꣀ ◌ꣁ ◌ꣂ ◌ꣃ [U+A8B5..U+A8C3]0 or more
VOWEL_MOD_ABOVE◌ꣅ U+A8C50 or more
VOWEL_MOD_POST◌ꢀ ◌ꢁ [U+A880..U+A881]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
, . ? ꣎ ꣏ [U+002C, U+002E, U+003F, U+A8CE..U+A8CF]

Sharada

ClassesCharactersEncodingCount
REPHA𑇂◌ 𑇃◌ [U+111C2..U+111C3]0 or 1
BASE◌ 𑆃 𑆄 𑆅 𑆆 𑆇 𑆈 𑆉 𑆊 𑆋 𑆌 𑆍 𑆎 𑆏 𑆐 𑆑 𑆒 𑆓 𑆔 𑆕 𑆖 𑆗 𑆘 𑆙 𑆚 𑆛 𑆜 𑆝 𑆞 𑆟 𑆠 𑆡 𑆢 𑆣 𑆤 𑆥 𑆦 𑆧 𑆨 𑆩 𑆪 𑆫 𑆬 𑆭 𑆮 𑆯 𑆰 𑆱 𑆲 𑇁 𑇐 𑇑 𑇒 𑇓 𑇔 𑇕 𑇖 𑇗 𑇘 𑇙 𑇚 [U+25CC, U+11183..U+111B2, U+111C1, U+111D0..U+111DA]1
CONS_MOD_BELOW◌𑇊 U+111CA0 or more
Repeating group0 or more
  HALANT◌𑇀 U+111C01
BASE◌ 𑆃 𑆄 𑆅 𑆆 𑆇 𑆈 𑆉 𑆊 𑆋 𑆌 𑆍 𑆎 𑆏 𑆐 𑆑 𑆒 𑆓 𑆔 𑆕 𑆖 𑆗 𑆘 𑆙 𑆚 𑆛 𑆜 𑆝 𑆞 𑆟 𑆠 𑆡 𑆢 𑆣 𑆤 𑆥 𑆦 𑆧 𑆨 𑆩 𑆪 𑆫 𑆬 𑆭 𑆮 𑆯 𑆰 𑆱 𑆲 𑇁 𑇐 𑇑 𑇒 𑇓 𑇔 𑇕 𑇖 𑇗 𑇘 𑇙 𑇚 [U+25CC, U+11183..U+111B2, U+111C1, U+111D0..U+111DA]1
CONS_MOD_BELOW◌𑇊 U+111CA0 or more
Alternative 1
  HALANT◌𑇀 U+111C01
Alternative 2
  VOWEL_PRE◌𑆴 ◌𑇎 [U+111B4, U+111CE]0 or more
VOWEL_ABOVE◌𑆼 ◌𑆽 ◌𑆾 ◌𑆿 ◌𑇋 [U+111BC..U+111BF, U+111CB]0 or more
VOWEL_BELOW◌𑆶 ◌𑆷 ◌𑆸 ◌𑆹 ◌𑆺 ◌𑆻 ◌𑇌 [U+111B6..U+111BB, U+111CC]0 or more
VOWEL_POST◌𑆳 ◌𑆵 [U+111B3, U+111B5]0 or more
VOWEL_MOD_ABOVE◌॑ ◌᳠ ◌𑆀 ◌𑆁 ◌𑇏 [U+0951, U+1CE0, U+11180..U+11181, U+111CF]0 or more
VOWEL_MOD_BELOW◌᳗ ◌᳙ ◌᳜ ◌᳝ [U+1CD7, U+1CD9, U+1CDC..U+1CDD]0 or more
VOWEL_MOD_POST◌𑆂 U+111820 or more
CONS_FINAL_MOD◌𑇉 U+111C90 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠸ 𑇄 𑇅 𑇆 𑇇 𑇈 𑇍 𑇛 𑇜 𑇝 𑇞 𑇟 [U+A830..U+A835, U+A838, U+111C4..U+111C8, U+111CD, U+111DB..U+111DF]

Siddham

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌𑖺 U+115BA◌𑖸 ◌𑖯 <U+115B8, U+115AF>
◌𑖻 U+115BB◌𑖹 ◌𑖯 <U+115B9, U+115AF>

Encoding order:

ClassesCharactersEncodingCount
BASE◌ 𑖀 𑖁 𑖂 𑖃 𑖄 𑖅 𑖆 𑖇 𑖈 𑖉 𑖊 𑖋 𑖌 𑖍 𑖎 𑖏 𑖐 𑖑 𑖒 𑖓 𑖔 𑖕 𑖖 𑖗 𑖘 𑖙 𑖚 𑖛 𑖜 𑖝 𑖞 𑖟 𑖠 𑖡 𑖢 𑖣 𑖤 𑖥 𑖦 𑖧 𑖨 𑖩 𑖪 𑖫 𑖬 𑖭 𑖮 𑗘 𑗙 𑗚 𑗛 [U+25CC, U+11580..U+115AE, U+115D8..U+115DB]1
CONS_MOD_BELOW◌𑗀 U+115C00 or more
Repeating group0 or more
  HALANT◌𑖿 U+115BF1
BASE◌ 𑖀 𑖁 𑖂 𑖃 𑖄 𑖅 𑖆 𑖇 𑖈 𑖉 𑖊 𑖋 𑖌 𑖍 𑖎 𑖏 𑖐 𑖑 𑖒 𑖓 𑖔 𑖕 𑖖 𑖗 𑖘 𑖙 𑖚 𑖛 𑖜 𑖝 𑖞 𑖟 𑖠 𑖡 𑖢 𑖣 𑖤 𑖥 𑖦 𑖧 𑖨 𑖩 𑖪 𑖫 𑖬 𑖭 𑖮 𑗘 𑗙 𑗚 𑗛 [U+25CC, U+11580..U+115AE, U+115D8..U+115DB]1
CONS_MOD_BELOW◌𑗀 U+115C00 or more
Alternative 1
  HALANT◌𑖿 U+115BF1
Alternative 2
  VOWEL_PRE◌𑖰 ◌𑖸 ◌𑖹 [U+115B0, U+115B8..U+115B9]0 or more
VOWEL_BELOW◌𑖲 ◌𑖳 ◌𑖴 ◌𑖵 ◌𑗜 ◌𑗝 [U+115B2..U+115B5, U+115DC..U+115DD]0 or more
VOWEL_POST◌𑖯 ◌𑖱 [U+115AF, U+115B1]0 or more
VOWEL_MOD_ABOVE◌𑖼 ◌𑖽 [U+115BC..U+115BD]0 or more
VOWEL_MOD_POST◌𑖾 U+115BE0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑗁 𑗂 𑗃 𑗄 𑗅 𑗆 𑗇 𑗈 𑗉 𑗊 𑗋 𑗌 𑗍 𑗎 𑗏 𑗐 𑗑 𑗒 𑗓 𑗔 𑗕 𑗖 𑗗 [U+115C1..U+115D7]

Known bugs:

Sinhala

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌ේ U+0DDA◌ෙ ◌් <U+0DD9, U+0DCA>
◌ො U+0DDC◌ෙ ◌ා <U+0DD9, U+0DCF>
◌ෝ U+0DDD◌ෙ ◌ා ◌් <U+0DD9, U+0DCF, U+0DCA>
◌ෞ U+0DDE◌ෙ ◌ෟ <U+0DD9, U+0DDF>

Encoding order:

ClassesCharactersEncodingCount
BASE0 1 2 3 4 5 6 7 8 9 අ ආ ඇ ඈ ඉ ඊ උ ඌ ඍ ඎ ඏ ඐ එ ඒ ඓ ඔ ඕ ඖ ක ඛ ග ඝ ඞ ඟ ච ඡ ජ ඣ ඤ ඥ ඦ ට ඨ ඩ ඪ ණ ඬ ත ථ ද ධ න ඳ ප ඵ බ භ ම ඹ ය ර ල ව ශ ෂ ස හ ළ ෆ ෦ ෧ ෨ ෩ ෪ ෫ ෬ ෭ ෮ ෯ ◌ 𑇡 𑇢 𑇣 𑇤 𑇥 𑇦 𑇧 𑇨 𑇩 𑇪 𑇫 𑇬 𑇭 𑇮 𑇯 𑇰 𑇱 𑇲 𑇳 𑇴 [U+0030..U+0039, U+0D85..U+0D96, U+0D9A..U+0DB1, U+0DB3..U+0DBB, U+0DBD, U+0DC0..U+0DC6, U+0DE6..U+0DEF, U+25CC, U+111E1..U+111F4]1
Repeating group0 or more
  HALANT◌් U+0DCA1
BASE0 1 2 3 4 5 6 7 8 9 අ ආ ඇ ඈ ඉ ඊ උ ඌ ඍ ඎ ඏ ඐ එ ඒ ඓ ඔ ඕ ඖ ක ඛ ග ඝ ඞ ඟ ච ඡ ජ ඣ ඤ ඥ ඦ ට ඨ ඩ ඪ ණ ඬ ත ථ ද ධ න ඳ ප ඵ බ භ ම ඹ ය ර ල ව ශ ෂ ස හ ළ ෆ ෦ ෧ ෨ ෩ ෪ ෫ ෬ ෭ ෮ ෯ ◌ 𑇡 𑇢 𑇣 𑇤 𑇥 𑇦 𑇧 𑇨 𑇩 𑇪 𑇫 𑇬 𑇭 𑇮 𑇯 𑇰 𑇱 𑇲 𑇳 𑇴 [U+0030..U+0039, U+0D85..U+0D96, U+0D9A..U+0DB1, U+0DB3..U+0DBB, U+0DBD, U+0DC0..U+0DC6, U+0DE6..U+0DEF, U+25CC, U+111E1..U+111F4]1
Alternative 1
  HALANT◌් U+0DCA1
Alternative 2
  VOWEL_PRE◌ෙ ◌ෛ [U+0DD9, U+0DDB]0 or more
VOWEL_ABOVE◌ි ◌ී [U+0DD2..U+0DD3]0 or more
VOWEL_BELOW◌ු ◌ූ [U+0DD4, U+0DD6]0 or more
VOWEL_POST◌ා ◌ැ ◌ෑ ◌ෘ ◌ෟ ◌ෲ ◌ෳ [U+0DCF..U+0DD1, U+0DD8, U+0DDF, U+0DF2..U+0DF3]0 or more
VOWEL_MOD_ABOVE◌ඁ U+0D810 or more
VOWEL_MOD_POST◌ං ◌ඃ [U+0D82..U+0D83]0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
අ ◌ා <U+0D85, U+0DCF>
අ ◌ැ <U+0D85, U+0DD0>
අ ◌ෑ <U+0D85, U+0DD1>
උ ◌ෟ <U+0D8B, U+0DDF>
ඍ ◌ෘ <U+0D8D, U+0DD8>
ඏ ◌ෟ <U+0D8F, U+0DDF>
එ ◌් <U+0D91, U+0DCA>
එ ◌ෙ <U+0D91, U+0DD9>
එ ◌ේ <U+0D91, U+0DDA>
එ ◌ො <U+0D91, U+0DDC>
එ ◌ෝ <U+0D91, U+0DDD>
එ ◌ෞ <U+0D91, U+0DDE>
ඔ ◌ෟ <U+0D94, U+0DDF>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ෴ ᳲ [U+0964..U+0965, U+0DF4, U+1CF2]

Known bugs:

Soyombo

ClassesCharactersEncodingCount
REPHA𑪄◌ 𑪅◌ 𑪆◌ 𑪇◌ 𑪈◌ 𑪉◌ [U+11A84..U+11A89]0 or 1
BASE◌ 𑩐 𑩜 𑩝 𑩞 𑩟 𑩠 𑩡 𑩢 𑩣 𑩤 𑩥 𑩦 𑩧 𑩨 𑩩 𑩪 𑩫 𑩬 𑩭 𑩮 𑩯 𑩰 𑩱 𑩲 𑩳 𑩴 𑩵 𑩶 𑩷 𑩸 𑩹 𑩺 𑩻 𑩼 𑩽 𑩾 𑩿 𑪀 𑪁 𑪂 𑪃 𑪝 [U+25CC, U+11A50, U+11A5C..U+11A83, U+11A9D]1
CONS_MOD_ABOVE◌𑪘 U+11A980 or more
Repeating group0 or more
  HALANT◌𑪙 U+11A991
BASE◌ 𑩐 𑩜 𑩝 𑩞 𑩟 𑩠 𑩡 𑩢 𑩣 𑩤 𑩥 𑩦 𑩧 𑩨 𑩩 𑩪 𑩫 𑩬 𑩭 𑩮 𑩯 𑩰 𑩱 𑩲 𑩳 𑩴 𑩵 𑩶 𑩷 𑩸 𑩹 𑩺 𑩻 𑩼 𑩽 𑩾 𑩿 𑪀 𑪁 𑪂 𑪃 𑪝 [U+25CC, U+11A50, U+11A5C..U+11A83, U+11A9D]1
CONS_MOD_ABOVE◌𑪘 U+11A980 or more
Alternative 1
  HALANT◌𑪙 U+11A991
Alternative 2
  VOWEL_ABOVE◌𑩑 ◌𑩔 ◌𑩕 ◌𑩖 [U+11A51, U+11A54..U+11A56]0 or more
VOWEL_BELOW◌𑩒 ◌𑩓 ◌𑩙 ◌𑩚 ◌𑩛 [U+11A52..U+11A53, U+11A59..U+11A5B]0 or more
VOWEL_POST◌𑩗 ◌𑩘 [U+11A57..U+11A58]0 or more
VOWEL_MOD_ABOVE◌𑪖 U+11A960 or more
VOWEL_MOD_POST◌𑪗 U+11A970 or more
CONS_FINAL_BELOW◌𑪊 ◌𑪋 ◌𑪌 ◌𑪍 ◌𑪎 ◌𑪏 ◌𑪐 ◌𑪑 ◌𑪒 ◌𑪓 ◌𑪔 ◌𑪕 [U+11A8A..U+11A95]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑪚 𑪛 𑪜 𑪞 𑪟 𑪠 𑪡 𑪢 [U+11A9A..U+11A9C, U+11A9E..U+11AA2]

Sundanese

ClassesCharactersEncodingCount
BASE, BASE_OTHERᮃ ᮄ ᮅ ᮆ ᮇ ᮈ ᮉ ᮊ ᮋ ᮌ ᮍ ᮎ ᮏ ᮐ ᮑ ᮒ ᮓ ᮔ ᮕ ᮖ ᮗ ᮘ ᮙ ᮚ ᮛ ᮜ ᮝ ᮞ ᮟ ᮠ ᮮ ᮯ ᮰ ᮱ ᮲ ᮳ ᮴ ᮵ ᮶ ᮷ ᮸ ᮹ ᮺ ᮻ ᮼ ᮽ ᮾ ᮿ ◌   [U+1B83..U+1BA0, U+1BAE..U+1BBF, U+25CC, U+00A0]1
Repeating group0 or more
  Alternative 1
  HALANT◌᮫ U+1BAB1
BASEᮃ ᮄ ᮅ ᮆ ᮇ ᮈ ᮉ ᮊ ᮋ ᮌ ᮍ ᮎ ᮏ ᮐ ᮑ ᮒ ᮓ ᮔ ᮕ ᮖ ᮗ ᮘ ᮙ ᮚ ᮛ ᮜ ᮝ ᮞ ᮟ ᮠ ᮮ ᮯ ᮰ ᮱ ᮲ ᮳ ᮴ ᮵ ᮶ ᮷ ᮸ ᮹ ᮺ ᮻ ᮼ ᮽ ᮾ ᮿ ◌ [U+1B83..U+1BA0, U+1BAE..U+1BBF, U+25CC]1
Alternative 2
  CONS_SUB◌ᮡ ◌ᮢ ◌ᮣ ◌ᮬ ◌ᮭ [U+1BA1..U+1BA3, U+1BAC..U+1BAD]1
Alternative 1
  HALANT◌᮫ U+1BAB1
Alternative 2
  VOWEL_PRE◌ᮦ U+1BA60 or more
VOWEL_ABOVE◌ᮤ ◌ᮨ ◌ᮩ [U+1BA4, U+1BA8..U+1BA9]0 or more
VOWEL_BELOW◌ᮥ U+1BA50 or more
VOWEL_POST◌ᮧ ◌᮪ [U+1BA7, U+1BAA]0 or more
VOWEL_MOD_ABOVE◌ᮀ U+1B800 or more
VOWEL_MOD_POST◌ᮂ U+1B820 or more
CONS_FINAL_ABOVE◌ᮁ U+1B810 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
" , . ? ᳀ ᳁ ᳂ ᳃ ᳄ ᳅ ᳆ ᳇ “ ” [U+0020, U+0022, U+002C, U+002E, U+003F, U+1CC0..U+1CC7, U+201C..U+201D]

Syloti Nagri

ClassesCharactersEncodingCount
BASE০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ◌ ꠀ ꠁ ꠃ ꠄ ꠅ ꠇ ꠈ ꠉ ꠊ ꠌ ꠍ ꠎ ꠏ ꠐ ꠑ ꠒ ꠓ ꠔ ꠕ ꠖ ꠗ ꠘ ꠙ ꠚ ꠛ ꠜ ꠝ ꠞ ꠟ ꠠ ꠡ ꠢ [U+09E6..U+09EF, U+25CC, U+A800..U+A801, U+A803..U+A805, U+A807..U+A80A, U+A80C..U+A822]1
Repeating group0 or more
  HALANT◌꠆ U+A8061
BASE০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯ ◌ ꠀ ꠁ ꠃ ꠄ ꠅ ꠇ ꠈ ꠉ ꠊ ꠌ ꠍ ꠎ ꠏ ꠐ ꠑ ꠒ ꠓ ꠔ ꠕ ꠖ ꠗ ꠘ ꠙ ꠚ ꠛ ꠜ ꠝ ꠞ ꠟ ꠠ ꠡ ꠢ [U+09E6..U+09EF, U+25CC, U+A800..U+A801, U+A803..U+A805, U+A807..U+A80A, U+A80C..U+A822]1
Alternative 1
  HALANT◌꠆ U+A8061
Alternative 2
  VOWEL_ABOVE◌ꠂ ◌ꠦ [U+A802, U+A826]0 or more
VOWEL_BELOW◌ꠥ ◌꠬ [U+A825, U+A82C]0 or more
VOWEL_POST◌ꠣ ◌ꠤ ◌ꠧ [U+A823..U+A824, U+A827]0 or more
VOWEL_MOD_ABOVE◌ꠋ U+A80B0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
, . : ; ? । ॥ ⁕ ꠨ ꠩ ꠪ ꠫ [U+002C, U+002E, U+003A..U+003B, U+003F, U+0964..U+0965, U+2055, U+A828..U+A82B]

Tagalog

ClassesCharactersEncodingCount
BASEᜀ ᜁ ᜂ ᜃ ᜄ ᜅ ᜆ ᜇ ᜈ ᜉ ᜊ ᜋ ᜌ ᜍ ᜎ ᜏ ᜐ ᜑ ᜟ ◌ [U+1700..U+1711, U+171F, U+25CC]1
VOWEL_ABOVE◌ᜒ U+17120 or more
VOWEL_BELOW◌ᜓ ◌᜔ [U+1713..U+1714]0 or more
VOWEL_POST◌᜕ U+17150 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᜵ ᜶ [U+1735..U+1736]

Tagbanwa

ClassesCharactersEncodingCount
BASEᝠ ᝡ ᝢ ᝣ ᝤ ᝥ ᝦ ᝧ ᝨ ᝩ ᝪ ᝫ ᝬ ᝮ ᝯ ᝰ ◌ [U+1760..U+176C, U+176E..U+1770, U+25CC]1
VOWEL_ABOVE◌ᝲ U+17720 or more
VOWEL_BELOW◌ᝳ U+17730 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
᜵ ᜶ [U+1735..U+1736]

Tai Le

ClassesCharactersEncodingCount
BASE0 1 2 3 4 5 6 7 8 9 ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ᥐ ᥑ ᥒ ᥓ ᥔ ᥕ ᥖ ᥗ ᥘ ᥙ ᥚ ᥛ ᥜ ᥝ ᥞ ᥟ ᥠ ᥡ ᥢ ᥣ ᥤ ᥥ ᥦ ᥧ ᥨ ᥩ ᥪ ᥫ ᥬ ᥭ ᥰ ᥱ ᥲ ᥳ ᥴ ◌ [U+0030..U+0039, U+1040..U+1049, U+1950..U+196D, U+1970..U+1974, U+25CC]1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
◌̀ ◌́ ◌̇ ◌̈ ◌̌ [U+0300..U+0301, U+0307..U+0308, U+030C]

Known bugs:

Tai Tham

Tai Tham is not fully supported by the USE. Orthographic syllables in Tai Tham can be more complicated than those of most other scripts, and there’s no agreement yet on how to encode them in Unicode. See the Topical Document List: Tai Tham.

ClassesCharactersEncodingCount
BASE0 1 2 3 4 5 6 7 8 9 ᨠ ᨡ ᨢ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯ ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿ ᩀ ᩁ ᩂ ᩃ ᩄ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᩓ ᩔ ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙ ◌ [U+0030..U+0039, U+1A20..U+1A54, U+1A80..U+1A89, U+1A90..U+1A99, U+25CC]1
Repeating group0 or more
  Alternative 1
  HALANT◌᩠ U+1A601
BASE0 1 2 3 4 5 6 7 8 9 ᨠ ᨡ ᨢ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯ ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿ ᩀ ᩁ ᩂ ᩃ ᩄ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᩓ ᩔ ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙ ◌ [U+0030..U+0039, U+1A20..U+1A54, U+1A80..U+1A89, U+1A90..U+1A99, U+25CC]1
Alternative 2
  CONS_SUB◌ᩗ ◌ᩛ ◌ᩜ ◌ᩝ ◌ᩞ [U+1A57, U+1A5B..U+1A5E]1
Alternative 1
  HALANT◌᩠ U+1A601
Alternative 2
  CONS_MED_PRE◌ᩕ U+1A550 or 1
CONS_MED_BELOW◌ᩖ U+1A560 or 1
VOWEL_PRE◌ᩮ ◌ᩯ ◌ᩰ ◌ᩱ ◌ᩲ [U+1A6E..U+1A72]0 or more
VOWEL_ABOVE◌ᩢ ◌ᩥ ◌ᩦ ◌ᩧ ◌ᩨ ◌ᩫ ◌ᩳ ◌᩺ [U+1A62, U+1A65..U+1A68, U+1A6B, U+1A73, U+1A7A]0 or more
VOWEL_BELOW◌ᩩ ◌ᩪ ◌ᩬ [U+1A69..U+1A6A, U+1A6C]0 or more
VOWEL_POST◌ᩡ ◌ᩣ ◌ᩤ ◌ᩭ [U+1A61, U+1A63..U+1A64, U+1A6D]0 or more
VOWEL_MOD_ABOVE◌ᩴ ◌᩵ ◌᩶ ◌᩷ ◌᩸ ◌᩹ ◌᩻ ◌᩼ [U+1A74..U+1A79, U+1A7B..U+1A7C]0 or more
VOWEL_MOD_BELOW◌᩿ U+1A7F0 or more
CONS_FINAL_ABOVE◌ᩘ ◌ᩙ [U+1A58..U+1A59]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
! " ( ) ? ◌ᩚ ᪠ ᪡ ᪢ ᪣ ᪤ ᪥ ᪦ ᪧ ᪨ ᪩ ᪪ ᪫ ᪬ ᪭ “ ” [U+0021..U+0022, U+0028..U+0029, U+003F, U+1A5A, U+1AA0..U+1AAD, U+201C..U+201D]

Known bugs:

Tai Viet

ClassesCharactersEncodingCount
BASE, BASE_OTHER◌ ꪀ ꪁ ꪂ ꪃ ꪄ ꪅ ꪆ ꪇ ꪈ ꪉ ꪊ ꪋ ꪌ ꪍ ꪎ ꪏ ꪐ ꪑ ꪒ ꪓ ꪔ ꪕ ꪖ ꪗ ꪘ ꪙ ꪚ ꪛ ꪜ ꪝ ꪞ ꪟ ꪠ ꪡ ꪢ ꪣ ꪤ ꪥ ꪦ ꪧ ꪨ ꪩ ꪪ ꪫ ꪬ ꪭ ꪮ ꪯ ꪱ ꪵ ꪶ ꪹ ꪺ ꪻ ꪼ ꪽ ꫀ ꫂ   [U+25CC, U+AA80..U+AAAF, U+AAB1, U+AAB5..U+AAB6, U+AAB9..U+AABD, U+AAC0, U+AAC2, U+00A0]1
VOWEL_ABOVE◌ꪰ ◌ꪲ ◌ꪳ ◌ꪷ ◌ꪸ ◌ꪾ [U+AAB0, U+AAB2..U+AAB3, U+AAB7..U+AAB8, U+AABE]0 or more
VOWEL_BELOW◌ꪴ U+AAB40 or more
VOWEL_MOD_ABOVE◌꪿ ◌꫁ [U+AABF, U+AAC1]0 or more

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
ꫛ ꫜ ꫝ ꫞ ꫟ [U+0020, U+AADB..U+AADF]

Takri

ClassesCharactersEncodingCount
BASE◌ 𑚀 𑚁 𑚂 𑚃 𑚄 𑚅 𑚆 𑚇 𑚈 𑚉 𑚊 𑚋 𑚌 𑚍 𑚎 𑚏 𑚐 𑚑 𑚒 𑚓 𑚔 𑚕 𑚖 𑚗 𑚘 𑚙 𑚚 𑚛 𑚜 𑚝 𑚞 𑚟 𑚠 𑚡 𑚢 𑚣 𑚤 𑚥 𑚦 𑚧 𑚨 𑚩 𑚪 𑚸 𑛀 𑛁 𑛂 𑛃 𑛄 𑛅 𑛆 𑛇 𑛈 𑛉 [U+25CC, U+11680..U+116AA, U+116B8, U+116C0..U+116C9]1
CONS_MOD_BELOW◌𑚷 U+116B70 or more
Repeating group0 or more
  HALANT◌𑚶 U+116B61
BASE◌ 𑚀 𑚁 𑚂 𑚃 𑚄 𑚅 𑚆 𑚇 𑚈 𑚉 𑚊 𑚋 𑚌 𑚍 𑚎 𑚏 𑚐 𑚑 𑚒 𑚓 𑚔 𑚕 𑚖 𑚗 𑚘 𑚙 𑚚 𑚛 𑚜 𑚝 𑚞 𑚟 𑚠 𑚡 𑚢 𑚣 𑚤 𑚥 𑚦 𑚧 𑚨 𑚩 𑚪 𑚸 𑛀 𑛁 𑛂 𑛃 𑛄 𑛅 𑛆 𑛇 𑛈 𑛉 [U+25CC, U+11680..U+116AA, U+116B8, U+116C0..U+116C9]1
CONS_MOD_BELOW◌𑚷 U+116B70 or more
Alternative 1
  HALANT◌𑚶 U+116B61
Alternative 2
  VOWEL_PRE◌𑚮 U+116AE0 or more
VOWEL_ABOVE◌𑚭 ◌𑚲 ◌𑚳 ◌𑚴 ◌𑚵 [U+116AD, U+116B2..U+116B5]0 or more
VOWEL_BELOW◌𑚰 ◌𑚱 [U+116B0..U+116B1]0 or more
VOWEL_POST◌𑚯 U+116AF0 or more
VOWEL_MOD_ABOVE◌𑚫 U+116AB0 or more
VOWEL_MOD_POST◌𑚬 U+116AC0 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑚀 ◌𑚭 <U+11680, U+116AD>
𑚀 ◌𑚴 <U+11680, U+116B4>
𑚀 ◌𑚵 <U+11680, U+116B5>
𑚆 ◌𑚲 <U+11686, U+116B2>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑚹 [U+0964..U+0965, U+A830..U+A839, U+116B9]

Tibetan

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌ཱི U+0F73◌ཱ ◌ི <U+0F71, U+0F72>
◌ཱུ U+0F75◌ཱ ◌ུ <U+0F71, U+0F74>
◌ྲྀ U+0F76◌ྲ ◌ྀ <U+0FB2, U+0F80>
◌ླྀ U+0F78◌ླ ◌ྀ <U+0FB3, U+0F80>
◌ཱྀ U+0F81◌ཱ ◌ྀ <U+0F71, U+0F80>

Encoding order:

ClassesCharactersEncodingCount
BASE, BASE_OTHERༀ ༁ ༄ ༅ ༆ ༠ ༡ ༢ ༣ ༤ ༥ ༦ ༧ ༨ ༩ ༪ ༫ ༬ ༭ ༮ ༯ ༰ ༱ ༲ ༳ ཀ ཁ ག གྷ ང ཅ ཆ ཇ ཉ ཊ ཋ ཌ ཌྷ ཎ ཏ ཐ ད དྷ ན པ ཕ བ བྷ མ ཙ ཚ ཛ ཛྷ ཝ ཞ ཟ འ ཡ ར ལ ཤ ཥ ས ཧ ཨ ཀྵ ཪ ཫ ཬ ྈ ྉ ྊ ྋ ྌ ◌   [U+0F00..U+0F01, U+0F04..U+0F06, U+0F20..U+0F33, U+0F40..U+0F47, U+0F49..U+0F6C, U+0F88..U+0F8C, U+25CC, U+00A0]1
CONS_MOD_ABOVE◌༹ U+0F390 or more
CONS_MOD_BELOW◌ཱ U+0F710 or more
Repeating group0 or more
  CONS_SUB◌ྍ ◌ྎ ◌ྏ ◌ྐ ◌ྑ ◌ྒ ◌ྒྷ ◌ྔ ◌ྕ ◌ྖ ◌ྗ ◌ྙ ◌ྚ ◌ྛ ◌ྜ ◌ྜྷ ◌ྞ ◌ྟ ◌ྠ ◌ྡ ◌ྡྷ ◌ྣ ◌ྤ ◌ྥ ◌ྦ ◌ྦྷ ◌ྨ ◌ྩ ◌ྪ ◌ྫ ◌ྫྷ ◌ྭ ◌ྮ ◌ྯ ◌ྰ ◌ྱ ◌ྲ ◌ླ ◌ྴ ◌ྵ ◌ྶ ◌ྷ ◌ྸ ◌ྐྵ ◌ྺ ◌ྻ ◌ྼ [U+0F8D..U+0F97, U+0F99..U+0FBC]1
CONS_MOD_ABOVE◌༹ U+0F390 or more
CONS_MOD_BELOW◌ཱ U+0F710 or more
VOWEL_ABOVE◌ུ ◌ཷ ◌ཹ [U+0F74, U+0F77, U+0F79]0 or more
VOWEL_BELOW◌ི ◌ེ ◌ཻ ◌ོ ◌ཽ ◌ྀ ◌྄ [U+0F72, U+0F7A..U+0F7D, U+0F80, U+0F84]0 or more
VOWEL_MOD_ABOVE◌ཾ ◌ྂ ◌ྃ ◌྆ ◌྇ [U+0F7E, U+0F82..U+0F83, U+0F86..U+0F87]0 or more
CONS_FINAL_MOD◌༵ ◌༷ ◌࿆ [U+0F35, U+0F37, U+0FC6]0 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
༂ ༃ ༇ ༈ ༉ ༊ ་ ༌ ། ༎ ༏ ༐ ༑ ༒ ༓ ༔ ༕ ༖ ༗ ◌༘ ◌༙ ༚ ༛ ༜ ༝ ༞ ༟ ༴ ༶ ༸ ༺ ༻ ༼ ༽ ◌༾ ◌༿ ◌ཿ ྅ ྾ ྿ ࿀ ࿁ ࿂ ࿃ ࿄ ࿅ ࿇ ࿈ ࿉ ࿊ ࿋ ࿌ ࿎ ࿏ ࿐ ࿑ ࿒ ࿓ ࿔ ࿕ ࿖ ࿗ ࿘ ࿙ ࿚ ☸ [U+0F02..U+0F03, U+0F07..U+0F1F, U+0F34, U+0F36, U+0F38, U+0F3A..U+0F3F, U+0F7F, U+0F85, U+0FBE..U+0FC5, U+0FC7..U+0FCC, U+0FCE..U+0FDA, U+2638]

Known bugs:

Tirhuta

The USE decomposes the following multi-part vowels, which therefore don’t show up in the encoding order:

Composed
Character
Composed
Encoding
Decomposed
Characters
Decomposed
Encoding
◌𑒻 U+114BB◌𑒹 ◌𑒺 <U+114B9, U+114BA>
◌𑒼 U+114BC◌𑒹 ◌𑒰 <U+114B9, U+114B0>
◌𑒾 U+114BE◌𑒹 ◌𑒽 <U+114B9, U+114BD>

Encoding order:

ClassesCharactersEncodingCount
BASE◌ 𑒁 𑒂 𑒃 𑒄 𑒅 𑒆 𑒇 𑒈 𑒉 𑒊 𑒋 𑒌 𑒍 𑒎 𑒏 𑒐 𑒑 𑒒 𑒓 𑒔 𑒕 𑒖 𑒗 𑒘 𑒙 𑒚 𑒛 𑒜 𑒝 𑒞 𑒟 𑒠 𑒡 𑒢 𑒣 𑒤 𑒥 𑒦 𑒧 𑒨 𑒩 𑒪 𑒫 𑒬 𑒭 𑒮 𑒯 𑓄 𑓐 𑓑 𑓒 𑓓 𑓔 𑓕 𑓖 𑓗 𑓘 𑓙 [U+25CC, U+11481..U+114AF, U+114C4, U+114D0..U+114D9]1
CONS_MOD_BELOW◌𑓃 U+114C30 or more
Repeating group0 or more
  HALANT◌𑓂 U+114C21
BASE◌ 𑒁 𑒂 𑒃 𑒄 𑒅 𑒆 𑒇 𑒈 𑒉 𑒊 𑒋 𑒌 𑒍 𑒎 𑒏 𑒐 𑒑 𑒒 𑒓 𑒔 𑒕 𑒖 𑒗 𑒘 𑒙 𑒚 𑒛 𑒜 𑒝 𑒞 𑒟 𑒠 𑒡 𑒢 𑒣 𑒤 𑒥 𑒦 𑒧 𑒨 𑒩 𑒪 𑒫 𑒬 𑒭 𑒮 𑒯 𑓄 𑓐 𑓑 𑓒 𑓓 𑓔 𑓕 𑓖 𑓗 𑓘 𑓙 [U+25CC, U+11481..U+114AF, U+114C4, U+114D0..U+114D9]1
CONS_MOD_BELOW◌𑓃 U+114C30 or more
Alternative 1
  HALANT◌𑓂 U+114C21
Alternative 2
  VOWEL_PRE◌𑒱 ◌𑒹 [U+114B1, U+114B9]0 or more
VOWEL_ABOVE◌𑒺 U+114BA0 or more
VOWEL_BELOW◌𑒳 ◌𑒴 ◌𑒵 ◌𑒶 ◌𑒷 ◌𑒸 [U+114B3..U+114B8]0 or more
VOWEL_POST◌𑒰 ◌𑒲 ◌𑒽 [U+114B0, U+114B2, U+114BD]0 or more
VOWEL_MOD_ABOVE◌॑ ◌𑒿 ◌𑓀 [U+0951, U+114BF..U+114C0]0 or more
VOWEL_MOD_BELOW◌॒ U+09520 or more
VOWEL_MOD_POST◌𑓁 U+114C10 or more

The USE does not allow the following character sequences because there are equivalent individual characters:

Character sequenceEncoding
𑒁 ◌𑒰 <U+11481, U+114B0>
𑒋 ◌𑒺 <U+1148B, U+114BA>
𑒍 ◌𑒺 <U+1148D, U+114BA>
𑒪 ◌𑒵 <U+114AA, U+114B5>
𑒪 ◌𑒶 <U+114AA, U+114B6>

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
। ॥ ৴ ᳲ ꠰ ꠱ ꠲ ꠳ ꠴ ꠵ ꠶ ꠷ ꠸ ꠹ 𑒀 𑓅 𑓆 𑓇 [U+0964..U+0965, U+09F4, U+1CF2, U+A830..U+A839, U+11480, U+114C5..U+114C7]

Zanabazar Square

ClassesCharactersEncodingCount
REPHA𑨺◌ U+11A3A0 or 1
BASE◌ 𑨀 𑨋 𑨌 𑨍 𑨎 𑨏 𑨐 𑨑 𑨒 𑨓 𑨔 𑨕 𑨖 𑨗 𑨘 𑨙 𑨚 𑨛 𑨜 𑨝 𑨞 𑨟 𑨠 𑨡 𑨢 𑨣 𑨤 𑨥 𑨦 𑨧 𑨨 𑨩 𑨪 𑨫 𑨬 𑨭 𑨮 𑨯 𑨰 𑨱 𑨲 [U+25CC, U+11A00, U+11A0B..U+11A32]1
Repeating group0 or more
  HALANT◌𑩇 U+11A471
BASE◌ 𑨀 𑨋 𑨌 𑨍 𑨎 𑨏 𑨐 𑨑 𑨒 𑨓 𑨔 𑨕 𑨖 𑨗 𑨘 𑨙 𑨚 𑨛 𑨜 𑨝 𑨞 𑨟 𑨠 𑨡 𑨢 𑨣 𑨤 𑨥 𑨦 𑨧 𑨨 𑨩 𑨪 𑨫 𑨬 𑨭 𑨮 𑨯 𑨰 𑨱 𑨲 [U+25CC, U+11A00, U+11A0B..U+11A32]1
Alternative 1
  HALANT◌𑩇 U+11A471
Alternative 2
  CONS_MED_BELOW◌𑨻 ◌𑨼 ◌𑨽 ◌𑨾 [U+11A3B..U+11A3E]0 or 1
VOWEL_ABOVE◌𑨁 ◌𑨄 ◌𑨅 ◌𑨆 ◌𑨇 ◌𑨈 ◌𑨉 [U+11A01, U+11A04..U+11A09]0 or more
VOWEL_BELOW◌𑨂 ◌𑨃 ◌𑨊 ◌𑨴 [U+11A02..U+11A03, U+11A0A, U+11A34]0 or more
VOWEL_MOD_ABOVE◌𑨵 ◌𑨶 ◌𑨷 ◌𑨸 [U+11A35..U+11A38]0 or more
VOWEL_MOD_POST◌𑨹 U+11A390 or more
CONS_FINAL_MOD◌𑨳 U+11A330 or 1

The following characters are used with the script but can not be part of clusters:

CharactersEncoding
𑨿 𑩀 𑩁 𑩂 𑩃 𑩄 𑩅 𑩆 [U+11A3F..U+11A46]

Known bugs:

𑽆𑽌𑽆𑽏