REFERENCE · WRITING SYSTEMS

Scripts in Unicode

Unicode 16.0 names 168 scripts, partitioning every codepoint into a writing-system identity.

A script in Unicode is a writing system — Latin, Cyrillic, Devanagari, Han — and it is a property of the codepoint itself, not of the page or paragraph it appears in. The Script property is defined by UAX #24 and uses the four-letter ISO 15924 codes (Latn, Cyrl, Deva, Hani) that you also see in BCP 47 language tags like zh-Hant or sr-Cyrl.

Scripts are easy to confuse with blocks, but they are different things. A block is a contiguous range of codepoints; a script is a logical identity. The Latin script alone is scattered across more than a dozen blocks — Basic Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, Latin Extended Additional, IPA Extensions, Phonetic Extensions, Latin Extended-C, D, E, F, G — because each round of additions for new languages, scholars, or specialists got a fresh range. There is also no requirement that a block belong to exactly one script: the General Punctuation block (U+2000–U+206F) is shared across every script, with the Common script value Zyyy.

Beyond the named scripts, three reserved values handle special cases: Common (shared punctuation, symbols, digits), Inherited (combining marks that inherit the script of the base they attach to), and Unknown for unassigned and private-use codepoints. These three account for about a quarter of all assigned codepoints. Together with the 168 named scripts, every codepoint has exactly one script value.

Modern scripts — in regular contemporary use
Around fifty writing systems serve living languages used in print, signage and digital text today.
ISO 15924NameStatusNotes
LatnLatinModernThe most widespread writing system, used by ~70% of all written languages. Spans Basic Latin through Latin Extended-G.
CyrlCyrillicModernUsed for Russian, Ukrainian, Bulgarian, Serbian, Mongolian and others. Lives in U+0400–04FF and Cyrillic Supplement.
GrekGreekModernContinuous use since the 8th century BCE. Greek and Coptic block plus Greek Extended.
HebrHebrewModernRight-to-left abjad used for Hebrew, Yiddish and Judeo-Arabic. U+0590–05FF.
ArabArabicModernSecond-most widespread script. Right-to-left, complex contextual shaping. Hosts Arabic, Persian, Urdu, Pashto.
DevaDevanagariModernUsed for Hindi, Marathi, Nepali, Sanskrit. The most-used Brahmic script.
BengBengaliModernUsed for Bengali and Assamese in South Asia.
GuruGurmukhiModernThe standard script for Punjabi.
GujrGujaratiModernUsed for Gujarati and Kachchi.
OryaOriya (Odia)ModernUsed for Odia in eastern India.
TamlTamilModernUsed in Tamil Nadu, Sri Lanka and the Tamil diaspora.
TeluTeluguModernUsed for Telugu, primarily in Andhra Pradesh and Telangana.
KndaKannadaModernUsed for Kannada in Karnataka.
MlymMalayalamModernUsed for Malayalam in Kerala.
SinhSinhalaModernUsed for Sinhala in Sri Lanka.
ThaiThaiModernUsed for Thai. Notably no inter-word spacing.
LaooLaoModernUsed for Lao. Closely related to Thai script.
MymrMyanmar (Burmese)ModernUsed for Burmese, Shan, Mon and several minority languages.
KhmrKhmerModernThe script of the Khmer language.
TibtTibetanModernUsed for Tibetan, Dzongkha and Ladakhi.
GeorGeorgianModernThe Georgian Mkhedruli script. Asomtavruli and Nuskhuri are encoded as case variants.
ArmnArmenianModernUsed for Armenian since the early 5th century.
HangHangulModernThe Korean script. 11,172 precomposed syllable blocks at U+AC00–D7A3.
HiraHiraganaModernOne of the two Japanese kana. Hiragana block U+3040–U+309F.
KanaKatakanaModernThe angular Japanese kana, used for loanwords and emphasis.
HaniHan (CJK Unified)ModernThe single largest script in Unicode — 97,000+ ideographs across the CJK blocks and extensions.
BopoBopomofoModernUsed to teach Mandarin pronunciation in Taiwan.
YiiiYiModernThe standardised Yi syllabary used in Sichuan, China.
MongMongolianModernTraditional Mongolian script, written vertically.
TfngTifinaghModernThe script of the Berber/Amazigh languages of North Africa.
EthiEthiopic (Geʽez)ModernUsed for Amharic, Tigrinya, Geʽez and other Ethio-Semitic languages.
CherCherokeeModernSequoyah's 1821 syllabary. Now has both cased forms.
CansCanadian Aboriginal SyllabicsModernA unified encoding of Cree, Inuktitut, Ojibwe and related syllabaries.
AdlmAdlamModernA 1989 script for the Fula language of West Africa. Added in Unicode 9.0.
OlckOl ChikiModernThe script for the Santali language of South Asia.
VaiiVaiModernA syllabary for the Vai language of Liberia.
NkooN'KoModernA right-to-left script for Manding languages, designed in 1949.
ThaaThaanaModernThe right-to-left script for Dhivehi in the Maldives.
JavaJavaneseModernUsed for Javanese; in cultural and educational use.
BaliBalineseModernUsed for Balinese in religious and cultural contexts.
SundSundaneseModernUsed for Sundanese in West Java.
BatkBatakModernUsed in cultural revival for the Batak languages of Sumatra.
BugiBugineseModernThe Lontara script of Sulawesi.
TagbTagbanwaModernUsed by the Tagbanwa people of Palawan, Philippines.
TglgTagalog (Baybayin)ModernIndigenous Philippine script, now in revival.
HanoHanunooModernUsed by the Hanunoo of Mindoro, Philippines.
BuhdBuhidModernUsed by the Buhid people of Mindoro.
LisuLisu (Fraser)ModernThe Fraser alphabet for the Lisu language.
MlymSaurashtraModernUsed for Saurashtra in southern India.
KhqaKhoja, Khojki and othersModernSeveral regional South Asian scripts continue in religious or community use.
Historic scripts — preserved for scholarship
Encoded so that epigraphers, historians and digital archives can quote inscriptions in plain text.
ISO 15924NameStatusNotes
EgypEgyptian HieroglyphsHistoric1,071 signs at U+13000–U+1342F, plus the Format Controls block for cartouche shaping.
CprtCypriot SyllabaryHistoricUsed for Arcadocypriot Greek and Eteocypriot, c. 1500–300 BCE.
LinbLinear BHistoricMycenaean Greek syllabary, deciphered 1952. U+10000–U+1007F.
LinaLinear AHistoricThe Minoan script of Crete. Still largely undeciphered.
PhnxPhoenicianHistoricThe 22-letter ancestor of every Mediterranean alphabet, c. 1050 BCE.
LyciLycianHistoricUsed in southwestern Anatolia, c. 6th–4th centuries BCE.
LydiLydianHistoricUsed in western Anatolia, c. 7th–3rd centuries BCE.
CariCarianHistoricUsed in southwestern Anatolia, related to the Greek alphabet.
GothGothicHistoricWulfila's 4th-century alphabet for the Gothic Bible translation.
RunrRunicHistoricThe Elder, Younger and Anglo-Saxon Futhark, plus medieval extensions.
OgamOghamHistoricThe notched Irish script of stone inscriptions, c. 4th–10th centuries.
XsuxCuneiformHistoric1,236 signs for Sumerian and Akkadian, plus separate blocks for numbers and Early Dynastic.
UgarUgariticHistoricLate Bronze Age cuneiform alphabet, c. 14th century BCE.
AvstAvestanHistoricUsed for the Avesta, the scripture of Zoroastrianism.
PhliInscriptional PahlaviHistoricThe monumental Middle Persian script.
PhlpPsalter PahlaviHistoricA Middle Persian book hand attested in a single Psalter manuscript.
PhlvBook PahlaviHistoricThe cursive Middle Persian script used for Zoroastrian scripture.
PrtiInscriptional ParthianHistoricThe script of the Parthian Empire, attested in royal inscriptions.
BrahBrahmiHistoricThe ancestor of every Brahmic script in South and Southeast Asia.
KharKharoshthiHistoricUsed in Gandhara and Central Asia, c. 3rd century BCE – 3rd century CE.
Liturgical & scholarly
Scripts still used in religious, ceremonial or specialist contexts.
ISO 15924NameStatusNotes
CoptCopticLiturgicalUsed by the Coptic Orthodox Church. Disunified from Greek in Unicode 4.1.
SyrcSyriacLiturgicalThe script of Eastern Christianity — Estrangela, Serto and Madnhaya styles all encoded.
MandMandaicLiturgicalThe script of the Mandaean religion.
SamrSamaritanLiturgicalUsed by the Samaritan community for the Samaritan Pentateuch.
LatgLatin — Gaelic styleStylisticNot a separate Unicode script; recorded as an ISO 15924 alias for Irish typographic tradition.
LatfLatin — Fraktur styleStylisticLikewise not separately encoded; Fraktur glyphs are font-level, not codepoint-level.
HluwAnatolian HieroglyphsHistoric / scholarlyThe Luwian script of the Hittite empire.
TnsaTangsaModern / scholarlyAdded in Unicode 14.0 for the Tangsa community of Northeast India.
Constructed scripts
Invented writing systems that meet Unicode's encoding criteria; others live in the Private Use Areas.
ISO 15924NameStatusNotes
ShawShavianConstructedGeorge Bernard Shaw's 1958 phonetic alphabet for English.
DsrtDeseretConstructedA 19th-century LDS-Church phonetic alphabet for English.
MeroMeroitic HieroglyphsHistoric / constructedPlus a separate Merc code for Meroitic Cursive.
DuplDuployan shorthandConstructedThe Duployan stenographic system, including its Chinook adaptation.
Tengwar (Tolkien)PUA onlyNot officially encoded. Allocated by the ConScript Unicode Registry at U+E000+.
Klingon (pIqaD)PUA onlyRejected for formal encoding in 2001; ConScript registers it at U+F8D0–U+F8FF.
Aurebesh, Hylian, etc.PUA onlyFictional scripts from games and films, where assigned at all, live in Private Use ranges.

Related