REFERENCE · WRITING SYSTEMS

Scripts in Unicode

Unicode 16.0 names 168 scripts, partitioning every codepoint into a writing-system identity.

A script in Unicode is a writing system — Latin, Cyrillic, Devanagari, Han — and it is a property of the codepoint itself, not of the page or paragraph it appears in. The Script property is defined by UAX #24 and uses the four-letter ISO 15924 codes (Latn, Cyrl, Deva, Hani) that you also see in BCP 47 language tags like zh-Hant or sr-Cyrl.

Scripts are easy to confuse with blocks, but they are different things. A block is a contiguous range of codepoints; a script is a logical identity. The Latin script alone is scattered across more than a dozen blocks — Basic Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, Latin Extended Additional, IPA Extensions, Phonetic Extensions, Latin Extended-C, D, E, F, G — because each round of additions for new languages, scholars, or specialists got a fresh range. There is also no requirement that a block belong to exactly one script: the General Punctuation block (U+2000–U+206F) is shared across every script, with the Common script value Zyyy.

Beyond the named scripts, three reserved values handle special cases: Common (shared punctuation, symbols, digits), Inherited (combining marks that inherit the script of the base they attach to), and Unknown for unassigned and private-use codepoints. These three account for about a quarter of all assigned codepoints. Together with the 168 named scripts, every codepoint has exactly one script value.

Modern scripts — in regular contemporary use

Around fifty writing systems serve living languages used in print, signage and digital text today.

ISO 15924	Name	Status	Notes
Latn	Latin	Modern	The most widespread writing system, used by ~70% of all written languages. Spans Basic Latin through Latin Extended-G.
Cyrl	Cyrillic	Modern	Used for Russian, Ukrainian, Bulgarian, Serbian, Mongolian and others. Lives in U+0400–04FF and Cyrillic Supplement.
Grek	Greek	Modern	Continuous use since the 8th century BCE. Greek and Coptic block plus Greek Extended.
Hebr	Hebrew	Modern	Right-to-left abjad used for Hebrew, Yiddish and Judeo-Arabic. U+0590–05FF.
Arab	Arabic	Modern	Second-most widespread script. Right-to-left, complex contextual shaping. Hosts Arabic, Persian, Urdu, Pashto.
Deva	Devanagari	Modern	Used for Hindi, Marathi, Nepali, Sanskrit. The most-used Brahmic script.
Beng	Bengali	Modern	Used for Bengali and Assamese in South Asia.
Guru	Gurmukhi	Modern	The standard script for Punjabi.
Gujr	Gujarati	Modern	Used for Gujarati and Kachchi.
Orya	Oriya (Odia)	Modern	Used for Odia in eastern India.
Taml	Tamil	Modern	Used in Tamil Nadu, Sri Lanka and the Tamil diaspora.
Telu	Telugu	Modern	Used for Telugu, primarily in Andhra Pradesh and Telangana.
Knda	Kannada	Modern	Used for Kannada in Karnataka.
Mlym	Malayalam	Modern	Used for Malayalam in Kerala.
Sinh	Sinhala	Modern	Used for Sinhala in Sri Lanka.
Thai	Thai	Modern	Used for Thai. Notably no inter-word spacing.
Laoo	Lao	Modern	Used for Lao. Closely related to Thai script.
Mymr	Myanmar (Burmese)	Modern	Used for Burmese, Shan, Mon and several minority languages.
Khmr	Khmer	Modern	The script of the Khmer language.
Tibt	Tibetan	Modern	Used for Tibetan, Dzongkha and Ladakhi.
Geor	Georgian	Modern	The Georgian Mkhedruli script. Asomtavruli and Nuskhuri are encoded as case variants.
Armn	Armenian	Modern	Used for Armenian since the early 5th century.
Hang	Hangul	Modern	The Korean script. 11,172 precomposed syllable blocks at U+AC00–D7A3.
Hira	Hiragana	Modern	One of the two Japanese kana. Hiragana block U+3040–U+309F.
Kana	Katakana	Modern	The angular Japanese kana, used for loanwords and emphasis.
Hani	Han (CJK Unified)	Modern	The single largest script in Unicode — 97,000+ ideographs across the CJK blocks and extensions.
Bopo	Bopomofo	Modern	Used to teach Mandarin pronunciation in Taiwan.
Yiii	Yi	Modern	The standardised Yi syllabary used in Sichuan, China.
Mong	Mongolian	Modern	Traditional Mongolian script, written vertically.
Tfng	Tifinagh	Modern	The script of the Berber/Amazigh languages of North Africa.
Ethi	Ethiopic (Geʽez)	Modern	Used for Amharic, Tigrinya, Geʽez and other Ethio-Semitic languages.
Cher	Cherokee	Modern	Sequoyah's 1821 syllabary. Now has both cased forms.
Cans	Canadian Aboriginal Syllabics	Modern	A unified encoding of Cree, Inuktitut, Ojibwe and related syllabaries.
Adlm	Adlam	Modern	A 1989 script for the Fula language of West Africa. Added in Unicode 9.0.
Olck	Ol Chiki	Modern	The script for the Santali language of South Asia.
Vaii	Vai	Modern	A syllabary for the Vai language of Liberia.
Nkoo	N'Ko	Modern	A right-to-left script for Manding languages, designed in 1949.
Thaa	Thaana	Modern	The right-to-left script for Dhivehi in the Maldives.
Java	Javanese	Modern	Used for Javanese; in cultural and educational use.
Bali	Balinese	Modern	Used for Balinese in religious and cultural contexts.
Sund	Sundanese	Modern	Used for Sundanese in West Java.
Batk	Batak	Modern	Used in cultural revival for the Batak languages of Sumatra.
Bugi	Buginese	Modern	The Lontara script of Sulawesi.
Tagb	Tagbanwa	Modern	Used by the Tagbanwa people of Palawan, Philippines.
Tglg	Tagalog (Baybayin)	Modern	Indigenous Philippine script, now in revival.
Hano	Hanunoo	Modern	Used by the Hanunoo of Mindoro, Philippines.
Buhd	Buhid	Modern	Used by the Buhid people of Mindoro.
Lisu	Lisu (Fraser)	Modern	The Fraser alphabet for the Lisu language.
Mlym	Saurashtra	Modern	Used for Saurashtra in southern India.
Khqa	Khoja, Khojki and others	Modern	Several regional South Asian scripts continue in religious or community use.

Historic scripts — preserved for scholarship

Encoded so that epigraphers, historians and digital archives can quote inscriptions in plain text.

ISO 15924	Name	Status	Notes
Egyp	Egyptian Hieroglyphs	Historic	1,071 signs at U+13000–U+1342F, plus the Format Controls block for cartouche shaping.
Cprt	Cypriot Syllabary	Historic	Used for Arcadocypriot Greek and Eteocypriot, c. 1500–300 BCE.
Linb	Linear B	Historic	Mycenaean Greek syllabary, deciphered 1952. U+10000–U+1007F.
Lina	Linear A	Historic	The Minoan script of Crete. Still largely undeciphered.
Phnx	Phoenician	Historic	The 22-letter ancestor of every Mediterranean alphabet, c. 1050 BCE.
Lyci	Lycian	Historic	Used in southwestern Anatolia, c. 6th–4th centuries BCE.
Lydi	Lydian	Historic	Used in western Anatolia, c. 7th–3rd centuries BCE.
Cari	Carian	Historic	Used in southwestern Anatolia, related to the Greek alphabet.
Goth	Gothic	Historic	Wulfila's 4th-century alphabet for the Gothic Bible translation.
Runr	Runic	Historic	The Elder, Younger and Anglo-Saxon Futhark, plus medieval extensions.
Ogam	Ogham	Historic	The notched Irish script of stone inscriptions, c. 4th–10th centuries.
Xsux	Cuneiform	Historic	1,236 signs for Sumerian and Akkadian, plus separate blocks for numbers and Early Dynastic.
Ugar	Ugaritic	Historic	Late Bronze Age cuneiform alphabet, c. 14th century BCE.
Avst	Avestan	Historic	Used for the Avesta, the scripture of Zoroastrianism.
Phli	Inscriptional Pahlavi	Historic	The monumental Middle Persian script.
Phlp	Psalter Pahlavi	Historic	A Middle Persian book hand attested in a single Psalter manuscript.
Phlv	Book Pahlavi	Historic	The cursive Middle Persian script used for Zoroastrian scripture.
Prti	Inscriptional Parthian	Historic	The script of the Parthian Empire, attested in royal inscriptions.
Brah	Brahmi	Historic	The ancestor of every Brahmic script in South and Southeast Asia.
Khar	Kharoshthi	Historic	Used in Gandhara and Central Asia, c. 3rd century BCE – 3rd century CE.

Liturgical & scholarly

Scripts still used in religious, ceremonial or specialist contexts.

ISO 15924	Name	Status	Notes
Copt	Coptic	Liturgical	Used by the Coptic Orthodox Church. Disunified from Greek in Unicode 4.1.
Syrc	Syriac	Liturgical	The script of Eastern Christianity — Estrangela, Serto and Madnhaya styles all encoded.
Mand	Mandaic	Liturgical	The script of the Mandaean religion.
Samr	Samaritan	Liturgical	Used by the Samaritan community for the Samaritan Pentateuch.
Latg	Latin — Gaelic style	Stylistic	Not a separate Unicode script; recorded as an ISO 15924 alias for Irish typographic tradition.
Latf	Latin — Fraktur style	Stylistic	Likewise not separately encoded; Fraktur glyphs are font-level, not codepoint-level.
Hluw	Anatolian Hieroglyphs	Historic / scholarly	The Luwian script of the Hittite empire.
Tnsa	Tangsa	Modern / scholarly	Added in Unicode 14.0 for the Tangsa community of Northeast India.

Constructed scripts

Invented writing systems that meet Unicode's encoding criteria; others live in the Private Use Areas.

ISO 15924	Name	Status	Notes
Shaw	Shavian	Constructed	George Bernard Shaw's 1958 phonetic alphabet for English.
Dsrt	Deseret	Constructed	A 19th-century LDS-Church phonetic alphabet for English.
Mero	Meroitic Hieroglyphs	Historic / constructed	Plus a separate Merc code for Meroitic Cursive.
Dupl	Duployan shorthand	Constructed	The Duployan stenographic system, including its Chinook adaptation.
—	Tengwar (Tolkien)	PUA only	Not officially encoded. Allocated by the ConScript Unicode Registry at U+E000+.
—	Klingon (pIqaD)	PUA only	Rejected for formal encoding in 2001; ConScript registers it at U+F8D0–U+F8FF.
—	Aurebesh, Hylian, etc.	PUA only	Fictional scripts from games and films, where assigned at all, live in Private Use ranges.

Related