The Latin-1 Supplement block carries the second hundred-and-twenty-eight codepoints of Unicode, U+0080 through U+00FF. Like Basic Latin before it, every codepoint in this block is inherited directly from an earlier standard: ISO 8859-1, commonly called "Latin-1", which extended seven-bit ASCII to eight bits by adding the accented letters needed for Western European languages — French, Spanish, Portuguese, Italian, German, Dutch, Icelandic, Nordic — plus a small set of symbols and punctuation marks that had been missing from the original ASCII set.
About this block
ISO 8859-1 was published in 1987 by the International Organization for Standardization, with the printable characters running from byte 0xA0 through 0xFF and the range 0x80–0x9F reserved for so-called C1 control codes. Unicode preserved this layout exactly when it adopted Latin-1, which is why the C1 controls — U+0080 through U+009F — still occupy the first half of this block. They were defined by ISO 6429 / ECMA-48 for terminal control alongside the C0 codes in Basic Latin, but unlike LF and ESC, almost none of them ever saw real use. Most Unicode-aware software treats them as opaque format characters and renders nothing.
Printable Latin-1 begins at U+00A0 NO-BREAK SPACE — the non-breaking space familiar to anyone who has written in HTML. Past that comes the punctuation row (¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯), the math and superscript row (° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿), and then the accented-letter rows: uppercase from U+00C0 (À) through U+00DE (Þ), the multiplication sign U+00D7 as an unfortunate stowaway in the middle, then lowercase from U+00DF (ß) through U+00FF (ÿ), with the division sign U+00F7 sitting in the parallel position. This grid layout was deliberate: each uppercase letter sits exactly 32 codepoints before its lowercase counterpart, mirroring ASCII.
The most consequential symbols here are the typographic and legal marks used everywhere on Western web pages. © (U+00A9), ® (U+00AE), ° (U+00B0), ± (U+00B1), £ (U+00A3), and the vulgar fractions ¼ ½ ¾ all live in this block. The accented vowels — é, è, ñ, ü, å, ø — power every European text. The German ß "sharp s" sits at U+00DF; its modern uppercase counterpart ẞ lives much later, at U+1E9E.
One persistent source of bugs: Windows-1252 is not Latin-1. Microsoft's "ANSI" code page reuses the same upper range but fills the C1 area (0x80–0x9F) with printable characters — the Euro sign €, smart quotes ‘ ’ “ ”, the em dash —, the bullet •, the ellipsis …, and the trademark ™. When a Windows-1252 byte stream is decoded as Latin-1 (or, worse, mis-tagged in HTTP headers), those characters become control codes, vanish, or render as mojibake. Modern code should declare UTF-8 and stop interpreting bytes as Latin-1 entirely — but the legacy is still very much alive in old databases, mail archives, and CSV exports.