From 431838747b63d3a7a0811325c0ac15d001bf0d63 Mon Sep 17 00:00:00 2001 From: Simon Tatham Date: Fri, 24 May 2024 22:25:56 +0100 Subject: [PATCH] Stop ignoring the Unicode tag character range. These were deliberately thrown away in our UTF-8 decoder, with a comment apparently introduced by RDB in the 2001 big Unicode patch. The purpose of this character range has changed completely since then, and now they act as modifier characters on top of U+1F3F4 to construct a space of flags (the standard examples being those of England, Scotland and Wales). We were failing to display those flags, and even pasting out of the terminal didn't give back the right Unicode. --- terminal/terminal.c | 4 ---- test/utf8.txt | 2 +- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/terminal/terminal.c b/terminal/terminal.c index 4c453809..90150044 100644 --- a/terminal/terminal.c +++ b/terminal/terminal.c @@ -3637,10 +3637,6 @@ unsigned long term_translate( if (t > 0x10FFFF) return UCSINVALID; - /* This is currently a TagPhobic application.. */ - if (t >= 0xE0000 && t <= 0xE007F) - return UCSINCOMPLETE; - /* U+FEFF is best seen as a null. */ if (t == 0xFEFF) return UCSINCOMPLETE; diff --git a/test/utf8.txt b/test/utf8.txt index 3b45f9eb..c76fc678 100644 --- a/test/utf8.txt +++ b/test/utf8.txt @@ -27,4 +27,4 @@ Dedicated emoji: 💜 🙂 🙁 (wide and should look correct) Combined via ZWJ: 👩‍💻 (PuTTY doesn't understand ZWJ) Skin tone mod: 👩🏻 👩🏿 (wcwidth doesn't know those are modifiers) Flags: 🇬🇧 🇺🇦 🇪🇺 (should work in GTK 2 or better) - +Flags using tags: 🏴󠁧󠁢󠁥󠁮󠁧󠁿 🏴󠁧󠁢󠁳󠁣󠁴󠁿 🏴󠁧󠁢󠁷󠁬󠁳󠁿 (the tags are treated as combining marks)