From: "nobu (Nobuyoshi Nakada)" Date: 2022-08-24T01:13:55+00:00 Subject: [ruby-core:109655] [Ruby master Bug#18973] Kernel#sprintf: %c allows codepoints above 127 for 7-bits ASCII encoding Issue #18973 has been updated by nobu (Nobuyoshi Nakada). Sorry, this. ```diff diff --git a/enc/us_ascii.c b/enc/us_ascii.c index 08f9072c435..9d854b12245 100644 --- a/enc/us_ascii.c +++ b/enc/us_ascii.c @@ -7,6 +7,12 @@ # define ENCINDEX_US_ASCII 0 #endif +static int +us_ascii_code_to_mbclen(OnigCodePoint code ARG_UNUSED, OnigEncoding enc ARG_UNUSED) +{ + return !(code & 0x80); +} + static int us_ascii_mbc_enc_len(const UChar* p, const UChar* e, OnigEncoding enc) { @@ -22,7 +28,7 @@ OnigEncodingDefine(us_ascii, US_ASCII) = { 1, /* min byte length */ onigenc_is_mbc_newline_0x0a, onigenc_single_byte_mbc_to_code, - onigenc_single_byte_code_to_mbclen, + us_ascii_code_to_mbclen, onigenc_single_byte_code_to_mbc, onigenc_ascii_mbc_case_fold, onigenc_ascii_apply_all_case_fold, ``` ---------------------------------------- Bug #18973: Kernel#sprintf: %c allows codepoints above 127 for 7-bits ASCII encoding https://bugs.ruby-lang.org/issues/18973#change-98878 * Author: andrykonchin (Andrew Konchin) * Status: Open * Priority: Normal * ruby -v: 3.0.3 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- I've noticed the following behavior: ```ruby sprintf("%c".encode("US-ASCII"), 128) => "\x80" sprintf("%c".encode("US-ASCII"), 128).valid_encoding? => false ``` Specifying codepoints 128-255 for ASCII encoded formatting sequence leads to a broken string. ```ruby sprintf("%c".encode("US-ASCII"), 255) => "\xFF" sprintf("%c".encode("US-ASCII"), 256) (irb):17:in `sprintf': 256 out of char range (RangeError) ``` Specifying codepoint greater that 255 causes the expected exception `out of char range`. I suppose this exception should be raised for codepoints 128-255 as well (for ASCII encoding). -- https://bugs.ruby-lang.org/ Unsubscribe: