From: "nobu (Nobuyoshi Nakada) via ruby-core" Date: 2023-09-06T16:16:21+00:00 Subject: [ruby-core:114665] [Ruby master Bug#19867] Unicode line and paragraph separator are not stripped Issue #19867 has been updated by nobu (Nobuyoshi Nakada). As for the implementation, changing ctype.h is not desirable. There is `rb_enc_isspace` function for such purpose already. ---------------------------------------- Bug #19867: Unicode line and paragraph separator are not stripped https://bugs.ruby-lang.org/issues/19867#change-104493 * Author: iainbeeston (Iain Beeston) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Unicode newline and paragraph separators are not removed by any of the strip methods: `"\u2028\u2029\u0000\t\n\v\f\r ".strip # => "\u2028\u2029"` I would have expected `strip` (and `lstrip`, `rstrip`) to remove unicode whitespace as well. It looks like #7154 reported something similar but for regular expressions and way back In ruby 1.9. I think that fixing this should be simple (just checking for `\x2028` and `\x2029` in ctype.h) but I'm not sure if it's supposed to behave this way or if changing it could introduce unexpected consequences. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/