From: naruse@... Date: 2016-07-19T08:31:51+00:00 Subject: [ruby-core:76432] [Ruby trunk Bug#12577][Rejected] Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8 Issue #12577 has been updated by Yui NARUSE. Status changed from Open to Rejected It's because of their specs as follows: POSIX > punct > Define characters to be classified as punctuation characters. > In the POSIX locale, neither the <space> nor any characters in classes alpha, digit, or cntrl shall be included. > > In a locale definition file, no character specified for the keywords upper, lower, alpha, digit, cntrl, xdigit, or as the <space> shall be specified. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07 Unicode > \p{gc=Punctuation} \p{gc=Symbol} -- \p{alpha} http://unicode.org/reports/tr18/#punct ---------------------------------------- Bug #12577: Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8 https://bugs.ruby-lang.org/issues/12577#change-59676 * Author: Martin D��rst * Status: Rejected * Priority: Normal * Assignee: * ruby -v: ruby 2.4.0dev (2016-07-09 trunk 55618) [x86_64-cygwin] * Backport: 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN ---------------------------------------- US-ASCII thinks '$' is punctuation. UTF-8 thinks it's not. This means that the following two scripts: ``` # encoding: us-ascii puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` and ``` # encoding: utf-8 puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` produce different results. It also means that the output from the single line script ``` puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` changed when we changed the default script encoding from US-ASCII to UTF-8. This may be okay as it is, but I'm reporting it here to check what others think. -- https://bugs.ruby-lang.org/ Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>