From: naruse@...
Date: 2016-07-19T08:31:51+00:00
Subject: [ruby-core:76432] [Ruby trunk Bug#12577][Rejected] Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8

Issue #12577 has been updated by Yui NARUSE.

Status changed from Open to Rejected

It's because of their specs as follows:

POSIX
> punct
> Define characters to be classified as punctuation characters.
> In the POSIX locale, neither the <space> nor any characters in classes alpha, digit, or cntrl shall be included.
>
> In a locale definition file, no character specified for the keywords upper, lower, alpha, digit, cntrl, xdigit, or as the <space> shall be specified.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07

Unicode
> 	\p{gc=Punctuation} \p{gc=Symbol} -- \p{alpha}
http://unicode.org/reports/tr18/#punct



----------------------------------------
Bug #12577: Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8
https://bugs.ruby-lang.org/issues/12577#change-59676

* Author: Martin D��rst
* Status: Rejected
* Priority: Normal
* Assignee: 
* ruby -v: ruby 2.4.0dev (2016-07-09 trunk 55618) [x86_64-cygwin]
* Backport: 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN
----------------------------------------
US-ASCII thinks '$' is punctuation. UTF-8 thinks it's not.

This means that the following two scripts:

```
# encoding: us-ascii
puts '$' =~ /\p{Punct}/ ? 'match' : 'no match'
```

and

```
# encoding: utf-8
puts '$' =~ /\p{Punct}/ ? 'match' : 'no match'
```

produce different results. It also means that the output from the single line script

```
puts '$' =~ /\p{Punct}/ ? 'match' : 'no match'
```

changed when we changed the default script encoding from US-ASCII to UTF-8.

This may be okay as it is, but I'm reporting it here to check what others think.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>