[#78633] ruby/spec needs help from CRuby committers — Benoit Daloze <eregontp@...>
Currently, ruby/spec is maintained mostly by individuals and enjoys the
13 messages
2016/12/13
[#78963] Re: ruby/spec needs help from CRuby committers
— Urabe Shyouhei <shyouhei@...>
2017/01/04
I did ask attendees of last developer meeting to join this
[#78642] Re: ruby/spec needs help from CRuby committers
— Eric Wong <normalperson@...>
2016/12/14
Benoit Daloze <eregontp@gmail.com> wrote:
[ruby-core:78543] [Ruby trunk Feature#13016] String#gsub(hash)
From:
duerst@...
Date:
2016-12-08 09:22:03 UTC
List:
ruby-core #78543
Issue #13016 has been updated by Martin Dürst.
Shyouhei Urabe wrote:
> I noticed that I can't purge `NKF.nkf '-Z4'`. It can neither be rewritten using String#tr, String#encode, nor String#unicode_normalize.
Can you give (a pointer to) a detailed description of what NKF, and in particular NKF.nkf -Z4, does exactly? For example, I can't find it at http://blog.layer8.sh/ja/2012/03/31/nkf_command_option/. The following may be related: 「-Z X0208中の英数字と若干の記号をASCIIに変換する。-Z1はX0208間 隔をASCII spaceに変換する。-Z2はX0208間隔をASCII space 二つに変換する。趣味によって使い分けてほしい。」(ここでの「X0208間隔」は全角スペースのことでしょうか。)
> It is doable using String#gsub theoretically, but that requires a hand-crafted nontrivial regular expression that exactly matches what Z4 expects to convert. This is almost impossible to do, and is definitely not something debuggable.
Please note that String#unicode_normalize, as currently implemented, also uses some huge regular expressions (though program-generated). And also has (hopefully) successfully been debugged, although with the help of testing data from Unicode.
----------------------------------------
Feature #13016: String#gsub(hash)
https://bugs.ruby-lang.org/issues/13016#change-61928
* Author: Shyouhei Urabe
* Status: Open
* Priority: Normal
* Assignee:
----------------------------------------
Background: I wanted to drop NKF dependency of my script. By doing so I noticed that I can't purge `NKF.nkf '-Z4'`. It can neither be rewritten using String#tr, String#encode, nor String#unicode_normalize. It is doable using String#gsub theoretically, but that requires a hand-crafted nontrivial regular expression that exactly matches what Z4 expects to convert. This is almost impossible to do, and is definitely not something debuggable.
Proposal: extend String#gsub so that it also accepts hash as its only argument, specifying input-output mapping.
```ruby
# now
def convert str
require 'nkf'
NKF.nkf '-Z4xm0', str
end
# proposed
def convert str
map = { "\u3002" => "\uFF61", "\u300C" => "\uFF62", ... }
str.gsub map
end
```
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>