From: duerst@... Date: 2016-12-08T09:22:03+00:00 Subject: [ruby-core:78543] [Ruby trunk Feature#13016] String#gsub(hash) Issue #13016 has been updated by Martin D��rst. Shyouhei Urabe wrote: > I noticed that I can't purge `NKF.nkf '-Z4'`. It can neither be rewritten using String#tr, String#encode, nor String#unicode_normalize. Can you give (a pointer to) a detailed description of what NKF, and in particular NKF.nkf -Z4, does exactly? For example, I can't find it at http://blog.layer8.sh/ja/2012/03/31/nkf_command_option/. The following may be related: ���-Z X0208������������������������������������ASCII������������������-Z1���X0208��� ������ASCII space������������������-Z2���X0208���������ASCII space ������������������������������������������������������������������������(���������������X0208���������������������������������������������������������) > It is doable using String#gsub theoretically, but that requires a hand-crafted nontrivial regular expression that exactly matches what Z4 expects to convert. This is almost impossible to do, and is definitely not something debuggable. Please note that String#unicode_normalize, as currently implemented, also uses some huge regular expressions (though program-generated). And also has (hopefully) successfully been debugged, although with the help of testing data from Unicode. ---------------------------------------- Feature #13016: String#gsub(hash) https://bugs.ruby-lang.org/issues/13016#change-61928 * Author: Shyouhei Urabe * Status: Open * Priority: Normal * Assignee: ---------------------------------------- Background: I wanted to drop NKF dependency of my script. By doing so I noticed that I can't purge `NKF.nkf '-Z4'`. It can neither be rewritten using String#tr, String#encode, nor String#unicode_normalize. It is doable using String#gsub theoretically, but that requires a hand-crafted nontrivial regular expression that exactly matches what Z4 expects to convert. This is almost impossible to do, and is definitely not something debuggable. Proposal: extend String#gsub so that it also accepts hash as its only argument, specifying input-output mapping. ```ruby # now def convert str require 'nkf' NKF.nkf '-Z4xm0', str end # proposed def convert str map = { "\u3002" => "\uFF61", "\u300C" => "\uFF62", ... } str.gsub map end ``` -- https://bugs.ruby-lang.org/ Unsubscribe: