From: sam.saffron@... Date: 2016-04-27T06:57:54+00:00 Subject: [ruby-core:75216] [Ruby trunk Feature#12306] Implement String #blank? #present? and improve #strip and family to handle unicode Issue #12306 has been updated by Sam Saffron. Shyouhei Urabe wrote: > You failed to see the problem because you could not imagine a file path containing U+3000. That is not very rare in cultures with ideographics. No, I'm not against categorizing such path being insane. But they ARE there. I can see that ... but I can not see someone trying to build tk and expecting that including `���` would work. Nobu, Yes the micro optimisations in blank only yield a 10-30% improvement see the benches in https://github.com/SamSaffron/fast_blank native is_space loop is often 10x faster. feel free to rerun the benches against latest code. ---------------------------------------- Feature #12306: Implement String #blank? #present? and improve #strip and family to handle unicode https://bugs.ruby-lang.org/issues/12306#change-58349 * Author: Sam Saffron * Status: Open * Priority: Normal * Assignee: Yukihiro Matsumoto ---------------------------------------- Time and again there have been rejected feature requests to Ruby core to implement `blank` and `present` protocols across all objects as ActiveSupport does. I am fine with this call and think it is fair. However, for the narrow case of String having `#blank?` and `#present?` makes sense. - Provides a natural extension over `#strip`, `#lstrip` and `#rstrip`. `(" ".strip.length == 0) == " ".blank?` - Plays nicely with ActiveSupport, providing an efficient implementation in Ruby core: see: https://github.com/SamSaffron/fast_blank, implementing blank efficiently requires a c extension. However, if this work is to be done, `#strip` and should probably start dealing with unicode blanks, eg: ``` irb(main):008:0> [0x3000].pack("U") => "���" irb(main):009:0> [0x3000].pack("U").strip.length => 1 ``` So there are 2 questions / feature requests here 1. Can we add blank? and present? to String? 2. Can we amend strip and family to account for unicode per: https://github.com/SamSaffron/fast_blank/blob/master/ext/fast_blank/fast_blank.c#L43-L74 -- https://bugs.ruby-lang.org/ Unsubscribe: