[#53944] [ruby-trunk - Bug #8210][Open] Multibyte character interfering with end-line character within a regex — "sawa (Tsuyoshi Sawada)" <sawadatsuyoshi@...>

14 messages 2013/04/03

[#53974] [ruby-trunk - Feature #8215][Open] Support accessing Fiber-locals and backtraces for a Fiber — "halorgium (Tim Carey-Smith)" <ruby-lang-bugs@...>

14 messages 2013/04/03

[#54095] [ruby-trunk - Feature #8237][Open] Logical method chaining via inferred receiver — "wardrop (Tom Wardrop)" <tom@...>

34 messages 2013/04/08

[#54138] [ruby-trunk - Bug #8241][Open] If uri host-part has underscore ( '_' ), 'URI#parse' raise 'URI::InvalidURIError' — "neocoin (Sangmin Ryu)" <neocoin@...>

9 messages 2013/04/09

[#54185] [CommonRuby - Feature #8257][Open] Exception#cause to carry originating exception along with new one — "headius (Charles Nutter)" <headius@...>

43 messages 2013/04/11

[#54196] Encouraging use of CommonRuby — Charles Oliver Nutter <headius@...>

I think we need to do more to encourage the use of the CommonRuby

20 messages 2013/04/11
[#54200] Re: Encouraging use of CommonRuby — Marc-Andre Lafortune <ruby-core-mailing-list@...> 2013/04/11

Hi,

[#54211] Re: Encouraging use of CommonRuby — "NARUSE, Yui" <naruse@...> 2013/04/12

As far as I understand, what is CommonRuby and the process over CommonRuby

[#54215] Re: Encouraging use of CommonRuby — Charles Oliver Nutter <headius@...> 2013/04/12

On Thu, Apr 11, 2013 at 11:25 PM, NARUSE, Yui <naruse@airemix.jp> wrote:

[#54207] [CommonRuby - Feature #8258][Open] Dir#escape_glob — "steveklabnik (Steve Klabnik)" <steve@...>

15 messages 2013/04/12

[#54218] [CommonRuby - Feature #8259][Open] Atomic attributes accessors — "funny_falcon (Yura Sokolov)" <funny.falcon@...>

43 messages 2013/04/12

[#54288] [CommonRuby - Feature #8271][Open] Proposal for moving to a more visible, formal process for feature requests — "headius (Charles Nutter)" <headius@...>

15 messages 2013/04/15

[#54333] Requesting Commit Access — Aman Gupta <ruby@...1.net>

Hello ruby-core,

16 messages 2013/04/16

[#54473] [Backport 200 - Backport #8299][Open] Minor error in float parsing — "bobjalex (Bob Alexander)" <bobjalex@...>

27 messages 2013/04/19

[#54532] [ruby-trunk - Bug #8315][Open] mkmf does not include include paths from pkg_config anymore — "Hanmac (Hans Mackowiak)" <hanmac@...>

11 messages 2013/04/23

[#54621] [ruby-trunk - Feature #8339][Open] Introducing Geneartional Garbage Collection for CRuby/MRI — "ko1 (Koichi Sasada)" <redmine@...>

43 messages 2013/04/27
[#54643] [ruby-trunk - Feature #8339] Introducing Geneartional Garbage Collection for CRuby/MRI — "authorNari (Narihiro Nakamura)" <authorNari@...> 2013/04/28

[#54649] Re: [ruby-trunk - Feature #8339] Introducing Geneartional Garbage Collection for CRuby/MRI — SASADA Koichi <ko1@...> 2013/04/28

(2013/04/28 9:23), authorNari (Narihiro Nakamura) wrote:

[#54657] Re: [ruby-trunk - Feature #8339][Open] Introducing Geneartional Garbage Collection for CRuby/MRI — Magnus Holm <judofyr@...> 2013/04/28

On Sat, Apr 27, 2013 at 8:19 PM, ko1 (Koichi Sasada)

[#54665] [ruby-trunk - Bug #8344][Open] Status of Psych and Syck — "Eregon (Benoit Daloze)" <redmine@...>

18 messages 2013/04/28

[ruby-core:54012] [ruby-trunk - Feature #8206] Should Ruby core implement String#blank?

From: "sam.saffron (Sam Saffron)" <sam.saffron@...>
Date: 2013-04-05 03:19:55 UTC
List: ruby-core #54012
Issue #8206 has been updated by sam.saffron (Sam Saffron).


This is a MASSIVE improvement:

#!/usr/bin/env ruby
$: << File.dirname(__FILE__)+'/lib'
require 'benchmark'
require 'fast_blank'

class String
  # active support implementation
  def slow_blank?
    self !~ /[^[:space:]]/
  end
end


n = 1000000


strings = [
  "",
  "\r\n\r\n  ",
  "this is a test",
  "   this is a longer test",
  "   this is a longer test
      this is a longer test
      this is a longer test
      this is a longer test
      this is a longer test"
]

strings.each do |s|
  raise "failed on #{s.inspect}" if s.blank? != s.slow_blank?
end

Benchmark.bmbm  do |x|
  strings.each do |s|
    x.report("Fast Blank #{s.length}    :") do  n.times { s.blank? }  end
    x.report("Fast Blank (Active Support)  #{s.length}    :") do  n.times { s.blank_as? }  end
    x.report("Slow Blank #{s.length}    :") do  n.times { s.slow_blank? }  end
    x.report("include? #{s.length}    :") do  n.times { !s.include?(/[^[:space]]/) }  end
  end
end


                                            user     system      total        real
Fast Blank 0    :                       0.080000   0.000000   0.080000 (  0.077008)
Fast Blank (Active Support)  0    :     0.080000   0.000000   0.080000 (  0.076362)
Slow Blank 0    :                       0.380000   0.000000   0.380000 (  0.378698)
include? 0    :                         0.180000   0.000000   0.180000 (  0.184465)
Fast Blank 6    :                       0.180000   0.000000   0.180000 (  0.180450)
Fast Blank (Active Support)  6    :     0.210000   0.000000   0.210000 (  0.207886)
Slow Blank 6    :                       0.590000   0.000000   0.590000 (  0.588945)
include? 6    :                         0.190000   0.000000   0.190000 (  0.190898)
Fast Blank 14    :                      0.090000   0.000000   0.090000 (  0.088225)
Fast Blank (Active Support)  14    :    0.130000   0.000000   0.130000 (  0.131408)
Slow Blank 14    :                      0.670000   0.000000   0.670000 (  0.674838)
include? 14    :                        0.190000   0.000000   0.190000 (  0.191627)
Fast Blank 24    :                      0.190000   0.000000   0.190000 (  0.186498)
Fast Blank (Active Support)  24    :    0.140000   0.010000   0.150000 (  0.147858)
Slow Blank 24    :                      0.770000   0.000000   0.770000 (  0.767816)
include? 24    :                        0.220000   0.000000   0.220000 (  0.220636)
Fast Blank 136    :                     0.150000   0.000000   0.150000 (  0.150967)
Fast Blank (Active Support)  136    :   0.150000   0.000000   0.150000 (  0.147665)
Slow Blank 136    :                     0.770000   0.000000   0.770000 (  0.779459)
include? 136    :                       0.200000   0.000000   0.200000 (  0.189744)


Some notes:

1. I am noticing ruby head as a 20% or so faster regex going on that 2.0 for these tests
2. the include? method is only 30% or so percent slower than hand coding, though empty strings need special casing. Essentially include? should be short cutting if the string length is zero and returning false. 
3. I love this improvement to include?, totally support it accepting regexes. Though I very much worry about consistency here. 

My suggestion would be: 

1. Amend include? to accept a regex 
2. Keep in line with the changes in https://bugs.ruby-lang.org/issues/8110 ... so for it to skip globals you MUST pass in /regx/S (a regex that skips setting globals) 

I very much worry about having a mishmash in the language where some methods avoid global settings and others do not. The cleanest way of introducing this change is simply to allow for the new rege modifier and keep all places that accept regexes in MRI consistent. 

----------------------------------------
Feature #8206: Should Ruby core implement String#blank? 
https://bugs.ruby-lang.org/issues/8206#change-38249

Author: sam.saffron (Sam Saffron)
Status: Open
Priority: Normal
Assignee: 
Category: core
Target version: 


There has been some discussion about porting the #blank? protocol over to Ruby in the past that has been rejected by Matz. 

This proposal is only about String however. 

At the moment to figure out if you have a blank string you would 

"  ".strip.length == 0

The disadvantage is that this forces unneeded allocations and does too much work: 

An optimal implementation would be:

static VALUE
rb_str_blank(VALUE str)
{
  rb_encoding *enc;
  char *s, *e;

  enc = STR_ENC_GET(str);
  s = RSTRING_PTR(str);
  if (!s || RSTRING_LEN(str) == 0) return Qtrue;

  e = RSTRING_END(str);
  while (s < e) {
	  int n;
	  unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc);

	  if (!rb_isspace(cc) && cc != 0) return Qfalse;
    s += n;
  }
  return Qtrue;
}

This in turn is about 5-8x than the regex solution to the problem and way faster than allocating one massive string with strip when length is large. 

Should Ruby take on this method, to accompany #strip following its practice. 

--- 

A slight caveat though is that active support has a somewhat different definition of blank? 

const unsigned int as_blank[26] = {9, 0xa, 0xb, 0xc, 0xd,
  0x20, 0x85, 0xa0, 0x1680, 0x180e, 0x2000, 0x2001,
  0x2002, 0x2003, 0x2004, 0x2005, 0x2006, 0x2007, 0x2008,
  0x2009, 0x200a, 0x2028, 0x2029, 0x202f, 0x205f, 0x3000
};

static VALUE
rb_str_blank_as(VALUE str)
{
  rb_encoding *enc;
  char *s, *e;
  int i;
  int found;

  enc = STR_ENC_GET(str);
  s = RSTRING_PTR(str);
  if (!s || RSTRING_LEN(str) == 0) return Qtrue;

  e = RSTRING_END(str);
  while (s < e) {
	  int n;
	  unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc);

    found = 0;
    for(i=0;i<26;i++){
      unsigned int current = as_blank[i];
      if(current == cc) {
        found = 1;
        break;
      }
      if(cc < current){
        break;
      }
    }

	  if (!found) return Qfalse;
    s += n;
  }
  return Qtrue;
}

Clearly it makes no sense to have such a method. 

If Ruby took over implementing String#blank? it would clash with Active Support. But imho would enforce better API consistency. 

Thoughts?


 


-- 
http://bugs.ruby-lang.org/

In This Thread