From: "sam.saffron (Sam Saffron)" Date: 2013-04-03T13:55:13+09:00 Subject: [ruby-core:53939] [ruby-trunk - Feature #8206] Should Ruby core implement String#blank? Issue #8206 has been updated by sam.saffron (Sam Saffron). @marcandre I tried pretty much every combination possible interestingly depending on the string /\A[[:space:]]*\z/ can be slower than the original regex, also afaik its not identical cause it misses some cases ---------------------------------------- Feature #8206: Should Ruby core implement String#blank? https://bugs.ruby-lang.org/issues/8206#change-38150 Author: sam.saffron (Sam Saffron) Status: Open Priority: Normal Assignee: Category: core Target version: There has been some discussion about porting the #blank? protocol over to Ruby in the past that has been rejected by Matz. This proposal is only about String however. At the moment to figure out if you have a blank string you would " ".strip.length == 0 The disadvantage is that this forces unneeded allocations and does too much work: An optimal implementation would be: static VALUE rb_str_blank(VALUE str) { rb_encoding *enc; char *s, *e; enc = STR_ENC_GET(str); s = RSTRING_PTR(str); if (!s || RSTRING_LEN(str) == 0) return Qtrue; e = RSTRING_END(str); while (s < e) { int n; unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc); if (!rb_isspace(cc) && cc != 0) return Qfalse; s += n; } return Qtrue; } This in turn is about 5-8x than the regex solution to the problem and way faster than allocating one massive string with strip when length is large. Should Ruby take on this method, to accompany #strip following its practice. --- A slight caveat though is that active support has a somewhat different definition of blank? const unsigned int as_blank[26] = {9, 0xa, 0xb, 0xc, 0xd, 0x20, 0x85, 0xa0, 0x1680, 0x180e, 0x2000, 0x2001, 0x2002, 0x2003, 0x2004, 0x2005, 0x2006, 0x2007, 0x2008, 0x2009, 0x200a, 0x2028, 0x2029, 0x202f, 0x205f, 0x3000 }; static VALUE rb_str_blank_as(VALUE str) { rb_encoding *enc; char *s, *e; int i; int found; enc = STR_ENC_GET(str); s = RSTRING_PTR(str); if (!s || RSTRING_LEN(str) == 0) return Qtrue; e = RSTRING_END(str); while (s < e) { int n; unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc); found = 0; for(i=0;i<26;i++){ unsigned int current = as_blank[i]; if(current == cc) { found = 1; break; } if(cc < current){ break; } } if (!found) return Qfalse; s += n; } return Qtrue; } Clearly it makes no sense to have such a method. If Ruby took over implementing String#blank? it would clash with Active Support. But imho would enforce better API consistency. Thoughts? -- http://bugs.ruby-lang.org/