[#55794] how to run ruby tests (backporting fix for cve-2013-4073) — Jordi Massaguer Pla <jmassaguerpla@...>
Hi ruby core developers,
4 messages
2013/07/04
[#55799] Re: how to run ruby tests (backporting fix for cve-2013-4073)
— V咜 Ondruch <v.ondruch@...>
2013/07/04
Dne 4.7.2013 13:19, Jordi Massaguer Pla napsal(a):
[#55853] ruby 1.9.3 p448 breaks ABI — V咜 Ondruch <v.ondruch@...>
Hi,
13 messages
2013/07/08
[#55854] Re: ruby 1.9.3 p448 breaks ABI
— Yorick Peterse <yorickpeterse@...>
2013/07/08
Out of curiosity, does this tool take into account deprecated/internal
[#55860] Re: ruby 1.9.3 p448 breaks ABI
— V咜 Ondruch <v.ondruch@...>
2013/07/08
Dne 8.7.2013 17:03, Yorick Peterse napsal(a):
[#55861] Re: ruby 1.9.3 p448 breaks ABI
— KOSAKI Motohiro <kosaki.motohiro@...>
2013/07/08
(7/8/13 5:36 PM), V咜 Ondruch wrote:
[#55864] Re: ruby 1.9.3 p448 breaks ABI
— Jon <jon.forums@...>
2013/07/08
On Tue, 9 Jul 2013 06:50:16 +0900
[#55886] [ruby-trunk - Bug #8616][Open] Process.daemon messes up threads — "tenderlovemaking (Aaron Patterson)" <aaron@...>
10 messages
2013/07/09
[#55976] [ruby-trunk - Feature #8629][Open] Method#parameters should include the default value — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>
13 messages
2013/07/12
[#56258] [ruby-trunk - Feature #8629] Method#parameters should include the default value
— "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>
2013/07/29
[#55984] [ruby-trunk - Bug #8630][Open] Transcoding high-bit bytes from ASCII-8BIT to a text encoding should be :invalid, not :undef — "headius (Charles Nutter)" <headius@...>
5 messages
2013/07/12
[#55986] Re: [ruby-trunk - Bug #8630][Open] Transcoding high-bit bytes from ASCII-8BIT to a text encoding should be :invalid, not :undef
— Tanaka Akira <akr@...>
2013/07/12
2013/7/13 headius (Charles Nutter) <headius@headius.com>:
[#55988] Next developer's meeting — Aaron Patterson <tenderlove@...>
Hi,
4 messages
2013/07/13
[#56001] [CommonRuby - Feature #8635][Open] attr_accessor with default block — "judofyr (Magnus Holm)" <judofyr@...>
5 messages
2013/07/14
[#56004] [ruby-trunk - Feature #8636][Open] Documentation hosting on ruby-lang.org — "zzak (Zachary Scott)" <e@...>
18 messages
2013/07/15
[#56005] [ruby-trunk - Feature #8637][Open] I18n documentation — "zzak (Zachary Scott)" <e@...>
5 messages
2013/07/15
[#56010] [ruby-trunk - Feature #8637] I18n documentation
— "zzak (Zachary Scott)" <e@...>
2013/07/15
[#56011] [ruby-trunk - Feature #8637] I18n documentation
— "kou (Kouhei Sutou)" <kou@...>
2013/07/15
[#56019] [ruby-trunk - Feature #8639][Open] Add Queue#each — "avdi (Avdi Grimm)" <avdi@...>
15 messages
2013/07/15
[#56020] [ruby-trunk - Feature #8639] Add Queue#each
— "rkh (Konstantin Haase)" <me@...>
2013/07/15
[#56029] Re: [ruby-trunk - Feature #8639][Open] Add Queue#each
— Alex Young <alex@...>
2013/07/15
On Tue, 2013-07-16 at 00:44 +0900, avdi (Avdi Grimm) wrote:
[#56027] [CommonRuby - Feature #8640][Open] Add Time#elapsed to return nanoseconds since creation — "tenderlovemaking (Aaron Patterson)" <aaron@...>
24 messages
2013/07/15
[#56068] [CommonRuby - Feature #8640] Add Time#elapsed to return nanoseconds since creation
— "phluid61 (Matthew Kerwin)" <matthew@...>
2013/07/17
[#56070] Re: [CommonRuby - Feature #8640] Add Time#elapsed to return nanoseconds since creation
— Aaron Patterson <tenderlove@...>
2013/07/18
On Thu, Jul 18, 2013 at 07:59:34AM +0900, phluid61 (Matthew Kerwin) wrote:
[#56037] [CommonRuby - Feature #8640] Add Time#elapsed to return nanoseconds since creation
— duerst (Martin Dürst) <duerst@...>
2013/07/16
[#56041] [CommonRuby - Feature #8643][Open] Add Binding.from_hash — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>
26 messages
2013/07/16
[#56087] [ruby-trunk - Feature #8658][Open] Process.clock_gettime — "akr (Akira Tanaka)" <akr@...>
23 messages
2013/07/19
[#56092] [ruby-trunk - Feature #8658] Process.clock_gettime
— "akr (Akira Tanaka)" <akr@...>
2013/07/20
[#56132] Re: [ruby-trunk - Feature #8658] Process.clock_gettime
— KOSAKI Motohiro <kosaki.motohiro@...>
2013/07/23
(7/20/13 6:39 AM), akr (Akira Tanaka) wrote:
[#56135] Re: [ruby-trunk - Feature #8658] Process.clock_gettime
— Tanaka Akira <akr@...>
2013/07/24
2013/7/24 KOSAKI Motohiro <kosaki.motohiro@gmail.com>:
[#56096] [CommonRuby - Feature #8661][Open] Add option to print backstrace in reverse order(stack frames first & error last) — "gary4gar (Gaurish Sharma)" <gary4gar@...>
18 messages
2013/07/20
[#56103] Ruby Developer Meeting Japan 2013-07-27 — "NARUSE, Yui" <naruse@...>
Hi,
6 messages
2013/07/21
[#56228] [ruby-trunk - Bug #8697][Open] Fixnum complement operator issue — "torimus (Torimus GL)" <torimus.gl@...>
8 messages
2013/07/27
[#56247] [ruby-trunk - Feature #8700][Open] Integer#bitsize (actually Fixnum#bitsize and Bignum#bitsize) — "akr (Akira Tanaka)" <akr@...>
8 messages
2013/07/28
[#56270] [ruby-trunk - Feature #8707][Open] Hash#reverse_each — "Glass_saga (Masaki Matsushita)" <glass.saga@...>
8 messages
2013/07/30
[#56276] [ruby-trunk - Feature #8707][Feedback] Hash#reverse_each
— "matz (Yukihiro Matsumoto)" <matz@...>
2013/07/31
[ruby-core:56183] [ruby-trunk - Feature #8678] Allow invalid string to work with regexp
From:
"matz (Yukihiro Matsumoto)" <matz@...>
Date:
2013-07-25 23:42:27 UTC
List:
ruby-core #56183
Issue #8678 has been updated by matz (Yukihiro Matsumoto).
I am positive. I'd rather want to make this default (if possible).
Matz.
----------------------------------------
Feature #8678: Allow invalid string to work with regexp
https://bugs.ruby-lang.org/issues/8678#change-40673
Author: naruse (Yui NARUSE)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: M17N
Target version: current: 2.1.0
Legacy Ruby 1.8 could regexp match with broken strings.
People can find characters from binary data on the age.
After Ruby 1.9, Ruby raises Exception if it does regexp match with broken strings.
So it became hard to work with character-wise regexp matching with binary data.
Following patch allows it with the constant Regexp::LOOSEENCODING.
commit eb0111ff7ae3f563ce201c4a5f724f121336d42d
Author: NARUSE, Yui <naruse@ruby-lang.org>
Date: Mon Jul 22 05:37:44 2013 +0900
* Regexp
* New constant:
* Regexp::ENCODINGLOOSE: declare execute matching even if the target string
is invalid byte sequence. [experimental]
diff --git a/NEWS b/NEWS
index f5fe388..ade0b03 100644
--- a/NEWS
+++ b/NEWS
@@ -35,6 +35,11 @@ with all sufficient information, see the ChangeLog file.
* misc
* Mutex#owned? is no longer experimental.
+* Regexp
+ * New constant:
+ * Regexp::ENCODINGLOOSE: declare execute matching even if the target string
+ is invalid byte sequence. [experimental]
+
* String
* New methods:
* String#scrub and String#scrub! verify and fix invalid byte sequence.
diff --git a/re.c b/re.c
index e5cc79d..230a2e0 100644
--- a/re.c
+++ b/re.c
@@ -256,6 +256,7 @@ rb_memsearch(const void *x0, long m, const void *y0, long n, rb_encoding *enc)
#define REG_LITERAL FL_USER5
#define REG_ENCODING_NONE FL_USER6
+#define REG_ENCODING_LOOSE FL_USER7
#define KCODE_FIXED FL_USER4
@@ -263,6 +264,7 @@ rb_memsearch(const void *x0, long m, const void *y0, long n, rb_encoding *enc)
(ONIG_OPTION_IGNORECASE|ONIG_OPTION_MULTILINE|ONIG_OPTION_EXTEND)
#define ARG_ENCODING_FIXED 16
#define ARG_ENCODING_NONE 32
+#define ARG_ENCODING_LOOSE 64
static int
char_to_option(int c)
@@ -1251,7 +1253,8 @@ rb_reg_prepare_enc(VALUE re, VALUE str, int warn)
{
rb_encoding *enc = 0;
- if (rb_enc_str_coderange(str) == ENC_CODERANGE_BROKEN) {
+ if (!(RBASIC(re)->flags & REG_ENCODING_LOOSE) &&
+ rb_enc_str_coderange(str) == ENC_CODERANGE_BROKEN) {
rb_raise(rb_eArgError,
"invalid byte sequence in %s",
rb_enc_name(rb_enc_get(str)));
@@ -2433,6 +2436,9 @@ rb_reg_initialize(VALUE obj, const char *s, long len, rb_encoding *enc,
if (options & ARG_ENCODING_NONE) {
re->basic.flags |= REG_ENCODING_NONE;
}
+ if (options & ARG_ENCODING_LOOSE) {
+ re->basic.flags |= REG_ENCODING_LOOSE;
+ }
re->ptr = make_regexp(RSTRING_PTR(unescaped), RSTRING_LEN(unescaped), enc,
options & ARG_REG_OPTION_MASK, err,
@@ -3091,6 +3097,7 @@ rb_reg_options(VALUE re)
options = RREGEXP(re)->ptr->options & ARG_REG_OPTION_MASK;
if (RBASIC(re)->flags & KCODE_FIXED) options |= ARG_ENCODING_FIXED;
if (RBASIC(re)->flags & REG_ENCODING_NONE) options |= ARG_ENCODING_NONE;
+ if (RBASIC(re)->flags & REG_ENCODING_LOOSE) options |= ARG_ENCODING_LOOSE;
return options;
}
@@ -3579,6 +3586,8 @@ Init_Regexp(void)
rb_define_const(rb_cRegexp, "FIXEDENCODING", INT2FIX(ARG_ENCODING_FIXED));
/* see Regexp.options and Regexp.new */
rb_define_const(rb_cRegexp, "NOENCODING", INT2FIX(ARG_ENCODING_NONE));
+ /* see Regexp.options and Regexp.new */
+ rb_define_const(rb_cRegexp, "LOOSEENCODING", INT2FIX(ARG_ENCODING_LOOSE));
rb_global_variable(®_cache);
diff --git a/string.c b/string.c
index 1d784e3..caf0baf 100644
--- a/string.c
+++ b/string.c
@@ -3970,7 +3970,7 @@ str_gsub(int argc, VALUE *argv, VALUE str, int bang)
cp = sp;
str_enc = STR_ENC_GET(str);
rb_enc_associate(dest, str_enc);
- ENC_CODERANGE_SET(dest, rb_enc_asciicompat(str_enc) ? ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID);
+ /*ENC_CODERANGE_SET(dest, rb_enc_asciicompat(str_enc) ? ENC_CODERANGE_7BIT : ENC_CODERANGE_VALID);*/
do {
n++;
diff --git a/test/ruby/test_regexp.rb b/test/ruby/test_regexp.rb
index 11e86ec..b8f6897 100644
--- a/test/ruby/test_regexp.rb
+++ b/test/ruby/test_regexp.rb
@@ -8,6 +8,10 @@ class TestRegexp < Test::Unit::TestCase
$VERBOSE = nil
end
+ def u(str)
+ str.dup.force_encoding(Encoding::UTF_8)
+ end
+
def teardown
$VERBOSE = @verbose
end
@@ -958,6 +962,17 @@ class TestRegexp < Test::Unit::TestCase
}
end
+ def test_encoding_loose
+ str = u("\x80\xE3\x81\x82\x81")
+ assert_equal(0, Regexp.new(".", Regexp::LOOSEENCODING) =~ str)
+ assert_equal(1, Regexp.new(u('\p{Any}'), Regexp::LOOSEENCODING) =~ str)
+ assert_equal(1, Regexp.new("\u3042", Regexp::LOOSEENCODING) =~ str)
+ assert_equal(1, Regexp.new(u('\p{Hiragana}'), Regexp::LOOSEENCODING) =~ str)
+ assert_equal(0, Regexp.new(u('\A.\p{Hiragana}.\z'), Regexp::LOOSEENCODING) =~ str)
+ str = u("\xf1\x80\xE3\x81\x82\x81")
+ assert_equal(0, Regexp.new(u('\A..\p{Hiragana}.\z'), Regexp::LOOSEENCODING) =~ str)
+ end
+
# This assertion is for porting x2() tests in testpy.py of Onigmo.
def assert_match_at(re, str, positions, msg = nil)
re = Regexp.new(re) unless re.is_a?(Regexp)
--
http://bugs.ruby-lang.org/