From: "kosaki (Motohiro KOSAKI)" Date: 2013-01-03T02:07:29+09:00 Subject: [ruby-dev:46828] [ruby-trunk - Bug #7646][Assigned] String#each_lineでinvalid byte sequence Issue #7646 has been updated by kosaki (Motohiro KOSAKI). Category set to core Status changed from Open to Assigned Assignee set to nobu (Nobuyoshi Nakada) Priority changed from Normal to High Target version set to 2.0.0 これはどうみても regressionじゃないかな。 2.0.0タグつけます。 ---------------------------------------- Bug #7646: String#each_lineでinvalid byte sequence https://bugs.ruby-lang.org/issues/7646#change-35181 Author: yoshidam (Yoshida Masato) Status: Assigned Priority: High Assignee: nobu (Nobuyoshi Nakada) Category: core Target version: 2.0.0 ruby -v: ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux] =begin String#each_lineでセパレータを指定したときにASCII以外の文字でinvalid byte sequenceが発生します。 $ ruby -ve '"\n\u0100".each_line("\n") {|l| p l }' ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux] "\n" -e:1:in `each_line': invalid byte sequence in UTF-8 (ArgumentError) from -e:1:in `
' r38616あたりの変更で入ったバグのようです。   --- string.c.org 2012-12-27 21:57:07.000000000 +0900 +++ string.c 2013-01-02 23:36:47.000000000 +0900 @@ -6199,14 +6199,14 @@ if (c == newline && (rslen <= 1 || (pend - p >= rslen && memcmp(RSTRING_PTR(rs), p, rslen) == 0))) { - p += (rslen ? rslen : n); - line = rb_str_subseq(str, s - ptr, p - s); + const char *pp = p + (rslen ? rslen : n); + line = rb_str_subseq(str, s - ptr, pp - s); if (wantarray) rb_ary_push(ary, line); else rb_yield(line); str_mod_check(str, ptr, len); - s = p; + s = pp; } p += n; } =end -- http://bugs.ruby-lang.org/