[#32498] Re: [ruby-cvs:21399] Ruby:r14162 (trunk): * parse.y (expr): redefinable not (!) operator. — SASADA Koichi <ko1@...>

 ささだです.

9 messages 2007/12/09

[#32512] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Nobuyoshi Nakada <nobu@...>

なかだです。

33 messages 2007/12/10
[#32520] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Martin Duerst <duerst@...> 2007/12/10

中田さん、こんにちは。

[#32527] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Nobuyoshi Nakada <nobu@...> 2007/12/10

なかだです。

[#32535] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Yukihiro Matsumoto <matz@...> 2007/12/11

まつもと ゆきひろです

[#32537] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Martin Duerst <duerst@...> 2007/12/11

At 15:33 07/12/11, Yukihiro Matsumoto wrote:

[#32538] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Yukihiro Matsumoto <matz@...> 2007/12/11

まつもと ゆきひろです

[#32539] Re: [ruby-cvs:21409] Ruby:r14172 (trunk): * transcode.c: new file to provide encoding conversion features. — Nobuyoshi Nakada <nobu@...> 2007/12/11

なかだです。

[#32550] Binary String — Hidetoshi NAGAI <nagai@...>

永井@知能.九工大です.

204 messages 2007/12/12
[#32551] Re: Binary String — Yukihiro Matsumoto <matz@...> 2007/12/12

まつもと ゆきひろです

[#32552] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2007/12/12

永井@知能.九工大です.

[#32553] Re: Binary String — Yukihiro Matsumoto <matz@...> 2007/12/12

まつもと ゆきひろです

[#32560] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2007/12/12

永井@知能.九工大です.

[#32561] Re: Binary String — Nobuyoshi Nakada <nobu@...> 2007/12/12

なかだです。

[#33018] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/11

永井@知能.九工大です.

[#33019] Re: Binary String — Tanaka Akira <akr@...> 2008/01/11

In article <20080111.171950.78716471.nagai@ai.kyutech.ac.jp>,

[#33024] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/11

永井@知能.九工大です.

[#33027] Re: Binary String — Tanaka Akira <akr@...> 2008/01/11

In article <20080111.184442.74744388.nagai@ai.kyutech.ac.jp>,

[#33041] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/11

永井@知能.九工大です.

[#33047] Re: Binary String — Tanaka Akira <akr@...> 2008/01/11

In article <20080112.004750.74741782.nagai@ai.kyutech.ac.jp>,

[#33055] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/12

永井@知能.九工大です.

[#33080] Re: Binary String — Tanaka Akira <akr@...> 2008/01/13

In article <20080112.100830.112615025.nagai@ai.kyutech.ac.jp>,

[#33104] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/14

永井@知能.九工大です.

[#33108] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/15

成瀬です。

[#33121] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/15

永井@知能.九工大です.

[#33123] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/15

成瀬です。

[#33127] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/16

永井@知能.九工大です.

[#33138] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/16

成瀬です。

[#33147] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/17

永井@知能.九工大です.

[#33152] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/17

成瀬です。

[#33153] Re: Binary String — 遊楽庵 <yu_raku_an@...> 2008/01/17

遊楽庵です。

[#33154] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/17

成瀬です。

[#33157] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/17

まつもと ゆきひろです

[#33330] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/23

成瀬です。

[#33336] Re: Binary String — Tanaka Akira <akr@...> 2008/01/23

In article <47975933.8010907@airemix.com>,

[#33337] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/23

まつもと ゆきひろです

[#33346] Re: Binary String — "U.Nakamura" <usa@...> 2008/01/24

こんにちは、なかむら(う)です。

[#33348] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/24

まつもと ゆきひろです

[#33352] Re: Binary String — "U.Nakamura" <usa@...> 2008/01/24

こんにちは、なかむら(う)です。

[#33353] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/24

まつもと ゆきひろです

[#33122] Re: Binary String — Tanaka Akira <akr@...> 2008/01/15

In article <20080115.024201.41653719.nagai@ai.kyutech.ac.jp>,

[#33126] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/16

永井@知能.九工大です.

[#33151] Re: Binary String — Tanaka Akira <akr@...> 2008/01/17

In article <20080116.102057.41656941.nagai@ai.kyutech.ac.jp>,

[#33160] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/17

永井@知能.九工大です.

[#33165] Re: Binary String — Tanaka Akira <akr@...> 2008/01/18

In article <20080117.233832.74721189.nagai@ai.kyutech.ac.jp>,

[#33188] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/19

永井@知能.九工大です.

[#33193] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/19

まつもと ゆきひろです

[#33202] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/19

永井@知能.九工大です.

[#33230] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/20

まつもと ゆきひろです

[#33236] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/21

永井@知能.九工大です.

[#33238] Re: Binary String — SASADA Koichi <ko1@...> 2008/01/21

 m17n には近づかないようにしているささだです。

[#33241] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/21

成瀬です。

[#33248] Re: Binary String — Yukihiro Matsumoto <matz@...> 2008/01/21

まつもと ゆきひろです

[#33281] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/22

永井@知能.九工大です.

[#33285] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/22

成瀬です。

[#33318] Re: Binary String — Hidetoshi NAGAI <nagai@...> 2008/01/23

永井@知能.九工大です.

[#33186] Ruby1.9 String バイト列へのインデックス アクセス — "Hisanori Kiryu" <hkiryu@...> 2008/01/19

長文失礼します。

[#33192] Re: Ruby1.9 String バイト列へのインデックス アクセス — Yukihiro Matsumoto <matz@...> 2008/01/19

まつもと ゆきひろです

[#33195] Re: Ruby1.9 String バイト列へのインデックス アクセス — rubikitch@... 2008/01/19

From: Yukihiro Matsumoto <matz@ruby-lang.org>

[#33199] Re: Ruby1.9 String バイト列へのインデックス アクセス — "NARUSE, Yui" <naruse@...> 2008/01/19

成瀬です。

[#33020] Re: Binary String — "NARUSE, Yui" <naruse@...> 2008/01/11

成瀬です。

[#32610] 1.9.1 issues left (as of 12/15) — Yukihiro Matsumoto <matz@...>

まつもと ゆきひろです

14 messages 2007/12/15

[#32715] issues left as of 12/25 2:00am JST — Yukihiro Matsumoto <matz@...>

まつもと ゆきひろです

41 messages 2007/12/24
[#32738] issues left as of 12/25 noon JST — Yukihiro Matsumoto <matz@...> 2007/12/25

まつもと ゆきひろです

[#32739] Re: issues left as of 12/25 noon JST — Yukihiro Matsumoto <matz@...> 2007/12/25

まつもと ゆきひろです

[#32791] Re: [ruby-list:44387] [ANN] Ruby 1.9.0 is released — SASADA Koichi <ko1@...>

 ささだです。

21 messages 2007/12/25

[#32823] class TimeSpan — "NARUSE, Yui" <naruse@...>

成瀬です。

18 messages 2007/12/27

[#32843] Windowでのデフォルトエンコーディング — KIMURA Koichi <kimura.koichi@...>

木村です。

30 messages 2007/12/28
[#32845] Re: Windowでのデフォルトエンコーディング — "U.Nakamura" <usa@...> 2007/12/28

こんにちは、なかむら(う)です。

[#32851] Re: Window でのデフォルトエンコーディング — Martin Duerst <duerst@...> 2007/12/28

At 13:55 07/12/28, U.Nakamura wrote:

[#32853] Re: Windowでのデフォルトエンコーディング — "NARUSE, Yui" <naruse@...> 2007/12/28

U.Nakamura wrote:

[#32857] Re: Windowでのデフォルトエンコーディング — "U.Nakamura" <usa@...> 2007/12/28

こんにちは、なかむら(う)です。

[#32852] Resolv::DNS#getaddresses doesn't return IPv6 address — "NARUSE, Yui" <naruse@...>

成瀬です。

17 messages 2007/12/28
[#32923] Re: Resolv::DNS#getaddresses doesn't return IPv6 address — Takahiro Kambe <taca@...> 2008/01/05

こんにちは。

[#32924] Re: Resolv::DNS#getaddresses doesn't return IPv6 address — "NARUSE, Yui" <naruse@...> 2008/01/05

成瀬です。

[#32925] Re: Resolv::DNS#getaddresses doesn't return IPv6 address — Takahiro Kambe <taca@...> 2008/01/05

In message <477EF0C9.4060103@airemix.com>

[#32929] Re: Resolv::DNS#getaddresses doesn't return IPv6 address — "NARUSE, Yui" <naruse@...> 2008/01/05

成瀬です

[ruby-dev:32438] Re: String#inspect

From: Tanaka Akira <akr@...>
Date: 2007-12-02 22:51:43 UTC
List: ruby-dev #32438
In article <87abotqmhh.fsf_-_@fsij.org>,
  Tanaka Akira <akr@fsij.org> writes:

> In article <87wsrz7wpg.fsf@fsij.org>,
>   Tanaka Akira <akr@fsij.org> writes:
>
>> 文字列がそのエンコーディングとして正しいかどうかを確認する機
>> 能とか。
>
> とうのを実装するには Oniguruma レベルの mbclen あたりで間違っ
> たエンコーディングをまともに扱って、その情報を提供させないと
> いけないわけですが、やってみるとこんな感じですかね。

String#valid_encoding? で確認するだけのをつけて、UTF-8 を厳
密化するとこんな感じでしょうか。
(string.c と utf8.c 以外は変わってないので省略)

Index: string.c
===================================================================
--- string.c	(revision 14088)
+++ string.c	(working copy)
@@ -2919,10 +2919,19 @@ rb_str_inspect(VALUE str)
     str_cat_char(result, '"', enc);
     p = RSTRING_PTR(str); pend = RSTRING_END(str);
     while (p < pend) {
-	int c = rb_enc_codepoint(p, pend, enc);
-	int n = rb_enc_codelen(c, enc);
+	int c;
+	int n;
 	int cc;
 
+        n = rb_enc_precise_mbclen(p, pend, enc);
+        if (!MBCLEN_CHARFOUND(n)) {
+            c = (unsigned char)*p++;
+            goto escape_byte;
+        }
+
+	c = rb_enc_codepoint(p, pend, enc);
+	n = rb_enc_codelen(c, enc);
+
 	p += n;
 	if (c == '"'|| c == '\\' ||
 	    (c == '#' && (cc = rb_enc_codepoint(p,pend,enc),
@@ -2961,8 +2970,10 @@ rb_str_inspect(VALUE str)
 	}
 	else {
 	    char buf[5];
-	    char *s = buf;
+	    char *s;
 
+escape_byte:
+	    s = buf;
 	    sprintf(buf, "\\%03o", c & 0377);
 	    while (*s) {
 		str_cat_char(result, *s++, enc);
@@ -5232,6 +5243,25 @@ rb_str_force_encoding(VALUE str, VALUE e
     return str;
 }
 
+static VALUE
+rb_str_valid_encoding_p(VALUE str)
+{
+    char *p = RSTRING_PTR(str);
+    char *pend = RSTRING_END(str);
+    rb_encoding *enc = rb_enc_get(str);
+
+    while (p < pend) {
+	int n;
+
+        n = rb_enc_precise_mbclen(p, pend, enc);
+        if (!MBCLEN_CHARFOUND(n)) {
+            return Qfalse;
+        }
+        p += n;
+    }
+    return Qtrue;
+}
+
 /**********************************************************************
  * Document-class: Symbol
  *
@@ -5644,6 +5674,7 @@ Init_String(void)
 
     rb_define_method(rb_cString, "encoding", rb_obj_encoding, 0); /* in encoding.c */
     rb_define_method(rb_cString, "force_encoding", rb_str_force_encoding, 1);
+    rb_define_method(rb_cString, "valid_encoding?", rb_str_valid_encoding_p, 0);
 
     id_to_s = rb_intern("to_s");
 
Index: enc/utf8.c
===================================================================
--- enc/utf8.c	(revision 14088)
+++ enc/utf8.c	(working copy)
@@ -56,13 +56,189 @@ static const int EncLen_UTF8[] = {
   2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
   2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
   3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
-  4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
+  4, 4, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
 };
 
+typedef enum {
+  FAILURE = -2,
+  ACCEPT,
+  S0, S1, S2, S3,
+  S4, S5, S6, S7
+} state_t;
+#define A ACCEPT
+#define F FAILURE
+static const signed char trans[][0x100] = {
+  { /* S0   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 1 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 2 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 3 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 4 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 5 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 6 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 7 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 8 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 9 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* a */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* b */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* c */ F, F, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* d */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* e */ 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3,
+    /* f */ 5, 6, 6, 6, 7, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S1   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* 9 */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* a */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* b */ A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S2   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 9 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* a */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* b */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S3   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* 9 */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* a */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* b */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S4   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* 9 */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+    /* a */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* b */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S5   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 9 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* a */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* b */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S6   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* 9 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* a */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* b */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+  { /* S7   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
+    /* 0 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 1 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 2 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 3 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 4 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 5 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 6 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 7 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* 8 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+    /* 9 */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* a */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* b */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* c */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* d */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* e */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F,
+    /* f */ F, F, F, F, F, F, F, F, F, F, F, F, F, F, F, F 
+  },
+};
+#undef A
+#undef F
+
 static int
 utf8_mbc_enc_len(const UChar* p, const UChar* e, OnigEncoding enc)
 {
-  return EncLen_UTF8[*p];
+  int firstbyte = *p++;
+  state_t s;
+  s = trans[0][firstbyte];
+  if (s < 0) return s == ACCEPT ? ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND(1) :
+                                  ONIGENC_CONSTRUCT_MBCLEN_INVALID();
+
+  if (p == e) return ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE(EncLen_UTF8[firstbyte]-1);
+  s = trans[s][*p++];
+  if (s < 0) return s == ACCEPT ? ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND(2) :
+                                  ONIGENC_CONSTRUCT_MBCLEN_INVALID();
+
+  if (p == e) return ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE(EncLen_UTF8[firstbyte]-2);
+  s = trans[s][*p++];
+  if (s < 0) return s == ACCEPT ? ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND(3) :
+                                  ONIGENC_CONSTRUCT_MBCLEN_INVALID();
+
+  if (p == e) return ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE(EncLen_UTF8[firstbyte]-3);
+  s = trans[s][*p++];
+  return s == ACCEPT ? ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND(4) :
+                       ONIGENC_CONSTRUCT_MBCLEN_INVALID();
 }
 
 static int
-- 
[田中 哲][たなか あきら][Tanaka Akira]

In This Thread

Prev Next