[#1263] Draft of the updated Ruby FAQ — Dave Thomas <Dave@...>

33 messages 2000/02/08

[#1376] Re: Scripting versus programming — Andrew Hunt <andy@...>

Conrad writes:

13 messages 2000/02/15

[#1508] Ruby/GTK and the mainloop — Ian Main <imain@...>

17 messages 2000/02/19
[#1544] Re: Ruby/GTK and the mainloop — Yasushi Shoji <yashi@...> 2000/02/23

Hello Ian,

[#1550] Re: Ruby/GTK and the mainloop — Ian Main <imain@...> 2000/02/23

On Wed, Feb 23, 2000 at 02:56:10AM -0500, Yasushi Shoji wrote:

[#1516] Ruby: PLEASE use comp.lang.misc for all Ruby programming/technical questions/discussions!!!! — "Conrad Schneiker" <schneiker@...>

((FYI: This was sent to the Ruby mail list.))

10 messages 2000/02/19

[#1569] Re: Ruby: constructors, new and initialise — Yukihiro Matsumoto <matz@...>

The following message is a courtesy copy of an article

12 messages 2000/02/25

[ruby-talk:01616] Re: Thanks and more regex q's

From: gotoken@... (GOTO Kentaro)
Date: 2000-02-28 03:35:42 UTC
List: ruby-talk #1616
Hi, 

In message "[ruby-talk:01598] Thanks and more regex q's"
    on 00/02/27, Wes Nakamura <wknaka@pobox.com> writes:
>I see that there are classes for Japanese string conversion and
>detection, and there's jcode.rb, but is there a class or module that has
>the concept of each EUC/SJIS "character" being a discrete unit instead
>of two bytes?  Maybe a string-like class with the underlying data being
>an array of integers. 

jcode.rb makes String `Japanese character string' rather than `byte
string'.  Let's assume XX is a Japanese character. "XX"[0] == "XX" if 
jcode.rb was loaded. 

>Is the Japanese-sensitive regex's behavior documented anywhere (I didn't
>see anything for the "n" option either)?  e.g. is there a way to use
>regexes where /./ would match 2 bytes, since . could match a single
>multibyte character? 

Well, ..., oh, this feature is not documented in English version of 
reference manual :-<

 * String, Regexp and program parsing is Japanese character code sensitive. 
 * $KCODE is used to control the character code. "e" for EUC-Japan, 
   "s" for Shift-JIS, "n" for none (i.e. non-J-sensitive). 
 * $KCODE value can be set by -K command line option. -Ke for
   EUC-Japan, etc.
 * Default for $KCODE value can be specified in configuration stage:
   "./configure --with-defalut-kcode=none".  See "./configure --help". 
 * "./configure --with-defalut-kcode=none" will be default in the next
   release of Ruby. 
 * Regexp's option e,s and n control matching manner whatever $KCODE
   is set.

>Is it possible to set an option like "n" when creating a regex when
>using Regexp.new() (since I was creating the regex on the fly using
>strings)?  The regex options become an attribute of the regex
>itself, right?

Yes. 

>This also didn't work:
>
># change hiragana to katakana...
>"\xa4\xa2".sub(/\xa4([\xa1-\xf3])/n, "\xa5\\1")

Hmmm, I don't know why that didn't work :-<  The following works:

  "\xa4\xa2".sub(/\xa4([\xa1-\xf3])/n){"\xa5#{$1}")

By the way, Ruby/KAKASI can be use to
{kanji,hiragana,katakana} -> {hiragana,katakana,ascii(romaji)}

For example, 

  require "kakasi"
  include Kakasi
  p Nakamura = "\xc3\xe6\xc2\xbc"           #=> (Namamura in kanji)
  p kakasi("-ieuc -oeuc -Ja", Nakamura)     #=> "nakamura"
  p a = kakasi("-ieuc -oeuc -JH", Nakamura) #=> (Nakamura in hiragana)
  p kakasi("-ieuc -oeuc -JK", a)            #=> (Nakamura in katakana)

Check out http://www.ruby-lang.org/en/raa.html. 

-- gotoken

In This Thread