[#6954] Why isn't Perl highly orthogonal? — Terrence Brannon <brannon@...>

27 messages 2000/12/09

[#7022] Re: Ruby in the US — Kevin Smith <kevinbsmith@...>

> Is it possible for the US to develop corporate

36 messages 2000/12/11
[#7633] Re: Ruby in the US — Dave Thomas <Dave@...> 2000/12/19

tonys@myspleenklug.on.ca (tony summerfelt) writes:

[#7636] Re: Ruby in the US — "Joseph McDonald" <joe@...> 2000/12/19

[#7704] Re: Ruby in the US — Jilani Khaldi <jilanik@...> 2000/12/19

> > first candidates would be mysql and postgressql because source is

[#7705] Code sample for improvement — Stephen White <steve@...> 2000/12/19

During an idle chat with someone on IRC, they presented some fairly

[#7750] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

Stephen White wrote:

[#7751] Re: Code sample for improvement — David Alan Black <dblack@...> 2000/12/20

Hello --

[#7755] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

David Alan Black wrote:

[#7758] Re: Code sample for improvement — Stephen White <steve@...> 2000/12/20

On Wed, 20 Dec 2000, Guy N. Hurst wrote:

[#7759] Next amusing problem: talking integers (was Re: Code sample for improvement) — David Alan Black <dblack@...> 2000/12/20

On Wed, 20 Dec 2000, Stephen White wrote:

[#7212] New User Survey: we need your opinions — Dave Thomas <Dave@...>

16 messages 2000/12/14

[#7330] A Java Developer's Wish List for Ruby — "Richard A.Schulman" <RichardASchulman@...>

I see Ruby as having a very bright future as a language to

22 messages 2000/12/15

[#7354] Ruby performance question — Eric Crampton <EricCrampton@...>

I'm parsing simple text lines which look like this:

21 messages 2000/12/15
[#7361] Re: Ruby performance question — Dave Thomas <Dave@...> 2000/12/15

Eric Crampton <EricCrampton@worldnet.att.net> writes:

[#7367] Re: Ruby performance question — David Alan Black <dblack@...> 2000/12/16

On Sat, 16 Dec 2000, Dave Thomas wrote:

[#7371] Re: Ruby performance question — "Joseph McDonald" <joe@...> 2000/12/16

[#7366] GUIs for Rubies — "Conrad Schneiker" <schneik@...>

Thought I'd switch the subject line to the subject at hand.

22 messages 2000/12/16

[#7416] Re: Ruby IDE (again) — Kevin Smith <kevins14@...>

>> >> I would contribute to this project, if it

17 messages 2000/12/16
[#7422] Re: Ruby IDE (again) — Holden Glova <dsafari@...> 2000/12/16

-----BEGIN PGP SIGNED MESSAGE-----

[#7582] New to Ruby — takaoueda@...

I have just started learning Ruby with the book of Thomas and Hunt. The

24 messages 2000/12/18

[#7604] Any corrections for Programming Ruby — Dave Thomas <Dave@...>

12 messages 2000/12/18

[#7737] strange border-case Numeric errors — "Brian F. Feldman" <green@...>

I haven't had a good enough chance to familiarize myself with the code in

19 messages 2000/12/20

[#7801] Is Ruby part of any standard GNU Linux distributions? — "Pete McBreen, McBreen.Consulting" <mcbreenp@...>

Anybody know what it would take to get Ruby into the standard GNU Linux

15 messages 2000/12/20

[#7938] Re: defined? problem? — Kevin Smith <sent@...>

matz@zetabits.com (Yukihiro Matsumoto) wrote:

26 messages 2000/12/22
[#7943] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

Kevin Smith <sent@qualitycode.com> writes:

[#7950] Re: defined? problem? — Stephen White <steve@...> 2000/12/22

On Fri, 22 Dec 2000, Dave Thomas wrote:

[#7951] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

On Fri, 22 Dec 2000, Stephen White wrote:

[#7954] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

David Alan Black <dblack@candle.superlink.net> writes:

[#7975] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

Hello --

[#7971] Hash access method — Ted Meng <ted_meng@...>

Hi,

20 messages 2000/12/22

[#8030] Re: Basic hash question — ts <decoux@...>

>>>>> "B" == Ben Tilly <ben_tilly@hotmail.com> writes:

15 messages 2000/12/24
[#8033] Re: Basic hash question — "David A. Black" <dblack@...> 2000/12/24

On Sun, 24 Dec 2000, ts wrote:

[#8178] Inexplicable core dump — "Nathaniel Talbott" <ntalbott@...>

I have some code that looks like this:

12 messages 2000/12/28

[#8196] My first impression of Ruby. Lack of overloading? (long) — jmichel@... (Jean Michel)

Hello,

23 messages 2000/12/28

[#8198] Re: Ruby cron scheduler for NT available — "Conrad Schneiker" <schneik@...>

John Small wrote:

14 messages 2000/12/28

[#8287] Re: speedup of anagram finder — "SHULTZ,BARRY (HP-Israel,ex1)" <barry_shultz@...>

> -----Original Message-----

12 messages 2000/12/29

[ruby-talk:7436] Unicode Issues (was: "A Java Developer's Wish List for Ruby")

From: "Richard A.Schulman" <RichardASchulman@...>
Date: 2000-12-16 15:27:45 UTC
List: ruby-talk #7436
Matz:
>|>I'm not going to choose USC-2.  UCS-2 is obsolete.

Schulman:
>|Do you mean that it has been superceded UTF-16? Or what?

Matz
>That's what I mean.

Good. Both UCS-2 and UTF-16 have the same 16-bit encoding
for the 49,194 presently defined characters used in most of
the languages of the world. UTF-16 is a superset of UCS-2,
adding in the possibility of surrogates. Just out of
curiosity, though, how important is the surrogate extension
to users in Japan?

Matz:
>|>But I'm going to add M17N feature to the next version Ruby.
>|>The future Ruby should handle Unicode as well as other encodings.

What exactly is the "M17N feature" that you plan to add? 

Matz:
>Unicode 3.0 is really an improvement.  Most Japanese can accept it
>except time and space efficiency.
>...
>    By using UTF-8, most of Japanese character takes 3 bytes each.  It
>    would be 1.5 time bigger than current.  Imagine all of your text
>    data grows 50% bigger.

I agree. I'm not partial to UTF-8 either. In my earlier
post, I recommended UCS-2, which is a two byte encoding for
both the Western languages and the CJK languages. As far as
DBCS Japanese goes, UCS-2 introduces no changes in storage
or processing requirements. The same is true for the
superset UTF-16, assuming surrogates are not required.

In converting to UTF-16, it's the Western languages that
would suffer a "hit" in terms of storage and processing
time. UTF-8, accordingly, will probably remain common in
Western end users shops for some time to come but not, I
hope, as the internal encoding of system software.

My own experience in developing international software is
that it is MUCH easier to work in an environment in which
UCS-2 or UTF-16 is the internal storage norm rather than
UTF-8. Accordingly, I seek out operating systems, databases,
and language providers that standardize on either of these
as their normative, internal coding.

It is necessary, of course, to provide secondary
transformation routines into the two other Unicode
transformation formats (UTF-8 and UTF-32), as well as
various legacy encodings. 

>Using Unicode as an internal universal chacacter
>sets covers 98% of M17N, but I want to cover ALL of the cases, and
>from my personal experience (Ruby Japanization), I think it's
>efficiently possible.

What is the 2% that isn't covered by Unicode's UTF-16
encoding (which provides for about 1 mn code points, if one
includes the surrogate facility)?

Richard Schulman

In This Thread