[#45942] win32ole and excel — Martin Stannard <martin@...>

Hi,

19 messages 2002/08/01

[#45948] "gets" blocking process not thread (in Windows only) — Matt Pattison <mfp@...>

The problem with my program is that (in Windows) gets seems to block the entire

13 messages 2002/08/01

[#46030] IO.readlines is slow ? — "Shashank Date" <ADATE@...>

I really like the convenience of doing:

18 messages 2002/08/02

[#46072] How to Load Script from a C Extension? — William Djaja Tjokroaminata <billtj@...>

Hi,

20 messages 2002/08/02

[#46107] embed or swig? — ptkwt@...1.aracnet.com (Phil Tomson)

I'm working a C++ project for a contract I'm doing. Originally, the

21 messages 2002/08/03

[#46128] Assoc Class (Hash Pairs) — Tom Sawyer <transami@...>

i've been thinking about posting this as an RCR.

28 messages 2002/08/03

[#46136] Should this work? — "Hal E. Fulton" <hal9000@...>

Should multiple assignment work for the

17 messages 2002/08/03

[#46192] Detecting when an instance variable is created/set — Harry Ohlsen <harryo@...>

Imagine we have a class like ...

22 messages 2002/08/04
[#46198] Re: Detecting when an instance variable is created/set — Tom Sawyer <transami@...> 2002/08/04

On Sun, 2002-08-04 at 06:03, Harry Ohlsen wrote:

[#46207] Re: Detecting when an instance variable is created/set — Harry Ohlsen <harryo@...> 2002/08/04

> > Can I write a method (of class Object or Kernel, perhaps) that will be

[#46226] Re: Detecting when an instance variable is created/set — Massimiliano Mirra <list@...> 2002/08/04

On Sun, Aug 04, 2002 at 10:32:44PM +0900, Harry Ohlsen wrote:

[#46264] Dynamic creation of classes and methods — Tomasz Wegrzanowski <taw@...>

I want to create classes and methods on fly.

11 messages 2002/08/05

[#46341] More questions on automation from na誰ve Windows user. — Chris Gehlker <gehlker@...>

Hi all,

15 messages 2002/08/05

[#46356] Coding challenge (on Ruby Garden) — David Alan Black <dblack@...>

Hello --

47 messages 2002/08/06

[#46357] Compiling Ruby to Native Code? — web2ed@... (Edward Wilson)

Having looked at OCaml, after following a post to this group, one

20 messages 2002/08/06

[#46426] Is There an Inverse of 'rb_define_method'? — William Djaja Tjokroaminata <billtj@...>

Hi,

15 messages 2002/08/06

[#46442] COM on Unix? — Chris Gehlker <gehlker@...>

As part of my crusade to make Ruby an automation language I read up a little

12 messages 2002/08/06

[#46443] Dup and Clone — "Justin Johnson" <justinj@...>

Could anyone kindly point out the difference between 'dup' and 'clone'?

17 messages 2002/08/06

[#46475] Named paramters again — "Justin Johnson" <justinj@...>

26 messages 2002/08/07
[#46534] Re: Named paramters again — "Gavin Sinclair" <gsinclair@...> 2002/08/07

[#46537] RE: Named paramters again — "Rich Kilmer" <rich@...> 2002/08/07

[#46550] GUI's and the Rouge, Part IV — Kero van Gelder <kero@...>

Funny, two savannah accounts for the same objective:

12 messages 2002/08/07

[#46565] Re: Unicode in Ruby now? — "Marcin 'Qrczak' Kowalczyk" <qrczak@...>

Wed, 7 Aug 2002 16:41:18 +0900, Curt Sampson <cjs@cynic.net> pisze:

12 messages 2002/08/07

[#46732] ambiguity between local variable assignment and writter method — Tom Sawyer <transami@...>

does anyone else find it annoying that local variable assignment is

56 messages 2002/08/09
[#46788] Re: ambiguity between local variable assignment and writter method — dblack@... 2002/08/10

Hi --

[#46791] Re: ambiguity between local variable assignment and writter method — Tom Sawyer <transami@...> 2002/08/10

On Fri, 2002-08-09 at 22:50, dblack@candle.superlink.net wrote:

[#46794] Re: ambiguity between local variable assignment and writter method — dblack@... 2002/08/10

Hi --

[#46734] Re: ambiguity between local variable assignment and writter method — Paul Brannan <pbrannan@...> 2002/08/09

On Sat, Aug 10, 2002 at 03:00:28AM +0900, Tom Sawyer wrote:

[#46737] Re: ambiguity between local variable assignment and writter method — Tom Sawyer <transami@...> 2002/08/09

On Fri, 2002-08-09 at 12:05, Paul Brannan wrote:

[#46739] Re: ambiguity between local variable assignment and writter method — Dave Thomas <Dave@...> 2002/08/09

Tom Sawyer <transami@transami.net> writes:

[#46741] Re: ambiguity between local variable assignment and writter method — GOTO Kentaro <gotoken@...> 2002/08/09

At Sat, 10 Aug 2002 03:44:45 +0900,

[#46748] Re: ambiguity between local variable assignment and writter method — Dave Thomas <Dave@...> 2002/08/09

GOTO Kentaro <gotoken@notwork.org> writes:

[#46753] Re: ambiguity between local variable assignment and writter method — Tom Sawyer <transami@...> 2002/08/09

On Fri, 2002-08-09 at 13:30, Dave Thomas wrote:

[#46841] Ah, I'm finally back from Japan ... — Dossy <dossy@...>

Not like anyone cares (or noticed) but my two week stay in Japan

12 messages 2002/08/10

[#46875] To be a Module, or not to be... — Holden Glova <dsafari@...>

-----BEGIN PGP SIGNED MESSAGE-----

12 messages 2002/08/11

[#46911] Choosing ruby? — Rhymes <raims@...>

27 messages 2002/08/11

[#46957] Handling forms on database driven websites — Philip Mak <pmak@...>

Ever since I learned Perl, Ruby and MySQL, I've built several database

10 messages 2002/08/12

[#47000] Primary Key Hash help — "Chris Morris" <chrismo@...>

I have a huge data file with rows like this:

17 messages 2002/08/12

[#47134] Data_Make_Struct Considered Dangerous? — William Djaja Tjokroaminata <billtj@...>

Hi,

39 messages 2002/08/13

[#47212] Ruby Weekly News — Dave@...

21 messages 2002/08/14

[#47292] Thought question: Where does "new" come from? — "Hal E. Fulton" <hal9000@...>

I've been brooding again on the circularities

28 messages 2002/08/15
[#47342] Re: Thought question: Where does "new" come from? — "Hal E. Fulton" <hal9000@...> 2002/08/15

----- Original Message -----

[#47346] Re: Thought question: Where does "new" come from? — dblack@... 2002/08/15

Hi --

[#47365] Re: Thought question: Where does "new" come from? — "MikkelFJ" <mikkelfj-anti-spam@...> 2002/08/15

[#47369] Re: Thought question: Where does "new" come from? — dblack@... 2002/08/15

Hello --

[#47372] Re: Thought question: Where does "new" come from? — "MikkelFJ" <mikkelfj-anti-spam@...> 2002/08/15

[#47377] Re: Thought question: Where does "new" come from? — dblack@... 2002/08/16

Hi --

[#47344] eruby editor — "Kyle Wilson" <kyle.wilson@...>

Hello. I was wondering if anyone knows of a text editor which will

17 messages 2002/08/15

[#47440] Help with a segv in mod_ruby — Dave Thomas <Dave@...>

14 messages 2002/08/16

[#47461] How do I dup file descriptors in ruby? (diverting STDERR) — "Richard A. Ryan" <ryan@...>

Hello,

12 messages 2002/08/16

[#47464] IDE vs. editor — Holden Glova <dsafari@...>

-----BEGIN PGP SIGNED MESSAGE-----

43 messages 2002/08/16

[#47547] Re: What Ruby needs. — "Shashank Date" <ADATE@...>

I do not have any problem with item 1) on your wish list as long as I don't

13 messages 2002/08/18

[#47559] Ruby Bot — Giuseppe Bilotta <bilotta78@...>

Hello,

14 messages 2002/08/18

[#47643] thread control — "Shashank Date" <ADATE@...>

I am trying to write a ruby script (Ruby 1.7.2 mswin32) which does the

21 messages 2002/08/20

[#47695] What makes a "good" Ruby extension? — Tim Hunter <cyclists@...>

So I'm reading the "Comparing Gui Toolkits" wiki page

14 messages 2002/08/20

[#47749] What New Language After Ruby? — William Djaja Tjokroaminata <billtj@...>

To Andrew Hunt and David Thomas:

74 messages 2002/08/21
[#47754] Re: What New Language After Ruby? — Wilkes Joiner <boognish23@...> 2002/08/21

Although activity seems to have died down, here are some links

[#47817] A Repeat: New Language After Ruby? — William Djaja Tjokroaminata <billtj@...>

Hi,

54 messages 2002/08/21
[#47820] RE: A Repeat: New Language After Ruby? — " JamesBritt" <james@...> 2002/08/21

[#47918] Win32 Scripting — Sean Middleditch <elanthis@...>

Hi,

13 messages 2002/08/22

[#48035] Why Ruby Uses Mark-and-Sweep GC? — William Djaja Tjokroaminata <billtj@...>

Hi,

39 messages 2002/08/23

[#48062] Ruby and Judy — Joseph McDonald <joe@...>

29 messages 2002/08/23

[#48082] Distributed Object Container — junderdown@... (Jason Underdown)

Is anyone out there in the Ruby community working on an object

23 messages 2002/08/24
[#48185] Re: Distributed Object Container — "Gavin Sinclair" <gsinclair@...> 2002/08/26

----- Original Message -----

[#48223] Ruby Based App Server — junderdown@... (Jason Underdown)

I posted a similar question a few days ago, but didn't get any

21 messages 2002/08/26

[#48264] Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — itsnewsforme@... (M S)

A big complaint from people looking into Ruby is that they don't see

36 messages 2002/08/27
[#48292] Re: Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — ts <decoux@...> 2002/08/27

>>>>> "M" == M S <itsnewsforme@yahoo.ca> writes:

[#48296] RE: Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — "Rich Kilmer" <rich@...> 2002/08/27

Actually, it would be nice to have them online, but not necessarily

[#48336] Re: Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — Massimiliano Mirra <list@...> 2002/08/27

On Tue, Aug 27, 2002 at 09:39:32PM +0900, Rich Kilmer wrote:

[#48358] RE: Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — "Rich Kilmer" <rich@...> 2002/08/28

http://kt-www.jaist.ac.jp/~ttate/ruby/ruby-dl.html

[#48362] RE: Ruby developers: help push RPKG development and usage forward!! (it is like CPAN.pm, only Ruby) — Tom Sawyer <transami@...> 2002/08/28

On Tue, 2002-08-27 at 19:32, Rich Kilmer wrote:

[#48367] RE: Ruby developers: help push RPKG development and usage forward!!(it is like CPAN.pm, only Ruby) — "Rich Kilmer" <rich@...> 2002/08/28

You can just install it in another directory and then go to that

[#48369] RE: Ruby developers: help push RPKG development and usage forward!!(it is like CPAN.pm, only Ruby) — Tom Sawyer <transami@...> 2002/08/28

uh, sorry, how do i get 1.7.2? i tried anonymous cvs but it said NO. did

[#48371] RE: Ruby developers: help push RPKG development and usageforward!!(it is like CPAN.pm, only Ruby) — "Rich Kilmer" <rich@...> 2002/08/28

Nightly CVS snapshot:

[#48274] ANN: RJudy-0.1 - Judy Arrays for Ruby — Lyle Johnson <lyle@...>

All,

17 messages 2002/08/27

[#48477] Newbie converting brain from perl — William Pietri <william-news-383910@...>

20 messages 2002/08/28

[#48544] Best GC for Ruby? — "Justin Johnson" <justinj@...>

34 messages 2002/08/29

[#48573] FXRuby Threading Problem Solved? — Lyle Johnson <lyle@...>

All,

14 messages 2002/08/29

[#48584] suggestions to the Ruby community — stibbs <stibbs@...>

Hi, first i would like to state that i absolutely love Ruby more than any

85 messages 2002/08/29
[#48923] Re: suggestions to the Ruby community — <bbense+comp.lang.ruby.Sep.03.02@...> 2002/09/03

-----BEGIN PGP SIGNED MESSAGE-----

[#48930] RE: suggestions to the Ruby community — " JamesBritt" <james@...> 2002/09/03

> >I was surprised just now to find that there is no absolute requirement

[#49017] Re: suggestions to the Ruby community — <bbense+comp.lang.ruby.Sep.04.02@...> 2002/09/04

-----BEGIN PGP SIGNED MESSAGE-----

[#48657] ICFP Programming Contest — Alan Chen <alan@...>

http://icfpcontest.cse.ogi.edu/task.html

12 messages 2002/08/30

[#48705] Ruby aesthetics — vegai@...

Hello. I've been checking into python lately quite a lot, and I

192 messages 2002/08/31
[#49010] Re: Ruby aesthetics — "Hal E. Fulton" <hal9000@...> 2002/09/04

----- Original Message -----

[#49100] Re: Ruby aesthetics — Paul Prescod <paulp@...> 2002/09/05

On Thu, 5 Sep 2002, Hal E. Fulton wrote:

[#49112] Re: Ruby aesthetics — William Djaja Tjokroaminata <billtj@...> 2002/09/05

Hi,

[#49154] Re: Ruby aesthetics — Paul Prescod <paulp@...> 2002/09/05

On Thu, 5 Sep 2002, William Djaja Tjokroaminata wrote:

[#49161] Re: Ruby aesthetics — Christian Szegedy <szegedy@...> 2002/09/05

Paul Prescod wrote:

[#49173] Re: Ruby aesthetics — William Djaja Tjokroaminata <billtj@...> 2002/09/05

Hi,

[#49183] Re: Ruby aesthetics — <paul@...> 2002/09/05

On Fri, 6 Sep 2002, William Djaja Tjokroaminata wrote:

[#49189] Re: Ruby aesthetics — William Djaja Tjokroaminata <billtj@...> 2002/09/05

I think we have communicated very well; I agree with all you said. May I

[#49191] Re: Ruby aesthetics — <paul@...> 2002/09/05

On Fri, 6 Sep 2002, William Djaja Tjokroaminata wrote:

[#49272] Re: Ruby aesthetics — William Djaja Tjokroaminata <billtj@...> 2002/09/06

Hi Matz,

[#49293] Re: Ruby aesthetics — matz@... (Yukihiro Matsumoto) 2002/09/06

Hi,

[#49312] Re: Ruby aesthetics — <paul@...> 2002/09/06

On Sat, 7 Sep 2002, Yukihiro Matsumoto wrote:

[#49321] Re: Ruby aesthetics — dblack@... 2002/09/06

Hello --

Re: Unicode in Ruby now?

From: Curt Sampson <cjs@...>
Date: 2002-08-02 09:24:19 UTC
List: ruby-talk #46044
On Fri, 2 Aug 2002, MikkelFJ wrote:

> > Yeah. This is getting into complex nightmare city. That's why I'd prefer
> > to have the basic system just work completely in Unicode. One could have
> > a separate character system (character and string classes, byte-stream
> > to char converters, etc.) to work with this tagged format if one wished.
>
> But isn't this what matz suggest?
> Each stream is tagged, that is the same as having different types. It's
> basically just  a different way to store the type while having a lot of
> common string operations.

No, because then you have to deal with conversions. Most popular
character sets are convertable to Unicode and back without loss. That is
not true of any arbitrary pair of character sets, though, even if you go
through Unicode.

The reason for this is as follows. Say character set Foo has split
a unified hanji, "a", and also has "A". When converting to Unicode,
that "A" will be preserved because it's assigned a code point in
a compatability area, and when you convert back from Unicode, that
"A" will be translated to "A" in Foo. However, if character set Bar
does not have "A", just "a", the "A" will be converted to "a". When you
go from Bar back to Unicode, you end up with "a" again because there's
no way to tell that it was originally "A" when you converted out.

But there's an even better reason than this for converting to
Unicode on input, rather than doing internal tagging. If you don't
have conversion tables for a particular character encoding, it's
much better to find out at the time you try to get the information
in to the system than at some arbitrary later point when you try
to do a conversion. That way you know where the problem information
is coming from.

In terms of interface, I would say:

    1. Continue to use String as it is for "binary" data. This is
    efficient, if you don't need to do much processing.

    2. Add a UString or similar for dealing with UTF-16 data. There's
    no need for surrogate support in this, for reasons I will get into
    below, so this is straight fixed width. Reasonably efficient (almost
    maximally efficient for those of us using Asian languages :-)) and
    very easy to use.

    3. Add other, specialized classes when you need to do special
    purpose things. No need for this in the standard distribution.

> BTW: Unicode is not a fixed with format.

In terms of code values, it is fixed width. However, some characters are
represented by pairs of code values.

> ...but there are escape codes...

No, there are no escape codes. The high and low code values for
surrogate characters have their own special areas, and so are easily
identifiable.

> and options for future extensions.

Not that I know of. Can you explain what these are?

> Hence UCS-4 is a strategy with limited timespan.

Not at all, unless they decide to change Unicode to the point where it
no longer uses 16-bit code values, or add escape codes, or something
like that. That would be backward-incompatable, severely complicate
processing, and generally cause all hell to break lose. So I'd rate
this as "Not Likely."

Here are a few points to keep in mind about Unicode processing:

    1. The surrogate pairs are almost never used. Two years ago
    there weren't even any characters assigned to those code points.

    2. There are many situations where, even if surrogate pairs
    are present, you don't know or care, and need do nothing to
    correctly deal with them.

    3. Broken surrogate pairs are not a problem; the standard says you
    must be able to ignore broken pairs, if you interpret surrogate
    pairs at all.

    3. The surrogate pairs are extremely easy to distinguish, even
    if you don't interpret them.

    4. The code for dealing with surrogate pairs well (basically,
    not breaking them) is very simple.

The implication of point 1 is that one should not spend a lot of effort
dealing with surrogate pairs, as very few users will ever use them. Very
few Asian users will ever use them in their lifetimes, in fact.

The implication of points 2 and 3 are that not everything that deals
with Unicode has to deal with, or even know about, surrogate pairs. If
you are writing a web application, for example, your typical fields you
just take as a whole from the web browser or database, and give as a whole
to the web browser or database. Thus only the web browser really has any
need at all to deal with surrogate pairs.

If you take a substring of a string and in the process end up with
a surrogate pair half on either end, that's no problem. It just
gets ignored by whatever facilities deal with surrogate pairs, or
treated as an unknown character by those that don't (rather than
two unknown characters for an unsplit surrogate pair).

The only time you really run into a problem is if you insert
something into a string; there's a chance you might split the
surrogate pair, and lose the character. This is pretty uncommon
except in interactive input situations, though, where you know how
to handle surrogate pairs and can avoid doing this, or where you
don't know and the user can't see the characters anyway.

Well, another area you can run into problems with is line wrapping, but
there's no single algorithm for that anyway, and plenty of algorithms
break on languages for which they were not designed. So there you should
add some very simple code that avoids splitting surrogate pairs. (This
code is much simpler than the line wrapping code anyway, so it's hardly
a burden.) That shows the advantages of points 3 and 4 (essentially the
same point).

So I propose just what the Unicode standard itself proposes in
section 5.4: UString (or whatever we call it) should have the
Surrogate Support Level "none"; i.e., it completely ignores the
existence of surrogate pairs. Things that use UString that have
the potential to encounter surrogate pair problems or wish to
interpret them can add simple or complex code, as they need, to
deal with the problem at hand. (Many users of UString will need to
do nothing.)

Note that there's a big difference between this and your UTF-8
proposal: ignoring multibyte stuff in UTF-8 is going to cause much,
much more lossage because there's a much, much bigger chance of
breaking things when using Asian languages. With UTF-16, you probably
won't even encounter surrogates, whereas with Japanese in UTF-8,
pretty much every character is multibyte.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

In This Thread