[#4076] Ruby/DL — Jamis Buck <jamis_buck@...>

I recently used Ruby/DL to create bindings to the SQLite3 embedded

40 messages 2005/01/03
[#4096] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/04

On Tue, Jan 04, 2005 at 02:53:49AM +0900, Jamis Buck wrote:

[#4099] Re: Ruby/DL — ts <decoux@...> 2005/01/04

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4119] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Wed, Jan 05, 2005 at 03:05:48AM +0900, ts wrote:

[#4120] Re: Ruby/DL — ts <decoux@...> 2005/01/05

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4125] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Thu, Jan 06, 2005 at 01:10:34AM +0900, ts wrote:

[#4116] Test::Unit::Collector::Dir won't work with code that modifies $LOAD_PATH — Eric Hodel <drbrain@...7.net>

Any test code that depends upon modifications of $: fails when used

10 messages 2005/01/05

[#4146] The face of Unicode support in the future — Charles O Nutter <headius@...>

Hello Rubyists!

47 messages 2005/01/06
[#4152] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/07

Hi,

[#4167] Re: The face of Unicode support in the future — Christian Neukirchen <chneukirchen@...> 2005/01/09

Yukihiro Matsumoto <matz@ruby-lang.org> writes:

[#4175] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/10

Hi,

[#4186] Re: The face of Unicode support in the future — Paul Brannan <pbrannan@...> 2005/01/11

On Mon, Jan 10, 2005 at 11:53:48PM +0900, Yukihiro Matsumoto wrote:

[#4192] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/12

Hi,

[#4269] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...>

19 messages 2005/01/18
[#4270] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/18

Hi,

[#4275] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...> 2005/01/19

[#4323] test/unit doesn't rescue a Exception — Tanaka Akira <akr@...17n.org>

test/unit doesn't rescue a Exception in a test method, as follows.

14 messages 2005/01/27
[#8773] Re: test/unit doesn't rescue a Exception — Tanaka Akira <akr@...> 2006/09/02

In article <87is5jb46q.fsf@serein.a02.aist.go.jp>,

[#8776] Re: test/unit doesn't rescue a Exception — "Nathaniel Talbott" <ntalbott@...> 2006/09/03

On 9/1/06, Tanaka Akira <akr@fsij.org> wrote:

[#8777] Re: test/unit doesn't rescue a Exception — Eric Hodel <drbrain@...7.net> 2006/09/03

On Sep 2, 2006, at 6:34 PM, Nathaniel Talbott wrote:

Re: The face of Unicode support in the future

From: Austin Ziegler <halostatue@...>
Date: 2005-01-13 12:41:17 UTC
List: ruby-core #4216
Note: this message contains UTF-8 characters.

On Thu, 13 Jan 2005 07:56:22 +0900, Yukihiro Matsumoto
<matz@ruby-lang.org> wrote:
> In message "Re: The face of Unicode support in the future" on Thu,
> 13 Jan 2005 01:35:36 +0900, Christian Neukirchen
> <chneukirchen@gmail.com > writes:
>| This sounds likely to result in duplicated efforts... Do it
>| pragmatically; I don't think it should be very hard to provide a
>| default Character class that people can "customize" by
>| subclassing or method redefinition.
> A character might be represented by either:

> * code point
> * sequence of code points
> * or even set of attributes, without any code point
> * or something totally different

> But never mind. I'm no expert. I just don't want to repeat the
> argument again in English.

I'm not going to claim to be an expert, but in my work I have had to
look this over extensively -- mostly from the perspective of needing
to support Unicode, but also from the perspective of dealing with
unknown code pages. I'm not really understanding the last two points
in your list (a set of attributes, without any code point; something
totally different).

In Unicode encodings, at least, a code point can be multiple bytes
(or multiple words in UTF-16 with surrogates) -- and a character can
be multiple code points for the combining characters (e.g., can be
represented as combining-' + e, but there are explicit rules for how
combining characters can be specified in order).

It's important to note that in filenames, at least, Windows will
represent most characters as composed (e.g.,  and the Mac will
represent most characters as decomposed (e.g., 'e). These are,
however, the same Character. So when working with Unicode strings,
If I have 
  a = "r駸um蘂
then when I do:
  a[1]
I expect to get a Character of "蘂. With this Character, I should be
able to extract the composed codepoint(s) as well as the decomposed
codepoint(s) -- there are regular transformations available in
Unicode for this matter.

I do not necessarily expect to be able to do these transformations
on normal Strings -- but I do expect to be able to do these
transformations on I18N/M17N Ruby with multiple String encoding
support. Whether it's as a Character class or something that I apply
to a String, I don't particularly care.
  
I fully expect that you'll have a sensible API for this when you get
it added, but I would like such standard handling -- and a standard
way of adding new handling -- in Ruby 2.0.

Have you looked at ICUC (IBM's International Components for Unicode
for C)? It uses a UTF-16 encoding internally, but it supports quite
a bit of what I'm talking about already.

-austin
-- 
Austin Ziegler * halostatue@gmail.com
               * Alternate: austin@halostatue.ca


In This Thread