[#4076] Ruby/DL — Jamis Buck <jamis_buck@...>

I recently used Ruby/DL to create bindings to the SQLite3 embedded

40 messages 2005/01/03
[#4096] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/04

On Tue, Jan 04, 2005 at 02:53:49AM +0900, Jamis Buck wrote:

[#4099] Re: Ruby/DL — ts <decoux@...> 2005/01/04

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4119] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Wed, Jan 05, 2005 at 03:05:48AM +0900, ts wrote:

[#4120] Re: Ruby/DL — ts <decoux@...> 2005/01/05

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4125] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Thu, Jan 06, 2005 at 01:10:34AM +0900, ts wrote:

[#4116] Test::Unit::Collector::Dir won't work with code that modifies $LOAD_PATH — Eric Hodel <drbrain@...7.net>

Any test code that depends upon modifications of $: fails when used

10 messages 2005/01/05

[#4146] The face of Unicode support in the future — Charles O Nutter <headius@...>

Hello Rubyists!

47 messages 2005/01/06
[#4152] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/07

Hi,

[#4167] Re: The face of Unicode support in the future — Christian Neukirchen <chneukirchen@...> 2005/01/09

Yukihiro Matsumoto <matz@ruby-lang.org> writes:

[#4175] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/10

Hi,

[#4186] Re: The face of Unicode support in the future — Paul Brannan <pbrannan@...> 2005/01/11

On Mon, Jan 10, 2005 at 11:53:48PM +0900, Yukihiro Matsumoto wrote:

[#4192] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/12

Hi,

[#4269] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...>

19 messages 2005/01/18
[#4270] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/18

Hi,

[#4275] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...> 2005/01/19

[#4323] test/unit doesn't rescue a Exception — Tanaka Akira <akr@...17n.org>

test/unit doesn't rescue a Exception in a test method, as follows.

14 messages 2005/01/27
[#8773] Re: test/unit doesn't rescue a Exception — Tanaka Akira <akr@...> 2006/09/02

In article <87is5jb46q.fsf@serein.a02.aist.go.jp>,

[#8776] Re: test/unit doesn't rescue a Exception — "Nathaniel Talbott" <ntalbott@...> 2006/09/03

On 9/1/06, Tanaka Akira <akr@fsij.org> wrote:

[#8777] Re: test/unit doesn't rescue a Exception — Eric Hodel <drbrain@...7.net> 2006/09/03

On Sep 2, 2006, at 6:34 PM, Nathaniel Talbott wrote:

Re: The face of Unicode support in the future

From: Wes Nakamura <wknaka@...>
Date: 2005-01-20 01:28:14 UTC
List: ruby-core #4286
On Thu, 20 Jan 2005, Yukihiro Matsumoto wrote:

| |Also, if s.explode.length == s.length, same character in different encodings:
| |
| |1. "\x{30b9}".explode (encoding = utf16) => [ 0x30b9 ]?
| |2. "\x{b930}".explode (encoding = utf16le) => [ 0x30b9 ]?
| |3. "\x{e382b9}".explode (encoding = utf8) => [ 0xe3, 0x82, 0xb9 ] or [ 0xe382b9 ]?
| 
| I'm afraid you have misunderstood.  The number within the braces
| should be a code point, not a sequence of bytes.
| 
| 							matz.

Yes, I'm getting quite confused! :)

I'm looking at:
http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf
which is not an especially easy read.

    Unicode codespace: 0-10ffff (I suppose you could call this the unicode
        character "set")

    Codepoint: a value in the unicode codespace

    Encoded character: an association or mapping between an abstract
        character and a codepoint

    Code unit: bit combination that can represent a unit of encoded text
        utf-8 has 8-bit code units, and utf-16 has 16-bit code units

The unicode codepoint and utf-16-encoded values are the same for I
believe 0-ffff.

So (these are all the same character - a katakana "su"):

"\x30\xb9"     - unicode codespace, utf-16 encoding, codepoint 30b9
"\xe3\x82\xb9" - unicode codespace, utf-8 encoding, codepoint 30b9

"\x25\x39"     - "JIS" encoding, codepoint 2539 (in JIS X0208 codespace)
"\xa5\xb9"     - euc-jp encoding, codepoint 2539 (in JIS X0208 codespace)
"\x83\x58"     - shift-jis encoding, codepoint 2539 (in JIS X0208 codespace)

Should explode give the codepoint values listed above?  The bottom three
examples are complicated by shift-jis and euc-jp using multiple
character sets (codespaces?), not just JIS X0208.

Wes




In This Thread