[#4076] Ruby/DL — Jamis Buck <jamis_buck@...>

I recently used Ruby/DL to create bindings to the SQLite3 embedded

40 messages 2005/01/03
[#4096] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/04

On Tue, Jan 04, 2005 at 02:53:49AM +0900, Jamis Buck wrote:

[#4099] Re: Ruby/DL — ts <decoux@...> 2005/01/04

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4119] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Wed, Jan 05, 2005 at 03:05:48AM +0900, ts wrote:

[#4120] Re: Ruby/DL — ts <decoux@...> 2005/01/05

>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:

[#4125] Re: Ruby/DL — Paul Brannan <pbrannan@...> 2005/01/05

On Thu, Jan 06, 2005 at 01:10:34AM +0900, ts wrote:

[#4116] Test::Unit::Collector::Dir won't work with code that modifies $LOAD_PATH — Eric Hodel <drbrain@...7.net>

Any test code that depends upon modifications of $: fails when used

10 messages 2005/01/05

[#4146] The face of Unicode support in the future — Charles O Nutter <headius@...>

Hello Rubyists!

47 messages 2005/01/06
[#4152] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/07

Hi,

[#4167] Re: The face of Unicode support in the future — Christian Neukirchen <chneukirchen@...> 2005/01/09

Yukihiro Matsumoto <matz@ruby-lang.org> writes:

[#4175] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/10

Hi,

[#4186] Re: The face of Unicode support in the future — Paul Brannan <pbrannan@...> 2005/01/11

On Mon, Jan 10, 2005 at 11:53:48PM +0900, Yukihiro Matsumoto wrote:

[#4192] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/12

Hi,

[#4269] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...>

19 messages 2005/01/18
[#4270] Re: The face of Unicode support in the future — Yukihiro Matsumoto <matz@...> 2005/01/18

Hi,

[#4275] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...> 2005/01/19

[#4323] test/unit doesn't rescue a Exception — Tanaka Akira <akr@...17n.org>

test/unit doesn't rescue a Exception in a test method, as follows.

14 messages 2005/01/27
[#8773] Re: test/unit doesn't rescue a Exception — Tanaka Akira <akr@...> 2006/09/02

In article <87is5jb46q.fsf@serein.a02.aist.go.jp>,

[#8776] Re: test/unit doesn't rescue a Exception — "Nathaniel Talbott" <ntalbott@...> 2006/09/03

On 9/1/06, Tanaka Akira <akr@fsij.org> wrote:

[#8777] Re: test/unit doesn't rescue a Exception — Eric Hodel <drbrain@...7.net> 2006/09/03

On Sep 2, 2006, at 6:34 PM, Nathaniel Talbott wrote:

Re: The face of Unicode support in the future

From: Yukihiro Matsumoto <matz@...>
Date: 2005-01-19 15:24:20 UTC
List: ruby-core #4278
Hi,

In message "Re: The face of Unicode support in the future"
    on Wed, 19 Jan 2005 19:34:41 +0900, Wes Nakamura <wknaka@pobox.com> writes:

|Will this be efficient enough?  When using a non-fixed-width encoding,
|String#[] won't run in constant time.

Right.  And you have to trust me, it's efficient enough for most of
the cases.  If you really care about efficiency, you can choose fixed
width encoding, which won't be slow even under M17N Ruby.

|1. This method is mentioned:
|
|     String#encoding, returns a string specifying the encoding
|   
|   But I haven't seen this, is there also:
|
|     String#encoding=
|   
|   I assume that setting the encoding would do nothing to the internal
|   representation of the string (based on char *), it would just affect
|   how methods that work on strings deal with characters, etc.


|2. What is the default encoding for strings?   What encoding would
|   String.new("") have #encoding set to?

The encoding of the script file.  See [ruby-core:04192].

|3. Are literal strings assumed to be a certain encoding, (encoding of
|   the script?) or can you specify an encoding at the time of creation?

The encoding of the script.

|3a. If there is a way of creating literal strings in other encodings,
|   is there also a way of creating literal regex's in other encodings?

If the encoding of the script is ascii (or binary, which is an alias
to ascii), you can do it by using octet (or decimal) string
representation + specifying encoding explicitly, e.g.

  # my family name in Japanese in euc-jp encoding
  "\244\336\244\304\244\342\244\310".encoding="euc-jp"

|3b. In \x{xxxx}, does the number have to be a 4-digit (hex) number?
|   How would you specify a utf-8 character, which can be more than 2 bytes?
|   Is the \x{} syntax basically \x{byte byte byte..}?

No, that is the very reason for braces around digits.  You can put
an arbitrary hexadecimal number in the braces.

|4. Will String#explode return an array of Fixnums, basically a byte array,
|   of the raw char * values?
|   
|   This would mean that s.explode.size is not necessarily == s.size

String explode (name might be changed) returns an array of fixnums,
which means s.explode.length == s.length (String#size now returns the
byte length of the string under the current M17N prototype, but I
consider it's a wrong decision, and will be fixed in the 1.9).

|5. When using String#[idx]= to set a single character, it must take as
|   an argument a string which has a size of 1 (i.e.  one codepoint) but
|   internally (i.e. #explode) doesn't necessarily have a size of 1?

It doesn't have to be a size of 1 anyway.  See [ruby-core:04276].

|6. Right now there is Fixnum#chr. Will there be Array#chr(encoding) or
|   something similiar?  So you could do something like:
|
|     [ 0x30, 0xb9 ].chr("utf-16")

I think it will be

  Integer#chr(encoding=script's_default)

to get a string corresponding a codepoint.  The is the place I haven't
made design decision.  But you will have something like this.

|7. Will strings that, when converted to the same encoding, are identical,
|   give different results for #intern when left in different encodings?
|
|   What happens to an interned string with a binary encoding?  Is it interned
|   based on the internal bytes of the string rather than the characters?

The is also the place I haven't made design decision.  Possible
options are:

  * restrict symbols to 7bit ascii
  * embed encoding info in Symbols
  * symbols just use byte sequence
  * something else I don't think of now.

							matz.

In This Thread