[#4065] Surprise in Time#sec — Steven Jenkins <steven.jenkins@...>
This bit me:
[#4067] Segfault in Thread#initialize / caller — Florian Gro<florgro@...>
Moin!
[#4076] Ruby/DL — Jamis Buck <jamis_buck@...>
I recently used Ruby/DL to create bindings to the SQLite3 embedded
On Tue, Jan 04, 2005 at 02:53:49AM +0900, Jamis Buck wrote:
>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:
On Wed, Jan 05, 2005 at 03:05:48AM +0900, ts wrote:
>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:
On Thu, Jan 06, 2005 at 01:10:34AM +0900, ts wrote:
>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:
On Thu, Jan 06, 2005 at 06:57:57PM +0900, ts wrote:
>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:
On Fri, Jan 07, 2005 at 12:06:16AM +0900, ts wrote:
>>>>> "P" == Paul Brannan <pbrannan@atdesk.com> writes:
ts wrote:
[#4116] Test::Unit::Collector::Dir won't work with code that modifies $LOAD_PATH — Eric Hodel <drbrain@...7.net>
Any test code that depends upon modifications of $: fails when used
Hi,
On 11 Jan 2005, at 04:14, nobu.nokada@softhome.net wrote:
On 11 Jan 2005, at 09:39, Eric Hodel wrote:
On Sat, 15 Jan 2005 04:06:10 +0900, Eric Hodel <drbrain@segment7.net> wrote:
On Fri, 14 Jan 2005 23:48:58 -0500, Nathaniel Talbott
On Thu, 27 Jan 2005 17:17:14 -0500, Nathaniel Talbott
[#4146] The face of Unicode support in the future — Charles O Nutter <headius@...>
Hello Rubyists!
Hi,
Yukihiro Matsumoto <matz@ruby-lang.org> writes:
Paul Brannan <pbrannan@atdesk.com> writes:
Hi,
On Mon, Jan 10, 2005 at 11:53:48PM +0900, Yukihiro Matsumoto wrote:
Hi,
Yukihiro Matsumoto wrote:
Hi,
On Wed, Jan 12, 2005 at 02:13:35PM +0900, Yukihiro Matsumoto wrote:
Hi,
[#4189] Authenticated proxy support for open-uri — Neil Kohl <nakohl@...>
Hello!
[#4232] Carriage return on shebang — Florian Gro<florgro@...>
Moin.
[#4242] tracer.rb: Do not list pseudo source lines of binary extensions — Florian Gro<florgro@...>
Moin.
[#4243] Patch that enables https in open-uri.rb — Michael Neumann <mneumann@...>
Hi,
In article <41E93F42.9090705@ntecs.de>,
Tanaka Akira wrote:
[#4269] Re: The face of Unicode support in the future — Wes Nakamura <wknaka@...>
Hi,
Hi,
Yukihiro Matsumoto wrote:
Hi,
[#4296] parse_c.rb: allow whitespace after function names — Tilman Sauerbeck <tilman@...>
Hi,
Hi,
Yukihiro Matsumoto <matz@ruby-lang.org> [2005-01-21 17:43]:
[#4311] RFE: Enumerable#group_by, Array#^ — Florian Gro<florgro@...>
Moin.
[#4323] test/unit doesn't rescue a Exception — Tanaka Akira <akr@...17n.org>
test/unit doesn't rescue a Exception in a test method, as follows.
In article <87is5jb46q.fsf@serein.a02.aist.go.jp>,
On 9/1/06, Tanaka Akira <akr@fsij.org> wrote:
On Sep 2, 2006, at 6:34 PM, Nathaniel Talbott wrote:
In article <A604C0B3-95ED-4B9B-866C-79A2C7D5E3C4@segment7.net>,
On Sep 2, 2006, at 9:39 PM, Tanaka Akira wrote:
In article <622DAC7E-55DB-4854-B82B-A037CE9C75EF@segment7.net>,
In article <87ac5hv4bo.fsf@fsij.org>,
On Sep 3, 2006, at 8:21 AM, Tanaka Akira wrote:
[#4332] IO#clearerr missing in action — Eric Hodel <drbrain@...7.net>
I wanted to implement tail(1) in ruby cleanly, but found the best I
[#4335] When will Object#type disappear? — "David A. Black" <dblack@...>
Hi --
Re: The face of Unicode support in the future
Hi,
In message "Re: The face of Unicode support in the future"
on Wed, 19 Jan 2005 19:34:41 +0900, Wes Nakamura <wknaka@pobox.com> writes:
|Will this be efficient enough? When using a non-fixed-width encoding,
|String#[] won't run in constant time.
Right. And you have to trust me, it's efficient enough for most of
the cases. If you really care about efficiency, you can choose fixed
width encoding, which won't be slow even under M17N Ruby.
|1. This method is mentioned:
|
| String#encoding, returns a string specifying the encoding
|
| But I haven't seen this, is there also:
|
| String#encoding=
|
| I assume that setting the encoding would do nothing to the internal
| representation of the string (based on char *), it would just affect
| how methods that work on strings deal with characters, etc.
|2. What is the default encoding for strings? What encoding would
| String.new("") have #encoding set to?
The encoding of the script file. See [ruby-core:04192].
|3. Are literal strings assumed to be a certain encoding, (encoding of
| the script?) or can you specify an encoding at the time of creation?
The encoding of the script.
|3a. If there is a way of creating literal strings in other encodings,
| is there also a way of creating literal regex's in other encodings?
If the encoding of the script is ascii (or binary, which is an alias
to ascii), you can do it by using octet (or decimal) string
representation + specifying encoding explicitly, e.g.
# my family name in Japanese in euc-jp encoding
"\244\336\244\304\244\342\244\310".encoding="euc-jp"
|3b. In \x{xxxx}, does the number have to be a 4-digit (hex) number?
| How would you specify a utf-8 character, which can be more than 2 bytes?
| Is the \x{} syntax basically \x{byte byte byte..}?
No, that is the very reason for braces around digits. You can put
an arbitrary hexadecimal number in the braces.
|4. Will String#explode return an array of Fixnums, basically a byte array,
| of the raw char * values?
|
| This would mean that s.explode.size is not necessarily == s.size
String explode (name might be changed) returns an array of fixnums,
which means s.explode.length == s.length (String#size now returns the
byte length of the string under the current M17N prototype, but I
consider it's a wrong decision, and will be fixed in the 1.9).
|5. When using String#[idx]= to set a single character, it must take as
| an argument a string which has a size of 1 (i.e. one codepoint) but
| internally (i.e. #explode) doesn't necessarily have a size of 1?
It doesn't have to be a size of 1 anyway. See [ruby-core:04276].
|6. Right now there is Fixnum#chr. Will there be Array#chr(encoding) or
| something similiar? So you could do something like:
|
| [ 0x30, 0xb9 ].chr("utf-16")
I think it will be
Integer#chr(encoding=script's_default)
to get a string corresponding a codepoint. The is the place I haven't
made design decision. But you will have something like this.
|7. Will strings that, when converted to the same encoding, are identical,
| give different results for #intern when left in different encodings?
|
| What happens to an interned string with a binary encoding? Is it interned
| based on the internal bytes of the string rather than the characters?
The is also the place I haven't made design decision. Possible
options are:
* restrict symbols to 7bit ascii
* embed encoding info in Symbols
* symbols just use byte sequence
* something else I don't think of now.
matz.