[#23231] What do you think about changing the return value of Kernel#require and Kernel#load to the source encoding of the required file? — =?ISO-8859-15?Q?Wolfgang_N=E1dasi-Donner?= <ed.odanow@...>

Dear Ruby developers and users!

8 messages 2009/04/17

[#23318] [Feature #1408] 0.1.to_r not equal to (1/10) — Heesob Park <redmine@...>

Feature #1408: 0.1.to_r not equal to (1/10)

19 messages 2009/04/26

[ruby-core:23155] ENC_CODERANGE_7BIT is a bad optimisation

From: Brian Candler <B.Candler@...>
Date: 2009-04-08 07:56:40 UTC
List: ruby-core #23155
In my opinion, ENC_CODERANGE_7BIT is a bad optimisation. Unless it is
recalculated correctly every time a string is built or modified, it leads to
bizarre behaviour. A number of ad-hoc fixes have been made to date, and yet
there are still core String methods which have the problem (see one example
below - I have not bothered to test them exhaustively, because I don't think
that is the right solution).

The biggest problem I see is: if Ruby's own implementation can't get it
right, what chance is there for extension libraries to get it right too?

I can suggest a number of ways it could be made better.

1. Don't store this flag. Calculate this property of the string only when it
is needed.

2. If you want to cache this property, then have three states: unknown,
7BIT, 8BIT. Any action which modifies the string then just needs to set the
state to unknown, which is cheaper than recalculating the state. (However
there is still the danger that an external library will forget to clear the
flag)

3. Don't include this state when calculating the hash value of a string.
However, if the flag is wrong it will then show itself in even more
insiduous ways, such as two strings being treated as having compatible
encodings when they don't, or vice versa.

Regards,

Brian.


$ ruby19 -v
ruby 1.9.2dev (2009-04-07 trunk 23150) [i686-linux]

$ cat chop.rb
#encoding: utf-8
a = "a"
b = "a
b.chop!
h = {a => 1}
p h.key?(a)     #true
p h.key?(b)     #false !!

p a             #"a"
p b             #"a"
p a.encoding
p b.encoding

p a == b        #true
p a.hash
p b.hash

$ ruby19 chop.rb
true
false
"a"
"a"
#<Encoding:UTF-8>
#<Encoding:UTF-8>
true
1071603097
1071603096

In This Thread

Prev Next