[#19075] Request For Removal: No Operator Concatenation — James Gray <james@...>

I'm disappointed that Ruby still supports this goofy syntax:

30 messages 2008/10/01
[#19076] Re: Request For Removal: No Operator Concatenation — "Gregory Brown" <gregory.t.brown@...> 2008/10/01

On Wed, Oct 1, 2008 at 1:58 PM, James Gray <james@grayproductions.net> wrote:

[#19078] Re: Request For Removal: No Operator Concatenation — "Jim Freeze" <jimfreeze@...> 2008/10/01

On Wed, Oct 1, 2008 at 1:08 PM, Gregory Brown <gregory.t.brown@gmail.com> wrote:

[#19080] Re: Request For Removal: No Operator Concatenation — James Gray <james@...> 2008/10/01

On Oct 1, 2008, at 1:15 PM, Jim Freeze wrote:

[#19081] Re: Request For Removal: No Operator Concatenation — "Jim Freeze" <jimfreeze@...> 2008/10/01

On Wed, Oct 1, 2008 at 1:29 PM, James Gray <james@grayproductions.net> wrote:

[#19082] Re: Request For Removal: No Operator Concatenation — James Gray <james@...> 2008/10/01

On Oct 1, 2008, at 1:37 PM, Jim Freeze wrote:

[#19083] Re: Request For Removal: No Operator Concatenation — Eric Hodel <drbrain@...7.net> 2008/10/01

On Oct 1, 2008, at 11:42 AM, James Gray wrote:

[#19084] Re: Request For Removal: No Operator Concatenation — "Gregory Brown" <gregory.t.brown@...> 2008/10/01

On Wed, Oct 1, 2008 at 2:45 PM, Eric Hodel <drbrain@segment7.net> wrote:

[#19087] Re: Request For Removal: No Operator Concatenation — "Jim Freeze" <jimfreeze@...> 2008/10/01

On Wed, Oct 1, 2008 at 2:10 PM, Gregory Brown <gregory.t.brown@gmail.com> wrote:

[#19132] [Feature #615] "with" operator — Lavir the Whiolet <redmine@...>

Feature #615: "with" operator

33 messages 2008/10/05
[#19137] Re: [Feature #615] "with" operator — Nobuyoshi Nakada <nobu@...> 2008/10/06

Hi,

[#19138] Re: [Feature #615] "with" operator — Paul Brannan <pbrannan@...> 2008/10/06

On Mon, Oct 06, 2008 at 10:46:49AM +0900, Nobuyoshi Nakada wrote:

[#19141] Re: [Feature #615] "with" operator — _why <why@...> 2008/10/06

On Mon, Oct 06, 2008 at 10:56:23PM +0900, Paul Brannan wrote:

[#19148] Re: [Feature #615] "with" operator — Trans <transfire@...> 2008/10/06

[#19149] Re: [Feature #615] "with" operator — "Austin Ziegler" <halostatue@...> 2008/10/06

On Mon, Oct 6, 2008 at 3:34 PM, Trans <transfire@gmail.com> wrote:

[#19150] Re: [Feature #615] "with" operator — "David A. Black" <dblack@...> 2008/10/06

Hi --

[#19154] Re: [Feature #615] "with" operator — _why <why@...> 2008/10/07

On Tue, Oct 07, 2008 at 05:47:23AM +0900, David A. Black wrote:

[#19250] default_internal encoding — Dave Thomas <dave@...>

I'm documenting default_internal for the PickAxe, and have a couple of

26 messages 2008/10/09
[#19254] Re: default_internal encoding — "Michael Selig" <michael.selig@...> 2008/10/09

Hi,

[#19255] Re: performance of C function calls in 1.8 vs 1.9 — "Michael Selig" <michael.selig@...> 2008/10/10

On Wed, Oct 8, 2008 at 3:52 PM, Paul Brannan <pbrannan / atdesk.com> wrote:

[#19289] [Bug #633] dl segfaults on x86_64-linux systems — Benjamin Floering <redmine@...>

Bug #633: dl segfaults on x86_64-linux systems

19 messages 2008/10/10

[#19315] [Feature #643] __DIR__ — Thomas Sawyer <redmine@...>

Feature #643: __DIR__

14 messages 2008/10/13

[#19342] [Bug #649] Memory leak in a array assignment? — Henri Suur-Inkeroinen <redmine@...>

Bug #649: Memory leak in a array assignment?

14 messages 2008/10/15

[#19350] Net::HTTP.post_form bug : can't post form to correct uri which contains QueryString(QueryString part are lost) and revise — Klesh <kleshwong@...>

Hi,

10 messages 2008/10/16
[#19352] Re: Net::HTTP.post_form bug : can't post form to correct uri which contains QueryString(QueryString part are lost) and revise — "Matt Todd" <chiology@...> 2008/10/16

You are trying to use GET-style query params instead of POSTing the

[#19378] Constant names in 1.9 — Dave Thomas <dave@...>

When Ruby makes the tIDENTIFIER/tCONSTANT test, it looks to see if the =20=

13 messages 2008/10/18

[#19397] [Feature #666] Enumerable::to_hash — Marc-Andre Lafortune <redmine@...>

Feature #666: Enumerable::to_hash

14 messages 2008/10/20
[#23249] [Feature #666](Rejected) Enumerable::to_hash — Yukihiro Matsumoto <redmine@...> 2009/04/18

Issue #666 has been updated by Yukihiro Matsumoto.

[#19422] Now that lambda has more powerful arguments... — Dave Thomas <dave@...>

is there anything that

24 messages 2008/10/21
[#19423] Re: Now that lambda has more powerful arguments... — Wolfgang N疆asi-Donner <ed.odanow@...> 2008/10/21

Dave Thomas schrieb:

[#19424] Re: Now that lambda has more powerful arguments... — Dave Thomas <dave@...> 2008/10/21

[#19427] Re: Now that lambda has more powerful arguments... — Paul Brannan <pbrannan@...> 2008/10/21

On Wed, Oct 22, 2008 at 04:01:45AM +0900, Dave Thomas wrote:

[#19429] Re: Now that lambda has more powerful arguments... — "David A. Black" <dblack@...> 2008/10/21

Hi --

[#19430] Re: Now that lambda has more powerful arguments... — Paul Brannan <pbrannan@...> 2008/10/21

On Wed, Oct 22, 2008 at 04:38:19AM +0900, David A. Black wrote:

[#19431] Re: Now that lambda has more powerful arguments... — "David A. Black" <dblack@...> 2008/10/21

Hi --

[#19432] Re: Now that lambda has more powerful arguments... — Jim Weirich <jim.weirich@...> 2008/10/21

On Oct 21, 2008, at 4:24 PM, David A. Black wrote:

[#19465] [Bug #680] csv.rb: CSV.parse is too late when encoding is mismatch — Takeyuki Fujioka <redmine@...>

Bug #680: csv.rb: CSV.parse is too late when encoding is mismatch

41 messages 2008/10/24
[#19466] Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is too late when encoding is mismatch) — "Michael Selig" <michael.selig@...> 2008/10/24

Hi,

[#19471] Re: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch) — Martin Duerst <duerst@...> 2008/10/24

A default for the source encoding has been discussed quite a long

[#19473] Re: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch) — "Michael Selig" <michael.selig@...> 2008/10/24

Hi,

[#19474] Re: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch) — Yukihiro Matsumoto <matz@...> 2008/10/24

Hi,

[#19515] String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/26

Hi,

[#19517] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/26

Hi,

[#19518] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/26

On Sun, 26 Oct 2008 17:26:32 +1100, Nobuyoshi Nakada <nobu@ruby-lang.org>

[#19522] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/26

Hi,

[#19525] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/26

On Sun, 26 Oct 2008 23:34:26 +1100, Nobuyoshi Nakada <nobu@ruby-lang.org>

[#19531] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/27

Hi,

[#19532] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/27

On Mon, 27 Oct 2008 16:07:54 +1100, Nobuyoshi Nakada <nobu@ruby-lang.org>

[#19533] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/27

Hi,

[#19535] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/27

On Mon, 27 Oct 2008 17:27:57 +1100, Nobuyoshi Nakada <nobu@ruby-lang.org>

[#19538] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/27

Hi,

[#19540] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — "Michael Selig" <michael.selig@...> 2008/10/27

On Mon, 27 Oct 2008 20:55:32 +1100, Nobuyoshi Nakada <nobu@ruby-lang.org>

[#19546] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) — Nobuyoshi Nakada <nobu@...> 2008/10/27

Hi,

[#19480] Re: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch) — James Gray <james@...> 2008/10/24

On Oct 24, 2008, at 1:52 AM, Martin Duerst wrote:

[#19566] GC thought — "Roger Pack" <roger.pack@...>

Here is a recent patch I've been experimenting with--for any advice. [1]

26 messages 2008/10/28
[#19569] Re: GC thought — Ken Bloom <kbloom@...> 2008/10/28

On Tue, 28 Oct 2008 17:02:17 +0900, Roger Pack wrote:

[#19575] Re: GC thought — "Roger Pack" <roger.pack@...> 2008/10/28

> Letting the program continue execution during the mark phase could cause

[#19577] Re: GC thought — Paul Brannan <pbrannan@...> 2008/10/28

On Wed, Oct 29, 2008 at 01:04:52AM +0900, Roger Pack wrote:

[#19596] Re: GC thought — "Robert Klemme" <shortcutter@...> 2008/10/29

2008/10/28 Paul Brannan <pbrannan@atdesk.com>:

[#19590] [Feature #695] More flexibility when combining ASCII-8BIT strings with other encodings — Michael Selig <redmine@...>

Feature #695: More flexibility when combining ASCII-8BIT strings with other encodings

13 messages 2008/10/29
[#19646] Re: [Feature #695] More flexibility when combining ASCII-8BIT strings with other encodings — "Michael Selig" <michael.selig@...> 2008/10/30

Hi,

[ruby-core:19650] Re: [Feature #695] More flexibility when combiningASCII-8BIT strings with other encodings

From: "Michael Selig" <michael.selig@...>
Date: 2008-10-31 04:57:20 UTC
List: ruby-core #19650
Hi

On Fri, 31 Oct 2008 13:51:53 +1100, Martin Duerst <duerst@it.aoyama.ac.jp>  
wrote:

>> Feature #695 was closed & marked done, but unfortunately it does not  
>> seem to have been implemented :-(
>
> I think it should have been marked part done, part rejected,
> I guess.

Some sort of explanation would also have been nice.
But at least we are now discussing it - I was expecting this to happen  
before implementation :-)

> I don't think it is by chance that most programming languages I
> know, even if they have a somewhat different internationalization
> model, more focused on Unicode than Ruby, make a clear distinction
> between characters and bytes. It also isn't by chance that one
> of the first things people have to learn when they learn about
> internationalization is "bytes are not characters".

Yes, I agree with you, and I have raised this "ambiguity" before - in Ruby  
ASCII-8BIT can either be a byte string or a character string of uncertain  
encoding.
The problem I am trying to address here is for simple scripts which don't  
care about internationalisation.

>> My feature request would mean that "pack" and "\x" string literals could
>> be left as ASCII-8BIT, and be "forced" to another encoding transparently
>> depending on how the programmer uses it.
>
> I think this is totally the wrong way. The problems are with
> pack and \x in string literals, and it would be a bad idea to
> try and solve them by introducing a general "bytes become characters"
> feature.

"default_internal" has gone a long way to help solve M17N issues, but  
there still remains "encoding compatibility" issues even in simple, single  
encoding scripts, ie: between the locale's encoding and ASCII-8BIT. The  
motivation behind this feature request was to address this latter point.

I agree with you that there is a problem with "\x" in string literals.

However I am not sure I agree that the problem is in pack. The root of the  
problem is this ambiguity with ASCII-8BIT between bytes and characters -  
the way I think it should work is really like a "wild card" encoding.
Pack is one simple example of a bunch of methods that return strings, but  
cannot easily determine what encoding to return them in.
Other examples are decryption and uncompression methods where often the  
original encoding is not known. In many cases there is no alternative  
other than to return them as ASCII-8BIT and let the application worry  
about interpreting the contents.

This is *forcing* the programmer to use "force_encoding()" where in 1.8 it  
was not necessary, and in 1.9 it can seem rather annoying.
There is even a weird exception to this - if the ASCII-8BIT string happens  
to be all 7-bit chars, then it CAN be combined with other ASCII-compatible  
encodings.
This probably allows some 1.8 legacy scripts to work, but only ones  
working in ASCII.
I do not think this sort of thing - one that works in some cases, but not  
in others - is desirable at all.

So in fact Ruby already has what you describe as a "bytes become  
characters feature", but it only works in certain circumstances!


>> You can liken this feature to the transparent conversion of an integer  
>> to
>> a float when doing arithmetic.
>
> Well, it's not very similar. The conversion of an interger to a float
> is very predictable, but the 'conversion' of ASCII-8BIT to some
> real encoding is just a wild guess.

A "wild guess" is overstating it. If a program attempts to combine an  
ASCII-8BIT string with another encoded string, AND it happens to be a  
valid encoding, I think that the chances are very high that the program is  
expecting the byte string to be in the other encoding. I think that a  
heuristic like this is reasonable as it keeps the language backward  
compatible & neat.

Furthermore as I said, this conversion already happens with ASCII-8BIT  
character strings consisting only of 7 bit chars, so extending it to all  
encodings seems an obvious thing to do. Look at:

a) 7-bit char strings work, irrespective of encoding:
ruby -e 'p ("abc".force_encoding("ASCII-8BIT") +  
"abc".force_encoding("UTF-8")).encoding'
=> #<Encoding:UTF-8>

but:
b) Legal 8-bit encoding string:
ruby -e 'p ("ab\xE0".force_encoding("ASCII-8BIT") +  
"ab\xE0".force_encoding("ISO-8859-8")).encoding'
=> -e:1:in `<main>': incompatible character encodings: ASCII-8BIT and  
ISO-8859-8 (Encoding::CompatibilityError)

c) Legal multibyte encoding string:
ruby -ve 'p ("ab\u0635".force_encoding("ASCII-8BIT") +  
"ab\u0635".force_encoding("UTF-8")).encoding'
=> -e:1:in `<main>': incompatible character encodings: ASCII-8BIT and  
UTF-8 (Encoding::CompatibilityError)

Certainly I don't see the downside in the conversion to a single-byte  
encoding (eg: example (b)) above. Even if it converted when it shouldn't  
have, the indexing and "codepoint values" are the same as if the result  
were ASCII-8BIT.

One other idea: maybe we should distinguish between 2 encodings "BINARY"  
and "ASCII-8BIT", which are currently aliases. Essentially they are the  
same, but "BINARY" would mean "bytestring" and will generate an error if  
you try to combine it with any other encoding, while "ASCII-8BIT" would  
mean "unknown encoding", which can be combined transparently with other  
encodings.
Maybe there is a better solution - any ideas?

Cheers
Mike

In This Thread