[#66678] [ruby-trunk - Feature #10481] Add "if" and "unless" clauses to rescue statements — alex@...
Issue #10481 has been updated by Alex Boyd.
3 messages
2014/12/04
[#66762] Re: [ruby-changes:36667] normal:r48748 (trunk): struct: avoid all O(n) behavior on access — Tanaka Akira <akr@...>
2014-12-10 0:44 GMT+09:00 normal <ko1@atdot.net>:
3 messages
2014/12/10
[#66851] [ruby-trunk - Feature #10585] struct: speedup struct.attr = v for first 10 attributes and struct[:attr] for big structs — funny.falcon@...
Issue #10585 has been updated by Yura Sokolov.
3 messages
2014/12/15
[#67126] Ruby 2.2.0 Released — "NARUSE, Yui" <naruse@...>
We are pleased to announce the release of Ruby 2.2.0.
8 messages
2014/12/25
[#67128] Re: Ruby 2.2.0 Released
— Rodrigo Rosenfeld Rosas <rr.rosas@...>
2014/12/25
I can't install it in any of our Ubuntu servers using rbenv:
[#67129] Re: Ruby 2.2.0 Released
— SHIBATA Hiroshi <shibata.hiroshi@...>
2014/12/25
> I can't install it in any of our Ubuntu servers using rbenv:
[ruby-core:66765] [ruby-trunk - Bug #10584] String.valid_encoding?, String.ascii_only? fails to account for BOM.
From:
duerst@...
Date:
2014-12-10 11:21:12 UTC
List:
ruby-core #66765
Issue #10584 has been updated by Martin D=C3=BCrst.
This isn't as simple as you describe it. With respect to BOMs, there is a c=
lear distinction between external data and internal data. A BOM is often ve=
ry helpful in external data (e.g. a file). On the other hand, it's not only=
useless, but actually highly counterproductive for internal data (just thi=
nk about concatenation).
The problem currently is that Ruby doesn't absorb that difference, it leave=
s it to the programmer. The reason for this is that it's difficult to defin=
e a clear external/internal boundary (the file example is the easy one). Al=
so, some cases require a BOM (e.g. UTF-16 in XML) whereas others forbid it =
and others allow it and so on. It might be possible to deal with some of th=
is as options on methods reading from files, but that would require careful=
analysis.
Because U+FFFE isn't a valid codepoint in Unicode, your first two examples =
could be made true, and might indeed catch some errors. For your third exam=
ple, a string with a BOM is definitely not ASCII, so ascii_only? should def=
initely return false. This is not only the definition of ASCII, but also t=
ightly linked to Ruby's internals (including optimizations).
For your forth example, once internal, it's unclear whether the BOM is actu=
ally a BOM or a zero-width non-breaking space. The later can appear at the =
start of a piece of text easily. Although explicitly deprecated, it's still=
effective, I just used it recently in a Web page.
----------------------------------------
Bug #10584: String.valid_encoding?, String.ascii_only? fails to account for=
BOM.
https://bugs.ruby-lang.org/issues/10584#change-50349
* Author: Geoff Nixon
* Status: Open
* Priority: Normal
* Assignee:=20
* Category: core
* Target version: current: 2.2.0
* ruby -v: ruby 2.2.0preview2 (2014-11-28 trunk 48628) [x86_64-darwin14]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
IMO:
- A Unicode (UTF-16, UTF-32) string with a valid BOM should not be consider=
ed a valid encoding if endianness is changed.
- A UTF-8 string with BOM should not consider the BOM as a codepoint.
~~~sh
> file utf-16be-file
utf-16be-file: POSIX shell script, Big-endian UTF-16 Unicode text executable
> file utf-16le-file
utf-16le-file: POSIX shell script, Little-endian UTF-16 Unicode text execut=
able
> file utf-8-with-bom-file
utf-8-with-bom-file: POSIX shell script, UTF-8 Unicode (with BOM) text exec=
utable
~~~
~~~sh
> ruby -e "p File.binread('utf-16le-file').force_encoding('UTF-16BE').valid=
_encoding?"
true # false
> ruby -e "p File.binread('utf-16be-file').force_encoding('UTF-16LE').valid=
_encoding?"
true # false
> ruby -e "p File.read('utf-8-with-bom-file').ascii_only?"
false # true
> ruby -e "p File.read('utf-8-with-bom-file')[0]"
"" # '#'
~~~
No?
---Files--------------------------------
utf-8-with-bom-file (14 Bytes)
utf-16le-file (2.46 KB)
utf-16be-file (2.45 KB)
--=20
https://bugs.ruby-lang.org/