[#48745] [ruby-trunk - Bug #7267][Open] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names — "kennygrant (Kenny Grant)" <kennygrant@...>

17 messages 2012/11/02

[#48773] [ruby-trunk - Bug #7269][Open] Refinement doesn't work if using locate after method — "ko1 (Koichi Sasada)" <redmine@...>

12 messages 2012/11/03

[#48847] [ruby-trunk - Bug #7274][Open] UnboundMethods should be bindable to any object that is_a?(owner of the UnboundMethod) — "rits (First Last)" <redmine@...>

21 messages 2012/11/04

[#48854] [ruby-trunk - Bug #7276][Open] TestFile#test_utime failure — "jonforums (Jon Forums)" <redmine@...>

14 messages 2012/11/04

[#48988] [ruby-trunk - Feature #7292][Open] Enumerable#to_h — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

40 messages 2012/11/06

[#48997] [ruby-trunk - Feature #7297][Open] map_to alias for each_with_object — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>

19 messages 2012/11/06

[#49001] [ruby-trunk - Bug #7298][Open] Behavior of Enumerator.new different between 1.9.3 and 2.0.0 — "ayumin (Ayumu AIZAWA)" <ayumu.aizawa@...>

12 messages 2012/11/06

[#49018] [ruby-trunk - Feature #7299][Open] Ruby should not completely ignore blocks. — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

13 messages 2012/11/07

[#49044] [ruby-trunk - Bug #7304][Open] Random test failures around test_autoclose_true_closed_by_finalizer — "luislavena (Luis Lavena)" <luislavena@...>

11 messages 2012/11/07

[#49196] [ruby-trunk - Feature #7322][Open] Add a new operator name #>< for bit-wise "exclusive or" — "alexeymuranov (Alexey Muranov)" <redmine@...>

18 messages 2012/11/10

[#49211] [ruby-trunk - Feature #7328][Open] Move ** operator precedence under unary + and - — "boris_stitnicky (Boris Stitnicky)" <boris@...>

20 messages 2012/11/11

[#49229] [ruby-trunk - Bug #7331][Open] Set the precedence of unary `-` equal to the precedence `-`, same for `+` — "alexeymuranov (Alexey Muranov)" <redmine@...>

17 messages 2012/11/11

[#49256] [ruby-trunk - Feature #7336][Open] Flexiable OPerator Precedence — "trans (Thomas Sawyer)" <transfire@...>

18 messages 2012/11/12

[#49354] review open pull requests on github — Zachary Scott <zachary@...>

Could we get a review on any open pull requests on github before the

12 messages 2012/11/15
[#49355] Re: review open pull requests on github — "NARUSE, Yui" <naruse@...> 2012/11/15

2012/11/15 Zachary Scott <zachary@zacharyscott.net>:

[#49356] Re: review open pull requests on github — Zachary Scott <zachary@...> 2012/11/15

Ok, I was hoping one of the maintainers might want to.

[#49451] [ruby-trunk - Bug #7374][Open] File.expand_path resolving to first file/dir instead of absolute path — mdube@... (Martin Dubé) <mdube@...>

12 messages 2012/11/16

[#49463] [ruby-trunk - Feature #7375][Open] embedding libyaml in psych for Ruby 2.0 — "tenderlovemaking (Aaron Patterson)" <aaron@...>

21 messages 2012/11/16
[#49494] [ruby-trunk - Feature #7375] embedding libyaml in psych for Ruby 2.0 — "vo.x (Vit Ondruch)" <v.ondruch@...> 2012/11/17

[#49467] [ruby-trunk - Feature #7377][Open] #indetical? as an alias for #equal? — "aef (Alexander E. Fischer)" <aef@...>

13 messages 2012/11/17

[#49558] [ruby-trunk - Bug #7395][Open] Negative numbers can't be primes by definition — "zzak (Zachary Scott)" <zachary@...>

10 messages 2012/11/19

[#49566] [ruby-trunk - Feature #7400][Open] Incorporate OpenSSL tests from JRuby. — "zzak (Zachary Scott)" <zachary@...>

11 messages 2012/11/19

[#49770] [ruby-trunk - Feature #7414][Open] Now that const_get supports "Foo::Bar" syntax, so should const_defined?. — "robertgleeson (Robert Gleeson)" <rob@...>

9 messages 2012/11/20

[#49950] [ruby-trunk - Feature #7427][Assigned] Update Rubygems — "mame (Yusuke Endoh)" <mame@...>

17 messages 2012/11/24

[#50043] [ruby-trunk - Bug #7429][Open] Provide options for core collections to customize behavior — "headius (Charles Nutter)" <headius@...>

10 messages 2012/11/24

[#50092] [ruby-trunk - Feature #7434][Open] Allow caller_locations and backtrace_locations to receive negative params — "sam.saffron (Sam Saffron)" <sam.saffron@...>

21 messages 2012/11/25

[#50094] [ruby-trunk - Bug #7436][Open] Allow for a "granularity" flag for backtrace_locations — "sam.saffron (Sam Saffron)" <sam.saffron@...>

11 messages 2012/11/25

[#50207] [ruby-trunk - Bug #7445][Open] strptime('%s %z') doesn't work — "felipec (Felipe Contreras)" <felipe.contreras@...>

19 messages 2012/11/27

[#50424] [ruby-trunk - Bug #7485][Open] ruby cannot build on mingw32 due to missing __sync_val_compare_and_swap — "drbrain (Eric Hodel)" <drbrain@...7.net>

15 messages 2012/11/30

[#50429] [ruby-trunk - Feature #7487][Open] Cutting through the issues with Refinements — "trans (Thomas Sawyer)" <transfire@...>

13 messages 2012/11/30

[ruby-core:48962] [ruby-trunk - Bug #7282][Assigned] Invalid UTF-8 from emoji allowed through silently

From: "usa (Usaku NAKAMURA)" <usa@...>
Date: 2012-11-06 05:58:09 UTC
List: ruby-core #48962
Issue #7282 has been updated by usa (Usaku NAKAMURA).

Category set to M17N
Status changed from Open to Assigned
Assignee set to naruse (Yui NARUSE)
Target version set to 2.0.0


----------------------------------------
Bug #7282: Invalid UTF-8 from emoji allowed through silently
https://bugs.ruby-lang.org/issues/7282#change-32469

Author: headius (Charles Nutter)
Status: Assigned
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: M17N
Target version: 2.0.0
ruby -v: 2.0.0


On my system, where the default encoding is UTF-8, the following should not parse:

ruby-2.0.0 -e 'p "Hello, \x96 world!\"}'

But it does. And it is apparently marked as "ok" as far as code range goes, because encoding to UTF-8 does not catch the problem:

system ~/projects/jruby $ ruby-1.9.3 -e 'p "{\"sample\": \"Hello, \x96 world!\"}".encode("UTF-8")'
"{\"sample\": \"Hello, \x96 world!\"}"

system ~/projects/jruby $ ruby-2.0.0 -e 'p "{\"sample\": \"Hello, \x96 world!\"}".encode("UTF-8")'
"{\"sample\": \"Hello, \x96 world!\"}"

Nor does character-walking:

system ~/projects/jruby $ ruby-1.9.3 -e '"Hello, \x96 world!".each_char {|x| print x}'
Hello, ? world!
system ~/projects/jruby $ ruby-2.0.0 -e '"Hello, \x96 world!".each_char {|x| print x}'
Hello, ? world!

Nor does []:

system ~/projects/jruby $ ruby-1.9.3 -e 'p "Hello, \x96 world!"[7]'
"\x96"

system ~/projects/jruby $ ruby-1.9.3 -e 'p "Hello, \x96 world!"[8]'
" "

system ~/projects/jruby $ ruby-2.0.0 -e 'p "Hello, \x96 world!"[7]'
"\x96"

system ~/projects/jruby $ ruby-2.0.0 -e 'p "Hello, \x96 world!"[8]'
" "

But the malformed String does get caught by transcoding to UTF-16:

system ~/projects/jruby $ ruby-1.9.3 -e 'p "{\"sample\": \"Hello, \x96 world!\"}".encode("UTF-16")'
-e:1:in `encode': "\x96" on UTF-8 (Encoding::InvalidByteSequenceError)
	from -e:1:in `<main>'

system ~/projects/jruby $ ruby-2.0.0 -e 'p "{\"sample\": \"Hello, \x96 world!\"}".encode("UTF-16")'
-e:1:in `encode': "\x96" on UTF-8 (Encoding::InvalidByteSequenceError)
	from -e:1:in `<main>'

Or by doing a simple regexp match:

system ~/projects/jruby $ ruby-1.9.3 -e '"Hello, \x96 world!".match /.+/'
-e:1:in `match': invalid byte sequence in UTF-8 (ArgumentError)
	from -e:1:in `match'
	from -e:1:in `<main>'

system ~/projects/jruby $ ruby-2.0.0 -e '"Hello, \x96 world!".match /.+/'
-e:1:in `match': invalid byte sequence in UTF-8 (ArgumentError)
	from -e:1:in `match'
	from -e:1:in `<main>'

And of course I am ignoring the fact that it should never have parsed to begin with.

This kind of inconsistency in rejecting malformed UTF-8 does not inspire a lot of confidence.

JRuby allows it through the parser (this is a bug) but does fail in other places because the string is malformed.


-- 
http://bugs.ruby-lang.org/

In This Thread