[#70257] [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI — ko1@...

Issue #11420 has been reported by Koichi Sasada.

11 messages 2015/08/06

[ruby-core:70381] [Ruby trunk - Bug #10705] JSON::ParserError#message is wrong encoding (ASCII-8BIT)

From: nagachika00@...
Date: 2015-08-14 06:21:30 UTC
List: ruby-core #70381
Issue #10705 has been updated by Tomoyuki Chikanaga.

Backport changed from 2.0.0: WONTFIX, 2.1: WONTFIX, 2.2: REQUIRED to 2.0.0:=
 WONTFIX, 2.1: WONTFIX, 2.2: DONE

r50339, r50340, r50342 and r50343 were backported into `ruby_2_2` branch at=
 r51571.

----------------------------------------
Bug #10705: JSON::ParserError#message is wrong encoding (ASCII-8BIT)
https://bugs.ruby-lang.org/issues/10705#change-53784

* Author: Josh Cheek
* Status: Closed
* Priority: Normal
* Assignee:=20
* ruby -v: ruby 2.3.0dev (2015-01-06 trunk 49159) [x86_64-darwin13]
* Backport: 2.0.0: WONTFIX, 2.1: WONTFIX, 2.2: DONE
----------------------------------------
JSON::ParserError#message is wrong encoding (ASCII-8BIT). I would expect th=
e error to be whatever the internal encoding is (in my case, utf8), perhaps=
 inspecting the string in the error message such that all characters would =
be valid in that encoding.

Here is an example of where it becomes an issue:

~~~ruby
# encoding: utf-8
require 'json'  # =3D> true

json =3D JSON.dump("=E2=88=9A")                                          # =
=3D> "\"=E2=88=9A\""
begin
  result =3D JSON.parse(json)
  puts "PARSED: #{result.inspect}"
rescue JSON::ParserError =3D> e
  `ruby -v`                                                    # =3D> "ruby=
 2.3.0dev (2015-01-06 trunk 49159) [x86_64-darwin13]\n"
  json.encoding                                                # =3D> #<Enc=
oding:UTF-8>
  e.message.encoding                                           # =3D> #<Enc=
oding:ASCII-8BIT>
  e.message                                                    # =3D> "757:=
 unexpected token at '\"\xE2\x88\x9A\"'"
  puts "Could not parse #{json.inspect} because #{e.message}"  # ~> Encodin=
g::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8B=
IT
end

# ~> Encoding::CompatibilityError
# ~> incompatible character encodings: UTF-8 and ASCII-8BIT
# ~>
# ~> f9.rb:13:in `rescue in <main>'
# ~> f9.rb:5:in `<main>'
~~~

If the parsed string doesn't have a multibyte unicode character, it still h=
appens, but fixes itself when it comes in contact with another string, sinc=
e all its bytes are within the ASCII range.

Documented the actual use case and debugging [here](https://github.com/Josh=
Cheek/seeing_is_believing/issues/46#issuecomment-69007428).

(side thought: should I open another bug since it generates invalid JSON?)



--=20
https://bugs.ruby-lang.org/

In This Thread

Prev Next