From: "nobu (Nobuyoshi Nakada)" Date: 2012-06-10T07:21:58+09:00 Subject: [ruby-core:45539] [ruby-trunk - Bug #6566] JSON.dump can generate invalid UTF-8 sequence Issue #6566 has been updated by nobu (Nobuyoshi Nakada). File bug-6566.diff added A bit simpler, it seems wrong that JSON.generate(["\xea"]).valid_encoding? returns false. I think this would be a bug in json generator, but what should happen in this case? Seems convert_UTF8_to_JSON_ASCII() wants to reject invalid sequence. ---------------------------------------- Bug #6566: JSON.dump can generate invalid UTF-8 sequence https://bugs.ruby-lang.org/issues/6566#change-27134 Author: shyouhei (Shyouhei Urabe) Status: Assigned Priority: Normal Assignee: naruse (Yui NARUSE) Category: M17N Target version: 2.0.0 ruby -v: ruby 2.0.0dev (2012-06-09) [x86_64-linux] =begin Look, in the following code JSON.dump outputs a sequence invalid as UTF-8. # -*- encoding: utf-8 -*- require 'json' IO.popen('hexdump -C', 'w') do |fp| JSON.dump(["\xea"], fp) end RFC4627 says that to encode JSON as a Unicode is a "SHALL". So this is an RFC violation. =end -- http://bugs.ruby-lang.org/