[ruby-core:70734] [Ruby trunk - Bug #11522] URI::decode returns incorrectly encoding strings

From: nobu@...
Date: 2015-09-12 14:43:36 UTC
List: ruby-core #70734
Issue #11522 has been updated by Nobuyoshi Nakada.


It has no hints for encoding.

----------------------------------------
Bug #11522: URI::decode returns incorrectly encoding strings
https://bugs.ruby-lang.org/issues/11522#change-54113

* Author: Charlie Anderson
* Status: Open
* Priority: Normal
* Assignee: akira yamada
* ruby -v: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
When given unicode characters to encode and decode, the URI module returns a string with an invalid encoding.

~~~
irb(main):026:0* unicode = 'œ´å∑®´ß∂†≈©ƒç˙©√∆˙∫˚∆~¬'
=> "œ´å∑®´ß∂†≈©ƒç˙©√∆˙∫˚∆~¬"
irb(main):027:0> unicode.encoding
=> #<Encoding:UTF-8>
irb(main):028:0> unicode.valid_encoding?
=> true
irb(main):029:0> encoded = URI::encode(unicode)
=> "%C5%93%C2%B4%C3%A5%E2%88%91%C2%AE%C2%B4%C3%9F%E2%88%82%E2%80%A0%E2%89%88%C2%A9%C6%92%C3%A7%CB%99%C2%A9%E2%88%9A%E2%88%86%CB%99%E2%88%AB%CB%9A%E2%88%86~%C2%AC"
irb(main):030:0> encoded.encoding
=> #<Encoding:US-ASCII>
irb(main):031:0> encoded.valid_encoding?
=> true
irb(main):032:0> decoded = URI::decode(encoded)
=> "\xC5\x93\xC2\xB4\xC3\xA5\xE2\x88\x91\xC2\xAE\xC2\xB4\xC3\x9F\xE2\x88\x82\xE2\x80\xA0\xE2\x89\x88\xC2\xA9\xC6\x92\xC3\xA7\xCB\x99\xC2\xA9\xE2\x88\x9A\xE2\x88\x86\xCB\x99\xE2\x88\xAB\xCB\x9A\xE2\x88\x86~\xC2\xAC"
irb(main):033:0> decoded.encoding
=> #<Encoding:US-ASCII>
irb(main):034:0> decoded.valid_encoding?
=> false
~~~

I would expect decoded to have a valid encoding - probably as UTF-8?



-- 
https://bugs.ruby-lang.org/

In This Thread

Prev Next