[ruby-core:71009] Re: [Ruby trunk - Bug #11506] [Assigned] Changed behavior of URI.unescape between 2.1.5 and 2.2.3

From: Matthew Kerwin <matthew@...>
Date: 2015-10-07 08:29:42 UTC
List: ruby-core #71009
On 07/10/2015 5:46 PM, <nagachika00@gmail.com> wrote:
>
> It is related with r46491 for [Feature #2542] (support RFC 3986)?
>
> I think it is intended change.
> naruse san, how do you think?
>

I think if you follow RFC 3986 by the letter, the unescaped string should
still have percent-encoding in it, because the "US-ASCII" encoding that the
RFC refers to is 7-bit (and so can't contain multibyte UTF-8-encoded
characters.) It is not the same as Ruby's "US-ASCII", which is 8-bit.

If you want to use RFC 3987 to convert the percent-encoded URI to a
UTF-8-encoded IRI, then the resulting string can contain Japanese
characters, but should have the encoding "UTF-8".

In This Thread

Prev Next