[#108771] [Ruby master Bug#18816] Ractor segfaulting MacOS 12.4 (aarch64 / M1 processor) — "brodock (Gabriel Mazetto)" <noreply@...>

Issue #18816 has been reported by brodock (Gabriel Mazetto).

8 messages 2022/06/05

[#108802] [Ruby master Feature#18821] Expose Pattern Matching interfaces in core classes — "baweaver (Brandon Weaver)" <noreply@...>

Issue #18821 has been reported by baweaver (Brandon Weaver).

9 messages 2022/06/08

[#108822] [Ruby master Feature#18822] Ruby lack a proper method to percent-encode strings for URIs (RFC 3986) — "byroot (Jean Boussier)" <noreply@...>

Issue #18822 has been reported by byroot (Jean Boussier).

18 messages 2022/06/09

[#108937] [Ruby master Bug#18832] Suspicious superclass mismatch — "fxn (Xavier Noria)" <noreply@...>

Issue #18832 has been reported by fxn (Xavier Noria).

16 messages 2022/06/15

[#108976] [Ruby master Misc#18836] DevMeeting-2022-07-21 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18836 has been reported by mame (Yusuke Endoh).

12 messages 2022/06/17

[#109043] [Ruby master Bug#18876] OpenSSL is not available with `--with-openssl-dir` — "Gloomy_meng (Gloomy Meng)" <noreply@...>

Issue #18876 has been reported by Gloomy_meng (Gloomy Meng).

18 messages 2022/06/23

[#109052] [Ruby master Bug#18878] parse.y: Foo::Bar {} is inconsistently rejected — "qnighy (Masaki Hara)" <noreply@...>

Issue #18878 has been reported by qnighy (Masaki Hara).

9 messages 2022/06/26

[#109055] [Ruby master Bug#18881] IO#read_nonblock raises IOError when called following buffered character IO — "javanthropus (Jeremy Bopp)" <noreply@...>

Issue #18881 has been reported by javanthropus (Jeremy Bopp).

9 messages 2022/06/26

[#109063] [Ruby master Bug#18882] File.read cuts off a text file with special characters when reading it on MS Windows — magynhard <noreply@...>

Issue #18882 has been reported by magynhard (Matth辰us Johannes Beyrle).

15 messages 2022/06/27

[#109081] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations) — "byroot (Jean Boussier)" <noreply@...>

Issue #18885 has been reported by byroot (Jean Boussier).

23 messages 2022/06/28

[#109083] [Ruby master Bug#18886] Struct aref and aset don't trigger any tracepoints. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18886 has been reported by ioquatix (Samuel Williams).

8 messages 2022/06/29

[#109095] [Ruby master Misc#18888] Migrate ruby-lang.org mail services to Google Domains and Google Workspace — "shugo (Shugo Maeda)" <noreply@...>

Issue #18888 has been reported by shugo (Shugo Maeda).

16 messages 2022/06/30

[ruby-core:108831] [Ruby master Feature#18822] Ruby lack a proper method to percent-encode strings for URIs (RFC 3986)

From: "byroot (Jean Boussier)" <noreply@...>
Date: 2022-06-09 10:10:06 UTC
List: ruby-core #108831
Issue #18822 has been updated by byroot (Jean Boussier).


Proposed implementation: https://github.com/ruby/cgi/pull/26

----------------------------------------
Feature #18822: Ruby lack a proper method to percent-encode strings for URIs (RFC 3986)
https://bugs.ruby-lang.org/issues/18822#change-97907

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

There are two fairly similar encoding methods that are often confused. 

`application/x-www-form-urlencoded` which is how form data is encoded, and "percent-encoding" as defined by [RFC 3986](https://www.rfc-editor.org/rfc/rfc3986).

AFAIK, the only way they differ is that "form encoding" escape space characters as `+`, and RFC 3986 escape them as `%20`. Most of the time it doesn't matter, but sometimes it does.

### Ruby form and URL escape methods

  - `URI.escape(" ") # => "%20"` but it was deprecated and removed (in 3.0 ?).
  - `ERB::Util.url_encode(" ") # => "%20"` but it's implemented with a `gsub` and isn't very performant. It's also awkward to have to reach for `ERB`
  - `CGI.escape(" ") # => "+"`
  - `URI.encode_www_form_component(" ") # => "+"`

### Unescape methods

For unescaping, it's even more of a clear cut since `URI.unescape` was removed. So there's no available method that won't treat an unescaped `+` as simply `+`.

e.g. in Javascript: `decodeURIComponent("foo+bar") #=> "foo+bar"`.

If one were to use `CGI.unescape`, the string might be improperly decoded: `GI.unescape("foo+bar") #=> "foo bar"`. 

### Other languages

  - Javascript `encodeURI` and `encodeURIComponent` use `%20`.
  - PHP has `urlencode` using `+` and `rawurlencode` using `%20`.
  - Python has `urllib.parse.quote` using `%20` and `urllib.parse.quote_plus` using `+`.

### Proposal

Since `CGI` already have a very performant encoder for `application/x-www-form-urlencoded`, I think it would make sense that it would provide another method for RFC3986.

I propose:

   - `CGI.url_encode(" ") # => "%20"`
   - Or `CGI.encode_url`.
   - Alias `CGI.escape` as `GCI.encode_www_form_component`
   - Clarify the documentation of `CGI.escape`.





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread