From: "chucke (Tiago Cardoso) via ruby-core" Date: 2025-09-25T13:44:58+00:00 Subject: [ruby-core:123330] [Ruby Feature#21617] Add Internationalized Domain Name (IDN) support to URI Issue #21617 has been updated by chucke (Tiago Cardoso). Just adding my original public API suggestions, for visibility and further discussion by the core team. I propose that `URI::Generic` supports punycode decoding OOTB by relying on the current behaviour of `URI::Generic#hostname`, which already applies transformations to the passed `host` when necessary, such as in the below case of IPv6 addresses: # the example above is inspired in how uri already handles IPv6 addresses uri = URI("https://[::1]") uri.host #=> "[::1]", cannot be used in Socket.new(host, port) uri.hostname #=> "::1", can be used in Socket.new(host, port) therefore, punycode translation would happen transparently for IDNAs when calling `hostname`: uri = "https://l������h.ws" uri = URI(uri) uri.host #=> "l������h.ws" #=> cannot be used in Socket.new(host, port) uri.hostname #=> "xn--lh-t0xz926h.ws" #=> can be used in Socket.new(host, port), which will perform DNS via getaddrinfo This would require very little change in `resolv` library, before issuing the DNS query. The same would apply for most use cases, I believe. The required punycode decoding logic could be implemented in a separate `URI::Punycode` module. This module could be exposed publicly, with a single public method, `decode(uri)`, which would return the punycode URI of a given IDNA. This API could be extended to support more advanced use cases beyond the main common use case (which `URI::Generic#hostname` should address), like [the ones documented here](https://github.com/skryukov/uri-idna?tab=readme-ov-file#options). ---------------------------------------- Feature #21617: Add Internationalized Domain Name (IDN) support to URI https://bugs.ruby-lang.org/issues/21617#change-114698 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- Originally proposed by @chucke at https://github.com/ruby/uri/issues/76, trying to formalize it here. ### Context [Internalized Domain Names](https://en.wikipedia.org/wiki/Internationalized_domain_name), are getting more common, yet Ruby's `uri` default gem has no support for it: ```ruby >> URI("https://���������.jp/") URI must be ascii only "https://\u65E5\u672C\u8A9E.jp/" (URI::InvalidURIError) ``` So any program that which to handle arbitrary valid URIs provided by users can't use the `uri` gem, and instead have to depend on third party gems like [`addressable`](https://rubygems.org/gems/addressable) ```ruby >> Addressable::URI.parse("https://���������.jp/") => # ``` But even there, it won't seamlessly work with other libraries such as `net-http`: ``ruby >> Net::HTTP.get(Addressable::URI.parse("https://���������.jp/")).bytesize OpenSSL::SSL::SSLSocket#connect_nonblock': SSL_connect returned=1 errno=0 peeraddr=[2001:218:3001:7::110]:443 state=error: ssl/tls alert handshake failure (SSL alert number 40) (OpenSSL::SSL::SSLError) ``` You have to explicitly normalize the URL: ```ruby >> Addressable::URI.parse("https://���������.jp/").normalize => # >> Net::HTTP.get(Addressable::URI.parse("https://���������.jp/").normalize).bytesize => 8703 ``` ### Feature Request I believe it's would be very useful if the default `uri` gem had the capacity of: - Parsing IDNA domain names. - Convert URLs between their unicode and ASCII forms. The `URI::Generic` class already have a `#normalize` method to ensure the host and schema parts are all lower case, it could be extended to encode IDN hosts into their ASCII equivalent. It would also be useful if the opposite operation was supported for display purposes, not sure what name such a method could have, perhaps `canonicalize`? ### Implementation In https://github.com/ruby/uri/issues/76 @skryukov pointed to his pure Ruby implementation of IDNA 2008 (https://github.com/skryukov/uri-idna), I believe it would be good to upstream parts of it in the `uri` gem to implement these feature. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/