From: "jrochkind (jonathan rochkind)" Date: 2012-05-17T08:41:44+09:00 Subject: [ruby-core:45100] [ruby-trunk - Feature #2567] Net::HTTP does not handle encoding correctly Issue #2567 has been updated by jrochkind (jonathan rochkind). It actually occurs to me that I mis-read the passage quoted by naruse. That passage is _discouraging_ heuristical guessing of charset, despite the fact that content-type is often wrong. That's what's being discouraged, and what it's saying there should be an opt-out of. ---------------------------------------- Feature #2567: Net::HTTP does not handle encoding correctly https://bugs.ruby-lang.org/issues/2567#change-26672 Author: slide_rule (Ryan Sims) Status: Assigned Priority: Low Assignee: naruse (Yui NARUSE) Category: lib Target version: 2.0.0 =begin A string returned by an HTTP get does not have its encoding set appropriately with the charset field, nor does the content_type report the charset. Example code demonstrating incorrect behavior is below. #!/usr/bin/ruby -w # encoding: UTF-8 require 'net/http' uri = URI.parse('http://www.hearya.com/feed/') result = Net::HTTP.start(uri.host, uri.port) {|http| http.get(uri.request_uri) } p result['content-type'] # "text/xml; charset=UTF-8" <- correct p result.content_type # "text/xml" <- incorrect; truncates the charset field puts result.body.encoding # ASCII-8BIT <- incorrect encoding, should be UTF-8 =end -- http://bugs.ruby-lang.org/