[#44776] [ruby-trunk - Bug #6385][Open] mtime vie File.stat(filename).utime vs File.open(filename, 'r').mtime in Windows — "kolmanv (Kolman Vornovitsky)" <kolmanv@...>

9 messages 2012/05/01

[#44782] [ruby-trunk - Bug #6387][Open] 1.9.3p194 crashed on require in ubuntu — "ywen (Yi Wen)" <hayafirst@...>

12 messages 2012/05/01

[#44795] [ruby-trunk - Bug #6391][Open] Segment Fault while execute make_encmake.rb for Ruby 1.9.3 P194 ( MinGW64) — "raylinn@... (ray linn)" <raylinn@...>

13 messages 2012/05/02

[#44911] [ruby-trunk - Bug #6408][Open] DelegateClass#eql? and <=> don't work as expected — "tenderlovemaking (Aaron Patterson)" <aaron@...>

11 messages 2012/05/06

[#44951] [ruby-trunk - Feature #6414][Open] Destructuring Assignment — "edtsech (Edward Tsech)" <edtsech@...>

14 messages 2012/05/08

[#44958] [ruby-trunk - Feature #6418][Assigned] Supporing a subset of ANSI escape code on Windows — "usa (Usaku NAKAMURA)" <usa@...>

11 messages 2012/05/09

[#45035] [ruby-trunk - Bug #6433][Open] rb_thread_blocking_region(): ubf() function is executed with GVL — ibc (Iñaki Baz Castillo) <ibc@...>

12 messages 2012/05/14

[#45180] [ruby-trunk - Feature #6478][Open] BasicObject#__class__ — "trans (Thomas Sawyer)" <transfire@...>

14 messages 2012/05/22

[#45193] [ruby-trunk - Feature #6482][Open] Add URI requested to Net::HTTP request and response objects — "drbrain (Eric Hodel)" <drbrain@...7.net>

16 messages 2012/05/23

[#45198] [ruby-trunk - Feature #6483][Open] parametric map — "prijutme4ty (Ilya Vorontsov)" <prijutme4ty@...>

14 messages 2012/05/23

[#45222] [ruby-trunk - Feature #6492][Open] Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default — "drbrain (Eric Hodel)" <drbrain@...7.net>

23 messages 2012/05/24

[#45252] [ruby-trunk - Feature #6499][Open] Array::zip — "prijutme4ty (Ilya Vorontsov)" <prijutme4ty@...>

14 messages 2012/05/26

[#45272] [ruby-trunk - Feature #6503][Open] Support for the NPN extension to TLS/SSL — "igrigorik (Ilya Grigorik)" <ilya@...>

13 messages 2012/05/27

[#45316] [ruby-trunk - Feature #6515][Open] array.c: added method that verifies if an Array is part of another — "lellisga (Li Ellis Galardo)" <lellisga@...>

14 messages 2012/05/30

[ruby-core:45099] [ruby-trunk - Feature #2567] Net::HTTP does not handle encoding correctly

From: "jrochkind (jonathan rochkind)" <jonathan@...>
Date: 2012-05-16 23:38:25 UTC
List: ruby-core #45099
Issue #2567 has been updated by jrochkind (jonathan rochkind).


It seems like encoding on _headers_ is a different question than encoding on bodies. 

Perhaps encoding on _headers_ should be left ascii-8bit -- I don't understand if the spec even says the charset in the header is supposed to apply to other headers. 

But it is clear to me that encoding on body should be set per headers, when possible.  

Most langauges are not so explicit about encoding as ruby 1.9. In most languages you can get away with ignoring encoding, and at worst get garbled text -- in ruby 1.9 you'll get exceptions raised. 

'Implementers are encouraged to provide a means of disabling such "content sniffing"   when it is used.'

Fortunately, there is a clear way to do that -- we're not talking about net::http doing any transcoding, only about it setting the encoding value. You want to 'disable' that? Just

    response.body.force_encoding("ASCII-8BIT") 

to throw out whatever encoding it determined from the headers. 

Whatever problems would be caused by http servers sending bad content-type header and net::http believing it -- wouldn't those same problems also be caused by leaving the encoding ASCII-8BIT?  If an individual client wants to use heuristics to guess encoding, there's nothing stopping them -- just force_encoding("ASCII-8BIT") and then use whatever heuristics you like and force_encoding as a result at the end. 

But by default, net::http should assume that the spec is being followed and the content-type header is correct. 
----------------------------------------
Feature #2567: Net::HTTP does not handle encoding correctly
https://bugs.ruby-lang.org/issues/2567#change-26671

Author: slide_rule (Ryan Sims)
Status: Assigned
Priority: Low
Assignee: naruse (Yui NARUSE)
Category: lib
Target version: 2.0.0


=begin
 A string returned by an HTTP get does not have its encoding set appropriately with the charset field, nor does the content_type report the charset. Example code demonstrating incorrect behavior is below.
 
 #!/usr/bin/ruby -w
 # encoding: UTF-8
 
 require 'net/http'
 
 uri = URI.parse('http://www.hearya.com/feed/')
 result = Net::HTTP.start(uri.host, uri.port) {|http|
     http.get(uri.request_uri)
 }
 
 p result['content-type']     # "text/xml; charset=UTF-8" <- correct
 p result.content_type        # "text/xml" <- incorrect; truncates the charset field
 puts result.body.encoding    # ASCII-8BIT <- incorrect encoding, should be UTF-8
=end



-- 
http://bugs.ruby-lang.org/

In This Thread

Prev Next