From: "drbrain (Eric Hodel)" Date: 2012-06-08T09:48:20+09:00 Subject: [ruby-core:45498] [ruby-trunk - Feature #6492] Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default Issue #6492 has been updated by drbrain (Eric Hodel). naruse (Yui NARUSE) wrote: > drbrain (Eric Hodel) wrote: > > Due to read_chunked, and persistent connections I don't see how to make this work. > > Yeah, I thought adding an another layer, transport encoding decoder, but it is just an idea and I don't suggest it. I had this idea too, but it would be a larger change. I hope we can create something simpler, but also usable. > > > This read method return a string whose length is not clen, this is wrong. > > > Other IO-like object for example Zlib::GzipReader returns a string whose length is clen. > > > So Inflater should have a internal buffer and return the string whose length is just clen. > > > > Upon review, I think this is OK. > > > > RFC 2616 specifies that Content-Length and Content-Range (which are used for clen) refer to the transferred bytes and are used to read the correct amount of data from the response to maintain the persistent connection. Net::HTTPResponse#read_body doesn't allow the user to specify the amount of bytes they wish to read, so returning more data to the user is OK. > > Your patch hides content-encoding layer. > Content-Length and Content-Range are the member of the layer. > > Net:HTTPRequest#read is on the layer. I'm confused. There is Net::BufferedIO#read, but no Net::HTTPResponse#read. There is Net::HTTPResponse#read_body which lets you read the entire body or chunks of unknown size. I don't see a way for the user to say "read 10 bytes of the response body" without manually buffering: require 'net/http' req = Net::HTTP::Get.new '/' body_part = Net::HTTP.start 'localhost' do |http| buffer = '' target_size = 10 http.request req do |res| res.read_body do |chunk| break if buffer.length == target_size buffer << chunk[0, target_size - buffer.length] end end buffer end p body_part Since Net::HTTPResponse is not usable as an IO I don't think IO-like behavior should apply to Net::HTTPResponse::Inflater. > A user of net/http can't know whether a request used content-encoding or not. I am unsure what you mean by "can't". Do you mean "a user of net/http must be able to tell content-encoding was present"? When this patch is combined with #6494 they will not be able to know whether a request used content-encoding or not. I think this is good, the user should not have to worry about how the bytes were transported. (This behavior matches Net::HTTP#get). If we want the user to be able to handle content-encoding themselves I think adding a Net::HTTP#compression = false (which will disable both #6492 and #6494) would be best. We can also add an indicator on Net::HTTPResponse that decompression was performed. For Content-Length with Content-Encoding, the Content-Length will be invalid. I think this is OK because RFC 2616 doesn't contain an indicator of the decoded length and the user is most likely interested in the decoded body. Content-Range with Content-Encoding requires special handling. The compressed stream may start anywhere in the underlying block. (For a deflate-based stream the user would need to manually reconstruct the complete response in order to inflate it.) I think such users should disable compression. > On such situation, it can't be a reason why hidden Content-Encoding layer effects the behavior of read method. I agree that in RFC 2616 that Content-Encoding, Content-Length and Content-Range are all on the same layer, but without an IO-like interface for the Net::HTTPResponse body I don't think a restriction on the behavior of the read method should apply. Since this API is entirely internal, I think it is OK if a future addition to the API needs to add buffering to be IO-like. ---------------------------------------- Feature #6492: Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default https://bugs.ruby-lang.org/issues/6492#change-27086 Author: drbrain (Eric Hodel) Status: Assigned Priority: Normal Assignee: naruse (Yui NARUSE) Category: lib Target version: 2.0.0 =begin This patch moves the compression-handling code from Net::HTTP#get to Net::HTTPResponse to allow decompression to occur by default on any response body. (A future patch will set the Accept-Encoding on all requests that allow response bodies by default.) Instead of having separate decompression code for deflate and gzip-encoded responses, (({Zlib::Inflate.new(32 + Zlib::MAX_WBITS)})) is used which automatically detects and inflated gzip-wrapped streams which allows for simpler processing of gzip bodies (no need to create a StringIO). =end -- http://bugs.ruby-lang.org/