From: "drbrain (Eric Hodel)" Date: 2012-05-30T08:45:39+09:00 Subject: [ruby-core:45311] [ruby-trunk - Feature #6492] Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default Issue #6492 has been updated by drbrain (Eric Hodel). File net.http.inflate_by_default.2.patch added I've updated this patch. Upon working with the code again and looking at RFC 2616, I have made the following changes: > naruse (Yui NARUSE) wrote: > > If Inflater's @socket.read returns nil or a string shorter than clen, it means the input is finished and @inflate can finish. > > So at that time, you can call @inflate.finish. > > I hadn't thought of that, I will implement it. Due to read_chunked, and persistent connections I don't see how to make this work. When reading the body's Content-Length or Content-Range this strategy would work, but read_chunked reads multiple chunks of the compressed body and indicates the input to inflate is finished with a terminating "0\r\n\r\n" on the raw socket. Adding this communication between the raw socket and Inflater seems worse. When the connection is persistent, #read should only return nil when the connection was abnormally terminated in which case we will throw away the body. For #read_all, this would work. Due to all the special cases, I changed Net::HTTPResponse#inflater to yield the Inflater and automatically clean it up. This keeps the special information about cleanup out of #read_body_0 > this variable inflater is confusing with the inflater method. In Net::HTTPResponse#read_chunked, the confusing "inflater" variable has been replaced with "chunk_data_io" which comes from RFC 2616 section 3.6.1. > This read method return a string whose length is not clen, this is wrong. > Other IO-like object for example Zlib::GzipReader returns a string whose length is clen. > So Inflater should have a internal buffer and return the string whose length is just clen. Upon review, I think this is OK. RFC 2616 specifies that Content-Length and Content-Range (which are used for clen) refer to the transferred bytes and are used to read the correct amount of data from the response to maintain the persistent connection. Net::HTTPResponse#read_body doesn't allow the user to specify the amount of bytes they wish to read, so returning more data to the user is OK. I have made an additional change beyond your review: I've added a Net::ReadAdapter to the Inflater to stream of the encoded response body through inflate without buffering it all. This will reduce memory consumption for large responses. ---------------------------------------- Feature #6492: Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default https://bugs.ruby-lang.org/issues/6492#change-26895 Author: drbrain (Eric Hodel) Status: Open Priority: Normal Assignee: Category: lib Target version: 2.0.0 =begin This patch moves the compression-handling code from Net::HTTP#get to Net::HTTPResponse to allow decompression to occur by default on any response body. (A future patch will set the Accept-Encoding on all requests that allow response bodies by default.) Instead of having separate decompression code for deflate and gzip-encoded responses, (({Zlib::Inflate.new(32 + Zlib::MAX_WBITS)})) is used which automatically detects and inflated gzip-wrapped streams which allows for simpler processing of gzip bodies (no need to create a StringIO). =end -- http://bugs.ruby-lang.org/