From: cardoso_tiago@... Date: 2019-01-24T12:21:45+00:00 Subject: [ruby-core:91249] [Ruby trunk Feature#15549] Enumerable#to_reader (or anything enumerable, Enumerator, lazy enums, enum_for results) Issue #15549 has been updated by chucke (Tiago Cardoso). I'm open for other names (#to_readable_stream perhaps?). Important is to acknowledge the validity of this use-case, as there are some constraints depending of where it is used. For instance, responding to #read is a sufficient requirement for writing to disk. As for writing to an S3 bucket, the sdk requires that it also responds to #bytesize. That's beyond the scope of this change though. ---------------------------------------- Feature #15549: Enumerable#to_reader (or anything enumerable, Enumerator, lazy enums, enum_for results) https://bugs.ruby-lang.org/issues/15549#change-76495 * Author: chucke (Tiago Cardoso) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- This is a feature proposal for something I've had to implement before multiple times. For a lot of IO-related APIs, there is this unspoken (because ruby doesn't have official interfaces) notion of a reader/writer protocol, that is, you pass arguments to certain functions where they either must implement "#read(nsize, buffer)" or "#write(data)". An example would be "IO.copy_stream". It happened to me multiple times in the past that I started implementing some data-generator using "#each" in a specific format (CSV data, JSON...) to be lazy and memory conservative, but end up rewriting it because I can't read from an enumerable into a socket/file handle directly. Lately I've been adopting the pattern of "injecting" a "#read" method to these objects, so that I can indeed use these APIs to my benefit. Sadly, I have to reimplement this in every project. This is the gist: https://gist.github.com/HoneyryderChuck/625c7b873a00a18d12b1a08695551510 I think such an API would be very benefitial to the common user. In most projects I've worked in, writing data to a tempfile, S3 bucket, FTP server, is very common, and I've lost the count to the number of implementations which write the whole data in memory **then** write to the handle, which obviously gives the impression that ruby consumes a lot of memory. Now, I also understand that this is only beneficial to particular case of enums (those which yield strings/"to_s"-ables). But since there's a precedent for "#sum", so maybe I can make a case. This is an example that works if you load the gist code": ```ruby enum = %w(a) * 65536 puts "size: #{enum.size}" reader = enum.to_reader IO.copy_stream(reader, $stderr) ``` -- https://bugs.ruby-lang.org/ Unsubscribe: