From: "ioquatix (Samuel Williams)" Date: 2021-12-19T20:08:25+00:00 Subject: [ruby-core:106736] [Ruby master Bug#18417] IO::Buffer problems Issue #18417 has been updated by ioquatix (Samuel Williams). Thanks so much for this awesome list of things we should try to address. https://github.com/ruby/ruby/pull/5303 We are considering how `to_str` should work. Implicit conversion is useful but it could also introduce issues. > Another concern about #to_str naming is it is one of the main parts of the buffer interface (alongside #copy), but doesn't look this way, and looks like some utility method instead. Maybe copy/to_str pair should be symmetrical, like write/read, or something like that? We need to be a little bit careful because string operations are not the primary point, and we have to also consider that such names clash with IO-like behaviour, e.g. we might use `buffer.write(io, ...)` or `buffer.write_to(io, ...)`... Let me first try to address all the bugs, then with the remaining time let's work on features/changes. This interface is marked experimental for a reason and I expect there will be some big changes between 3.1. release and 3.2 release in terms of the user facing interface so while I'd prefer we get this all sorted, I'm not too concerned if we don't have all the answers before the release of 3.1. ---------------------------------------- Bug #18417: IO::Buffer problems https://bugs.ruby-lang.org/issues/18417#change-95429 * Author: zverok (Victor Shepelev) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN ---------------------------------------- Hello. While working for documentation for `IO::Buffer` (still WIP), I uncovered these problems. I believe at least some of them should be fixed before the release, though I understand not much time left. The list is not final, probably (still on it, but wanted to raise awareness ASAP). **These should be probably fixed/discussed before the release** 1\. I believe `to_str` is an unfortunate name for "read string from buffer, maybe entire buffer. It is Ruby's convention that `to_str` is an implicit String-conversion, and a) I am not sure the Buffer is "string-like" enough to be implicitly converted, and b) it is really unexpected that `#to_s` and `#to_str` do different things: ```ruby buf = IO::Buffer.for('test') "buf: #{buf}" # => "buf: #" "buf: " + buf # => "buf: test" puts buf # # puts '' + buf # test ``` Another concern about `#to_str` naming is it is one of the main parts of the buffer interface (alongside `#copy`), but doesn't look this way, and looks like some utility method instead. Maybe copy/to_str pair should be symmetrical, like `write`/`read`, or something like that? 2\. I am not sure that type names (`:U8`, `:S8`, `:U16`, `:u16` and so on) introduced "from scratch" (?) is a good thing. From one point, they are kinda nice, and consistent. From another, we already have abbreviated names for the types in [Array#pack](https://docs.ruby-lang.org/en/master/Array.html#method-i-pack), and having two completely unrelated sets of designation for the same thing in the **core of the language** seems weird to me. I understand that not all `#pack`s designations can be represented by Symbol, and their consistency is questionable (it is `S` for 16-bit unsigned LE, `s` for 16-bit signed LE, and `S>`/`s>` for the same in BE), but the discrepancy is questionable too. 3\. About `flags` as a creation parameter: * `::new`: it seems that the only acceptable value is `MAPPED`; of others, `INTERNAL` is the default, `EXTERNAL`, `IMMUTABLE` and `LOCKED` raise on creation; * `::for`: no flags accepted (though it seems that a method to make such buffer immutable might be of some utility?) * `::map`: `INTERNAL`, `LOCKED`, `EXTERNAL` are ignored, `MAPPED` is the default; seems like the only usage of the parameter is _not_ to pass `IMMUTABLE` which is the default: and for that, you'll need to also pass `size` and `offset`. It seems, that **we shouldn't probably expose "flags" concept to the user** at all, and instead make APIs like `::new(size, mapped: false)` and `::map(io, ..., immutable: true)`? **Probably bugs** * It seems that `#clear` behaves weirdly when `offset` is provided but `length` is not: ```ruby buf = IO::Buffer.new(4) buf.clear(1, 2) # it seems obvious that I want to clear from byte 2 to the end of the buffer # in `clear': Offset + length out of bounds! (RuntimeError) ``` * `IO::Buffer.map('README.md')` (just a string, I experimented "whether it is smart enough") core dumps. * I suspect we might want to set buffer to `IMMUTABLE` if it is created from the frozen string. Currently, this works: ```ruby str = 'test'.freeze buf = IO::Buffer.for(str) buf.set(:U8, 1, 111) str #=> "tost" ``` * It seems `to_str` always returns a string in `US-ASCII` encoding, I am not sure it is the right behavior. For a low-level buffer, I'd rather expect `ASCII-8BIT` (what file read in binary mode produces in Ruby). * (Not sure if a bug or just not very useful behavior?) `#resize` applied to a buffer mapped from file seems to "break" the link between the buffer and file; no further `copy` or `set` would be reflected on file. * ~~`buf = IO::Buffer.new(4); buf.free; buf.inspect` gives a core dump~~ * ~~`buf = IO::Buffer.for("test"); buf.free` ��� after that, `buf.get(:U8, 0)` gives "Buffer has been invalidated! (RuntimeError)", but `buf.to_str` works and produces `"\x00\x00\x00\x00"`; `buf.to_str(2)` does a core dump~~ **Inconsistencies and other problems** * The buffer raises `RuntimeError` as the only type of the error. I believe that in some places, it should be other standard errors (for example, "Offset + length out of bounds" could be at least `ArumentError`?..), and in other occurrences, special error classes: for example, "Buffer is locked" would probably be helpful to have as some `IO::Buffer::LockedError` * ~~Current `#inspect` is kinda nice and useful, but it also might be really annoying when used by Ruby itself. Try `IO::Buffer.map(File.open("big file")).unexisting_method`, and you'll get REALLY long error message.~~ * ~~Can we maybe have for `#resize` a signature `resize(new_size, preserve = self.size)`? It seems a good default to allow just `resize(123)`, with preserving everything by default?~~ -- https://bugs.ruby-lang.org/ Unsubscribe: