From: "YO4 (Yoshinao Muramatsu) via ruby-core" Date: 2025-11-13T16:58:06+00:00 Subject: [ruby-core:123791] [Ruby Bug#20889] IO#ungetc and IO#ungetbyte should not cause IO#pos to report an inaccurate position Issue #20889 has been updated by YO4 (Yoshinao Muramatsu). I checked #20869 and [DevMeeting-2024-11-07](https://github.com/ruby/dev-meeting-log/blob/master/2024/DevMeeting-2024-11-07.md). On that basis, I will now state my current opinion. IO#ungetc * As stated in #21682, the character buffer is considered isolated from the underlying stream, and IO#ungetc does not change the file position. IO#getbyte 1. To enable the operation of reading and then reverting back, it is preferable for the file position to change. This is a way that can be used even on devices where seeking is not possible. 1. To allow reverting back using the same number of ungetbyte/getbyte operations, negative file positions should be permitted. If the operation to move to a negative file position were prohibited, it would be less convenient. 1. Clearing the buffer while at a negative file position prevents the file position and subsequent reads from being determined, and cannot be restored using IO#getbyte. Therefore, IO#pos should not clear the buffer. IO#pos * IO#pos is an operation that does not affect the target, analogous to how peek relates to read, as compared to seek. IO#seek * IO#seek changes the file position and clear the buffer. ---------------------------------------- Bug #20889: IO#ungetc and IO#ungetbyte should not cause IO#pos to report an inaccurate position https://bugs.ruby-lang.org/issues/20889#change-115191 * Author: javanthropus (Jeremy Bopp) * Status: Open * ruby -v: ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8') do |f| f.write('0123456789') f.rewind f.ungetbyte(93) f.pos # => -1; negative value is surprising! end Tempfile.open(encoding: 'utf-8') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-8')) f.pos # => -1; similar to the ungetbyte case end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) f.pos # => 0; maybe should be -2 to match the previous ungetc case? end ``` It doesn't seem logical that `IO#pos` should ever be affected by `IO#ungetc` or `IO#ungetbyte`. The pushed characters or bytes aren't really in the stream source. The value of `IO#pos` implies that jumping directly to that position via `IO#seek` and reading from there would return the same character or byte that was pushed, but the pushed characters or bytes are lost when the operation to seek in the stream is performed. In the case where `IO#pos` is a negative value, attempting to seek to that position actually raises an exception. In the `IO#ungetc` with character conversion case above, it seems unreasonable to make `IO#pos` report an even less correct position. In that case, the position would need to be adjusted by 2 bytes in reverse due to the internal encoding of the stream, but that is completely inconsistent with the behavior of `IO#pos` when reading from the stream normally where it reports the underlying stream's byte position and not the number of transcoded bytes that have been read: ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.getc.bytesize # => 2; due to the internal encoding of the stream f.pos # => 1; reports actual bytes read from the stream, not transcoded bytes end ``` Attempting to use `IO#pos` when there are characters or bytes pushed into the read buffer by way of `IO#ungetc` or `IO#ungetbyte` should result in one of the following behaviors: 1. Raise and exception 2. Return the stream's position, clearing the read buffer entirely 3. Return the stream's position, ignoring the pushed characters or bytes, and produce a warning -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/