From: "javanthropus (Jeremy Bopp) via ruby-core" Date: 2024-11-08T13:16:43+00:00 Subject: [ruby-core:119843] [Ruby master Bug#20869] IO buffer handling is inconsistent when seeking Issue #20869 has been updated by javanthropus (Jeremy Bopp). > Since io.pos (not assignment) looks mere attribute, differentiated from seek. If not for the fact that `IO#seek` always returns 0 regardless of its arguments (something I've never understood), `IO#pos` could be implemented as `IO#seek(0, :CUR)`. Why not avoid busting the buffer in that case? On the other hand, why not simplify the implementation and bust the buffer in all cases? Maybe I'm too hung up on implementation and am unable to see `IO#pos` as merely an attribute. I also just installed the latest master branch build to check the changes, and there are still a couple of issues: 1. It's still possible for `IO#pos` to return negative values: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind f.ungetbyte(97) f.pos # => -1 end ``` 2. The character buffer isn't cleared when transcoding and seeking without first calling `IO#getc`: ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end ``` ---------------------------------------- Bug #20869: IO buffer handling is inconsistent when seeking https://bugs.ruby-lang.org/issues/20869#change-110533 * Author: javanthropus (Jeremy Bopp) * Status: Closed * ruby -v: ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When performing any of the seek based operations on IO (IO#seek, IO#pos=, or IO#rewind), the read buffer is inconsistently cleared: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #ungetbyte as the first read buffer # operation uses a buffer that is preserved during # seek operations f.ungetbyte(97) # Byte buffer will not be cleared f.seek(2, :SET) f.getbyte # => 97 end Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #getbyte before #ungetbyte uses a # buffer that is not preserved when seeking f.getbyte f.ungetbyte(97) # Byte buffer will be cleared f.seek(2, :SET) f.getbyte # => 50 end ``` Similar behavior happens when reading characters: ```ruby require 'tempfile' Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #ungetc as the first read buffer # operation uses a buffer that is preserved during # seek operations f.ungetc('a') # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a' end Tempfile.open do |f| f.write('0123456789') f.rewind # Calling #getc before #ungetc uses a # buffer that is not preserved when seeking f.getc f.ungetc('a') # Character buffer will be cleared f.seek(2, :SET) f.getc # => '2' end ``` When transcoding, however, the character buffer is never cleared when seeking: ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.getc f.ungetc('a'.encode('utf-16le')) # Character buffer will not be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le') end ``` I would expect the buffers to be cleared in all cases except possibly when the seek operation doesn't actually move the file pointer such as when calling IO#pos or IO#seek(0, :CUR). The inconsistent behavior demonstrated here is a problem regardless though. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/