From: mjrzasa via ruby-core Date: 2025-02-05T22:09:30+00:00 Subject: [ruby-core:120887] [Ruby master Bug#20919] IO#seek and IO#pos= do not clear the character buffer in some cases while transcoding Issue #20919 has been updated by mjrzasa (Maciek Rz��sa). I rerun tests on 3.5.0 and it's indeed related to transcoding ``` puts "Hello dev-ruby! #{RUBY_VERSION}" require 'tempfile' Tempfile.open() do |f| f.write('0123456789') f.rewind f.ungetc('a') # Character buffer WILL NOT be cleared f.seek(2, :SET) puts f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer WILL NOT be cleared f.seek(2, :SET) puts f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end Tempfile.open() do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer WILL NOT be cleared f.seek(2, :SET) puts f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a') # Character buffer WILL NOT be cleared f.seek(2, :SET) puts f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end ``` ``` Hello dev-ruby! 3.5.0 2 a 2 a2 ``` so the issue happened when encoding was set on `.open`. Also when a non-encoded char was `ungetc'-ed, `getc` returned two characters. ---------------------------------------- Bug #20919: IO#seek and IO#pos= do not clear the character buffer in some cases while transcoding https://bugs.ruby-lang.org/issues/20919#change-111761 * Author: javanthropus (Jeremy Bopp) * Status: Open * ruby -v: ruby 3.4.0dev (2024-11-28T12:38:16Z master 3af1a04741) +PRISM [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When transcoding characters, `IO#seek` and `IO#pos=` only clear the internal character buffer if `IO#getc` is called first: ```ruby require 'tempfile' Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer WILL NOT be cleared f.seek(2, :SET) f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind f.ungetc('a'.encode('utf-16le')) # Character buffer WILL NOT be cleared f.pos = 2 f.getc # => 'a'.encode('utf-16le'); should be '2'.encode('utf-16le') end Tempfile.open(encoding: 'utf-8:utf-16le') do |f| f.write('0123456789') f.rewind # Added a call to #getc here f.getc f.ungetc('a'.encode('utf-16le')) # Character buffer WILL be cleared now f.seek(2, :SET) # Same behavior for #pos= #f.pos = 2 f.getc # => '2'.encode('utf-16le') end ``` -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/