From: duerst@... Date: 2017-09-25T08:27:56+00:00 Subject: [ruby-core:82977] [Ruby trunk Feature#13626] Add String#byteslice! Issue #13626 has been updated by duerst (Martin D��rst). normalperson (Eric Wong) wrote: > Fwiw, I'm also not convinced String#<< behavior about changing > write_buffer to Encoding::UTF-8 in your above example is good > behavior on Ruby's part... But I don't know much about human > language encodings, I am just a *nix plumber where a byte is a > byte. This behavior may not be the best for this specific case, but in general, if one string is US-ASCII, and the other is UTF-8, then UTF-8 is a superset of US-ASCII, and concatenating the two will produce a string in UTF-8. Dropping the encoding would loose important information. Please also note that you are actually on dangerous ground here. The above only works because the string doesn't contain any non-ASCII (high bit set) bytes. As soon as there is such a byte, there will be an error. ```` s = "abcde".b s.encoding # => # s << "����������" # => "abcde����������" s.encoding # => # ```` but: ```` t = "����������".b # => "\xCE\xB1\xCE\xB2\xCE\xB3\xCE\xB4\xCE\xB5" t.encoding # => # t << "��������" # => Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8 ```` So if you have an ASCII-8BIT buffer, and want to append something, always make sure you make the appended stuff also ASCII-8BIT. ---------------------------------------- Feature #13626: Add String#byteslice! https://bugs.ruby-lang.org/issues/13626#change-66884 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- It's a common pattern in IO buffering, to read a part of a string while leaving the remainder. ~~~ # Consume only part of the read buffer: result = @read_buffer.byteslice(0, size) @read_buffer = @read_buffer.byteslice(size, @read_buffer.bytesize) ~~~ It would be nice if this code could be simplified to: ~~~ result = @read_buffer.byteslice!(size) ~~~ Additionally, this allows a significantly improved implementation by the interpreter. -- https://bugs.ruby-lang.org/ Unsubscribe: