From: "mame (Yusuke Endoh) via ruby-core" Date: 2023-08-25T06:44:54+00:00 Subject: [ruby-core:114527] [Ruby master Bug#19784] String#delete_prefix! problem Issue #19784 has been updated by mame (Yusuke Endoh). > Do you mean that if the argument or if the receiver String is not `String#valid_encoding?`, then we compare byte-by-byte, and otherwise we compare character-by-character ? No. > I think we should not consider whether a given substring is valid I think it's this way. In the case of `"\xFF\xC3\x84"` (= `"\xFF�"`), the byte sequence from byteoffset 0 is invalid, so we take out one byte `"\xFF"`, and the next byte sequence from byteoffset 1 is valid, so we take out two bytes (one character) `"\xC3\x84"`, and so on, I think @akr or @naruse can explain the rationale. ---------------------------------------- Bug #19784: String#delete_prefix! problem https://bugs.ruby-lang.org/issues/19784#change-104325 * Author: inversion (Yura Babak) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Here is the snipped and the question is in the comments: ``` ruby fp = 'with_BOM_16.txt' body = File.read(fp).force_encoding('UTF-8') p body # "\xFF\xFE1\u00001\u0000" p body.start_with?("\xFF\xFE") # true body.delete_prefix!("\xFF\xFE") # !!! why doesn't work? p body # "\xFF\xFE1\u00001\u0000" p body.start_with?("\xFF\xFE") # true body[0, 2] = '' p body # "1\u00001\u0000" p body.start_with?("\xFF\xFE") # false ``` Works same on Linux (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]) and Windows (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt]) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/