From: nobu@... Date: 2017-02-16T08:03:48+00:00 Subject: [ruby-core:79551] [Ruby trunk Bug#13216] Possible unexpected behaviour reading string starting with a byte order mark Issue #13216 has been updated by Nobuyoshi Nakada. Shyouhei Urabe wrote: > > $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")' > > �� > > This IS weird. Smells like a bug to me. Not a bug. `pack("U")` packs just one codepoint, and U+00EF is LATIN SMALL LETTER I WITH DIAERESIS, which is the printed exactly. ``` $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U*")' ������id ``` ---------------------------------------- Bug #13216: Possible unexpected behaviour reading string starting with a byte order mark https://bugs.ruby-lang.org/issues/13216#change-62991 * Author: Gabriel Giordano * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- Maybe the comparison between symbols has an unexpected behaviour. Tested with ruby 2.4.0 ``` $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes' 239 187 191 105 100 $ echo -n -e 'id' | ruby -e 'puts STDIN.read.bytes' 105 100 $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym' id $ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym' id $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.to_sym == :id' false $ echo -n -e 'id' | ruby -e 'puts STDIN.read.to_sym == :id' true $ echo -n -e '\xEF\xBB\xBFid' | ruby -e 'puts STDIN.read.bytes.pack("U")' �� -- https://bugs.ruby-lang.org/ Unsubscribe: