From: Heesob Park <redmine@...>
Date: 2010-12-10T13:59:44+09:00
Subject: [ruby-core:33661] [Ruby 1.9-Feature#4145][Open] The result of UTF-16 encoded string concatenation

Feature #4145: The result of UTF-16 encoded string concatenation
http://redmine.ruby-lang.org/issues/show/4145

Author: Heesob Park
Status: Open, Priority: Normal
Category: core, Target version: 1.9.x

C:\work>irb
irb(main):001:0> a = 'abc'.encode('UTF-16')
=> "\uFEFFabc"
irb(main):002:0> b = a + a
=> "\uFEFFabc\uFEFFabc"
irb(main):003:0> c = b.encode('UTF-8')
=> "abc\uFEFFabc"
irb(main):004:0> d = b.encode('US-ASCII')
Encoding::UndefinedConversionError: U+FEFF to US-ASCII in conversion from UTF-16
 to UTF-8 to US-ASCII
        from (irb):4:in `encode'
        from (irb):4
        from c:/usr/bin/irb.bat:19:in `<main>'
irb(main):005:0> b << b
=> "\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc"
irb(main):006:0> b * 3
=> "\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc\uFEFFabc"
irb(main):007:0>

Although I understand this behaviour, is there any possibility of generating only one \uFEFF ?


----------------------------------------
http://redmine.ruby-lang.org