From: "phasis68 (Heesob Park)" Date: 2013-07-16T11:27:54+09:00 Subject: [ruby-core:56036] [ruby-trunk - Bug #8642][Open] Unexpected behavior of String#split with UTF-32 encoded string. Issue #8642 has been reported by phasis68 (Heesob Park). ---------------------------------------- Bug #8642: Unexpected behavior of String#split with UTF-32 encoded string. https://bugs.ruby-lang.org/issues/8642 Author: phasis68 (Heesob Park) Status: Open Priority: Normal Assignee: Category: Target version: ruby -v: ruby 2.1.0dev (2013-07-16 trunk 41990) [i386-mingw32] Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN The recent revisions related with encoding r41964,r41965,r41968,r71969,r47970,r41973 raised many test failures and errors. http://ci.rubyinstaller.org/job/ruby-trunk-x86-test-all/1796/console Here is a simple test case. C:\work>irb irb(main):001:0> a = 'test'.encode('UTF-32BE') => "test" irb(main):002:0> a.split => ["\u{F8493B6D}\u{12000000}\u{3000000}\u{E06DA102}"] irb(main):003:0> a => "\u{5F203D20}\u{4952422E}\u{43757272}\u{656E7443}" C:\work>irb irb(main):001:0> a = 'abc,def'.encode('UTF-32LE') => "abc,def" irb(main):002:0> sep = ','.encode('UTF-32LE') => "," irb(main):003:0> a.split(sep) => ["abc", "\u{65746E6F}\u{6C2E7478}\u{5F747361}"] irb(main):004:0> a => "\u{203D205F}\u{2E425249}\u{72727543}\u{43746E65}\u{65746E6F}\u{6C2E7478}\u{5F747361}" irb(main):005:0> -- http://bugs.ruby-lang.org/