From: Vincent Isambart Date: 2010-01-24T12:10:46+09:00 Subject: [ruby-core:27748] [Bug #2636] Incorrect UTF-16 string length Bug #2636: Incorrect UTF-16 string length http://redmine.ruby-lang.org/issues/show/2636 Author: Vincent Isambart Status: Open, Priority: Normal Category: M17N, Target version: 1.9.2 ruby -v: ruby 1.9.2dev (2010-01-22 trunk 26370) [x86_64-darwin10.2.0] str = "\xDC\x0B\xD8\x40".force_encoding(Encoding::UTF_16BE) str.length #=> 3 This string is made by inverting 2 words of a UTF-16 character not in the BMP. The length should be 2 because it's made of two (unpaired) surrogates and not 3. The most strange part is that even though the length concurs with how the string is displayed when doing #inspect ("\xDC\u0BD8\x40"), but not with what #[] does. If the length is 3, then why does str[2] return nil? ---------------------------------------- http://redmine.ruby-lang.org