From: Tomas Matousek Date: 2009-03-10T02:59:56+09:00 Subject: [ruby-core:22784] Unicode sensitive operations What's the status of Unicode/culture-sensitive operations in Ruby 1.9.1? I tried following: # encoding: UTF-8 str = "combining mark: a\u{30a}"; p str.index("\u{e5}") # => nil # The result should be 16. ["a", "b", "c", "d", "e", "\u{e9}" "\u{e1}"].sort.each { |x| print x.dump, " " } # => "a" "b" "c" "d" "e" "\u{e1}" "\u{e9}" # The correct result is: # "a" "\u{e1}" "b" "c" "d" "e" "\u{e9}" So it seems that string operations such as sort, index, etc. work on a binary representation of the strings (correct me please, if I am wrong), not taking Unicode properties of the characters into consideration. Also does (will) Ruby have notion of culture (used to collate etc.)? Tomas