From: thomas@... Date: 2014-04-07T19:40:16+00:00 Subject: [ruby-core:61898] [ruby-trunk - Bug #9712] Dir.entries replace Unicode character with questionmarks Issue #9712 has been updated by Thomas Thomassen. Usaku NAKAMURA wrote: > check Dir.entries('Foo', encoding: 'utf-8') Ah, well that worked. I'd been referring to the Ruby 2.0.0 docs where this argument is missing: http://www.ruby-doc.org/core-2.0/Dir.html#method-c-entries But why is this needed? On my machine it returns the strings by default in Windows-1252 - which is the same as File.find('filesystem'). I guess it returns it based on that? But for Windows this is really awkward. Windows-1252 is the compatibility codepage - but the file system itself is perfectly capable of handling Unicode characters. I see Ruby explicitly calls the W versions of the Windows file functions instead of declaring the UNICODE flag - this makes all system calls treat Ruby with compatibility handling. The Windows file system isn't actually Windows-1252 encoded - or any other encoding ruby currently reports. It's all Unicode - I can use any character I like, so why isn't Ruby just returning result from file functions as Unicode? ---------------------------------------- Bug #9712: Dir.entries replace Unicode character with questionmarks https://bugs.ruby-lang.org/issues/9712#change-46106 * Author: Thomas Thomassen * Status: Rejected * Priority: High * Assignee: cruby-windows * Category: platform/windows * Target version: current: 2.2.0 * ruby -v: ruby 2.2.0dev (2014-04-07 trunk 45528) [i386-mswin32_100] * Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN ---------------------------------------- My basis when testing this is that I have a computer with English OS - codepage Windows-1252. The tests might yield different result if the Windows codepage is different - so please pay attention to that if you are unable to reproduce. Given a folder named "Foo" which contains a sub-folder "���������" ("\u3066\u3059\u3068") Dir.entries("Foo") will return: [".", "..", "???"] The characters that doesn't fit my filesystem codepage is translated into question marks. I would have expected the strings returned to be in some Unicode format. -- https://bugs.ruby-lang.org/