From: "stringsn88keys (Thomas Powell) via ruby-core" Date: 2023-01-26T23:59:21+00:00 Subject: [ruby-core:112064] [Ruby master Bug#19383] Time.now.zone encoding for German display language in Windows is incorrect Issue #19383 has been updated by stringsn88keys (Thomas Powell). By "console" do you mean irb are you referencing PowerShell or cmd.exe/Command Prompt? Windows Terminal produces the same results as well. Also, the source for this is from one process to another without user interactivity. Looking at the Code Page 437 vs. Windows-1252, 0xE4 would be ��� in Code Page 437 and �� in Windows-1252 The byte sequence of "Mitteleurop��ische Zeit" as encoded from "Time.now.zone" (which reports itself as "IBM437" is (hex values): => ["4d", "69", "74", "74", "65", "6c", "65", "75", "72", "6f", "70", "e4", "69", "73", "63", "68", "65", "20", "5a", "65", "69", "74"] 70 e4 69 would be "p��i" in Windows-1252, but "p���i" in IBM437 as reported. If UTF-8 is assumed, then e4 is a leading byte for a CJK script byte, but packing them doesn't associate the e4 with the following byte, which is confirmed by occasional invalid byte sequence errors depending on how the string is picked up. ---------------------------------------- Bug #19383: Time.now.zone encoding for German display language in Windows is incorrect https://bugs.ruby-lang.org/issues/19383#change-101498 * Author: stringsn88keys (Thomas Powell) * Status: Open * Priority: Normal * ruby -v: 3.1.3 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- OS: Verified on Windows 10 and Windows Server 2022 and Ruby 2.7.7 through 3.1.3 Display language: Verified on German, but may impact other languages in which Time.now.zone returns characters that aren't [A-Za-z]. Time zone: CET (UTC +01:00) Amsterdam, Berlin, ... Time.now.zone # => "Mitteleuro\xE3ische Zeit" Time.now.zone.encoding # => # puts Time.now.zone # => "Mitteleurop���ische Zeit" (should be "Mitteleurop��ische Zeit") Time.now.zone.encode(Encoding::UTF_8) # => "Mitteleurop���ische Zeit" Doing a force_encoding on all encodings in Encoding.list reveals that ISO-8859-(1..16) and Windows-125(0,2,4,7) work to coerce the �� out of the time zone string: Time.now.zone.force_encoding(Encoding::WINDOWS_1252) # => "Mitteleuro\xE3ische Zeit" ... but ... Time.now.zone.force_encoding(Encoding::WINDOWS_1252).encode(Encoding::UTF_8) #=> "Mitteleurop��ische Zeit" Related issue: This improper encoding/rendering caused Ohai's JSON output to be unparseable. Workaround was forcing to Windows-1252. https://github.com/chef/ohai/pull/1781 -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/