From: "﻿RRRoy BBBean" <rrroybbbean@...>
Date: 2017-11-03T16:25:14-05:00
Subject: [ruby-core:83662] Re: [Ruby trunk Feature#14077] Add Encoding::FILESYSTEM and Encoding::LOCALE constants

On Fri, 2017-11-03 at 18:34 +0000, shevegen@gmail.com wrote:
> Issue #14077 has been updated by shevegen (Robert A. Heiler).
> I am in agreement with the feature-suggestion. Not sure whether
> it should be a constant or a method or both but I agree that it
> may be useful to have direct support for this in ruby.
...
> Matz said several times that one (core?) part of ruby's philosophy
> is the "human aspect" aka how something is used with ruby. I think
> that����this is also a reason why the ruby core team often likes
> to see "real world use cases" to determine how/if something is
> used.

Many of my cheesy ruby scripts manipulate directory hierarchies on both
windows and linux, often to fix problems that occur when you share an
NTFS-formatted external disk drive between systems.

This is one of the most frequent things that I have to do, since many
of my files (and some directories) use Korean UTF-8 characters:

Dir.entries('/.../mydir/',).each do |base|

I know that I must specify the encoding on Windows 7, or else it
assumes Windows-1252 and messes up multi-byte characters. This code
also works fine on Ubuntu 14, Fedora 24 and Debian 9, although I don't
even know what the default or filesystem encoding is on Linux systems.

FYI, on Windows 7, I work exclusively with NTFS and Fat32. On Linux, I
routinely work with EXT4, NTFS and Fat32. Are you aware that the NTFS
driver for Linux allows you to create filesystem objects with names
that are unworkable under Windows? [names with embedded colons : for
instance]

I had to go to the Internet to figure out that I needed to use
:encoding=>'UTF-8' to properly handle multi-byte characters on Windows
7. It would have been nice to have Ruby tell me what the default
encodings were. That's a lame reason for inclusion of this proposed
feature, but it's all I have at the moment.

In the past, I ran into another problem, where I found embedded text of
a character type different than the enclosing text. I find that even
today, in filenames and text that mix English, Japanese and Korean
texts into a single string or file. I blame word-processors for this
mess. I used to jump through hoops to handle the problem, then I got
smart and just forced the encoding to UTF-8, replacing bad characters
with ''. In this situation, I don't see how knowing the filesystem or
default encodings would help, since the person who created the
Frankenstein-text didn't realize what they were doing.


Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>