From: Eric Wong Date: 2016-02-07T22:36:25+00:00 Subject: [ruby-core:73737] Re: [Ruby trunk Feature#12034] RegExp does not respect file encoding directive nobu@ruby-lang.org wrote: > Eric Wong wrote: > > How about fall back to ASCII-8BIT if we detect broken code range? > > It may be desirable or undesirable, as it can cause unexpected failure later. Current behavior causes failures now. > > ```diff > > + link = "\xde\xad\xbe\xef".b > > + File.symlink(link, 'foo') > > + str = File.readlink('foo') > > + assert_predicate str, :valid_encoding?, bug12034 > > + assert_equal link, str, bug12034 > > Anyway, "\xde\xad\xbe\xef" is a valid string in some encodings, e.g., EUC-JP, ISO-8859-1, and so on. > Especially in ISO-8859 encodings, any bytes are valid. I think that is fine as long as the strings are valid. Returning invalid strings is the main problem, I think; and we should stop doing that. Dir.entries and similar methods have the same problem. Unsubscribe: