[#48745] [ruby-trunk - Bug #7267][Open] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names — "kennygrant (Kenny Grant)" <kennygrant@...>

17 messages 2012/11/02

[#48773] [ruby-trunk - Bug #7269][Open] Refinement doesn't work if using locate after method — "ko1 (Koichi Sasada)" <redmine@...>

12 messages 2012/11/03

[#48847] [ruby-trunk - Bug #7274][Open] UnboundMethods should be bindable to any object that is_a?(owner of the UnboundMethod) — "rits (First Last)" <redmine@...>

21 messages 2012/11/04

[#48854] [ruby-trunk - Bug #7276][Open] TestFile#test_utime failure — "jonforums (Jon Forums)" <redmine@...>

14 messages 2012/11/04

[#48988] [ruby-trunk - Feature #7292][Open] Enumerable#to_h — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

40 messages 2012/11/06

[#48997] [ruby-trunk - Feature #7297][Open] map_to alias for each_with_object — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>

19 messages 2012/11/06

[#49001] [ruby-trunk - Bug #7298][Open] Behavior of Enumerator.new different between 1.9.3 and 2.0.0 — "ayumin (Ayumu AIZAWA)" <ayumu.aizawa@...>

12 messages 2012/11/06

[#49018] [ruby-trunk - Feature #7299][Open] Ruby should not completely ignore blocks. — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

13 messages 2012/11/07

[#49044] [ruby-trunk - Bug #7304][Open] Random test failures around test_autoclose_true_closed_by_finalizer — "luislavena (Luis Lavena)" <luislavena@...>

11 messages 2012/11/07

[#49196] [ruby-trunk - Feature #7322][Open] Add a new operator name #>< for bit-wise "exclusive or" — "alexeymuranov (Alexey Muranov)" <redmine@...>

18 messages 2012/11/10

[#49211] [ruby-trunk - Feature #7328][Open] Move ** operator precedence under unary + and - — "boris_stitnicky (Boris Stitnicky)" <boris@...>

20 messages 2012/11/11

[#49229] [ruby-trunk - Bug #7331][Open] Set the precedence of unary `-` equal to the precedence `-`, same for `+` — "alexeymuranov (Alexey Muranov)" <redmine@...>

17 messages 2012/11/11

[#49256] [ruby-trunk - Feature #7336][Open] Flexiable OPerator Precedence — "trans (Thomas Sawyer)" <transfire@...>

18 messages 2012/11/12

[#49354] review open pull requests on github — Zachary Scott <zachary@...>

Could we get a review on any open pull requests on github before the

12 messages 2012/11/15
[#49355] Re: review open pull requests on github — "NARUSE, Yui" <naruse@...> 2012/11/15

2012/11/15 Zachary Scott <zachary@zacharyscott.net>:

[#49356] Re: review open pull requests on github — Zachary Scott <zachary@...> 2012/11/15

Ok, I was hoping one of the maintainers might want to.

[#49451] [ruby-trunk - Bug #7374][Open] File.expand_path resolving to first file/dir instead of absolute path — mdube@... (Martin Dubé) <mdube@...>

12 messages 2012/11/16

[#49463] [ruby-trunk - Feature #7375][Open] embedding libyaml in psych for Ruby 2.0 — "tenderlovemaking (Aaron Patterson)" <aaron@...>

21 messages 2012/11/16
[#49494] [ruby-trunk - Feature #7375] embedding libyaml in psych for Ruby 2.0 — "vo.x (Vit Ondruch)" <v.ondruch@...> 2012/11/17

[#49467] [ruby-trunk - Feature #7377][Open] #indetical? as an alias for #equal? — "aef (Alexander E. Fischer)" <aef@...>

13 messages 2012/11/17

[#49558] [ruby-trunk - Bug #7395][Open] Negative numbers can't be primes by definition — "zzak (Zachary Scott)" <zachary@...>

10 messages 2012/11/19

[#49566] [ruby-trunk - Feature #7400][Open] Incorporate OpenSSL tests from JRuby. — "zzak (Zachary Scott)" <zachary@...>

11 messages 2012/11/19

[#49770] [ruby-trunk - Feature #7414][Open] Now that const_get supports "Foo::Bar" syntax, so should const_defined?. — "robertgleeson (Robert Gleeson)" <rob@...>

9 messages 2012/11/20

[#49950] [ruby-trunk - Feature #7427][Assigned] Update Rubygems — "mame (Yusuke Endoh)" <mame@...>

17 messages 2012/11/24

[#50043] [ruby-trunk - Bug #7429][Open] Provide options for core collections to customize behavior — "headius (Charles Nutter)" <headius@...>

10 messages 2012/11/24

[#50092] [ruby-trunk - Feature #7434][Open] Allow caller_locations and backtrace_locations to receive negative params — "sam.saffron (Sam Saffron)" <sam.saffron@...>

21 messages 2012/11/25

[#50094] [ruby-trunk - Bug #7436][Open] Allow for a "granularity" flag for backtrace_locations — "sam.saffron (Sam Saffron)" <sam.saffron@...>

11 messages 2012/11/25

[#50207] [ruby-trunk - Bug #7445][Open] strptime('%s %z') doesn't work — "felipec (Felipe Contreras)" <felipe.contreras@...>

19 messages 2012/11/27

[#50424] [ruby-trunk - Bug #7485][Open] ruby cannot build on mingw32 due to missing __sync_val_compare_and_swap — "drbrain (Eric Hodel)" <drbrain@...7.net>

15 messages 2012/11/30

[#50429] [ruby-trunk - Feature #7487][Open] Cutting through the issues with Refinements — "trans (Thomas Sawyer)" <transfire@...>

13 messages 2012/11/30

[ruby-core:48755] [ruby-trunk - Bug #7267] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names

From: "kennygrant (Kenny Grant)" <kennygrant@...>
Date: 2012-11-02 11:32:36 UTC
List: ruby-core #48755
Issue #7267 has been updated by kennygrant (Kenny Grant).

File results.txt added

Output of the test.rb script:

Tested on Ruby 2.0.0-preview and 1.9.3 on Mac OS X
1.9.3x86_64-darwin11.4.0
Inline string works as expected
Source encoding: UTF-8
External encoding: UTF-8
Name: "./Testé.txt"
Encoding: UTF-8
Chars: [".", "/", "T", "e", "s", "t", "é", ".", "t", "x", "t"]
Codepoints: [46, 47, 84, 101, 115, 116, 233, 46, 116, 120, 116]
Bytes: [46, 47, 84, 101, 115, 116, 195, 169, 46, 116, 120, 116]
Testing string ./Testé.txt
./TestTEST.txt

File name from Dir.glob does not
Source encoding: UTF-8
External encoding: UTF-8
Name: "./Testé.txt"
Encoding: UTF-8
Chars: [".", "/", "T", "e", "s", "t", "e", "́", ".", "t", "x", "t"]
Codepoints: [46, 47, 84, 101, 115, 116, 101, 769, 46, 116, 120, 116]
Bytes: [46, 47, 84, 101, 115, 116, 101, 204, 129, 46, 116, 120, 116]
Testing string ./Testé.txt
./Testé.txt

Source encoding: UTF-8
External encoding: UTF-8
Name: "./Testé.txt"
Encoding: UTF-8
Chars: [".", "/", "T", "e", "s", "t", "é", ".", "t", "x", "t"]
Codepoints: [46, 47, 84, 101, 115, 116, 233, 46, 116, 120, 116]
Bytes: [46, 47, 84, 101, 115, 116, 195, 169, 46, 116, 120, 116]
Testing string ./Testé.txt
./TestTEST.txt
----------------------------------------
Bug #7267: Dir.glob on Mac OS X returns unexpected string encodings for unicode file names
https://bugs.ruby-lang.org/issues/7267#change-32234

Author: kennygrant (Kenny Grant)
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 2.0.0
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0]


Tested on Ruby 1.9.3-p194 and ruby-2.0.0-preview1 on Mac OS X 10. 7.5

When calling file system methods with Ruby on Mac OS X, it is not possible to manipulate the resulting file name as a normal UTF-8 string, even though it reports the encoding as UTF-8. It seems to be a UTF-8-MAC string, even when the default encoding is set to UTF-8. This leads to confusion as the string can be manipulated normally except for any unicode characters, which seem to be decomposed. So a regexp using utf-8 characters won't work on the string, unless it is first converted from UTF-8-MAC. I'd expect the string encoding to be UTF-8, or at least to report that it is not a normal UTF-8 string if it has to be UTF-8-MAC for some reason. 

Example, run with a file called Testé.txt in the same folder:

def transform_string s
   puts "Testing string #{s}"
   puts s.gsub(/é/,'TEST')
end

Dir.glob("./*.txt").each do |f|  
  puts "Inline string works as expected" 
   s = "./Testé.txt" 
   puts transform_string s

   puts "File name from Dir.glob does not" 
   puts transform_string f
   
   puts "Encoded file name works as expected, though it is reported as UTF-8, not UTF-8-MAC" 
   f.encode!('UTF-8','UTF-8-MAC')
   puts transform_string f
end


-- 
http://bugs.ruby-lang.org/

In This Thread