[#53072] [ruby-trunk - Feature #7994][Open] Make iterators pass an implicit named parameter `iteration` to the executed block — "alexeymuranov (Alexey Muranov)" <redmine@...>

10 messages 2013/03/01

[#53097] [ruby-trunk - Bug #8000][Open] "require 'tk'" segfaults on 64-bit linux with Tk 8.6 — "edmccard (Ed McCardell)" <edmccard@...>

25 messages 2013/03/02

[#53137] [ruby-trunk - Bug #8017][Open] Got segmentation fault on attempt to install ruby 2.0.0-p0 on Mac 10.6.8 via RVM — "adantel (Alex Filatau)" <filatau@...>

9 messages 2013/03/05

[#53168] [ruby-trunk - Bug #8034][Open] File.expand_path('something', '~') do not include home path — "rap-kasta (Pavel Manylov)" <rapkasta@...>

12 messages 2013/03/06

[#53199] [ruby-trunk - Bug #8040][Open] Unexpect behavior when using keyword arguments — "pabloh (Pablo Herrero)" <pablodherrero@...>

11 messages 2013/03/07

[#53203] [ruby-trunk - Feature #8042][Open] Add Addrinfo#socket to create a socket that is not connected or bound — "drbrain (Eric Hodel)" <drbrain@...7.net>

12 messages 2013/03/07

[#53248] Github commit log should not be used as references on redmine — Marc-Andre Lafortune <ruby-core-mailing-list@...>

Github commit log should not be used as references on redmine. E.g:

10 messages 2013/03/09

[#53386] [CommonRuby - Feature #8088][Open] Method#parameters (and friends) should provide useful information about core methods — "headius (Charles Nutter)" <headius@...>

14 messages 2013/03/13

[#53412] [CommonRuby - Feature #8096][Open] introduce Time.current_timestamp — "vipulnsward (Vipul Amler)" <vipulnsward@...>

34 messages 2013/03/14

[#53439] [ruby-trunk - Bug #8100][Open] Segfault in ruby-2.0.0p0 — "judofyr (Magnus Holm)" <judofyr@...>

22 messages 2013/03/15

[#53478] [ruby-trunk - Feature #8107][Open] [patch] runtime flag to track object allocation metadata — "tmm1 (Aman Gupta)" <ruby@...1.net>

20 messages 2013/03/16

[#53498] [ruby-trunk - Feature #8110][Open] Regex methods not changing global variables — "prijutme4ty (Ilya Vorontsov)" <prijutme4ty@...>

21 messages 2013/03/18

[#53502] [ruby-trunk - Bug #8115][Open] make install DESTDIR=/my/install/path fails — "vo.x (Vit Ondruch)" <v.ondruch@...>

11 messages 2013/03/18

[#53688] [ruby-trunk - Feature #8158][Open] lightweight structure for loaded features index — "funny_falcon (Yura Sokolov)" <funny.falcon@...>

27 messages 2013/03/24

[#53692] [ruby-trunk - Bug #8159][Open] Build failure introduced by Rinda changes — "luislavena (Luis Lavena)" <luislavena@...>

22 messages 2013/03/24

[#53733] [ruby-trunk - Bug #8165][Open] Problems with require — "Krugloff (Alexandr Kruglov)" <mr.krugloff@...>

12 messages 2013/03/26

[#53742] [ruby-trunk - Bug #8168][Open] Feature request: support for (single) statement lambda syntax/definition — "garysweaver (Gary Weaver)" <garysweaver@...>

9 messages 2013/03/26

[#53765] [ruby-trunk - Bug #8174][Open] AIX header file conflict with rb_hook_list_struct — "edelsohn (David Edelsohn)" <dje.gcc@...>

11 messages 2013/03/27

[#53808] [ruby-trunk - Feature #8181][Open] New flag for strftime that supports adding ordinal suffixes to numbers — "tkellen (Tyler Kellen)" <tyler@...>

10 messages 2013/03/28

[#53811] [ruby-trunk - Bug #8182][Open] XMLRPC request fails with "Wrong size. Was 31564, should be 1501" — "tsagadar (Marcel Mueller)" <marcel.mueller@...>

28 messages 2013/03/28

[#53849] [ruby-trunk - Feature #8191][Open] Short-hand syntax for duck-typing — "wardrop (Tom Wardrop)" <tom@...>

48 messages 2013/03/31

[#53850] An evaluation of 2.0.0 release — Yusuke Endoh <mame@...>

Let's look back at 2.0.0 release so that we can do better next time.

12 messages 2013/03/31

[ruby-core:53493] [ruby-trunk - Bug #7267] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names

From: "naruse (Yui NARUSE)" <naruse@...>
Date: 2013-03-18 06:22:37 UTC
List: ruby-core #53493
Issue #7267 has been updated by naruse (Yui NARUSE).


Attaching my proof of concept code.

diff --git a/dir.c b/dir.c
index d4b3dd3..126c27e 100644
--- a/dir.c
+++ b/dir.c
@@ -81,6 +81,84 @@ char *strchr(char*,char);
 
 #define rb_sys_fail_path(path) rb_sys_fail_str(path)
 
+#if defined(__APPLE__)
+
+#include <sys/param.h>
+#include <sys/mount.h>
+
+static int
+is_hfs(const char *path, size_t len)
+{
+    struct statfs buf;
+    char *p = ALLOCA_N(char, len+1);
+    memcpy(p, path, len);
+    p[len] = 0;
+    if (statfs(p, &buf) == 0) {
+	return buf.f_type == 17; /* HFS on darwin */
+    }
+    return FALSE;
+}
+
+/*
+ * vpath is UTF8-MAC string
+ */
+VALUE
+compose_utf8_mac(VALUE vpath)
+{
+    const char *path0 = RSTRING_PTR(vpath);
+    const char *p = path0;
+    const char *subpath = p;
+    const char *pend = RSTRING_END(vpath);
+    int hfs_p;
+    VALUE result;
+    rb_encoding *utf8, *utf8mac;
+    static VALUE utf8enc;
+
+    if (*p++ != '/')
+	return vpath;
+
+    if (!utf8enc) {
+	utf8mac = rb_enc_find("UTF8-MAC");
+	utf8 = rb_utf8_encoding();
+	utf8enc = rb_enc_from_encoding(utf8);
+    }
+
+    result = rb_str_buf_new(RSTRING_LEN(vpath));
+    hfs_p = is_hfs("/", 1);
+
+    for (; p < pend; p++) {
+	if (*p != '/')
+	    continue;
+	if (hfs_p != is_hfs(path0, p-path0)) {
+	    if (hfs_p) {
+		VALUE str = rb_enc_str_new(subpath, p - subpath, utf8mac);
+		rb_str_buf_append(result, rb_str_encode(str, utf8enc, 0, Qnil));
+	    }
+	    else {
+                rb_str_buf_cat(result, subpath, p - subpath);
+	    }
+	    hfs_p = !hfs_p;
+	    subpath = p;
+	}
+    }
+    if (hfs_p) {
+	VALUE str = rb_enc_str_new(subpath, p - subpath, utf8mac);
+	rb_str_buf_append(result, rb_str_encode(str, utf8enc, 0, Qnil));
+    }
+    else {
+	rb_str_buf_cat(result, subpath, p - subpath);
+    }
+
+    rb_enc_associate(result, utf8);
+    return result;
+}
+static VALUE
+dir_s_compose_path(VALUE dir, VALUE path)
+{
+    return compose_utf8_mac(path);
+}
+#endif
+
 #define FNM_NOESCAPE	0x01
 #define FNM_PATHNAME	0x02
 #define FNM_DOTMATCH	0x04
@@ -2120,6 +2198,7 @@ Init_Dir(void)
     rb_define_singleton_method(rb_cDir,"delete", dir_s_rmdir, 1);
     rb_define_singleton_method(rb_cDir,"unlink", dir_s_rmdir, 1);
     rb_define_singleton_method(rb_cDir,"home", dir_s_home, -1);
+    rb_define_singleton_method(rb_cDir,"compose", dir_s_compose_path, 1);
 
     rb_define_singleton_method(rb_cDir,"glob", dir_s_glob, -1);
     rb_define_singleton_method(rb_cDir,"[]", dir_s_aref, -1);

----------------------------------------
Bug #7267: Dir.glob on Mac OS X returns unexpected string encodings for unicode file names
https://bugs.ruby-lang.org/issues/7267#change-37687

Author: kennygrant (Kenny Grant)
Status: Assigned
Priority: Normal
Assignee: duerst (Martin D端rst)
Category: 
Target version: next minor
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0]


Tested on Ruby 1.9.3-p194 and ruby-2.0.0-preview1 on Mac OS X 10. 7.5

When calling file system methods with Ruby on Mac OS X, it is not possible to manipulate the resulting file name as a normal UTF-8 string, even though it reports the encoding as UTF-8. It seems to be a UTF-8-MAC string, even when the default encoding is set to UTF-8. This leads to confusion as the string can be manipulated normally except for any unicode characters, which seem to be decomposed. So a regexp using utf-8 characters won't work on the string, unless it is first converted from UTF-8-MAC. I'd expect the string encoding to be UTF-8, or at least to report that it is not a normal UTF-8 string if it has to be UTF-8-MAC for some reason. 

Example, run with a file called Test辿.txt in the same folder:

def transform_string s
   puts "Testing string #{s}"
   puts s.gsub(/辿/,'TEST')
end

Dir.glob("./*.txt").each do |f|  
  puts "Inline string works as expected" 
   s = "./Test辿.txt" 
   puts transform_string s

   puts "File name from Dir.glob does not" 
   puts transform_string f
   
   puts "Encoded file name works as expected, though it is reported as UTF-8, not UTF-8-MAC" 
   f.encode!('UTF-8','UTF-8-MAC')
   puts transform_string f
end


-- 
http://bugs.ruby-lang.org/

In This Thread