[#29911] [Bug #3231] Digest Does Not Build — Charlie Savage <redmine@...>

Bug #3231: Digest Does Not Build

19 messages 2010/05/01

[#29920] [Feature #3232] Loops (while/until) should return last statement value if any, like if/unless — Benoit Daloze <redmine@...>

Feature #3232: Loops (while/until) should return last statement value if any, like if/unless

9 messages 2010/05/01

[#29997] years in Time.utc — Xavier Noria <fxn@...>

Does anyone have a precise statement about the years supported by

13 messages 2010/05/04

[#30010] [Bug #3248] extension 'tk' is finding tclConfig.sh and tkConfig.sh incorrectly — Luis Lavena <redmine@...>

Bug #3248: extension 'tk' is finding tclConfig.sh and tkConfig.sh incorrectly

9 messages 2010/05/05

[#30226] [Bug #3288] Segmentation fault - activesupport-3.0.0.beta3/lib/active_support/callbacks.rb:88 — Szymon Jeż <redmine@...>

Bug #3288: Segmentation fault - activesupport-3.0.0.beta3/lib/active_support/callbacks.rb:88

10 messages 2010/05/13

[#30358] tk doesn't startup well in doze — Roger Pack <rogerdpack2@...>

Currently with 1.9.x and tk 8.5,the following occurs

12 messages 2010/05/22

[ruby-core:30052] Re: [Bug #1685] Some windows unicode path issues remain

From: Bill Kelly <billk@...>
Date: 2010-05-06 10:39:27 UTC
List: ruby-core #30052
U.Nakamura wrote:
> 
> In message "[ruby-core:30012] Re: [Bug #1685] Some windows unicode path issues remain"
>     on May.05,2010 15:35:11, <billk@cts.com> wrote:
> | 
> | It seems rb_stat in file.c calls stat(), but stat does
> | not map to the unicode version.
> 
> Oops, thank you!

Thanks, the test gets much further now.

It now fails at the last line:

  Dir.chdir DNAME_CHINESE
  cwd = Dir.pwd
  ( cwd[(-DNAME_CHINESE.length)..-1] == DNAME_CHINESE ) or raise "cwd check fail"


Currently there was only rb_w32_getcwd.  I have added a unicode
rb_w32_ugetcwd:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Index: include/ruby/win32.h
===================================================================
--- include/ruby/win32.h	(revision 27644)
+++ include/ruby/win32.h	(working copy)
@@ -254,6 +254,7 @@
 extern struct servent  *WSAAPI rb_w32_getservbyport(int, const char *);
 extern int    rb_w32_socketpair(int, int, int, int *);
 extern char * rb_w32_getcwd(char *, int);
+extern char * rb_w32_ugetcwd(char *, int);
 extern char * rb_w32_getenv(const char *);
 extern int    rb_w32_rename(const char *, const char *);
 extern int    rb_w32_urename(const char *, const char *);
@@ -611,7 +612,7 @@
 #define get_osfhandle(h)	rb_w32_get_osfhandle(h)

 #undef getcwd
-#define getcwd(b, s)		rb_w32_getcwd(b, s)
+#define getcwd(b, s)		rb_w32_ugetcwd(b, s)

 #undef getenv
 #define getenv(n)		rb_w32_getenv(n)
Index: win32/win32.c
===================================================================
--- win32/win32.c	(revision 27644)
+++ win32/win32.c	(working copy)
@@ -3692,6 +3692,57 @@
     return p;
 }

+char *
+rb_w32_ugetcwd(char *buffer, int size)
+{
+    char *p;
+    WCHAR *wp;
+    long len, wlen;
+
+    wlen = GetCurrentDirectoryW(0, NULL);  // wlen includes null terminating character
+    if (!wlen) {
+	errno = map_errno(GetLastError());
+	return NULL;
+    }
+
+    wp = malloc(wlen * sizeof(WCHAR));
+    if (!wp) {
+	errno = ENOMEM;
+	return NULL;
+    }
+
+    if (!GetCurrentDirectoryW(wlen, wp)) {
+	errno = map_errno(GetLastError());
+	free(wp);
+        return NULL;
+    }
+
+    p = wstr_to_utf8(wp, &len);
+    free(wp);
+    len += 1;  // len now includes null terminating character
+
+    if (!p) {
+	errno = ENOMEM;
+	return NULL;
+    }
+
+    if (buffer) {
+	if (size < len) {
+	    free(p);
+	    errno = ERANGE;
+	    return NULL;
+	}
+
+	memcpy(buffer, p, len);
+	free(p);
+	p = buffer;
+    }
+
+    translate_char(p, '\\', '/');
+
+    return p;
+}
+
 int
 chown(const char *path, int owner, int group)
 {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


This works, in terms of returning a UTF-8 path string; however,
rb_dir_getwd calls rb_enc_associate(cwd, rb_filesystem_encoding())
on the result, associating the WINDOWS-1252 encoding instead of
UTF-8.

So, I would like to ask: is there a reason
enc_set_filesystem_encoding() should not return UTF-8 now for
Windows?

static int
enc_set_filesystem_encoding(void)
{
    int idx;
#if defined NO_LOCALE_CHARMAP
    idx = rb_enc_to_index(rb_default_external_encoding());
#elif defined _WIN32 || defined __CYGWIN__
    char cp[sizeof(int) * 8 / 3 + 4];
    snprintf(cp, sizeof cp, "CP%d", AreFileApisANSI() ? GetACP() : GetOEMCP());
    idx = rb_enc_find_index(cp);
    if (idx < 0) idx = rb_ascii8bit_encindex();
#else
    idx = rb_enc_to_index(rb_default_external_encoding());
#endif

    enc_alias_internal("filesystem", idx);
    return idx;
}

It seems strange that it still selects non-unicode encodings.


  *  *  *


Also, my bootstraptest encountered one more problem.  The mktmpdir
can't delete the unicode directory entries created by my test:

P:/code/ruby-svn/trunk/lib/fileutils.rb:1307:in `unlink': Invalid argument - C:/temp/bootstraptest20100505-1016-1lvss6a.tmpwd/???? (Errno::EINVAL)
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1307:in `block in remove_file'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1315:in `platform_support'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1306:in `remove_file'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1295:in `remove'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:761:in `block in remove_entry'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1345:in `block (2 levels) in postorder_traverse'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1349:in `postorder_traverse'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1344:in `block in postorder_traverse'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1343:in `each'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:1343:in `postorder_traverse'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:759:in `remove_entry'
        from P:/code/ruby-svn/trunk/lib/fileutils.rb:688:in `remove_entry_secure'
        from P:/code/ruby-svn/trunk/lib/tmpdir.rb:85:in `ensure in mktmpdir'
        from P:/code/ruby-svn/trunk/lib/tmpdir.rb:85:in `mktmpdir'
        from ./bootstraptest/runner.rb:375:in `in_temporary_working_directory'
        from ./bootstraptest/runner.rb:126:in `main'
        from ./bootstraptest/runner.rb:398:in `<main>'

I don't have a patch for this yet.  However, it looks like
in win32.c, routines such as rb_w32_opendir and rb_w32_readdir_with_enc
are already using WCHAR internally!

For example:

DIR *
rb_w32_opendir(const char *filename)
{
    struct stati64 sbuf;
    WIN32_FIND_DATAW fd;
    HANDLE fh;
    WCHAR *wpath;

    if (!(wpath = filecp_to_wstr(filename, NULL)))
	return NULL;

... so it seems if filesystem encoding were considered UTF-8
instead of WINDOWS-1252, then opendir might just work.


Similarly (somewhat) with rb_w32_readdir_with_enc.  (At least,
it does call readdir_internal, which uses WCHAR.)


So I *think* these are very close to working UTF-8, but, again,
I don't understand why enc_set_filesystem_encoding() uses
WINDOWS-1252 still.


Thanks,

Regards,

Bill


In This Thread