[#4595] New block syntax — Daniel Amelang <daniel.amelang@...>

I'm really sorry if this isn't the place to talk about this. I've

25 messages 2005/03/21
[#4606] Re: New block syntax — "David A. Black" <dblack@...> 2005/03/21

Hi --

[#4629] Re: New block syntax — "Sean E. Russell" <ser@...> 2005/03/30

On Monday 21 March 2005 16:17, David A. Black wrote:

[#4648] about REXML::Encoding — speakillof <speakillof@...>

Hi.

15 messages 2005/03/31
[#4659] Re: about REXML::Encoding — "Sean E. Russell" <ser@...> 2005/04/04

On Thursday 31 March 2005 09:44, speakillof wrote:

Re: Win32 Non-ASCII Filename Access

From: Austin Ziegler <halostatue@...>
Date: 2005-03-09 18:48:15 UTC
List: ruby-core #4540
On Thu, 10 Mar 2005 02:25:45 +0900, Berger, Daniel
<Daniel.Berger@qwest.com> wrote:
>> From: Austin Ziegler [mailto:halostatue@gmail.com ]
>>> IF the UNICODE macro is set. Basically, most Windows functions
>>> look like this:
>> I know.
>> 
>> UNFORTUNATELY, to get that to work, you also have to use TCHAR as
>> your character type. That is, instead of:
>>   char*       spec = "C:\\Foo\\Bar\\*.*";
>> you need:
>>   TCHAR* spec = "C:\\Foo\\Bar\\*.*";
>> This may cause *other* problems with Ruby, since it seemss to be
>> written around the assumption that a character is a single byte
>> wide.
> That isn't my understanding, though perhaps I'm not "getting it".
> From what I've read, TCHAR is 2 bytes wide if UNICODE is defined.

Okay -- let's try again. Ruby isn't written in Microsoft's dialect
of C++. It doesn't use TCHAR. It uses char. Saying that Ruby needs
to use TCHAR would be a Bad Thing.

I don't have the Ruby code in front of me, but a lot of things
probably wouldn't work quite the same if we used the UNICODE macro.
String#each_byte, anyone?

Ruby doesn't know anything about TCHAR, and if you try to fit a
TCHAR into char, you're asking for trouble.

>> Ultimately, the only acceptable way to do this is to NOT use
>> TCHAR, but to explicitly use the wide versions of functions and
>> do MultibyteToWide and WideToMultibyte calls as necessary. The
>> best choice for this will be, of course, UTF-8 (CP_UTF8), but if
>> we're not in UTF-8 mode, we can always use ANSI (CP_ACP) and get
>> the exact same behaviour. Better, we get to choose the mode of
>> behaviour at run-time.
> Ugh. I really hope this isn't necessary.

Why not? Properly abstracted, this isn't a big deal. I've just
converted my company's primary client application to supporting
Unicode -- while maintaining ASCII/ANSI support -- simply by doing
this.

I maintain that if we try to recompile Ruby to use TCHAR as wchar_t,
we're simply going to regret it. I also believe that using UNICODE
will cause a number of extensions to fail that haven't been written
with TCHAR in mine (Oracle comes to mind).

This is probably the best way to handle this issue, and it will be
much simpler when Ruby gets encoding-capable Strings, but we can
already solve some of the issue by using MultibyteToWide and
WideToMultibyte (and they can be done *safely*).

>> I do NOT recommend the use of TCHAR and _TEXT; they are
>> Microsoftisms, and they won't be compatible with standard Ruby, I
>> don't think.
> I definitely think we should test this. Any suggestions how?

I'm not sure. What I do know is that there's places in the code --
including in win32/win32.c and win32/dir.h -- that explicitly define
buffers as char. They're not TCHAR. TCHAR is the only safe way to
use the damnable UNICODE macro.

-austin
-- 
Austin Ziegler * halostatue@gmail.com
               * Alternate: austin@halostatue.ca

In This Thread

Prev Next