[#48729] [ANN] ruby 2.0.0-preview1 released — Yusuke Endoh <mame@...>
Japanese later; 日本語はあとで
Hi,
Hello Vit,
2012/11/6 Yusuke Endoh <mame@tsg.ne.jp>
[#48745] [ruby-trunk - Bug #7267][Open] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names — "kennygrant (Kenny Grant)" <kennygrant@...>
[#48773] [ruby-trunk - Bug #7269][Open] Refinement doesn't work if using locate after method — "ko1 (Koichi Sasada)" <redmine@...>
(2012/11/03 10:11), headius (Charles Nutter) wrote:
(2012/11/03 10:36), SASADA Koichi wrote:
[#48774] [ruby-trunk - Feature #4085] Refinements and nested methods — "shugo (Shugo Maeda)" <redmine@...>
[#48819] [ruby-trunk - Feature #4085] Refinements and nested methods — "headius (Charles Nutter)" <headius@...>
[#48820] [ruby-trunk - Bug #7271][Assigned] Refinement doesn't seem lexical — "ko1 (Koichi Sasada)" <redmine@...>
[#48847] [ruby-trunk - Bug #7274][Open] UnboundMethods should be bindable to any object that is_a?(owner of the UnboundMethod) — "rits (First Last)" <redmine@...>
[#48882] [ruby-trunk - Feature #4085] Refinements and nested methods — "headius (Charles Nutter)" <headius@...>
[#48964] [Backport93 - Backport #7285][Assigned] some failures on RubyInstaller CI — "usa (Usaku NAKAMURA)" <usa@...>
[#48988] [ruby-trunk - Feature #7292][Open] Enumerable#to_h — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>
[#48997] [ruby-trunk - Feature #7297][Open] map_to alias for each_with_object — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>
[#49018] [ruby-trunk - Feature #7299][Open] Ruby should not completely ignore blocks. — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>
[#49078] Re: [ruby-cvs:44714] marcandre:r37544 (ruby_1_9_3): merge revisions r33453, r37542: — "U.Nakamura" <usa@...>
Hello,
[#49119] ID_ALLOCATOR ? — Roger Pack <rogerdpack2@...>
Hello.
Can I see ruby-prof code?
On Fri, Nov 9, 2012 at 11:14 AM, SASADA Koichi <ko1@atdot.net> wrote:
[#49196] [ruby-trunk - Feature #7322][Open] Add a new operator name #>< for bit-wise "exclusive or" — "alexeymuranov (Alexey Muranov)" <redmine@...>
[#49211] [ruby-trunk - Feature #7328][Open] Move ** operator precedence under unary + and - — "boris_stitnicky (Boris Stitnicky)" <boris@...>
[#49256] [ruby-trunk - Feature #7336][Open] Flexiable OPerator Precedence — "trans (Thomas Sawyer)" <transfire@...>
[#49267] [ruby-trunk - Feature #7340][Open] 'each_with' or 'into' alias for 'each_with_object' — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>
[#49268] [ruby-trunk - Feature #7341][Open] Enumerable#associate — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>
[#49282] Re: [ruby-cvs:44801] tenderlove:r37631 (trunk): * probes.d: add DTrace probe declarations. — "U.Nakamura" <usa@...>
Hello,
Hello,
2012/11/13 U.Nakamura <usa@garbagecollect.jp>:
[#49298] [ruby-trunk - Feature #7346][Open] object(...) as syntax sugar for object.call(...) — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>
[#49320] [ruby-trunk - Feature #4085] Refinements and nested methods — "headius (Charles Nutter)" <headius@...>
[#49328] [ruby-trunk - Bug #7349][Open] Struct#inspect needs more meaningful output — "postmodern (Hal Brodigan)" <postmodern.mod3@...>
[#49340] bugs.ruby-lang.org - 500 error — Luis Lavena <luislavena@...>
Hello,
I've been unable to access it since morning EET (about 6 hours now).
It's almost 3am in Japan now, don't forget.
On Wed, Nov 14, 2012 at 2:46 PM, Zachary Scott <zachary@zacharyscott.net> wrote:
[#49354] review open pull requests on github — Zachary Scott <zachary@...>
Could we get a review on any open pull requests on github before the
2012/11/15 Zachary Scott <zachary@zacharyscott.net>:
Ok, I was hoping one of the maintainers might want to.
I could add my eyes to monitor the github issues/pull requests, if only to
On Thu, Nov 15, 2012 at 2:11 PM, Marc-Andre Lafortune
On Thu, Nov 15, 2012 at 1:01 PM, Luis Lavena <luislavena@gmail.com> wrote:
On Thu, Nov 15, 2012 at 1:06 PM, Zachary Scott <zachary@zacharyscott.net>
[#49370] [ruby-trunk - Bug #7358][Open] Wrong fd redirection on fork — "felipec (Felipe Contreras)" <felipe.contreras@...>
[#49416] make check: missing psych — Ramkumar Ramachandra <artagnon@...>
Hi,
On Fri, Nov 16, 2012 at 9:58 AM, Ramkumar Ramachandra
Luis Lavena wrote:
[#49463] [ruby-trunk - Feature #7375][Open] embedding libyaml in psych for Ruby 2.0 — "tenderlovemaking (Aaron Patterson)" <aaron@...>
On Sun, Nov 18, 2012 at 03:05:50AM +0900, vo.x (Vit Ondruch) wrote:
Dne 17.11.2012 21:19, Aaron Patterson napsal(a):
On 17 November 2012 21:34, V咜 Ondruch <v.ondruch@gmail.com> wrote:
Hello,
[#49468] [ruby-trunk - Feature #7378][Open] Adding Pathname#write — "aef (Alexander E. Fischer)" <aef@...>
[#49479] [ruby-trunk - Bug #7379][Open] Unexpected result of Kernel#gets on Windows 8 — "phasis68 (Heesob Park)" <phasis@...>
[#49518] [ruby-trunk - Bug #7383][Open] Use stricter cache check in load.c — "funny_falcon (Yura Sokolov)" <funny.falcon@...>
[#49536] [ruby-trunk - Feature #7388][Open] Object#embed — "zzak (Zachary Scott)" <zachary@...>
[#49543] [ruby-trunk - Feature #7390][Open] Funny Falcon Threads — "zzak (Zachary Scott)" <zachary@...>
[#49558] [ruby-trunk - Bug #7395][Open] Negative numbers can't be primes by definition — "zzak (Zachary Scott)" <zachary@...>
[#49868] How to stop spam from ruby-core — Heesob Park <phasis@...>
Hi,
[#49949] [ruby-trunk - Feature #7426][Assigned] Update Rdoc — "mame (Yusuke Endoh)" <mame@...>
(2012/11/27 13:33), drbrain (Eric Hodel) wrote:
On Tue, Nov 27, 2012 at 12:57 AM, SASADA Koichi <ko1@atdot.net> wrote:
On Nov 26, 2012, at 10:09 PM, Luis Lavena <luislavena@gmail.com> wrote:
[#50092] [ruby-trunk - Feature #7434][Open] Allow caller_locations and backtrace_locations to receive negative params — "sam.saffron (Sam Saffron)" <sam.saffron@...>
[#50264] [ruby-trunk - Feature #7457][Open] GC.stat to return "allocated object count" and "freed object count" — "ko1 (Koichi Sasada)" <redmine@...>
[#50306] Towards a better process for changing Ruby — Magnus Holm <judofyr@...>
Hey folks,
What I'd like to see is primarily better communication and release
Hello Magnus,
Endoh-san,
[#50312] How to stop spam message from redmine.ruby-lang.org — Heesob Park <phasis@...>
HI,
Hi,
[#50372] [ruby-trunk - Bug #7476][Open] missing "IP_TRANSPARENT" constant for IP sockets. — "elico (Eliezer Croitoru)" <eliezer@...>
2013/2/24 ko1 (Koichi Sasada) <redmine@ruby-lang.org>:
[ruby-core:49157] [ruby-trunk - Bug #7267] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names
Issue #7267 has been updated by kennygrant (Kenny Grant).
Thanks for the comments on this issue. I'm not clear on what the UTF8-MAC encoding represents, are there docs on this Ruby behaviour and the problems involved somewhere?
At present Dir.glob has inconsistent behaviour even working with the same file/filesystem:
It may return a filename marked UTF-8 which is NFD, or NFC, depending on the glob pattern you call it with (see writer.rb attachment to this issue). That's a small issue though and just indicates a wider complex problem.
> An issue is people may write decomposed filename. A imaginary use case is a program which make a filename from the name of a music output from iTunes. iTunes manages texts with UTF8-MAC. So the people will confuse.
OK, so in this case someone is unwittingly using a mix of UTF-8 NFC (any strings they create in ruby with legible accents) and UTF-8 NFD (any strings they get from itunes say) in their script, which could lead to issues even before writing file names. If they get NFD from itunes, then try to match on a track name with a regexp, it won't work unless they convert to NFC or explicitly create an NFD string will it? So even ignoring the file system there are issues here with labelling two different normalization forms UTF-8 as it leads to naive expectations of compatability which are false.
> More painful thing is normal UTF-8 is not NFCed UTF-8.There is both NFCed one and NFDed one.
Yes, it might be good to make this very clear in the Ruby docs, as so many people use UTF-8 for all strings now, probably mostly without understanding this issue (I'm aware not everyone in the Ruby community uses UTF-8 for other reasons). I certainly didn't know about it and it took quite a while to find any docs about it at all (most of which were for perl or other languages). Thanks for taking the time to explain it, and sorry if I missed some obvious notes in the docs for the ruby string class say.
One thing I don't understand though, is that you say there are both in normal use - in use of Ruby ignoring file systems, if you create a string or regexp, NFC is the default isn't it? So Ruby has chosen one default for UTF-8 strings created in Ruby (as it must), but has to interact with lots of systems which might or might not be using NFC. At present we seem to have a de-facto default normalization of NFC, but nothing is translated to it when it comes from the OS. That might be a a very hard problem, but in principle it would be nice to have one normalization blessed as the default so that all strings in a given encoding are comparable. The results of leaving them as they are supplied are really unexpected, and people using Ruby are not going to want to manually convert every string they touch from outside Ruby to NFC in case it was touched by HFS or created as NFD.
> First Ruby 1.9.0 set strings derived from filenames UTF8-MAC.
> But some reported that if filenames is UTF8-MAC, it is hard to compare
> with normal UTF-8 strings.
This is interesting as it's exactly the behaviour I expected (if it's not possible to cleanly translate to NFC) - if strings are coming through as UTF-8 NFD, I'd expect them to be marked as such somehow (for example by being marked as encoding UTF8-MAC) - is there any indication? Then at least it is clear that they are not comparable or compatible with the NFC ruby strings I get when creating a string s = "d辿tente". Removing the explicit encoding doesn't really solve the problem of strings being hard to compare, it just hides it until comparisons fail. By default, Ruby seems to use NFC UTF-8, which is what I'd expect, so I also expected to either receive strings which are comparable directly, or for those strings to be marked as some other encoding/normalization if the come from an HFS+ file system, so that I can deal with them even if Ruby doesn't. Is the normalization exposed somewhere on strings? The current situation was surprising and unexpected, though now that it has
been explained I see why it might occur and how hard it is to fix.
In the example code attached to this issue, I've used str.encode!("UTF-8", "UTF8-MAC") to translate file names/paths to NFC which works for my uses, but from naruse's comments, it seems that would fail in certain circumstances?
> If the translation from UTF8-MAC -> UTF-8 is entirely non-lossy and would do no harm to other UTF-8 strings
> Yes until all part of the converting string is truly UTF8-MAC.
I assumed from others' comments that UTF8-MAC was purely a sub-encoding used to indicate the use of decomposed strings, but would appreciate some more detail (if anyone has a link) on what exactly it involves, and if translation from UTF8-MAC to UTF8 can lose information that implies other differences. If the only difference is the decomposition (patterns which do not occur in NFC), I'd expect re-encoding to be idempotent and not affect NFC strings and thus harmless to apply to NFC strings or strings containing a mix. Re the file-system example, I had assumed that if you ask HFS to write to a file on a mounted file system HFS would normalize all names to NFD (as it does for any HFS files), but perhaps that is incorrect.
I suppose the above boils down to this question:
Is there a correct way to handle this situation, and never fail when comparing a default Ruby string (NFC) against a file from any file system which may be NFD?
----------------------------------------
Bug #7267: Dir.glob on Mac OS X returns unexpected string encodings for unicode file names
https://bugs.ruby-lang.org/issues/7267#change-32704
Author: kennygrant (Kenny Grant)
Status: Assigned
Priority: Normal
Assignee: duerst (Martin D端rst)
Category:
Target version: 2.0.0
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0]
Tested on Ruby 1.9.3-p194 and ruby-2.0.0-preview1 on Mac OS X 10. 7.5
When calling file system methods with Ruby on Mac OS X, it is not possible to manipulate the resulting file name as a normal UTF-8 string, even though it reports the encoding as UTF-8. It seems to be a UTF-8-MAC string, even when the default encoding is set to UTF-8. This leads to confusion as the string can be manipulated normally except for any unicode characters, which seem to be decomposed. So a regexp using utf-8 characters won't work on the string, unless it is first converted from UTF-8-MAC. I'd expect the string encoding to be UTF-8, or at least to report that it is not a normal UTF-8 string if it has to be UTF-8-MAC for some reason.
Example, run with a file called Test辿.txt in the same folder:
def transform_string s
puts "Testing string #{s}"
puts s.gsub(/辿/,'TEST')
end
Dir.glob("./*.txt").each do |f|
puts "Inline string works as expected"
s = "./Test辿.txt"
puts transform_string s
puts "File name from Dir.glob does not"
puts transform_string f
puts "Encoded file name works as expected, though it is reported as UTF-8, not UTF-8-MAC"
f.encode!('UTF-8','UTF-8-MAC')
puts transform_string f
end
--
http://bugs.ruby-lang.org/