[#29911] [Bug #3231] Digest Does Not Build — Charlie Savage <redmine@...>
Bug #3231: Digest Does Not Build
[#29920] [Feature #3232] Loops (while/until) should return last statement value if any, like if/unless — Benoit Daloze <redmine@...>
Feature #3232: Loops (while/until) should return last statement value if any, like if/unless
Hi,
On 2 May 2010 01:56, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:
Hi,
On 2 May 2010 15:24:52 UTC+2, Nobuyoshi Nakada <nobu@ruby-lang.org> wrote:
[#29953] [Bug #3241] gem update --system Segmentation fault — Benedikt Eickhoff <redmine@...>
Bug #3241: gem update --system Segmentation fault
Hi,
On Mon, May 03, 2010 at 08:55:14PM +0900, Yusuke ENDOH wrote:
[#29993] [Feature:trunk] thread-local yamler — Nobuyoshi Nakada <nobu@...>
Hi,
[#29997] years in Time.utc — Xavier Noria <fxn@...>
Does anyone have a precise statement about the years supported by
On Tue, May 4, 2010 at 8:05 AM, Xavier Noria <fxn@hashref.com> wrote:
Hi,
Hi,
[#30002] 1.9.1 lib dirs? — Roger Pack <rogerdpack2@...>
Hi all.
On Tue, May 4, 2010 at 3:00 PM, Roger Pack <rogerdpack2@gmail.com> wrote:
[#30010] [Bug #3248] extension 'tk' is finding tclConfig.sh and tkConfig.sh incorrectly — Luis Lavena <redmine@...>
Bug #3248: extension 'tk' is finding tclConfig.sh and tkConfig.sh incorrectly
Issue #3248 has been updated by Luis Lavena.
[#30023] [Bug #3250] [BUG] Segmentation fault — Diogo Almeida <redmine@...>
Bug #3250: [BUG] Segmentation fault
[#30070] [Bug #3255] Trunk fail to build without explicit ./configure options (yaml.h not found) — Benoit Daloze <redmine@...>
Bug #3255: Trunk fail to build without explicit ./configure options (yaml.h not found)
Hi,
[#30094] suggestion: switch default name for BINARY encoding — Roger Pack <rogerdpack2@...>
Situation:
(2010/05/08 7:50), Roger Pack wrote:
[#30145] [Bug #3273] Float string conversion — Marc-Andre Lafortune <redmine@...>
Bug #3273: Float string conversion
[#30154] [Bug #3275] incompatibility of testrb — Yusuke Endoh <redmine@...>
Bug #3275: incompatibility of testrb
[#30175] [Problem] DATA and __END__ in a loaded rb file — Charles Cui <zheng.cuizh@...>
how to get global constant DATA in file <a.rb>,if a.rb is loaded by b.rb.
[#30182] [Bug #3281] fail to build fiddle on Debian/lenny by default — Yusuke Endoh <redmine@...>
Bug #3281: fail to build fiddle on Debian/lenny by default
2010/5/12 Yusuke Endoh <redmine@ruby-lang.org>:
On Wed, May 12, 2010 at 11:26:44PM +0900, Tanaka Akira wrote:
2010/5/14 Aaron Patterson <aaron@tenderlovemaking.com>:
[#30226] [Bug #3288] Segmentation fault - activesupport-3.0.0.beta3/lib/active_support/callbacks.rb:88 — Szymon Jeż <redmine@...>
Bug #3288: Segmentation fault - activesupport-3.0.0.beta3/lib/active_support/callbacks.rb:88
Issue #3288 has been updated by Szymon Je甜.
[#30249] [Bug #3299] revision.h rule in common.mk is broken for MSVC — Romulo Ceccon <redmine@...>
Bug #3299: revision.h rule in common.mk is broken for MSVC
[#30290] [Bug #3309] net/http calls leak memory and file handles in windows — Pete Higgins <redmine@...>
Bug #3309: net/http calls leak memory and file handles in windows
[#30315] [Bug #3320] emacs ruby-mode.el font-lock fails on symboled string ending with ? — Zev Blut <redmine@...>
Bug #3320: emacs ruby-mode.el font-lock fails on symboled string ending with ?
[#30323] [Feature #3322] Simple Patch to make ruby copy-on-write-friendly — Daniel DeLorme <redmine@...>
Feature #3322: Simple Patch to make ruby copy-on-write-friendly
[#30358] tk doesn't startup well in doze — Roger Pack <rogerdpack2@...>
Currently with 1.9.x and tk 8.5,the following occurs
From: Roger Pack <rogerdpack2@gmail.com>
> Does it occur with RubyTk-Kit version (it based on latest tcltklib.c)?
[#30401] [Bug #3336] Memory leak in IO.select() on Windows — HD Moore <redmine@...>
Bug #3336: Memory leak in IO.select() on Windows
[#30406] [Bug #3337] MS-DOS device names are identified as readable_real — HD Moore <redmine@...>
Bug #3337: MS-DOS device names are identified as readable_real
[#30434] [Feature #3346] __DIR__ revisted — Thomas Sawyer <redmine@...>
Feature #3346: __DIR__ revisted
[#30449] [Bug #3350] Protected methods & documentation — Marc-Andre Lafortune <redmine@...>
Bug #3350: Protected methods & documentation
[#30451] [Bug #3352] Delegates: protected methods — Marc-Andre Lafortune <redmine@...>
Bug #3352: Delegates: protected methods
[#30513] [Bug #3365] floats revisited (see bug 1841) — Roberto Tomás Collins McCarthy <redmine@...>
Bug #3365: floats revisited (see bug 1841)
[ruby-core:30123] Suggestion regarding m17n
Hello fellow Rubyists and Ruby core developers!
First of all, a disclaimer: I a not an expert on m17n or Ruby's m17n
implementation. I might be making fundamental errors, so please do not
hesitate to point them out in a direct way.
This is probably a long read but I would really appreciate any time
spent on feedback or in general - helping others make the most of what
Ruby provides regarding multilingualization.
M17N FROM A SIMPLE DEVELOPER'S PERSPECTIVE
After many hours spent on learning about m17n in Ruby and encoding
issues, I have been banging my head against the wall, trying to figure
out how to help developers having the same issues while attempting to
write robust applications and frameworks.
This is difficult for a few reasons:
1. Correct understanding of encoding issues requires years of
experience and an incredible amount of knowledge.
2. Many developers do not experience such issues until their users
do.
3. Developers under pressure try to workaround problems encoding to
UTF-8 or ASCII-8BIT.
4. Tracking down root causes of encoding incompatibilities is
difficult.
5. Ruby 1.8 doesn't support encoding, which makes
backward-compatible workarounds look quite ugly.
6. Ruby allows joining ascii compatible strings as a special case
which is great for backward compatibility, but makes development and
debugging harder, especially since things "work most of the time".
7. There are great articles about m17n features and changes in Ruby
1.9, but most of the present only the available functionality
instead of describing how to fix problems, avoid them and what to
look out for.
Notable exceptions:
* obviously James Edward Gray's "Shades of Gray" blog
* "Ruby Best Practices" by Gregory Brown (covers James's regexp
idea used in CSV, and how to deal with encoding problems)
* Yehuda's recent, in-depth article on his blog:
"Ruby 1.9 Encodings: A Primer and the Solution for Rails"
* comments on Redmine, where core Ruby developers share their
knowledge and more importantly - do it in a very clear and
concise way.
Encoding is really a very difficult concept to implement correctly and
Ruby does a great job providing a CSI approach, while minimizing the
drawbacks. Sometimes I think it is a pity that people are unaware how
much effort has been put into Ruby to achieve this.
MY QUESTIONS
1. Does it make sense to expect libraries and frameworks for Ruby to
work with an ASCII incompatible internal encoding, e.g "-E :UTF16_BE"?
Could we consider failing to do so a bug?
2. If so, would it be reasonable to expect the same of Ruby's standard
library?
3. If so, which Ruby version should people test against?
* 1.9.1-p376 (as presented on ruby-lang.org)
* 1.9.1-p378
* 1.9.1-head
* a specific 1.9.2 revision?
4. Would trying to add support to applications running in ASCII
incompatible environments be useful to developers in the long run
(taking into account future versions of Ruby) or just be an
unnecessary activity (e.g. because of UTF-8 which is ASCII
compatible)?
EXPLANATION
Now, before I am misunderstood (or probably even laughed at for
proposing something so unthinkable as globally setting an
ASCII-incompatible encoding and expecting it to be supported), I want
to make a few things clear:
- I am not trying to be an "encoding purist" suggesting this idea
- I am not carelessly "adding more work" for other people by
"expecting UTF-16 to work" or "get fixed"
- I am not proposing a UCS model in place of the existing ICS one
- I am not suggesting people change their default encodings
- I am not suggesting changing anything in m17n handling in Ruby
I am evaluating if *testing* encoding support in applications with a
non-ASCII compatible default_internal makes sense.
REASONS
This would help developers by helping them:
- find encoding problems before they occur and are reported
- reproduce problems without creating specific test cases or just write
fewer encoding specific test cases
- discover the root causes much faster
- prevent a lot of encoding related regressions
- performance-wise - making sure no encoding happens implicitly
- smooth out some possibly hidden and unreported encoding issues in Ruby
- possibly write more robust application that might gracefully
handle changes in Ruby's m17n implementation
- respect the fact that not everyone can "just use UTF-8" and truly
globalize their applications with less effort and issues in the
long run
- in the worst case scenario where this is too much effort, either
specific components can be supported or incompatible encoding
support can be discarded on a case-by-case basis
The downside would be:
- many more problems will surface than would normally occur in
reality
- people interested only in ASCII compatible encodings would have to
do more work or give up on encoding support
- developers may have to become aware of regexp, 'filesystem',
'locale' and other encodings - even if they want things to "just
work"
- problems would have to solved near the cause, which requires time
and knowledge to do properly - adding ".encoding" calls in random
places won't be enough
- handling all the issues in the most correct and
backward-compatible way would make the code uglier, especially
with regexps
- effort needed to persuade developers to pro-actively test their
software this way, instead of expecting test cases and patches.
CASE STUDY:
As an experiment, I tried to get irb and then tests to run without
warnings, starting with the following:
Rubygems:
% ruby -v -E :UTF-16BE -e 'puts "hello"'
ruby 1.9.2dev (2010-02-04 trunk 26559) [x86_64-linux]
Error loading gem paths on load path in gem_prelude
incompatible character encodings: UTF-16BE and ASCII-8BIT
<internal:gem_prelude>:76:in `split'
<internal:gem_prelude>:76:in `set_paths'
<internal:gem_prelude>:47:in `path'
<internal:gem_prelude>:228:in `push_all_highest_version_gems_on_load_path'
<internal:gem_prelude>:294:in `<compiled>'
hello
And ended up with:
http://github.com/e2/ruby/compare/trunk...utf16_fix
Although getting all the tests running will still require touching
quite a few files.
NOTE: I tried my best to when choosing each workaround, but I believe
there are much cleaner solutions out there. I also admit that I gave
up on trying to get irb to actually do something other than just
start.
It turns out there are usually just a few tiny patches required in
every library, that a developer with commit access could do on his own
in a short while. Once this is done, the remaining issues can be
discovered while using the standard library. And it would be a great
example of how to correctly use the m17n functionality in Ruby.
The problems I encountered were usually related to:
- regexp handling
- ENV variables (reading and writing) in locale which were not
compatible with default_internal
- using concat operations like File.join, split, interpolation
- comparing strings with different encodings silently returns false
Some other issues that may become more apparent as a result of trying
to get things to work:
- filesystem encoding / locale mismatches
- program argument encodings
- file system encoding differences for different mount points
- environment variables containing non-ascii or non-utf data
- lack of encoding info from stdlib
- stdin/stdout issues
- gracefully handling corrupt data
- combinations of the above and other
As a side note, I was wondering if regexp support couldn't be extended
to better support what was done in the CSV library - encoding the
regexp to match the default/input. Perhaps with a cleaner syntax or
making the existing notation more flexible, since regexp are used like
a Swiss Army Knife for many different things. Then again, I may be
missing something important.
I understand what I propose may be crazy and require a insane amount
of work to support. Although I do *not* believe that the flexibility
Ruby provides and the support of vague cases (e.g. ASCII-7BIT safe
concat, default encodings) is meant to be treated as a standard for
new applications now and in the future.
Instead, IMHO this flexibility should be used appropriately: for easy
transitions, backward compatibility, performance and other instances
that are genuinely useful. In every other case, I believe putting
additional effort into avoiding transcoding where possible and
generally honoring the user's selection of encoding, whatever he or
she may choose, will provide more benefit for everyone in the long
run.
Please correct me if I am wrong.
Thank you for your kind interest and precious time,
Cezary Bagiナгki
--
Cezary Baginski