[#14696] Inconsistency in rescuability of "return" — Charles Oliver Nutter <charles.nutter@...>

Why can you not rescue return, break, etc when they are within

21 messages 2008/01/02

[#14738] Enumerable#zip Needs Love — James Gray <james@...>

The community has been building a Ruby 1.9 compatibility tip list on =20

15 messages 2008/01/03
[#14755] Re: Enumerable#zip Needs Love — Martin Duerst <duerst@...> 2008/01/04

Hello James,

[#14772] Manual Memory Management — Pramukta Kumar <prak@...>

I was thinking it would be nice to be able to free large objects at

36 messages 2008/01/04
[#14788] Re: Manual Memory Management — Marcin Raczkowski <mailing.mr@...> 2008/01/05

I would only like to add that RMgick for example provides free method to

[#14824] Re: Manual Memory Management — MenTaLguY <mental@...> 2008/01/07

On Sat, 5 Jan 2008 15:49:30 +0900, Marcin Raczkowski <mailing.mr@gmail.com> wrote:

[#14825] Re: Manual Memory Management — "Evan Weaver" <evan@...> 2008/01/07

Python supports 'del reference', which decrements the reference

[#14838] Re: Manual Memory Management — Marcin Raczkowski <mailing.mr@...> 2008/01/08

Evan Weaver wrote:

[#14911] Draft of some pages about encoding in Ruby 1.9 — Dave Thomas <dave@...>

Folks:

24 messages 2008/01/10

[#14976] nil encoding as synonym for binary encoding — David Flanagan <david@...>

The following just appeared in the ChangeLog

37 messages 2008/01/11
[#14977] Re: nil encoding as synonym for binary encoding — Yukihiro Matsumoto <matz@...> 2008/01/11

Hi,

[#14978] Re: nil encoding as synonym for binary encoding — Dave Thomas <dave@...> 2008/01/11

[#14979] Re: nil encoding as synonym for binary encoding — David Flanagan <david@...> 2008/01/11

Dave Thomas wrote:

[#14993] Re: nil encoding as synonym for binary encoding — Dave Thomas <dave@...> 2008/01/11

[#14980] Re: nil encoding as synonym for binary encoding — Gary Wright <gwtmp01@...> 2008/01/11

[#14981] Re: nil encoding as synonym for binary encoding — Yukihiro Matsumoto <matz@...> 2008/01/11

Hi,

[#14995] Re: nil encoding as synonym for binary encoding — David Flanagan <david@...> 2008/01/11

Yukihiro Matsumoto writes:

[#15050] how to "borrow" the RDoc::RubyParser and HTMLGenerator — Phlip <phlip2005@...>

Core Rubies:

17 messages 2008/01/13
[#15060] Re: how to "borrow" the RDoc::RubyParser and HTMLGenerator — Eric Hodel <drbrain@...7.net> 2008/01/14

On Jan 13, 2008, at 08:54 AM, Phlip wrote:

[#15062] Re: how to "borrow" the RDoc::RubyParser and HTMLGenerator — Phlip <phlip2005@...> 2008/01/14

Eric Hodel wrote:

[#15073] Re: how to "borrow" the RDoc::RubyParser and HTMLGenerator — Eric Hodel <drbrain@...7.net> 2008/01/14

On Jan 13, 2008, at 20:35 PM, Phlip wrote:

[#15185] Friendlier methods to compare two Time objects — "Jim Cropcho" <jim.cropcho@...>

Hello,

10 messages 2008/01/22

[#15194] Can large scale projects be successful implemented around a dynamic programming language? — Jordi <mumismo@...>

A good article I have found (may have been linked by slashdot, don't know)

8 messages 2008/01/24

[#15248] Symbol#empty? ? — "David A. Black" <dblack@...>

Hi --

24 messages 2008/01/28
[#15250] Re: Symbol#empty? ? — Yukihiro Matsumoto <matz@...> 2008/01/28

Hi,

Re: multibyte strings & bucket-of-bytes efficiency under 1.9.0

From: Tanaka Akira <akr@...>
Date: 2008-01-01 07:00:36 UTC
List: ruby-core #14657
In article <6.0.0.20.2.20071231173234.0a28c170@localhost>,
  Martin Duerst <duerst@it.aoyama.ac.jp> writes:

> I think it's a bit overkill to claim that we better use
> ASCII-8BIT than BINARY if out of a file that can easily
> be 100KB or more, and where virtually all byte values
> that look like ASCII characters are not characters at all,
> just because of three bytes at the start of the file.

GIF is just an example.

Another example is the internet mails.

After MIME, a mail may contain two or more texts with
different encoding.  So whole mail is BINARY.  But header is
basically ASCII and line oriented ASCII based processing is
required for decomposing multipart mail: extracting
"Content-Type" field body and "boundary" parameter.  The
extracted ASCII parameter is used to search the BINARY body.

How MIME library accepts a mail?  ASCII-8BIT or BINARY?
It should be BINARY in theory.  But it is not intuitive.

It seems Python-3000 faced this problem.
[Python-3000] Questions about email bytes/str (python 3000)
http://mail.python.org/pipermail/python-3000/2007-August/009503.html

Python has some experience on distinguish bytes and string.
I think we should study Python on this area.

Another example RFC 1468.

RFC 1468 (ISO-2022-JP) describes escape sequences:
"ESC ( B", etc.  The authors don't distinguish ASCII and
octet.

JIS X 0202 (Japanese version of ISO 2022) distinguish them.
In the style of JIS X 0202, they should be written as
"ESC 2/8 4/2", etc.

There is the culture which doesn't distinguish ASCII and
BINARY, especially with Unix and the Internet.  It may be
easy to distinguish them in most case.  But sometimes it is
not simple.

> Count uses just a very tiny part of the Regexp syntax,
> so see below.

count, delete, squeeze has no \xHH notation.  So we cannot
specify a byte in ASCII notation.  It is different from
Regexp.

> We of course can't do this. But it's not necessary. There
> are, simply put, two strings participating in a regexp
> operation: The regular expression and the 'target string'.
> The regular expression needs a 'real' encoding, in most
> cases ASCII-8BIT will be sufficient. The target string
> can be just bytes. We just need a few conventions to
> do the right things, e.g. we can agree that '.' matches
> one byte (rather than one character). That's very easy
> to implement. We can also limit non-meta characters to
> e.g. just \xHH notation, to make clear that we are just
> matching byte values and not actual characters. But
> imprementing that is probably quite a bit of a hassle
> for little benefit.

Of course it is easy to implement if we consider /A/ matches
BINARY 0x41.  It is what Ruby 1.9 does now with ASCII-8BIT.

But why BINARY is required if we need to consider LATIN
CAPITAL LETTER A is equal to BINARY 0x41?

It seems ASCII-8BIT.

Note that BINARY is an alias to ASCII-8BIT now.

% ./ruby -e 'p Encoding.find("BINARY")'
#<Encoding:ASCII-8BIT>

> Even currently, we can use an ASCII-8BIT regexp with
> many, many other encodings, so using it with BINARY
> isn't anything much new.

ASCII-8BIT regexp is usable with other encodings if the
regexp contains only ASCII characters, or target string
contains only ASCII characters.

If regexp and string has both non-ASCII character, match is
possible only if their encoding are same.

Since BINARY has no ASCII characters, non-empty BINARY
string has a non-ASCII character.  So ASCII-8BIT regexp is
not applicable in general.  This is the result of current
principle.

> Well, it's actually not so difficult, and it will be needed.
> We can't label a String as e.g. UTF-16, and claim that
> UTF-16 is some kind of ASCII-compatible encoding.

I'm not sure when matz introduce UTF-16.  I hope it is not
just before a release.  At the time, dereferences of char*
should be examined.
-- 
Tanaka Akira

In This Thread