[#12312] Need Japanese Help - VRuby & new One-Click Ruby Installer with patch 110 — "Curt Hibbs" <curt.hibbs@...>
I'm trying to build a new release of the One-Click Ruby Installer for
Hello,
Hello,
[#12328] Dir.chdir patch for MS Windows — "Berger, Daniel" <Daniel.Berger@...>
Hi,
[#12344] patch to implement Array.permutation — David Flanagan <david@...>
Hi,
[#12372] Release compatibility/train — Prashant Srinivasan <Prashant.Srinivasan@...>
Hello all,
Hi,
Yukihiro Matsumoto wrote:
Hi,
Yukihiro Matsumoto wrote:
Hi,
Yukihiro Matsumoto wrote:
Hi,
Hi --
On 10/3/07, David A. Black <dblack@rubypal.com> wrote:
Rick DeNatale wrote:
[#12383] Include Rake in Ruby 1.9 — "NAKAMURA, Hiroshi" <nakahiro@...>
-----BEGIN PGP SIGNED MESSAGE-----
On 10/3/07, NAKAMURA, Hiroshi <nakahiro@sarion.co.jp> wrote:
On Oct 3, 2007, at 08:59 , Jacob Fugal wrote:
-----BEGIN PGP SIGNED MESSAGE-----
On 10/15/07, NAKAMURA, Hiroshi <nakahiro@sarion.co.jp> wrote:
[#12539] Ordered Hashes in 1.9? — Michael Neumann <mneumann@...>
Hi all,
Hi,
Yukihiro Matsumoto wrote:
[#12568] $" and require — "Tim Morgan" <tmorgan99@...>
Hello!
[#12578] Possible memory leak in ruby-1.8.6-p110?? — "M. Edward (Ed) Borasky" <znmeb@...>
I haven't had a chance to narrow this down in enough detail yet, but
M. Edward (Ed) Borasky wrote:
On Thu, 11 Oct 2007, M. Edward (Ed) Borasky wrote:
[#12579] iconv enhancement in Ruby 1.9 — "Eugene Ossintsev" <eugoss@...>
Hi,
[#12587] Confusion about arities — Charles Oliver Nutter <charles.nutter@...>
It seems like a number of methods have unexpected arities. For example,
On Oct 10, 2007, at 22:44 , Charles Oliver Nutter wrote:
Eric Hodel wrote:
[#12588] MatchData#select rdoc and arity incorrect — Charles Oliver Nutter <charles.nutter@...>
Rdoc is here:
[#12617] Question about heap_slots in gc.c — Hongli Lai <h.lai@...>
I'm trying to modify the Ruby interpreter's garbage collector. At the
[#12618] StringIO is not IO? — Hongli Lai <h.lai@...>
According to irb,
[#12629] file encoding comments and a patch to parse.y — David Flanagan <david@...>
Matz, Nobu:
[#12632] Defining unicode methods — "Daniel Berger" <djberg96@...>
Hi all,
[#12670] Bug in Numeric#divmod — "Dirk Traulsen" <dirk.traulsen@...>
Hi all!
[#12681] Unicode: Progress? — murphy <murphy@...>
Hello!
murphy schrieb:
Hi,
Yukihiro Matsumoto wrote:
[#12693] retry: revised 1.9 http patch — Hugh Sasse <hgs@...>
I'm reposting this because I've had little response to this version
On Tue, Oct 16, 2007 at 01:32:42AM +0900, Hugh Sasse wrote:
Would this require that zlib be installed? I know that it's possible to
On Wed, 31 Oct 2007, Roger Pack wrote:
-----BEGIN PGP SIGNED MESSAGE-----
[#12697] Range.first is incompatible with Enumerable.first — David Flanagan <david@...>
The new Enumerable.first method is a generalization of Array.first to
Hi,
[#12703] Long encoding names with -K and bad error message — David Flanagan <david@...>
I noticed the following line in the change log:
Hi,
Nobuyoshi Nakada wrote:
Nobu,
At 16:04 07/10/17, David Flanagan wrote:
[#12706] Re: A couple of bugs? — "Gavin Kistner" <gavin.kistner@...>
From: John Lam (DLR) [mailto:jflam@microsoft.com]=20
On Wed, Oct 17, 2007 at 03:10:07AM +0900, Gavin Kistner wrote:
Well, that's interesting. Then this seems to be the only assignment that ha=
[#12710] enum.c patch: fixes Enumerable.cycle and rdoc bugs — David Flanagan <david@...>
The attached patch fixes:
Hi,
[#12714] Re: A couple of bugs? — "Gavin Kistner" <gavin.kistner@...>
> Well, that's interesting. Then this seems to be the only=20
[#12754] Improving 'syntax error, unexpected $end, expecting kEND'? — Hugh Sasse <hgs@...>
I've had a look at this, but can't see how to do it: When I get
On Fri, Oct 19, 2007 at 03:01:55AM +0900, Hugh Sasse wrote:
The patch below changes this message to:
At 04:15 07/10/24, David Flanagan wrote:
Thanks for filling these in Martin. I worry that this is such a simple
At 16:57 07/10/24, David Flanagan wrote:
Martin Duerst schrieb:
Hi,
[#12758] Encoding::primary_encoding — David Flanagan <david@...>
Hi,
Hi,
Nobuyoshi Nakada schrieb:
Hi,
Nobuyoshi Nakada schrieb:
Hi,
Nobuyoshi Nakada schrieb:
T24gMjIvMTAvMjAwNywgV29sZmdhbmcgTsOhZGFzaS1Eb25uZXIgPGVkLm9kYW5vd0B3b25hZG8u
Michal Suchanek schrieb:
Hi,
Nobuyoshi Nakada schrieb:
I made some tests with UFT-8, option "-Ku", option "-Ka" and both types of magic
[#12767] \u escapes in string literals: proof of concept implementation — David Flanagan <david@...>
Back at the end of August, Matz wrote (see
Hi,
Nobuyoshi Nakada wrote:
Hi,
Yukihiro Matsumoto wrote:
At 04:19 07/10/23, David Flanagan wrote:
Martin Duerst wrote:
Hi,
At 13:10 07/10/23, David Flanagan wrote:
Martin Duerst wrote:
Hi,
Yukihiro Matsumoto wrote:
Hi,
Nobuyoshi Nakada wrote:
Hi,
At 16:46 07/10/29, Nobuyoshi Nakada wrote:
Hi,
At 11:29 07/11/06, Nobuyoshi Nakada wrote:
Hi,
Yukihiro Matsumoto wrote:
[#12787] How to specify in Ruby 1.9 the expected file encoding — =?ISO-8859-15?Q?Wolfgang_N=E1dasi-Donner?= <ed.odanow@...>
Dear Ruby developers!
Wolfgang N疆asi-Donner wrote:
Gonzalo Garramu schrieb:
Hi,
Yukihiro Matsumoto schrieb:
I wouldn't want a program to write a BOM at the start of a file
[#12795] patch for String.concat — David Flanagan <david@...>
I don't think that String.<< currently handles appending codepoints
[#12825] clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...>
Hi,
On Mon, Oct 22, 2007, Lucas Nussbaum wrote:
On 23/10/07 at 00:13 +0900, Ben Bleything wrote:
On 10/22/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:
On 23/10/07 at 01:55 +0900, Austin Ziegler wrote:
Lucas Nussbaum wrote:
On 24/10/07 at 05:14 +0900, Gonzalo Garramu wrote:
Lucas Nussbaum wrote:
On 30/10/07 at 07:28 +0900, Gonzalo Garramu wrote:
On 10/29/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:
Austin Ziegler wrote:
On 10/30/07, Mathieu Blondel <mblondel@rubyforge.org> wrote:
On Tue, Oct 23, 2007 at 01:55:29AM +0900, Austin Ziegler wrote:
On 10/22/07, Sam Roberts <sroberts@uniserve.com> wrote:
Austin Ziegler wrote:
On 10/28/07, Bob Proulx <bob@proulx.com> wrote:
Austin,
On 10/29/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:
On 10/29/07, Luis Lavena <luislavena@gmail.com> wrote:
On 10/30/07, Austin Ziegler <halostatue@gmail.com> wrote:
Do we think that maybe, just maybe, things went off the rails when the
On 10/30/07, Rick Bradley <rick@rickbradley.com> wrote:
On Tue, 30 Oct 2007 22:52:29 +0900, "Luis Lavena" <luislavena@gmail.com> wrote:
[#12849] Problem reported in Rdoc (Ruby 1.9) Rdoc for Ruby 1.8 works — =?ISO-8859-15?Q?Wolfgang_N=E1dasi-Donner?= <ed.odanow@...>
Hi!
[#12867] constant lookup rules in 1.9 — David Flanagan <david@...>
Hi,
[#12895] OSX patches — "Laurent Sansonetti" <laurent.sansonetti@...>
Hi ruby-core,
[#12900] Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...>
Dear Ruby 1.9 architects, developers, and testers!
Hi,
Yukihiro Matsumoto schrieb:
Hi,
Yukihiro Matsumoto schrieb:
I have a (hopefully) final question before testing all
Hi,
Wolfgang N叩dasi-Donner wrote:
David Flanagan schrieb:
At 10:30 07/10/26, Nobuyoshi Nakada wrote:
Yukihiro Matsumoto wrote:
On 10/25/07, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:
[#12951] Fluent programming in Ruby — David Flanagan <david@...>
From the ChangeLog:
At 14:01 07/10/26, David Flanagan wrote:
Martin Duerst schrieb:
[#12971] Re: Fluent programming in Ruby — Brent Roman <brent@...>
I suppose you could have irb require a terminating ';'
> -----Original Message-----
On 10/26/07, Berger, Daniel <Daniel.Berger@qwest.com> wrote:
[#12996] General hash keys for colon notation — murphy <murphy@...>
Dear language designer(s) and parser wizards,
On 10/28/07, murphy <murphy@rubychan.de> wrote:
On 10/28/07, Rick DeNatale <rick.denatale@gmail.com> wrote:
Rick DeNatale wrote:
[#13027] Implementation of "guessUTF" method - final questions — Wolfgang Nádasi-Donner <ed.odanow@...>
Dear Ruby designers, developers, and testers!
On 10/29/07, Wolfgang N=E1dasi-Donner <ed.odanow@wonado.de> wrote:
Nikolai Weibull schrieb:
On 10/29/07, Wolfgang N=E1dasi-Donner <ed.odanow@wonado.de> wrote:
Nikolai Weibull schrieb:
Hello Wolfgang,
At 17:50 07/10/29, Nikolai Weibull wrote:
On 10/29/07, Martin Duerst <duerst@it.aoyama.ac.jp> wrote:
[#13069] new Enumerable.butfirst method — David Flanagan <david@...>
Matz,
Hi,
Yukihiro Matsumoto wrote:
Hi,
[#13083] Didn't find String#subseq — Wolfgang Nádasi-Donner <ed.odanow@...>
Hi!
[#13096] 1.8.6 gc.c thoughts — "Roger Pack" <rogerpack2005@...>
After examining how the 1.8.6 gc works, I had a few thoughts:
[#13107] %s and utf8 ? — hadmut@... (Hadmut Danisch)
Hi,
[#13135] patch for lib/net/http.rb, self['User-Agent'] ||= 'Ruby' — Stephen Bannasch <stephen.bannasch@...>
I posted this patch before in the middle of another thread and didn't
Hi Stephen,
In article <9079DC13-476F-4C12-922E-E197BD5AAA5C@loveruby.net>,
[#13139] Required Space for Unicode Character Attribute Tables — Wolfgang Nádasi-Donner <ed.odanow@...>
Hi!
[#13143] Two Issues (open-uri's respond_to? and autoload's require) — Trans <transfire@...>
Hi--
-----BEGIN PGP SIGNED MESSAGE-----
Re: \u escapes in string literals: proof of concept implementation
This is the third version of my patch for \u escapes. It is stronger
than the first and cleaner than the second. I'm more confident about
this one. If \u escapes are still desired for 1.9 (and I hope they are)
I think this patch will be helpful. Someone with more experience with
parse.y needs to look it over carefully, but I don't think it is a
complete hack, either.
The patch includes two different sets of code for converting codepoints
to UTF-8. The shorter one relies on enc/utf-8.c The longer one does
the conversion explicitly and is probably a little faster. But I've
commented it out in favor of not duplicating that conversion code.
The patch is attached. This is some interesting test code that you can
run if you apply the patch:
# \u escapes work in these forms
puts "\ubbbb"
puts %Q{\ubbbb}
puts %W{\ubbbb}
puts <<EOS
\ubbbb
EOS
# \u escapes don't work in these forms
puts '\ubbbb'
puts %q{\ubbbb}
puts %w{\ubbbb}
puts <<'EOS'
\ubbbb
EOS
# \u escapes work in regexps
puts /\ubbbb/
# \u escapes in regexps are handled by the lexer, but
# all other regexp escapes are handled by regexp engine
# This leads to possibly confusing behavior, since a \u005c
# is converted by the lexer to \, and the regexp engine can
# then interpret it as an esscape
puts /\u{5c}(/ # match a single open parenthesis
# Here is the other form of \u escape
puts "\u{41}" # Letter A
puts "\u{A0A}" # Some greek thing?
puts "\u{10FFFF}" # Largest Unicode codepoint
# Encoding stuff. Any \u escapes for codepoints >= 128
# always force utf-8 encoding.
puts "\u0079".encoding # ASCII, regardless of -K
puts "\u0080".encoding # UTF-8, regardless of -K option
puts "\x79".encoding # ASCII, regardless of -K
puts "\x80".encoding # encoding depends on -K
Attachments (1)
Index: parse.y
===================================================================
--- parse.y (revision 13739)
+++ parse.y (working copy)
@@ -237,6 +237,8 @@
int has_shebang;
int parser_ruby_sourceline; /* current line no. */
rb_encoding *enc;
+ rb_encoding *ascii;
+ rb_encoding *utf8;
#ifndef RIPPER
/* Ruby core only */
@@ -264,6 +266,7 @@
#define STR_NEW0() rb_enc_str_new(0,0,rb_enc_from_index(0))
#define STR_NEW2(p) rb_enc_str_new((p),strlen(p),parser->enc)
#define STR_NEW3(p,n,m) parser_str_new((p),(n),STR_ENC(!ENC_SINGLE(m)),(m))
+#define STR_NEW4(p,n,m,u) parser_str_new((p),(n),(u)?parser->utf8:(ENC_SINGLE(m)?parser->ascii:parser->enc), (m))
#define STR_ENC(m) ((m)?parser->enc:rb_enc_from_index(0))
#define ENC_SINGLE(cr) ((cr)==ENC_CODERANGE_SINGLE)
#define TOK_INTERN(mb) rb_intern3(tok(), toklen(), STR_ENC(mb))
@@ -4483,7 +4486,8 @@
# define yylval (*((YYSTYPE*)(parser->parser_yylval)))
static int parser_regx_options(struct parser_params*);
-static int parser_tokadd_string(struct parser_params*,int,int,int,long*,int*);
+static int parser_tokadd_string(struct parser_params*,int,int,int,long*,
+ int*, int*);
static int parser_parse_string(struct parser_params*,NODE*);
static int parser_here_document(struct parser_params*,NODE*);
@@ -4494,7 +4498,7 @@
# define read_escape(m) parser_read_escape(parser, m)
# define tokadd_escape(t,m) parser_tokadd_escape(parser, t, m)
# define regx_options() parser_regx_options(parser)
-# define tokadd_string(f,t,p,n,m) parser_tokadd_string(parser,f,t,p,n,m)
+# define tokadd_string(f,t,p,n,m,u) parser_tokadd_string(parser,f,t,p,n,m,u)
# define parse_string(n) parser_parse_string(parser,n)
# define here_document(n) parser_here_document(parser,n)
# define heredoc_identifier() parser_heredoc_identifier(parser)
@@ -4674,7 +4678,9 @@
}
}
- parser->enc = rb_enc_get(lex_input);
+ parser->enc = rb_enc_get(lex_input); /* encoding of source file */
+ parser->ascii = rb_enc_from_index(0); /* ASCII/binary */
+ parser->utf8 = rb_enc_find("utf-8"); /* UTF-8 */
ruby_sourcefile = rb_source_filename(f);
ruby_sourceline = line - 1;
parser_prepare(parser);
@@ -5110,6 +5116,75 @@
return 0;
}
+static void
+parser_tokadd_utf8(struct parser_params *parser, int *mb, int *has_utf8)
+{
+ int numlen, brace, codepoint;
+ brace = nextc();
+ if (brace == '{') { /* handle \u{...} form */
+ codepoint = scan_hex(lex_p, 6, &numlen);
+ if (numlen == 0) {
+ yyerror("Invalid Unicode escape");
+ return;
+ }
+ if (codepoint > 0x10ffff) {
+ yyerror("Illegal Unicode codepoint (too large)");
+ return;
+ }
+ lex_p += numlen;
+
+ if ((brace = nextc()) != '}') {
+ pushback(brace);
+ yyerror("Unterminated Unicode escape");
+ return;
+ }
+ }
+ else { /* handle \uxxxx form */
+ pushback(brace);
+ codepoint = scan_hex(lex_p, 4, &numlen);
+ if (numlen < 4) {
+ yyerror("Invalid Unicode escape");
+ return;
+ }
+ lex_p += 4;
+ }
+
+ if (codepoint < 0x80) { /* \u escape encoded ordinary ASCII char */
+ tokadd(codepoint);
+ }
+ else {
+ UChar buf[4];
+ int i, n;
+
+ /* Set flags so that the resulting string has correct encoding */
+ if (mb) *mb = ENC_CODERANGE_MULTI;
+ if (has_utf8) *has_utf8 = 1;
+
+ /* Convert codepoint to UTF-8 bytes */
+ n = rb_enc_mbcput(codepoint, buf, parser->utf8);
+ for(i=0; i < n; i++) tokadd(buf[i]);
+
+#if 0
+ if (codepoint < 0x800) { /* && codepoint >= 0x80 */
+ tokadd(((codepoint >> 6)&0x1f) | 0xC0);
+ tokadd((codepoint & 0x3F) | 0x80);
+ }
+ else if (codepoint < 0x10000) {
+ tokadd(((codepoint >> 12) & 0x0f) | 0xe0);
+ tokadd(((codepoint >> 6)&0x3f) | 0x80);
+ tokadd((codepoint & 0x3F) | 0x80);
+ }
+ else { /* codepoint < 0x110000 */
+ tokadd(((codepoint >> 18) & 0x07) | 0xf0);
+ tokadd(((codepoint >> 12) & 0x3f) | 0x80);
+ tokadd(((codepoint >> 6)&0x3f) | 0x80);
+ tokadd((codepoint & 0x3F) | 0x80);
+ }
+#endif
+ }
+}
+
+
static int
parser_regx_options(struct parser_params *parser)
{
@@ -5184,7 +5259,8 @@
static int
parser_tokadd_string(struct parser_params *parser,
- int func, int term, int paren, long *nest, int *mb)
+ int func, int term, int paren, long *nest,
+ int *mb, int *has_utf8)
{
int c;
@@ -5219,6 +5295,16 @@
if (func & STR_FUNC_ESCAPE) tokadd(c);
break;
+ case 'u':
+ if ((func & STR_FUNC_EXPAND) == 0) {
+ tokadd('\\');
+ break;
+ }
+ else {
+ parser_tokadd_utf8(parser, mb, has_utf8);
+ continue;
+ }
+
default:
if (func & STR_FUNC_REGEXP) {
pushback(c);
@@ -5267,7 +5353,7 @@
int func = quote->nd_func;
int term = nd_term(quote);
int paren = nd_paren(quote);
- int c, space = 0, mb = ENC_CODERANGE_SINGLE;
+ int c, space = 0, mb = ENC_CODERANGE_SINGLE, has_utf8 = 0;
if (func == -1) return tSTRING_END;
c = nextc();
@@ -5301,7 +5387,8 @@
tokadd('#');
}
pushback(c);
- if (tokadd_string(func, term, paren, "e->nd_nest, &mb) == -1) {
+ if (tokadd_string(func, term, paren, "e->nd_nest,
+ &mb, &has_utf8) == -1) {
if (func & STR_FUNC_REGEXP) {
ruby_sourceline = nd_line(quote);
compile_error(PARSER_ARG "unterminated regexp meets end of file");
@@ -5315,7 +5402,7 @@
}
tokfix();
- set_yylval_str(STR_NEW3(tok(), toklen(), mb));
+ set_yylval_str(STR_NEW4(tok(), toklen(), mb, has_utf8));
return tSTRING_CONTENT;
}
@@ -5479,6 +5566,7 @@
}
else {
int mb = ENC_CODERANGE_SINGLE, *mbp = &mb;
+ int has_utf8 = 0;
newtok();
if (c == '#') {
switch (c = nextc()) {
@@ -5493,16 +5581,18 @@
}
do {
pushback(c);
- if ((c = tokadd_string(func, '\n', 0, NULL, mbp)) == -1) goto error;
+ if ((c = tokadd_string(func, '\n', 0, NULL,
+ mbp, &has_utf8)) == -1)
+ goto error;
if (c != '\n') {
- set_yylval_str(STR_NEW3(tok(), toklen(), mb));
+ set_yylval_str(STR_NEW4(tok(), toklen(), mb, has_utf8));
return tSTRING_CONTENT;
}
tokadd(nextc());
if (mbp && mb == ENC_CODERANGE_UNKNOWN) mbp = 0;
if ((c = nextc()) == -1) goto error;
} while (!whole_match_p(eos, len, indent));
- str = STR_NEW3(tok(), toklen(), mb);
+ str = STR_NEW4(tok(), toklen(), mb, has_utf8);
}
heredoc_restore(lex_strterm);
lex_strterm = NEW_STRTERM(-1, 0, 0);