[#10209] Market for XML Web stuff — Matt Sergeant <matt@...>

I'm trying to get a handle on what the size of the market for AxKit would be

15 messages 2001/02/01

[#10238] RFC: RubyVM (long) — Robert Feldt <feldt@...>

Hi,

20 messages 2001/02/01
[#10364] Re: RFC: RubyVM (long) — Mathieu Bouchard <matju@...> 2001/02/05

[#10708] Suggestion for threading model — Stephen White <spwhite@...>

I've been playing around with multi-threading. I notice that there are

11 messages 2001/02/11

[#10853] Re: RubyChangeRequest #U002: new proper name for Hash#indexes, Array#indexes — "Mike Wilson" <wmwilson01@...>

10 messages 2001/02/14

[#11037] to_s and << — "Brent Rowland" <tarod@...>

list = [1, 2.3, 'four', false]

15 messages 2001/02/18

[#11094] Re: Summary: RCR #U002 - proper new name fo r indexes — Aleksi Niemel<aleksi.niemela@...>

> On Mon, 19 Feb 2001, Yukihiro Matsumoto wrote:

12 messages 2001/02/19

[#11131] Re: Summary: RCR #U002 - proper new name fo r indexes — "Conrad Schneiker" <schneik@...>

Robert Feldt wrote:

10 messages 2001/02/19

[#11251] Programming Ruby is now online — Dave Thomas <Dave@...>

36 messages 2001/02/21

[#11469] XML-RPC and KDE — schuerig@... (Michael Schuerig)

23 messages 2001/02/24
[#11490] Re: XML-RPC and KDE — schuerig@... (Michael Schuerig) 2001/02/24

Michael Neumann <neumann@s-direktnet.de> wrote:

[#11491] Negative Reviews for Ruby and Programming Ruby — Jim Freeze <jim@...> 2001/02/24

Hi all:

[#11633] RCR: shortcut for instance variable initialization — Dave Thomas <Dave@...>

13 messages 2001/02/26

[#11652] RE: RCR: shortcut for instance variable initialization — Michael Davis <mdavis@...>

I like it!

14 messages 2001/02/27

[#11700] Starting Once Again — Ron Jeffries <ronjeffries@...>

OK, I'm starting again with Ruby. I'm just assuming that I've

31 messages 2001/02/27
[#11712] RE: Starting Once Again — "Aaron Hinni" <aaron@...> 2001/02/27

> 2. So far I think running under TextPad will be better than running

[#11726] Re: Starting Once Again — Aleksi Niemel<zak@...> 2001/02/28

On Wed, 28 Feb 2001, Aaron Hinni wrote:

[ruby-talk:10395] Re: Structured text matching?

From: Ernest Ellingson <erne@...>
Date: 2001-02-06 01:58:53 UTC
List: ruby-talk #10395
But Dave no one will say "You look Marvellous!"

At 09:05 2/6/2001 +0900, you wrote:
>schuerig@acm.org (Michael Schuerig) writes:
>
> > > Please note that the samples provided assumes that the start and end tags
> > > appear in the same string (that is, on the same line in a html file).
> >
> > That's exactly the restriction I'd like to avoid...
> >
> > I haven't looked into it, but I'm sure it's possible to redefine the
> > input record separator, slurp a complete file into a string and match a
> > regex against that string.
>
>str = File.open("x.html") {|f| f.read}
>str =~ /.../m
>
> > This very much goes against my sense of aesthetics. There's no need
> > to read in the file beyond a successful match, and there's no need
> > to read further when an orphaned </title> or a </head> tag are
> > encountered.
>
>All true, but at the same time, if you can do it in two lines rather
>than writing a full parser, isn't there some compensating gain to be
>had?
>
>I've used a technique for a while now to convert structured files from
>one form to another.
>
>1. Slurp the whole file in
>2. Convert escaped characters into something distinct so they are no
>    longer involved in processing.
>3. Match delimiters (for example braces in LaTeX, and <>'s in
>    HTML. This is where you take account of strings, commands and the
>    like.
>4. Perform a series of substitutions which match the command pattern
>    and any arguments. The name of the command is then used either to
>    look up a hash, or as the name of a method to call. The results of
>    all this then get substituted back into the buffer.
>
>It sounds messy, but the reality is that it works, and is a whole lot
>simpler than doing the full parse (particularly for non-regular
>languages such as LaTeX).
>
>
>For your particular example, if I was worried about the potential size
>of reading in the while file, I might just read in the first (say) 2k,
>and quickly check for </head>. If I didn't find it, I'd read another
>2k until I did.
>
>
>    def findTitle(file)
>       str = ''
>       loop do
>         begin
>            str << file.sysread(2048)
>           puts "next"
>         rescue EOFError
>            raise "</title> not found in file"
>         end
>         break if str =~ %{</title>}
>       end
>
>       return $1 if str =~ %r{<head.*?>.*?<title.*?>(.*?)</title>.*?</head>}m
>
>       raise "Couldn't find title in file"
>    end
>
>    title = findTitle(File.open("test.html"))
>    puts title
>
>Can't say as I've tested this, but it _might_ work ;-)
>
>
>Dave

In This Thread