[#33161] Call/CC and Ruby iterators. — olczyk@... (Thaddeus L Olczyk)

Reading about call/cc in Scheme I get the impression that it is very

11 messages 2002/02/05

[#33242] favicon.ico — Dave Thomas <Dave@...>

19 messages 2002/02/06
[#33256] Re: favicon.ico — Leon Torres <leon@...> 2002/02/06

[#33435] Reg: tiny contest: who's faster? (add_a_gram) — grady@... (Steven Grady)

> My current solution works correctly with various inputs.

17 messages 2002/02/08

[#33500] Ruby Embedded Documentation — William Djaja Tjokroaminata <billtj@...>

Hi,

24 messages 2002/02/10
[#33502] Re: Ruby Embedded Documentation — "Lyle Johnson" <ljohnson@...> 2002/02/10

> Now, I am using Ruby on Linux, and I have downloaded Ruby version

[#33615] Name resolution in Ruby — stern@... (Alan Stern)

I've been struggling to understand how name resolution is supposed to

16 messages 2002/02/11

[#33617] choice of HTML templating system — Paul Brannan <paul@...>

I am not a web developer, nor do I pretend to be one.

23 messages 2002/02/11

[#33619] make first letter lowercase — sebi@... (sebi)

hello,

20 messages 2002/02/11
[#33620] Re: [newbie] make first letter lowercase — Tobias Reif <tobiasreif@...> 2002/02/11

sebi wrote:

[#33624] Re: [newbie] make first letter lowercase — "Jeff 'japhy' Pinyan" <jeffp@...> 2002/02/11

On Feb 11, Tobias Reif said:

[#33632] Re: [newbie] make first letter lowercase — Mathieu Bouchard <matju@...> 2002/02/12

[#33731] simple XML parsing (greedy / non-greedy — Ron Jeffries <ronjeffries@...>

Suppose I had this text

14 messages 2002/02/13

[#33743] qualms about respond_to? idiom — David Alan Black <dblack@...>

Hi --

28 messages 2002/02/13
[#33751] Re: qualms about respond_to? idiom — Dave Thomas <Dave@...> 2002/02/13

David Alan Black <dblack@candle.superlink.net> writes:

[#33754] Re: qualms about respond_to? idiom — David Alan Black <dblack@...> 2002/02/13

Hi --

[#33848] "Powered by Ruby" banner — Yuri Leikind <YuriLeikind@...>

Hello Ruby folks,

78 messages 2002/02/14
[#33909] Re: "Powered by Ruby" banner — Leon Torres <leon@...> 2002/02/14

On Thu, 14 Feb 2002, Yuri Leikind wrote:

[#33916] RE: "Powered by Ruby" banner — "Jack Dempsey" <dempsejn@...> 2002/02/15

A modest submission:

[#33929] Re: "Powered by Ruby" banner — yet another bill smith <bigbill.smith@...> 2002/02/15

Kent Dahl wrote:

[#33932] OT Netscape 4.x? was Re: "Powered by Ruby" banner — Chris Gehlker <gehlker@...> 2002/02/15

On 2/15/02 5:54 AM, "yet another bill smith" <bigbill.smith@verizon.net>

[#33933] RE: OT Netscape 4.x? was Re: "Powered by Ruby" banner — "Jack Dempsey" <dempsejn@...> 2002/02/15

i just don't understand why it didn't show up! dhtml/javascript, ok, but a

[#33937] Re: OT Netscape 4.x? was Re: "Powered by Ruby" banner — Chris Gehlker <gehlker@...> 2002/02/15

On 2/15/02 7:16 AM, "Jack Dempsey" <dempsejn@georgetown.edu> wrote:

[#33989] Re: OT OmniWeb [was: Netscape 4.x?] — Sean Russell <ser@...> 2002/02/16

Chris Gehlker wrote:

[#33991] Re: OT OmniWeb [was: Netscape 4.x?] — Rob Partington <rjp@...> 2002/02/16

In message <3c6e5e01_1@spamkiller.newsgroups.com>,

[#33993] Re: OT OmniWeb [was: Netscape 4.x?] — Thomas Hurst <tom.hurst@...> 2002/02/16

* Rob Partington (rjp@browser.org) wrote:

[#33925] Re: "Powered by Ruby" banner — Martin Maciaszek <mmaciaszek@...> 2002/02/15

In article <3C6CFCCA.5AD5CA67@scnsoft.com>, Yuri Leikind wrote:

[#33956] Re: "Powered by Ruby" banner — Leon Torres <leon@...> 2002/02/15

On Fri, 15 Feb 2002, Martin Maciaszek wrote:

[#33851] Ruby and .NET — Patrik Sundberg <ps@...>

I have been reading a bit about .NET for the last couple of days and must say

53 messages 2002/02/14

[#34024] Compiled companion language for Ruby? — Erik Terpstra <erik@...>

Hmmm, seems that my previous post was in a different thread, I'll try

12 messages 2002/02/16

[#34036] The GUI Returns — "Horacio Lopez" <vruz@...>

Hello all,

33 messages 2002/02/17

[#34162] Epic4/Ruby — Thomas Hurst <tom.hurst@...>

Rejoice, for you no longer have to put up with that evil excuse for a

34 messages 2002/02/18

[#34185] Operator overloading and multiple arguments — ptkwt@...1.aracnet.com (Phil Tomson)

I'm trying to overload the '<=' operator in a class in order to use it for

10 messages 2002/02/18

[#34217] Ruby for web development — beripome@... (Billy)

Hi all,

21 messages 2002/02/19

[#34350] FAQ for comp.lang.ruby — "Hal E. Fulton" <hal9000@...>

RUBY NEWSGROUP FAQ -- Welcome to comp.lang.ruby! (Revised 2001-2-18)

15 messages 2002/02/20

[#34375] Setting the Ruby continued — <jostein.berntsen@...>

Hi,

24 messages 2002/02/20
[#34384] Re: Setting the Ruby continued — Paulo Schreiner <paulo@...> 2002/02/20

Also VERY important:

[#34467] recursive require — Ron Jeffries <ronjeffries@...>

I'm having a really odd thing happen with two files that mutually

18 messages 2002/02/21

[#34503] special characters — Tobias Reif <tobiasreif@...>

Hi all,

13 messages 2002/02/22

[#34517] Windows Installer Ruby 166-0 available — Andrew Hunt <andy@...>

16 messages 2002/02/22

[#34597] rdoc/xml questions — Dave Thomas <Dave@...>

24 messages 2002/02/23

[#34631] Object/Memory Management — "Sean O'Dell" <sean@...>

I'm new to Ruby and the community here (I've been learning Ruby for a grand

44 messages 2002/02/23

[#34682] duplicate method name — Ron Jeffries <ronjeffries@...>

I just found a case in a test file where i had two tests of the same

16 messages 2002/02/24
[#34687] Re: duplicate method name — s@... (Stefan Schmiedl) 2002/02/24

Hi Ron.

[#34791] Style Question — Ron Jeffries <ronjeffries@...>

So I'm building this set theory library. The "only" object is supposed

13 messages 2002/02/25

[#34912] RCR?: parallel to until: as_soon_as — Tobias Reif <tobiasreif@...>

Hi,

18 messages 2002/02/26

[#34972] OT A Question on work styles — Chris Gehlker <gehlker@...>

As a Mac baby I just had to step through ruby in GDB *from the command line*

20 messages 2002/02/28

[#35015] Time Comparison — "Sean O'Dell" <sean@...>

I am using the time object to compare times between two files and I'm

21 messages 2002/02/28

Ruby + XML Proposal

From: Bryan Murphy <bryan@...>
Date: 2002-02-01 08:13:29 UTC
List: ruby-talk #32948
The following is a sample application that will be included with the next revision 
of the Ruby Publication Framework (which will hopefully be released sometime early 
next week):

  require 'rexml/document'
  require 'RPF/console'

  serializer = Framework::Serializer::REXMLSerializer.new()
  parser     = Framework::XML::FrameworkParser.new(serializer)

  parser.register(Framework::Transformer::XIncludeTransformer.new())

  parser.parse(File.new('default.xml'))

  doc = serializer.document

  # The rest of this is 100% REXML code
  print "doc is an instance of: #{doc.class}\n\n"
  doc.write $stdout
  print "\n"

This application loads an XML file (default.xml) from disk, and then builds a REXML 
DOM object from a SAX2-like  stream events.  In and of itself, this isn't a very 
interesting way to create a REXML document.  However, in the middle of the 
application is the following line:

  parser.register(Framework::Transformer::XIncludeTransformer.new())

What this line does is it registers a class that implements the W3C XInclude 
specification.  This class will receive ALL events that exist in the XInclude 
namespace, load any specified XML documents and insert their content into the 
middle of the XML stream!

So, what I've effectively done is added XInclude support into REXML.  But the fun 
doesn't stop there!

Because of the way the framework is built, these events are fired off to components 
based upon their namespaces (if they don't exist in any particular namespace, they 
are sent directly to the Serializer which sits at the end of the chain).   Because 
each event goes back to the same dispatcher, if XML document A includes XML 
document B, and then XML document B includes XML document C, then XML document A 
will be able to include XML document C as part of XML document B without you having 
to do any extra work on your part (as long as they all agree on the same include 
standard and you register the transformer that implements it).

I can register as many of these transformers as I feel necessary. For instance, 
what if I want to include a SQL statement in my XML document, execute the SQL 
statement, and transform it's results into part of the XML document?  Just write a 
new transformer and register it.  What if I want a looping construct to repeat a 
part of the XML document a number of times?   Create another transformer.  Or how 
about an XSL transformer that transforms only a part of the XML document, creates 
an XInclude tag, and then the XInclude tag gets expanded out to contain the 
contents of the included document?  Very easy to implement!

What the framework essentially gives you is the ability to create smart tags for 
your XML documents that behave a lot like Java Tag Libraries, but are completely 
seperate from web applications!!  But there's more:

The framework also has the ability to get the input stream of events from any 
number of sources.  Maybe you don't want to go through the trouble of creating a 
component that implements a custom set of tags, and want to generate the XML 
document a bit more programmaticaly.  With the framework, you can swap out the 
default generator which reads XML documents from disk for another one.   The first 
alternate generator already built into the framework uses ERuby to generate XML 
documents.  You could put all your logic into an ERuby XML document, yet still use 
the framework to implement XInclude support (or any other transformers which may be 
available):

  generator = Framework::Generator::ERubyGenerator.new()
  generator.source = "default.erb"
  parser.generator = generator

Or maybe you have a REXML document already loaded in memory and want to run that 
through the processor.  Easy enough, use the included REXMLGenerator which takes a 
REXML DOM document and converts it into a stream of events:

  generator = Framework::Generator::REXMLGenerator.new()
  generator.source = my_rexml_obj
  parser.generator = generator

The RPF also includes a few other nifty abilities.  One of those is the ability to 
explicitly apply Transformers that capture *ALL* streamed events.  A stylesheet for 
the whole XML document is an example of when you'd need one.  You can have an XML 
document on disk, and then use the built in SablotronTransformer to apply a 
stylesheet to the XML document before you load it into the REXML dom:

  xsl = Framework::Transformer::SablotronTransformer.new()
  xsl.source = 'default.xsl'
  parser.addTransformer(xsl)

These transformers are applied in the order they are added (you can add as many as 
you want), and you can control whether the namespace based transformers are 
dispatched inbetween them or not.

The best part about this is that since the dispatcher only dispatches the events 
that a transformer will recognize, this greatly simplifies transformer 
development!  In fact, the XInclude transformer (which admittedly isn't 100% 
compliant) is only about 20 lines of code right now, and most of that is for 
loading the XML document and creating a new XML parser to originate the new event 
stream.

Finally, the last part the framework does is it adds infrastructure for building 
dynamic Web applications using this stuff by changing the pipeline on the fly via 
an XML base sitemap.  This of course, a discussion completely on it's own.  ;)

Now, you ask, why am I posting this instead of just releasing the newest code?   
Well, I'm at a bit of a cross-roads.  The REXML stuff is a good example.  The next 
version of the framework will have integrated support for cooperating with REXML, 
but I want to go a step further.  I want to not just cooperate with REXML, but 
*ALL* other Ruby XML stuff out there (NQXML, XMLParser, XSLT4R, you name it).

There are two ways to accomplish this, the quick easy short term method: build the 
support into the framework.  And the harder but more rewarding long term method: 
build the support directly into the corresponding Ruby libraries.

An ideal example would be to abstract away the REXML(Generator|Serializer) and 
replace it with a Stream(Generator|Serializer).  The stream based components would 
work explicitly with the SAX2-like stream of events.  The code that generates the 
stream from a REXML document would then be moved into REXML, and the code which 
generates a REXML document from a stream would be moved into REXML as well.   All 
other libraries could implement similar code, and then plug in directly with the 
Stream components in the framework.  Overtime, the actual implementations 
themselves could become less hacky and use real stream based implementations for 
additional speed benefits.  A side effect is that you would then be able to connect 
a REXML stream generator to an NQXML document creator and convert a REXML document 
into an NQXML document (and vice versa) as efficiently as possible and in a 
standardized way (or do the same with any other conformant library).

I can provide the initial code for the various XML libraries (and will have the 
starting REXML code ready early next week).  But to really do this well, we all 
need to agree on a streaming model for Ruby.  Below is my proposal on what this 
model should be.  This is more or less how my framework is implemented at the 
moment.  Below it is some commentary on why certain things are as they are.  I'd 
like for us to come to an agreement (if at all possible) before I get any futher 
into building the RPF framework, as the more code I write, the harder it will be to 
implement any major changes:

  module DocumentHandler

    # Contains unencoded character content (needs to be encoded when inserted into
    # the XML stream so it doesn't screw up the final XML document).  ch is a
    # string, and start and length are used so that ch can actually be a substring
    # of a larger string (avoids some uneccessary string copying).
    def characters(ch, start, length)

    # Similar to characters, however contains the contents of a comment *excluding*
    # the <!-- begin and --> end tags.
    def comment(ch, start, length)

    # Notifies the end of an XML stream
    def endDocument()

    # Notifies the end of an XML element.  URI is the namespace that the element
    # exists in (nil if there is none).  localName is the name of the XML tag.
    # qName is the fully qualified name (i.e. if there was a namespace qualifier,
    # it would contain "qualifier:#{localName}" otherwise it would be the 
    # "#{localName}").
    def endElement(uri, localName, qName)

    # Passes ignorable whitespace the same way characters are passed.
    def ignorableWhitespace(ch, start, length)

    # Passes a processing instruction (excluding the <? begin and ?> end tags)
    def processingInstruction(target, data)

    # This I'm open for debate, I never liked this anyway.  Passes an object that
    # contains a means to get information about the current position within the
    # current XML document).
    def setDocumentLocator(locator)

    # returns the name of an entity that the previous XML stream couldn't 
    # figure out.  For instance, if the XML document contained an &nbsp; 
    # element, this would receive a string containg: 'nbsp'
    def skippedEntity(name)

    # Notifies the beginning of an XML stream.
    def startDocument()

    # Notifies the start of an XML document.  This *INCLUDES* namespace entries.
    # All the params are the same as the endElement tag, and attrs is a Hash that
    # contains all the attibutes (including any namespace entries) for this tag.
    def startElement(uri, localName, qName, attrs)
  end

Also, do we need to standardize on the ErrorHandler interface as well, or can we 
simply rely on exceptions?   (Personally I'd choose exceptions).

The major differences between this model and the XMLParser streaming model is that 
this model formalizes namespace support, and greatly simplifies working with 
namespaces.

The differences between this model and the SAX2 model are:  

1. I've removed the startPrefixMapping and endPrefixMapping events, and moved 
namespace declarations back into the attributes where they are in the XML 
document.  For the life of me I can't figure out why SAX2 implemented them this 
way.  The RPF framework originally used them, but I found it much simpler to move 
them into the attrs hash.

2. Attributes are a hash, and not an object (though this is, like everything else, 
open for discussion).  I'm of the opinion that if you need to worry about attribute 
namespaces, you should check the attribute names explicitly using regexps.  There 
are definitely good reasons to use a programmatic interface, however.  Perhaps we 
can have the best of both worlds by using a hash like object with extra 
functionality (Ruby seems to be good at giving us the best of both worlds).

Finally, the differences between this and other currently available streaming 
models:

1. Namespaces are supported.

2. We'll soon have a lot of infrastructure to coincide with this format (if you use 
my stuff anyway ;)

3. A standard format that crosses XML library boundaries will give us a lot of 
flexibility we just don't have right now.

That's my proposal.  I'd like to hear your comments on it.  I'll donate any code I 
create (and some of my time) that can be used to help other projects along.  If we 
can come up with a standard, I'll create a writeup for the standard and add it to 
the Ruby Guarden Wiki.

Finally, if you think all of this is just a load of hot air, I want you to think 
about how you use IO streams.  What do you do if you need to get at the contents of 
a file that are compressed?  You do the following:

  contents = TarArchive.new(File.new('archive.tar.gz')).read()
  puts contents

This works because we all agree on what an IO stream is supposed to look like.  We 
can have this same kind of synergy when working with XML, we just need to come up 
with a standard and stick with it ;)

Thanks for reading all this!
Bryan

In This Thread

Prev Next