[#6828] bug in mailread.rb, and: proposal for Mail#to_s — Wybo Dekker <wybo@...>
mailread separates mail messages looking for /^From /.
Hi,
[#6847] Re: Proposed patch for optparse to fix multi line argument handling — Daniel Hobe <hobe@...>
The attached patch fixes a bug in Optparse (at least I think it is a
Hi,
[#6864] ruby 1.8.4 rc breaks alias_method/rails in bad ways — "Ara.T.Howard" <ara.t.howard@...>
Ara.T.Howard wrote:
On Tue, 13 Dec 2005, [ISO-8859-15] Florian Growrote:
On Dec 12, 2005, at 1:19 PM, ara.t.howard@noaa.gov wrote:
On Tue, 13 Dec 2005, James Edward Gray II wrote:
On Dec 12, 2005, at 1:42 PM, ara.t.howard@noaa.gov wrote:
On Dec 12, 2005, at 2:10 PM, James Edward Gray II wrote:
On Tue, 13 Dec 2005, James Edward Gray II wrote:
[#6888] Iconv library - differences between Ruby 1.8.2 and 1.8.4? — "Dave Burt" <dave@...>
Hi,
[#6891] Time.utc! and Time.localtime! — Daniel Hobe <hobe@...>
Writing a script yesterday I found out, much to my surprise, that the
On Dec 14, 2005, at 11:36 AM, Daniel Hobe wrote:
Hi,
[#6894] Dir.tmpdir RDoc — Eric Hodel <drbrain@...7.net>
Speaking of tmpdir, I'm curious why the tmpdir source
[#6906] Add Missing HTTP Headers and Status Codes to Ruby CGI — Paul Duncan <pabs@...>
Hi Everyone,
[#6911] IO.open not calling close in block form? — Daniel Berger <Daniel.Berger@...>
What happened to the block form of IO.open after 1.8.2? It's supposed to
[#6918] change to yaml in 1.8.4 — ara.t.howard@...
-----BEGIN PGP SIGNED MESSAGE-----
On Sat, 17 Dec 2005, Jeremy Kemper wrote:
On Sat, 2005-12-17 at 03:10 +0900, ara.t.howard@noaa.gov wrote:
On Dec 16, 2005, at 2:57 PM, Tom Copeland wrote:
On Sat, 2005-12-17 at 09:02 +0900, Eric Hodel wrote:
On Dec 16, 2005, at 5:30 PM, Tom Copeland wrote:
[#6934] 1.8.x, YAML, and release management — Ryan Davis <ryand-ruby@...>
I'm concerned that 1.8.3's acceptance of non-backwards-compatible
Hi.
Ryan Davis (ryand-ruby@zenspider.com) wrote:
On Saturday 17 December 2005 22:18, Ryan Davis wrote:
Hi.
[#6964] Array Documentation Issues — James Edward Gray II <james@...>
Let's start with:
[#6979] ruby 1.8.4 preview3 — Yukihiro Matsumoto <matz@...>
Hi,
[#6980] Re: ruby 1.8.4 preview3 — Kailden <kailden@...>
matz> I have just put 1.8.4 preview3 on the server.
[#6996] Problems building 1.8.4 with VS8 C++ Express Edition (cl 14.00) — Austin Ziegler <halostatue@...>
Visual Studio C++ 2005 Express Edition (VS 8.0)
Hello,
On 26/12/05, U.Nakamura <usa@garbagecollect.jp> wrote:
>>> __pioinfo structure may have been changed.
Hi.
I have replaced the config/makefile setup for Ruby using C++ Express, and I
Hi,
Hello,
Hi,
On 27/12/05, nobuyoshi nakada <nobuyoshi.nakada@ge.com> wrote:
Hello,
[#7008] Install fails to create directories — noreply@...
Bugs item #3115, was opened at 2005-12-28 05:00
Hi,
[#7028] Ruby 1.8.4 RDoc HTML Cleanups and HTML Language Support — Paul Duncan <pabs@...>
Hi,
Hi.
* H.Yamamoto (ocean@m2.ccsnet.ne.jp) wrote:
* H.Yamamoto (ocean@m2.ccsnet.ne.jp) wrote:
Re: bug in mailread.rb, and: proposal for Mail#to_s
On Tue, 6 Dec 2005, Yukihiro Matsumoto wrote:
> In message "Re: bug in mailread.rb, and: proposal for Mail#to_s"
> on Sun, 4 Dec 2005 21:46:44 +0900, Wybo Dekker <wybo@servalys.nl> writes:
>
> |mailread separates mail messages looking for /^From /.
> |This is incorrect, because message bodies may contain lines beginning with
> |
> |From like this mail does. So the regexp should be, I think, something
> |like: /^From .*? \w{3} \w{3} [\d ]{2} \d\d:\d\d:\d\d \d{4}/
>
> Interesting. But I have seen wide range of variety of From line format.
> Does this good enough for all of them?
mbox(5) says: A postmark line consists of the four characters "From",
followed by a space character, followed by the message's envelope sender
address, followed by whitespace, and followed by a time stamp. The sender
address is expected to be an addrspec as defined in appendix D of RFC 822.
In the sources of pine (ftp://ftp.cac.washington.edu/pine/pine-4.64-1.src.rpm)
a FAQ (attached) addresses this problem, especially
the time stamp format, which should be ctime's format. The above re
matches that.
> |I also propose a to_s method, which converts the Mail object to a
> |(possibly edited) copy of the original mail message.
>
> I think to_s is not sufficient for string representation of whole mail
> body. It's just too long. I agreed to add a new method to do this
> work. Any name suggestion?
Mail#assemble ?
reassemble ?
rebuild ?
I have attached a new version which
- has assemble instead of to_s
- retains the original /^From / line instead of generating a new one
- uses line.chomp instead of line.chop
- and $/ instead of "\n"
- has more comment (mostly from the pic axe)
--
Wybo
Attachments (2)
From pine4.64/imap/docs/FAQ.html#6.12 in the linux sources of pine (ftp://ftp.cac.washington.edu/pine/pine-4.64-1.src.rpm): 6.12 Why are you so fussy about the date/time format in the internal "From " line in traditional UNIX mailbox files? My other mail program just considers every line that starts with "From " to be the start of the message. You just answered your own question. If any line that starts with "From " is treated as the start of a message, then every message text line which starts with "From " has to be quoted (typically by prefixing a ">" character). People complain about this -- "why did a > get stuck in my message?" So, good mail reading software only considers a line to be a "From " line if it follows the actual specification for a "From " line. This means, among other things, that the day of week is fixed-format: "May 14", but "May 7" (note the extra space) as opposed to "May 7". ctime() format for the date is the most common, although POSIX also allows a numeric timezone after the year. For compatibility with ancient software, the seconds are optional, the timezone may appear before the year, the old 3-letter timezones are also permitted, and "remote from xxx" may appear after the whole thing. Unfortunately, some software written by novices use other formats. The most common error is to have a variable-width day of month, perhaps in the erroneous belief that RFC 2822 (or RFC 822) defines the format of the date/time in the "From " line (it doesn't; no RFC describes internal formats). I've seen a few other goofs, such as a single-digit second, but these are less common. If you are writing your own software that writes mailbox files, and you really aren't all that savvy with all the ins and outs and ancient history, you should seriously consider using the c-client library (e.g. routine mail_append()) instead of doing the file writes yourself. If you must do it yourself, use ctime(), as in: fprintf (mbx,"From %s@%h %s",user,host,ctime (time (0))); rather than try to figure out a good format yourself. ctime() is the most traditional format and nobody will flame you for using it.
#
# mailread.rb - basic parsing for mbox e-mail message files
#
# Class +Mail+ provides basic parsing for mbox e-mail messages.
# It can read an individual message from a named file, or it can be
# called repeatedly to read messages from a stream on an opened mbox
# format file. Each +Mail+ object represents a single e-mail message,
# which is split into a header and a body. The body is an array of
# lines, and the header is a hash indexed by header field name. +Mail+
# correctly joins multiline headers.
class Mail
@@from = ''
# read a new mail message from an mbox mail file
def initialize(mbox)
unless defined? mbox.gets
mbox = open(mbox, "r")
opened = true
end
@header = {}
@body = []
@from = @@from # From-line stored from previous Mail#new call
begin
while line = mbox.gets()
line.chomp!
if /^From /=~line # save From-line
@from = line
next
end
break if /^$/=~line # end of header
if /^(\S+?):\s*(.*)/=~line
(attr = $1).capitalize!
@header[attr] = $2
elsif attr
line.sub!(/^\s*/, '')
@header[attr] += $/ + line
end
end
return unless line
while line = mbox.gets()
# From wybo@servalys.nl Sun Sep 26 13:20:51 2004 +0200
if /^From .*? \w{3} \w{3} [\d ]{2} \d\d:\d\d:\d\d \d{4}/=~line
@@from = line.chomp # save From-line for next Mail#new call
break
end
@body.push(line)
end
ensure
mbox.close if opened
end
end
# return the header as a hash with header field names as keys and
# header field contents as values. The values for fields with
# continuation lines contain one multiple lines.
def header
return @header
end
# return the body as an array of lines
def body
return @body
end
# return a header field
def [](field)
@header[field.capitalize]
end
# Return a +Mail+ object, ready to print, including its `+From+ ' line.
# Header fields will appear sorted, but +Date+, +From+, +Subject+ and +To+
# go in front; note that of fields that occurred multiply times in the
# original (like +Received+), only the last one will be reproduced here
def assemble
prior = %w{Date From Subject To}
s = @from + $/
self.header.keys.sort { |a,b|
if prior.index(a)
prior.index(b) ? a <=> b : -1
elsif prior.index(b)
1
else
a <=> b
end
}.each { |k|
s << "#{k}: #{self.header[k].gsub(/#{$/}/,$/+"\t")}" + $/
}
s << $/
s << self.body.join
end
end