[#82311] [Ruby trunk Bug#13794] Infinite loop of sched_yield — charlie@...
Issue #13794 has been reported by catphish (Charlie Smurthwaite).
4 messages
2017/08/09
[#82518] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid — mame@...
Issue #13618 has been updated by mame (Yusuke Endoh).
5 messages
2017/08/30
[#82552] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
— Eric Wong <normalperson@...>
2017/08/31
mame@ruby-lang.org wrote:
[#82756] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
— Eric Wrong <normalperson@...>
2017/09/12
Eric Wrong <normalperson@yhbt.net> wrote:
[ruby-core:82349] [Ruby trunk Bug#13806] StringIO encoding conversion
From:
loic.nageleisen@...
Date:
2017-08-11 13:12:41 UTC
List:
ruby-core #82349
Issue #13806 has been reported by lloeki (Loic Nageleisen).
----------------------------------------
Bug #13806: StringIO encoding conversion
https://bugs.ruby-lang.org/issues/13806
* Author: lloeki (Loic Nageleisen)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
StringIO's doc page says:
> Pseudo I/O on String object.
>
> Commonly used to simulate `$stdio` or `$stderr`
As it turns out, this is precisely my use case, as I was writing some tests that boiled down to something like a (highly simplified) this:
~~~ ruby
s = StringIO.new("foo")
stuff.new(s).do_something
assert_equal "foo", s.tap(&:rewind).read
~~~
The result of which was in my case:
~~~ diff
--- expected
+++ actual
@@ -1,2 +1,2 @@
-"foo"
+# encoding: ASCII-8BIT
+""
~~~
Indeed I had a bug so my test was supposed to fail ("foo" vs "") but what caught my eye was the encoding issue.
So I did some comparison tests, and behaviours differ significantly:
~~~ ruby
f = File.open("foo", File::CREAT | File::RDWR)
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => nil
f.read.encoding # reads "foo" # => #<Encoding:UTF-8>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-8>
s = StringIO.new("foo") # => #<StringIO:0x007f879e9e54d0>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:UTF-8>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
There's that subtle little issue at EOF. So, what about "w+"?:
~~~ ruby
f = File.open("foo", "w+") # => #<File:foo>
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => nil
f.read.encoding # reads "foo" # => #<Encoding:UTF-8>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-8>
s = StringIO.new("foo", "w+") # => #<StringIO:0x007f879e81f268>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
Somehow it makes StringIO always behave as binary on #read. Hmmm.
Let's try binary. IO's doc says:
> "b" Binary file mode
> Suppresses EOL <-> CRLF conversion on Windows. And
> sets external encoding to ASCII-8BIT unless explicitly
> specified.
~~~ ruby
f = File.open("foo", "w+b") # => #<File:foo>
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => #<Encoding:ASCII-8BIT>
f.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
f.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
s = StringIO.new("foo", "w+b") # => #<StringIO:0x007f879f0bd460>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
Close, but no cigar: external_encoding is still incorrect, and #read could care less. Let's try making things explicit:
~~~ ruby
f = File.open("foo", "w+b:ASCII-8BIT:ASCII-8BIT")
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => #<Encoding:UTF-8>
f.read.encoding # reads "foo" # => #<Encoding:UTF-8>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-8>
s = StringIO.new("", "w+b:ASCII-8BIT:ASCII-8BIT")
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
Nope, external_encoding still wrong. Anyway, in my case I was looking for UTF-8, so what about that?
~~~ ruby
f = File.open("foo", "w+b:UTF-8:UTF-8") # => #<File:foo>
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => #<Encoding:UTF-8>
f.read.encoding # reads "foo" # => #<Encoding:UTF-8>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-8>
s = StringIO.new("", "w+b:UTF-8:UTF-8") # => #<StringIO:0x007fd531cb9248>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
StringIO keeps insisting on its binary output irrespective of the mode argument as described in the doc. Last resort, forcing text mode:
~~~ ruby
f = File.open("foo", "w+t:UTF-8:UTF-8") # => #<File:foo>
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => nil
f.external_encoding # => #<Encoding:UTF-8>
f.read.encoding # reads "foo" # => #<Encoding:UTF-8>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-8>
s = StringIO.new("", "w+t:UTF-8:UTF-8") # => #<StringIO:0x007f879f04fc08>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
Same. Anyway, one last time, let's go nuts:
~~~ ruby
f = File.open("foo", "w+:UTF-16:UTF-32") # => #<File:foo>
f.write("foo") # => 3
f.rewind # => 0
f.internal_encoding # => #<Encoding:UTF-32 (dummy)>
f.external_encoding # => #<Encoding:UTF-16 (dummy)>
f.read.encoding # reads "foo" # => #<Encoding:UTF-32 (dummy)>
f.read.encoding # reads "" at EOF # => #<Encoding:UTF-32 (dummy)>
s = StringIO.new("", "w+:UTF-16:UTF-32") # => #<StringIO:0x007f879f04fc08>
s.internal_encoding # => nil
s.external_encoding # => #<Encoding:UTF-8>
s.read.encoding # reads "foo" # => #<Encoding:ASCII-8BIT>
s.read.encoding # reads "" at EOF # => #<Encoding:ASCII-8BIT>
~~~
I think the result speaks for itself.
In my specific case I quickly found workarounds, but this makes for brittle code ant tests. Sometimes this involves faking StringIO with an actual temp file, which is, let's say, sub par.
Tangentially related: StringIO is missing quite some methods compared to IO, either sometimes forcing code to be aware of it, which is IMHO not good, (e.g breaking code coverage in tests), requiring monkeypatching StringIO, or making creative (ahem) use of temp files and thus hitting the filesystem.
Seems tied to old-ish: https://bugs.ruby-lang.org/issues/7964
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>