[#106939] [Ruby master Bug#18455] `IO#close` has poor performance and difficult to understand semantics. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18455 has been reported by ioquatix (Samuel Williams).

10 messages 2022/01/01

[#106977] [Ruby master Feature#18461] closures are capturing unused variables — "bughit (bug hit)" <noreply@...>

Issue #18461 has been reported by bughit (bug hit).

12 messages 2022/01/05

[#106994] [Ruby master Feature#18462] Proposal to merge WASI based WebAssembly support — "katei (Yuta Saito)" <noreply@...>

Issue #18462 has been reported by katei (Yuta Saito).

8 messages 2022/01/07

[#106996] [Ruby master Feature#18463] Random number generation with xoshiro — "bbrklm (Benson Muite)" <noreply@...>

Issue #18463 has been reported by bbrklm (Benson Muite).

8 messages 2022/01/07

[#107005] [Ruby master Bug#18464] RUBY_INTERNAL_EVENT_NEWOBJ tracepoint causes an interpreter crash when combined with Ractors — "kjtsanaktsidis (KJ Tsanaktsidis)" <noreply@...>

Issue #18464 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).

14 messages 2022/01/08

[#107008] [Ruby master Bug#18465] Make `IO#write` atomic. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18465 has been reported by ioquatix (Samuel Williams).

16 messages 2022/01/09

[#107073] [Ruby master Feature#18481] Porting YJIT to Rust (request for feedback) — "maximecb (Maxime Chevalier-Boisvert)" <noreply@...>

Issue #18481 has been reported by maximecb (Maxime Chevalier-Boisvert).

26 messages 2022/01/12

[#107106] [Ruby master Bug#18487] Kernel#binding behaves differently depending on implementation language of items on the stack — "alanwu (Alan Wu)" <noreply@...>

Issue #18487 has been reported by alanwu (Alan Wu).

11 messages 2022/01/13

[#107190] [Ruby master Feature#18498] Introduce a public WeakKeysMap that compares by equality — "byroot (Jean Boussier)" <noreply@...>

Issue #18498 has been reported by byroot (Jean Boussier).

17 messages 2022/01/19

[#107203] [Ruby master Bug#18501] [BUG] try to mark T_NONE object in RubyVM::InstructionSequence. load_from_binary — "byroot (Jean Boussier)" <noreply@...>

Issue #18501 has been reported by byroot (Jean Boussier).

8 messages 2022/01/20

[#107204] [Ruby master Bug#18502] Make ruby-2.7.5 on Solaris 10 ld.so.1: gcc: fatal: libintl.so.8: open failed: No such file or directory — "dklein (Dmitri Klein)" <noreply@...>

Issue #18502 has been reported by dklein (Dmitri Klein).

8 messages 2022/01/20

[#107275] [Ruby master Bug#18512] MacOS 12.1 Monterey Bug — "oucl5976@... (Paul Liu)" <noreply@...>

Issue #18512 has been reported by oucl5976@gmail.com (Paul Liu).

9 messages 2022/01/25

[#107291] [Ruby master Bug#18518] NoMemoryError + [FATAL] failed to allocate memory for twice 1 << large — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18518 has been reported by Eregon (Benoit Daloze).

12 messages 2022/01/26

[#107310] [Ruby master Bug#18555] Running "bundle exec middleman server" on M1 Mac gives [BUG] Bus Error at 0x0000000104b04000 — "anthonyaykut (Anthony Aykut)" <noreply@...>

Issue #18555 has been reported by anthonyaykut (Anthony Aykut).

13 messages 2022/01/28

[#107346] [Ruby master Misc#18557] DevMeeting-2022-02-17 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18557 has been reported by mame (Yusuke Endoh).

18 messages 2022/01/29

[#107392] [Ruby master Bug#18560] "Compaction isn't available on this platform" error running PG test suite on ppc64le — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18560 has been reported by vo.x (Vit Ondruch).

7 messages 2022/01/31

[ruby-core:106941] [Ruby master Bug#18455] `IO#close` has poor performance and difficult to understand semantics.

From: "Eregon (Benoit Daloze)" <noreply@...>
Date: 2022-01-01 11:10:38 UTC
List: ruby-core #106941
Issue #18455 has been updated by Eregon (Benoit Daloze).


Nice, I thought this logic was already per IO instance.

I don't see how it will change anything about `this leads to strange behaviour` though.
Which is fine because it seems unavoidable as long as `IO.for_fd(fd)` exists.

`IO.for_fd(fd)` does not `dup(2)` and so EBADF is expected if the original IO of fd is closed.
`IO.for_fd(fd)` is unsafe in general, and is only OK when: 1) autoclose = false is called on it; 2) the original IO is alive longer than all `IO.for_fd(fd)`
For Ractor purposes it's a lot safer to `Racter#send(io, move: true)` rather than passing the fd and `IO.for_fd(fd)`.


----------------------------------------
Bug #18455: `IO#close` has poor performance and difficult to understand semantics.
https://bugs.ruby-lang.org/issues/18455#change-95763

* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
`IO#close` should be responsible for closing the file descriptor referred to by the IO instance. When dealing with buffered IO, one can also expect this to flush the internal buffers if possible.

Currently, all blocking IO operations release the GVL and perform the blocking system call using `rb_thread_io_blocking_region`. The current implementation takes a file descriptor and adds an entry to the VM global `waiting_fds` list. When the operation is completed, the entry is removed from `waiting_fds`.

When calling `IO#close`, this list is traversed and any threads performing blocking operations with a matching file descriptor are interrupted. The performance of this is O(number of blocking IO operations) which in practice the performance of `IO#close` can take milliseconds with 10,000 threads performing blocking IO. This performance is unacceptable.

``` ruby
#!/usr/bin/env ruby

require 'benchmark'

class Reading
  def initialize
    @r, @w = IO.pipe

    @thread = Thread.new do
      @r.read
    rescue IOError
      # Ignore.
    end
  end

  attr :r
  attr :w

  attr :thread

  def join
    @thread.join
  end
end

def measure(count = 10)
  readings = count.times.map do
    Reading.new
  end

  sleep 10

  duration = Benchmark.measure do
    readings.each do |reading|
      reading.r.close
      reading.w.close
    end
  end

  average = (duration.total / count) * 1000.0
  pp count: count, average: sprintf("%0.2fms", average)

  readings.each(&:join)
end

measure(   10)
measure(  100)
measure( 1000)
measure(10000)
```

In addition, the semantics of this operation are confusing at best. While Ruby programs are dealing with IO instances, the VM is dealing with file descriptors, in effect performing some internal de-duplication of IO state. In practice, this leads to strange behaviour:

``` ruby
#!/usr/bin/env ruby

r, w = IO.pipe
r2 = IO.for_fd(r.to_i)
pp r: r, r2: r2

t = Thread.new do
  r2.read rescue nil
  r2.read # EBADF
end

sleep 0.5
r.close
t.join rescue nil

pp r: r, r2: r2
# r is closed, r2 is valid but will raise EBADF on any operation.
```

In addition, this confusing behaviour extends to Ractor and state is leaked between the two:

``` ruby
r, w = IO.pipe

ractor = Ractor.new(r.to_i) do |fd|
  r2 = IO.for_fd(fd)
  r2.read
  # r2.read # EBADF
end

sleep 0.5
r.close

pp take: ractor.take
```

I propose the following changes to simplify the semantics and improve performance:

- Move the semantics of `waiting_fds` from per-fd to per-IO. This means that `IO#close` only interrupts blocking operations performed on the same IO instance rather than ANY IO which refers to the same file descriptor. I think this behaviour is easier to understand and still protects against the vast majority of incorrect usage.
- Move the details of `struct rb_io_t` to `internal/io.h` so that the implementation details are not part of the public interface.

## Benchmarks

Before:

```
{:count=>10, :average=>"0.19ms"}
{:count=>100, :average=>"0.11ms"}
{:count=>1000, :average=>"0.18ms"}
{:count=>10000, :average=>"1.16ms"}
```

After:

```
{:count=>10, :average=>"0.20ms"}
{:count=>100, :average=>"0.11ms"}
{:count=>1000, :average=>"0.15ms"}
{:count=>10000, :average=>"0.68ms"}
```

After investigating this further I found that the `rb_thread_io_blocking_region` using `ubf_select` can be incredibly slow, proportional to the number of threads. I don't know whether it's advisable but:

``` c
        BLOCKING_REGION(blocking_node.thread, {
            val = func(data1);
            saved_errno = errno;
        }, NULL /* ubf_select */, blocking_node.thread, FALSE);
```

Disabling the UBF function and relying on `read(fd, ...)`/`write(fd, ...)` blocking operations to fail when `close(fd)` is invoked might be sufficient? This needs more investigation but after making this change, we have constant-time IO#close.

```
{:count=>10, :average=>"0.13ms"}
{:count=>100, :average=>"0.06ms"}
{:count=>1000, :average=>"0.04ms"}
{:count=>10000, :average=>"0.09ms"}
```

Which is ideally what we want.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread