From: shyouhei@...
Date: 2020-11-26T22:47:40+00:00
Subject: [ruby-core:101110] [Ruby master Feature#17307] A way to mark C extensions as thread-safe, Ractor-safe, or unsafe

Issue #17307 has been updated by shyouhei (Shyouhei Urabe).


Eregon (Benoit Daloze) wrote in #note-15:
> shyouhei (Shyouhei Urabe) wrote in #note-14:
> > Am I missing something?  This sounds eccentric to me.  Whether a C function is thread safe or not is rather a common concept.  "This C function is otherwise thread-safe unless any functions it (maybe recursively) calls break thread-safety" is NOT a thread-safe C function itself.  Given rb_* families differ in their thread-safety among implementations, it is nearly impossible for thread-safe C functions to use them, which effectively means almost everything an extension library concerns cannot be thread-safe.
> 
> Marking as thread-safe would mean nothing if `rb_*` functions are not thread-safe.
> So on CRuby it would do nothing.
> So the condition is "if this C extension function is thread-safe, assuming `rb_*` functions are thread-safe.".
> If `rb_*` functions are not thread-safe, then the marking has no effect/is not used by anything.
> @shyouhei Does that make sense?

This is not what I know is a thread-safety.  I understand what you need, but you should name the property differently than thread-safe, like for instance Truffle safe.

----------------------------------------
Feature #17307: A way to mark C extensions as thread-safe, Ractor-safe, or unsafe
https://bugs.ruby-lang.org/issues/17307#change-88785

* Author: Eregon (Benoit Daloze)
* Status: Open
* Priority: Normal
----------------------------------------
I would like to design a way to mark C extensions as thread-safe, Ractor-safe, or unsafe (= needs process-global lock).
By default, if not marked, C extensions would be treated as unsafe for compatibility.

Specifically, TruffleRuby supports C extensions, but for scalability it is important to run at least some of them in parallel (e.g., HTTP parsing in Puma).
This was notably mentioned in my [RubyKaigi talk](https://speakerdeck.com/eregon/running-rack-and-rails-faster-with-truffleruby?slide=17).
TruffleRuby defaults to acquire a global lock when executing C extension code for maximum compatibility (Ruby code OTOH can always run in parallel).
There is a command-line option for that lock and it can be disabled, but then it is disabled for all C extensions.
The important property for TruffleRuby is that the C extension does not need a global lock, i.e., that it synchronizes any mutable state in C that could be accessed by multiple threads, such as global C variables.
I believe many C extensions are already thread-safe, or can easily become thread-safe, because they do not rely on global state and do not share the RData objects between threads.

Ractor also needs a way to mark C extensions, to know if it's OK to use the C extension in multiple Ractors in parallel, and that the C extension will not leak non-shareable objects from one Ractor to another, which would lead to bugs & segfaults.
Otherwise, C extensions could only be used on the main/initial Ractor (or need to acquire a process-global lock whenever executing C extension code and ensure no non-shareable objects leak between Ractors), which would be a very big limitation (almost every non-trivial application depends on a C extension transitively).

In both cases, global state in the C extension needs synchronization.
In the thread-safe case, mutable state in C that could be accessed by multiple Ruby threads needs to be synchronized too (there might be no such state, e.g., if C extension objects are created per Thread).
In the Ractor case, the C extension must never pass an object from a Ractor to another, unless it is a shareable object.

What do you think would be a good way to "mark" C extensions?
Maybe defining a symbol in the C extension, similar to the `Init_foo` we have, like say `foo_is_thread_safe`/`foo_is_ractor_safe`?
A symbol including the C extension name seems best, to avoid any possible confusion when looking it up.

Maybe there are other ways to mark C extensions than defining symbols, that could still be read by the Ruby implementation reliably?

I used the term `C extensions` but of course it would apply to native extensions too (including C++/Rust/...).

cc @ko1



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>