From: "nobu (Nobuyoshi Nakada)" <noreply@...>
Date: 2022-10-11T01:19:09+00:00
Subject: [ruby-core:110251] [Ruby master Bug#19043] Segfault on macOS 11.7 while using StringScanner in multiple threads

Issue #19043 has been updated by nobu (Nobuyoshi Nakada).


This seems related to compaction-GC, since crashed at `revert_stack_objects`.
@tenderlovemaking, any thoughts?

----------------------------------------
Bug #19043: Segfault on macOS 11.7 while using StringScanner in multiple threads
https://bugs.ruby-lang.org/issues/19043#change-99540

* Author: keithdoggett (Keith Doggett)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.0dev (2022-09-27T18:58:28Z master 5d4048e0bc) [x86_64-darwin19]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
During testing on our CI, one of the runners failed due to a segfault that appears to have originated from the `StringScanner` class, specifically the `scan_until` method. The test ensures that we are able to properly parse strings in a multithreaded environment.

```ruby
  def test_multithreaded
    parser = RGeo::WKRep::WKTParser.new
    data = fixtures.join("isere.wkt").read
    Array.new(100) do
      Thread.fork do
        parser.parse(data)
      end
    end.map(&:join)
  end
```

Here's the `parse` method

```ruby
      def parse(str)
        @mutex.synchronize do
          str = str.downcase
          @cur_factory = @exact_factory
          if @cur_factory
            @cur_factory_support_z = @cur_factory.property(:has_z_coordinate) ? true : false
            @cur_factory_support_m = @cur_factory.property(:has_m_coordinate) ? true : false
          end
          @cur_expect_z = nil
          @cur_expect_m = nil
          @cur_srid = @default_srid
          if @support_ewkt && str =~ /^srid=(\d+);/i
            str = $'
            @cur_srid = Regexp.last_match(1).to_i
          end
          begin
            start_scanner(str)
            obj = parse_type_tag
            if @cur_token && !@ignore_extra_tokens
              raise Error::ParseError, "Extra tokens beginning with #{@cur_token.inspect}."
            end
          ensure
            clean_scanner
          end
          obj
        end
      end
```

Where the `StringScanner` is created and assigned to `@scanner` in `start_scanner` and `@scanner` is set to `nil` in `clean_scanner`. According to the control frame information in the log, the error is caused in the `scan_until` method, but it might be due to `gc_sweep` being run at some point.

Unfortunately since this happened on a CI system I don't have access to the diagnostic file. We've tried to replicate this locally unsuccessfully. The best we've done is caused a deadlock while trying to join the threads, but cannot reliably reproduce that. Here's a link to the CI run that caused the issue if that's helpful (https://github.com/rgeo/rgeo/actions/runs/3144578897/jobs/5110771257).

If there's any tips on how to reproduce or anything you want me to try to get more information please let me know.

---Files--------------------------------
multithread_crash.log (75.3 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>