[#115212] [Ruby master Bug#19983] Nested * seems incorrect — "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>

Issue #19983 has been reported by Eregon (Benoit Daloze).

9 messages 2023/11/01

[#115226] [Ruby master Bug#19984] `make test-bundler-parallel` fails with ` --enable-shared` — "vo.x (Vit Ondruch) via ruby-core" <ruby-core@...>

Issue #19984 has been reported by vo.x (Vit Ondruch).

7 messages 2023/11/02

[#115227] [Ruby master Feature#19985] Support `Pathname` for `require` — "vo.x (Vit Ondruch) via ruby-core" <ruby-core@...>

Issue #19985 has been reported by vo.x (Vit Ondruch).

14 messages 2023/11/02

[#115259] [Ruby master Bug#19990] Could we reconsider the second argument to Kernel#load? — "fxn (Xavier Noria) via ruby-core" <ruby-core@...>

SXNzdWUgIzE5OTkwIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGZ4biAoWGF2aWVyIE5vcmlhKS4NDQoN

9 messages 2023/11/06

[#115304] [Ruby master Feature#19993] Optionally Free all memory at exit — "HParker (Adam Hess) via ruby-core" <ruby-core@...>

Issue #19993 has been reported by HParker (Adam Hess).

8 messages 2023/11/08

[#115333] [Ruby master Misc#19997] DevMeeting-2023-11-30 — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #19997 has been reported by mame (Yusuke Endoh).

15 messages 2023/11/10

[#115334] [Ruby master Feature#19998] Emit deprecation warnings when the old (non-Typed) Data_XXX API is used — "byroot (Jean Boussier) via ruby-core" <ruby-core@...>

Issue #19998 has been reported by byroot (Jean Boussier).

12 messages 2023/11/10

[#115388] [Ruby master Feature#20005] Add C API to return symbols of native extensions resolved from features — "tagomoris (Satoshi Tagomori) via ruby-core" <ruby-core@...>

Issue #20005 has been reported by tagomoris (Satoshi Tagomori).

14 messages 2023/11/14

[#115422] [Ruby master Bug#20009] Marshal.load raises exception when load dumped class include non-ASCII — "ippachi (Kazuya Hatanaka) via ruby-core" <ruby-core@...>

SXNzdWUgIzIwMDA5IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGlwcGFjaGkgKEthenV5YSBIYXRhbmFr

14 messages 2023/11/19

[#115428] [Ruby master Feature#20011] Reduce implicit array allocations on caller side of method calling — "jeremyevans0 (Jeremy Evans) via ruby-core" <ruby-core@...>

Issue #20011 has been reported by jeremyevans0 (Jeremy Evans).

8 messages 2023/11/20

[#115438] [Ruby master Misc#20013] Travis CI status — "jaruga (Jun Aruga) via ruby-core" <ruby-core@...>

Issue #20013 has been reported by jaruga (Jun Aruga).

51 messages 2023/11/21

[#115484] [Ruby master Bug#20022] GC.verify_compaction_references does not actually move alll objects — "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...>

Issue #20022 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).

7 messages 2023/11/27

[#115491] [Ruby master Feature#20024] SyntaxError subclasses — "kddnewton (Kevin Newton) via ruby-core" <ruby-core@...>

Issue #20024 has been reported by kddnewton (Kevin Newton).

17 messages 2023/11/27

[#115525] [Ruby master Feature#20027] Range Deconstruction — "stuyam (Stuart Yamartino) via ruby-core" <ruby-core@...>

Issue #20027 has been reported by stuyam (Stuart Yamartino).

8 messages 2023/11/28

[#115552] [Ruby master Misc#20032] Propose @kjtsanaktsidis as a commiter — "jeremyevans0 (Jeremy Evans) via ruby-core" <ruby-core@...>

Issue #20032 has been reported by jeremyevans0 (Jeremy Evans).

15 messages 2023/11/30

[ruby-core:115496] [Ruby master Bug#20021] TestGCCompact#test_moving_hashes_down_size_pools is flaky

From: "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>
Date: 2023-11-28 04:57:28 UTC
List: ruby-core #115496
Issue #20021 has been updated by mame (Yusuke Endoh).

Status changed from Open to Closed

Closed by commit:8427a8a655e2a04bfdc6a645ec967405d3617137

----------------------------------------
Bug #20021: TestGCCompact#test_moving_hashes_down_size_pools is flaky
https://bugs.ruby-lang.org/issues/20021#change-105425

* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Closed
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
The test `TestGCCompact#test_moving_hashes_down_size_pools is flaky` is, I believe, flaky, and sometimes fails with this message:

```
  1) Failure:
TestGCCompact#test_moving_hashes_down_size_pools [/home/chkbuild/chkbuild/tmp/build/20231112T183003Z/ruby/test/ruby/test_gc_compact.rb:442]:
Expected 499 to be >= 500.
```

I started looking at this because when https://github.com/ruby/ruby/pull/8858 was merged, rubyci began failing on Ubuntu 22.04 with this message (https://rubyci.s3.amazonaws.com/ubuntu2204/ruby-master/log/20231112T183003Z.fail.html.gz). I was able to reproduce this failure after _very_ carefully reproducing the ruibyci testing environment (same EC2 instance type, and I actually had to install _exactly_ the same kernel version as well!).

After debugging the test, what I discovered is that the _last_ hash that was modified in the test is actually on the machine stack when `GC.verify_compaction_references` is called, which means it gets pinned and cannot be moved; thus, the test fails because only 499 out of 500 objects got moved! The hash's VALUE is "on the stack" in the sense that it's in a memory location below `%rsp`, but it is _NOT_ actually live in any C local variable. The stack address I found the VALUE in is part of the frame of `gc_start`, but `gc_start` obviously doesn't hold a reference to the hash from the test. Rather, it's left-behind in uninitialized memory.

I think this is made more likely because of how GCC wound up compiling the `gc_start` function. It seems to have allocated quite a lot of stack space - 440 bytes of it - and very little of it seems to be used for local variables. Thus, a lot of this memory just contains assorted tidbits from previous VM execution state.

This patch works around the issue, at least on my testbench, by running the code which touches the test hashes in a separate fiber. Each fiber gets its own machine stack, so references to the to-be-moved hashes shouldn't leak into the main stack, and this fiber's stack should be out of scope when the compaction is run, so it should get freed. https://github.com/ruby/ruby/pull/9040

The other approach I considered is counting the number of pinned objects, and asserting that (moved + pinned) >= 500 instead of just moved >= 500. However, it's very likely that there will be a pinned hash (or string, etc, for the other, similar test cases), so this would make the test pass even when it should not, I think.

Also, since this these tests were skipped on Solaris because of this bug, I have unskipped them (although it seems we no longer have solaris on ruby ci, so I guess it doesn't matter).



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

In This Thread

Prev Next