From: mame@... Date: 2020-12-10T09:23:55+00:00 Subject: [ruby-core:101375] [Ruby master Feature#16375] Right size regular expression compile buffers for literal regexes and on Regexp#freeze Issue #16375 has been updated by mame (Yusuke Endoh). Status changed from Open to Closed Merged by nobu at 9a17437558e42aa1da372b515ba8bc18067d578c ---------------------------------------- Feature #16375: Right size regular expression compile buffers for literal regexes and on Regexp#freeze https://bugs.ruby-lang.org/issues/16375#change-89134 * Author: methodmissing (Lourens Naud�) * Status: Closed * Priority: Normal * Assignee: nobu (Nobuyoshi Nakada) * Target version: 3.0 ---------------------------------------- References PR https://github.com/ruby/ruby/pull/2696 As a continuation of [type specific resize on freeze implementations of String and Array](https://bugs.ruby-lang.org/issues/16291) and looking into the `Regexp` type I found these memory access patterns for regular expression literals: ``` ==22079== -------------------- 12 of 500 -------------------- ==22079== max-live: 1,946,560 in 4,345 blocks ==22079== tot-alloc: 1,946,560 in 4,345 blocks (avg size 448.00) ==22079== deaths: none (none of these blocks were freed) ==22079== acc-ratios: 1.36 rd, 0.98 wr (2,651,994 b-read, 1,908,158 b-written) ==22079== at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so) ==22079== by 0x24C496: onig_new_with_source (re.c:844) ==22079== by 0x24C496: make_regexp (re.c:874) ==22079== by 0x24C496: rb_reg_initialize (re.c:2858) ==22079== by 0x24C496: rb_reg_initialize_str (re.c:2892) ==22079== by 0x24C496: rb_reg_compile (re.c:2982) ==22079== by 0x12EB84: rb_parser_reg_compile (parse.y:12185) ==22079== by 0x12EB84: parser_reg_compile (parse.y:12179) ==22079== by 0x12EB84: reg_compile (parse.y:12195) ==22079== by 0x2147E3: new_regexp (parse.y:10101) ==22079== by 0x2147E3: ruby_yyparse (parse.y:4419) ==22079== by 0x2161F7: yycompile0 (parse.y:5942) ==22079== by 0x3241FF: rb_suppress_tracing (vm_trace.c:427) ==22079== by 0x1FDBF6: yycompile (parse.y:5991) ==22079== by 0x1FDBF6: rb_parser_compile_file_path (parse.y:6130) ==22079== by 0x27AC96: load_file_internal (ruby.c:2034) ==22079== by 0x137730: rb_ensure (eval.c:1129) ==22079== by 0x27CEEA: load_file (ruby.c:2153) ==22079== by 0x27CEEA: rb_parser_load_file (ruby.c:2175) ==22079== by 0x1954CE: load_iseq_eval (load.c:587) ==22079== by 0x1954CE: rb_load_internal (load.c:651) ==22079== by 0x1954CE: rb_f_load (load.c:709) ==22079== by 0x2FB957: vm_call_cfunc_with_frame (vm_insnhelper.c:2468) ==22079== by 0x2FB957: vm_call_cfunc (vm_insnhelper.c:2493) ``` Digging a little further and remembering some context of previous oniguruma memory investigation I remembered the pattern buffer struct has a compile buffer with a simple watermark for tracking used space. This changeset implements `reg_resize` (static as `ary_resize`) which attempts to right size the compile buffer if over allocated at the following sites: * After compiling a literal regular expression. * Implement an explicit type specific `rb_reg_freeze` and point `Regexp#compile` to it * I also follow the `chain` member which points to another `regex_t` on the struct if present, but have not been able to find references to it in the source tree other than for freeing a regex or inspecting it's memory footprint. I introduced 2 new debug counters, which yields the following results on booting Redmine on Rails 5: ``` [RUBY_DEBUG_COUNTER] obj_regexp_lit_extracapa 6319 [RUBY_DEBUG_COUNTER] obj_regexp_lit_extracapa_bytes 301685 ``` About 300kb reallocated across 6319 oversized instances. An example of `Regexp#freeze` ``` irb(main):007:0> r = Regexp.compile("(?!%\h\h|[!$-&(-;=?-_a-~]).") irb(main):008:0> ObjectSpace.memsize_of(r) => 588 irb(main):009:0> r.freeze => /(?!%hh|[!$-&(-;=?-_a-~])./ irb(main):010:0> ObjectSpace.memsize_of(r) => 543 ``` There is likely more layers that can be peeled back here, but keeping it simple and concise for review. I think it's possible to get towards a state where 5 to 10MB RSS can be shaved off a standard Rails process by: * Being careful with buffer defaults for objects that have auxiliary buffers * Identifying hooks where excess allocation can be reduced to current watermark if the object buffer does not need to grow anymore (literal object, frozen object etc.) * A GC hook on promotion to OLD object to trim excessive capacity, which I'd consider as a special kind of garbage, as outlined in https://bugs.ruby-lang.org/issues/15402 Would any further incremental work towards this be considered worthwhile from a ruby-core and community perspective? -- https://bugs.ruby-lang.org/ Unsubscribe: