From: ko1@... Date: 2017-11-14T08:40:02+00:00 Subject: [ruby-core:83758] [Ruby trunk Feature#14104] Remove `trace` instructions Issue #14104 has been reported by ko1 (Koichi Sasada). ---------------------------------------- Feature #14104: Remove `trace` instructions https://bugs.ruby-lang.org/issues/14104 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * Target version: 2.5 ---------------------------------------- # Abstract Remove `trace` instructions from bytecode and it will improve performance, 10% faster. Instead of `trace` instruction, we add event information for each instruction. In other words, we unified `trace` instructions with the following instructions. # Backgroud `TracePoint` and old `set_trace_func` method need to know which point they need to kick hooks at. It is easy to introduce hook points into bytecode. However, `trace` VM instruction is inserted to represent such hook points and kicks hooks if needed. This technique introduces some overhead on "no-hooks" situations (and I believe 99% of workloads does not need hooks, especially on production enviornment). For example, the following code show ``` def foo x _y = x end ``` Disasm output is here: ``` == disasm: #>======================================= local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] x [ 1] _y 0000 trace 8 ( 1) 0002 trace 1 0004 getlocal x, 0 0007 dup 0008 setlocal _y, 0 0011 trace 16 ( 2) 0013 leave ( 1) ``` ``` # disasm code is: puts RubyVM::InstructionSequence.compile("def foo x _y = x end", nil, nil, 0, trace_instruction: true, operands_unification: false).disasm ``` You can see `trace` instructions for 0000, 0002 and 0011. They means hook points for :call, :line and :return events. We can remove `trace` instructions with the `trace_instruction: true` compile option, but we can't turn on `TracePoint` for such compiled ISeqs. # Proposal Remove `trace` instruction and introduce event flags as information for each instructions. With same code above, we can get the following disasm code. ``` == disasm: #>======================================= local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] x [ 1] _y 0000 getlocal x, 0 ( 1)[LiCa] 0003 dup 0004 setlocal _y, 0 0007 leave [Re] ``` 0000 shows `[LiCa]`, it means :line and :call events. 0007 shows `[Re]` which represents a :return event. It is too heavy if we check these event flags on each instruction dispatch. So we introduce new instructions `trace_` prefix instructions such as `trace_getlocal` and so on. All instructions have corresponding `trace_` prefix instructions (generated by the VM generator). If we need to enable `TracePoint` (or `set_trace_func`), then rewrite all of instruction sequences in the heap from normal instructions to trace prefix instructions dynamically. `trace_` prefix instructions checks flags for an executing instruction and kick hooks if needed. * Good: a. Lowever overhead for normal execution because we don't need to dispatch useless `trace` instructions. b. We don't have `trace` instructions, so that it is easy to turn on tail call optimization flags. c. We can enable/disable traces on small units. * Bad: a. Big overhead to turn on/off `TracePoint` because we need to modify all of existing ISeqs. b. Introducing a few incomatibilities for tracing. For (Bad-a) I believe nobody turn on and trun off tracepoints many times, so that turning on/off overhead is not so big impact. For (Good-c), I need to explain more. Current `TracePoint` enables all of hooks for all methods. With this technique, however, we can select enable/disable the hooks in more small units, like files, classes or methods. For example, we can enable the hooks only method named `foo`. Now we have no interface to specify it, but we can design later. # Evaluation ``` require 'benchmark' def foo n end N = 100_000_000 Benchmark.bm(10){|x| x.report('trace off'){ N.times{ foo(10) foo(10) foo(10) } } x.report('trace on'){ TracePoint.new{}.enable N.times{ foo(10) foo(10) foo(10) } } } ``` ``` Currnt trunk: user system total real trace off 9.090678 0.000000 9.090678 ( 9.083007) trace on 108.217320 0.005524 108.222844 (108.201941) modified: user system total real trace off 6.647247 0.000000 6.647247 ( 6.641530) trace on 93.405389 0.000000 93.405389 ( 93.404864) ``` # Compatibility issues With this change, we introduced a few incompatibilies: * Line numbers on :return/:b_return events show the last executed lines, instead of `end` lines (without `return` statement). * `begin` block doesn't show `:line` events. -- https://bugs.ruby-lang.org/ Unsubscribe: