From: jclark@... Date: 2016-05-19T16:56:32+00:00 Subject: [ruby-core:75614] [Ruby trunk Feature#11098] Thread-level allocation counting Issue #11098 has been updated by Jason Clark. allocation_tracer is awesome for debugging, and I've happily used it a number of times. Thank you for building it Koichi! While most people certainly wouldn't use this, I do have a case for it in production. Specifically, I work at New Relic, and I wanted this for the Ruby agent (newrelic_rpm) to read. It would be a huge benefit to our users to pinpoint specific web requests that are allocation heavy. Production allocation often differs from other environments, so seeing what's actually happening on prod is a big benefit. The current global counters are noisy in the presence of other threads, and since we can't reliably provide the information for a specific request, we don't say anything at all. Working as a gem has the disadvantages that you list, which are real concerns for us. In our experience few users enable optional features, so we probably won't even build something for an optional approach to adding this in. If you still feel the overhead outweighs the use case we can close this out. It would give instrumenters like myself awesome insight into one of the most common causes of Ruby app slowdown, but I understand the concerns. ---------------------------------------- Feature #11098: Thread-level allocation counting https://bugs.ruby-lang.org/issues/11098#change-58747 * Author: Jason Clark * Status: Feedback * Priority: Normal * Assignee: ---------------------------------------- This patch introduces a thread-local allocation count. Today you can get a global allocation count from `GC.stat`, but in multi-threaded contexts that can give a muddied picture of the allocation behavior of a particular piece of code. Usage looks like this: ``` [2] pry(main)> Thread.new do [2] pry(main)* 1000.times do [2] pry(main)* Object.new [2] pry(main)* end [2] pry(main)* puts Thread.current.allocated_objects [2] pry(main)* end 1000 ``` This would be of great interest to folks profiling Ruby code in cases where we can't turn on more detailed object tracing tools. We currently use GC activity as a proxy for object allocations, but this would let us be way more precise. Obviously performance is a big concern. Looking at GET_THREAD, this doesn't appear to have any clearly large overhead. To check this out, I ran the following benchmark: ``` require 'benchmark/ips' Benchmark.ips do |benchmark| benchmark.report "Object.new" do Object.new end benchmark.report "Object.new" do Object.new end benchmark.report "Object.new" do Object.new end end ``` Results from a few run-throughs locally: Commit 9955bb0 on trunk: ``` Calculating ------------------------------------- Object.new 105.244k i/100ms Object.new 105.814k i/100ms Object.new 106.579k i/100ms ------------------------------------------------- Object.new 4.886M (�� 4.5%) i/s - 24.417M Object.new 4.900M (�� 1.9%) i/s - 24.549M Object.new 4.835M (�� 7.4%) i/s - 23.980M ``` With this patch: ``` Calculating ------------------------------------- Object.new 114.248k i/100ms Object.new 114.508k i/100ms Object.new 114.472k i/100ms ------------------------------------------------- Object.new 4.776M (�� 5.1%) i/s - 23.878M Object.new 4.767M (�� 5.2%) i/s - 23.818M Object.new 4.818M (�� 1.5%) i/s - 24.154M ``` I don't have a good sense of whether this is an acceptable level of change or not, but I figured without writing the code to test there was no way to know. What do you think? ---Files-------------------------------- thread-local.patch (2.04 KB) thread-local-update.patch (2.05 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: