From: sam.saffron@... Date: 2014-02-18T23:39:59+00:00 Subject: [ruby-core:60845] [ruby-trunk - Feature #9113] Ship Ruby for Linux with jemalloc out-of-the-box Issue #9113 has been updated by Sam Saffron. I can confirm 2 findings. When heaps are small you barely notice a different. When heaps grow and general memory fragmentation grows, jemalloc is far better. I see a 6% reduction of RSS running discourse bench on 2.1.0 https://github.com/discourse/discourse/blob/master/script/bench.rb An artificial test is: @retained = [] MAX_STRING_SIZE = 100 def stress(allocate_count, retain_count, chunk_size) chunk = [] while retain_count > 0 || allocate_count > 0 if retain_count == 0 || (Random.rand < 0.5 && allocate_count > 0) chunk << " " * (Random.rand * MAX_STRING_SIZE).to_i allocate_count -= 1 if chunk.length > chunk_size chunk = [] end else @retained << " " * (Random.rand * MAX_STRING_SIZE).to_i retain_count -= 1 end end end start = Time.now stress(1_000_000, 600_000, 200_000) puts "Duration: #{(Time.now - start).to_f}" puts `ps aux | grep #{Process.pid} | grep -v grep` For glibc sam@ubuntu ~ % time ruby stress_mem.rb Duration: 0.705922489 sam 17397 73.0 2.5 185888 156884 pts/10 Sl+ 10:37 0:00 ruby stress_mem.rb ruby stress_mem.rb 0.78s user 0.08s system 100% cpu 0.855 total For jemalloc 3.5.0 Duration: 0.676871705 sam 17428 70.0 2.3 186248 144800 pts/10 Sl+ 10:37 0:00 ruby stress_mem.rb LD_PRELOAD=/home/sam/Source/jemalloc-3.5.0/lib/libjemalloc.so ruby 0.68s user 0.09s system 100% cpu 0.771 total -- You can see the 8% or so better RSS with jemalloc ---------------------------------------- Feature #9113: Ship Ruby for Linux with jemalloc out-of-the-box https://bugs.ruby-lang.org/issues/9113#change-45260 * Author: Sam Saffron * Status: Feedback * Priority: Normal * Assignee: * Category: build * Target version: ---------------------------------------- libc's malloc is a problem, it fragments badly meaning forks share less memory and is slow compared to tcmalloc or jemalloc. both jemalloc and tcmalloc are heavily battle tested and stable. 2 years ago redis picked up the jemalloc dependency see: http://oldblog.antirez.com/post/everything-about-redis-24.html To quote antirez: `` But an allocator is a serious thing. Since we introduced the specially encoded data types Redis started suffering from fragmentation. We tried different things to fix the problem, but basically the Linux default allocator in glibc sucks really, really hard. `` --- I recently benched Discourse with tcmalloc / jemalloc and default and noticed 2 very important thing: median request time reduce by up to 10% (under both) PSS (proportional share size) is reduced by 10% under jemalloc and 8% under tcmalloc. We can always use LD_PRELOAD to yank these in, but my concern is that standard distributions are using a far from optimal memory allocator. It would be awesome if the build, out-of-the-box, just checked if it was on Linux (eg: https://github.com/antirez/redis/blob/unstable/src/Makefile#L30-L34 ) and then used jemalloc instead. -- http://bugs.ruby-lang.org/