From: gustavo.pedrosa@... Date: 2014-07-22T16:31:13+00:00 Subject: [ruby-core:63928] [ruby-trunk - Bug #10080] Functions marked as "static inline" are not inlined by gcc Issue #10080 has been updated by Gustavo Frederico Temple Pedrosa. Eric Wong wrote: > Can you show a performance difference with always_inline on those > functions? > > Inlining is not always faster, it bloats the code and eats cache. > Perhaps GCC is avoiding that bloat. Sorry, yes, you are right. There is a small performance penalty when every function marked as inline is, in fact, inlined. There are other issues, a lot of functions are not being inlined because GCC cannot do that or because they imply in code bloating (as detected by the compiler heuristics). Do you think it would be interesting to remove the inline keyword in such cases? ---------------------------------------- Bug #10080: Functions marked as "static inline" are not inlined by gcc https://bugs.ruby-lang.org/issues/10080#change-47957 * Author: Gustavo Frederico Temple Pedrosa * Status: Open * Priority: Normal * Assignee: Nobuyoshi Nakada * Category: build * Target version: * ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) * Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN ---------------------------------------- Since GCC 4.8.3, some static inline functions (such as rb_call0) are not inlined in the generated machine code. This happens in both x86-64 and ppc64 platforms and it can be verified by using the following commands: ~~~ (in ppc64) objdump -dS ruby | grep bl.*rb_call0 155838: c1 f8 ff 4b bl 1550f8 1558a4: 55 f8 ff 4b bl 1550f8 1559e4: 15 f7 ff 4b bl 1550f8 158b88: 71 c5 ff 4b bl 1550f8 15bf40: b9 91 ff 4b bl 1550f8 15c0f8: 01 90 ff 4b bl 1550f8 15c490: 69 8c ff 4b bl 1550f8 15c4cc: 2d 8c ff 4b bl 1550f8 15c8bc: 3d 88 ff 4b bl 1550f8 15cc94: 65 84 ff 4b bl 1550f8 15d344: b5 7d ff 4b bl 1550f8 15d3c4: 35 7d ff 4b bl 1550f8 (in x86-64) objdump -dS ruby | grep call\.*rb_call0 126280: e8 cb f9 ff ff callq 125c50 126347: e8 04 f9 ff ff callq 125c50 1264cc: e8 7f f7 ff ff callq 125c50 128a9e: e8 ad d1 ff ff callq 125c50 12ade9: e8 62 ae ff ff callq 125c50 12af6a: e8 e1 ac ff ff callq 125c50 12b2e6: e8 65 a9 ff ff callq 125c50 12b5d7: e8 74 a6 ff ff callq 125c50 12b899: e8 b2 a3 ff ff callq 125c50 12bccd: e8 7e 9f ff ff callq 125c50 12bdf0: e8 5b 9e ff ff callq 125c50 ~~~ This behaviour can be fixed if every inlined function were marked with the always_inline attribute when compiling with GCC. -- https://bugs.ruby-lang.org/