From: pdahorek@... Date: 2020-01-10T05:09:51+00:00 Subject: [ruby-core:96753] [Ruby master Misc#16487] Potential for SIMD usage in ruby-core Issue #16487 has been updated by ahorek (Pavel Rosick�). > Do you have any practical applications whose performance is significantly improved by the SIMD hacks? I'm unsure about coderange_scan, but it is difficult for me to imagine an application that String#strip is a bottleneck. I agree, String#strip probably won't be a bottleneck, it was just easy to implement as an example. there's a real use case for coderange_scan https://github.com/rubyjs/mini_racer/pull/128 https://github.com/rails/rails/blob/2ae9e5da734e85bc5afaa15089171f1e996bd306/activesupport/lib/active_support/core_ext/string/multibyte.rb#L48 https://github.com/rails/rails/blob/98a57aa5f610bc66af31af409c72173cdeeb3c9e/actionview/lib/action_view/template/handlers/erb.rb#L75 https://github.com/mikel/mail/blob/6bc16b4bce4fe280b19523c939b14a30e32a8ba4/lib/mail/fields/unstructured_field.rb#L28 etc. the steam hardware survey states that any reasonable x86 CPU supports at least SSE4.2. https://store.steampowered.com/hwsurvey/ SSE2 100.00% SSE3 100.00% SSSE3 98.47% SSE4.1 97.70% SSE4.2 96.99% AVX 92.79% AVX2 74.63% AVX512CD 0.16% in fact, AVX was introduced in 2011, so this requirement for portability is very low. Some Linux distributions already dropped support for old processors and have more aggressive flags by default. https://clearlinux.org/news-blogs/smart-not-enough even mentioned PHP has some functions optimized this way. Of course, it has to be carefully decided what's worth to optimize and what's not, but this is one of many opportunities on how to improve performance. here's also a very well written example https://dev.to/wunk/fast-array-reversal-with-simd-j3p > it would need to be dynamic if we want most users to benefit from it. SSE2 is a hard requirement for x86_64 CPUs. If you need a portable package, this is the baseline. I don't think dynamic loading is a solution. You can't use for example AVX instructions generated from a regular C code, even if your processor supports it. You have to recompile it for your platform, that's a pain of all C programs. > Introducing SIMD will make maintenanceability worse. that's definitely true and valid concern. If there's any good library to make things simpler (without sacrificing performance), that would be great. ---------------------------------------- Misc #16487: Potential for SIMD usage in ruby-core https://bugs.ruby-lang.org/issues/16487#change-83744 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal * Assignee: ---------------------------------------- ### Context There are several ruby core methods that could be optimized with the use of SIMD instructions. I experimented a bit on `coderange_scan` https://github.com/Shopify/ruby/pull/2, and Pavel Rosick� experimented on `String#strip` https://github.com/ruby/ruby/pull/2815. ### Problem The downside of SIMD instructions is that they are not universally available. So it means maintaining several versions of the same code, and switching them either statically or dynamically. And since most Ruby users use precompiled binaries from repositories and such, it would need to be dynamic if we want most users to benefit from it. So it's not exactly "free speed", as it means a complexified codebase. ### Question So the question is to know wether ruby-core is open to patches using SIMD instructions ? And if so under which conditions. cc @shyouhei -- https://bugs.ruby-lang.org/ Unsubscribe: