From: jzakiya@... Date: 2018-01-31T15:56:57+00:00 Subject: [ruby-core:85313] [Ruby trunk Feature#14383] Making prime_division in prime.rb Ruby 3 ready. Issue #14383 has been updated by jzakiya (Jabari Zakiya). This version is probably better, as it's a little faster, and takes a complete list of witnesses for a given number, which can be determined separately. ``` # Returns true if +self+ is a prime number, else returns false. def primemr? primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43] return primes.include? self if self <= primes.last return false unless primes.reduce(:*).gcd(self) == 1 wits = WITNESS_RANGES.find {|range, wits| range > self} # [range, [wit_prms]] or nil witnesses = wits && wits[1] || primes miller_rabin_test(witnesses) end private # Returns true if +self+ passes Miller-Rabin Test on witness +b+ def miller_rabin_test(witnesses) # use witness list to test with neg_one_mod = n = d = self - 1 d >>= 1 while d.even? witnesses.each do |b| s = d y = b.to_bn.mod_exp(d, self) # y = (b**d) mod self until s == n || y == 1 || y == neg_one_mod y = y.mod_exp(2, self) # y = (y**2) mod self s <<= 1 end return false unless y == neg_one_mod || s.odd? end true end ``` ---------------------------------------- Feature #14383: Making prime_division in prime.rb Ruby 3 ready. https://bugs.ruby-lang.org/issues/14383#change-70098 * Author: jzakiya (Jabari Zakiya) * Status: Open * Priority: Normal * Assignee: yugui (Yuki Sonoda) * Target version: Next Major ---------------------------------------- I have been running old code in Ruby 2.5.0 (released 2017.12.25) to check for speed and compatibility. I still see the codebase in `prime.rb` hardly has changed at all (except for replacing `Math.sqrt` with `Integer.sqrt`). To achieve the Ruby 3 goal to make it at least three times faster than Ruby 2 there are three general areas where Ruby improvements can occur. * increase the speed of its implementation at the machine level * rewrite its existing codebase in a more efficient|faster manner * use faster algorithms to implement routines and functions I want to suggest how to address the later two ways to improve performance of specifically the `prime_division` method in the `prime.rb` library. I've raised and made suggestions to some of these issues here [ruby-issues forum](https://bugs.ruby-lang.org/issues/12676) and now hope to invigorate additional discussion. Hopefully with the release of 2.5.0, and Ruby 3 conceptually closer to reality, more consideration will be given to coding and algorithmic improvements to increase its performance too. **Mathematical correctness** First I'd like to raise what I consider *math bugs* in `prime_division`, in how it handles `0` and `-1` inputs. ``` > -1.prime_division => [[-1,1]] > 0.prime_division Traceback (most recent call last): 4: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/bin/irb:11:in `
' 3: from (irb):85 2: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:30:in `prime_division' 1: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:203:in `prime_division' ZeroDivisionError (ZeroDivisionError) ``` First, `0` is a perfectly respectable integer, and is non-prime, so its output should be `[]`, an empty array to denote it has no prime factors. The existing behavior is solely a matter of `prime_division`'s' implementation, and does not take this mathematical reality into account. The output for `-1` is also mathematically wrong because `1` is also non-prime (and correctly returns `[]`), well then mathematically so should `-1`. Thus, `prime_division` treats `-1` as a new prime number, and factorization, that has no mathematical basis. Thus, for mathematical correctness and consistency `-1` and `0` should both return `[]`, as none have prime factors. ``` > -1.prime_division => [] > 0.prime_division => [] > 1.prime_division => [] ``` There's a very simple one-line fix to `prime_division` to do this: ``` # prime.rb class Prime def prime_division(value, generator = Prime::Generator23.new) -- raise ZeroDivisionError if value == 0 ++ return [] if (value.abs | 1) == 1 ``` **Simple Code and Algorithmic Improvements** As stated above, besides the machine implementation improvements, the other areas of performance improvements will come from coding rewrites and better algorithms. Below is the coding of `prime_division`. This coding has existed at least since Ruby 2.0 (the farthest I've gone back). ``` # prime.rb class Integer # Returns the factorization of +self+. # # See Prime#prime_division for more details. def prime_division(generator = Prime::Generator23.new) Prime.prime_division(self, generator) end end class Prime def prime_division(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 if value < 0 value = -value pv = [[-1, 1]] else pv = [] end generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime) mod) == 0 value = value1 count += 1 end if count != 0 pv.push [prime, count] end break if value1 <= prime end if value > 1 pv.push [value, 1] end pv end end ``` This can be rewritten in more modern and idiomatic Ruby, to become much shorter and easier to understand. ``` require 'prime.rb' class Integer def prime_division1(generator = Prime::Generator23.new) Prime.prime_division1(self, generator) end end class Prime def prime_division1(value, generator = Prime::Generator23.new) # raise ZeroDivisionError if value == 0 return [] if (value.abs | 1) == 1 pv = value < 0 ? [[-1, 1]] : [] value = value.abs generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime); mod) == 0 value = value1 count += 1 end pv.push [prime, count] unless count == 0 break if prime > value1 end pv.push [value, 1] if value > 1 pv end end ``` By merely rewriting it we get smaller|concise code, that's easier to understand, which is slightly faster. A *triple win!* Just paste the above code into a 2.5.0 terminal session, and run the benchmarks below. ``` def tm; s=Time.now; yield; Time.now-s end n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 27.02951016 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 25.959149721 ``` Again, we get a *triple win* to this old codebase by merely rewriting it. It can be made 3x faster by leveraging the `prime?` method from the `OpenSSL` library to perform a more efficient|faster factoring algorithm, and implementation. ``` require 'prime.rb' require 'openssl' class Integer def prime_division2(generator = Prime::Generator23.new) return [] if (self.abs | 1) == 1 pv = self < 0 ? [-1] : [] value = self.abs prime = generator.next until value.to_bn.prime? or value == 1 while prime (pv << prime; value /= prime; break) if value % prime == 0 prime = generator.next end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end end ``` Here we're making much better use of Ruby idioms and libraries (`enumerable` and `openssl`), leading to a much greater performance increase. A bigger *triple win*. Pasting this code into a 2.5.0 terminal session gives the following results. ``` # Hardware: System76 laptop; I7 cpu @ 3.5GHz, 64-bit Linux def tm; s=Time.now; yield; Time.now-s end n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 27.02951016 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 25.959149721 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division2 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 9.39650374 ``` `prime_division2` is much more usable for significantly larger numbers and use cases than `prime_division`. I can even do multiple times better than this, if you review the above cited forum thread. My emphasis here is to show there are a lot of possible *low hanging fruit* performance gains ripe for the picking to achieve Ruby 3 performance goals, if we look (at minimum) for simpler|better code rewrites, and then algorithmic upgrades. So the question is, are the devs willing to upgrade the codebase to provide the demonstrated performance increases shown here for `prime_division`? ---Files-------------------------------- bm.rb (1.94 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: