From: jzakiya@... Date: 2018-01-30T18:01:29+00:00 Subject: [ruby-core:85261] [Ruby trunk Feature#14383] Making prime_division in prime.rb Ruby 3 ready. Issue #14383 has been updated by jzakiya (Jabari Zakiya). FYI, I re-ran the examples above (and an additional one) using the `Miller-Rabin` implementation in my `primes-utls` 3.0 development branch. Not only is it deterministic up to about 25-digits, but it's way faster too. It can probably be made faster by using a `WITNESS_RANGES` list with fewer witnesses, and can now be easily extended as optimum witnesses are found for larger number ranges. I provide this to raise the issue that absolute determinism for a primality test function maybe shouldn't be an absolute criteria for selection. Of course, it is the ideal, but sometimes `striving for perfection can be the enemy of good enough`. Again, hybrid methods may be able to combine the best of all worlds. ``` > n= (10**100+267); tm{ p n.primesmr n+5000 } => 0.056422744 3.0 dev => 0.06310091 2.7 > n= (10**1000+267); tm{ p n.primesmr n+5000 } => 7.493292636 3.0 dev => 11.089284953 2.7 > n= (10**2000+267); tm{ p n.primesmr n+5000 } => 36.614877874 3.0 dev => 56.129938705 2.7 > n= (10**3000+267); tm{ p n.primesmr n+5000 } => 10000000000........1027 => 192.866720666 3.0 dev => 286.244139793 2.7 ``` ``` # Miller-Rabin version in Primes-Utils 2.7 def primemr?(k=20) # increase k for more reliability n = self.abs return true if [2,3].include? n return false unless [1,5].include?(n%6) and n > 1 d = n - 1 s = 0 (d >>= 1; s += 1) while d.even? k.times do a = 2 + rand(n-4) x = a.to_bn.mod_exp(d,n) # x = (a**d) mod n next if x == 1 or x == n-1 (s-1).times do x = x.mod_exp(2,n) # x = (x**2) mod n return false if x == 1 break if x == n-1 end return false if x != n-1 end true # n is prime (with high probability) end ``` ``` # Miller-Rabin version in Primes-Utils 3.0 dev # Returns true if +self+ is a prime number, else returns false. def primemr? primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43] return primes.include? self if self <= primes.last return false unless primes.reduce(:*).gcd(self) == 1 wits = WITNESS_RANGES.find {|range, wits| range > self} # [range, [wit_prms]] or nil witnesses = wits && wits[1] || primes witnesses.each {|p| return false unless miller_rabin_test(p) } true end private # Returns true if +self+ passes Miller-Rabin Test on witness +b+ def miller_rabin_test(b) # b is a witness to test with n = d = self - 1 d >>= 1 while d.even? y = b.to_bn.mod_exp(d, self) # x = (b**d) mod n until d == n || y == n || y == 1 y = y.mod_exp(2, self) # y = (y**2) mod self d <<= 1 end y == n || d.odd? end WITNESS_RANGES = { 2_047 => [2], 1_373_653 => [2, 3], 25_326_001 => [2, 3, 5], 3_215_031_751 => [2, 3, 5, 7], 2_152_302_898_747 => [2, 3, 5, 7, 11], 3_474_749_660_383 => [2, 3, 5, 7, 11, 13], 341_550_071_728_321 => [2, 3, 5, 7, 11, 13, 17], 3_825_123_056_546_413_051 => [2, 3, 5, 7, 11, 13, 17, 19, 23], 318_665_857_834_031_151_167_461 => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37], 3_317_044_064_679_887_385_961_981 => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41] } ``` ---------------------------------------- Feature #14383: Making prime_division in prime.rb Ruby 3 ready. https://bugs.ruby-lang.org/issues/14383#change-70022 * Author: jzakiya (Jabari Zakiya) * Status: Open * Priority: Normal * Assignee: yugui (Yuki Sonoda) * Target version: Next Major ---------------------------------------- I have been running old code in Ruby 2.5.0 (released 2017.12.25) to check for speed and compatibility. I still see the codebase in `prime.rb` hardly has changed at all (except for replacing `Math.sqrt` with `Integer.sqrt`). To achieve the Ruby 3 goal to make it at least three times faster than Ruby 2 there are three general areas where Ruby improvements can occur. * increase the speed of its implementation at the machine level * rewrite its existing codebase in a more efficient|faster manner * use faster algorithms to implement routines and functions I want to suggest how to address the later two ways to improve performance of specifically the `prime_division` method in the `prime.rb` library. I've raised and made suggestions to some of these issues here [ruby-issues forum](https://bugs.ruby-lang.org/issues/12676) and now hope to invigorate additional discussion. Hopefully with the release of 2.5.0, and Ruby 3 conceptually closer to reality, more consideration will be given to coding and algorithmic improvements to increase its performance too. **Mathematical correctness** First I'd like to raise what I consider *math bugs* in `prime_division`, in how it handles `0` and `-1` inputs. ``` > -1.prime_division => [[-1,1]] > 0.prime_division Traceback (most recent call last): 4: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/bin/irb:11:in `<main>' 3: from (irb):85 2: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:30:in `prime_division' 1: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:203:in `prime_division' ZeroDivisionError (ZeroDivisionError) ``` First, `0` is a perfectly respectable integer, and is non-prime, so its output should be `[]`, an empty array to denote it has no prime factors. The existing behavior is solely a matter of `prime_division`'s' implementation, and does not take this mathematical reality into account. The output for `-1` is also mathematically wrong because `1` is also non-prime (and correctly returns `[]`), well then mathematically so should `-1`. Thus, `prime_division` treats `-1` as a new prime number, and factorization, that has no mathematical basis. Thus, for mathematical correctness and consistency `-1` and `0` should both return `[]`, as none have prime factors. ``` > -1.prime_division => [] > 0.prime_division => [] > 1.prime_division => [] ``` There's a very simple one-line fix to `prime_division` to do this: ``` # prime.rb class Prime def prime_division(value, generator = Prime::Generator23.new) -- raise ZeroDivisionError if value == 0 ++ return [] if (value.abs | 1) == 1 ``` **Simple Code and Algorithmic Improvements** As stated above, besides the machine implementation improvements, the other areas of performance improvements will come from coding rewrites and better algorithms. Below is the coding of `prime_division`. This coding has existed at least since Ruby 2.0 (the farthest I've gone back). ``` # prime.rb class Integer # Returns the factorization of +self+. # # See Prime#prime_division for more details. def prime_division(generator = Prime::Generator23.new) Prime.prime_division(self, generator) end end class Prime def prime_division(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 if value < 0 value = -value pv = [[-1, 1]] else pv = [] end generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime) mod) == 0 value = value1 count += 1 end if count != 0 pv.push [prime, count] end break if value1 <= prime end if value > 1 pv.push [value, 1] end pv end end ``` This can be rewritten in more modern and idiomatic Ruby, to become much shorter and easier to understand. ``` require 'prime.rb' class Integer def prime_division1(generator = Prime::Generator23.new) Prime.prime_division1(self, generator) end end class Prime def prime_division1(value, generator = Prime::Generator23.new) # raise ZeroDivisionError if value == 0 return [] if (value.abs | 1) == 1 pv = value < 0 ? [[-1, 1]] : [] value = value.abs generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime); mod) == 0 value = value1 count += 1 end pv.push [prime, count] unless count == 0 break if prime > value1 end pv.push [value, 1] if value > 1 pv end end ``` By merely rewriting it we get smaller|concise code, that's easier to understand, which is slightly faster. A *triple win!* Just paste the above code into a 2.5.0 terminal session, and run the benchmarks below. ``` def tm; s=Time.now; yield; Time.now-s end n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 27.02951016 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 25.959149721 ``` Again, we get a *triple win* to this old codebase by merely rewriting it. It can be made 3x faster by leveraging the `prime?` method from the `OpenSSL` library to perform a more efficient|faster factoring algorithm, and implementation. ``` require 'prime.rb' require 'openssl' class Integer def prime_division2(generator = Prime::Generator23.new) return [] if (self.abs | 1) == 1 pv = self < 0 ? [-1] : [] value = self.abs prime = generator.next until value.to_bn.prime? or value == 1 while prime (pv << prime; value /= prime; break) if value % prime == 0 prime = generator.next end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end end ``` Here we're making much better use of Ruby idioms and libraries (`enumerable` and `openssl`), leading to a much greater performance increase. A bigger *triple win*. Pasting this code into a 2.5.0 terminal session gives the following results. ``` # Hardware: System76 laptop; I7 cpu @ 3.5GHz, 64-bit Linux def tm; s=Time.now; yield; Time.now-s end n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 27.02951016 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 25.959149721 n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division2 } [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] => 9.39650374 ``` `prime_division2` is much more usable for significantly larger numbers and use cases than `prime_division`. I can even do multiple times better than this, if you review the above cited forum thread. My emphasis here is to show there are a lot of possible *low hanging fruit* performance gains ripe for the picking to achieve Ruby 3 performance goals, if we look (at minimum) for simpler|better code rewrites, and then algorithmic upgrades. So the question is, are the devs willing to upgrade the codebase to provide the demonstrated performance increases shown here for `prime_division`? ---Files-------------------------------- bm.rb (1.94 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>