From: jzakiya@... Date: 2016-10-10T17:15:24+00:00 Subject: [ruby-core:77539] [Ruby trunk Feature#12676] Significant performance increase, and code conciseness, for prime_division method in prime.rb Issue #12676 has been updated by Jabari Zakiya. Thanks for your explanation. I understood (after looking at the source code) what was being done, and how it was being done, but from a user perspective I was inquiring *why* it was being done that way. Also purely from a user perspective, the method name `prime_factors` is much clearer and intuitive than `prime_division`. It says what the method will produce and not what it's doing. ``` this: 47824821.prime_factors is intuitively clearer than this: 47824821.prime_division ``` I suggest at least aliasing this for >= Ruby 2.4 because Ruby 3 will have backward incompatibiities anyway, and since I would imagine `prime.rb` is not used by a whole lot of people, creating at least an alias puts the name on a conceptually better foundation of what user would expect. Looking in code and seeing the use of `n.prime_factors` is just so much clearer. Finally, if you don't want to directly `require openssl` you can still use the technique that uses a fast `prime?` method. You can paste the code below directly into an irb session (or load from a file). I created two versions of `prime_factors`, one using `prime?` from `openssl` and the other using the upgraded `prime?` method from the current trunk code for `prime.rb` (Thanks, again, for incorporating my suggestion for it into trunk.) https://github.com/ruby/ruby/blob/trunk/lib/prime.rb For the randomly choseen 27 digit integer shown `prime_division` takes twice as much time as `prime_factors` and `prime_factors1`, because half its time (21 sec) is used to perform trial division on the last|largest prime. Thus, using this technique enables the code to be greatly simplified (conceptually and in locs) while significantly increasing speed, in this case 2x faster. However, `prime_factors` is still 7x slower, and needs more total code, than `prime_division9` (from previous post) which is still the better method, especially to deal with large primes. I present this to note and demonstrate you can apply this technique within the existing codebase to make prime factorization faster, simpler to code, and easier to upgrade by using a faster generator or `prime?` method. (In my `primes-utils` gem I use a Miller-Rabin primality test method, but to uses the `openssl` method ` mod_exp(n,m)`. This just reinforces the fact that `openssl` math methods are very useful|fast for working with arbitrary sized numbers.) ``` require 'prime.rb' require 'openssl' class Integer def prime? return self >= 2 if self <= 3 return true if self == 5 return false unless 30.gcd(self) == 1 (7..Math.sqrt(self).to_i).step(30) do |p| return false self%(p) == 0 || self%(p+4) == 0 || self%(p+6) == 0 || self%(p+10) == 0 || self%(p+12) == 0 || self%(p+16) == 0 || self%(p+22) == 0 || self%(p+24) == 0 end true end def prime_factors(generator = Prime::Generator23.new) return [] if self | 1 == 1 pv = self < 0 ? [-1] : [] value = self.abs prime = generator.next until value.to_bn.prime? or value == 1 while prime (pv << prime; value /= prime; break) if value % prime == 0 prime = generator.next end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end def prime_factors1(generator = Prime::Generator23.new) return [] if self | 1 == 1 pv = self < 0 ? [-1] : [] value = self.abs prime = generator.next until value.prime? or value == 1 while prime (pv << prime; value /= prime; break) if value % prime == 0 prime = generator.next end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end end ``` ``` System: System76 laptop; I7 3.5GHz cpu > n = 500000000000000000008244213; n.to_s.size => 27 > n = 500000000000000000008244213; n.prime_factors => [[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]] n.prime_division 42.61 secs n.prime_factors1 23.43 secs n.prime_factors 21.05 secs n.prime_division9 3.35 secs ``` ---------------------------------------- Feature #12676: Significant performance increase, and code conciseness, for prime_division method in prime.rb https://bugs.ruby-lang.org/issues/12676#change-60807 * Author: Jabari Zakiya * Status: Assigned * Priority: Normal * Assignee: Marc-Andre Lafortune ---------------------------------------- I earlier posted code to simplify the prime_division method in prime.rb. This made the code much more concise and readable/understandable, while also providing a small speed increase. The code presented here for prime_division2 maintains the conciseness and readability, but uses a different/simpler algorithm to provide a significant speed increase over the current implementation of prime_division in prime.rb. Timings for selected large primes are provided, run on CRuby 2.3.1. System: System76 3.5GHz I7 cpu laptop, Linux 64-bit OS in Virtual Box. ``` n1 = 100_000_000_000_000_003 n2 = 200_000_000_000_000_003 n3 = 1_000_000_000_000_000_003 n1 n2 n3 prime_division 23.7 33.5 74.6 prime_division1 22.9 32.2 72.8 prime_division2 14.8 20.5 45.8 ``` ```ruby def tm; s = Time.now; yield; Time.now - s end irb(main):015:0> n = 100_000_000_000_000_003; tm{ n.prime_division } => 23.730239721 irb(main):016:0> n = 100_000_000_000_000_003; tm{ n.prime_division1 } => 22.877657172 irb(main):017:0> n = 100_000_000_000_000_003; tm{ n.prime_division2 } => 14.758561827 irb(main):018:0> n = 200_000_000_000_000_003; tm{ n.prime_division } => 33.502851342 irb(main):019:0> n = 200_000_000_000_000_003; tm{ n.prime_division1 } => 32.23911595 irb(main):020:0> n = 200_000_000_000_000_003; tm{ n.prime_division2 } => 20.476660683 irb(main):021:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division } => 74.630244055 irb(main):022:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division1 } => 72.778948947 irb(main):023:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division2 } => 45.802756121 ``` 1. Current code for prime_division in prime.rb. ```ruby def prime_division(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 if value < 0 value = -value pv = [[-1, 1]] else pv = [] end generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime) mod) == 0 value = value1 count += 1 end if count != 0 pv.push [prime, count] end break if value1 <= prime end if value > 1 pv.push [value, 1] end pv end ``` 2. Code simplification for current algorithm, increases conciseness/readability, with slight speedup. ```ruby def prime_division1(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 pv = value < 0 ? [[-1, 1]] : [] value = value.abs generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime); mod) == 0 value = value1 count += 1 end pv.push [prime, count] unless count == 0 break if prime > value1 end pv.push [value, 1] if value > 1 pv end ``` 3. Change of algorithm, maintains conciseness/readability with significant speed increase. ```ruby def prime_division2(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 pv = value < 0 ? [-1] : [] value = value.abs sqrt_value = Math.sqrt(value).to_i generator.each do |prime| break if prime > sqrt_value while value % prime == 0 pv << prime value /= prime sqrt_value = Math.sqrt(value).to_i end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end ``` -- https://bugs.ruby-lang.org/ Unsubscribe: