From: jzakiya@... Date: 2016-08-29T18:53:39+00:00 Subject: [ruby-core:77097] [Ruby trunk Feature#12676] Significant performance increase, and code conciseness, for prime_division method in prime.rb Issue #12676 has been updated by Jabari Zakiya. Ah, but let's not forget the context that brings us here. With prime_division6 we have blown past my intended purpose to make prime_division at least 3x faster, to meet the Ruby 3x3 goal. We are now orders of magnitude faster than the current implementation, which now creates the capability of processing previously unthinkable large numbers in reasonable times. Actually, the code: ``` self | 1 == 1 ``` is quite easy to do in hardware. It just sets the lsb of the number to '1' and if any other bits are '1' then it's 'false'. I don't understand the concern about using it for large numbers because other calculations will use the initial large number as well. It certainly won't affect anything (technically). I changed the functionality to this implementation because it requires less cpu work, and is shorter than the original functionally equivalent snippet, and also looks better (to me). The main thing for me is the recognition that '0' and '1' should produce the same answer of ``[]``. To support this, all *nix OS systems come with the cli command 'factor', part of the Unix coreutils standard library. Doing the following gives the same (mathematically correct) answers. ``` jzakiya@jzakiya-VirtualBox ~ $ factor 0 0: jzakiya@jzakiya-VirtualBox ~ $ factor 1 1: jzakiya@jzakiya-VirtualBox ~ $ factor 2 2: 2 ``` Thus, when use ask the question what are the factors of '0' or '1' the answer is 'none'. In fact, if you want to make the prime_division function as fast as possible we can use the 'factor' command to implement it with, as shown below (this is what my 'primes-utils' gem uses if it detects the OS is a *nix variant). This will give instantaneous results until you exceed the allowable number size for 'factor'. ``` class Integer def factors return [] if self | 1 == 1 factors = self < 0 ? [-1] : [] factors += `factor #{self.abs}`.split(' ')[1..-1].map(&:to_i) factors.group_by {|prm| prm }.map {|prm, exp| [prm, exp.size] } end end ``` So instead of just leveraging the resources of the Ruby Standard Library we can also leverage the resources of the Unix standard environment. Obviously, we can't make a factoring function that can work on every arbitrary large number, on a real physical computer, within a realistic time frame for computation. But prime_divsion6|factors more than exceed the goals for Ruby 3x3, which for me, and what I use Ruby for, makes it much more attractive|usable to applied math and science domains, and engineering. Lastly, (on this topic) I would suggest changing (or providing an alias) to the name 'prime_division'. What the method is functionally doing is providing a list of a number's prime factors, but the current name is focusing on a process mechanism. I think it's clearer to use the following: Instead of: ``` n.prime_division ``` rename it to: ``` n.prime_factors ``` or what I like the best (because it's shorter and more consistent with the Unix 'factor' command): ``` n.factors ``` Then you can rename: ``` Integer.from_prime_division ``` to a more descriptive ``` Integer.from_prime_factors ``` And lastly, lastly, on a related issue, I also suggest the following. The technique in prime_division6 is dependent on using a fast 'prime?' method, which Ruby comes with in its OPENSSL standard library. I would suggest/urge using it in the prime.rb library as it's 'prime?' method. I raised this issue of why create/use slow implementations for a primality tester when Ruby already has a fast one. See: https://bugs.ruby-lang.org/issues/12673 I would suggest doing this in prime.rb for 'prime?', which keeps it and 'prime_division(6)' in one nice neat place. Then, if you find/create a faster implementation of 'prime?' only one method has to change without changing the semantics (or dependencies) on any other method that uses it. ``` require 'openssl' class Integer def prime? self.to_bn.prime? end def prime_division7(pg_selector = 0) return [] if self | 1 == 1 return [[self, 1]] if self.prime? base_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47] pv = self < 0 ? [-1] : [] value = self.abs base_primes.each {|prm| (pv << prm; value /= prm) while value % prm == 0 } sqrt_value = Math.sqrt(value).to_i num = self.abs == value ? value : sqrt_value residues, *, mod = init_generator1(num, pg_selector) rn = residues.size - 1 # last_residue_index modk = r = 0 until value.prime? or value == 1 while (prime = modk + residues[r]) <= sqrt_value if value % prime == 0 pv << prime value /= prime sqrt_value = Math.sqrt(value).to_i break end r += 1; (r = 0; modk += mod) if r > rn end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end private def init_generator1(num, pg_selector) base_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23] pg_selector = select_pg(num.abs) unless base_primes.include? pg_selector # puts "Using P#{pg_selector}" base_primes.select! {|prm| prm <= pg_selector } mod = base_primes.reduce(:*) residues = []; 3.step(mod, 2) {|r| residues << r if mod.gcd(r) == 1 } [residues << mod + 1, base_primes, mod] end def select_pg(num) # adaptively select fastest SP Prime Generator return 5 if num < 1 * 10**7 + 1000 return 7 if num < 1 * 10**10 + 1000 return 11 if num < 1 * 10**13 + 1000 return 13 if num < 7 * 10**15 + 1000 return 17 if num < 4 * 10**18 + 1000 19 end end ``` ---------------------------------------- Feature #12676: Significant performance increase, and code conciseness, for prime_division method in prime.rb https://bugs.ruby-lang.org/issues/12676#change-60319 * Author: Jabari Zakiya * Status: Assigned * Priority: Normal * Assignee: Yuki Sonoda ---------------------------------------- I earlier posted code to simplify the prime_division method in prime.rb. This made the code much more concise and readable/understandable, while also providing a small speed increase. The code presented here for prime_division2 maintains the conciseness and readability, but uses a different/simpler algorithm to provide a significant speed increase over the current implementation of prime_division in prime.rb. Timings for selected large primes are provided, run on CRuby 2.3.1. System: System76 3.5GHz I7 cpu laptop, Linux 64-bit OS in Virtual Box. ``` n1 = 100_000_000_000_000_003 n2 = 200_000_000_000_000_003 n3 = 1_000_000_000_000_000_003 n1 n2 n3 prime_division 23.7 33.5 74.6 prime_division1 22.9 32.2 72.8 prime_division2 14.8 20.5 45.8 ``` ```ruby def tm; s = Time.now; yield; Time.now - s end irb(main):015:0> n = 100_000_000_000_000_003; tm{ n.prime_division } => 23.730239721 irb(main):016:0> n = 100_000_000_000_000_003; tm{ n.prime_division1 } => 22.877657172 irb(main):017:0> n = 100_000_000_000_000_003; tm{ n.prime_division2 } => 14.758561827 irb(main):018:0> n = 200_000_000_000_000_003; tm{ n.prime_division } => 33.502851342 irb(main):019:0> n = 200_000_000_000_000_003; tm{ n.prime_division1 } => 32.23911595 irb(main):020:0> n = 200_000_000_000_000_003; tm{ n.prime_division2 } => 20.476660683 irb(main):021:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division } => 74.630244055 irb(main):022:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division1 } => 72.778948947 irb(main):023:0> n = 1_000_000_000_000_000_003; tm{ n.prime_division2 } => 45.802756121 ``` 1. Current code for prime_division in prime.rb. ```ruby def prime_division(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 if value < 0 value = -value pv = [[-1, 1]] else pv = [] end generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime) mod) == 0 value = value1 count += 1 end if count != 0 pv.push [prime, count] end break if value1 <= prime end if value > 1 pv.push [value, 1] end pv end ``` 2. Code simplification for current algorithm, increases conciseness/readability, with slight speedup. ```ruby def prime_division1(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 pv = value < 0 ? [[-1, 1]] : [] value = value.abs generator.each do |prime| count = 0 while (value1, mod = value.divmod(prime); mod) == 0 value = value1 count += 1 end pv.push [prime, count] unless count == 0 break if prime > value1 end pv.push [value, 1] if value > 1 pv end ``` 3. Change of algorithm, maintains conciseness/readability with significant speed increase. ```ruby def prime_division2(value, generator = Prime::Generator23.new) raise ZeroDivisionError if value == 0 pv = value < 0 ? [-1] : [] value = value.abs sqrt_value = Math.sqrt(value).to_i generator.each do |prime| break if prime > sqrt_value while value % prime == 0 pv << prime value /= prime sqrt_value = Math.sqrt(value).to_i end end pv << value if value > 1 pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] } end ``` -- https://bugs.ruby-lang.org/ Unsubscribe: