From: merch-redmine@... Date: 2019-05-30T15:17:59+00:00 Subject: [ruby-core:92888] [Ruby trunk Feature#15797] Use realpath(3) instead of custom realpath implementation if available Issue #15797 has been updated by jeremyevans0 (Jeremy Evans). I was able to run the benchmarks in a Linux virtual machine. This includes the benchmarks I added for `File.realdirpath` and for `File.realpath` where the paths that do not exist. `File.realdirpath` always uses the emulated code on Linux, so performance is basically the same for it. For `File.realpath`, this results in a 30%-146% performance improvement. ``` Warming up -------------------------------------- relative_nil 68.629k i/s - 70.070k times in 1.020996s (14.57��s/i) absolute_nil 57.387k i/s - 60.762k times in 1.058804s (17.43��s/i) relative_relative 55.483k i/s - 58.799k times in 1.059775s (18.02��s/i) absolute_relative 57.913k i/s - 61.438k times in 1.060868s (17.27��s/i) relative_absolute 45.609k i/s - 45.756k times in 1.003219s (21.93��s/i) relative_nil_dir 29.184k i/s - 30.960k times in 1.060844s (34.26��s/i) absolute_nil_dir 33.260k i/s - 35.388k times in 1.063982s (30.07��s/i) relative_relative_dir 28.179k i/s - 29.892k times in 1.060801s (35.49��s/i) absolute_relative_dir 32.118k i/s - 33.948k times in 1.056964s (31.13��s/i) relative_absolute_dir 31.790k i/s - 34.008k times in 1.069771s (31.46��s/i) relative_nil_notexist 37.898k i/s - 39.276k times in 1.036371s (26.39��s/i) absolute_nil_notexist 34.093k i/s - 35.460k times in 1.040093s (29.33��s/i) relative_relative_notexist 32.874k i/s - 34.464k times in 1.048382s (30.42��s/i) absolute_relative_notexist 33.897k i/s - 35.736k times in 1.054244s (29.50��s/i) relative_absolute_notexist 28.660k i/s - 30.480k times in 1.063515s (34.89��s/i) Calculating ------------------------------------- new/ruby old/ruby relative_nil 69.788k 28.397k i/s - 205.887k times in 2.950191s 7.250217s absolute_nil 58.143k 32.227k i/s - 172.162k times in 2.961032s 5.342120s relative_relative 55.612k 27.348k i/s - 166.447k times in 2.992992s 6.086198s absolute_relative 58.391k 30.164k i/s - 173.738k times in 2.975413s 5.759725s relative_absolute 45.409k 30.837k i/s - 136.827k times in 3.013203s 4.437175s relative_nil_dir 29.215k 29.319k i/s - 87.552k times in 2.996856s 2.986143s absolute_nil_dir 32.813k 33.347k i/s - 99.779k times in 3.040857s 2.992129s relative_relative_dir 28.121k 28.421k i/s - 84.536k times in 3.006131s 2.974449s absolute_relative_dir 32.306k 32.606k i/s - 96.355k times in 2.982614s 2.955131s relative_absolute_dir 31.942k 31.896k i/s - 95.369k times in 2.985704s 2.989975s relative_nil_notexist 38.438k 20.326k i/s - 113.692k times in 2.957835s 5.593418s absolute_nil_notexist 34.183k 22.248k i/s - 102.279k times in 2.992088s 4.597170s relative_relative_notexist 32.986k 19.736k i/s - 98.620k times in 2.989738s 4.996882s absolute_relative_notexist 33.944k 22.027k i/s - 101.691k times in 2.995879s 4.616749s relative_absolute_notexist 28.781k 21.903k i/s - 85.979k times in 2.987334s 3.925385s Comparison: relative_nil new/ruby: 69787.7 i/s old/ruby: 28397.4 i/s - 2.46x slower absolute_nil new/ruby: 58142.6 i/s old/ruby: 32227.3 i/s - 1.80x slower relative_relative new/ruby: 55612.2 i/s old/ruby: 27348.3 i/s - 2.03x slower absolute_relative new/ruby: 58391.2 i/s old/ruby: 30164.3 i/s - 1.94x slower relative_absolute new/ruby: 45409.2 i/s old/ruby: 30836.5 i/s - 1.47x slower relative_nil_dir old/ruby: 29319.4 i/s new/ruby: 29214.6 i/s - 1.00x slower absolute_nil_dir old/ruby: 33347.2 i/s new/ruby: 32812.8 i/s - 1.02x slower relative_relative_dir old/ruby: 28420.7 i/s new/ruby: 28121.2 i/s - 1.01x slower absolute_relative_dir old/ruby: 32606.0 i/s new/ruby: 32305.6 i/s - 1.01x slower relative_absolute_dir new/ruby: 31941.9 i/s old/ruby: 31896.2 i/s - 1.00x slower relative_nil_notexist new/ruby: 38437.6 i/s old/ruby: 20326.0 i/s - 1.89x slower absolute_nil_notexist new/ruby: 34183.2 i/s old/ruby: 22248.3 i/s - 1.54x slower relative_relative_notexist new/ruby: 32986.2 i/s old/ruby: 19736.3 i/s - 1.67x slower absolute_relative_notexist new/ruby: 33943.6 i/s old/ruby: 22026.5 i/s - 1.54x slower relative_absolute_notexist new/ruby: 28781.2 i/s old/ruby: 21903.3 i/s - 1.31x slower ``` ---------------------------------------- Feature #15797: Use realpath(3) instead of custom realpath implementation if available https://bugs.ruby-lang.org/issues/15797#change-78268 * Author: jeremyevans0 (Jeremy Evans) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- One reason to do this is simplicity, as this approach is ~30 lines of code instead of ~200. Performance wise, this performs 25%-115% better, using the following benchmark on OpenBSD 6.5: ```ruby require 'benchmark' f = File pwd = Dir.pwd Dir.mkdir('b') unless f.directory?('b') f.write('b/a', '') unless f.file?('b/a') args = [ ["b/a", nil], ["#{pwd}/b/a", nil], ['a', 'b'], ["#{pwd}/b/a", 'b'], ["b/a", pwd] ] args.each do |path, base| print "File.realpath(#{path.inspect}, #{base.inspect}): ".ljust(50) puts Benchmark.measure{100000.times{f.realpath(path, base)}} end ``` Before: ``` File.realpath("b/a", nil): 4.330000 2.990000 7.320000 ( 7.316244) File.realpath("/home/testr/ruby/b/a", nil): 3.560000 2.680000 6.240000 ( 6.240951) File.realpath("a", "b"): 4.370000 3.080000 7.450000 ( 7.452511) File.realpath("/home/testr/ruby/b/a", "b"): 3.730000 2.640000 6.370000 ( 6.371979) File.realpath("b/a", "/home/testr/ruby"): 3.590000 2.630000 6.220000 ( 6.226824) ``` After: ``` File.realpath("b/a", nil): 1.370000 2.030000 3.400000 ( 3.400775) File.realpath("/home/testr/ruby/b/a", nil): 1.260000 2.770000 4.030000 ( 4.024957) File.realpath("a", "b"): 2.090000 1.990000 4.080000 ( 4.080284) File.realpath("/home/testr/ruby/b/a", "b"): 1.400000 2.620000 4.020000 ( 4.015505) File.realpath("b/a", "/home/testr/ruby"): 2.150000 2.760000 4.910000 ( 4.910634) ``` If someone could benchmark before/after with this patch on Linux and/or MacOS X, and post the results here, I would appreciate it. My personal reason for wanting this is that the custom realpath implementation does not work with OpenBSD's unveil(2) system call, which limits access to the file system, allowing for security similar to chroot(2), without most of the downsides. This change passes all tests except for one assertion related to taintedness. Previously, if either argument to `File.realpath` is an absolute path, then the returned value is considered not tainted. However, I believe that behavior to be incorrect, because if there is a symlink anywhere in the path, the returned value can contain a section that was taken from the file system (unreliable source) that was not marked as untainted. Example: ```ruby Dir.mkdir('b') unless File.directory?('b') File.write('b/a', '') unless File.file?('b/a') File.symlink('b', 'c') unless File.symlink?('c') path = File.realpath('c/a'.untaint, Dir.pwd.untaint) path # "/home/testr/ruby/b/a" path.tainted? # should be true, as 'b' comes from file system ``` I believe it is safer to always mark the output of realpath as tainted to prevent this issue, which is what this commit does. ---Files-------------------------------- use-native-realpath.patch (6.31 KB) use-native-realpath-v2.patch (4.64 KB) use-native-realpath-v3.patch (5.18 KB) use-native-realpath-v4.patch (6.34 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: