From: takashikkbn@... Date: 2021-06-15T20:50:32+00:00 Subject: [ruby-core:104289] [Ruby master Bug#17992] Upstreaming the htmlentities gem into CGI#.(un)escape_html Issue #17992 has been updated by k0kubun (Takashi Kokubun). Status changed from Open to Feedback Could you clarify a bit more context about why you'd like to escape these characters not supported in `CGI.escapeHTML`? I believe `CGI.escapeHTML` has been primarily used to avoid breaking the DOM structure by the escaped content with optimal performance. It's a very understandable behavior to me, and I would prefer rather not escaping any other character for the best performance as long as it's not considered as a security vulnerability. ```rb require 'benchmark/ips' require 'htmlentities' require 'cgi/escape' str = <<~HTML

Example Domain

This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior coordination or asking for permission.

More information...

HTML coder = HTMLEntities.new Benchmark.ips do |x| x.report("CGI.escapeHTML") { CGI.escapeHTML(str) } x.report("HTMLEntities #{HTMLEntities::VERSION::STRING}") { coder.encode(str) } x.compare! end ``` ``` ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin19] Warming up -------------------------------------- CGI.escapeHTML 112.937k i/100ms HTMLEntities 4.3.4 1.029k i/100ms Calculating ------------------------------------- CGI.escapeHTML 1.131M (� 2.3%) i/s - 5.760M in 5.095252s HTMLEntities 4.3.4 10.281k (� 2.1%) i/s - 51.450k in 5.006333s Comparison: CGI.escapeHTML: 1131036.5 i/s HTMLEntities 4.3.4: 10281.4 i/s - 110.01x (� 0.00) slower ``` Note that `CGI.escapeHTML` is the default HTML escape method. You'll make every embedded Ruby expression 110x slower if you suddenly replace `CGP.escapeHTML` with that gem. We may want to support escaping some other characters for some other usages, but for backward compatibility and the performance in existing places, the feature must be enabled by a new option or another method. ---------------------------------------- Bug #17992: Upstreaming the htmlentities gem into CGI#.(un)escape_html https://bugs.ruby-lang.org/issues/17992#change-92506 * Author: AMomchilov (Alexander Momchilov) * Status: Feedback * Priority: Normal * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN ---------------------------------------- Hi there, I was looking to unescape some HTML entities in a String, and I discovered that `CGI#.(un)escape_html` is **really** limited. Many StackOverflow questions share a similar disappointment, and point users to using the [htmlentities gem](https://github.com/threedaymonk/htmlentities): 1. https://stackoverflow.com/a/383561/3141234 2. https://stackoverflow.com/a/22926384/3141234 This solved my problem, but I feel like something this standard/universal should be built-in. To that end, I'm interested in working on merging the htmlentities gem into CGI's repo. Would this be a welcome change? * I've e-mailed the author (Paul Battley) privately, and got his blessing to do so. * It's MIT licensed, so that should be OK. -- https://bugs.ruby-lang.org/ Unsubscribe: