From: "marcandre (Marc-Andre Lafortune)" Date: 2013-02-06T06:05:24+09:00 Subject: [ruby-core:51886] [ruby-trunk - Bug #7780] Marshal & YAML should deserialize only basic types by default. Issue #7780 has been updated by marcandre (Marc-Andre Lafortune). Anonymous wrote: > On Mon, Feb 04, 2013 at 12:54:50PM +0900, marcandre (Marc-Andre Lafortune) wrote: > > > > Issue #7780 has been updated by marcandre (Marc-Andre Lafortune). > > > > > > mame (Yusuke Endoh) wrote: > > > I think that this issue is not a bug, but a new feature. > > > > I would rather not argue about this. > > > > > We should keep {YAML|Marshal}.load "as is" (i.e., dangerous), and that we will introduce {YAML|Marshal}.safe_load in the next minor. > > > > This is worth arguing over. > > > > What downside do you see to my proposition? > > What upsides do you see to yours? > > Do you believe that the typical use is to call `safe_load` or `unsafe_load`? > > Why should the shortest and default way not be the safe one? > > > > charliesome (Charlie Somerville) wrote: > > > However I think YAML.load should be safe, since most people using YAML only use it for primitive types and are not aware that it is able to deserialize into any class. > > > > I'm glad to have support on this. > > Another source that supports this point of view: http://nedbatchelder.com/blog/201302/war_is_peace.html > > It discusses PyYAML which decided to have `load` (unsafe) and `safe_load`. It doesn't come bundled with python but is still used; a google search will point to different pull requests for python libraries to use `safe_load` instead of `load`, e.g. https://bugs.launchpad.net/cloud-init/+bug/1015818 > > > > This could all be avoided with `load` being safe! > > > > I hope that Charliesome, myself and others can convince Matz / tenderlove that YAML.load should be safe by default. > > *Many* people use YAML load / dump for unsafe operations, e.g. storing > serialized objects in the database. I am very against changing this > behavior. > > making `load` "safe" by default would > break lots of Rails apps. I'll repeat that the proposal is for a migration path, giving Rails or any library using YAML at least a year of transition where they can call `load!` if YAML responds to it otherwise `load`. To have breakage in production, you would need: 1) A rails app that uses ActiveRecord's `serialize` with custom classes (how rare that is is up for debate) 2) That doesn't upgrade to Ruby 2.0.0 or decides to ignore the warning 3) That will upgrade directly to next minor (say in one year) but not upgrade to *any* new version of Rails in the meantime 4) And not run any tests when doing so, nor read the release notes. If they do, and decide to maintain the same version of Rails, a one-line monkey patch restores `YAML.load` as the dangerous version. I really don't see this as a problem. What I see as a problem is having sites compromised. My preference is clear between a rare site that breaks through convoluted upgrade process, or vulnerable sites. ---------------------------------------- Bug #7780: Marshal & YAML should deserialize only basic types by default. https://bugs.ruby-lang.org/issues/7780#change-35872 Author: marcandre (Marc-Andre Lafortune) Status: Assigned Priority: Normal Assignee: Category: Target version: next minor ruby -v: r39035 YAML is a wonderful, powerful and expressive format to serialize data in a human readable way. It can be used, for example, to read and write nice configuration files, to store strings, numbers, dates & times in a hash. YAML.load will, by default, instantiate any user class and set instance variables directly. On the other hand, this can make apparently innocent code lead to major vulnerabilities, as was clearly illustrated by different exploits recently. I feel YAML.load should, by default, be safe to use, for example by instantiating only known core classes. The same can be said for Marshal, even though it would more rarely be used as a public interface to exchange data. Maybe the following transition path could be taken: 1) Have {YAML|Marshal}.load issue a warning (once) that next minor will only deserialize basic types. 2) Create {YAML|Marshal}.unsafe_load, which does the same thing as current `load`, without a warning of course. As these changes are compatible and extremely minor, I would like them to be considered for Ruby 2.0.0. They also make for a "Secure by default" is not a new concept. Rails 3.0 has XSS protection by default, for example. The fact that one needs to do extra processing like calling `raw` when that security needs to be bypassed makes XSS attacks less likely. I believe the typical use of Yaml.load is to load basic types. We should expect users to use the easiest solution, so that should be the safe way. If a tool makes the safe way of doing things the default, and makes it easy to do more complex deserializing (e.g. whitelisting some user classes), this can only lead to less vulnerabilities. I hope nobody will take offence that I've tagged this issue as a "bug". The current behavior is as speced, but it can be argued that a design that is inherently insecure is a defect. -- http://bugs.ruby-lang.org/