From: duerst@... Date: 2016-01-05T06:52:40+00:00 Subject: [ruby-core:72710] [Ruby trunk - Bug #11706] Clean up files etc/unicode/name2ctype.{h.blt, kwd, src} Issue #11706 has been updated by Martin D��rst. Chris Seaton wrote: > I've been dealing with an issue related to this. When Ruby updated to MRI 7.0 Do you mean Unicode 7.0? > the name2ctype.h was updated but not the name2ctype.src, so they're now inconsistent (look at CR_Blank for example). What do you mean by "now"? What's your current revision/Ruby version? As for inconsistencies, I indeed mentioned that. > I found this problem when I tried to update JCodings (part of JRuby) Can you tell me where in the JRuby source tree these files are? > which generated its tables from these files. It uses the name2ctype.src, so got the wrong values. > > I'll update JCodings to read from name2ctype.h instead. > > You've listed name2ctype.h as an intermediate that should be deleted. I'm not sure that's right - it's actually the original source now isn't it? But I haven't listed it as an intermediary; I only listed name2ctype.h.blt, which isn't the same file. > It's the only file in https://github.com/k-takata/Onigmo/tree/master/enc/unicode. I don't think that one can be deleted. I didn't propose to delete it, but it could be deleted because it's an intermediate file in the sense that the original source of the data is the Unicode database itself. > https://github.com/jruby/jcodings/issues/13 I'll add a pointer to here to that issue. ---------------------------------------- Bug #11706: Clean up files etc/unicode/name2ctype.{h.blt,kwd,src} https://bugs.ruby-lang.org/issues/11706#change-55960 * Author: Martin D��rst * Status: Open * Priority: Normal * Assignee: Nobuyoshi Nakada * ruby -v: * Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN ---------------------------------------- The files name2ctype.{h.blt,kwd,src} in etc/unicode are intermediate products that are not needed in the repository, and haven't been committed consistently. I propose to remove them. [I'm not sure this is a bug or a feature, but it doesn't provide any new functionality, so feature doesn't seem right.] [I've assigned this to Nobu for feedback; I can execute it once we agree on a way forward.] On 2015/11/17 15:39, Nobuyoshi Nakada wrote: > Please update name2ctype.{h.blt,kwd,src} files too. Thanks for the reminder. I had a look at these files. Maybe before further commits, we can try to simplify things a bit, and/or to ignore irrelevant stuff. Sorry this message is long. Looking at the three files you mentioned, I noticed the following: enc/unicode/name2ctype.h.kwd was produced on the Onigmo side, when I worked on the update (see also https://github.com/k-takata/Onigmo/pull/58), too. However, it is not part of the Onigmo distribution. It was last committed by Yui Naruse at r36070, on 2012/06/14. This is way before the update to Unicode 7.0.0 with r46831. On 2011/11/20, K. Takata introduced https://github.com/k-takata/Onigmo/blob/master/tool/convert-name2ctype.sh, which is used as: convert-name2ctype.sh name2ctype.kwd > name2ctype.h to directly convert from name2ctype.kwd to name2ctype.h (although it produces a few numbered intermediary files which are removed in the last step). enc/unicode/name2ctype.h.blt was last committed by yourself in r49292 on 2015/01/17. Your log message mentions r46831, but it is unclear why you updated .h.blt and not .kwd and .src. The last commit before this was r36070, same as for name2ctype.h.kwd. enc/unicode/name2ctype.src also was last committed in r36070. Looking at Makefile.in, it contains instructions to create enc/unicode/name2ctype.h from enc/unicode/name2ctype.kwd at http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/Makefile.in?view=markup#l340. There, .h.blt and .src are mentioned, but my knowledge of shell syntax isn't good enough to understand what's exactly supposed to go on. My conclusions so far would be: - name2ctype.{h.blt,kwd,src} are all intermediary files that are not actually used directly for building Ruby. - In the last few years, these three files have been committed only rarely and accidentally, not in any visible sync with actual bug fixes or feature additions. - Onigmo no longer uses name2ctype.h.blt and .src, and does not commit .kwd. - The build process on the Onigmo side, although I did it manually, was well documented and painless; on the Ruby side, it may be possible to build enc/unicode/name2ctype.h (the file that's finally used for compilation), but I haven't found how to do so. - For a process that needs to be done about once a year, this amount of manual work seems perfectly fine (at least for me, and I volunteer to do it again next year). - Therefore, I suggest that we don't care about committing name2ctype.{h.blt,kwd,src}. If you want me to commit enc/unicode/name2ctype.h.kwd, I can do it (because I have the new version). Indeed, it might be better to remove these three files; they only make checkouts heavier. - If we want to simplify the production process, my preference would be to update Makefile.in based on convert-name2ctype.sh, or to directly integrate convert-name2ctype.sh into tool/enc-unicode.rb (why would one want to use sed and friends if we already use ruby?) -- https://bugs.ruby-lang.org/ Unsubscribe: