Safely escaping and reading back a file path in ruby

Posted by user336851 on Stack Overflow See other posts from Stack Overflow or by user336851
Published on 2010-05-13T20:22:10Z Indexed on 2010/05/13 22:04 UTC
Read the original article Hit count: 228

Filed under:
|
|
|
|

I need to save a few informations about some files. Nothing too fancy so I thought I would go with a simple one line per item text file. Something like this :

# write
io.print "%i %s %s\n" % [File.mtime(fname), fname, Digest::SHA1.file(fname).hexdigest]
# read
io.each do |line|
  mtime, name, hash = line.scanf "%i %s %s"
end

Of course this doesn't work because a file name can contain spaces (breaking scanf) and line breaks (breaking IO#each).

The line break problem can be avoided by dropping the use of each and going with a bunch of gets(' ')

while not io.eof?
  mtime = Time.at(io.gets(" ").to_i)
  name = io.gets " "
  hash = io.gets "\n"
end

Dealing with spaces in the names is another matter. Now we need to do some escaping.
note : I like space as a record delimiter but I'd have no issue changing it for one easier to use. In the case of filenames though, the only one that could help is ascii nul "\0" but a nul delimited file isn't really a text file anymore...

I initially had a wall of text detailing the iterations of my struggle to make a correct escaping function and its reciprocal but it was just boring and not really useful. I'll just give you the final result:

def write_name(io, val)
  io << val.gsub(/([\\ ])/, "\\\\\\1") # yes that' 6 backslashes !
end

def read_name(io)
  name, continued = "", true
  while continued
    continued = false
    name += io.gets(' ').gsub(/\\(.)/) do |c|
      if c=="\\\\"
        "\\"
      elsif c=="\\ "
        continued=true
        " "
      else
        raise "unexpected backslash escape  : %p (%s %i)" % [c, io.path, io.pos]
      end
    end
  end
  return name.chomp(' ')
end

I'm not happy at all with read_name. Way too long and akward, I feel it shouldn't be that hard.

While trying to make this work I tried to come up with other ways :

  • the bittorrent encoded / php serialize way : prefix the file name with the length of the name then just io.read(name_len.to_i). It works but it's a real pita to edit the file by hand. At this point we're halfway to a binary format.

  • String#inspect : This one looks expressly made for that purpose ! Except it seems like the only way to get the value back is through eval. I hate the idea of eval-ing a string I didn't generate from trusted data.

So. Opinions ? Isn't there some lib which can do all this ? Am I missing something obvious ? How would you do that ?

© Stack Overflow or respective owner

Related posts about ruby

Related posts about flat-file