Sharing large objects between ruby processes without a performance hit

Posted by Gdeglin on Stack Overflow See other posts from Stack Overflow or by Gdeglin
Published on 2010-05-26T03:15:47Z Indexed on 2010/05/26 3:21 UTC
Read the original article Hit count: 435

I have a Ruby hash that reaches approximately 10 megabytes if written to a file using Marshal.dump. After gzip compression it is approximately 500 kilobytes.

Iterating through and altering this hash is very fast in ruby (fractions of a millisecond). Even copying it is extremely fast.

The problem is that I need to share the data in this hash between Ruby on Rails processes. In order to do this using the Rails cache (file_store or memcached) I need to Marshal.dump the file first, however this incurs a 1000 millisecond delay when serializing the file and a 400 millisecond delay when serializing it.

Ideally I would want to be able to save and load this hash from each process in under 100 milliseconds.

One idea is to spawn a new Ruby process to hold this hash that provides an API to the other processes to modify or process the data within it, but I want to avoid doing this unless I'm certain that there are no other ways to share this object quickly.

Is there a way I can more directly share this hash between processes without needing to serialize or deserialize it?

Here is the code I'm using to generate a hash similar to the one I'm working with:

@a = []
0.upto(500) do |r|
  @a[r] = []
  0.upto(10_000) do |c|
    if rand(10) == 0 
      @a[r][c] = 1 # 10% chance of being 1
    else
      @a[r][c] = 0
    end
  end
end

@c = Marshal.dump(@a) # 1000 milliseconds
Marshal.load(@c) # 400 milliseconds

© Stack Overflow or respective owner

Related posts about ruby-on-rails

Related posts about ruby