Sharing large objects between ruby processes without a performance hit
- by Gdeglin
I have a Ruby hash that reaches approximately 10 megabytes if written to a file using Marshal.dump. After gzip compression it is approximately 500 kilobytes.
Iterating through and altering this hash is very fast in ruby (fractions of a millisecond). Even copying it is extremely fast.
The problem is that I need to share the data in this hash between Ruby on Rails processes. In order to do this using the Rails cache (file_store or memcached) I need to Marshal.dump the file first, however this incurs a 1000 millisecond delay when serializing the file and a 400 millisecond delay when serializing it.
Ideally I would want to be able to save and load this hash from each process in under 100 milliseconds.
One idea is to spawn a new Ruby process to hold this hash that provides an API to the other processes to modify or process the data within it, but I want to avoid doing this unless I'm certain that there are no other ways to share this object quickly.
Is there a way I can more directly share this hash between processes without needing to serialize or deserialize it?
Here is the code I'm using to generate a hash similar to the one I'm working with:
@a = []
0.upto(500) do |r|
@a[r] = []
0.upto(10_000) do |c|
if rand(10) == 0
@a[r][c] = 1 # 10% chance of being 1
else
@a[r][c] = 0
end
end
end
@c = Marshal.dump(@a) # 1000 milliseconds
Marshal.load(@c) # 400 milliseconds