File Storage for Web Applications: Filesystem vs DB vs NoSQL engines
- by El Yobo
I have a web application that stores a lot of user generated files. Currently these are all stored on the server filesystem, which has several downsides for me.
When we move "folders" (as defined by our application) we also have to move the files on disk (although this is more due to strange design decisions on the part of the original developers than a requirement of storing things on the filesystem).
It's hard to write tests for file system actions; I have a mock filesystem class that logs actions like move, delete etc, without performing them, which more or less does the job, but I don't have 100% confidence in the tests.
I will be adding some other jobs which need to access the files from other service to perform additional tasks (e.g. indexing in Solr, generating thumbnails, movie format conversion), so I need to get at the files remotely. Doing this over network shares seems dodgy...
Dealing with permissions on the filesystem as sometimes given us problems in the past, although now that we've moved to a pure Linux environment this should be less of an issue.
What are the downsides of storing files as BLOBs in MySQL? I guess that it would massively increase the database size and reduce the effectiveness of caches, but are there other problems?
Do the same problems exist with NoSQL systems like Cassandra?
Does anyone have any other suggestions that might be appropriate?