Keeping files or database records? Java and Python
Posted
by danpalmer
on Stack Overflow
See other posts from Stack Overflow
or by danpalmer
Published on 2010-04-11T18:49:27Z
Indexed on
2010/04/11
18:53 UTC
Read the original article
Hit count: 453
My website will use a Neural Network to predict thing based on user data. The user can select the data to be used in training the network and then use their trained network to predict things.
I am using a framework to create, train and query the networks. This uses Java. The framework has persistence for saving a network to an XML file.
What is the best way to store these files? I can see several potential ideas, but I need help on choosing which is best:
- Save each network to a separate XML file with a name that is stored in the database. Load this each time.
- Save all the networks to the same XML file with each network having a different name that is stored in the database.
- Somehow pass what would normally be written to an XML file to the Django site for writing to the database. This would need to be returned to the Java code when a prediction needs to be made.
I am able to do 1 or 2, but I think their performance will be quite limited and I am on shared hosting at the moment, so I don't know how pleased they would be with thousands of files. Also, after adding a few thousand records to one XML file, I was noticing a massive performance hit on saving to it.
If I were able to implement version 3 somehow I think it would be best. No issues with separate processes accessing the database and I think performance would be better. Not to mention having no files lying around.
However, the stuff in the neural network framework I am using (Encog) for saving to a file needs access to a Java file object, not a string that could be saved to a database. Unless there is some Java magic I can do here (I know very little Java), the only way I can see of doing this would be with a temporary files but I don't know if this is the correct way to do it.
I would appreciate any ideas on the best way to implement any of the above 3 ideas or any alternatives. Thanks!
© Stack Overflow or respective owner