What is the best way to build a database from a MS Word document?
- by Jayron Soares
Please advise me on how to approach this problem:
I have a sequential list of metadata in a document in MS Word. The basic idea is to create a Python algorithm to iterate over the information, retrieving just the name of the PROCESS, when is made a queue, from a database.
Example metadata:
Process: Process Walker (1965)
Exact reference: Walker Process Equipment., Inc. v. Food Machinery Corp.
Link: http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=US&vol=382&invol=
Type of procedure: Certiorari to the United States Court of Appeals for the Seventh Circuit.
Parties: Walker Process Equipment, Inc.
Sector: Systems is ...
Start Date: October 12-13 Arguedas, 1965
Summary: Food Machinery Company has initiated a process to stop or slow the entry of competitors through the use of a patent obtained by fraud. The case concerned a patent on "knee action swing diffusers" used in aeration equipment for sewage treatment systems, and the question was whether "the maintenance and enforcement of a patent obtained by fraud before the patent office" may be a basis for antitrust punishment.
Report of the evolution process: petitioner, in answer to respond...
Importance: a) First case which established an analysis for the diagnosis of dispute…
There are about 200 pages containing the information above.
I have in mind the idea of implementing an algorithm in Python to be able to break this information sequence and try to store it in a web database (an open source application that I’m looking for) in order to allow for free consultations.