I want to parse the html file, pdf file, csv file and text file.
Now parsing for which type of file (specified above) is easiest and efficient ?
Like parsing for html file is easiest and efficient OR
parsing for pdf file is easiest and efficient OR
parsing for csv file is easiest and efficient ?
I am asking this question because I want to parse pdf ,html ,csv and text file through common parsing code if possible.
And now suppose if parsing for html is easiest and efficient then :
I will write the parsing code for html file and will try to convert pdf file to the html file(if possible)so the code written for parsing html file will also work for pdf file also.
And thus I will try to convert pdf,csv and text file to html file.And write the code for parsing html file and thus this code will parse html,pdf,csv and text file.
Suppose if parsing for pdf is easiest and efficient then :
I will convert html,csv and text file to pdf and write the code for parsing pdf file.So the code for parsing pdf file can parse html,csv and text file.
So my question is
(1) Which type of file parsing is easiest and efficient (pdf,csv,html,text) ?
(2) And converting files(pdf,text,html,csv) to eachother is possible.
Like if html parsing easiest then pdf to html,text to html and csv to html.