Which type of file parsing easiest and efficient and good ?(html,pdf,csv,text)
Posted
by Harikrishna
on Stack Overflow
See other posts from Stack Overflow
or by Harikrishna
Published on 2010-03-18T08:38:08Z
Indexed on
2010/03/18
8:41 UTC
Read the original article
Hit count: 486
I want to parse the html file, pdf file, csv file and text file.
Now parsing for which type of file (specified above) is easiest and efficient ?
Like parsing for html file is easiest and efficient OR parsing for pdf file is easiest and efficient OR parsing for csv file is easiest and efficient ?
I am asking this question because I want to parse pdf ,html ,csv and text file through common parsing code if possible.
And now suppose if parsing for html is easiest and efficient then :
I will write the parsing code for html file and will try to convert pdf file to the html file(if possible)so the code written for parsing html file will also work for pdf file also.
And thus I will try to convert pdf,csv and text file to html file.And write the code for parsing html file and thus this code will parse html,pdf,csv and text file.
Suppose if parsing for pdf is easiest and efficient then :
I will convert html,csv and text file to pdf and write the code for parsing pdf file.So the code for parsing pdf file can parse html,csv and text file.
So my question is (1) Which type of file parsing is easiest and efficient (pdf,csv,html,text) ? (2) And converting files(pdf,text,html,csv) to eachother is possible. Like if html parsing easiest then pdf to html,text to html and csv to html.
© Stack Overflow or respective owner