Checking for valid document files
Posted
by
sweb
on Super User
See other posts from Super User
or by sweb
Published on 2012-06-25T12:17:45Z
Indexed on
2012/06/25
15:18 UTC
Read the original article
Hit count: 177
I need a simple way to check if my files are valid documents (pdf, doc, docx, ppt, pptx, xls, xlsx, odt, ods, odp and etc).
I can't use file
because magic
does not work well at all. For example, for PDF files, this is my output.
sweb@sweb-laptop: /media/files/ebooks/PDF and CHM$ file --mime *. Pdf
PHP 5 for Dummies. Pdf: application/pdf; charset=binary
PHP 6 and MySQL 5 for Dynamic Web Sites. Pdf: application/octet-stream; charset=binary
PHP6 and MySQL Bible. Pdf: application/pdf; charset=binary
PHP6.pdf: application/octet-stream; charset=binary
PHP and MySQL for Dummies SE. Pdf: application/pdf; charset=binary
For example, I use abiword
– which is a good tool – but it converts any format. It doesn't check for valid documents:
abiword --to=txt --to-name=output.txt audio.mp3
Is there any command available to check for valid documents then?
© Super User or respective owner