Command line tool to query HTML elements (linux)
- by ipsec
I am looking for a (linux) command line tool to parse HTML files and extract some elements, ideally with some XPath-like syntax.
I have the following requirements:
It must be able to parse arbitrary HTML files (which may contain errors) in a robust manner
It must be able to extract text of elements and attributes
What I have tried so far:
…