Killing HTML nodes from shell
Posted
by hendry
on Stack Overflow
See other posts from Stack Overflow
or by hendry
Published on 2010-05-03T11:12:13Z
Indexed on
2010/05/03
12:08 UTC
Read the original article
Hit count: 447
Need a solution to kill nodes like <footer>foobar</footer>
and <div class="nav"></div>
from many several HTML files.
I want to dump a site to disk without the menus and footers and what not. Ideally I would accomplish this task using basic unix tools like sed. Since it's not XML I can't use xmlstarlet
.
Could anyone please suggest recipes, so I can ideally have a script running kill-node.sh 'div class="toplinks"' *.html
to prune the bits I don't want. Thank you,
© Stack Overflow or respective owner