how to remove all text nodes and only preserve structure information of a html page with nokogiri
- by user58948
I want to remove all text from html page that I load with nokogiri. For example, if a page has the following:
<body><script>var x = 10;</script><div>Hello</div><div><h1>Hi</h1></div></body>
I want to process it with Nokogiri and return html like the following after stripping the text like so:
<body><script>var x = 10;</script><div></div><div><h1></h1></div></body>
(THat is, remove the actual h1 text, text between divs, text in p elements etc, but keep the tags. also, dont remove text in the script tags.)
How can I do that?