Extract a specific string from a curl'd result

Posted by allentown on Stack Overflow See other posts from Stack Overflow or by allentown
Published on 2010-06-13T02:54:54Z Indexed on 2010/06/13 3:02 UTC
Read the original article Hit count: 278

Filed under:
|

Given this curl command: curl --user-agent "fogent" --silent -o page.html "http://www.google.com/search?q=insansiate"

* Spelling is intentionally incorrect. I want to grab the suggestion as my result.

I want to be able to either grep into the page.html file perhaps with grep -oE or pipe it right from curl and never store a file.

The result should be: 'instantiate'

I need only the word 'instantiate', or the phrase, whatever google is auto correcting, is what I am after.

Here is the basic html that is returned:

<span class=spell style="color:#cc0000">Did you mean: </span><a href="/search?hl=en&amp;ie=UTF-8&amp;&amp;sa=X&amp;ei=VEMUTMDqGoOINraK3NwL&amp;ved=0CB0QBSgA&amp;q=instantiate&amp;spell=1"class=spell><b><i>instantiate</i></b></a>&nbsp;&nbsp;<span class=std>Top 2 results shown</span>

So perhaps from/to of the string below, which I hope is unique enough to cover all my bases.

class=spell><b><i>instantiate</i></b></a>&nbsp;&nbsp;

I keep running into issues with greedy grep; perhaps I should run it though an html prettify tool first to get a line break or 50 in there. I don't know of any simple way to do so in bash, which is what I would ideally like this to be in. I really don't want to deal with firing up perl, and making sure I have the correct module.

Any suggestions, thank you?

© Stack Overflow or respective owner

Related posts about bash

Related posts about grep