I want to find the span tag beween the LI tag and its attributes but no luck.

Posted by Mahesh on Stack Overflow See other posts from Stack Overflow or by Mahesh
Published on 2010-03-18T08:10:37Z Indexed on 2010/03/18 8:41 UTC
Read the original article Hit count: 384

Filed under:
|
|

I want to find the span tag beween the LI tag and its attributes. Trying with beautful soap but no luck. Details of my code. Is any one point me right methodlogy

In this this code, my getId function should return me id = "0_False-2"

Any one know right method?


from BeautifulSoup import BeautifulSoup as bs
import re

html = '<ul>\
<li class="line">&nbsp;</li>\
<li class="folder-open-last" id="0">\
<img style="float: left;" class="trigger" src="/media/images/spacer.gif" border="0">\
<span class="text" id="0_False">NOC</span><ul style="display: block;"><li class="line">&nbsp;</li><li class="doc" id="1"><span class="active text" id="0_False-1">PNQAIPMS1</span></li><li class="line">&nbsp;</li><li class="doc-last" id="2"><span class="text" id="0_False-2">PNQAIPMS2</span></li><li class="line-last"></li></ul></li><li class="line-last"></li>\
</ul>' 


def getId(html, txt):
 soup = bs(html)
 soup.findAll('ul',recursive=False)
 head = soup.contents[0]
 temp = head
 elements = {}
 while True:
    # It temp  is None that means no HTML tags are available 
  if temp == None:
   break
  #print temp
  if re.search('li', str( temp)) != None:
   attr = str(temp.attrs).encode('ascii','ignore')
   attr = attr.replace(' ', '')
   attr = attr.replace('[', '')
   attr = attr.replace(']', '')
   attr = attr.replace(')', '')
   attr = attr.replace('(', '')
   attr = attr.replace('u\'', '')
   attr = attr.replace('\'', '')
   attr  = attr.split(',')
   span = str(temp.text)

   if span == txt:
    return attr[3]

   temp = temp.next
  else:
   temp = temp.next


id = getId(html,"PNQAIPMS2")
print "ID = " + id

© Stack Overflow or respective owner

Related posts about python

Related posts about html