How to use python and beautfulsoup to print timestamp/last updated time (from HTML:) for each row ?
Posted
by
cesalo
on Stack Overflow
See other posts from Stack Overflow
or by cesalo
Published on 2014-08-18T15:22:05Z
Indexed on
2014/08/18
16:22 UTC
Read the original article
Hit count: 85
python
How to use python and beautfulsoup to print timestamp/last updated time (from HTML:) for each row ? thanks a lot !
A)
1) can i add the print a)date/time and b)last updated time after row ?
a) date/time - display the time when execute the python code
b) last updated time from HTML:
HTML structure:
td x 1 including
two tables
each table have few "tr"
and within "tr" have few "td" data inside
HTML:
<td>
<table width="100%" border="4" cellspacing="0" bordercolor="white" align="center">
<tbody>
<tr>
<td colspan="2" class="verd_black11">Last Updated: 18/08/2014 10:19</td>
</tr>
<tr>
<td colspan="3" class="verd_black11">All data delayed at least 15 minutes</td>
</tr>
</tbody>
</table>
<table width="100%" border="4" cellspacing="0" bordercolor="white" align="center">
<tbody id="tbody">
<tr id="tr0" class="tableHdrB1" align="center">
<td align="centre">C Aug-14 - 15000</td>
<td align="right"> - </td>
<td align="right">5</td>
<td align="right">9,904</td>
</tr>
</tbody>
</table>
</td>
Code:
import urllib2
from bs4 import BeautifulSoup
contenturl = "HTML:"
soup = BeautifulSoup(urllib2.urlopen(contenturl).read())
table = soup.find('tbody', attrs={'id': 'tbody'})
rows = table.findAll('tr')
for tr in rows:
cols = tr.findAll('td')
for td in cols:
t = td.find(text=True)
if t:
text = t + ';'
print text,
print
Output from above code
C Aug-14 - 15000 ; - ; 5 ; 9,904
Expected output:
C Aug-14 - 15000 ; - ; 5 ; 9,904 ; 18/08/2014 ; 13:48:00 ; 18/08/2014 ; 10:19
(execute python code) (last updated time)
© Stack Overflow or respective owner