Parsing a complicated HTML table with PHP
Posted
by
user2944979
on Stack Overflow
See other posts from Stack Overflow
or by user2944979
Published on 2013-11-01T13:52:12Z
Indexed on
2013/11/01
15:54 UTC
Read the original article
Hit count: 346
I successfully parsed a dynamic table with the following PHP code:
$docH = new DOMDocument();
$docH->loadHTMLFile($url);
//get everything inside the body element:
$bodyH = $docH->getElementsByTagName('body')->item(0);
foreach ($bodyH->childNodes as $childNode) {
echo $docH->saveHTML($childNode);
}
Parsed HTML Table:
<table>
<tr>
<td>5CG </td>
<td>aass </td>
<td>sxs </td>
<td>sx </td>
<td>EK </td>
<td> </td>
<td>72 </td>
</tr>
<td> </td>
<td>samplxs </td>
<td>xs </td>
<td> </td>
<td>xss </td>
<td>fkxsx aus</td>
<td>s </td>
</tr>
<td> </td>
<td>5AH. </td>
<td>ds </td>
<td>d </td>
<td>sdf </td>
<td>sdfsdf aus</td>
<td> </td>
</tr>
<tr>
<td>6CG </td>
<td>3. </td>
<td>sfd </td>
<td> </td>
<td>scs </td>
<td>das aus</td>
<td>a </td>
</tr>
<tr>
<td>7DG </td>
<td>6. </td>
<td>s </td>
<td>s </td>
<td>sD </td>
<td>sdsa.</td>
<td> </td>
</tr>
<td> </td>
<td>samplxs </td>
<td>xs </td>
<td> </td>
<td>xss </td>
<td>fkxsx aus</td>
<td>s </td>
</tr>
<tr>
<td>7DG, 7CG, 7CR </td>
<td>6. </td>
<td>NsdR </td>
<td>s </td>
<td>SP </td>
<td>fasdlt aus</td>
<td>s </td>
</tr>
<td> </td>
<td>samplxs </td>
<td>xs </td>
<td> </td>
<td>xss </td>
<td>fkxsx aus</td>
<td>s </td>
</tr>
<tr>
<td> 9BR </td>
<td>6. </td>
<td>FEI </td>
<td>sa </td>
<td>DE </td>
<td>fasdad aus</td>
<td> </td>
</tr>
<tr>
<td>9AR, 9BR, 9CR</td>
<td>62. </td>
<td>BEH </td>
<td> </td>
<td>sd </td>
<td>fasda aus</td>
<td> </td>
</tr>
<tr>
<td> </td>
<td>6. </td>
<td>MLR </td>
<td> </td>
<td>FdR </td>
<td>fsdfaus</td>
<td> </td>
</tr>
<tr>
<td>E10C </td>
<td>6. </td>
<td>sdf </td>
<td>d </td>
<td>d </td>
<td>fsdfs aus</td>
<td> </td>
</tr>
<tr>
</table>
But my goal is to just show the content of the table the user wants by asking for just the <tr>
elements in which the first <td>
of the first <tr>
includes some text until there is another <tr>
which first <td>
has a different content.
For example: If the user types "9BR" into an input field, I just want him to see:
<td> 9BR </td>
<td>6. </td>
<td>FEI </td>
<td>sa </td>
<td>DE </td>
<td>fasdad aus</td>
<td> </td>
</tr>
<tr>
<td>9AR, 9BR, 9CR</td>
<td>62. </td>
<td>BEH </td>
<td> </td>
<td>sd </td>
<td>fasda aus</td>
<td> </td>
</tr>
<tr>
<td> </td>
<td>6. </td>
<td>MLR </td>
<td> </td>
<td>FdR </td>
<td>fsdfaus</td>
<td> </td>
</tr>
If he types in 5CG:
<tr>
<td>5CG </td>
<td>aass </td>
<td>sxs </td>
<td>sx </td>
<td>EK </td>
<td> </td>
<td>72 </td>
</tr>
<td> </td>
<td>samplxs </td>
<td>xs </td>
<td> </td>
<td>xss </td>
<td>fkxsx aus</td>
<td>s </td>
</tr>
Or if 6CG just:
<tr>
<td>6CG </td>
<td>3. </td>
<td>sfd </td>
<td> </td>
<td>scs </td>
<td>das aus</td>
<td>a </td>
</tr>
© Stack Overflow or respective owner