python 3 4 - Page 378 - Developer IT

Detect if 2 HTML fragments have identical hierarchical structure

- by sergzach

An example of fragments that have identical hierarchical structure: (1) <div> <span>It's a message</span> </div> (2) <div> <span class='bold'>This is a new text</span> </div> An example of fragments that have different structure: (1) <div> <span><b>It's a message</b></span> </div> (2) <div> <span>This is a new text</span> </div> So, fragments with a similar structure correspond to one hierarchical tree (the same tag names, the same hierarchical structure). How can I detect if 2 elements (html fragments) have the same structure simply with lxml? I have a function that does not work properly for some more difficult case (than the example): def _is_equal( el1, el2 ): # input: 2 elements with possible equal structure and tag names # e.g. root = lxml.html.fromstring( buf ) # el1 = root[ 0 ] # el2 = root[ 1 ] # move from top to bottom, compare elements result = False if el1.tag == el2.tag: # has no children if len( el1 ) == len( el2 ): if len( el1 ) == 0: return True else: # iterate one of them, for example el1 i = 0 for child1 in el1: child2 = el2[ i ] is_equal2 = _is_equal( child1, child2 ) if not is_equal2: return False return True else: return False else: return False The code fails to detect that 2 divs with class='tovar2' have an identical structure: <body> <div class="tovar2"> <h2 class="new"> <a href="http://modnyedeti-krsk.ru/magazin/product/333193003"> ?????? ?/? </a> </h2> <ul class="art"> <li> ???????: <span>1759</span> </li> </ul> <div> <div class="wrap" style="width:180px;"> <div class="new"> <img src="shop_files/new-t.png" alt=""> </div> <a class="highslide" href="http://modnyedeti-krsk.ru/d/459730/d/820.jpg" onclick="return hs.expand(this)"> <img src="shop_files/fr_5.gif" style="background:url(/d/459730/d/548470803_5.jpg) 50% 50% no-repeat scroll;" alt="?????? ?/?" height="160" width="180"> </a> </div> </div> <form action="" onsubmit="return addProductForm(17094601,333193003,3150.00,this,false);"> <ul class="bott "> <li class="price">????:<br> <span> <b> 3 150 </b> ???. </span> </li> <li class="amount">???-??:<br><input class="number" onclick="this.select()" value="1" name="product_amount" type="text"> </li> <li class="buy"><input value="" type="submit"> </li> </ul> </form> </div> <div class="tovar2"> <h2 class="new"> <a href="http://modnyedeti-krsk.ru/magazin/product/333124803">?????? ?/?</a> </h2> <ul class="art"> <li> ???????: <span>1759</span> </li> </ul> <div> <div class="wrap" style="width:180px;"> <div class="new"> <img src="shop_files/new-t.png" alt=""> </div> <a class="highslide" href="http://modnyedeti-krsk.ru/d/459730/d/820.jpg" onclick="return hs.expand(this)"> <img src="shop_files/fr_5.gif" style="background:url(/d/459730/d/548470803_5.jpg) 50% 50% no-repeat scroll;" alt="?????? ?/?" height="160" width="180"> </a> </div> </div> <form action="" onsubmit="return addProductForm(17094601,333124803,3150.00,this,false);"> <ul class="bott "> <li class="price">????:<br> <span> <b>3 150</b> ???. </span> </li> <li class="amount">???-??:<br><input class="number" onclick="this.select()" value="1" name="product_amount" type="text"> </li> <li class="buy"> <input value="" type="submit"> </li> </ul> </form> </div> </body>

Search Results

Search found 13534 results on 542 pages for 'python 3 4'.

Page 378/542 | < Previous Page | 374 375 376 377 378 379 380 381 382 383 384 385 | Next Page >

- by sergzach

- by zjm1126

- by Kent

- by zjm1126

- by zjm1126

- by JPC

- by eWizardII

- by zjm1126

- by Bob Dover

- by sebpiq

- by Joshua

- by amann

- by calccrypto

- by Big 40wt Svetlyak

- by user338095

- by colwilson

- by JiL

- by maysam

- by Jordan Messina

- by Sergio Tapia

- by dassouki

- by eWizardII

- by Greg

- by Mridang Agarwalla

- by SquidneyPoitier

< Previous Page | 374 375 376 377 378 379 380 381 382 383 384 385 | Next Page >