Search Results

Search found 74 results on 3 pages for 'lxml'.

Page 2/3 | < Previous Page | 1 2 3  | Next Page >

  • Best way to get back to using the power of lxml after having to use a regex to find something in an

    - by PyNEwbie
    I am trying to rip some text out of a large number of html documents (numbers in the hundreds of thousands). The documents are really forms but they are prepared by a very large group of different organizations so there is significant variation in how they create the document. For example, the documents are divided into chapters. I might want to extract the contents of Chapter 5 from every document so I can analyze the content of the chapter. Initially I thought this would be easy but it turns out that the authors might use a set of non-nested tables throughout the document to hold the content so that Chapter n could be displayed using td tags inside a table. Or they might use other elements such as p tags H tags, div tags or any other block level element. After trying repeatedly to use lxml to help me identify the beginning and end of each chapter I have determined that it is a lot cleaner to use a regular expression because in every case, no matter what the enclosing html element is the chapter label is always in the form of >Chapter # It is a little more complicated in that there might be some white space or non-breaking space represented in different ways (  or   or just spaces). Nonetheless it was trivial to write a regular expression to identify the beginning of each section. (The beginning of one section is the end of the previous section.) But now I want to use lxml to get the text out. My thought is that I have really no choice but to walk along my string to find the close tag for the element that encloses the text I am using to find the relevant section. That is here is one example where the element holding the Chapter name is a div <div style="DISPLAY: block; MARGIN-LEFT: 0pt; TEXT-INDENT: 0pt; MARGIN-RIGHT: 0pt" align="left"><font style="DISPLAY: inline; FONT-WEIGHT: bold; FONT-SIZE: 10pt; FONT-FAMILY: Times New Roman">Chapter 1.&#160;&#160;&#160;Our Beginnings.</font></div> So I am imagining that I would begin at the location where I found the match for chapter 1 and set up a regular expressions to find the next </div|</td|</p|</h1 . . . So at this point I have identified the type of element holding my chapter heading I can use the same logic to find all of the text that is within that element that is set up a regular expression to help me mark from >Chapter 1.&#160;&#160;&#160;Our Beginnings.< So I have identified where my Chapter 1 begins I can do the same for chapter 2 (which is where Chapter 1 ends) Now I am imagining that I am going to snip the document beginning at the opening of the element that I identified as the element the indicates where chapter 1 begins and ending just before the opening of the element that I identified as the element that indicates where Chapter 2 begins. The string that I have identified will then be fed to lxml to use its power to get the content. I am going to all of this trouble because I have read over and over - never use a regular expression to extract content from html documents and I have not hit on a way to be as accurate with lxml to identify the starting and ending locations for the text I want to extract. For example, I can never be certain that the subtitle of Chapter 1 is Our Beginnings it could be Our Red Canary. Let me say that I spent two solid days trying with lxml to be confident that I had the beginning and ending elements and I could only be accurate <60% of the time but a very short regular expression has given me better than 95% success. I have a tendency to make things more complicated than necessary so I am wondering if anyone has seen or solved a similar problems and if they had an approach (not the details mind you) that they would like to offer.

    Read the article

  • Should I strip the XML declaration from suds output before parsing with lxml?

    - by mikl
    I’m trying to implement a SOAP webservice in Python 2.6 using the suds library. That is working well, but I’ve run into a problem when trying to parse the output with lxml. Suds returns a suds.sax.text.Text object with the reply from the SOAP service. The suds.sax.text.Text class is a subclass of the Python built-in Unicode class. In essence, it would be comparable with this Python statement: u'<?xml version="1.0" encoding="utf-8" ?><root><lotsofelements \></root>' Which is incongrous, since if the XML declaration is correct, the contents are UTF-8 encoded, and thus not a Python Unicode object (because those are stored in some internal encoding like UCS4). lxml will refuse to parse this, as documented, since there is no clear answer to what encoding it should be interpreted as. As I see it, there are two ways out of this bind: Strip the <?xml> declaration, including the encoding. Convert the output from Suds into a bytestring, using the specified encoding. Currently, the data I’m receiving from the webservice is within the ASCII-range, so either way will work, but both feels very much like ugly hacks to me, and I’m not quite sure what would happen, if I start to receive data that would need a wider range of Unicode characters. Any good ideas? I can’t imagine I’m the first one in this position…

    Read the article

  • Python web scraping involving HTML tags with attributes

    - by rohanbk
    I'm trying to make a web scraper that will parse a web-page of publications and extract the authors. The skeletal structure of the web-page is the following: <html> <body> <div id="container"> <div id="contents"> <table> <tbody> <tr> <td class="author">####I want whatever is located here ###</td> </tr> </tbody> </table> </div> </div> </body> </html> I've been trying to use BeautifulSoup and lxml thus far to accomplish this task, but I'm not sure how to handle the two div tags and td tag because they have attributes. In addition to this, I'm not sure whether I should rely more on BeautifulSoup or lxml or a combination of both. What should I do? At the moment, my code looks like what is below: import re import urllib2,sys import lxml from lxml import etree from lxml.html.soupparser import fromstring from lxml.etree import tostring from lxml.cssselect import CSSSelector from BeautifulSoup import BeautifulSoup, NavigableString address='http://www.example.com/' html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) html=soup.prettify() html=html.replace('&nbsp', '&#160') html=html.replace('&iacute','&#237') root=fromstring(html) I realize that a lot of the import statements may be redundant, but I just copied whatever I currently had in more source file. EDIT: I suppose that I didn't make this quite clear, but I have multiple tags in page that I want to scrape.

    Read the article

  • How do I require that an element has either one set of attributes or another in an XSD schema?

    - by Eli Courtwright
    I'm working with an XML document where a tag must either have one set of attributes or another. For example, it needs to either look like <tag foo="hello" bar="kitty" /> or <tag spam="goodbye" eggs="world" /> e.g. <root> <tag foo="hello" bar="kitty" /> <tag spam="goodbye" eggs="world" /> </root> So I have an XSD schema where I use the xs:choice element to choose between two different attribute groups: <xsi:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema" attributeFormDefault="unqualified" elementFormDefault="qualified"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" name="tag"> <xs:choice> <xs:complexType> <xs:attribute name="foo" type="xs:string" use="required" /> <xs:attribute name="bar" type="xs:string" use="required" /> </xs:complexType> <xs:complexType> <xs:attribute name="spam" type="xs:string" use="required" /> <xs:attribute name="eggs" type="xs:string" use="required" /> </xs:complexType> </xs:choice> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xsi:schema> However, when using lxml to attempt to load this schema, I get the following error: >>> from lxml import etree >>> etree.XMLSchema( etree.parse("schema_choice.xsd") ) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "xmlschema.pxi", line 85, in lxml.etree.XMLSchema.__init__ (src/lxml/lxml.etree.c:118685) lxml.etree.XMLSchemaParseError: Element '{http://www.w3.org/2001/XMLSchema}element': The content is not valid. Expected is (annotation?, ((simpleType | complexType)?, (unique | key | keyref)*))., line 7 Since the error is with the placement of my xs:choice element, I've tried putting it in different places, but no matter what I try, I can't seem to use it to define a tag to have either one set of attributes (foo and bar) or another (spam and eggs). Is this even possible? And if so, then what is the correct syntax?

    Read the article

  • Parse html and find data in the html

    - by Dan.StackOverflow
    Hi all. I am trying to use html5lib to parse an html page in to something I can query with xpath. html5lib has close to zero documentation and I've spent too much time trying to figure this problem out. Ultimate goal is to pull out the second row of a table: <html> <table> <tr><td>Header</td></tr> <tr><td>Want This</td></tr> </table> </html> so lets try it: >>> doc = html5lib.parse('<html><table><tr><td>Header</td></tr><tr><td>Want This</td> </tr></table></html>', treebuilder='lxml') >>> doc <lxml.etree._ElementTree object at 0x1a1c290> that looks good, lets see what else we have: >>> root = doc.getroot() >>> print(lxml.etree.tostring(root)) <html:html xmlns:html="http://www.w3.org/1999/xhtml"><html:head/><html:body><html:table><html:tbody><html:tr><html:td>Header</html:td></html:tr><html:tr><html:td>Want This</html:td></html:tr></html:tbody></html:table></html:body></html:html> LOL WUT? seriously. I was planning on using some xpath to get at the data I want, but that doesn't seem to work. So what can I do? I am willing to try different libraries and approaches.

    Read the article

  • <?xml version=“1.0” encoding=“UTF-8”?> not <?xml version='1.0' encoding='UTF-8'?>

    - by user2446702
    I am using lxml with tree.write(xmlFileOut, pretty_print = True, xml_declaration = True, encoding='UTF-8' to write out my opened and edited xml file, but I absolutely need to have the xml declaration as <?xml version=“1.0” encoding=“UTF-8”?> and NOT <?xml version='1.0' encoding='UTF-8'?> Now I know they are exactly the same when it comes to xml, but I am dealing with a very tricky customer who absolutely has to have " in the declaration and not '. I have searched everywhere but can't find the answer. Could I create it and add it in myself to the head of the xml somehow? Could I tell lxml that this is what I need as an xml declaration?

    Read the article

  • Which Python XML library should I use?

    - by PulpFiction
    Hello. I am going to handle XML files for a project. I had earlier decided to use lxml but after reading the requirements, I think ElemenTree would be better for my purpose. The XML files that have to be processed are: Small in size. Typically < 10 KB. No namespaces. Simple XML structure. Given the small XML size, memory is not an issue. My only concern is fast parsing. What should I go with? Mostly I have seen people recommend lxml, but given my parsing requirements, do I really stand to benefit from it or would ElementTree serve my purpose better?

    Read the article

  • Regular expression works normally, but fails when placed in an XML schema

    - by Eli Courtwright
    I have a simple doc.xml file which contains a single root element with a Timestamp attribute: <?xml version="1.0" encoding="utf-8"?> <root Timestamp="04-21-2010 16:00:19.000" /> I'd like to validate this document against a my simple schema.xsd to make sure that the Timestamp is in the correct format: <?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:attribute name="Timestamp" use="required" type="timeStampType"/> </xs:complexType> </xs:element> <xs:simpleType name="timeStampType"> <xs:restriction base="xs:string"> <xs:pattern value="(0[0-9]{1})|(1[0-2]{1})-(3[0-1]{1}|[0-2]{1}[0-9]{1})-[2-9]{1}[0-9]{3} ([0-1]{1}[0-9]{1}|2[0-3]{1}):[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1}.[0-9]{3}" /> </xs:restriction> </xs:simpleType> </xs:schema> So I use the lxml Python module and try to perform a simple schema validation and report any errors: from lxml import etree schema = etree.XMLSchema( etree.parse("schema.xsd") ) doc = etree.parse("doc.xml") if not schema.validate(doc): for e in schema.error_log: print e.message My XML document fails validation with the following error messages: Element 'root', attribute 'Timestamp': [facet 'pattern'] The value '04-21-2010 16:00:19.000' is not accepted by the pattern '(0[0-9]{1})|(1[0-2]{1})-(3[0-1]{1}|[0-2]{1}[0-9]{1})-[2-9]{1}[0-9]{3} ([0-1]{1}[0-9]{1}|2[0-3]{1}):[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1}.[0-9]{3}'. Element 'root', attribute 'Timestamp': '04-21-2010 16:00:19.000' is not a valid value of the atomic type 'timeStampType'. So it looks like my regular expression must be faulty. But when I try to validate the regular expression at the command line, it passes: >>> import re >>> pat = '(0[0-9]{1})|(1[0-2]{1})-(3[0-1]{1}|[0-2]{1}[0-9]{1})-[2-9]{1}[0-9]{3} ([0-1]{1}[0-9]{1}|2[0-3]{1}):[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1}.[0-9]{3}' >>> assert re.match(pat, '04-21-2010 16:00:19.000') >>> I'm aware that XSD regular expressions don't have every feature, but the documentation I've found indicates that every feature that I'm using should work. So what am I mis-understanding, and why does my document fail?

    Read the article

  • HTML parser for GAE

    - by Richard
    Generally I use lxml for my HTML parsing needs, but that isn't available on Google App Engine. The obvious alternative is BeautifulSoup, but I find it chokes too easily on malformed HTML. Currently I am testing libxml2dom and have been getting better results. Which pure Python HTML parser have you found performs best? My priority is the ability to handle bad HTML over speed.

    Read the article

  • Should Python 2.6 on OS X deal with multiple easy-install.pth files in $PYTHONPATH?

    - by ahd
    I am running ipython from sage and also am using some packages that aren't in sage (lxml, argparse) which are installed in my home directory. I have therefore ended up with a $PYTHONPATH of $HOME/sage/local/lib/python:$HOME/lib/python Python is reading and processing the first easy-install.pth it finds ($HOME/sage/local/lib/python/site-packages/easy-install.pth) but not the second, so eggs installed in $HOME/lib/python aren't added to the path. On reading the off-the-shelf site.py, I cannot for the life of me see why it's doing this. Can someone enlighten me? Or advise how to nudge Python into reading both easy-install.pth files? Consolidating both into one .pth file is a viable workaround for now, so this question is mostly for curiosity value.

    Read the article

  • etree.findall: 'OR'-lookup?

    - by piquadrat
    I want to find all stylesheet definitions in a XHTML file with lxml.etree.findall. This could be as simple as elems = tree.findall('link[@rel="stylesheet"]') + tree.findall('style') But the problem with CSS style definitions is that the order matters, e.g. <link rel="stylesheet" type="text/css" href="/media/css/first.css" /> <style>body:{font-size: 10px;}</style> <link rel="stylesheet" type="text/css" href="/media/css/second.css" /> if the contents of the style tag is applied after the rules in the two link tags, the result may be completely different from the one where the rules are applied in order of definition. So, how would I do a lookup that inlcudes both link[@rel="stylesheet"] and style?

    Read the article

  • How to print an Objectified Element?

    - by BeeBand
    I have xml of the format: <channel> <games> <game slot='1'> <id>Bric A Bloc</id> <title-text>BricABloc Hoorah</title-text> <link>Fruit Splat</link> </game> </games> </channel> I've parsed this xml using lxml.objectify, via: tree = objectify.parse(file) There will potentially be a number of <game>s underneath <games>. I understand that I can generate a list of <game> objects via: [ tree.games[0].game[0:4] ] My question is, what class are those objects and is there a function to print any object of whatever class these objects belong to?

    Read the article

  • (Python) Extracting Text from Source Code?

    - by zhuyxn
    Currently have a large webpage whose source code is ~200,000 lines of almost all (if not all) HTML. More specifically, it is a webpage whose content is a few thousand blocks of paragraphs separated by line breaks (though a line break does not specifically mean there is a separation in content) My main objective is to extract text from the source code as if I were copying/pasting the webpage into a text editor. There is another parsing function I would like to use, which originally took in copied/pasted text rather than the source code. To do this, I'm currently using urllib2, and calling .get_text() in Beautiful Soup. The problem is, Beautiful Soup is leaving tremendous amounts of white space in my code, and it is difficult to pass the result into the second "text" parser. I have done quite a bit of research on parsing HTMLs, but I'm frankly not sure how to solve this problem easily. Furthermore, I'm a bit confused on how to use imports like lxml to extract text as if I were to simply copy and paste?

    Read the article

  • Detect if 2 HTML fragments have identical hierarchical structure

    - by sergzach
    An example of fragments that have identical hierarchical structure: (1) <div> <span>It's a message</span> </div> (2) <div> <span class='bold'>This is a new text</span> </div> An example of fragments that have different structure: (1) <div> <span><b>It's a message</b></span> </div> (2) <div> <span>This is a new text</span> </div> So, fragments with a similar structure correspond to one hierarchical tree (the same tag names, the same hierarchical structure). How can I detect if 2 elements (html fragments) have the same structure simply with lxml? I have a function that does not work properly for some more difficult case (than the example): def _is_equal( el1, el2 ): # input: 2 elements with possible equal structure and tag names # e.g. root = lxml.html.fromstring( buf ) # el1 = root[ 0 ] # el2 = root[ 1 ] # move from top to bottom, compare elements result = False if el1.tag == el2.tag: # has no children if len( el1 ) == len( el2 ): if len( el1 ) == 0: return True else: # iterate one of them, for example el1 i = 0 for child1 in el1: child2 = el2[ i ] is_equal2 = _is_equal( child1, child2 ) if not is_equal2: return False return True else: return False else: return False The code fails to detect that 2 divs with class='tovar2' have an identical structure: <body> <div class="tovar2"> <h2 class="new"> <a href="http://modnyedeti-krsk.ru/magazin/product/333193003"> ?????? ?/? </a> </h2> <ul class="art"> <li> ???????: <span>1759</span> </li> </ul> <div> <div class="wrap" style="width:180px;"> <div class="new"> <img src="shop_files/new-t.png" alt=""> </div> <a class="highslide" href="http://modnyedeti-krsk.ru/d/459730/d/820.jpg" onclick="return hs.expand(this)"> <img src="shop_files/fr_5.gif" style="background:url(/d/459730/d/548470803_5.jpg) 50% 50% no-repeat scroll;" alt="?????? ?/?" height="160" width="180"> </a> </div> </div> <form action="" onsubmit="return addProductForm(17094601,333193003,3150.00,this,false);"> <ul class="bott "> <li class="price">????:<br> <span> <b> 3 150 </b> ???. </span> </li> <li class="amount">???-??:<br><input class="number" onclick="this.select()" value="1" name="product_amount" type="text"> </li> <li class="buy"><input value="" type="submit"> </li> </ul> </form> </div> <div class="tovar2"> <h2 class="new"> <a href="http://modnyedeti-krsk.ru/magazin/product/333124803">?????? ?/?</a> </h2> <ul class="art"> <li> ???????: <span>1759</span> </li> </ul> <div> <div class="wrap" style="width:180px;"> <div class="new"> <img src="shop_files/new-t.png" alt=""> </div> <a class="highslide" href="http://modnyedeti-krsk.ru/d/459730/d/820.jpg" onclick="return hs.expand(this)"> <img src="shop_files/fr_5.gif" style="background:url(/d/459730/d/548470803_5.jpg) 50% 50% no-repeat scroll;" alt="?????? ?/?" height="160" width="180"> </a> </div> </div> <form action="" onsubmit="return addProductForm(17094601,333124803,3150.00,this,false);"> <ul class="bott "> <li class="price">????:<br> <span> <b>3 150</b> ???. </span> </li> <li class="amount">???-??:<br><input class="number" onclick="this.select()" value="1" name="product_amount" type="text"> </li> <li class="buy"> <input value="" type="submit"> </li> </ul> </form> </div> </body>

    Read the article

  • javascript-aware html parser for Python ~

    - by znetor
    <html> <head> <script type="text/javascript"> document.write('<a href="http://www.google.com">f*** js</a>'); document.write("f*** js!"); </script> </head> <body> <script type="text/javascript"> document.write('<a href="http://www.google.com">f*** js</a>'); document.write("f*** js!"); </script> <div><a href="http://www.google.com">f*** js</a></div> </body> </html> I want use xpath to catch all lable object in the html page above... In [1]: import lxml.html as H In [2]: f = open("test.html","r") In [3]: c = f.read() In [4]: doc = H.document_fromstring(c) In [5]: doc.xpath('//a') Out[5]: [<Element a at a01d17c>] In [6]: a = doc.xpath('//a')[0] In [7]: a.getparent() Out[7]: <Element div at a01d41c> I only get one don't generate by js~ but firefox xpath checker can find all lable!? http://i.imgur.com/0hSug.png how to do that??? thx~! <html> <head> </head> <body> <script language="javascript"> function over(){ a.innerHTML="mouse me" } function out(){ a.innerHTML="<a href='http://www.google.com'>google</a>" } </script> <body><li id="a"onmouseover="over()" onmouseout="out()">mouse me</li> </body> </html>

    Read the article

  • From escaped html -> to regular html? - Python

    - by RadiantHex
    Hi folks, I used BeautifulSoup to handle XML files that I have collected through a REST API. The responses contain HTML code, but BeautifulSoup escapes all the HTML tags so it can be displayed nicely. Unfortunately I need the HTML code. How would I go on about transforming the escaped HTML into proper markup? Help would be very much appreciated!

    Read the article

  • Passing around an ElementTree

    - by PulpFiction
    Hello. In my program, I need to make use of an ElementTree object in various functions in my program. More specifically, I am doing this: tree = etree.parse('somefile.xml') I am passing this tree around in my program. I was wondering whether this is a good approach, or can I do this: Create a global tree (I come from a C++ background and I know global is bad) Create the tree again wherever required. Or is my approach ok?

    Read the article

  • Should I use a class in this: Reading a XML file using lxml.

    - by PulpFiction
    Hi everyone. This question is in continuation to my previous question, in which I asked about passing around an ElementTree. I need to read the XML files only and to solve this, I decided to create a global ElementTree and then parse it wherever required. My question is: Is this an acceptable practice? I heard global variables are bad. If I don't make it global, I was suggested to make a class. But do I really need to create a class? What benefits would I have from that approach. Note that I would be handling only one ElementTree instance per run, the operations are read-only. If I don't use a class, how and where do I declare that ElementTree so that it available globally? (Note that I would be importing this module) Please answer this question in the respect that I am a beginner to development, and at this stage I can't figure out whether to use a class or just go with the functional style programming approach.

    Read the article

  • Almost every Inkscape extension yields an error in Mac OS X

    - by andyvn22
    I've run the latest few versions of Inkscape (currently landed on "0.47+devel"), and have been having trouble with the Extensions menu. So far, in every version of Inkscape I've tried, nearly every extension yields the following error: The fantastic lxml wrapper for libxml2 is required by inkex.py and therefore this extension. Please download and install the latest version from http://cheeseshop.python.org/pypi/lxml/, or install it through your package manager by a command like: sudo apt-get install python-lxml I've tried the instructions listed there, of course, with no effect. I've also found many references to this issue on fora, in bug trackers, etc., and as such also tried: sudo easy_install lxml cd /Applications/Inkscape.app/Contents/Resources/lib mv libxml2.2.dylib libxml2.2.dylib.old ln -s /usr/lib/libxml2.dylib and a few similar solutions. Nothing has produced any change in Inkscape's behavior. Does anyone know A) what's really going on here? Because from what I gather the error is not describing the actual problem. And of course B) a simple solution? I need those features! :)

    Read the article

  • Scrape zipcode table for different urls based on county

    - by Dr.Venkman
    I used lxml and ran into a wall as my new computer wont install lxml and the code doesnt work. I know this is simple - maybe some one can help with a beautiful soup script. this is my code: import codecs import lxml as lh from selenium import webdriver import time import re results = [] city = [ 'amador'] state = [ 'CA'] for state in states: for city in citys: browser = webdriver.Firefox() link2 = 'http://www.getzips.com/cgi-bin/ziplook.exe?What=3&County='+ city +'&State=' + state + '&Submit=Look+It+Up' browser.get(link2) bcontent = browser.page_source zipcode = bcontent[bcontent.find('<td width="15%"'):bcontent.find('<p>')+0] if len(zipcode) > 0: print zipcode else: print 'none' browser.quit() Thanks for the help

    Read the article

  • how to diagnosis and resolve: /usr/lib64/libz.so.1: no version information available

    - by matchew
    I had a hell of a time installing lxml for python2.7 on centOs5.6. For some background, python2.7 is an alternative installation of python on centOS5.6 which comes with python2.4 installed. it was bulit from source per its instrucitons ./configure make make altinstall However, after about 20 hours of trying I managed to find a workable solution and was able to install lxml. Until, I notice the following error at the top of the interpreter: python2.7: /usr/lib64/libz.so.1: no version information available (required by python2.7) Python 2.7.2 (default, Jun 30 2011, 18:55:26) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print 'Sheeeeut!' this error is printed out everytime I run a script. For example: $ ./test.py /usr/local/bin/python2.7: /usr/lib64/libz.so.1: no version information available (required by /usr/local/bin/python2.7) the script runs flawlessly, but this error is bothersome. After some digging I have seem to believe I have a wrong version of libz installed, that it is either an older version or built for a different platform. I'm not quite sure how, I've only installed libz through yum, as far as I know. Although, I can't quite remember every little thing I tried in my twenty hours of trying. You may also be intereted in what my lib64 folder looks like, here is some information $ ls -ltrh libz* -rwxr-xr-x 1 root root 84K Jan 9 2007 libz.so.1.2.3 -rwxr-xr-x 1 root root 107K Jan 9 2007 libz.a -rwxr-xr-x 1 root root 154K Feb 22 23:30 libzdb.so.7.0.2 lrwxrwxrwx 1 root root 13 Apr 20 20:46 libz.so.1 -> libz.so.1.2.3 lrwxrwxrwx 1 root root 15 Jun 30 18:43 libzdb.so.7 -> libzdb.so.7.0.2 lrwxrwxrwx 1 root root 13 Jul 1 11:35 libz.so -> libz.so.1.2.3 lrwxrwxrwx 1 root root 15 Jul 1 11:35 libzdb.so -> libzdb.so.7.0.2 notice: the items that Say Jul 1st or Jun 30th are from me. I had initially moved these files into a backup folder as they seeemed to be 1. duplicates and 2. had a date after/during my problems I alluded to earlier that I had with lxml One inclination is to completely remove python2.7 and re-install. I think having it install to /usr/local/ was a poor default choice. However, without the make uninstall option being present it seems to be a time consuming task for a solution I am not quite sure would solve my problem.

    Read the article

< Previous Page | 1 2 3  | Next Page >