Search Results

Search found 117 results on 5 pages for 'no soup for you'.

Page 1/5 | 1 2 3 4 5  | Next Page >

  • Dealing with curly brace soup

    - by Cyborgx37
    I've programmed in both C# and VB.NET for years, but primarily in VB. I'm making a career shift toward C# and, overall, I like C# better. One issue I'm having, though, is curly brace soup. In VB, each structure keyword has a matching close keyword, for example: Namespace ... Class ... Function ... For ... Using ... If ... ... End If If ... ... End If End Using Next End Function End Class End Namespace The same code written in C# ends up very hard to read: namespace ... { class ... { function ... { for ... { using ... { if ... { ... } if ... { ... } } } // wait... what level is this? } } } Being so used to VB, I'm wondering if there's a technique employed by c-style programmers to improve readability and to ensure that your code ends up in the correct "block". The above example is relatively easy to read, but sometimes at the end of a piece of code I'll have 8 or more levels of curly braces, requiring me to scroll up several pages to figure out which brace ends the block I'm interested in.

    Read the article

  • Python beautiful soup arguments

    - by scott
    Hi I have this code that fetches some text from a page using BeautifulSoup soup= BeautifulSoup(html) body = soup.find('div' , {'id':'body'}) print body I would like to make this as a reusable function that takes in some htmltext and the tags to match it like the following def parse(html, atrs): soup= BeautifulSoup(html) body = soup.find(atrs) return body But if i make a call like this parse(htmlpage, ('div' , {'id':'body'}")) or like parse(htmlpage, ['div' , {'id':'body'}"]) I get only the div element, the body attribute seems to get ignored. Is there a way to fix this?

    Read the article

  • Beautiful Soup Unicode encode error

    - by iamrohitbanga
    I am trying the following code with a particular HTML file from BeautifulSoup import BeautifulSoup import re import codecs import sys f = open('test1.html') html = f.read() soup = BeautifulSoup(html) body = soup.body.contents para = soup.findAll('p') print str(para).encode('utf-8') I get the following error: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 9: ordinal not in range(128) How do I debug this?

    Read the article

  • Python Beautiful Soup .content Property

    - by Robert Birch
    What does BeautifulSoup's .content do? I am working through crummy.com's tutorial and I don't really understand what .content does. I have looked at the forums and I have not seen any answers. Looking at the code below.... from BeautifulSoup import BeautifulSoup import re doc = ['<html><head><title>Page title</title></head>', '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.', '<p id="secondpara" align="blah">This is paragraph <b>two</b>.', '</html>'] soup = BeautifulSoup(''.join(doc)) print soup.contents[0].contents[0].contents[0].contents[0].name I would expect the last line of the code to print out 'body' instead of... File "pe_ratio.py", line 29, in <module> print soup.contents[0].contents[0].contents[0].contents[0].name File "C:\Python27\lib\BeautifulSoup.py", line 473, in __getattr__ raise AttributeError, "'%s' object has no attribute '%s'" % (self.__class__.__name__, attr) AttributeError: 'NavigableString' object has no attribute 'name' Is .content only concerned with html, head and title? If, so why is that? Thanks for the help in advance.

    Read the article

  • Find unique vertices from a 'triangle-soup'

    - by sum1stolemyname
    I am building a CAD-file converter on top of two libraries (Opencascade and DWF Toolkit). However, my question is plattform agnostic: Given: I have generated a mesh as a list of triangular faces form a model constructed through my application. Each Triangle is defined through three vertexes, which consist of three floats (x, y & z coordinate). Since the triangles form a mesh, most of the vertices are shared by more then one triangle. Goal: I need to find the list of unique vertices, and to generate an array of faces consisting of tuples of three indices in this list. What i want to do is this: //step 1: build a list of unique vertices for each triangle for each vertex in triangle if not vertex in listOfVertices Add vertex to listOfVertices //step 2: build a list of faces for each triangle for each vertex in triangle Get Vertex Index From listOfvertices AddToMap(vertex Index, triangle) While I do have an implementation which does this, step1 (the generation of the list of unique vertices) is really slow in the order of O(n!), since each vertex is compared to all vertices already in the list. I thought "Hey, lets build a hashmap of my vertices' components using std::map, that ought to speed things up!", only to find that generating a unique key from three floating point values is not a trivial task. Here, the experts of stackoverflow come into play: I need some kind of hash-function which works on 3 floats, or any other function generating a unique value from a 3d-vertex position.

    Read the article

  • urllib2.Request() with data returns empty url

    - by Mr. Polywhirl
    My main concern is the function: getUrlAndHtml() If I manually build and append the query to the end of the uri, I can get the response.url(), but if I pass a dictionary as the request data, the url does not come back. Is there anyway to guarantee the redirected url? In my example below, if thisWorks = True I get back a url, but the returned url is the request url as opposed to a redirect link. On a sidenote, the encoding for .E2.80.93 does not translate to - for some reason? #!/usr/bin/python import pprint import urllib import urllib2 from bs4 import BeautifulSoup from sys import argv URL = 'http://en.wikipedia.org/w/index.php?' def yesOrNo(boolVal): return 'yes' if boolVal else 'no' def getTitleFromRaw(page): return page.strip().replace(' ', '_') def getUrlAndHtml(title, printable=False): thisWorks = False if thisWorks: query = 'title={:s}&printable={:s}'.format(title, yesOrNo(printable)) opener = urllib2.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] response = opener.open(URL + query) else: params = {'title':title,'printable':yesOrNo(printable)} data = urllib.urlencode(params) headers = {'User-agent':'Mozilla/5.0'}; request = urllib2.Request(URL, data, headers) response = urllib2.urlopen(request) return response.geturl(), response.read() def getSoup(html, name=None, attrs=None): soup = BeautifulSoup(html) if name is None: return None return soup.find(name, attrs) def setTitle(soup, newTitle): title = soup.find('div', {'id':'toctitle'}) h2 = title.find('h2') h2.contents[0].replaceWith('{:s} for {:s}'.format(h2.getText(), newTitle)) def updateLinks(soup, url): fragment = '#' for a in soup.findAll('a', href=True): a['href'] = a['href'].replace(fragment, url + fragment) def writeToFile(soup, filename='out.html', indentLevel=2): with open(filename, 'wt') as out: pp = pprint.PrettyPrinter(indent=indentLevel, stream=out) pp.pprint(soup) print('Wrote {:s} successfully.'.format(filename)) if __name__ == '__main__': def exitPgrm(): print('usage: {:s} "<PAGE>" <FILE>'.format(argv[0])) exit(0) if len(argv) == 2: help = argv[1] if help == '-h' or help == '--help': exitPgrm() if False:''' if not len(argv) == 3: exitPgrm() ''' page = 'Led Zeppelin' # argv[1] filename = 'test.html' # argv[2] title = getTitleFromRaw(page) url, html = getUrlAndHtml(title) soup = getSoup(html, 'div', {'id':'toc'}) setTitle(soup, page) updateLinks(soup, url) writeToFile(soup, filename)

    Read the article

  • How to convert Beautiful Soup Unicode into a decimal value?

    - by MikeTheCoder
    I'm trying to Use python's Beautiful Soup Library to grab a bunch of divs from an html file, and from there get the string - which is a money value - that's inside the div. Then remove the dollar sign and convert it to a decimal so that I can use a greater than and less than conditional statement to compare values. I have googled the heck out of it and can't seem to come up with a way to convert this unicode string into a decimal value. I really could use some help here. How do I convert unicode into a decimal value? This was my last attempt: import unicodedata from bs4 import BeautifulSoup soup = BeautifulSoup(open("/Users/sm/Documents/python/htmldemo.html")) for tag in soup.findAll("div",attrs={"itemprop":"price"}) : val = tag.string new_val = val[8:] workable = int(new_val) if workable > 250: print(type(workable)) else: print(type(workable)) Edit: When I print the type of new_val I get : print(type(new_val))

    Read the article

  • 3 Scenarios for most relevant keywords in website. Which one is best?

    - by Sam
    A webpage about Tomato Soup has either of three following filenames: Scenario 1 website.org/en/tomato-soup or Scenario 2 website.org/en/tomato-soup-healthy-soups-recipes or Scenario 3 website.org/en/tomato-why-sandra-is-so-wild-about-her-healthy-tomato-soup-recipes Q1. Which one of the abobe would You go for? Q2. Which one of these would be ranked as most relevant by google? Q3. Would either of these be penalized for keyword stuffing?

    Read the article

  • Beautiful soup how print a tag while iterating over it .

    - by Bunny Rabbit
    <?xml version="1.0" encoding="UTF-8"?> <playlist version="1" xmlns="http://xspf.org/ns/0/"> <trackList> <track> <location>file:///home/ashu/Music/Collections/randomPicks/ipod%20on%20sep%2009/Coldplay-Sparks.mp3</location> <title>Coldplay-Sparks</title> </track> <track> <location>file:///home/ashu/Music/Collections/randomPicks/gud%201s/Coldplay%20Warning%20sign.mp3</location> <title>Coldplay Warning sign</title> </track>.... My xml looks like this , i want to get the locations, i am trying from BeautifulSoup import BeautifulSoup as bs soup = bs (the_above_xml_text) for track in soup.tracklist: print track.location.string but that is not working because i am getting AttributeError: 'NavigableString' object has no attribute 'location' how can i achive the result , thanks in advance.

    Read the article

  • Parsing HTML with Python 2.7 - HTMLParser, SGMLParser, or Beautiful Soup?

    - by Eric Wilson
    I want to do some screen-scraping with Python 2.7, and I have no context for the differences between HTMLParser, SGMLParser, or Beautiful Soup. Are these all trying to solve the same problem, or do they exist for different reasons? Which is simplest, which is most robust, and which (if any) is the default choice? Also, please let me know if I have overlooked a significant option. Edit: I should mention that I'm not particularly experienced in HTML parsing, and I'm particularly interested in which will get me moving the quickest, with the goal of parsing HTML on one particular site.

    Read the article

  • Any Suggestions on How to Soup Up/ Mod a MacBook Pro 13"?

    - by 5arx
    So I've got a mid-2009 MacBook Pro 13". Integrated GPU so not a games machine but fast enough for doing .Net development in VMs. I love the little thing and wanted to give it a Christmas present so thought I'd mod it up a bit and give it a boost. I'm probably going to go for a 500GB Seagate Momentus XT hybrid drive rather than full-on SSD (I need 500GB space) but was wondering if there are any other mods/tweaks people could suggest? I saw something online about swapping a HDD for the DVD drive and wondered if anyone had tried this or similarly drastic mods to the smallest of the MBPs. Cheers.

    Read the article

  • BeautifulSoup HTMLParseError. What's wrong with this?

    - by user1915496
    This is my code: from bs4 import BeautifulSoup as BS import urllib2 url = "http://services.runescape.com/m=news/recruit-a-friend-for-free-membership-and-xp" res = urllib2.urlopen(url) soup = BS(res.read()) other_content = soup.find_all('div',{'class':'Content'})[0] print other_content Yet an error comes up: /Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py:149: RuntimeWarning: Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help. "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.")) Traceback (most recent call last): File "web.py", line 5, in <module> soup = BS(res.read()) File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 172, in __init__ self._feed() File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 185, in _feed self.builder.feed(self.markup) File "/Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py", line 150, in feed raise e I've let two other people use this code, and it works for them perfectly fine. Why is it not working for me? I have bs4 installed...

    Read the article

  • In Python BeautifulSoup How to move tags

    - by JJ
    I have a partially converted XML document in soup coming from HTML. After some replacement and editing in the soup, the body is essentially - <Text...></Text> # This replaces <a href..> tags but automatically creates the </Text> <p class=norm ...</p> <p class=norm ...</p> <Text...></Text> <p class=norm ...</p> and so forth. I need to "move" the <p> tags to be children to <Text> or know how to suppress the </Text>. I want - <Text...> <p class=norm ...</p> <p class=norm ...</p> </Text> <Text...> <p class=norm ...</p> </Text> I've tried using item.insert and item.append but I'm thinking there must be a more elegant solution. for item in soup.findAll(['p','span']): if item.name == 'span' and item.has_key('class') and item['class'] == 'section': xBCV = short_2_long(item._getAttrMap().get('value','')) if currentnode: pass currentnode = Tag(soup,'Text', attrs=[('TypeOf', 'Section'),... ]) item.replaceWith(currentnode) # works but creates end tag elif item.name == 'p' and item.has_key('class') and item['class'] == 'norm': childcdatanode = None for ahref in item.findAll('a'): if childcdatanode: pass newlink = filter_hrefs(str(ahref)) childcdatanode = Tag(soup, newlink) ahref.replaceWith(childcdatanode) Thanks

    Read the article

  • BeautifulSoup can't parse a webpage?

    - by JLTChiu
    I am using beautiful soup for parsing webpage now, I've heard it's very famous and good, but it doesn't seems works properly. Here's what I did import urllib2 from bs4 import BeautifulSoup page = urllib2.urlopen("http://www.cnn.com/2012/10/14/us/skydiver-record-attempt/index.html?hpt=hp_t1") soup = BeautifulSoup(page) print soup.prettify() I think this is kind of straightforward. I open the webpage and pass it to the beautifulsoup. But here's what I got: Warning (from warnings module): File "C:\Python27\lib\site-packages\bs4\builder\_htmlparser.py", line 149 "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.")) ... HTMLParseError: bad end tag: u'</"+"script>', at line 634, column 94 I thought CNN website should be well designed, so I am not very sure what's going on though. Does anyone has idea about this?

    Read the article

  • beautifulsoup can't find exist href in file

    - by young001
    I have a html file like following: <form action="/2811457/follow?gsid=3_5bce9b871484d3af90c89f37" method="post"> <div> <a href="/2811457/follow?page=2&amp;gsid=3_5bce9b871484d3af90c89f37">next_page</a> &nbsp;<input name="mp" type="hidden" value="3" /> <input type="text" name="page" size="2" style='-wap-input-format: "*N"' /> <input type="submit" value="jump" />&nbsp;1/3 </div> </form> how to extract the "1/3" from the file? It is a part of html,I intend to make it clear. When I use beautifulsoup, I'm new to beautifulsoup,and I have look the document,but still confused. how to extract"1/3" from the html file? total_urls_num = soup.find(re.compile('.*/d\//d.*')) doesn't work As JBernardo said,\d should be a number,When I change to .*\d/\d.*,it doesn't work too. my code: from BeautifulSoup import BeautifulSoup import re with open("html.txt","r") as f: response = f.read() print response soup = BeautifulSoup(response) delete_urls = soup.findAll('a', href=re.compile('follow\?page')) #works print delete_urls #total_urls_num = soup.find(re.compile('.*\d/\d.*')) total_urls_num = soup.find('input',style='submit') #can't work print total_urls_num

    Read the article

  • Python Continue Loop

    - by Rob B.
    I am using the following code from this tutorial (http://jeriwieringa.com/blog/2012/11/04/beautiful-soup-tutorial-part-1/). from bs4 import BeautifulSoup soup = BeautifulSoup (open("43rd-congress.html")) final_link = soup.p.a final_link.decompose() trs = soup.find_all('tr') for tr in trs: for link in tr.find_all('a'): fulllink = link.get ('href') print fulllink #print in terminal to verify results tds = tr.find_all("td") try: #we are using "try" because the table is not well formatted. This allows the program to continue after encountering an error. names = str(tds[0].get_text()) # This structure isolate the item by its column in the table and converts it into a string. years = str(tds[1].get_text()) positions = str(tds[2].get_text()) parties = str(tds[3].get_text()) states = str(tds[4].get_text()) congress = tds[5].get_text() except: print "bad tr string" continue #This tells the computer to move on to the next item after it encounters an error print names, years, positions, parties, states, congress However, I get an error saying that 'continue' is not properly in the loop on line 27. I am using notepad++ and windows powershell. How do I make this code work?

    Read the article

  • Python regex on list

    - by Peter Nielsen
    Hi there I am trying to build a parser and save the results as an xml file but i have problems.. For instance i get a TypeError: expected string or buffer when i try to run the code.. Would you experts please have a look at my code ? import urllib2, re from xml.dom.minidom import Document from BeautifulSoup import BeautifulSoup as bs osc = open('OSCTEST.html','r') oscread = osc.read() soup=bs(oscread) doc = Document() root = doc.createElement('root') doc.appendChild(root) countries = doc.createElement('countries') root.appendChild(countries) findtags1 = re.compile ('<h1 class="title metadata_title content_perceived_text(.*?)</h1>', re.DOTALL | re.IGNORECASE).findall(soup) findtags2 = re.compile ('<span class="content_text">(.*?)</span>', re.DOTALL | re.IGNORECASE).findall(soup) for header in findtags1: title_elem = doc.createElement('title') countries.appendChild(title_elem) header_elem = doc.createTextNode(header) title_elem.appendChild(header_elem) for item in findtags2: art_elem = doc.createElement('artikel') countries.appendChild(art_elem) s = item.replace('<P>','') t = s.replace('</P>','') text_elem = doc.createTextNode(t) art_elem.appendChild(text_elem) print doc.toprettyxml()

    Read the article

  • Python BeautifulSoup Print Info in CSV

    - by Codin
    I can print the information I am pulling from a site with no problem. But when I try to place the street names in one column and the zipcodes into another column into a CSV file that is when I run into problems. All I get in the CSV is the two column names and every thing in its own column across the page. Here is my code. Also I am using Python 2.7.5 and Beautiful soup 4 from bs4 import BeautifulSoup import csv import urllib2 url="http://www.conakat.com/states/ohio/cities/defiance/road_maps/" page=urllib2.urlopen(url) soup = BeautifulSoup(page.read()) f = csv.writer(open("Defiance Steets1.csv", "w")) f.writerow(["Name", "ZipCodes"]) # Write column headers as the first line links = soup.find_all(['i','a']) for link in links: names = link.contents[0] print unicode(names) f.writerow(names)

    Read the article

  • "'" is displayed as u0027 in facebook app - how to fix that?

    - by Imageree
    I have a facebook app that is displaying random quotes - it is written in php. One quote in the database looks like this: "There's only one rule in photography - never develop colour film in chicken noodle soup. - Freeman Patterson" When it is seen on facebook it looks like this: "Thereu0027s only one rule in photography - never develop colour film in chicken noodle soup. - Freeman Patterson" How do I fix it?

    Read the article

  • Extracting value in Beautifulsoup

    - by Seth
    I have the following code: f = open(path, 'r') html = f.read() # no parameters => reads to eof and returns string soup = BeautifulSoup(html) schoolname = soup.findAll(attrs={'id':'ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel'}) print schoolname which gives: [<span id="ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel">A B Paterson College, Arundel, QLD</span>] when I try and access the value (i.e. 'A B Paterson College, Arundel, QLD) by using schoolname['value'] I get the following error: print schoolname['value'] TypeError: list indices must be integers, not str What am I doing wrong to get that value?

    Read the article

  • Annoying Blank pop-up window go away

    - by No Soup for YOU
    Hi All, Sorry about the wording for my question title. I have a basic HTML anchor tag that when clicked it is suppose to bring up a dialog box to download a file from a differnt website. I am using an attribute of target="_blank" so that when my hyperlink is clicked, I don't navigate away from my main window. This is all the easy part (if it was so easy I wouldnt be here though). When I do the above though, and click on the hyperlink, an annoying blank window pops up with my download dialog box behind it. How do I get rid of that annoying blank window and keep only my download dialog box on the screen? Below is the HTML I'm working with... <a href="http://www.fake-domain-name.com/downloads/setup.msi" target="_blank"> <img src="images/download.png" alt="download file"/> </a>

    Read the article

1 2 3 4 5  | Next Page >