Search Results

Search found 3241 results on 130 pages for 'extract'.

Page 82/130 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >

  • Error while trying to parse a website url using python . how to debug it ?

    - by mekasperasky
    #!/usr/bin/python import json import urllib from BeautifulSoup import BeautifulSoup from BeautifulSoup import BeautifulStoneSoup import BeautifulSoup def showsome(searchfor): query = urllib.urlencode({'q': searchfor}) url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query search_response = urllib.urlopen(url) search_results = search_response.read() results = json.loads(search_results) data = results['responseData'] print 'Total results: %s' % data['cursor']['estimatedResultCount'] hits = data['results'] print 'Top %d hits:' % len(hits) for h in hits: print ' ', h['url'] resp = urllib.urlopen(h['url']) res = resp.read() soup = BeautifulSoup(res) print soup.prettify() print 'For more results, see %s' % data['cursor']['moreResultsUrl'] showsome('sachin') What is the wrong in this code ? Note all the 4 links that I am getting out of the search , I am feeding it back to extract the contents out of it , and then use BeautifulSoup to parse it . How should I go about it ?

    Read the article

  • A specific string format with a number and character together represeting a certain item

    - by sil3nt
    Hello there, I have a string which looks like this "a 3e,6s,1d,3g,22r,7c 3g,5r,9c 19.3", how do I go through it and extract the integers and assign them to its corresponding letter variable?. (i have integer variables d,r,e,g,s and c). The first letter in the string represents a function, "3e,6s,1d,3g,22r,7c" and "3g,5r,9c" are two separate containers . And the last decimal value represents a number which needs to be broken down into those variable numbers. my problem is extracting those integers with the letters after it and assigning them into there corresponding letter. and any number with a negative sign or a space in between the number and the letter is invalid. How on earth do i do this?

    Read the article

  • Key word extraction in Python

    - by oliland
    I'm building a website in django that needs to extract key words from short (twitter-like) messages. I've looked at packages like topia.textextract and nltk - but both seem to be overkill for what I need to do. All I need to do is filter words like "and", "or", "not" while keeping nouns and verbs that aren't conjunctives or other parts of speech. Are there any "simpler" packages out there that can do this? EDIT: This needs to be done in near real-time on a production website, so using a keyword extraction service seems out of the question, based on their response times and request throttling.

    Read the article

  • Select row data as ColumnName and Value

    - by Bobcat1506
    I have a history table and I need to select the values from this table in ColumnName, ColumnValue form. I am using SQL Server 2008 and I wasn’t sure if I could use the PIVOT function to accomplish this. Below is a simplified example of what I need to accomplish: This is what I have: The table’s schema is CREATE TABLE TABLE1 (ID INT PRIMARY KEY, NAME VARCHAR(50)) The “history” table’s schema is CREATE TABLE TABLE1_HISTORY( ID INT, NAME VARCHAR(50), TYPE VARCHAR(50), TRANSACTION_ID VARCHAR(50)) Here is the data from TABLE1_HISTORY ID NAME TYPE TRANSACTION_ID 1 Joe INSERT a 1 Bill UPDATE b 1 Bill DELETE c I need to extract the data from TABLE1_HISTORY into this format: TransactionId Type ColumnName ColumnValue a INSERT ID 1 a INSERT NAME Joe b UPDATE ID 1 b UPDATE NAME Bill c DELETE ID 1 c DELETE NAME Bill Other than upgrading to Enterprise Edition and leveraging the built in change tracking functionality, what is your suggestion for accomplishing this task?

    Read the article

  • wpf legacy server call

    - by Shah Al
    Hi, We have a legacy application running tomcat that publishes data in a simple html table. I have no control on the remote server publishing the data. I am looking to extract the data into a WPF desktop application and display it as a table. Is there any way a WPF application can make a url call, get the result and parse the data. This would be similar to AJAX from JSP. Any thoughts/ideas? Please advice. Regards,

    Read the article

  • Preserve "long" spaces in PDFBox text extraction

    - by Thilo
    I am using PDFBox to extract text from PDF. The PDF has a tabular structure, which is quite simple and columns are also very widely spaced from each-other This works really well, except that all kinds of horizontal space gets converted into a single space character, so that I cannot tell columns apart anymore (space within words in a column looks just like space between columns). I appreciate that a general solution is very hard, but in this case the columns are really far apart so that having a simple differentiation between "long spaces" and "space between words" would be enough. Is there a way to tell PDFBox to turn horizontal whitespace of more then x inches into something other than a single space? A proportional approach (x inch become y spaces) would also work.

    Read the article

  • custom attribute changes in .NET 4

    - by Sarah Vessels
    I recently upgraded a C# project from .NET 3.5 to .NET 4. I have a method that extracts all MSTest test methods from a given list of MethodBase instances. Its body looks like this: return null == methods || methods.Count() == 0 ? null : from method in methods let testAttribute = Attribute.GetCustomAttribute(method, typeof(TestMethodAttribute)) where null != testAttribute select method; This worked in .NET 3.5, but since upgrading my projects to .NET 4, this code always returns an empty list, even when given a list of methods containing a method that is marked with [TestMethod]. Did something change with custom attributes in .NET 4? Debugging, I found that the results of GetCustomAttributesData() on the test method gives a list of two CustomAttributeData which are described in Visual Studio 2010's 'Locals' window as: Microsoft.VisualStudio.TestTools.UnitTesting.DeploymentItemAttribute("myDLL.dll") Microsoft.VisualStudio.TestTools.UnitTesting.TestMethodAttribute() -- this is what I'm looking for When I call GetType() on that second CustomAttributeData instance, however, I get {Name = "CustomAttributeData" FullName = "System.Reflection.CustomAttributeData"} System.Type {System.RuntimeType}. How can I get TestMethodAttribute out of the CustomAttributeData, so that I can extract test methods from a list of MethodBases?

    Read the article

  • Adjusting Timezone - Convert XML DateTime to SQL DateTime

    - by noob.spt
    We are using TypedDataSet in our application. Data is passed to procedure in form of XML for insert/update. Now after populating DE with data, datetime remains the same though timezone information is added as below. Date in DB: 2009-10-29 18:52:53.43 Date in XML: 2009-10-29T18:52:53.43-05:00 Now when I am trying to convert below XML to SQL DateTime it is adjusting 5 hours and I am getting 2009-10-29 23:52:53.430 as the final output, which is wrong. Need to find a way to extract datetime from below XML snippet ignoring timezone. I have XML in following format, with timezone difference -05.00 <Order> <EnteredDateTime>2009-10-29T18:52:53.43-05:00</EnteredDateTime> </Order>

    Read the article

  • Caching the repository index in m2eclipse

    - by Titi Wangsa bin Damhore
    everytime i start with a fresh new workspace, m2eclipse downloads nexus-maven-repository-index.gz from the maven central repository. this is good. but, some times, i just want to start a new workspace, and not wait for it to download, it tried copying the whole .metadata directory from an old workspace to the new one, but the list of maven artifacts are still empty. is there a way i can cache it? or at least download the file once, and the copy/extract/repackage it so that m2eclipse thinks it has already downloaded it and allows me to search for maven artifacts. or a short version of the question where and in what format is the "nexus-maven-repository-index.gz" file stored in the workspace?

    Read the article

  • How to break a series of git commits into patches for submission to another project

    - by krosenvold
    So I've been bashing away at my favorite open source project for quite some time, and It's time for submitting issues with patches back. I have to regroup my commits more or less fully, and hopefully extract some pieces of code that can function as distinct patches to avoid code bombing. Currently I usually do something like this: rebase/squash everything to one commit since the old ones often don't make sense as patches undo that commit start adding stuff that I think fits to one commit, using add/add -i commit stash the rest test that commit re-apply the stash and start from 3 until all is accounted for It works, but is there a better way ?

    Read the article

  • Linux - How do i know the block map of the given file and/or the free space map of the partition.

    - by Inso Reiges
    Hello, I am on Linux and need to know either of the two things: 1) If i have a regular file on some file system on a partition under Linux is there a way to know the set of the physical blocks that this file occupies on the drive from user space? Or at least the set of the file system's clusters? 2) Is there a way to get the same information about the whole free space of the given file system? In both cases i understand that if there is any possible way to extract this info it will probably be totally unsafe and racy (anything could happen to these set of blocks between the time i see them and act on them somehow). I also really don't want an implementation that will have to know a lot about every filesystem.

    Read the article

  • Jquery: Extracting the hrefs from multiple links on a page.

    - by Pete B
    Hi, I discovered that .attr() only applies to the first matched element on the page! So, I've been trying to get the hrefs from all the matched elements on a page, but to no avail. Here's what I tentatively wrote: var thelinks = $("td a").each(function(){ $(this).attr("href"); document.write(thelinks); }); I used document.write just to see what was going on, and I got a long list of "undefinedundefinedundefined" What I'm trying to do is extract the hrefs from each td a and then use ajax to visit those pages and do other stuff. I can get it work fine when it's dealing with just one link, but this multiple elements thing I can't figure out. Any help rendered is appreciated, I'm a novice to the world of Javascript and Jquery.

    Read the article

  • Parse and charset: why my script doesn't work

    - by Rebol Tutorial
    I want to extract attribute1 and attribute3 values only. I don't understand why charset doesn't seem to work in my case to "skip" any other attributes (attribute3 is not extracted as I would like): content: {<tag attribute1="valueattribute1" attribute2="valueattribute2" attribute3="valueattribute3"> </tag> <tag attribute2="valueattribute21" attribute1="valueattribute11" > </tag> } attribute1: [{attribute1="} copy valueattribute1 to {"} thru {"}] attribute3: [{attribute3="} copy valueattribute3 to {"} thru {"}] spacer: charset reduce [tab newline #" "] letter: complement spacer to-space: [some letter | end] attributes-rule: [(valueattribute1: none valueattribute3: none) [attribute1 | none] to-space [attribute3 | none] (print valueattribute1 print valueattribute3) | [attribute3 | none] to-space [attribute1 | none] (print valueattribute3 print valueattribute1 valueattribute1: none valueattribute3: none ) | none ] rule: [any [to {<tag } thru {<tag } attributes-rule {>} to {</tag>} thru {</tag>}] to end] parse content rule output is >> parse content rule valueattribute1 none == true >>

    Read the article

  • Generating an array of functions?

    - by Wordpressor
    I have a few PHP files filled with multiple functions. Let's call them functions1.php, functions2.php, functions3.php: function function_something( $atts ) { extract( something_atts( array( 'foo' => 'bar', 'bar' => 'foo, ), $atts ) ); return 'something'; } I'm loading these files within all_functions.php like this: require_once('functions1.php'); require_once('functions2.php'); require_once('functions3.php'); I'm wondering if it's possible to create an array of all these functions and their attributes? I'm thinking about something like: function my_functions() { require_once('functions1.php'); require_once('functions2.php'); require_once('functions3.php'); } And then some foreach loop, but I'm not sure how should it look like after all. I know this probably looks tricky, but I'm just curious if I'm able to list all my WordPress shortcodes without PHP's Reflection API :)

    Read the article

  • Download and write .tar.gz files without corruption.

    - by arbales
    I've tried numerous ways of downloading files, specifically .zip and .tar.gz, with Ruby and write them to the disk. I've found that the file appears to be the same as the reference (in size), but the archives refuse to extract. What I'm attempting now is: Thanks! def download_request(url, filePath:path, progressIndicator:progressBar) file = File.open(path, "w+") begin Net::HTTP.get_response URI.parse(url) do |response| if response['Location']!=nil puts 'Direct to: ' + response['Location'] return download_request(response['Location'], filePath:path, progressIndicator:progressBar) end # some stuff response.read_body do |segment| file.write(segment) # some progress stuff. end end ensure file.close end end download_request("http://github.com/jashkenas/coffee-script/tarball/master", filePath:"tarball.tar.gz", progressIndicator:nil)

    Read the article

  • How to Store data without using Database and how to retrieve them ?

    - by Harikrishna
    I am parsing the html file to extract tabular information through column names. And I want like let user give the input for column name. And according to that column names tabular information will be extracted. Now that column names which user will input,based on that column names I want to find the tabular information from the html file. But where I should store this column names input by user ? And how to retrieve them ? I dont want to use database.

    Read the article

  • Extracting Information from Images

    - by Khorkrak
    What are some fast and somewhat reliable ways to extract information about images? I've been tinkering with openCV and this seems so far to be the best route plus it has Python bindings. So to be more specific I'd like to determine what I can about what's in an image. So for example the haar face detection and full body detection classifiers are great - now I can tell that most likely there are faces and / or people in the image as well as about how many. okay - what else - how about whether there are any buildings and if so what do they seem to be - huts, office buildings etc? Is there sky visible, grass, trees and so forth. From what I've read about training classifiers to detect objects, it seems like a rather laborious process 10,000 or so wrong images and 5,000 or so correct samples to train a classifier. I'm hoping that there are some decent ones around already instead of having to do this all myself for a bunch of different objects - or is there some other way to go about this sort of thing?

    Read the article

  • reassembling http packets with perl and parsing it

    - by johnny2
    I am using net::pcap module to capture packets with this filter: dst $my_host and dst port 80 inside the net::pcap::loop i use the below callback function: net::pcap::loop($pcap_t,-1,\my_callback,'') where my_callback look like this : my_callback { my ($user_data, $header, $packet) = @_; # Strip ethernet IP and TCP my $ether_data = NetPacket::Ethernet::strip($packet); my $ip = NetPacket::IP->decode($ether_data); my $tcp = NetPacket::TCP->decode($ip->{'data'}); } could someone help me how can i assemble the http packets to one packet and extract its header .

    Read the article

  • Url open encoding

    - by badc0re
    I have the following code for urllib and BeautifulSoup: getSite = urllib.urlopen(pageName) # open current site getSitesoup = BeautifulSoup(getSite.read()) # reading the site content print getSitesoup.originalEncoding for value in getSitesoup.find_all('link'): # extract all <a> tags defLinks.append(value.get('href')) The result of it: /usr/lib/python2.6/site-packages/bs4/dammit.py:231: UnicodeWarning: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER. "Some characters could not be decoded, and were " And when i try to read the site i get: ?7?e????0*"I??G?H????F??????9-??????;??E?YÞBs????????????4i???)?????^W?????`w?Ke??%??*9?.'OQB???V??@?????]???(P??^??q?$?S5???tT*?Z

    Read the article

  • Search filenames in MySQL database table restricted by filetype?

    - by ju
    Hello I have a MySQL database that I replicate from another server. The database contains a table with this columns ID, FileName and FileSize In the table there are more than 4'000'000 records. I want to make fast a search in FileName (varchar) column I found that I can use for this Sphinx search engine. The problem is that I want to restrict searches by filetype. Do I have to and how (trigers?) to extract file extensions for all rows? May be I have to create another table (because this one is replicated) and join them in 1:1 relation? Can you give me some advices please :)

    Read the article

  • Retrieving datatypes from underlying database

    - by H4mm3rHead
    Hi, Im making an application that displays information about an underlying database. The database can be anything, but is typically either Oracle, MSSQL or MySQL. I am trying to extract the datatype but cannot seem to get this right. I have a DbConnection because i dont know whether I need a OleDbConnection or an OdbcConnection. On this connection I make a GetSchema("Columns", "mytablename") query and gets the result back. It seems though that there are some inconsistencies with my datatypes or the query returns different datatypes for the different databases. For instance, in my MSSQL database I query and get an integer back (which seems to be the OleDbType) which I map to a datatype. My varchars is now of type char - no length - and this confuses me a bit. I guess my main question is something like: Is there any way of making a uniform way of extracting datatypes across providers and having an "accurate" representation of the datatype?

    Read the article

  • ASP.NET MVC Session across subdomains

    - by nccsbim071
    Hi, In my website i have implemented custom session values. In which, on log on i set the session value to some object. This object is used to extract user specific data from db. now the problem is If user logs in with : test1.somesite.com and logs off and again logs in with: test2.somesite.com that user is still receiving the data from object specific to test1.somesite.com. the point is whichever site user frist logs in with the second time if he logs in with anathor subdomain he is always getting the data from previous sub domain login. on log out from specific domain i cleared all the sessions(tried everything): by putting HttpContext.session["UserDetail"] = null;, HttpContext.Session.Abandon() and also HttpContext.Session.Clear(); but nothing seems to work any help please

    Read the article

  • db2 sql pattern matching

    - by Jitesh
    I have a table in db2 which has the following fields int xyz; string myId; string myName; Example dataset xyz | myid | myname -------------------------------- 1 | ABC.123.456 | ABC 2 | PRQS.12.34 | PQRS 3 | ZZZ.3.2.2 | blah I want to extract the rows where myName matches the character upto "." in the myId field. So from the above 3 rows, I want the firs 2 rows since myName is present in myId before "." How can I do this in the query, can I do some kind of pattern matching inside the query?

    Read the article

  • Sqlite3 Database versus populating Arrays

    - by Kenoy
    hi, I am working on a program that requires me to input values for 12 objects, each with 4 arrays, each with 100 values. (4800) values. The 4 arrays represent possible outcomes based on 2 boolean values... i.e. YY, YN, NN, NY and the 100 values to the array are what I want to extract based on another inputted variable. I previously have all possible outcomes in a csv file, and have imported these into sqlite where I can query then for the value using sql. However, It has been suggested to me that sqlite database is not the way to go, and instead I should populate using arrays hardcoded. Which would be better during run time and for memory management?

    Read the article

  • Best way to store data for Greasemonkey based crawler?

    - by Björn
    I want to crawl a site with Greasemonkey and wonder if there is a better way to temporarily store values than with GM_setValue. What I want to do is crawl my contacts in a social network and extract the Twitter URLs from their profile pages. My current plan is to open each profile in it's own tab, so that it looks more like a normal browsing person (ie css, scrits and images will be loaded by the browser). Then store the Twitter URL with GM_setValue. Once all profile pages have been crawled, create a page using the stored values. I am not so happy with the storage option, though. Maybe there is a better way? I have considered inserting the user profiles into the current page so that I could all process them with the same script instance, but I am not sure if XMLHttpRequest looks indistignuishable from normal user initiated requests.

    Read the article

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >