Search Results

Search found 27205 results on 1089 pages for 'python imaging library'.

Page 138/1089 | < Previous Page | 134 135 136 137 138 139 140 141 142 143 144 145  | Next Page >

  • Python minidom and UTF-8 encoded XML with hash references

    - by Jakob Simon-Gaarde
    Hi I am experiencing some difficulty in my home project where I need to parse a SOAP request. The SOAP is generated with gSOAP and involves string parameters with special characters like the danish letters "æøå". gSOAP builds SOAP requests with UTF-8 encoding by default, but instead of sending the special chatacters in raw format (ie. bytes C3A6 for the special character "æ") it sends what I think is called character hash references (ie. &#195;&#166;). I don't completely understand why gSOAP does it this way as I can see that it has marked the incomming payload as being UTF-8 encoded anyway (Content-Type: text/xml; charset=utf-8), but this is besides the question (I think). Anyway I guess gSOAP probably is obeying transport rules, or what? When I parse the request from gSOAP in python with xml.dom.minidom.parseString() I get element values as unicode objects which is fine, but the character hash references are not decoded as UTF-8 character codes. It unescapes the character hash references, but does not decode the string afterwards. In the end I have a unicode string object with UTF-8 encoding: So if the string "æble" is contained in the XML, it comes like this in the request: "&#195;&#166;ble" After parsing the XML the unicode string in the DOM Text Node's data member looks like this: u'\xc3\xa6ble' I would expect it to look like this: u'\xe6ble' What am I doing wrong? Should I unescape the SOAP XML before parsing it, or is it somewhere else I should be looking for the solution, maybe gSOAP? Thanks in advance. Best regards Jakob Simon-Gaarde

    Read the article

  • python calendar to calculate month backwards

    - by Suhail
    Hi, we are trying to create a calendar function in python. we have created a small content management system, the requirement is, there will be a drop down list on the top right hand corner of the website, which will give the options - Months - 1 month, 2 months, 3 months and so on..., if the user selects 8 months then it should show the postscount for the last 8 months. the issue is we tried to write a small code which would do the month calculations, but the bug is that it does not consider the months beyond the current year, it shows the postscount only for months of the current year. for example: if the user selects 3 months, it will show the count for the l 3 months i.e present month and the previous 2 months, but if the user selects option more than 4 months, it does not consider the months from previous year, it still shows the month of the present year only. I am pasting the code below:- def __getSpecifiedMailCount__(request, value): dbconnector= DBConnector() CdateList= "select cdate from mail_records" DateNow= datetime.datetime.today() DateNow= DateNow.strftime("%Y-%m") DateYear= datetime.datetime.today() DateYear= DateYear.strftime("%Y") DateMonth= datetime.datetime.today() DateMonth= DateMonth.strftime("%m") #print DateMonth def getMonth(value): valueDic= {"01": "Jan", "02": "Feb", "03": "Mar", "04": "Apr", "05": "May", "06": "Jun", "07": "Jul", "08": "Aug", "09": "Sep", "10": "Oct", "11": "Nov", "12": "Dec"} return valueDic[value] def getMonthYearandCount(yearmonth): MailCount= "select count(*) as mailcount from mail_records where cdate like '%s%s'" % (yearmonth, "%") MailCountResult= MailCount[0]['mailcount'] return MailCountResult MailCountList= [] MCOUNT= getMonthYearandCount(DateNow) MONTH= getMonth(DateMonth) MailCountDict= {} MailCountDict['monthyear']= MONTH + ' ' + DateYear MailCountDict['mailcount']= MCOUNT var_monthyear= MONTH + ' ' + DateYear var_mailcount= MCOUNT MailCountList.append(MailCountDict) i=1 k= int(value) hereMONTH= int(DateMonth) while (i < k): hereMONTH= int(hereMONTH) - 1 if (hereMONTH < 10): hereMONTH = '0' + str(hereMONTH) if (hereMONTH == '00') or (hereMONTH == '0-1'): break else: PMONTH= getMonth(hereMONTH) hereDateNow= DateYear + '-' + PMONTH hereDateNowNum= DateYear + '-' + hereMONTH PMCOUNT= getMonthYearandCount(hereDateNowNum) MailCountDict= {} MailCountDict['monthyear']= PMONTH + ' ' + DateYear MailCountDict['mailcount']= PMCOUNT var_monthyear= PMONTH + ' ' + DateYear var_mailcount= PMCOUNT MailCountList.append(MailCountDict) i = i + 1 #print MailCountList MailCountDict= {'monthmailcount': MailCountList} reportdata = MailCountDict['monthmailcount'] #print reportdata return render_to_response('test.html', locals())

    Read the article

  • Python and csv help

    - by user353064
    I'm trying to create this script that will check the computer host name then search a master list for the value to return a corresponding value in the csv file. Then open another file and do a find an replace. I know this should be easy but haven't done so much in python before. Here is what I have so far... masterlist.txt (tab delimited) Name UID Bob-Smith.local bobs Carmen-Jackson.local carmenj David-Kathman.local davidk Jenn-Roberts.local jennr Here is the script that I have created thus far #GET CLIENT HOST NAME import socket host = socket.gethostname() print host #IMPORT MASTER DATA import csv, sys filename = "masterlist.txt" reader = csv.reader(open(filename, "rU")) #PRINT MASTER DATA for row in reader: print row #SEARCH ON HOSTNAME AND RETURN UID #REPLACE VALUE IN FILE WITH UID #import fileinput #for line in fileinput.FileInput("filetoreplace",inplace=1): # line = line.replace("replacethistext","UID") # print line Right now, it's just set to print the master list. I'm not sure if the list needs to be parsed and placed into a dictionary or what. I really need to figure out how to search the first field for the hostname and then return the field in the second column. Thanks in advance for your help, Aaron

    Read the article

  • Unit testing and mocking email sender in Python with Google AppEngine

    - by CVertex
    I'm a newbie to python and the app engine. I have this code that sends an email based on request params after some auth logic. in my Unit tests (i'm using GAEUnit), how do I confirm an email with specific contents were sent? - i.e. how do I mock the emailer with a fake emailer to verify send was called? class EmailHandler(webapp.RequestHandler): def bad_input(self): self.response.set_status(400) self.response.headers['Content-Type'] = 'text/plain' self.response.out.write("<html><body>bad input </body></html>") def get(self): to_addr = self.request.get("to") subj = self.request.get("subject") msg = self.request.get("body") if not mail.is_email_valid(to_addr): # Return an error message... # self.bad_input() pass # authenticate here message = mail.EmailMessage() message.sender = "[email protected]" message.to = to_addr message.subject = subj message.body = msg message.send() self.response.headers['Content-Type'] = 'text/plain' self.response.out.write("<html><body>success!</body></html>") And the unit tests, import unittest from webtest import TestApp from google.appengine.ext import webapp from email import EmailHandler class SendingEmails(unittest.TestCase): def setUp(self): self.application = webapp.WSGIApplication([('/', EmailHandler)], debug=True) def test_success(self): app = TestApp(self.application) response = app.get('http://localhost:8080/[email protected]&body=blah_blah_blah&subject=mySubject') self.assertEqual('200 OK', response.status) self.assertTrue('success' in response) # somehow, assert email was sent

    Read the article

  • Copy an entity in Google App Engine datastore in Python without knowing property names at 'compile'

    - by Gordon Worley
    In a Python Google App Engine app I'm writing, I have an entity stored in the datastore that I need to retrieve, make an exact copy of it (with the exception of the key), and then put this entity back in. How should I do this? In particular, are there any caveats or tricks I need to be aware of when doing this so that I get a copy of the sort I expect and not something else. ETA: Well, I tried it out and I did run into problems. I would like to make my copy in such a way that I don't have to know the names of the properties when I write the code. My thinking was to do this: #theThing = a particular entity we pull from the datastore with model Thing copyThing = Thing(user = user) for thingProperty in theThing.properties(): copyThing.__setattr__(thingProperty[0], thingProperty[1]) This executes without any errors... until I try to pull copyThing from the datastore, at which point I discover that all of the properties are set to None (with the exception of the user and key, obviously). So clearly this code is doing something, since it's replacing the defaults with None (all of the properties have a default value set), but not at all what I want. Suggestions?

    Read the article

  • python eval weirdness

    - by amadain
    Hi Folks I have the following code in one of my classes along with checks when the code does not eval: filterParam="self.recipientMSISDN==tmpBPSS.split('_')[3].split('#')[0] and self.recipientIMSI==tmpBPSS.split('_')[3].split('#')[1]" if eval(filterParam): print "Evalled" else: print "Not Evalled\nfilterParam\n'%s'\ntmpBPSS\n'%s'\nself.recipientMSISDN\n'%s'\nself.recipientIMSI\n'%s'" % (filterParam, tmpBPSS, self.recipientMSISDN, self.recipientIMSI) I am not getting anything to 'eval'. Here are the results: Not Evalled filterParam 'self.recipientMSISDN==tmpBPSS.split('_')[3].split('#')[0] and self.recipientIMSI==tmpBPSS.split('_')[3].split('#')[1]' tmpBPSS 'bprm_DAILY_MO_919844000039#892000000' self.recipientMSISDN '919844000039' self.recipientIMSI '892000000' So I used the outputs from the above to check the code in a python shell and as you can see the code evalled correctly: >>> filterParam="recipientMSISDN==tmpBPSS.split('_')[3].split('#')[0] and recipientIMSI==tmpBPSS.split('_')[3].split('#')[1]" >>> tmpBPSS='bprm_DAILY_MO_919844000039#892000000' >>> recipientMSISDN='919844000039' >>> recipientIMSI='892000000' >>> if eval(filterParam): ... print "Evalled" ... else: ... print "Not Evalled" ... Evalled Am I off my rocker or what am I missing? A

    Read the article

  • Python performance improvement request for winkler

    - by Martlark
    I'm a python n00b and I'd like some suggestions on how to improve the algorithm to improve the performance of this method to compute the Jaro-Winkler distance of two names. def winklerCompareP(str1, str2): """Return approximate string comparator measure (between 0.0 and 1.0) USAGE: score = winkler(str1, str2) ARGUMENTS: str1 The first string str2 The second string DESCRIPTION: As described in 'An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census' by William E. Winkler and Yves Thibaudeau. Based on the 'jaro' string comparator, but modifies it according to whether the first few characters are the same or not. """ # Quick check if the strings are the same - - - - - - - - - - - - - - - - - - # jaro_winkler_marker_char = chr(1) if (str1 == str2): return 1.0 len1 = len(str1) len2 = len(str2) halflen = max(len1,len2) / 2 - 1 ass1 = '' # Characters assigned in str1 ass2 = '' # Characters assigned in str2 #ass1 = '' #ass2 = '' workstr1 = str1 workstr2 = str2 common1 = 0 # Number of common characters common2 = 0 #print "'len1', str1[i], start, end, index, ass1, workstr2, common1" # Analyse the first string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len1): start = max(0,i-halflen) end = min(i+halflen+1,len2) index = workstr2.find(str1[i],start,end) #print 'len1', str1[i], start, end, index, ass1, workstr2, common1 if (index > -1): # Found common character common1 += 1 #ass1 += str1[i] ass1 = ass1 + str1[i] workstr2 = workstr2[:index]+jaro_winkler_marker_char+workstr2[index+1:] #print "str1 analyse result", ass1, common1 #print "str1 analyse result", ass1, common1 # Analyse the second string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len2): start = max(0,i-halflen) end = min(i+halflen+1,len1) index = workstr1.find(str2[i],start,end) #print 'len2', str2[i], start, end, index, ass1, workstr1, common2 if (index > -1): # Found common character common2 += 1 #ass2 += str2[i] ass2 = ass2 + str2[i] workstr1 = workstr1[:index]+jaro_winkler_marker_char+workstr1[index+1:] if (common1 != common2): print('Winkler: Wrong common values for strings "%s" and "%s"' % \ (str1, str2) + ', common1: %i, common2: %i' % (common1, common2) + \ ', common should be the same.') common1 = float(common1+common2) / 2.0 ##### This is just a fix ##### if (common1 == 0): return 0.0 # Compute number of transpositions - - - - - - - - - - - - - - - - - - - - - # transposition = 0 for i in range(len(ass1)): if (ass1[i] != ass2[i]): transposition += 1 transposition = transposition / 2.0 # Now compute how many characters are common at beginning - - - - - - - - - - # minlen = min(len1,len2) for same in range(minlen+1): if (str1[:same] != str2[:same]): break same -= 1 if (same > 4): same = 4 common1 = float(common1) w = 1./3.*(common1 / float(len1) + common1 / float(len2) + (common1-transposition) / common1) wn = w + same*0.1 * (1.0 - w) return wn

    Read the article

  • Python unicode problem

    - by Somebody still uses you MS-DOS
    I'm receiving some data from a ZODB (Zope Object Database). I receive a mybrains object. Then I do: o = mybrains.getObject() and I receive a "Person" object in my project. Then, I can do b = o.name and doing print b on my class I get: José Carlos and print b.name.__class__ <type 'unicode'> I have a lot of "Person" objects. They are added to a list. names = [o.nome, o1.nome, o2.nome] Then, I trying to create a text file with this data. delimiter = ';' all = delimiter.join(names) + '\n' No problem. Now, when I do a print all I have: José Carlos;Jonas;Natália Juan;John But when I try to create a file of it: f = open("/tmp/test.txt", "w") f.write(all) I get an error like this (the positions aren't exaclty the same, since I change the names) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 84: ordinal not in range(128) If I can print already with the "correct" form to display it, why I can't write a file with it? Which encode/decode method should I use to write a file with this data? I'm using Python 2.4.5 (can't upgrade it)

    Read the article

  • HTTPSConnection module missing in Python 2.6 on CentOS 5.2

    - by d2kagw
    Hi guys, I'm playing around with a Python application on CentOS 5.2. It uses the Boto module to communicate with Amazon Web Services, which requires communication through a HTTPS connection. When I try running my application I get an error regarding HTTPSConnection being missing: "AttributeError: 'module' object has no attribute 'HTTPSConnection'" Google doesn't really return anything relevant, I've tried most of the solutions but none of them solve the problem. Has anyone come across anything like it? Here's the traceback: Traceback (most recent call last): File "./chatter.py", line 114, in <module> sys.exit(main()) File "./chatter.py", line 92, in main chatter.status( ) File "/mnt/application/chatter/__init__.py", line 161, in status cQueue.connect() File "/mnt/application/chatter/tools.py", line 42, in connect self.connection = SQSConnection(cConfig.get("AWS", "KeyId"), cConfig.get("AWS", "AccessKey")); File "/usr/local/lib/python2.6/site-packages/boto-1.7a-py2.6.egg/boto/sqs/connection.py", line 54, in __init__ self.region.endpoint, debug, https_connection_factory) File "/usr/local/lib/python2.6/site-packages/boto-1.7a-py2.6.egg/boto/connection.py", line 418, in __init__ debug, https_connection_factory) File "/usr/local/lib/python2.6/site-packages/boto-1.7a-py2.6.egg/boto/connection.py", line 189, in __init__ self.refresh_http_connection(self.server, self.is_secure) File "/usr/local/lib/python2.6/site-packages/boto-1.7a-py2.6.egg/boto/connection.py", line 247, in refresh_http_connection connection = httplib.HTTPSConnection(host) AttributeError: 'module' object has no attribute 'HTTPSConnection'

    Read the article

  • A better python property decorator

    - by leChuck
    I've inherited some python code that contains a rather cryptic decorator. This decorator sets properties in classes all over the project. The problem is that this I have traced my debugging problems to this decorator. Seems it "fubars" all debuggers I've tried and trying to speed up the code with psyco breaks everthing. (Seems psyco and this decorator dont play nice). I think it would be best to change it. def Property(function): """Allow readable properties""" keys = 'fget', 'fset', 'fdel' func_locals = {'doc':function.__doc__} def probeFunc(frame, event, arg): if event == 'return': locals = frame.f_locals func_locals.update(dict((k,locals.get(k)) for k in keys)) sys.settrace(None) return probeFunc sys.settrace(probeFunc) function() return property(**func_locals) Used like so: class A(object): @Property def prop(): def fget(self): return self.__prop def fset(self, value): self.__prop = value ... ect The errors I get say the problems are because of sys.settrace. (Perhaps this is abuse of settrace ?) My question: Is the same decorator achievable without sys.settrace. If not I'm in for some heavy rewrites.

    Read the article

  • Computing complex math equations in python

    - by dassouki
    Are there any libraries or techniques that simplify computing equations ? Take the following two examples: F = B * { [ a * b * sumOf (A / B ''' for all i ''' ) ] / [ sumOf(c * d * j) ] } where: F = cost from i to j B, a, b, c, d, j are all vectors in the format [ [zone_i, zone_j, cost_of_i_to_j], [..]] This should produce a vector F [ [1,2, F_1_2], ..., [i,j, F_i_j] ] T_ij = [ P_i * A_i * F_i_j] / [ SumOf [ Aj * F_i_j ] // j = 1 to j = n ] where: n is the number of zones T = vector [ [1, 2, A_1_2, P_1_2], ..., [i, j, A_i_j, P_i_j] ] F = vector [1, 2, F_1_2], ..., [i, j, F_i_j] so P_i would be the sum of all P_i_j for all j and Aj would be sum of all P_j for all i I'm not sure what I'm looking for, but perhaps a parser for these equations or methods to deal with multiple multiplications and products between vectors? To calculate some of the factors, for example A_j, this is what i use from collections import defaultdict A_j_dict = defaultdict(float) for A_item in TG: A_j_dict[A_item[1]] += A_item[3] Although this works fine, I really feel that it is a brute force / hacking method and unmaintainable in the case we want to add more variables or parameters. Are there any math equation parsers you'd recommend? Side Note: These equations are used to model travel. Currently I use excel to solve a lot of these equations; and I find that process to be daunting. I'd rather move to python where it pulls the data directly from our database (postgres) and outputs the results into the database. All that is figured out. I'm just struggling with evaluating the equations themselves. Thanks :)

    Read the article

  • Python print statement prints nothing with a carriage return

    - by Jonathan Sternberg
    I'm trying to write a simple tool that reads files from disc, does some image processing, and returns the result of the algorithm. Since the program can sometimes take awhile, I like to have a progress bar so I know where it is in the program. And since I don't like to clutter up my command line and I'm on a Unix platform, I wanted to use the '\r' character to print the progress bar on only one line. But when I have this code here, it prints nothing. # Files is a list with the filenames for i, f in enumerate(files): print '\r%d / %d' % (i, len(files)), # Code that takes a long time I have also tried: print '\r', i, '/', len(files), Now just to make sure this worked in python, I tried this: heartbeat = 1 while True: print '\rHello, world', heartbeat, heartbeat += 1 This code works perfectly. What's going on? My understanding of carriage returns on Linux was that it would just move the line feed character to the beginning and then I could overwrite old text that was written previously, as long as I don't print a newline anywhere. This doesn't seem to be happening though. Also, is there a better way to display a progress bar in a command line than what I'm current trying to do?

    Read the article

  • Problem with python urllib

    - by mudder
    I'm getting an error when ever I try to pull down a web page with urllib.urlopen. I've disabled windows firewall and my AV so its not that. I can access the pages in my browser. I even reinstalled python to rule out it being a broken urllib. Any help would be greatly appreciated. >>> import urllib >>> h = urllib.urlopen("http://www.google.com").read() Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> h = urllib.urlopen("http://www.google.com").read() File "C:\Python26\lib\urllib.py", line 86, in urlopen return opener.open(url) File "C:\Python26\lib\urllib.py", line 205, in open return getattr(self, name)(url) File "C:\Python26\lib\urllib.py", line 344, in open_http h.endheaders() File "C:\Python26\lib\httplib.py", line 904, in endheaders self._send_output() File "C:\Python26\lib\httplib.py", line 776, in _send_output self.send(msg) File "C:\Python26\lib\httplib.py", line 735, in send self.connect() File "C:\Python26\lib\httplib.py", line 716, in connect self.timeout) File "C:\Python26\lib\socket.py", line 514, in create_connection raise error, msg IOError: [Errno socket error] [Errno 10061] No connection could be made because the target machine actively refused it >>>

    Read the article

  • Nicely representing a floating-point number in python

    - by dln385
    I want to represent a floating-point number as a string rounded to some number of significant digits, and never using the exponential format. Essentially, I want to display any floating-point number and make sure it “looks nice”. There are several parts to this problem: I need to be able to specify the number of significant digits. The number of significant digits needs to be variable, which can't be done with with the string formatting operator. I need it to be rounded the way a person would expect, not something like 1.999999999999 I've figured out one way of doing this, though it looks like a work-round and it's not quite perfect. (The maximum precision is 15 significant digits.) >>> def f(number, sigfig): return ("%.15f" % (round(number, int(-1 * floor(log10(number)) + (sigfig - 1))))).rstrip("0").rstrip(".") >>> print f(0.1, 1) 0.1 >>> print f(0.0000000000368568, 2) 0.000000000037 >>> print f(756867, 3) 757000 Is there a better way to do this? Why doesn't Python have a built-in function for this?

    Read the article

  • Parsing email with Python

    - by Manuel Ceron
    I'm writing a Python script to process emails returned from Procmail. As suggested in this question, I'm using the following Procmail config: :0: |$HOME/process_mail.py My process_mail.py script is receiving an email via stdin like this: From hostname Tue Jun 15 21:43:30 2010 Received: (qmail 8580 invoked from network); 15 Jun 2010 21:43:22 -0400 Received: from mail-fx0-f44.google.com (209.85.161.44) by ip-73-187-35-131.ip.secureserver.net with SMTP; 15 Jun 2010 21:43:22 -0400 Received: by fxm19 with SMTP id 19so170709fxm.3 for <[email protected]>; Tue, 15 Jun 2010 18:47:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.84.1 with SMTP id m1mr2774225mul.26.1276652853684; Tue, 15 Jun 2010 18:47:33 -0700 (PDT) Received: by 10.123.143.4 with HTTP; Tue, 15 Jun 2010 18:47:33 -0700 (PDT) Date: Tue, 15 Jun 2010 20:47:33 -0500 Message-ID: <[email protected]> Subject: TEST 12 From: Full Name <[email protected]> To: [email protected] Content-Type: text/plain; charset=ISO-8859-1 ONE TWO THREE I'm trying to parse the message in this way: >>> import email >>> msg = email.message_from_string(full_message) I want to get message fields like 'From', 'To' and 'Subject'. However, the message object does not contain any of these fields. What am I doing wrong?

    Read the article

  • Python: Networked IDLE?

    - by Rosarch
    Is there any existing web app that lets multiple users work with an interactive IDLE type session at once? Something like: IDLE 2.6.4 Morgan: >>> letters = list("abcdefg") Morgan: >>> # now, how would you iterate over letters? Jack: >>> for char in letters: print "char %s" % char char a char b char c char d char e char f char g Morgan: >>> # nice nice If not, I would like to create one. Is there some module I can use that simulates an interactive session? I'd want an interface like this: def class InteractiveSession(): ''' An interactive Python session ''' def putLine(line): ''' Evaluates line ''' pass def outputLines(): ''' A list of all lines that have been output by the session ''' pass def currentVars(): ''' A dictionary of currently defined variables and their values ''' pass (Although that last function would be more of an extra feature.) To formulate my problem another way: I'd like to create a new front end for IDLE. How can I do this?

    Read the article

  • Radix Sort in Python [on hold]

    - by Steven Ramsey
    I could use some help. How would you write a program in python that implements a radix sort? Here is some info: A radix sort for base 10 integers is a based on sorting punch cards, but it turns out the sort is very ecient. The sort utilizes a main bin and 10 digit bins. Each bin acts like a queue and maintains its values in the order they arrive. The algorithm begins by placing each number in the main bin. Then it considers the ones digit for each value. The rst value is removed and placed in the digit bin corresponding to the ones digit. For example, 534 is placed in digit bin 4 and 662 is placed in the digit bin 2. Once all the values in the main bin are placed in the corresponding digit bin for ones, the values are collected from bin 0 to bin 9 (in that order) and placed back in the main bin. The process continues with the tens digit, the hundreds, and so on. After the last digit is processed, the main bin contains the values in order. Use randint, found in random, to create random integers from 1 to 100000. Use a list comphrension to create a list of varying sizes (10, 100, 1000, 10000, etc.). To use indexing to access the digits rst convert the integer to a string. For this sort to work, all numbers must have the same number of digits. To zero pad integers with leading zeros, use the string method str.zfill(). Once main bin is sorted, convert the strings back to integers. I'm not sure how to start this, Any help is appreciated. Thank you.

    Read the article

  • reading csv files in scipy/numpy in Python

    - by user248237
    I am having trouble reading a csv file, delimited by tabs, in python. I use the following function: def csv2array(filename, skiprows=0, delimiter='\t', raw_header=False, missing=None, with_header=True): """ Parse a file name into an array. Return the array and additional header lines. By default, parse the header lines into dictionaries, assuming the parameters are numeric, using 'parse_header'. """ f = open(filename, 'r') skipped_rows = [] for n in range(skiprows): header_line = f.readline().strip() if raw_header: skipped_rows.append(header_line) else: skipped_rows.append(parse_header(header_line)) f.close() if missing: data = genfromtxt(filename, dtype=None, names=with_header, deletechars='', skiprows=skiprows, missing=missing) else: if delimiter != '\t': data = genfromtxt(filename, dtype=None, names=with_header, delimiter=delimiter, deletechars='', skiprows=skiprows) else: data = genfromtxt(filename, dtype=None, names=with_header, deletechars='', skiprows=skiprows) if data.ndim == 0: data = array([data.item()]) return (data, skipped_rows) the problem is that genfromtxt complains about my files, e.g. with the error: Line #27100 (got 12 columns instead of 16) I am not sure where these errors come from. Any ideas? Here's an example file that causes the problem: #Gene 120-1 120-3 120-4 30-1 30-3 30-4 C-1 C-2 C-5 genesymbol genedesc ENSMUSG00000000001 7.32 9.5 7.76 7.24 11.35 8.83 6.67 11.35 7.12 Gnai3 guanine nucleotide binding protein alpha ENSMUSG00000000003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Pbsn probasin Is there a better way to write a generic csv2array function? thanks.

    Read the article

  • Python many-to-one mapping (creating equivalence classes)

    - by Adam Matan
    Hi, I have a project of converting one database to another. One of the original database columns defines the row's category. This coulmn should be mapepd to a new category in the new databse. For example, let's assume the original categories are:parrot, spam, cheese_shop, Cleese, Gilliam, Palin Now that's a little verbose for me, And I want to have these rows categorized as sketch, actor - That is, define all the sketches and all the actors as two equivalence classes. >>> monty={'parrot':'sketch', 'spam':'sketch', 'cheese_shop':'sketch', 'Cleese':'actor', 'Gilliam':'actor', 'Palin':'actor'} >>> monty {'Gilliam': 'actor', 'Cleese': 'actor', 'parrot': 'sketch', 'spam': 'sketch', 'Palin': 'actor', 'cheese_shop': 'sketch'} That's quite awkward- I would prefer having something like: monty={ ('parrot','spam','cheese_shop'): 'sketch', ('Cleese', 'Gilliam', 'Palin') : 'actors'} But this, of course, sets the entire tuple as a key: >>> monty['parrot'] Traceback (most recent call last): File "<pyshell#29>", line 1, in <module> monty['parrot'] KeyError: 'parrot' Any ideas how to create an elegant many-to-one dictionary in Python? Thanks, Adam

    Read the article

  • Python access an object byref / Need tagging

    - by Aaron C. de Bruyn
    I need to suck data from stdin and create a object. The incoming data is between 5 and 10 lines long. Each line has a process number and either an IP address or a hash. For example: pid=123 ip=192.168.0.1 - some data pid=123 hash=ABCDEF0123 - more data hash=ABCDEF123 - More data ip=192.168.0.1 - even more data I need to put this data into a class like: class MyData(): pid = None hash = None ip = None lines = [] I need to be able to look up the object by IP, HASH, or PID. The tough part is that there are multiple streams of data intermixed coming from stdin. (There could be hundreds or thousands of processes writing data at the same time.) I have regular expressions pulling out the PID, IP, and HASH that I need, but how can I access the object by any of those values? My thought was to do something like this: myarray = {} for each line in sys.stdin.readlines(): if pid and ip: #If we can get a PID out of the line myarray[pid] = MyData().pid = pid #Create a new MyData object, assign the PID, and stick it in myarray accessible by PID. myarray[pid].ip = ip #Add the IP address to the new object myarray[pid].lines.append(data) #Append the data myarray[ip] = myarray[pid] #Take the object by PID and create a key from the IP. <snip>do something similar for pid and hash, hash and ip, etc...</snip> This gives my an array with two keys (a PID and an IP) and they both point to the same object. But on the next iteration of the loop, if I find (for example) an IP and HASH and do: myarray[hash] = myarray[ip] The following is False: myarray[hash] == myarray[ip] Hopefully that was clear. I hate to admit that waaay back in the VB days, I remember being able handle objects byref instead of byval. Is there something similar in Python? Or am I just approaching this wrong?

    Read the article

  • Figure out if element is present in multi-dimensional array in python

    - by Terje
    I am parsing a log containing nicknames and hostnames. I want to end up with an array that contains the hostname and the latest used nickname. I have the following code, which only creates a list over the hostnames: hostnames = [] # while(parsing): # nick = nick_on_current_line # host = host_on_current_line if host in hostnames: # Hostname is already present. pass else: # Hostname is not present hostnames.append(host) print hostnames # ['[email protected]', '[email protected]', '[email protected]'] I thought it would be nice to end up with something along the lines of the following: # [['[email protected]', 'John'], ['[email protected]', 'Mary'], ['[email protected]', 'Joe']] My problem is finding out if the hostname is present in such a list hostnames = [] # while(parsing): # nick = nick_on_current_line # host = host_on_current_line if host in hostnames[0]: # This doesn't work. # Hostname is already present. # Somehow check if the nick stored together # with the hostname is the latest one else: # Hostname is not present hostnames.append([host, nick]) Are there any easy fix to this, or should I try a different approach? I could always have an array with objects or structs (if there is such a thing in python), but I would prefer a solution to my array problem.

    Read the article

  • Python socket error on UDP data receive. (10054)

    - by Charles
    I currently have a problem using UDP and Python socket module. We have a server and clients. The problem occurs when we send data to a user. It's possible that user may have closed their connection to the server through a client crash, disconnect by ISP, or some other improper method. As such, it is possible to send data to a closed socket. Of course with UDP you can't tell if the data really reached or if it's closed, as it doesn't care (atleast, it doesn't bring up an exception). However, if you send data and it is closed off, you get data back somehow (???), which ends up giving you a socket error on sock.recvfrom. [Errno 10054] An existing connection was forcibly closed by the remote host. Almost seems like an automatic response from the connection. Although this is fine, and can be handled by a try: except: block (even if it lowers performance of the server a little bit). The problem is, I can't tell who this is coming from or what socket is closed. Is there anyway to find out 'who' (ip, socket #) sent this? It would be great as I could instantly just disconnect them and remove them from the data. Any suggestions? Thanks.

    Read the article

  • Quickly determine if a number is prime in Python for numbers < 1 billion

    - by Frór
    Hi, My current algorithm to check the primality of numbers in python is way to slow for numbers between 10 million and 1 billion. I want it to be improved knowing that I will never get numbers bigger than 1 billion. The context is that I can't get an implementation that is quick enough for solving problem 60 of project Euler: I'm getting the answer to the problem in 75 seconds where I need it in 60 seconds. http://projecteuler.net/index.php?section=problems&id=60 I have very few memory at my disposal so I can't store all the prime numbers below 1 billion. I'm currently using the standard trial division tuned with 6k±1. Is there anything better than this? Do I already need to get the Rabin-Miller method for numbers that are this large. primes_under_100 = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97] def isprime(n): if n <= 100: return n in primes_under_100 if n % 2 == 0 or n % 3 == 0: return False for f in range(5, int(n ** .5), 6): if n % f == 0 or n % (f + 2) == 0: return False return True How can I improve this algorithm?

    Read the article

  • Python multiprocessing doesn't play nicely with uuid.uuid4().

    - by yig
    I'm trying to generate a uuid for a filename, and I'm also using the multiprocessing module. Unpleasantly, all of my uuids end up exactly the same. Here is a small example: import multiprocessing import uuid def get_uuid( a ): ## Doesn't help to cycle through a bunch. #for i in xrange(10): uuid.uuid4() ## Doesn't help to reload the module. #reload( uuid ) ## Doesn't help to load it at the last minute. ## (I simultaneously comment out the module-level import). #import uuid ## uuid1() does work, but it differs only in the first 8 characters and includes identifying information about the computer. #return uuid.uuid1() return uuid.uuid4() def main(): pool = multiprocessing.Pool( 20 ) uuids = pool.map( get_uuid, range( 20 ) ) for id in uuids: print id if __name__ == '__main__': main() I peeked into uuid.py's code, and it seems to depending-on-the-platform use some OS-level routines for randomness, so I'm stumped as to a python-level solution (to do something like reload the uuid module or choose a new random seed). I could use uuid.uuid1(), but only 8 digits differ and I think there are derived exclusively from the time, which seems dangerous especially given that I'm multiprocessing (so the code could be executing at exactly the same time). Is there some Wisdom out there about this issue?

    Read the article

  • Reading numpy arrays outside of Python

    - by Abiel
    In a recent question I asked about the fastest way to convert a large numpy array to a delimited string. My reason for asking was because I wanted to take that plain text string and transmit it (over HTTP for instance) to clients written in other programming languages. A delimited string of numbers is obviously something that any client program can work with easily. However, it was suggested that because string conversion is slow, it would be faster on the Python side to do base64 encoding on the array and send it as binary. This is indeed faster. My question now is, (1) how can I make sure my encoded numpy array will travel well to clients on different operating systems and different hardware, and (2) how do I decode the binary data on the client side. For (1), my inclination is to do something like the following import numpy as np import base64 x = np.arange(100, dtype=np.float64) base64.b64encode(x.tostring()) Is there anything else I need to do? For (2), I would be happy to have an example in any programming language, where the goal is to take the numpy array of floats and turn them into a similar native data structure. Assume we have already done base64 decoding and have a byte array, and that we also know the numpy dtype, dimensions, and any other metadata which will be needed. Thanks.

    Read the article

< Previous Page | 134 135 136 137 138 139 140 141 142 143 144 145  | Next Page >