Search Results

Search found 13693 results on 548 pages for 'python metaprogramming'.

Page 75/548 | < Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >

  • Endless problems with a very simple python subprocess.Popen task

    - by Thomas
    I'd like python to send around a half-million integers in the range 0-255 each to an executable written in C++. This executable will then respond with a few thousand integers. Each on one line. This seems like it should be very simple to do with subprocess but i've had endless troubles. Right now im testing with code: // main() u32 num; std::cin >> num; u8* data = new u8[num]; for (u32 i = 0; i < num; ++i) std::cin >> data[i]; // test output / spit it back out for (u32 i = 0; i < num; ++i) std::cout << data[i] << std::endl; return 0; Building an array of strings ("data"), each like "255\n", in python and then using: output = proc.communicate("".join(data))[0] ...doesn't work (says stdin is closed, maybe too much at one time). Neither has using proc.stdin and proc.stdout worked. This should be so very simple, but I'm getting constant exceptions, and/or no output data returned to me. My Popen is currently: proc = Popen('aux/test_cpp_program', stdin=PIPE, stdout=PIPE, bufsize=1) Advise me before I pull my hair out. ;)

    Read the article

  • Help me find an appropriate ruby/python parser generator

    - by Geo
    The first parser generator I've worked with was Parse::RecDescent, and the guides/tutorials available for it were great, but the most useful feature it has was it's debugging tools, specifically the tracing capabilities ( activated by setting $RD_TRACE to 1 ). I am looking for a parser generator that can help you debug it's rules. The thing is, it has to be written in python or in ruby, and have a verbose mode/trace mode or very helpful debugging techniques. Does anyone know such a parser generator ? EDIT: when I said debugging, I wasn't referring to debugging python or ruby. I was referring to debugging the parser generator, see what it's doing at every step, see every char it's reading, rules it's trying to match. Hope you get the point. BOUNTY EDIT: to win the bounty, please show a parser generator framework, and illustrate some of it's debugging features. I repeat, I'm not interested in pdb, but in parser's debugging framework. Also, please don't mention treetop. I'm not interested in it.

    Read the article

  • Getting all new messages from a Maildir in python

    - by Jesper
    I have a mail dir: foo@foo:~/Maildir$ ls -l total 288 drwx------ 2 foo foo 155648 2010-04-19 15:19 cur -rw------- 1 foo foo 440 2010-03-20 08:50 dovecot.index.log -rw------- 1 foo foo 112 2010-03-20 08:49 dovecot-uidlist -rw------- 1 foo foo 8 2010-03-20 08:49 dovecot-uidvalidity -rw------- 1 foo foo 0 2010-03-20 08:49 dovecot-uidvalidity.4ba48c0e drwx------ 2 foo foo 114688 2010-04-19 16:07 new drwx------ 2 foo foo 4096 2010-04-19 16:07 tmp And in python I'm trying to get all new messages (Python 2.6.5rc2). First, getting "Maildir" works: >>> import mailbox >>> md = mailbox.Maildir('/home/foo/Maildir') >>> md.iterkeys().next() '1269924477.Vfc01I4249fM708004.foo' But how do I access "Maildir/new"? This does not work: >>> md = mailbox.Maildir('/home/foo/Maildir/new') >>> md.iterkeys().next() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/mailbox.py", line 346, in iterkeys self._refresh() File "/usr/lib/python2.6/mailbox.py", line 467, in _refresh for entry in os.listdir(subdir_path): OSError: [Errno 2] No such file or directory: '/home/foo/Maildir/new/new' >>> Any ideas?

    Read the article

  • Building a python module and linking it against a MacOSX framework

    - by madflo
    I'm trying to build a Python extension on MacOSX 10.6 and to link it against several frameworks (i386 only). I made a setup.py file, using distutils and the Extension object. I order to link against my frameworks, my LDFLAGS env var should look like : LDFLAGS = -lc -arch i386 -framework fwk1 -framework fwk2 As I did not find any 'framework' keyword in the Extension module documentation, I used the extra_link_args keyword instead. Extension('test', define_macros = [('MAJOR_VERSION', '1'), ,('MINOR_VERSION', '0')], include_dirs = ['/usr/local/include', 'include/', 'include/vitale'], extra_link_args = ['-arch i386', '-framework fwk1', '-framework fwk2'], sources = "testmodule.cpp", language = 'c++' ) Everything is compiling and linking fine. If I remove the -framework line from the extra_link_args, my linker fails, as expected. Here is the last two lines produced by a python setup.py build : /usr/bin/g++-4.2 -arch x86_64 -arch i386 -isysroot / -L/opt/local/lib -arch x86_64 -arch i386 -bundle -undefined dynamic_lookup build/temp.macosx-10.6-intel-2.6/testmodule.o -o build/lib.macosx-10.6-intel-2.6/test.so -arch i386 -framework sgdosx -framework srtosx -framework ssvosx -framework stsosx Unfortunately, the .so that I just produced is unable to find several symbols provided by this framework. I tried to check the linked framework with otool. None of them is appearing. $ otool -L test.so test.so: /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) There is the output of otool run on a test binary, made with g++ and ldd using the LDFLAGS described at the top of my post. On this example, the -framework did work. $ otool -L vitaosx vitaosx: /Library/Frameworks/sgdosx.framework/Versions/A/sgdosx (compatibility version 1.0.0, current version 1.0.0) /Library/Frameworks/ssvosx.framework/Versions/A/ssvosx (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) May this issue be linked to the "-undefined dynamic_lookup" flag on the linking step ? I'm a little bit confused by the few lines of documentation that I'm finding on Google. Cheers,

    Read the article

  • Converting python collaborative filtering code to use Map Reduce

    - by Neil Kodner
    Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

    Read the article

  • Decoding tcp packets using python

    - by mikip
    Hello I am trying to decode data received over a tcp connection. The packets are small, no more than 100 bytes. However when there is a lot of them I receive some of the the packets joined together. Is there a way to prevent this. I am using python I have tried to separate the packets, my source is below. The packets start with STX byte and end with ETX bytes, the byte following the STX is the packet length, (packet lengths less than 5 are invalid) the checksum is the last bytes before the ETX def decode(data): while True: start = data.find(STX) if start == -1: #no stx in message pkt = '' data = '' break #stx found , next byte is the length pktlen = ord(data[1]) #check message ends in ETX (pktken -1) or checksum invalid if pktlen < 5 or data[pktlen-1] != ETX or checksum_valid(data[start:pktlen]) == False: print "Invalid Pkt" data = data[start+1:] continue else: pkt = data[start:pktlen] data = data[pktlen:] break return data , pkt I use it like this #process reports try: data = sock.recv(256) except: continue else: while data: data, pkt = decode(data) if pkt: process(pkt) Also if there are multiple packets in the data stream, is it best to return the packets as a collection of lists or just return the first packet I am not that familiar with python, only C, is this method OK. Any advice would be most appreciated. Thanks in advance Thanks

    Read the article

  • Replicating SQL's 'Join' in Python

    - by Daniel Mathews
    I'm in the process of trying to switch from R to Python (mainly issues around general flexibility). With Numpy, matplotlib and ipython, I've am able to cover all my use cases save for merging 'datasets'. I would like to simulate SQL's join by clause (inner, outer, full) purely in python. R handles this with the 'merge' function. I've tried the numpy.lib.recfunctions join_by, but it critical issues with duplicates along the 'key': join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2', defaults=None, usemask=True, asrecarray=False) Join arrays r1 and r2 on key key. The key should be either a string or a sequence of string corresponding to the fields used to join the array. An exception is raised if the key field cannot be found in the two input arrays. Neither r1 nor r2 should have any duplicates along key: the presence of duplicates will make the output quite unreliable. Note that duplicates are not looked for by the algorithm. source: http://presbrey.mit.edu:1234/numpy.lib.recfunctions.html Any pointers or help will be most appreciated!

    Read the article

  • Reading UTF-8 XML and writing it to a file with Python

    - by Harri
    I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least in some level) Here's the XML (from the parts that matter at least, it's a *.resx file which contains UI strings) <?xml version="1.0" encoding="utf-8"?> <root> <resheader name="foo"> <value>bar</value> </resheader> <data name="lorem" xml:space="preserve"> <value>ipsum öä</value> </data> </root> And here's my python script from xml.dom.minidom import parse names = [] values = [] def getStrings(path): dom = parse(path) data = dom.getElementsByTagName("data") for i in range(len(data)): name = data[i].getAttribute("name") names.append(name) value = data[i].getElementsByTagName("value") values.append(value[0].firstChild.nodeValue.encode("utf-8")) def writeToFile(): with open("uiStrings-fi.py", "w") as f: for i in range(len(names)): line = names[i] + '="'+ values[i] + '"' #varName='varValue' f.write(line) f.write("\n") getStrings("ResourceFile.fi-FI.resx") writeToFile() And here's the traceback: Traceback (most recent call last): File "GenerateLanguageFiles.py", line 24, in writeToFile() File "GenerateLanguageFiles.py", line 19, in writeToFile line = names[i] + '="'+ values[i] + '"' #varName='varValue' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in ran ge(128) How should I fix my script so it would read and write UTF-8 characters properly? The files I'm trying to generate would be used in test automation with Robots Framework.

    Read the article

  • Python: puzzling behaviour inside httplib

    - by Anna
    I have added one line ( import pdb; pdb.set_trace() ) to httplib's HTTPConnection.putheader, so I can see what's going on inside. httplib.py, line 489: def putheader(self, header, value): """Send a request header line to the server. For example: h.putheader('Accept', 'text/html') """ import pdb; pdb.set_trace() if self.__state != _CS_REQ_STARTED: raise CannotSendHeader() str = '%s: %s' % (header, value) self._output(str) then ran this from the interpreter import urllib2 urllib2.urlopen('http://www.ioerror.us/ip/headers') ... and as expected PDB kicks in: > c:\python26\lib\httplib.py(858)putheader() -> if self.__state != _CS_REQ_STARTED: (Pdb) in PDB I have the luxury of evaluating expressions on the fly, so I have tried to enter self.__state: (Pdb) self.__state *** AttributeError: HTTPConnection instance has no attribute '__state' Alas, there is no __state of this instance. However when I enter step, the debugger gets past the if self.__state != _CS_REQ_STARTED: line without a problem. Why is this happening? If the self.__state doesn't exist python would have to raise an exception as it did when I entered the expression. Python version: 2.6.4 on win32

    Read the article

  • Base class deleted before subclass during python __del__ processing

    - by Oddthinking
    Context I am aware that if I ask a question about Python destructors, the standard argument will be to use contexts instead. Let me start by explaining why I am not doing that. I am writing a subclass to logging.Handler. When an instance is closed, it posts a sentinel value to a Queue.Queue. If it doesn't, a second thread will be left running forever, waiting for Queue.Queue.get() to complete. I am writing this with other developers in mind, so I don't want a failure to call close() on a handler object to cause the program to hang. Therefore, I am adding a check in __del__() to ensure the object was closed properly. I understand circular references may cause it to fail in some circumstances. There's not a lot I can do about that. Problem Here is some simple example code: explicit_delete = True class Base: def __del__(self): print "Base class cleaning up." class Sub(Base): def __del__(self): print "Sub-class cleaning up." Base.__del__(self) x = Sub() if explicit_delete: del x print "End of thread" When I run this I get, as expected: Sub-class cleaning up. Base class cleaning up. End of thread If I set explicit_delete to False in the first line, I get: End of thread Sub-class cleaning up. Exception AttributeError: "'NoneType' object has no attribute '__del__'" in <bound method Sub.__del__ of <__main__.Sub instance at 0x00F0B698>> ignored It seems the definition of Base is removed before the x._del_() is called. The Python Documentation on _del_() warns that the subclass needs to call the base-class to get a clean deletion, but here that appears to be impossible. Can you see where I made a bad step?

    Read the article

  • Python template engine

    - by jturo
    Hello there, Could it be possible if somebody could help me get started in writing a python template engine? I'm new to python and as I learn the language I've managed to write a little MVC framework running in its own light-weight-WSGI-like server. I've managed to write a script that finds and replaces keys for values: (Obviously this is not how my script is structured or implemented. this is just an example) from string import Template html = '<html>\n' html += ' <head>\n' html += ' <title>This is so Cool : In Controller HTML</title>\n' html += ' </head>\n' html += ' <body>\n' html += ' Home | <a href="/hi">Hi ${name}</a>\n' html += ' </body>\n' html += '<html>' Template(html).safe_substitute(dict(name = 'Arturo')) My next goal is to implement custom statements, modifiers, functions, etc (like 'for' loop) but i don't really know if i should use another module that i don't know about. I thought of regular expressions but i kind feel like that wouldn't be an efficient way of doing it Any help is appreciated, and i'm sure this will be of use to other people too. Thank you

    Read the article

  • Python byte per byte XOR decryption

    - by neurino
    I have an XOR encypted file by a VB.net program using this function to scramble: Public Class Crypter ... 'This Will convert String to bytes, then call the other function. Public Function Crypt(ByVal Data As String) As String Return Encoding.Default.GetString(Crypt(Encoding.Default.GetBytes(Data))) End Function 'This calls XorCrypt giving Key converted to bytes Public Function Crypt(ByVal Data() As Byte) As Byte() Return XorCrypt(Data, Encoding.Default.GetBytes(Me.Key)) End Function 'Xor Encryption. Private Function XorCrypt(ByVal Data() As Byte, ByVal Key() As Byte) As Byte() Dim i As Integer If Key.Length <> 0 Then For i = 0 To Data.Length - 1 Data(i) = Data(i) Xor Key(i Mod Key.Length) Next End If Return Data End Function End Class and saved this way: Dim Crypter As New Cryptic(Key) 'open destination file Dim objWriter As New StreamWriter(fileName) 'write crypted content objWriter.Write(Crypter.Crypt(data)) Now I have to reopen the file with Python but I have troubles getting single bytes, this is the XOR function in python: def crypto(self, data): 'crypto(self, data) -> str' return ''.join(chr((ord(x) ^ ord(y)) % 256) \ for (x, y) in izip(data.decode('utf-8'), cycle(self.key)) I had to add the % 256 since sometimes x is 256 i.e. not a single byte. This thing of two bytes being passed does not break the decryption because the key keeps "paired" with the following data. The problem is some decrypted character in the conversion is wrong. These chars are all accented letters like à, è, ì but just a few of the overall accented letters. The others are all correctly restored. I guess it could be due to the 256 mod but without it I of course get a chr exception... Thanks for your support

    Read the article

  • S3 Backup Memory Usage in Python

    - by danpalmer
    I currently use WebFaction for my hosting with the basic package that gives us 80MB of RAM. This is more than adequate for our needs at the moment, apart from our backups. We do our own backups to S3 once a day. The backup process is this: dump the database, tar.gz all the files into one backup named with the correct date of the backup, upload to S3 using the python library provided by Amazon. Unfortunately, it appears (although I don't know this for certain) that either my code for reading the file or the S3 code is loading the entire file in to memory. As the file is approximately 320MB (for today's backup) it is using about 320MB just for the backup. This causes WebFaction to quit all our processes meaning the backup doesn't happen and our site goes down. So this is the question: Is there any way to not load the whole file in to memory, or are there any other python S3 libraries that are much better with RAM usage. Ideally it needs to be about 60MB at the most! If this can't be done, how can I split the file and upload separate parts? Thanks for your help. This is the section of code (in my backup script) that caused the processes to be quit: filedata = open(filename, 'rb').read() content_type = mimetypes.guess_type(filename)[0] if not content_type: content_type = 'text/plain' print 'Uploading to S3...' response = connection.put(BUCKET_NAME, 'daily/%s' % filename, S3.S3Object(filedata), {'x-amz-acl': 'public-read', 'Content-Type': content_type})

    Read the article

  • deepcopy and python - tips to avoid using it?

    - by blackkettle
    Hi, I have a very simple python routine that involves cycling through a list of roughly 20,000 latitude,longitude coordinates and calculating the distance of each point to a reference point. def compute_nearest_points( lat, lon, nPoints=5 ): """Find the nearest N points, given the input coordinates.""" points = session.query(PointIndex).all() oldNearest = [] newNearest = [] for n in xrange(nPoints): oldNearest.append(PointDistance(None,None,None,99999.0,99999.0)) newNearest.append(obj2) #This is almost certainly an inappropriate use of deepcopy # but how SHOULD I be doing this?!?! for point in points: distance = compute_spherical_law_of_cosines( lat, lon, point.avg_lat, point.avg_lon ) k = 0 for p in oldNearest: if distance < p.distance: newNearest[k] = PointDistance( point.point, point.kana, point.english, point.avg_lat, point.avg_lon, distance=distance ) break else: newNearest[k] = deepcopy(oldNearest[k]) k += 1 for j in range(k,nPoints-1): newNearest[j+1] = deepcopy(oldNearest[j]) oldNearest = deepcopy(newNearest) #We're done, now print the result for point in oldNearest: print point.station, point.english, point.distance return I initially wrote this in C, using the exact same approach, and it works fine there, and is basically instantaneous for nPoints<=100. So I decided to port it to python because I wanted to use SqlAlchemy to do some other stuff. I first ported it without the deepcopy statements that now pepper the method, and this caused the results to be 'odd', or partially incorrect, because some of the points were just getting copied as references(I guess? I think?) -- but it was still pretty nearly as fast as the C version. Now with the deepcopy calls added, the routine does it's job correctly, but it has incurred an extreme performance penalty, and now takes several seconds to do the same job. This seems like a pretty common job, but I'm clearly not doing it the pythonic way. How should I be doing this so that I still get the correct results but don't have to include deepcopy everywhere?

    Read the article

  • python 'with' statement

    - by Stephen
    Hi, I'm using Python 2.5. I'm trying to use this 'with' statement. from __future__ import with_statement a = [] with open('exampletxt.txt','r') as f: while True: a.append(f.next().strip().split()) print a The contents of 'exampletxt.txt' are simple: a b In this case, I get the error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/python-7036sVf.py", line 5, in <module> a.append(f.next().strip().split()) StopIteration And if I replace f.next() with f.read(), it seems to be caught in an infinite loop. I wonder if I have to write a decorator class that accepts the iterator object as an argument, and define an __exit__ method for it? I know it's more pythonic to use a for-loop for iterators, but I wanted to implement a while loop within a generator that's called by a for-loop... something like def g(f): while True: x = f.next() if test(x): a = x elif test(x): b = f.next() yield [a,x,b] a = [] with open(filename) as f: for x in g(f): a.append(x)

    Read the article

  • How to write a custom solution using a python package, modules etc

    - by morpheous
    I am writing a packacge foobar which consists of the modules alice, bob, charles and david. From my understanding of Python packages and modules, this means I will create a folder foobar, with the following subdirectories and files (please correct if I am wrong) foobar/ __init__.py alice/alice.py bob/bob.py charles/charles.py david/david.py The package should be executable, so that in addition to making the modules alice, bob etc available as 'libraries', I should also be able to use foobar in a script like this: python foobar --args=someargs Question1: Can a package be made executable and used in a script like I described above? Question 2 The various modules will use code that I want to refactor into a common library. Does that mean creating a new sub directory 'foobar/common' and placing common.py in that folder? Question 3 How will the modules foo import the common module ? Is it 'from foobar import common' or can I not use this since these modules are part of the package? Question 4 I want to add logic for when the foobar package is being used in a script (assuming this can be done - I have only seen it done for modules) The code used is something like: if __name__ == "__main__": dosomething() where (in which file) would I put this logic ?

    Read the article

  • Speed vs security vs compatibility over methods to do string concatenation in Python

    - by Cawas
    Similar questions have been brought (good speed comparison there) on this same subject. Hopefully this question is different and updated to Python 2.6 and 3.0. So far I believe the faster and most compatible method (among different Python versions) is the plain simple + sign: text = "whatever" + " you " + SAY But I keep hearing and reading it's not secure and / or advisable. I'm not even sure how many methods are there to manipulate strings! I could count only about 4: There's interpolation and all its sub-options such as % and format and then there's the simple ones, join and +. Finally, the new approach to string formatting, which is with format, is certainly not good for backwards compatibility at same time making % not good for forward compatibility. But should it be used for every string manipulation, including every concatenation, whenever we restrict ourselves to 3.x only? Well, maybe this is more of a wiki than a question, but I do wish to have an answer on which is the proper usage of each string manipulation method. And which one could be generally used with each focus in mind (best all around for compatibility, for speed and for security). Thanks.

    Read the article

  • [Python] name 'OptionGroup' is not defined

    - by Cawas
    Ok, so I made this rookie mistake below, but in my defense I was led to it thanks to how the help about this subject is on python docs, which states how to use optparse. It is actually an error under the gigantic tutorial section. In contrast and to my offense, I may be one of the very few stupid people who can't read very well and pay close attention on what I do. But since this took me so long to discover, I wanted to "document" it here: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) >>> from optparse import OptionParser >>> outputGroup = OptionGroup(parser, 'Output handling') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'OptionGroup' is not defined This is strictly done with examples found on the docs, and you can't find anything about it anywhere, be it that long long docs page, google or stackoverflow. Plus, reading optparse.py shows OptionGroup is there, so that adds to the confusion. I bet it will take less than 1 minute for someone to spot my error. For that I'll only add proper tags and / or modify the title later on. :)

    Read the article

  • Python OOP and lists

    - by Mikk
    Hi, I'm new to Python and it's OOP stuff and can't get it to work. Here's my code: class Tree: root = None; data = []; def __init__(self, equation): self.root = equation; def appendLeft(self, data): self.data.insert(0, data); def appendRight(self, data): self.data.append(data); def calculateLeft(self): result = []; for item in (self.getLeft()): if (type(item) == type(self)): data = item.calculateLeft(); else: data = item; result.append(item); return result; def getLeft(self): return self.data; def getRight(self): data = self.data; data.reverse(); return data; tree2 = Tree("*"); tree2.appendRight(44); tree2.appendLeft(20); tree = Tree("+"); tree.appendRight(4); tree.appendLeft(10); tree.appendLeft(tree2); print(tree.calculateLeft()); It looks like tree2 and tree are sharing list "data"? At the moment I'd like it to output something like [[20,44], 10, 4], but when I tree.appendLeft(tree2) I get RuntimeError: maximum recursion depth exceeded, and when i even won't appendLeft(tree2) it outputs [10, 20, 44, 4] (!!!). What am I missing here? I'm using Portable Python 3.0.1. Thank you

    Read the article

  • python histogram one-liner

    - by mykhal
    there are many ways, how to code histogram in Python. by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.: >>> L = 'abracadabra' >>> histogram(L) {'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2} it can be written like this: def histogram(L): d = {} for x in L: if x in d: d[x] += 1 else: d[x] = 1 return d ..however, there are much less ways, how do this in a single expression. if we had "dict comprehensions" in python, we would write: >>> { x: L.count(x) for x in set(L) } but we don't have them, so we have to write: >>> dict([(x, L.count(x)) for x in set(L)]) however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where: def gen(): for x in L: yield x we can go with reduce (R.I.P.): >>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong! oops, does not work, the key name is 'x', not x :( i ended with: >>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {}) (in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there) please beat me with a better one-liner, more readable! ;)

    Read the article

  • Python DictReader - Skipping rows with missing columns?

    - by victorhooi
    heya, I have a Excel .CSV file I'm attempting to read in with DictReader. All seems to be well, except it seems to omit rows, specifically those with missing columns. Our input looks like: mail,givenName,sn,lorem,ipsum,dolor,telephoneNumber [email protected],ian,bay,3424,8403,2535,+65(2)34523534545 [email protected],mike,gibson,3424,8403,2535,+65(2)34523534545 [email protected],ross,martin,,,,+65(2)34523534545 [email protected],david,connor,,,,+65(2)34523534545 [email protected],chris,call,3424,8403,2535,+65(2)34523534545 So some of the rows have missing lorem/ipsum/dolor columns, and it's just a string of commas for those. We're reading it in with: def read_gd_dump(input_file="blah 20100423.csv"): gd_extract = csv.DictReader(open('blah 20100423.csv'), restval='missing', dialect='excel') return dict([(row['something'], row) for row in gd_extract]) And I checked that "something" (the key for our dict) isn't one of the missing columns, I had originally suspected it might be that. It's one of the columns after that. However, DictReader seems to completely skip over the rows. I tried setting restval to something, didn't seem to make any difference. I can't seem to find anything in Python's CSV docs (http://docs.python.org/library/csv.html) that would explain this behaviour, but I may have misread something. Any ideas? Thanks, Victor

    Read the article

  • spam and dirty words comment post filtering in python (django)

    - by sintaloo
    Hi All, My basic question is how to filter spam and dirty words in a comment post system under python (django). I have a collection of phrases (approximately 3000 phrases) to be filtered. Question (1), are there any existing open source python (or django) package/module/plugin which can handle this job? I knew there was one called Akismet. But from what I understood, it will not solve my problem. Akismet is just a web service and filter the words dictionary defined by Akismet. But I have my own collection of words. Please correct me if I am wrong. Question (2), If there is no such open source package I can use, how to create my own one? The only thing I can think of it's to use regular expression and join all the word phrases with 'or' in a regular expression. but I have 3000 phrases, I think it won't work in term of performance and filter every comment post. any suggestions where should I start from? Thank you very much for your help and time.

    Read the article

  • use proxy in python to fetch a webpage

    - by carmao
    I am trying to write a function in Python to use a public anonymous proxy and fetch a webpage, but I got a rather strange error. The code (I have Python 2.4): import urllib2 def get_source_html_proxy(url, pip, timeout): # timeout in seconds (maximum number of seconds willing for the code to wait in # case there is a proxy that is not working, then it gives up) proxy_handler = urllib2.ProxyHandler({'http': pip}) opener = urllib2.build_opener(proxy_handler) opener.addheaders = [('User-agent', 'Mozilla/5.0')] urllib2.install_opener(opener) req=urllib2.Request(url) sock=urllib2.urlopen(req) timp=0 # a counter that is going to measure the time until the result (webpage) is # returned while 1: data = sock.read(1024) timp=timp+1 if len(data) < 1024: break timpLimita=50000000 * timeout if timp==timpLimita: # 5 millions is about 1 second break if timp==timpLimita: print IPul + ": Connection is working, but the webpage is fetched in more than 50 seconds. This proxy returns the following IP: " + str(data) return str(data) else: print "This proxy " + IPul + "= good proxy. " + "It returns the following IP: " + str(data) return str(data) # Now, I call the function to test it for one single proxy (IP:port) that does not support user and password (a public high anonymity proxy) #(I put a proxy that I know is working - slow, but is working) rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) print rez The error: Traceback (most recent call last): File "./public_html/cgi-bin/teste5.py", line 43, in ? rez=get_source_html_proxy("http://www.whatismyip.com/automation/n09230945.asp", "93.84.221.248:3128", 50) File "./public_html/cgi-bin/teste5.py", line 18, in get_source_html_proxy sock=urllib2.urlopen(req) File "/usr/lib64/python2.4/urllib2.py", line 130, in urlopen return _opener.open(url, data) File "/usr/lib64/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib64/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib64/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib64/python2.4/urllib2.py", line 573, in lambda r, proxy=url, type=type, meth=self.proxy_open: \ File "/usr/lib64/python2.4/urllib2.py", line 580, in proxy_open if '@' in host: TypeError: iterable argument required I do not know why the character "@" is an issue (I have no such in my code. Should I have?) Thanks in advance for your valuable help.

    Read the article

  • Ignore case in Python strings

    - by Paul Oyster
    What is the easiest way to compare strings in Python, ignoring case? Of course one can do (str1.lower() <= str2.lower()), etc., but this created two additional temporary strings (with the obvious alloc/g-c overheads). I guess I'm looking for an equivalent to C's stricmp(). [Some more context requested, so I'll demonstrate with a trivial example:] Suppose you want to sort a looong list of strings. You simply do theList.sort(). This is O(n * log(n)) string comparisons and no memory management (since all strings and list elements are some sort of smart pointers). You are happy. Now, you want to do the same, but ignore the case (let's simplify and say all strings are ascii, so locale issues can be ignored). You can do theList.sort(key=lambda s: s.lower()), but then you cause two new allocations per comparison, plus burden the garbage-collector with the duplicated (lowered) strings. Each such memory-management noise is orders-of-magnitude slower than simple string comparison. Now, with an in-place stricmp()-like function, you do: theList.sort(cmp=stricmp) and it is as fast and as memory-friendly as theList.sort(). You are happy again. The problem is any Python-based case-insensitive comparison involves implicit string duplications, so I was expecting to find a C-based comparisons (maybe in module string). Could not find anything like that, hence the question here. (Hope this clarifies the question).

    Read the article

  • Speedup writing C programs using a subset of the Python syntax

    - by psihodelia
    I am constantly trying to optimize my time. Writing a C code takes a lot of time and requires much more keyboard touches than say writing a Python program. However, in order to speed up the time required to create a C program, one can automatize many things. I'd like to write my programs using smth. like Python but with C semantics. It means, all keywords are C keywords, but syntax is optimized. For example, this C code: #include "dsplib.h" #include "coeffs.h" #define MODULENAME "dsplib" #define NUM_SAMPLES 320 typedef float t_Vec; typedef struct s_Inter { char *pc_Name; struct s_Inter *px_Next; }t_Inter; typedef struct s_DspLibControl { t_Vec f_Y; }t_DspLibControl; void v_DspLibName(void) { printf("Module: %s", MODULENAME); printf("\n"); } int v_DspLibInitInterControl(t_DspLibControl *px_Con) { int y; px_Con->f_Y = 0.0; for(int i=0;i<10;i++) { y += i * i; } return y; } in optimized pythonized version can look like: include dsplib, coeffs define MODULENAME="dsplib", NUM_SAMPLES=320 typedef float t_Vec typedef struct s_Inter: char *pc_Name struct s_Inter *px_Next t_Inter typedef struct s_DspLibControl: t_Vec f_Y t_DspLibControl v_DspLibName(): printf("Module: %s", MODULENAME); printf("\n") int v_DspLibInitInterControl(t_DspLibControl *px_Con): int y px_Con->f_Y = 0.0 for int i=0;i<10;i++: y += i * i return y My question is: Do you know any VIM script, which allows to translate an original pythonized C code into a standard C code? For example, one is writing a C code but uses pythonized syntax, once she decides to translate pythonized blocks into standard C, she selects such blocks and press some key. And she doesn't save such pythonized code of course, VIM translates it into standard C.

    Read the article

< Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >