Search Results

Search found 13534 results on 542 pages for 'python 2 x'.

Page 73/542 | < Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80  | Next Page >

  • Producing an view of a text's revision history in Python

    - by hekevintran
    I have two versions of a piece of text and I want to produce an HTML view of its revision similar to what Google Docs or Stack Overflow displays. I need to do this in Python. I don't know what this technique is called but I assume that it has a name and hopefully there is a Python library that can do it. Version 1: William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. Version 2: William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American. The desired output: William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American. Using the diff command doesn't work because it tells me which lines are different but not which columns/words are different. $ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen.' > oldfile $ echo 'William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > newfile $ diff -u oldfile newfile --- oldfile 2010-04-30 13:32:43.000000000 -0700 +++ newfile 2010-04-30 13:33:09.000000000 -0700 @@ -1 +1 @@ -William Henry "Bill" Gates III (born October 28, 1955)[2] is an American business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. +William Henry "Bill" Gates III (born October 28, 1955)[2] is a business magnate, philanthropist, and chairman[3] of Microsoft, the software company he founded with Paul Allen. He is American.' > oldfile

    Read the article

  • Python: fetching SVG file using urllib is returning binary when I need ASCII

    - by Drew Dara-Abrams
    I'm using urllib (in Python) to fetch an SVG file: import urllib urllib.urlopen('http://alpha.vectors.cloudmade.com/BC9A493B41014CAABB98F0471D759707/-122.2487,37.87588,-122.265823,37.868054?styleid=1&viewport=400x231').read() which produces output of the sort: xb6\xf6\x00\xb3\xfb2\xff\xda\xc5\xf2\xc2\x14\xef\xcd\x82\x0b\xdbU\xb0\x81\xcaF\xd8\x1a\xf6\xdf[i)\xba\xcf\x80\xab\xd6\x8c\xe3l_\xe7\n\xed2,\xbdm\xa0_|\xbb\x12\xff\xb6\xf8\xda\xd9\xc3\xd9\t\xde\x9a\xf8\xae\xb3T\xa3\r`\x8a\x08!T\xfb8\x92\x95\x0c\xdd\x8b!\x02P\xea@\x98\x1c^\xc7\xda\\\xec\xe3\xe1\xbe,0\xcd\xbeZ~\x92\xb3\xfa\xdd\xfcbyu\xb8\x83\xbb\xbdS\x0f\x82\x0b\xfe\xf5_\xdawn\xff\xef_\xff\xe5\xfa\x1f?\xbf\xffoZ\x0f\x8b\xbfV\xf4\x04\x00' when I was expecting more like this: <?xml version='1.0' encoding='UTF-8'?> <svg xmlns="http://www.w3.org/2000/svg" xmlns:cm="http://cloudmade.com/" width="400" height="231"> <rect width="100%" height="100%" fill="#eae8dd" opacity="1"/> <g transform="scale(0.209849975856)"> <g transform="translate(13610569, 4561906)" flood-opacity="0.1" flood-color="grey"> <path d="M -13610027.720000000670552 -4562403.660000000149012 I guess this is an issue of binary vs. ASCII. Can anyone help me (a Python newbie) with the appropriate conversion so that I can get on with parsing and manipulating the SVG code?

    Read the article

  • python - tkinter - update label from variable

    - by Tom
    I wrote a python script that does some stuff to generate and then keep changing some text stored as a string variable. This works, and I can print the string each time it gets changed. Problems have arisen while trying to display that output in a GUI (just as a basic label) using tkinter. I can get the label to display the string for the first time... but it never updates. This is really the first time I have tried to use tkinter, so it's likely I'm making a foolish error. What I've got looks logical to me, but I'm evidently going wrong somewhere! from tkinter import * outputText = 'Ready' counter = int(0) root = Tk() root.maxsize(400, 400) var = StringVar() l = Label(root, textvariable=var, anchor=NW, justify=LEFT, wraplength=398) l.pack() var.set(outputText) while True: counter = counter + 1 #do some stuff that generates string as variable 'result' outputText = result #do some more stuff that generates new string as variable 'result' outputText = result #do some more stuff that generates new string as variable 'result' outputText = result if counter == 5: break root.mainloop() I also tried: from tkinter import * outputText = 'Ready' counter = int(0) root = Tk() root.maxsize(400, 400) var = StringVar() l = Label(root, textvariable=var, anchor=NW, justify=LEFT, wraplength=398) l.pack() var.set(outputText) while True: counter = counter + 1 #do some stuff that generates string as variable 'result' outputText = result var.set(outputText) #do some more stuff that generates new string as variable 'result' outputText = result var.set(outputText) #do some more stuff that generates new string as variable 'result' outputText = result var.set(outputText) if counter == 5: break root.mainloop() In both cases, the label will show 'Ready' but won't update to change that to the strings as they're generated later. After a fair bit of googling and looking through answers on this site, I thought the solution might be to use update_idletasks - I tried putting that in after each time the variable was changed, but it didn't help. It also seems possible I am meant to be using trace and callback somehow to make all this work...but I can't get my head around how that works (I tried, but didn't manage to make anything that even looked like it would do something, let alone actually worked). I'm still very new to both python and especially tkinter, so, any help much appreciated but please spell it out for me if you can :)

    Read the article

  • Starting out NLP - Python + large data set

    - by pencilNero
    Hi, I've been wanting to learn python and do some NLP, so have finally gotten round to starting. Downloaded the english wikipedia mirror for a nice chunky dataset to start on, and have been playing around a bit, at this stage just getting some of it into a sqlite db (havent worked with dbs in the past unfort). But I'm guessing sqlite is not the way to go for a full blown nlp project(/experiment :) - what would be the sort of things I should look at ? HBase (.. and hadoop) seem interesting, i guess i could run then im java, prototype in python and maybe migrate the really slow bits to java... alternatively just run Mysql.. but the dataset is 12gb, i wonder if that will be a problem? Also looked at lucene, but not sure how (other than breaking the wiki articles into chunks) i'd get that to work.. What comes to mind for a really flexible NLP platform (i dont really know at this stage WHAT i want to do.. just want to learn large scale lang analysis tbh) ? Many thanks.

    Read the article

  • Create a VPN with Python

    - by user213060
    I want to make a device "tunnel box" that you plug an input ethernet line, and an output ethernet line, and all the traffic that goes through it gets modified in a special way. This is similar to how a firewall, IDS, VPN, or similar boxes are connected inline in a network. I think you can just assume that I am writing a custom VPN in Python for the purpose of this question: LAN computer <--\ LAN computer <---> [LAN switch] <--> ["tunnel box"] <--> [internet modem] <--> LAN computer <--/ My question is, what is a good way to program this "tunnel box" from python? My application needs to see TCP flows at the network layer, not as individual ethernet frames. Non-TCP/IP traffic such as ICPM and other types should just be passed through. Example Twisted-like Code for my "tunnel box" tunnel appliance: from my_code import special_data_conversion_function class StreamInterceptor(twisted.Protocol): def dataReceived(self,data): data=special_data_conversion_function(data) self.outbound_connection.send(data) My initial guesses: TUN/TAP with twisted.pair.tuntap.py - Problem: This seems to only work at the ethernet frame level, not like my example? Socks proxy - Problem: Not transparent as in my diagram. Programs have to be specifically setup for it. Thanks!

    Read the article

  • using python 'with' statement with iterators?

    - by Stephen
    Hi, I'm using Python 2.5. I'm trying to use this 'with' statement. from __future__ import with_statement a = [] with open('exampletxt.txt','r') as f: while True: a.append(f.next().strip().split()) print a The contents of 'exampletxt.txt' are simple: a b In this case, I get the error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/python-7036sVf.py", line 5, in <module> a.append(f.next().strip().split()) StopIteration And if I replace f.next() with f.read(), it seems to be caught in an infinite loop. I wonder if I have to write a decorator class that accepts the iterator object as an argument, and define an __exit__ method for it? I know it's more pythonic to use a for-loop for iterators, but I wanted to implement a while loop within a generator that's called by a for-loop... something like def g(f): while True: x = f.next() if test1(x): a = x elif test2(x): b = f.next() yield [a,x,b] a = [] with open(filename) as f: for x in g(f): a.append(x)

    Read the article

  • Python API for VirtualBox

    - by jessica
    I have made a command-line interface for virtualbox such that the virtualbox can be controlled from a remote machine. now I am trying to implement the commmand-line interface using python virtualbox api. For that I have downloaded the pyvb package (python api documentation shows functions that can be used for implementing this under pyvb package). but when I give pyvb.vb.VB.startVM(instance of VB class,pyvb.vm.vbVM) SERVER SIDE CODE IS from pyvb.constants import * from pyvb.vm import * from pyvb.vb import * import xpcom import pyvb import os import socket import threading class ClientThread ( threading.Thread ): # Override Thread's init method to accept the parameters needed: def init ( self, channel, details ): self.channel = channel self.details = details threading.Thread.__init__ ( self ) def run ( self ): print 'Received connection:', self.details [ 0 ] while 1: s= self.channel.recv ( 1024 ) if(s!='end'): if(s=='start'): v=VB() pyvb.vb.VB.startVM(v,pyvb.vm.vbVM) else: self.channel.close() break print 'Closed connection:', self.details [ 0 ] server = socket.socket ( socket.AF_INET, socket.SOCK_STREAM ) server.bind ( ( '127.0.0.1', 2897 ) ) server.listen ( 5 ) while True: channel, details = server.accept() ClientThread ( channel, details ).start() it shows an error Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "news.py", line 27, in run pyvb.vb.VB.startVM(v,pyvb.vm.vbVM.getUUID(m)) File "/usr/lib/python2.5/site-packages/pyvb-0.0.2-py2.5.egg/pyvb/vb.py", line 65, in startVM cmd='%s %s'%(VB_COMMAND_STARTVM, vm.getUUID()) AttributeError: 'str' object has no attribute 'getUUID'

    Read the article

  • How do you invoke a python script inside a jar file using python ?

    - by Trevor
    I'm working on an application that intersperses a bunch of jython and java code. Due to the nature of the program (using wsadmin) we are really restricted to Python 2.1 We currently have a jar containing both java source and .py modules. The code is currently invoked using java, but I'd like to remove this in favor of migrating as much functionality as possible to jython. The problem I have is that I want to either import or execute python modules inside the existing jar file from a calling jython script. I've tried a couple of different ways without success. My directory structure looks like: application.jar |-- com |--example |-- action |-- MyAction.class |-- pre_myAction.py The 1st approach I tried was to do imports from the jar. I added the jar to my sys.path and tried to import the module using both import com.example.action.myAction and import myAction. No success however, even when I put init.py files into the directory at each level. The 2nd approach I tried was to load the resource using the java class. So I wrote the below code: import sys import os import com.example.action.MyAction as MyAction scriptName = str(MyAction.getResource('/com/example/action/myAction.py')) scriptStr = MyAction.getResourceAsStream('/com/example/action/myAction.py') try: print execfile(scriptStr) except: print "failed 1" try: print execfile(scriptName) except: print "failed 2" Both of these failed. I'm at a bit of a loss now as to how I should proceed. Any ideas ? cheers, Trevor

    Read the article

  • Python doctest error

    - by user74283
    Hi I recently started experimenting with python currently reading "Think like a computer scientist: Learning python v2nd edition" I have been having some trouble with doctest. I use a windows 7 machine and Eclipse IDE with pydev. My question is when i run the script below i get the error below. Said script is below the the error message Traceback (most recent call last): File "C:\Users\shaytac\PythonProjects\test.py", line 21, in doctest.testmod() File "C:\Python26\lib\doctest.py", line 1829, in testmod for test in finder.find(m, name, globs=globs, extraglobs=extraglobs): File "C:\Python26\lib\doctest.py", line 852, in find self._find(tests, obj, name, module, source_lines, globs, {}) File "C:\Python26\lib\doctest.py", line 906, in _find globs, seen) File "C:\Python26\lib\doctest.py", line 894, in _find test = self._get_test(obj, name, module, globs, source_lines) File "C:\Python26\lib\doctest.py", line 978, in _get_test filename, lineno) File "C:\Python26\lib\doctest.py", line 597, in get_doctest return DocTest(self.get_examples(string, name), globs, File "C:\Python26\lib\doctest.py", line 611, in get_examples return [x for x in self.parse(string, name) File "C:\Python26\lib\doctest.py", line 573, in parse self._parse_example(m, name, lineno) File "C:\Python26\lib\doctest.py", line 631, in _parse_example self._check_prompt_blank(source_lines, indent, name, lineno) File "C:\Python26\lib\doctest.py", line 718, in _check_prompt_blank line[indent:indent+3], line)) ValueError: line 2 of the docstring for main.compare lacks blank after : 'compare(5, 4) ' def compare(a, b): """ >>>compare(5, 4) 1 >>>compare(7, 7) 0 >>>compare(2, 3) -1 >>>compare(42, 1) 1 """ if a > b : return 1 if a == b : return 0 if a < b : return -1 if __name__ == '__main__': import doctest doctest.testmod()

    Read the article

  • Python Pre-testing for exceptions when coverage fails

    - by Tal Weiss
    I recently came across a simple but nasty bug. I had a list and I wanted to find the smallest member in it. I used Python's built-in min(). Everything worked great until in some strange scenario the list was empty (due to strange user input I could not have anticipated). My application crashed with a ValueError (BTW - not documented in the official docs). I have very extensive unit tests and I regularly check coverage to avoid surprises like this. I also use Pylint (everything is integrated in PyDev) and I never ignore warnings, yet I failed to catch this bug before my users did. Is there anything I can change in my methodology to avoid these kind of runtime errors? (which would have been caught at compile time in Java / C#?). I'm looking for something more than wrapping my code with a big try-except. What else can I do? How many other build in Python functions are hiding nasty surprises like this???

    Read the article

  • Python with Wiimote using pywiiuse module

    - by Anon
    After seeing the abilities and hackibility of wiimotes I really want to use it in my 'Intro to programming' final. Everyone must make a python program and present it to the class. I want to make a game with pygame incorporating a wiimote. I found pywiiuse which is a very basic wrapper for the wiiuse library using c types. I can NOT get anything beyond LEDs and vibrating to work. Buttons, IR, motion sensing, nothing. I've tried different versions of wiiuse, pywiiuse, even python. I can't even get the examples that came with it to run. Here's the code I made as a simple test. I copied some of the example C++ code. from pywiiuse import * from time import sleep #Init wiimotes = wiiuse_init() #Find and start the wiimote found = wiiuse_find(wiimotes,1,5) #Make the variable wiimote to the first wiimote init() found wiimote = wiimotes.contents #Set Leds wiiuse_set_leds(wiimote,WIIMOTE_LED_1) #Rumble for 1 second wiiuse_rumble(wiimote,1) sleep(1) wiiuse_rumble(wiimote,0) #Turn motion sensing on(supposedly) wiiuse_motion_sensing(wiimote,1) while 1: #Poll the wiimotes to get the status like pitch or roll if(wiiuse_poll(wiimote,1)): print 'EVENT' And here's the output when I run it. wiiuse version 0.9 wiiuse api version 8 [INFO] Found wiimote [assigned wiimote id 1]. EVENT EVENT Traceback (most recent call last): File "C:\Documents and Settings\Nick\Desktop\wiimotetext.py", line 26, in <mod ule> if(wiiuse_poll(wiimote,1)): WindowsError: exception: access violation reading 0x00000004 It seems each time I run it, it prints out EVENT 2-5 times until the trace back. I have no clue what to do at this point, I've been trying for the past two days to get it working. Thanks!

    Read the article

  • python: nonblocking subprocess, check stdout

    - by Will Cavanagh
    Ok so the problem I'm trying to solve is this: I need to run a program with some flags set, check on its progress and report back to a server. So I need my script to avoid blocking while the program executes, but I also need to be able to read the output. Unfortunately, I don't think any of the methods available from Popen will read the output without blocking. I tried the following, which is a bit hack-y (are we allowed to read and write to the same file from two different objects?) import time import subprocess from subprocess import * with open("stdout.txt", "wb") as outf: with open("stderr.txt", "wb") as errf: command = ['Path\\To\\Program.exe', 'para', 'met', 'ers'] p = subprocess.Popen(command, stdout=outf, stderr=errf) isdone = False while not isdone : with open("stdout.txt", "rb") as readoutf: #this feels wrong for line in readoutf: print(line) print("waiting...\\r\\n") if(p.poll() != None) : done = True time.sleep(1) output = p.communicate()[0] print(output) Unfortunately, Popen doesn't seem to write to my file until after the command terminates. Does anyone know of a way to do this? I'm not dedicated to using python, but I do need to send POST requests to a server in the same script, so python seemed like an easier choice than, say, shell scripting. Thanks! Will

    Read the article

  • Endless problems with a very simple python subprocess.Popen task

    - by Thomas
    I'd like python to send around a half-million integers in the range 0-255 each to an executable written in C++. This executable will then respond with a few thousand integers. Each on one line. This seems like it should be very simple to do with subprocess but i've had endless troubles. Right now im testing with code: // main() u32 num; std::cin >> num; u8* data = new u8[num]; for (u32 i = 0; i < num; ++i) std::cin >> data[i]; // test output / spit it back out for (u32 i = 0; i < num; ++i) std::cout << data[i] << std::endl; return 0; Building an array of strings ("data"), each like "255\n", in python and then using: output = proc.communicate("".join(data))[0] ...doesn't work (says stdin is closed, maybe too much at one time). Neither has using proc.stdin and proc.stdout worked. This should be so very simple, but I'm getting constant exceptions, and/or no output data returned to me. My Popen is currently: proc = Popen('aux/test_cpp_program', stdin=PIPE, stdout=PIPE, bufsize=1) Advise me before I pull my hair out. ;)

    Read the article

  • Getting all new messages from a Maildir in python

    - by Jesper
    I have a mail dir: foo@foo:~/Maildir$ ls -l total 288 drwx------ 2 foo foo 155648 2010-04-19 15:19 cur -rw------- 1 foo foo 440 2010-03-20 08:50 dovecot.index.log -rw------- 1 foo foo 112 2010-03-20 08:49 dovecot-uidlist -rw------- 1 foo foo 8 2010-03-20 08:49 dovecot-uidvalidity -rw------- 1 foo foo 0 2010-03-20 08:49 dovecot-uidvalidity.4ba48c0e drwx------ 2 foo foo 114688 2010-04-19 16:07 new drwx------ 2 foo foo 4096 2010-04-19 16:07 tmp And in python I'm trying to get all new messages (Python 2.6.5rc2). First, getting "Maildir" works: >>> import mailbox >>> md = mailbox.Maildir('/home/foo/Maildir') >>> md.iterkeys().next() '1269924477.Vfc01I4249fM708004.foo' But how do I access "Maildir/new"? This does not work: >>> md = mailbox.Maildir('/home/foo/Maildir/new') >>> md.iterkeys().next() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/mailbox.py", line 346, in iterkeys self._refresh() File "/usr/lib/python2.6/mailbox.py", line 467, in _refresh for entry in os.listdir(subdir_path): OSError: [Errno 2] No such file or directory: '/home/foo/Maildir/new/new' >>> Any ideas?

    Read the article

  • Help me find an appropriate ruby/python parser generator

    - by Geo
    The first parser generator I've worked with was Parse::RecDescent, and the guides/tutorials available for it were great, but the most useful feature it has was it's debugging tools, specifically the tracing capabilities ( activated by setting $RD_TRACE to 1 ). I am looking for a parser generator that can help you debug it's rules. The thing is, it has to be written in python or in ruby, and have a verbose mode/trace mode or very helpful debugging techniques. Does anyone know such a parser generator ? EDIT: when I said debugging, I wasn't referring to debugging python or ruby. I was referring to debugging the parser generator, see what it's doing at every step, see every char it's reading, rules it's trying to match. Hope you get the point. BOUNTY EDIT: to win the bounty, please show a parser generator framework, and illustrate some of it's debugging features. I repeat, I'm not interested in pdb, but in parser's debugging framework. Also, please don't mention treetop. I'm not interested in it.

    Read the article

  • Building a python module and linking it against a MacOSX framework

    - by madflo
    I'm trying to build a Python extension on MacOSX 10.6 and to link it against several frameworks (i386 only). I made a setup.py file, using distutils and the Extension object. I order to link against my frameworks, my LDFLAGS env var should look like : LDFLAGS = -lc -arch i386 -framework fwk1 -framework fwk2 As I did not find any 'framework' keyword in the Extension module documentation, I used the extra_link_args keyword instead. Extension('test', define_macros = [('MAJOR_VERSION', '1'), ,('MINOR_VERSION', '0')], include_dirs = ['/usr/local/include', 'include/', 'include/vitale'], extra_link_args = ['-arch i386', '-framework fwk1', '-framework fwk2'], sources = "testmodule.cpp", language = 'c++' ) Everything is compiling and linking fine. If I remove the -framework line from the extra_link_args, my linker fails, as expected. Here is the last two lines produced by a python setup.py build : /usr/bin/g++-4.2 -arch x86_64 -arch i386 -isysroot / -L/opt/local/lib -arch x86_64 -arch i386 -bundle -undefined dynamic_lookup build/temp.macosx-10.6-intel-2.6/testmodule.o -o build/lib.macosx-10.6-intel-2.6/test.so -arch i386 -framework sgdosx -framework srtosx -framework ssvosx -framework stsosx Unfortunately, the .so that I just produced is unable to find several symbols provided by this framework. I tried to check the linked framework with otool. None of them is appearing. $ otool -L test.so test.so: /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) There is the output of otool run on a test binary, made with g++ and ldd using the LDFLAGS described at the top of my post. On this example, the -framework did work. $ otool -L vitaosx vitaosx: /Library/Frameworks/sgdosx.framework/Versions/A/sgdosx (compatibility version 1.0.0, current version 1.0.0) /Library/Frameworks/ssvosx.framework/Versions/A/ssvosx (compatibility version 1.0.0, current version 1.0.0) /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.1) May this issue be linked to the "-undefined dynamic_lookup" flag on the linking step ? I'm a little bit confused by the few lines of documentation that I'm finding on Google. Cheers,

    Read the article

  • Replicating SQL's 'Join' in Python

    - by Daniel Mathews
    I'm in the process of trying to switch from R to Python (mainly issues around general flexibility). With Numpy, matplotlib and ipython, I've am able to cover all my use cases save for merging 'datasets'. I would like to simulate SQL's join by clause (inner, outer, full) purely in python. R handles this with the 'merge' function. I've tried the numpy.lib.recfunctions join_by, but it critical issues with duplicates along the 'key': join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2', defaults=None, usemask=True, asrecarray=False) Join arrays r1 and r2 on key key. The key should be either a string or a sequence of string corresponding to the fields used to join the array. An exception is raised if the key field cannot be found in the two input arrays. Neither r1 nor r2 should have any duplicates along key: the presence of duplicates will make the output quite unreliable. Note that duplicates are not looked for by the algorithm. source: http://presbrey.mit.edu:1234/numpy.lib.recfunctions.html Any pointers or help will be most appreciated!

    Read the article

  • Converting python collaborative filtering code to use Map Reduce

    - by Neil Kodner
    Using Python, I'm computing cosine similarity across items. given event data that represents a purchase (user,item), I have a list of all items 'bought' by my users. Given this input data (user,item) X,1 X,2 Y,1 Y,2 Z,2 Z,3 I build a python dictionary {1: ['X','Y'], 2 : ['X','Y','Z'], 3 : ['Z']} From that dictionary, I generate a bought/not bought matrix, also another dictionary(bnb). {1 : [1,1,0], 2 : [1,1,1], 3 : [0,0,1]} From there, I'm computing similarity between (1,2) by calculating cosine between (1,1,0) and (1,1,1), yielding 0.816496 I'm doing this by: items=[1,2,3] for item in items: for sub in items: if sub >= item: #as to not calculate similarity on the inverse sim = coSim( bnb[item], bnb[sub] ) I think the brute force approach is killing me and it only runs slower as the data gets larger. Using my trusty laptop, this calculation runs for hours when dealing with 8500 users and 3500 items. I'm trying to compute similarity for all items in my dict and it's taking longer than I'd like it to. I think this is a good candidate for MapReduce but I'm having trouble 'thinking' in terms of key/value pairs. Alternatively, is the issue with my approach and not necessarily a candidate for Map Reduce?

    Read the article

  • Decoding tcp packets using python

    - by mikip
    Hello I am trying to decode data received over a tcp connection. The packets are small, no more than 100 bytes. However when there is a lot of them I receive some of the the packets joined together. Is there a way to prevent this. I am using python I have tried to separate the packets, my source is below. The packets start with STX byte and end with ETX bytes, the byte following the STX is the packet length, (packet lengths less than 5 are invalid) the checksum is the last bytes before the ETX def decode(data): while True: start = data.find(STX) if start == -1: #no stx in message pkt = '' data = '' break #stx found , next byte is the length pktlen = ord(data[1]) #check message ends in ETX (pktken -1) or checksum invalid if pktlen < 5 or data[pktlen-1] != ETX or checksum_valid(data[start:pktlen]) == False: print "Invalid Pkt" data = data[start+1:] continue else: pkt = data[start:pktlen] data = data[pktlen:] break return data , pkt I use it like this #process reports try: data = sock.recv(256) except: continue else: while data: data, pkt = decode(data) if pkt: process(pkt) Also if there are multiple packets in the data stream, is it best to return the packets as a collection of lists or just return the first packet I am not that familiar with python, only C, is this method OK. Any advice would be most appreciated. Thanks in advance Thanks

    Read the article

  • Reading UTF-8 XML and writing it to a file with Python

    - by Harri
    I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least in some level) Here's the XML (from the parts that matter at least, it's a *.resx file which contains UI strings) <?xml version="1.0" encoding="utf-8"?> <root> <resheader name="foo"> <value>bar</value> </resheader> <data name="lorem" xml:space="preserve"> <value>ipsum öä</value> </data> </root> And here's my python script from xml.dom.minidom import parse names = [] values = [] def getStrings(path): dom = parse(path) data = dom.getElementsByTagName("data") for i in range(len(data)): name = data[i].getAttribute("name") names.append(name) value = data[i].getElementsByTagName("value") values.append(value[0].firstChild.nodeValue.encode("utf-8")) def writeToFile(): with open("uiStrings-fi.py", "w") as f: for i in range(len(names)): line = names[i] + '="'+ values[i] + '"' #varName='varValue' f.write(line) f.write("\n") getStrings("ResourceFile.fi-FI.resx") writeToFile() And here's the traceback: Traceback (most recent call last): File "GenerateLanguageFiles.py", line 24, in writeToFile() File "GenerateLanguageFiles.py", line 19, in writeToFile line = names[i] + '="'+ values[i] + '"' #varName='varValue' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in ran ge(128) How should I fix my script so it would read and write UTF-8 characters properly? The files I'm trying to generate would be used in test automation with Robots Framework.

    Read the article

  • Python: puzzling behaviour inside httplib

    - by Anna
    I have added one line ( import pdb; pdb.set_trace() ) to httplib's HTTPConnection.putheader, so I can see what's going on inside. httplib.py, line 489: def putheader(self, header, value): """Send a request header line to the server. For example: h.putheader('Accept', 'text/html') """ import pdb; pdb.set_trace() if self.__state != _CS_REQ_STARTED: raise CannotSendHeader() str = '%s: %s' % (header, value) self._output(str) then ran this from the interpreter import urllib2 urllib2.urlopen('http://www.ioerror.us/ip/headers') ... and as expected PDB kicks in: > c:\python26\lib\httplib.py(858)putheader() -> if self.__state != _CS_REQ_STARTED: (Pdb) in PDB I have the luxury of evaluating expressions on the fly, so I have tried to enter self.__state: (Pdb) self.__state *** AttributeError: HTTPConnection instance has no attribute '__state' Alas, there is no __state of this instance. However when I enter step, the debugger gets past the if self.__state != _CS_REQ_STARTED: line without a problem. Why is this happening? If the self.__state doesn't exist python would have to raise an exception as it did when I entered the expression. Python version: 2.6.4 on win32

    Read the article

  • Base class deleted before subclass during python __del__ processing

    - by Oddthinking
    Context I am aware that if I ask a question about Python destructors, the standard argument will be to use contexts instead. Let me start by explaining why I am not doing that. I am writing a subclass to logging.Handler. When an instance is closed, it posts a sentinel value to a Queue.Queue. If it doesn't, a second thread will be left running forever, waiting for Queue.Queue.get() to complete. I am writing this with other developers in mind, so I don't want a failure to call close() on a handler object to cause the program to hang. Therefore, I am adding a check in __del__() to ensure the object was closed properly. I understand circular references may cause it to fail in some circumstances. There's not a lot I can do about that. Problem Here is some simple example code: explicit_delete = True class Base: def __del__(self): print "Base class cleaning up." class Sub(Base): def __del__(self): print "Sub-class cleaning up." Base.__del__(self) x = Sub() if explicit_delete: del x print "End of thread" When I run this I get, as expected: Sub-class cleaning up. Base class cleaning up. End of thread If I set explicit_delete to False in the first line, I get: End of thread Sub-class cleaning up. Exception AttributeError: "'NoneType' object has no attribute '__del__'" in <bound method Sub.__del__ of <__main__.Sub instance at 0x00F0B698>> ignored It seems the definition of Base is removed before the x._del_() is called. The Python Documentation on _del_() warns that the subclass needs to call the base-class to get a clean deletion, but here that appears to be impossible. Can you see where I made a bad step?

    Read the article

  • Python template engine

    - by jturo
    Hello there, Could it be possible if somebody could help me get started in writing a python template engine? I'm new to python and as I learn the language I've managed to write a little MVC framework running in its own light-weight-WSGI-like server. I've managed to write a script that finds and replaces keys for values: (Obviously this is not how my script is structured or implemented. this is just an example) from string import Template html = '<html>\n' html += ' <head>\n' html += ' <title>This is so Cool : In Controller HTML</title>\n' html += ' </head>\n' html += ' <body>\n' html += ' Home | <a href="/hi">Hi ${name}</a>\n' html += ' </body>\n' html += '<html>' Template(html).safe_substitute(dict(name = 'Arturo')) My next goal is to implement custom statements, modifiers, functions, etc (like 'for' loop) but i don't really know if i should use another module that i don't know about. I thought of regular expressions but i kind feel like that wouldn't be an efficient way of doing it Any help is appreciated, and i'm sure this will be of use to other people too. Thank you

    Read the article

  • Python byte per byte XOR decryption

    - by neurino
    I have an XOR encypted file by a VB.net program using this function to scramble: Public Class Crypter ... 'This Will convert String to bytes, then call the other function. Public Function Crypt(ByVal Data As String) As String Return Encoding.Default.GetString(Crypt(Encoding.Default.GetBytes(Data))) End Function 'This calls XorCrypt giving Key converted to bytes Public Function Crypt(ByVal Data() As Byte) As Byte() Return XorCrypt(Data, Encoding.Default.GetBytes(Me.Key)) End Function 'Xor Encryption. Private Function XorCrypt(ByVal Data() As Byte, ByVal Key() As Byte) As Byte() Dim i As Integer If Key.Length <> 0 Then For i = 0 To Data.Length - 1 Data(i) = Data(i) Xor Key(i Mod Key.Length) Next End If Return Data End Function End Class and saved this way: Dim Crypter As New Cryptic(Key) 'open destination file Dim objWriter As New StreamWriter(fileName) 'write crypted content objWriter.Write(Crypter.Crypt(data)) Now I have to reopen the file with Python but I have troubles getting single bytes, this is the XOR function in python: def crypto(self, data): 'crypto(self, data) -> str' return ''.join(chr((ord(x) ^ ord(y)) % 256) \ for (x, y) in izip(data.decode('utf-8'), cycle(self.key)) I had to add the % 256 since sometimes x is 256 i.e. not a single byte. This thing of two bytes being passed does not break the decryption because the key keeps "paired" with the following data. The problem is some decrypted character in the conversion is wrong. These chars are all accented letters like à, è, ì but just a few of the overall accented letters. The others are all correctly restored. I guess it could be due to the 256 mod but without it I of course get a chr exception... Thanks for your support

    Read the article

  • S3 Backup Memory Usage in Python

    - by danpalmer
    I currently use WebFaction for my hosting with the basic package that gives us 80MB of RAM. This is more than adequate for our needs at the moment, apart from our backups. We do our own backups to S3 once a day. The backup process is this: dump the database, tar.gz all the files into one backup named with the correct date of the backup, upload to S3 using the python library provided by Amazon. Unfortunately, it appears (although I don't know this for certain) that either my code for reading the file or the S3 code is loading the entire file in to memory. As the file is approximately 320MB (for today's backup) it is using about 320MB just for the backup. This causes WebFaction to quit all our processes meaning the backup doesn't happen and our site goes down. So this is the question: Is there any way to not load the whole file in to memory, or are there any other python S3 libraries that are much better with RAM usage. Ideally it needs to be about 60MB at the most! If this can't be done, how can I split the file and upload separate parts? Thanks for your help. This is the section of code (in my backup script) that caused the processes to be quit: filedata = open(filename, 'rb').read() content_type = mimetypes.guess_type(filename)[0] if not content_type: content_type = 'text/plain' print 'Uploading to S3...' response = connection.put(BUCKET_NAME, 'daily/%s' % filename, S3.S3Object(filedata), {'x-amz-acl': 'public-read', 'Content-Type': content_type})

    Read the article

< Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80  | Next Page >