Search Results

Search found 14399 results on 576 pages for 'python noob'.

Page 312/576 | < Previous Page | 308 309 310 311 312 313 314 315 316 317 318 319  | Next Page >

  • Emulating a web browser

    - by Sean
    Hello, we are tasked with basically emulating a browser to fetch webpages, looking to automate tests on different web pages. This will be used for (ideally) console-ish applications that run in the background and generate reports. We tried going with .NET and the WatiN library, but it was built on a Marshalled IE, and so it lacked many features that we hacked in with calls to unmanaged native code, but at the end of the day IE is not thread safe nor process safe, and many of the needed features could only be implemented by changing registry values and it was just terribly unflexible. Proxy support JavaScript support- we have to be able to parse the actual DOM after any javascript has executed (and hopefully an event is raised to handle any ajax calls) Ability to save entire contents of page including images FROM THE loaded page's CACHE to a separate location ability to clear cookies/cache, get the cookies/cache, etc. Ability to set headers and alter post data for any browser call And for the love of drogs, an API that isn't completely cryptic Languages acceptable C++, C#, Python, anything that can be a simple little console application that doesn't have a retarded syntax like Ruby. From my own research, and believe me I am terrible at google searches, I have heard good things about WebKit... would the Qt module QtWebKit handle all these features?

    Read the article

  • Selenium RC: how to capture/handle error?

    - by KenBurnsFan1
    Hi, My test uses Selenium to loop through a CSV list of URLs via an HTTP proxy (working script below). As I watch the script run I can see about 10% of the calls produce "Proxy error: 502" ("Bad_Gateway"); however, the errors are not captured by my catch-all "except Exception" clause -- ie: instead of writing 'error' in the appropriate row of the "output.csv", they get passed to the else clause and produce a short piece of html that starts: "Proxy error: 502 Read from server failed: Unknown error." Also, if I collect all the URLs which returned 502s and re-run the script, they all pass, which leads me to believe that this is a sporadic network path issue. Question: Can the script be made to recognize the the 502 errors, sleep a minute, and then retry the URL instead of moving on to the next URL in the list? The only alternative that I can think of is to apply re.search("Proxy error: 502") after "get_html_source" as a way to catch the bad calls. Then, if the RE matches, put the script to sleep for a minute and then retry 'sel.open(row[0]' on the URL which produced the 502. Any advice would be much appreciated. Thanks! #python 2.6 from selenium import selenium import unittest, time, re, csv, logging class Untitled(unittest.TestCase): def setUp(self): self.verificationErrors = [] self.selenium = selenium("localhost", 4444, "*firefox", "http://baseDomain.com") self.selenium.start() self.selenium.set_timeout("60000") def test_untitled(self): sel = self.selenium spamReader = csv.reader(open('ListOfSubDomains.csv', 'rb')) for row in spamReader: try: sel.open(row[0]) except Exception: ofile = open('output.csv', 'ab') ofile.write("error" + '\n') ofile.close() else: time.sleep(5) html = sel.get_html_source() ofile = open('output.csv', 'ab') ofile.write(html.encode('utf-8') + '\n') ofile.close() def tearDown(self): self.selenium.stop() self.assertEqual([], self.verificationErrors) if __name__ == "__main__": unittest.main()

    Read the article

  • Set-Cookie error appearing in logs when deployed to google appengine

    - by Jesse
    I have been working towards converting one of our applications to be threadsafe. When testing on the local dev app server everything is working as expected. However, upon deployment of the application it seems that Cookies are not being written correctly? Within the logs there is an error with no stack trace: 2012-11-27 16:14:16.879 Set-Cookie: idd_SRP=Uyd7InRpbnlJZCI6ICJXNFdYQ1ZITSJ9JwpwMAou.Q6vNs9vGR-rmg0FkAa_P1PGBD94; expires=Wed, 28-Nov-2012 23:59:59 GMT; Path=/ Here is the block of code in question: # area of the code the emits the cookie cookie = Cookie.SimpleCookie() if not domain: domain = self.__domain self.__updateCookie(cookie, expires=expires, domain=domain) self.__updateSessionCookie(cookie, domain=domain) print cookie.output() Cookie helper methods: def __updateCookie(self, cookie, expires=None, domain=None): """ Takes a Cookie.SessionCookie instance an updates it with all of the private persistent cookie data, expiry and domain. @param cookie: a Cookie.SimpleCookie instance @param expires: a datetime.datetime instance to use for expiry @param domain: a string to use for the cookie domain """ cookieValue = AccountCookieManager.CookieHelper.toString(self.cookie) cookieName = str(AccountCookieManager.COOKIE_KEY % self.partner.pid) cookie[cookieName] = cookieValue cookie[cookieName]['path'] = '/' cookie[cookieName]['domain'] = domain if not expires: # set the expiry date to 1 day from now expires = datetime.date.today() + datetime.timedelta(days = 1) expiryDate = expires.strftime("%a, %d-%b-%Y 23:59:59 GMT") cookie[cookieName]['expires'] = expiryDate def __updateSessionCookie(self, cookie, domain=None): """ Takes a Cookie.SessionCookie instance an updates it with all of the private session cookie data and domain. @param cookie: a Cookie.SimpleCookie instance @param expires: a datetime.datetime instance to use for expiry @param domain: a string to use for the cookie domain """ cookieValue = AccountCookieManager.CookieHelper.toString(self.sessionCookie) cookieName = str(AccountCookieManager.SESSION_COOKIE_KEY % self.partner.pid) cookie[cookieName] = cookieValue cookie[cookieName]['path'] = '/' cookie[cookieName]['domain'] = domain Again, the libraries in use are: Python 2.7 Django 1.2 Any suggestion on what I can try?

    Read the article

  • Programmatically talking to a Serial Port in OS X or Linux

    - by deadprogrammer
    I have a Prolite LED sign that I like to set up to show scrolling search queries from a apache logs and other fun statistics. The problem is, my G5 does not have a serial port, so I have to use a usb to serial dongle. It shows up as /dev/cu.usbserial and /dev/tty.usbserial . When i do this everything seems to be hunky-dory: stty -f /dev/cu.usbserial speed 9600 baud; lflags: -icanon -isig -iexten -echo iflags: -icrnl -ixon -ixany -imaxbel -brkint oflags: -opost -onlcr -oxtabs cflags: cs8 -parenb Everything also works when I use the serial port tool to talk to it. If I run this piece of code while the above mentioned serial port tool, everthing also works. But as soon as I disconnect the tool the connection gets lost. #!/usr/bin/python import serial ser = serial.Serial('/dev/cu.usbserial', 9600, timeout=10) ser.write("<ID01><PA> \r\n") read_chars = ser.read(20) print read_chars ser.close() So the question is, what magicks do I need to perform to start talking to the serial port without the serial port tool? Is that a permissions problem? Also, what's the difference between /dev/cu.usbserial and /dev/tty.usbserial?

    Read the article

  • Download-from-PyPI-and-install script

    - by zubin71
    Hello, I have written a script which fetches a distribution, given the URL. After downloading the distribution, it compares the md5 hashes to verify that the file has been downloaded properly. This is how I do it. def download(package_name, url): import urllib2 downloader = urllib2.urlopen(url) package = downloader.read() package_file_path = os.path.join('/tmp', package_name) package_file = open(package_file_path, "w") package_file.write(package) package_file.close() I wonder if there is any better(more pythonic) way to do what I have done using the above code snippet. Also, once the package is downloaded this is what is done: def install_package(package_name): if package_name.endswith('.tar'): import tarfile tarfile.open('/tmp/' + package_name) tarfile.extract('/tmp') import shlex import subprocess installation_cmd = 'python %ssetup.py install' %('/tmp/'+package_name) subprocess.Popen(shlex.split(installation_cmd) As there are a number of imports for the install_package method, i wonder if there is a better way to do this. I`d love to have some constructive criticism and suggestions for improvement. Also, I have only implemented the install_package method for .tar files; would there be a better manner by which I could install .tar.gz and .zip files too without having to write seperate methods for each of these?

    Read the article

  • reactor not working when reactor.run is not called in the main thread and installSignalHandlers=Fals

    - by Kalmi
    I'm trying to answer the following question out of personal interest: What is the fastest way to send 100,000 HTTP requests in Python? And this is what I have came up so far, but I'm experiencing something very stange. When installSignalHandlers is True, it just hangs. I can see that the DelayedCall instances are in reactor._newTimedCalls, but processResponse never gets called. When installSignalHandlers is False, it throws an error and works. from twisted.internet import reactor from twisted.web.client import Agent from threading import Semaphore, Thread import time concurrent = 100 s = Semaphore(concurrent) reactor.suggestThreadPoolSize(concurrent) t=Thread( target=reactor.run, kwargs={'installSignalHandlers':True}) t.daemon=True t.start() agent = Agent(reactor) def processResponse(response,url): print response.code, url s.release() def processError(response,url): print "error", url s.release() def addTask(url): req = agent.request('HEAD', url) req.addCallback(processResponse, url) req.addErrback(processError, url) for url in open('urllist.txt'): addTask(url.strip()) s.acquire() while s._Semaphore__value!=concurrent: time.sleep(0.1) reactor.stop() And here is the error that it throws when installSignalHandlers is True: (Note: This is the expected behaviour! The question is why it doesn't work when installSignalHandlers is False.) Traceback (most recent call last): File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 396, in fireEvent DeferredList(beforeResults).addCallback(self._continueFiring) File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 224, in addCallback callbackKeywords=kw) File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 213, in addCallbacks self._runCallbacks() File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 371, in _runCallbacks self.result = callback(self.result, *args, **kw) --- <exception caught here> --- File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 409, in _continueFiring callable(*args, **kwargs) File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1165, in _reallyStartRunning self._handleSignals() File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1105, in _handleSignals signal.signal(signal.SIGINT, self.sigInt) exceptions.ValueError: signal only works in main thread What am I doing wrong and what is the right way? I'm new to twisted.

    Read the article

  • How to determine subprocess.Popen() failed when shell=True

    - by Malcolm
    Windows version of Python 2.6.4: Is there any way to determine if subprocess.Popen() fails when using shell=True? Popen() successfully fails when shell=False >>> import subprocess >>> p = subprocess.Popen( 'Nonsense.application', shell=False ) Traceback (most recent call last): File ">>> pyshell#258", line 1, in <module> p = subprocess.Popen( 'Nonsense.application' ) File "C:\Python26\lib\subprocess.py", line 621, in __init__ errread, errwrite) File "C:\Python26\lib\subprocess.py", line 830, in _execute_child startupinfo) WindowsError: [Error 2] The system cannot find the file specified But when shell=True, there appears to be no way to determine if a Popen() call was successful or not. >>> p = subprocess.Popen( 'Nonsense.application', shell=True ) >>> p >>> subprocess.Popen object at 0x0275FF90&gt;&gt;&gt; >>> p.pid 6620 >>> p.returncode >>> Ideas appreciated. Regards, Malcolm

    Read the article

  • wxPython, Threads, and PostEvent between modules

    - by Sam Starling
    I'm relatively new to wxPython (but not Python itself), so forgive me if I've missed something here. I'm writing a GUI application, which at a very basic level consists of "Start" and "Stop" buttons that start and stop a thread. This thread is an infinite loop, which only ends when the thread is stopped. The loop generates messages, which at the moment are just output using print. The GUI class and the infinite loop (using threading.Thread as a subclass) are held in separate files. What is the best way to get the thread to push an update to something like a TextCtrl in the GUI? I've been playing around with PostEvent and Queue, but without much luck. Here's some bare bones code, with portions removed to keep it concise: main_frame.py import wx from loop import Loop class MainFrame(wx.Frame): def __init__(self, parent, title): # Initialise and show GUI # Add two buttons, btnStart and btnStop # Bind the two buttons to the following two methods self.threads = [] def onStart(self): x = Loop() x.start() self.threads.append(x) def onStop(self): for t in self.threads: t.stop() loop.py class Loop(threading.Thread): def __init__(self): self._stop = threading.Event() def run(self): while not self._stop.isSet(): print datetime.date.today() def stop(self): self._stop.set() I did, at one point, have it working by having the classes in the same file by using wx.lib.newevent.NewEvent() along these lines. If anyone could point me in the right direction, that'd be much appreciated.

    Read the article

  • Does anyone knows any Train-table-api service?

    - by DaNieL
    Hi guys, im wondering if there is any online service who provide API about the train timetables (arrivals, departure, etch..), at least for the european stations. I know www.bahn.de, who provide the accurated timetables for many european countries, but i didin't find any similar to a api service. My goal (well, just a future-project) is to develope an application like gmaps, but instead of giving the car-trip i would like to give the train-trip.. you know, the user set the departure day, time and station, then the first arrival, maybe then add another one, and so on. So, there is something like, that can be queryed by php/python/javascript? Edit: if can help, im wondering to build a service to plan an inter-rail trip! (and any help will be really appreciate) Im stuck. Google's public-service api rely on the information that every local company makes avaiable, much branches are missing (for example, is almost impossible to use it to build an itinerary that touches two or more regions of Europe. As i said before, the only service that i know doing it good is the bahn.de, they really have all the data.. but no api. I tryed to parse theyre result page (in order to use them as API), but seem like the markup is been build exactly to avoid that.. maybe they have business plans behind that or whatever, but i dont think they will ever release some API.. so, my project is going on without this function (p.s: my project is about non-profit cultural organizations, we wont make war to anyone ;P) @El Goorf: if you find a way, and consider the idea of sharing it, count on my hand if need help!

    Read the article

  • Unknown error when submit a REST request to Liferay json API

    - by r.rodriguez
    I'm writing an script in Python to automatically update the structures in my Liferay portal and I want to do it via the json REST API. I make a request to get an structure (method getStructure), and it worked. But when I try to do an structure update in the portal it shows me the following error: ValueError: Content-Length should be specified for iterable data of type class 'dict' {'serviceContext': "{'prueba'}", 'serviceClassName': 'com.liferay.portlet.journal.service.JournalStructureServiceUtil', 'name': 'FOO', 'xsd': '... THE XSD OBTAINED VIA JSON ...', 'serviceParameters': '[groupId,structureId,parentStructureId,name,description,xsd,serviceContext]', 'description': 'FOO Structure', 'serviceMethodName': 'updateStructure', 'groupId': '10133'} What I'm doing is the next: urllib.request.Request(url = URL, data = data_update, headers = headers) URL is http://localhost:8080/tunnel-web/secure/json The headers are configured with basic authentication (it works, it is tested with the getStructure method). Data is: data_update = { "serviceClassName" : "com.liferay.portlet.journal.service.JournalStructureServiceUtil", "serviceMethodName" : "updateStructure", "serviceParameters" : "[groupId,structureId,parentStructureId,name,description,xsd,serviceContext]", "groupId" : 10133, "name" : FOO, "description" : FOO Structure, "xsd" : ... THE XSD OBTAINED VIA JSON ..., "serviceContext" : "{}" } Does anybody know the solution? Have I to specify the length for the dictionary and how? Or this is a bug?

    Read the article

  • Full-text search on App Engine with Whoosh

    - by Martin
    I need to do full text searching with Google App Engine. I found the project Whoosh and it works really well, as long as I use the App Engine Development Environement... When I upload my application to App Engine, I am getting the following TraceBack. For my tests, I am using the example application provided in this project. Any idea of what I am doing wrong? <type 'exceptions.ImportError'>: cannot import name loads Traceback (most recent call last): File "/base/data/home/apps/myapp/1.334374478538362709/hello.py", line 6, in <module> from whoosh import store File "/base/data/home/apps/myapp/1.334374478538362709/whoosh/__init__.py", line 17, in <module> from whoosh.index import open_dir, create_in File "/base/data/home/apps/myapp/1.334374478538362709/whoosh/index.py", line 31, in <module> from whoosh import fields, store File "/base/data/home/apps/myapp/1.334374478538362709/whoosh/store.py", line 27, in <module> from whoosh import tables File "/base/data/home/apps/myapp/1.334374478538362709/whoosh/tables.py", line 43, in <module> from marshal import loads Here is the import I have in my Python file. # Whoosh ---------------------------------------------------------------------- sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..', 'utils'))) from whoosh.fields import Schema, STORED, ID, KEYWORD, TEXT from whoosh.index import getdatastoreindex from whoosh.qparser import QueryParser, MultifieldParser Thank you in advance for your help!

    Read the article

  • Why is iPdb not displaying STOUT after my input?

    - by BryanWheelock
    I can't figure out why ipdb is not displaying stout properly. I'm trying to debug why a test is failing and so I attempt to use ipdb debugger. For some reason my Input seems to be accepted, but the STOUT is not displayed until I (c)ontinue. Is this something broken in ipdb? It makes it very difficult to debug a program. Below is an example ipdb session, notice how I attempt to display the values of the attributes with: user.is_authenticated() user_profile.reputation user.is_superuser The results are not displayed until 'begin captured stdout ' In [13]: !python manage.py test Creating test database... < SNIP remove loading tables nosetests ...E.. /Users/Bryan/work/APP/forum/auth.py(93)can_retag_questions() 92 import ipdb; ipdb.set_trace() ---> 93 return user.is_authenticated() and ( 94 RETAG_OTHER_QUESTIONS <= user_profile.reputation < EDIT_OTHER_POSTS or user.is_authenticated() user_profile.reputation user.is_superuser c F /Users/Bryan/work/APP/forum/auth.py(93)can_retag_questions() 92 import ipdb; ipdb.set_trace() ---> 93 return user.is_authenticated() and ( 94 RETAG_OTHER_QUESTIONS <= user_profile.reputation < EDIT_OTHER_POSTS or c .....EE...... FAIL: test_can_retag_questions (APP.forum.tests.test_views.AuthorizationFunctionsTestCase) Traceback (most recent call last): File "/Users/Bryan/work/APP/../APP/forum/tests/test_views.py", line 71, in test_can_retag_questions self.assertTrue(auth.can_retag_questions(user)) AssertionError: -------------------- begin captured stdout << --------------------- ipdb True ipdb 4001 ipdb False ipdb --------------------- end captured stdout << ---------------------- Ran 20 tests in 78.032s FAILED (errors=3, failures=1) Destroying test database... In [14]: Here is the actual test I'm trying to run: def can_retag_questions(user): """Determines if a User can retag Questions.""" user_profile = user.get_profile() import ipdb; ipdb.set_trace() return user.is_authenticated() and ( RETAG_OTHER_QUESTIONS <= user_profile.reputation < EDIT_OTHER_POSTS or user.is_superuser) I've also tried to use pdb, but that doesn't display anything. I see my test progress .... , and then nothing and not responsive to keyboard input. Is this a problem with readline?

    Read the article

  • How do I process a nested list?

    - by ddbeck
    Suppose I have a bulleted list like this: * list item 1 * list item 2 (a parent) ** list item 3 (a child of list item 2) ** list item 4 (a child of list item 2 as well) *** list item 5 (a child of list item 4 and a grand-child of list item 2) * list item 6 I'd like to parse that into a nested list or some other data structure which makes the parent-child relationship between elements explicit (rather than depending on their contents and relative position). For example, here's a list of tuples containing an item and a list of its children (and so forth): [('list item 1',), ('list item 2', [('list item 3',), [('list item 4', [('list item 5'),]] ('list item 6',)] I've attempted to do this with plain Python and some experimentation with Pyparsing, but I'm not making progress. I'm left with two major questions: What's the strategy I need to employ to make this work? I know recursion is part of the solution, but I'm having a hard time making the connection between this and, say, a Fibonacci sequence. I'm certain I'm not the first person to have done this, but I don't know the terminology of the problem to make fruitful searches for more information on this topic. What problems are related to this so that I can learn more about solving these kinds of problems in general?

    Read the article

  • Adding a font for use in ReportLab

    - by Jimmy McCarthy
    I'm trying to add a font to the python ReportLab so that I can use it for a function. The function is using canvas.Canvas to draw a bunch of text in a PDF, nothing complicated, but I need to add a fixed width font for layout issues. When I tried to register a font using what little info I could find, that seemed to work. But when I tried to call .addFont('fontname') from my Canvas object I keep getting "PDFDocument instance has no attribute 'addFont'" Is the function just not implemented? How do I get access to fonts other than the 10 or so default ones that are listed in .getAvailableFonts? Thanks. Some example code of what I'm trying to make happen: from reportlab.pdfgen import canvas c = canvas.Canvas('label.pdf') c.addFont('TestFont') #This throws the error listed above, regardless of what argument I use (whether it refers to a font or not). c.drawString(1,1,'test data here') c.showPage() c.save() To register the font, I tried from reportlab.lib.fonts import addMapping from reportlab.pdfbase import pdfmetrics pdfmetrics.registerFont(TTFont('TestFont', 'ghettomarquee.ttf')) addMapping('TestFont', 0, 0, 'TestFont') where 'ghettomarquee.ttf' was just a random font I had lying around.

    Read the article

  • Parsing/Tokenizing a String Containing a SQL Command

    - by Alan Storm
    Are there any open source libraries (any language, python/PHP preferred) that will tokenize/parse an ANSI SQL string into its various components? That is, if I had the following string SELECT a.foo, b.baz, a.bar FROM TABLE_A a LEFT JOIN TABLE_B b ON a.id = b.id WHERE baz = 'snafu'; I'd get back a data structure/object something like //fake PHPish $results['select-columns'] = Array[a.foo,b.baz,a.bar]; $results['tables'] = Array[TABLE_A,TABLE_B]; $results['table-aliases'] = Array[a=TABLE_A, b=TABLE_B]; //etc... Restated, I'm looking for the code in a database package that teases the SQL command apart so that the engine knows what to do with it. Searching the internet turns up a lot of results on how to parse a string WITH SQL. That's not what I want. I realize I could glop through an open source database's code to find what I want, but I was hoping for something a little more ready made, (although if you know where in the MySQL, PostgreSQL, SQLite source to look, feel free to pass it along) Thanks!

    Read the article

  • Sudden issues reading uncompressed video using opencv

    - by JohnSavage
    I have been using a particular pipeline to process video using opencv to encode uncompressed video (fourcc = 0), and opencv python bindings to then open and work on these files. This has been working fine for me on OpenCV 2.3.1a on Ubuntu 11.10 until just a few days ago. For some reason it currently is only allowing me to read the first frame of a given file the first time I open that file. Further frames are not read, and once I touch the file once with my program, it then cannot even read the first frame. More detail: I created the uncompressed video files as follows: out_video.open(out_vid_name, 0, // FOURCC = 0 means record raw fps, Size(640, 480)) Again, these videos worked fine for me until about a week ago. Now, when I try to open one of these I get the following message (from what I think is ffmpeg): Processing video.avi Using network protocols without global network initialization. Please use avformat_network_init(), this will become mandatory later. [avi @ 0x29251e0] parser not found for codec rawvideo, packets or times may be invalid. It reads and displays the first frame fine, but then fails to read the next frame. Then, when I try to run my code on the same video, the capture still opens with the same message as above. However, it cannot even read the very first frame. Here is the code to open the capture: self.capture = cv2.VideoCapture(filename) if not self.capture.isOpened() print "Error: could not open capture" sys.exit() Again, this part is passed without any issue, but then the break happens at: success, rgb = self.capture.read() if not success: print "error: could not read frame" return False This part breaks at the second frame on the first run of the video file, and then on the first frame on subsequent runs. I really don't know where to even begin debugging this. Please help!

    Read the article

  • Am I correctly extracting JPEG binary data from this mysqldump?

    - by Glenn
    I have a very old .sql backup of a vbulletin site that I ran around 8 years ago. I am trying to see the file attachments that are stored in the DB. The script below extracts them all and is verified to be JPEG by hex dumping and checking the SOI (start of image) and EOI (end of image) bytes (FFD8 and FFD9, respectively) according to the JPEG wiki page. But when I try to open them with evince, I get this message "Error interpreting JPEG image file (JPEG datastream contains no image)" What could be going on here? Some background info: sqldump is around 8 years old vbulletin 2.x was the software that stored the info most likely php 4 was used most likely mysql 4.0, possibly even 3.x the column datatype these attachments are stored in is mediumtext My Python 3.1 script: #!/usr/bin/env python3.1 import re trim_l = re.compile(b"""^INSERT INTO attachment VALUES\('\d+', '\d+', '\d+', '(.+)""") trim_r = re.compile(b"""(.+)', '\d+', '\d+'\);$""") extractor = re.compile(b"""^(.*(?:\.jpe?g|\.gif|\.bmp))', '(.+)$""") with open('attachments.sql', 'rb') as fh: for line in fh: data = trim_l.findall(line)[0] data = trim_r.findall(data)[0] data = extractor.findall(data) if data: name, data = data[0] try: filename = 'files/%s' % str(name, 'UTF-8') ah = open(filename, 'wb') ah.write(data) except UnicodeDecodeError: continue finally: ah.close() fh.close() update The JPEG wiki page says FF bytes are section markers, with the next byte indicating the section type. I see some that are not listed in the wiki page (specifically, I see a lot of 5C bytes, so FF5C). But the list is of "common markers" so I'm trying to find a more complete list. Any guidance here would also be appreciated.

    Read the article

  • Unittesting aspect-oriented features.

    - by Tomas Brambora
    Hi, I'd like to know what would you propose as the best way to unit test aspect-oriented application features (well, perhaps that's not the best name, but it's the best I was able to come up with :-) ) such as logging or security? These things are sort of omni-present in the application, so how to test them properly? E.g. say that I'm writing a Cherrypy web server in Python. I can use a decorator to check whether the logged-in user has the permission to access a given page. But then I'd need to write a test for every page to see whether it works oK (or more like to see that I had not forgotten to check security perms for that page). This could maybe (emphasis on maybe) be bearable if logging and/or security were implemented during the web server "normal business implementation". However, security and logging usually tend to be added to the app as an afterthough (or maybe that's just my experience, I'm usually given a server and then asked to implement security model :-) ). Any thoughts on this are very welcome. I have currently 'solved' this issue by, well - not testing this at all. Thanks.

    Read the article

  • How does one decrypt a PDF with an owner password, but no user password?

    - by Tony Meyer
    Although the PDF specification is available from Adobe, it's not exactly the simplest document to read through. PDF allows documents to be encrypted so that either a user password and/or an owner password is required to do various things with the document (display, print, etc). A common use is to lock a PDF so that end users can read it without entering any password, but a password is required to do anything else. I'm trying to parse PDFs that are locked in this way (to get the same privileges as you would get opening them in any reader). Using an empty string as the user password doesn't work, but it seems (section 3.5.2 of the spec) that there has to be a user password to create the hash for the admin password. What I would like is either an explanation of how to do this, or any code that I can read (ideally Python, C, or C++, but anything readable will do) that does this so that I can understand what I'm meant to be doing. Standalone code, rather than reading through (e.g.) the gsview source, would be best.

    Read the article

  • How can I do such a typical unittest?

    - by Malcom.Z
    This is a simple structure in my project: MyAPP--- note--- __init__.py views.py urls.py test.py models.py auth-- ... template--- auth--- login.html register.html note--- noteshow.html media--- css--- ... js--- ... settings.py urls.py __init__.py manage.py I want to make a unittest which can test the noteshow page working propeyly or not. The code: from django.test import TestCase class Note(TestCase): def test_noteshow(self): response = self.client.get('/note/') self.assertEqual(response.status_code, 200) self.assertTemplateUsed(response, '/note/noteshow.html') The problem is that my project include an auth mod, it will force the unlogin user redirecting into the login.html page when they visit the noteshow.html. So, when I run my unittest, in the bash it raise an failure that the response.status_code is always 302 instead of 200. All right though through this result I can check the auth mod is running well, it is not like what I want it to be. OK, the question is that how can I make another unittest to check my noteshow.template is used or not? Thanks for all. django version: 1.1.1 python version: 2.6.4 Use Eclipse for MAC OS

    Read the article

  • Standardizing a Release/Tools group on a specific language

    - by grahzny
    I'm part of a six-member build and release team for an embedded software company. We also support a lot of developer tools, such as Atlassian's Fisheye, Jira, etc., Perforce, Bugzilla, AnthillPro, and a couple of homebrew tools (like my Django release notes generator). Most of the time, our team just writes little plugins for larger apps (ex: customize workflows in Anthill), long-term utility scripts (package up a release for QA), or things like Perforce triggers (don't let people check into a specific branch unless their change description includes a bug number; authenticate against Active Directory instead of Perforce's internal passwords). That's about the scale of our problems, although we sometimes tackle something slightly more sizable. My boss, who is reasonably technical, has asked us to standardize on one or two languages so we can more easily substitute for each other. He's advocating bash scripts and Perl, due to their universality and simplicity. I can see his point--we mostly do "glue", so why not use "glue" languages rather than saddle ourselves with something designed for much larger projects? Since some of the tools we work with are Java-based, we do need to use something that speaks JVM sometimes. (The path of least resistance for these projects is BeanShell and Groovy.) I feel a tremendous itch toward language advocacy, but I'm trying to avoid saying "We should use Python 'cause I like it and Perl is gross." Instead, I'm trying to come up with a good approach to defining our problem set: what problems do we solve with scripts? Would we benefit from a library of common functions by our team, or are most of our projects more isolated? What is it reasonable to expect my co-workers to learn? What languages give us the most ease of development and ease of modification? Can you folks suggest some useful ways to approach this problem, both for my own thinking process and to help me facilitate some brainstorming among my coworkers?

    Read the article

  • PDF Form Field Manipulation

    - by 108039818756939362532
    I'm making a web interface to autofill pdf forms with user data from a database. The admin needs to be able to upload a pdf (right now targeted at IRS pdf forms) and then associate the fields in the pdf with data fields in the database. I need a way to help the admin associate the field names (stuff like "topmostSubform[0].Page2[0].p2-t66[0]") with the the data fields in the database. I'm looking for a way to modify the PDF programatically to in some way provide this information. Basically I'm open to suggestions on how I might make the field names appear in an obvious manner on a modified version of the original pdf. The closest I've gotten is being able to insert Tooltips into the fields in the pdf by just editting the raw pdf line by line. However when editting the pdf in this manner the field names are gibberish, and so I can't just use them. An optimal solution would be anything that could automatically parse a pdf and set each field's tooltip to be the fields name. Anything that can be run from the command line, or any python tool, or just a basic how to correctly parse a field's name from a raw pdf file would be amazing.

    Read the article

  • When to use "property" builtin: auxiliary functions and generators

    - by Seth Johnson
    I recently discovered Python's property built-in, which disguises class method getters and setters as a class's property. I'm now being tempted to use it in ways that I'm pretty sure are inappropriate. Using the property keyword is clearly the right thing to do if class A has a property _x whose allowable values you want to restrict; i.e., it would replace the getX() and setX() construction one might write in C++. But where else is it appropriate to make a function a property? For example, if you have class Vertex(object): def __init__(self): self.x = 0.0 self.y = 1.0 class Polygon(object): def __init__(self, list_of_vertices): self.vertices = list_of_vertices def get_vertex_positions(self): return zip( *( (v.x,v.y) for v in self.vertices ) ) is it appropriate to add vertex_positions = property( get_vertex_positions ) ? Is it ever ok to make a generator look like a property? Imagine if a change in our code meant that we no longer stored Polygon.vertices the same way. Would it then be ok to add this to Polygon? @property def vertices(self): for v in self._new_v_thing: yield v.calculate_equivalent_vertex()

    Read the article

  • Performing text processing on flatpage content to include handling of custom tag

    - by Dzejkob
    Hi. I'm using flatpages app in my project to manage some html content. That content will include images, so I've made a ContentImage model allowing user to upload images using admin panel. The user should then be able to include those images in content of the flatpages. He can of course do that by manually typing image url into <img> tag, but that's not what I'm looking for. To make including images more convenient, I'm thinking about something like this: User edits an additional, let's say pre_content field of CustomFlatPage model (I'm using custom flatpage model already) instead of defining <img> tags directly, he uses a custom tag, something like [img=...] where ... is name of the ContentImage instance now the hardest part: before CustomFlatPage is saved, pre_content field is checked for all [img=...] occurences and they are processed like this: ContentImage model is searched if there's image instance with given name and if so, [img=...] is replaced with proper <img> tag. flatpage actual content is filled with processed pre_content and then flatpage is saved (pre_content is leaved unchanged, as edited by user) The part that I can't cope with is text processing. Should I use regular expressions? Apparently they can be slow for large strings. And how to organize logic? I assume it's rather algorithmic question, but I'm not familliar with text processing in Python enough, to do it myself. Can somebody give me any clues?

    Read the article

  • Excel CSV into Nested Dictionary; List Comprehensions

    - by victorhooi
    heya, I have a Excel CSV files with employee records in them. Something like this: mail,first_name,surname,employee_id,manager_id,telephone_number [email protected],john,smith,503422,503423,+65(2)3423-2433 [email protected],george,brown,503097,503098,+65(2)3423-9782 .... I'm using DictReader to put this into a nested dictionary: import csv gd_extract = csv.DictReader(open('filename 20100331 original.csv'), dialect='excel') employees = dict([(row['employee_id'], row) for row in gp_extract]) Is the above the proper way to do it - it does work, but is it the Right Way? Something more efficient? Also, the funny thing is, in IDLE, if I try to print out "employees" at the shell, it seems to cause IDLE to crash (there's approximately 1051 rows). 2. Remove employee_id from inner dict The second issue issue, I'm putting it into a dictionary indexed by employee_id, with the value as a nested dictionary of all the values - however, employee_id is also a key:value inside the nested dictionary, which is a bit redundant? Is there any way to exclude it from the inner dictionary? 3. Manipulate data in comprehension Thirdly, we need do some manipulations to the imported data - for example, all the phone numbers are in the wrong format, so we need to do some regex there. Also, we need to convert manager_id to an actual manager's name, and their email address. Most managers are in the same file, while others are in an external_contractors CSV, which is similar but not quite the same format - I can import that to a separate dict though. Are these two items things that can be done within the single list comprehension, or should I use a for loop? Or does multiple comprehensions work? (sample code would be really awesome here). Or is there a smarter way in Python do it? Cheers, Victor

    Read the article

< Previous Page | 308 309 310 311 312 313 314 315 316 317 318 319  | Next Page >