I have a class which is being operated on by two functions. One function creates a list of widgets and writes it into the class:
def updateWidgets(self):
widgets = self.generateWidgetList()
self.widgets = widgets
the other function deals with the widgets in some way:
def workOnWidgets(self):
for widget in self.widgets:
self.workOnWidget(widget)
each of these functions runs in it's own thread. the question is, what happens if the updateWidgets() thread executes while the workOnWidgets() thread is running?
I am assuming that the iterator created as part of the for...in loop will keep some kind of reference to the old self.widgets object? So I will finish iterating over the old list... but I'd love to know for sure.
I need to open multiple files (2 input and 2 output files), do complex manipulations on the lines from input files and then append results at the end of 2 output files. I am currently using the following approach:
in_1 = open(input_1)
in_2 = open(input_2)
out_1 = open(output_1, "w")
out_2 = open(output_2, "w")
# Read one line from each 'in_' file
# Do many operations on the DNA sequences included in the input files
# Append one line to each 'out_' file
in_1.close()
in_2.close()
out_1.close()
out_2.close()
The files are huge (each potentially approaching 1Go, that is why I am reading through these input files one at a time. I am guessing that this is not a very Pythonic way to do things. :) Would using the following form good?
with open("file1") as f1:
with open("file2") as f2: # etc.
If yes, could I do it while avoiding the highly indented code that would result? Thanks for the insights!
Is it possible to get the full path of the file on the user's computer being uploaded to my site?
Using os.path.abspath(fileitem.filename) simply gets me the address of where my script is executing from on my shared hosting server.
FYI: fileitem = form['file'] and form = cgi.FieldStorage()
Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array?
I am looking for something similar to Excel's percentile function.
I looked in NumPy's statistics reference, and couldn't find this. All I could find is the median (50th percentile), but not something more specific.
I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.)
I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject.
I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines....
Now, many subject lines would fail a string match, but
"Google Friends : Our latest news"
"Google Friends : What we're doing today"
are more similar to each other than a random subject line, as is:
"Virgin Airlines has a great sale today"
"Take a flight with Virgin Airlines"
So -- how can I start to automagically extract trends/examples of strings that may be more similar.
Approaches I've considered and discarded ('because there must be some better way'):
Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones
Stripping off the first word or two and then count the occurrence of each sub string
Comparing Levenshtein distance between entries
Some sort of string similarity index ...
Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..?
In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set.
After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one.
Any guidance would be appreciated.
I have two functions:
def f(a,b,c=g(b)):
blabla
def g(n):
blabla
c is an optional argument in function f. If the user does not specify its value, the program should compute g(b) and that would be the value of c. But the code does not compile - it says name 'b' is not defined. How to fix that?
Someone suggested:
def g(b):
blabla
def f(a,b,c=None):
if c is None:
c = g(b)
blabla
But this doesn't work, because maybe the user intended c to be None and then c will have another value.
I have to create an "Expires" value 5 minutes in the future, but I have to supply it in UNIX Timestamp format. I have this so far, but it seems like a hack.
def expires():
'''return a UNIX style timestamp representing 5 minutes from now'''
epoch = datetime.datetime(1970, 1, 1)
seconds_in_a_day = 60 * 60 * 24
five_minutes = datetime.timedelta(seconds=5*60)
five_minutes_from_now = datetime.datetime.now() + five_minutes
since_epoch = five_minutes_from_now - epoch
return since_epoch.days * seconds_in_a_day + since_epoch.seconds
Is there a module or function that does the timestamp conversion for me?
I had to do heavy I/o bound operation, i.e Parsing large files and converting from one format to other format. Initially I used to do it serially, i.e parsing one after another..! Performance was very poor ( it used take 90+ seconds). So I decided to use threading to improve the performance. I created one thread for each file. ( 4 threads)
for file in file_list:
t=threading.Thread(target = self.convertfile,args = file)
t.start()
ts.append(t)
for t in ts:
t.join()
But for my astonishment, there is no performance improvement whatsoever. Now also it takes around 90+ seconds to complete the task. As this is I/o bound operation , I had expected to improve the performance. What am I doing wrong?
say ive got a matrix that looks like:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
how can i make it on seperate lines:
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]
and then remove commas etc:
0 0 0 0 0
And also to make it blank instead of 0's, so that numbers can be put in later, so in the end it will be like:
_ 1 2 _ 1 _ 1
(spaces not underscores)
thanks
I have a request that maps to this class ChatMsg
It takes in 3 get variables, username, roomname, and msg. But it fails on this last line here.
class ChatMsg(webapp.RequestHandler): # this is line 239
def get(self):
username = urllib.unquote(self.request.get('username'))
roomname = urllib.unquote(self.request.get('roomname')) # this is line 242
When it tries to assign roomname, it tells me:
<type 'exceptions.NameError'>: name 'self' is not defined
Traceback (most recent call last):
File "/base/data/home/apps/chatboxes/1.341998073649951735/chatroom.py", line 239, in <module>
class ChatMsg(webapp.RequestHandler):
File "/base/data/home/apps/chatboxes/1.341998073649951735/chatroom.py", line 242, in ChatMsg
roomname = urllib.unquote(self.request.get('roomname'))
what the hell is going on to make self not defined
Hi,
I'm trying to modify Guido's multimethod (dynamic dispatch code):
http://www.artima.com/weblogs/viewpost.jsp?thread=101605
to handle inheritance and possibly out of order arguments.
e.g. (inheritance problem)
class A(object):
pass
class B(A):
pass
@multimethod(A,A)
def foo(arg1,arg2):
print 'works'
foo(A(),A()) #works
foo(A(),B()) #fails
Is there a better way than iteratively checking for the super() of each item until one is found?
e.g. (argument ordering problem)
I was thinking of this from a collision detection standpoint.
e.g.
foo(Car(),Truck()) and
foo(Truck(), Car()) and
should both trigger
foo(Car,Truck) # Note: @multimethod(Truck,Car) will throw an exception if @multimethod(Car,Truck) was registered first?
I'm looking specifically for an 'elegant' solution. I know that I could just brute force my way through all the possibilities, but I'm trying to avoid that. I just wanted to get some input/ideas before sitting down and pounding out a solution.
Thanks
When I try to write a field that includes whitespace in it, it gets split into multiple fields on the space. What's causing this? It's driving me insane. Thanks
data = open("file.csv", "wb")
w = csv.writer(data)
w.writerow(['word1', 'word2'])
w.writerow(['word 1', 'word2'])
data.close()
I'll get 2 fields(word1,word2) for first example and 3(word,1,word2) for the second.
Today I was bitten again by "Mutable default arguments" after many years. I usually don't use mutable default arguments unless needed but I think with time I forgot about that, and today in the application I added tocElements=[] in a pdf generation function's argument list and now 'Table of Content' gets longer and longer after each invocation of "generate pdf" :)
My question is what other things should I add to my list of things to MUST avoid?
1 Mutable default arguments
2 import modules always same way e.g. 'from y import x' and 'import x' are totally different things actually they are treated as different modules
see http://stackoverflow.com/questions/1459236/module-reimported-if-imported-from-different-path
3 Do not use range in place of lists because range() will become an iterator anyway, so things like this will fail, so wrap it by list
myIndexList = [0,1,3]
isListSorted = myIndexList == range(3) # will fail in 3.0
isListSorted = myIndexList == list(range(3)) # will not
same thing can be mistakenly done with xrange e.g myIndexList == xrange(3).
4 Catching multiple exceptions
try:
raise KeyError("hmm bug")
except KeyError,TypeError:
print TypeError
It prints "hmm bug", though it is not a bug, it looks like we are catching exceptions of type KeyError,TypeError but instead we are catching KeyError only as variable TypeError, instead use
try:
raise KeyError("hmm bug")
except (KeyError,TypeError):
print TypeError
Hello, I have html-file. I have to replace all text between this: [%anytext%]. As I understand, it's very easy to do with BeautifulSoup for parsing hmtl. But what is regular expression and how to remove&write back text data?
Having a class like this:
class Spam(object):
def __init__(self, name=''):
self.name = name
eggs = Spam('systempuntoout')
using dis, is it possible to see how an instance of a class and the respective hex Identity are created?
So lets say I have an incredibly nested iterable of lists/dictionaries. I would like to print them to a file as easily as possible. Why can't I just redirect print to a file?
val = print(arg)
gets a SyntaxError.
Is there a way to access stdinput?
And why does print take forever with massive strings? Bad programming on my side for outputting massive strings, but quick debugging--and isn't that leveraging the strength of an interactive prompt?
There's probably also an easier way than my gripe. Has the hive-mind an answer?
from google.appengine.api import users
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
class MainPage(webapp.RequestHandler):
def get(self):
user = users.get_current_user()
if user:
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write('Hello, ' + user.nickname())
else:
self.redirect(users.create_login_url(self.request.uri))
application = webapp.WSGIApplication(
[('/', MainPage)],
debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
I don't understand how this line works:
if user:
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write('Hello, ' + user.nickname())
else:
self.redirect(users.create_login_url(self.request.uri))
I'm guessing the users.get_current_user() return a boolean? Then, if that is the case how can it get a .nickname() method?
Thanks for the guidance.
Hi i working on scrapy and trying xml feeds first time, below is my code
class TestxmlItemSpider(XMLFeedSpider):
name = "TestxmlItem"
allowed_domains = {"http://www.nasinteractive.com"}
start_urls = [
"http://www.nasinteractive.com/jobexport/advance/hcantexasexport.xml"
]
iterator = 'iternodes'
itertag = 'job'
def parse_node(self, response, node):
title = node.select('title/text()').extract()
job_code = node.select('job-code/text()').extract()
detail_url = node.select('detail-url/text()').extract()
category = node.select('job-category/text()').extract()
print title,";;;;;;;;;;;;;;;;;;;;;"
print job_code,";;;;;;;;;;;;;;;;;;;;;"
item = TestxmlItem()
item['title'] = node.select('title/text()').extract()
.......
return item
result:
File "/usr/lib/python2.7/site-packages/Scrapy-0.14.3-py2.7.egg/scrapy/item.py", line 56, in __setitem__
(self.__class__.__name__, key))
exceptions.KeyError: 'TestxmlItem does not support field: title'
Totally there are 200+ items so i need to loop over and assign the node text to item
but here all the results are displaying at once when we print, actually how can we loop over on nodes in scraping xml files with xmlfeedspider
I have this xml model.
link text
So I have to add some node (see the text commented) to this file.
How I can do it?
I have writed this partial code but it doesn't work:
xmldoc=minidom.parse(directory)
child = xmldoc.createElement("map")
for node in xmldoc.getElementsByTagName("Environment"):
node.appendChild(child)
Thanks in advance.
Hello everyone. My question is if we can assign/bind some value to a certain item and hide that value(or if we can do the same thing in another way).
Example: Lets say the columns on ListCtrl are "Name" and "Description":
self.lc = wx.ListCtrl(self, -1, style=wx.LC_REPORT)
self.lc.InsertColumn(0, 'Name')
self.lc.InsertColumn(1, 'Description')
And when I add a item I want them to show the Name parameter and the description:
num_items = self.lc.GetItemCount()
self.lc.InsertStringItem(num_items, "Randomname")
self.lc.SetStringItem(num_items, 1, "Some description here")
Now what I want to do is basically assign something to that item that is not shown so I can access later on the app.
So I would like to add something that is not shown on the app but is on the item value like:
hiddendescription = "Somerandomthing"
Still didn't undestand? Well lets say I add a button to add a item with some other TextCtrls to set the parameters and the TextCtrls parameters are:
"Name"
"Description"
"Hiddendescription"
So then the user fills this textctrls out and clicks the button to create the item, and I basically want only to show the Name and Description and hide the "HiddenDescription" but to do it so I can use it later.
Sorry for explaining more than 1 time on this post but I want to make sure you understand what I pretend to do.
I'm not even sure what the right words are to search for. I want to display parts of the error object in an except block (similar to the err object in VBScript, which has Err.Number and Err.Description). For example, I want to show the values of my variables, then show the exact error. Clearly, I am causing a divided-by-zero error below, but how can I print that fact?
try:
x = 0
y = 1
z = y / x
z = z + 1
print "z=%d" % (z)
except:
print "Values at Exception: x=%d y=%d " % (x,y)
print "The error was on line ..."
print "The reason for the error was ..."
Is there a simple way to get the default thumbnail from a youtube entry object gdata.youtube.YouTubeVideoEntry?
I tried entry.media.thumbnail, but that gives me four thumbnail objects. Can I always trust that there are four? Can I know which is the default thumbnail that would also appears on the youtube search page? And how would I get that one? Or do I have to alter one of the other ones?
When I know the video_id I use:
http://i4.ytimg.com/vi/{{video_id}}/default.jpg
so, it would also be helpful to get the video_id.
Do I really have to parse one of the url's to get at the video_id ? It seems strange that they don't provide this information directly.
I know how to override an object's getattr() to handle calls to undefined object functions. However, I would like to achieve the same behavior for the builtin getattr() function. For instance, consider code like this:
call_some_undefined_function()
Normally, that simply produces an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'call_some_undefined_function' is not defined
I want to override getattr() so that I can intercept the call to "call_some_undefined_function()" and figure out what to do.
Is this possible?
Thanks,
--Steve