cprofile - Developer IT

Using cProfile results with KCacheGrind

- by Adam Luchjenbroers

I'm using cProfile to profile my Python program. Based upon this talk I was under the impression that KCacheGrind could parse and display the output from cProfile. However, when I go to import the file, KCacheGrind just displays an 'Unknown File Format' error in the status bar and sits there displaying nothing. Is there something special I need to do before my profiling stats are compatible with KCacheGrind? ... if profile: import cProfile profileFileName = 'Profiles/pythonray_' + time.strftime('%Y%m%d_%H%M%S') + '.profile' profile = cProfile.Profile() profile.run('pilImage = camera.render(scene, samplePattern)') profile.dump_stats(profileFileName) profile.print_stats() else: pilImage = camera.render(scene, samplePattern) ... Package Versions KCacheGrind 4.3.1 Python 2.6.2

Read the article

Where is my python script spending time? Is there "missing time" in my cprofile / pstats trace?

- by fmark

I am attempting to profile a long running python script. The script does some spatial analysis on raster GIS data set using the gdal module. The script currently uses three files, the main script which loops over the raster pixels called find_pixel_pairs.py, a simple cache in lrucache.py and some misc classes in utils.py. I have profiled the code on a moderate sized dataset. pstats returns: p.sort_stats('cumulative').print_stats(20) Thu May 6 19:16:50 2010 phes.profile 355483738 function calls in 11644.421 CPU seconds Ordered by: cumulative time List reduced from 86 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.008 0.008 11644.421 11644.421 <string>:1(<module>) 1 11064.926 11064.926 11644.413 11644.413 find_pixel_pairs.py:49(phes) 340135349 544.143 0.000 572.481 0.000 utils.py:173(extent_iterator) 8831020 18.492 0.000 18.492 0.000 {range} 231922 3.414 0.000 8.128 0.000 utils.py:152(get_block_in_bands) 142739 1.303 0.000 4.173 0.000 utils.py:97(search_extent_rect) 745181 1.936 0.000 2.500 0.000 find_pixel_pairs.py:40(is_no_data) 285478 1.801 0.000 2.271 0.000 utils.py:98(intify) 231922 1.198 0.000 2.013 0.000 utils.py:116(block_to_pixel_extent) 695766 1.990 0.000 1.990 0.000 lrucache.py:42(get) 1213166 1.265 0.000 1.265 0.000 {min} 1031737 1.034 0.000 1.034 0.000 {isinstance} 142740 0.563 0.000 0.909 0.000 utils.py:122(find_block_extent) 463844 0.611 0.000 0.611 0.000 utils.py:112(block_to_pixel_coord) 745274 0.565 0.000 0.565 0.000 {method 'append' of 'list' objects} 285478 0.346 0.000 0.346 0.000 {max} 285480 0.346 0.000 0.346 0.000 utils.py:109(pixel_coord_to_block_coord) 324 0.002 0.000 0.188 0.001 utils.py:27(__init__) 324 0.016 0.000 0.186 0.001 gdal.py:848(ReadAsArray) 1 0.000 0.000 0.160 0.160 utils.py:50(__init__) The top two calls contain the main loop - the entire analyis. The remaining calls sum to less than 625 of the 11644 seconds. Where are the remaining 11,000 seconds spent? Is it all within the main loop of find_pixel_pairs.py? If so, can I find out which lines of code are taking most of the time?

Read the article

Python Profiling In Windows, How do you ignore Builtin Functions

- by Tim McJilton

I have not been capable of finding this anywhere online. I was looking to find out using a profiler how to better optimize my code, and when sorting by which functions use up the most time cumulatively, things like str(), print, and other similar widely used functions eat up much of the profile. What is the best way to profile a python program to get the user-defined functions only to see what areas of their code they can optimize? I hope that makes sense, any light you can shed on this subject would be very appreciated.

Read the article

Optimizing code using PIL

- by freakazo

Firstly sorry for the long piece of code pasted below. This is my first time actually having to worry about performance of an application so I haven't really ever worried about performance. This piece of code pretty much searches for an image inside another image, it takes 30 seconds to run on my computer, converting the images to greyscale and other changes shaved of 15 seconds, I need another 15 shaved off. I did read a bunch of pages and looked at examples but I couldn't find the same problems in my code. So any help would be greatly appreciated. From the looks of it (cProfile) 25 seconds is spent within the Image module, and only 5 seconds in my code. from PIL import Image import os, ImageGrab, pdb, time, win32api, win32con import cProfile def GetImage(name): name = name + '.bmp' try: print(os.path.join(os.getcwd(),"Images",name)) image = Image.open(os.path.join(os.getcwd(),"Images",name)) except: print('error opening image;', name) return image def Find(name): image = GetImage(name) imagebbox = image.getbbox() screen = ImageGrab.grab() #screen = Image.open(os.path.join(os.getcwd(),"Images","Untitled.bmp")) YLimit = screen.getbbox()[3] - imagebbox[3] XLimit = screen.getbbox()[2] - imagebbox[2] image = image.convert("L") Screen = screen.convert("L") Screen.load() image.load() #print(XLimit, YLimit) Found = False image = image.getdata() for y in range(0,YLimit): for x in range(0,XLimit): BoxCoordinates = x, y, x+imagebbox[2], y+imagebbox[3] ScreenGrab = screen.crop(BoxCoordinates) ScreenGrab = ScreenGrab.getdata() if image == ScreenGrab: Found = True #print("woop") return x,y if Found == False: return "Not Found" cProfile.run('print(Find("Login"))')

Read the article

How to optimize my PageRank calculation?

- by asmaier

In the book Programming Collective Intelligence I found the following function to compute the PageRank: def calculatepagerank(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() for i in range(iterations): print "Iteration %d" % i for (urlid,) in self.con.execute("select rowid from urllist"): pr=0.15 # Loop through all the pages that link to this one for (linker,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): # Get the PageRank of the linker linkingpr=self.con.execute("select score from pagerank where urlid=%d" % linker).fetchone()[0] # Get the total number of links from the linker linkingcount=self.con.execute("select count(*) from link where fromid=%d" % linker).fetchone()[0] pr+=0.85*(linkingpr/linkingcount) self.con.execute("update pagerank set score=%f where urlid=%d" % (pr,urlid)) self.dbcommit() However, this function is very slow, because of all the SQL queries in every iteration >>> import cProfile >>> cProfile.run("crawler.calculatepagerank()") 2262510 function calls in 136.006 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 136.006 136.006 <string>:1(<module>) 1 20.826 20.826 136.006 136.006 searchengine.py:179(calculatepagerank) 21 0.000 0.000 0.528 0.025 searchengine.py:27(dbcommit) 21 0.528 0.025 0.528 0.025 {method 'commit' of 'sqlite3.Connecti 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler 1339864 112.602 0.000 112.602 0.000 {method 'execute' of 'sqlite3.Connec 922600 2.050 0.000 2.050 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} So I optimized the function and came up with this: def calculatepagerank2(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() inlinks={} numoutlinks={} pagerank={} for (urlid,) in self.con.execute("select rowid from urllist"): inlinks[urlid]=[] numoutlinks[urlid]=0 # Initialize pagerank vector with 1.0 pagerank[urlid]=1.0 # Loop through all the pages that link to this one for (inlink,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): inlinks[urlid].append(inlink) # get number of outgoing links from a page numoutlinks[urlid]=self.con.execute("select count(*) from link where fromid=%d" % urlid).fetchone()[0] for i in range(iterations): print "Iteration %d" % i for urlid in pagerank: pr=0.15 for link in inlinks[urlid]: linkpr=pagerank[link] linkcount=numoutlinks[link] pr+=0.85*(linkpr/linkcount) pagerank[urlid]=pr for urlid in pagerank: self.con.execute("update pagerank set score=%f where urlid=%d" % (pagerank[urlid],urlid)) self.dbcommit() This function is 20 times faster (but uses a lot more memory for all the temporary dictionaries) because it avoids the unnecessary SQL queries in every iteration: >>> cProfile.run("crawler.calculatepagerank2()") 64802 function calls in 6.950 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.004 0.004 6.950 6.950 <string>:1(<module>) 1 1.004 1.004 6.946 6.946 searchengine.py:207(calculatepagerank2 2 0.000 0.000 0.104 0.052 searchengine.py:27(dbcommit) 23065 0.012 0.000 0.012 0.000 {meth 'append' of 'list' objects} 2 0.104 0.052 0.104 0.052 {meth 'commit' of 'sqlite3.Connection 1 0.000 0.000 0.000 0.000 {meth 'disable' of '_lsprof.Profiler' 31298 5.809 0.000 5.809 0.000 {meth 'execute' of 'sqlite3.Connectio 10431 0.018 0.000 0.018 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} But is it possible to further reduce the number of SQL queries to speed up the function even more?

Read the article

How can you get the call tree with python profilers?

- by Oliver

I used to use a nice Apple profiler that is built into the System Monitor application. As long as your C++ code was compiled with debug information, you could sample your running application and it would print out an indented tree telling you what percent of the parent function's time was spent in this function (and the body vs. other function calls). For instance, if main called function_1 and function_2, function_2 calls function_3, and then main calls function_3: main (100%, 1% in function body): function_1 (9%, 9% in function body): function_2 (90%, 85% in function body): function_3 (100%, 100% in function body) function_3 (1%, 1% in function body) I would see this and think, "Something is taking a long time in the code in the body of function_2. If I want my program to be faster, that's where I should start." Does anyone know how I can most easily get this exact profiling output for a python program? I've seen people say to do this: import cProfile, pstats prof = cProfile.Profile() prof = prof.runctx("real_main(argv)", globals(), locals()) stats = pstats.Stats(prof) stats.sort_stats("time") # Or cumulative stats.print_stats(80) # 80 = how many to print but it's quite messy compared to that elegant call tree. Please let me know if you can easily do this, it would help quite a bit. Cheers!

Read the article

Very simple python functions takes spends long time in function and not subfunctions

- by John Salvatier

I have spent many hours trying to figure what is going on here. The function 'grad_logp' in the code below is called many times in my program, and cProfile and runsnakerun the visualize the results reveals that the function grad_logp spends about .00004s 'locally' every call not in any functions it calls and the function 'n' spends about .00006s locally every call. Together these two times make up about 30% of program time that I care about. It doesn't seem like this is function overhead as other python functions spend far less time 'locally' and merging 'grad_logp' and 'n' does not make my program faster, but the operations that these two functions do seem rather trivial. Does anyone have any suggestions on what might be happening? Have I done something obviously inefficient? Am I misunderstanding how cProfile works? def grad_logp(self, variable, calculation_set ): p = params(self.p,self.parents) return self.n(variable, self.p) def n (self, variable, p ): gradient = self.gg(variable, p) return np.reshape(gradient, np.shape(variable.value)) def gg(self, variable, p): if variable is self: gradient = self._grad_logps['x']( x = self.value, **p) else: gradient = __builtin__.sum([self._pgradient(variable, parameter, value, p) for parameter, value in self.parents.iteritems()]) return gradient

Read the article

Profiling short-lived Java applications

- by ejel

Is there any Java profiler that allows profiling short-lived applications? The profilers I found so far seem to work with applications that keep running until user termination. However, I want to profile applications that work like command-line utilities, it runs and exits immediately. Tools like visualvm or NetBeans Profiler do not even recognize that the application was ran. I am looking for something similar to Python's cProfile, in that the profiler result is returned when the application exits.

Read the article

does a switch idiom make sense in this case?

- by the ungoverned

I'm writing a parser/handler for a network protocol; the protocol is predefined and I am writing an adapter, in python. In the process of decoding the incoming messages, I've been considering using the idiom I've seen suggested elsewhere for "switch" in python: use a hash table whose keys are the field you want to match on (a string in this case) and whose values are callable expressions: self.switchTab = { 'N': self.handleN, 'M': self.handleM, ... } Where self.handleN, etc., are methods on the current class. The actual switch looks like this: self.switchTab[selector]() According to some profiling I've done with cProfile (and Python 2.5.2) this is actually a little bit faster than a chain of if..elif... statements. My question is, do folks think this is a reasonable choice? I can't imagine that re-framing this in terms of objects and polymorphism would be as fast, and I think the code looks reasonably clear to a reader.

Read the article

Unable to post via HTTP POST

- by jihbvsdfu

i am trying to post data via HTTP Post using name value key pair. But I am unable to post . The post url is http://mastercp.openweb.co.za/api/dbg_dump.asp .Should I include some header also while posting? Thanks public class MainActivity extends Activity { Button ok; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.profile); ok=(Button)findViewById(R.id.but_signup_login); ok.setOnClickListener(new OnClickListener() { public void onClick(View arg0) { System.out.println("Clicked"); DownloadWebPageTask task = new DownloadWebPageTask(); task.execute(new String[] { "http://mastercp.openweb.co.za/api/dbg_dump.asp" });}}); } public void postData() { // Create a new HttpClient and Post Header HttpClient httpclient = new DefaultHttpClient(); HttpPost httppost = new HttpPost("http://mastercp.openweb.co.za/api/dbg_dump.asp"); System.out.println("Clicked again"); try { // Add your data List<NameValuePair> nameValuePairs = new ArrayList<NameValuePair>(34); String amount ="Ashish"; nameValuePairs.add(new BasicNameValuePair("User_Type", amount)); nameValuePairs.add(new BasicNameValuePair("User_Email", "[email protected]")); nameValuePairs.add(new BasicNameValuePair("User_Email_In", amount)); nameValuePairs.add(new BasicNameValuePair("User_Pass", amount)); nameValuePairs.add(new BasicNameValuePair("User_Mobile", amount)); nameValuePairs.add(new BasicNameValuePair("User_Mobile_In", amount)); nameValuePairs.add(new BasicNameValuePair("User_ADSL", amount)); nameValuePairs.add(new BasicNameValuePair("User_Org", amount)); nameValuePairs.add(new BasicNameValuePair("User_VAT", amount)); nameValuePairs.add(new BasicNameValuePair("User_Name", amount)); nameValuePairs.add(new BasicNameValuePair("User_Surname", amount)); nameValuePairs.add(new BasicNameValuePair("User_RegNo", amount)); nameValuePairs.add(new BasicNameValuePair("User_Address", amount)); nameValuePairs.add(new BasicNameValuePair("User_Town", amount)); nameValuePairs.add(new BasicNameValuePair("User_Code", amount)); nameValuePairs.add(new BasicNameValuePair("User_State", amount)); nameValuePairs.add(new BasicNameValuePair("User_Country", amount)); nameValuePairs.add(new BasicNameValuePair("User_ADSL", amount)); nameValuePairs.add(new BasicNameValuePair("User_ADSL_Address", amount)); nameValuePairs.add(new BasicNameValuePair("Payment_CC_Alt", amount)); nameValuePairs.add(new BasicNameValuePair("Payment_Type", amount)); nameValuePairs.add(new BasicNameValuePair("CProfile", amount)); nameValuePairs.add(new BasicNameValuePair("COrder", amount)); nameValuePairs.add(new BasicNameValuePair("Debit_Name", amount)); nameValuePairs.add(new BasicNameValuePair("Debit_Bank", amount)); nameValuePairs.add(new BasicNameValuePair("Debit_Number", amount)); nameValuePairs.add(new BasicNameValuePair("Debit_Code", amount)); nameValuePairs.add(new BasicNameValuePair("Debit_Type", amount)); nameValuePairs.add(new BasicNameValuePair("TOS_Agree", amount)); nameValuePairs.add(new BasicNameValuePair("Code", amount)); nameValuePairs.add(new BasicNameValuePair("package_activation", amount)); nameValuePairs.add(new BasicNameValuePair("session", amount)); nameValuePairs.add(new BasicNameValuePair("OnceOff", amount)); nameValuePairs.add(new BasicNameValuePair("submit-button", amount)); try { httppost.setEntity(new UrlEncodedFormEntity(nameValuePairs)); } catch (UnsupportedEncodingException e) { System.out.println("Unsupported Exception "+e); e.printStackTrace(); } } catch (Exception e) { System.out.println(" Exception last"+e); // TODO Auto-generated catch block } } private class DownloadWebPageTask extends AsyncTask<String, Void, String> { @Override protected String doInBackground(String... urls) { String response = ""; for (String url : urls) { postData(); } return response; } @Override protected void onPostExecute(String result) {} } }

Read the article

Increasing speed of python code

- by Curious2learn

Hi, I have some python code that has many classes. I used cProfile to find that the total time to run the program is 68 seconds. I found that the following function in a class called Buyers takes about 60 seconds of those 68 seconds. I have to run the program about 100 times, so any increase in speed will help. Can you suggest ways to increase the speed by modifying the code? If you need more information that will help, please let me know. def qtyDemanded(self, timePd, priceVector): '''Returns quantity demanded in period timePd. In addition, also updates the list of customers and non-customers. Inputs: timePd and priceVector Output: count of people for whom priceVector[-1] < utility ''' ## Initialize count of customers to zero ## Set self.customers and self.nonCustomers to empty lists price = priceVector[-1] count = 0 self.customers = [] self.nonCustomers = [] for person in self.people: if person.utility >= price: person.customer = 1 self.customers.append(person) else: person.customer = 0 self.nonCustomers.append(person) return len(self.customers) self.people is a list of person objects. Each person has customer and utility as its attributes. EDIT - responsed added ------------------------------------- Thanks so much for the suggestions. Here is the response to some questions and suggestions people have kindly made. I have not tried them all, but will try others and write back later. (1) @amber - the function is accessed 80,000 times. (2) @gnibbler and others - self.people is a list of Person objects in memory. Not connected to a database. (3) @Hugh Bothwell cumtime taken by the original function - 60.8 s (accessed 80000 times) cumtime taken by the new function with local function aliases as suggested - 56.4 s (accessed 80000 times) (4) @rotoglup and @Martin Thomas I have not tried your solutions yet. I need to check the rest of the code to see the places where I use self.customers before I can make the change of not appending the customers to self.customers list. But I will try this and write back. (5) @TryPyPy - thanks for your kind offer to check the code. Let me first read a little on the suggestions you have made to see if those will be feasible to use. EDIT 2 Some suggested that since I am flagging the customers and noncustomers in the self.people, I should try without creating separate lists of self.customers and self.noncustomers using append. Instead, I should loop over the self.people to find the number of customers. I tried the following code and timed both functions below f_w_append and f_wo_append. I did find that the latter takes less time, but it is still 96% of the time taken by the former. That is, it is a very small increase in the speed. @TryPyPy - The following piece of code is complete enough to check the bottleneck function, in case your offer is still there to check it with other compilers. Thanks again to everyone who replied. import numpy class person(object): def __init__(self, util): self.utility = util self.customer = 0 class population(object): def __init__(self, numpeople): self.people = [] self.cus = [] self.noncus = [] numpy.random.seed(1) utils = numpy.random.uniform(0, 300, numpeople) for u in utils: per = person(u) self.people.append(per) popn = population(300) def f_w_append(): '''Function with append''' P = 75 cus = [] noncus = [] for per in popn.people: if per.utility >= P: per.customer = 1 cus.append(per) else: per.customer = 0 noncus.append(per) return len(cus) def f_wo_append(): '''Function without append''' P = 75 for per in popn.people: if per.utility >= P: per.customer = 1 else: per.customer = 0 numcustomers = 0 for per in popn.people: if per.customer == 1: numcustomers += 1 return numcustomers

Read the article

Optimizing Python code with many attribute and dictionary lookups

- by gotgenes

I have written a program in Python which spends a large amount of time looking up attributes of objects and values from dictionary keys. I would like to know if there's any way I can optimize these lookup times, potentially with a C extension, to reduce the time of execution, or if I need to simply re-implement the program in a compiled language. The program implements some algorithms using a graph. It runs prohibitively slowly on our data sets, so I profiled the code with cProfile using a reduced data set that could actually complete. The vast majority of the time is being burned in one function, and specifically in two statements, generator expressions, within the function: The generator expression at line 202 is neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) and the generator expression at line 204 is neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) The source code for this function of context provided below. selected_nodes is a set of nodes in the interaction_graph, which is a NetworkX Graph instance. node_neighbors is an iterator from Graph.neighbors_iter(). Graph itself uses dictionaries for storing nodes and edges. Its Graph.node attribute is a dictionary which stores nodes and their attributes (e.g., 'weight') in dictionaries belonging to each node. Each of these lookups should be amortized constant time (i.e., O(1)), however, I am still paying a large penalty for the lookups. Is there some way which I can speed up these lookups (e.g., by writing parts of this as a C extension), or do I need to move the program to a compiled language? Below is the full source code for the function that provides the context; the vast majority of execution time is spent within this function. def calculate_node_z_prime( node, interaction_graph, selected_nodes ): """Calculates a z'-score for a given node. The z'-score is based on the z-scores (weights) of the neighbors of the given node, and proportional to the z-score (weight) of the given node. Specifically, we find the maximum z-score of all neighbors of the given node that are also members of the given set of selected nodes, multiply this z-score by the z-score of the given node, and return this value as the z'-score for the given node. If the given node has no neighbors in the interaction graph, the z'-score is defined as zero. Returns the z'-score as zero or a positive floating point value. :Parameters: - `node`: the node for which to compute the z-prime score - `interaction_graph`: graph containing the gene-gene or gene product-gene product interactions - `selected_nodes`: a `set` of nodes fitting some criterion of interest (e.g., annotated with a term of interest) """ node_neighbors = interaction_graph.neighbors_iter(node) neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) try: max_z_score = max(neighbor_z_scores) # max() throws a ValueError if its argument has no elements; in this # case, we need to set the max_z_score to zero except ValueError, e: # Check to make certain max() raised this error if 'max()' in e.args[0]: max_z_score = 0 else: raise e z_prime = interaction_graph.node[node]['weight'] * max_z_score return z_prime Here are the top couple of calls according to cProfiler, sorted by time. ncalls tottime percall cumtime percall filename:lineno(function) 156067701 352.313 0.000 642.072 0.000 bpln_contextual.py:204(<genexpr>) 156067701 289.759 0.000 289.759 0.000 bpln_contextual.py:202(<genexpr>) 13963893 174.047 0.000 816.119 0.000 {max} 13963885 69.804 0.000 936.754 0.000 bpln_contextual.py:171(calculate_node_z_prime) 7116883 61.982 0.000 61.982 0.000 {method 'update' of 'set' objects}

Search Results

Search found 12 results on 1 pages for 'cprofile'.

Page 1/1 | 1

- by Adam Luchjenbroers

- by fmark

- by Tim McJilton

- by freakazo

- by asmaier

- by Oliver

- by John Salvatier

- by ejel

- by the ungoverned

- by jihbvsdfu

- by Curious2learn

- by gotgenes