Search Results

Search found 15 results on 1 pages for 'fmark'.

Page 1/1 | 1 

  • What is the python "with" statement designed for?

    - by fmark
    I came across the Python with statement for the first time today. I've been using Python lightly for several months and didn't even of its existence! Given its somewhat obscure status, I thought it would be worth asking: What is the Python with statement designed to be used for? What do you use it for? Are their any gotchas I need to be aware of, or common anti-patterns associated with its use?

    Read the article

  • More pythonic way to iterate

    - by fmark
    I am using a module that is part of a commercial software API. The good news is there is a python module - the bad news is that its pretty unpythonic. To iterate over rows, the follwoing syntax is used: cursor = gp.getcursor(table) row = cursor.Next() while row: #do something with row row = cursor.next() What is the most pythonic way to deal with this situation? I have considered creating a first class function/generator and wrapping calls to a for loop in it: def cursor_iterator(cursor): row = cursor.Next() while row: yield row row = cursor.next() [...] cursor = gp.getcursor(table) for row in cursor_iterator(cursor): # do something with row This is an improvement, but feels a little clumsy. Is there a more pythonic approach? Should I create a wrapper class around the table type?

    Read the article

  • What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least s

    - by fmark
    Could someone explain to the statistically naive what the difference between Multiple R-squared and Adjusted R-squared is? I am doing a single-variate regression analysis as follows: v.lm <- lm(epm ~ n_days, data=v) print(summary(v.lm)) Results: Call: lm(formula = epm ~ n_days, data = v) Residuals: Min 1Q Median 3Q Max -693.59 -325.79 53.34 302.46 964.95 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2550.39 92.15 27.677 <2e-16 *** n_days -13.12 5.39 -2.433 0.0216 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 410.1 on 28 degrees of freedom Multiple R-squared: 0.1746, Adjusted R-squared: 0.1451 F-statistic: 5.921 on 1 and 28 DF, p-value: 0.0216 Apologies for the newbiness of this question.

    Read the article

  • Best practice for installing python modules from an arbitrary VCS repository

    - by fmark
    I'm newish to the python ecosystem, and have a question about module editing. I use a bunch of third-party modules, distributed on PyPi. Coming from a C and Java background, I love the ease of easy_install <whatever>. This is a new, wonderful world, but the model breaks down when I want to edit the newly installed module for two reasons: The egg files may be stored in a folder or archive somewhere crazy on the file system. Using an egg seems to preclude using the version control system of the originating project, just as using a debian package precludes development from an originating VCS repository. What is the best practice for installing modules from an arbitrary VCS repository? I want to be able to continue to import foomodule in other scripts.

    Read the article

  • Where is my python script spending time? Is there "missing time" in my cprofile / pstats trace?

    - by fmark
    I am attempting to profile a long running python script. The script does some spatial analysis on raster GIS data set using the gdal module. The script currently uses three files, the main script which loops over the raster pixels called find_pixel_pairs.py, a simple cache in lrucache.py and some misc classes in utils.py. I have profiled the code on a moderate sized dataset. pstats returns: p.sort_stats('cumulative').print_stats(20) Thu May 6 19:16:50 2010 phes.profile 355483738 function calls in 11644.421 CPU seconds Ordered by: cumulative time List reduced from 86 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.008 0.008 11644.421 11644.421 <string>:1(<module>) 1 11064.926 11064.926 11644.413 11644.413 find_pixel_pairs.py:49(phes) 340135349 544.143 0.000 572.481 0.000 utils.py:173(extent_iterator) 8831020 18.492 0.000 18.492 0.000 {range} 231922 3.414 0.000 8.128 0.000 utils.py:152(get_block_in_bands) 142739 1.303 0.000 4.173 0.000 utils.py:97(search_extent_rect) 745181 1.936 0.000 2.500 0.000 find_pixel_pairs.py:40(is_no_data) 285478 1.801 0.000 2.271 0.000 utils.py:98(intify) 231922 1.198 0.000 2.013 0.000 utils.py:116(block_to_pixel_extent) 695766 1.990 0.000 1.990 0.000 lrucache.py:42(get) 1213166 1.265 0.000 1.265 0.000 {min} 1031737 1.034 0.000 1.034 0.000 {isinstance} 142740 0.563 0.000 0.909 0.000 utils.py:122(find_block_extent) 463844 0.611 0.000 0.611 0.000 utils.py:112(block_to_pixel_coord) 745274 0.565 0.000 0.565 0.000 {method 'append' of 'list' objects} 285478 0.346 0.000 0.346 0.000 {max} 285480 0.346 0.000 0.346 0.000 utils.py:109(pixel_coord_to_block_coord) 324 0.002 0.000 0.188 0.001 utils.py:27(__init__) 324 0.016 0.000 0.186 0.001 gdal.py:848(ReadAsArray) 1 0.000 0.000 0.160 0.160 utils.py:50(__init__) The top two calls contain the main loop - the entire analyis. The remaining calls sum to less than 625 of the 11644 seconds. Where are the remaining 11,000 seconds spent? Is it all within the main loop of find_pixel_pairs.py? If so, can I find out which lines of code are taking most of the time?

    Read the article

  • Parallelism in Python

    - by fmark
    What are the options for achieving parallelism in Python? I want to perform a bunch of CPU bound calculations over some very large rasters, and would like to parallelise them. Coming from a C background, I am familiar with three approaches to parallelism: Message passing processes, possibly distributed across a cluster, e.g. MPI. Explicit shared memory parallelism, either using pthreads or fork(), pipe(), et. al Implicit shared memory parallelism, using OpenMP. Deciding on an approach to use is an exercise in trade-offs. In Python, what approaches are available and what are their characteristics? Is there a clusterable MPI clone? What are the preferred ways of achieving shared memory parallelism? I have heard reference to problems with the GIL, as well as references to tasklets. In short, what do I need to know about the different parallelization strategies in Python before choosing between them?

    Read the article

  • Can I ask Postgresql to ignore errors within a transaction

    - by fmark
    I use Postgresql with the PostGIS extensions for ad-hoc spatial analysis. I generally construct and issue SQL queries by hand from within psql. I always wrap an analysis session within a transaction, so if I issue a destructive query I can roll it back. However, when I issue a query that contains an error, it cancels the transaction. Any further queries elicit the following warning: ERROR: current transaction is aborted, commands ignored until end of transaction block Is there a way I can turn this behaviour off? It is tiresome to rollback the transaction and rerun previous queries every time I make a typo.

    Read the article

  • How can this verbose, unpythonic routine be improved?

    - by fmark
    Is there a more pythonic way of doing this? I am trying to find the eight neighbours of an integer coordinate lying within an extent. I am interested in reducing its verbosity without sacrificing execution speed. def fringe8((px, py), (x1, y1, x2, xy)): f = [(px - 1, py - 1), (px - 1, py), (px - 1, py + 1), (px, py - 1), (px, py + 1), (px + 1, py - 1), (px + 1, py), (px + 1, py + 1)] f_inrange = [] for fx, fy in f: if fx < x1: continue if fx >= x2: continue if fy < y1: continue if fy >= y2: continue f_inrange.append((fx, fy)) return f_inrange

    Read the article

  • Best data-structure to use for two ended sorted list

    - by fmark
    I need a collection data-structure that can do the following: Be sorted Allow me to quickly pop values off the front and back of the list Remain sorted after I insert a new value Allow a user-specified comparison function, as I will be storing tuples and want to sort on a particular value Thread-safety is not required Optionally allow efficient haskey() lookups (I'm happy to maintain a separate hash-table for this though) My thoughts at this stage are that I need a priority queue and a hash table, although I don't know if I can quickly pop values off both ends of a priority queue. I'm interested in performance for a moderate number of items (I would estimate less than 200,000). Another possibility is simply maintaining an OrderedDictionary and doing an insertion sort it every-time I add more data to it. Furthermore, are there any particular implementations in Python. I would really like to avoid writing this code myself.

    Read the article

  • What "exotic" language feature do you use every day?

    - by fmark
    For most programmers using procedural or object-oriented languages there is a language-feature lowest common denominator: variables, procedures, standard control structures, and classes. However, almost all languages add features on top of this. Recent C# versions have LINQ and delegates. C++ has template metaprogramming. Java has annotations. What features such as these do you use every day?

    Read the article

  • Program structure in long running data processing python script

    - by fmark
    For my current job I am writing some long-running (think hours to days) scripts that do CPU intensive data-processing. The program flow is very simple - it proceeds into the main loop, completes the main loop, saves output and terminates: The basic structure of my programs tends to be like so: <import statements> <constant declarations> <misc function declarations> def main(): for blah in blahs(): <lots of local variables> <lots of tightly coupled computation> for something in somethings(): <lots more local variables> <lots more computation> <etc., etc.> <save results> if __name__ == "__main__": main() This gets unmanageable quickly, so I want to refactor it into something more manageable. I want to make this more maintainable, without sacrificing execution speed. Each chuck of code relies on a large number of variables however, so refactoring parts of the computation out to functions would make parameters list grow out of hand very quickly. Should I put this sort of code into a python class, and change the local variables into class variables? It doesn't make a great deal of sense tp me conceptually to turn the program into a class, as the class would never be reused, and only one instance would ever be created per instance. What is the best practice structure for this kind of program? I am using python but the question is relatively language-agnostic, assuming a modern object-oriented language features.

    Read the article

1