attrgetter - Developer IT

lambda vs. operator.attrGetter('xxx') as sort key in Python

- by Paul McGuire

I am looking at some code that has a lot of sort calls using comparison functions, and it seems like it should be using key functions. If you were to change seq.sort(lambda x,y: cmp(x.xxx, y.xxx)), which is preferable: seq.sort(key=operator.attrgetter('xxx')) or: seq.sort(key=lambda a:a.xxx) I would also be interested in comments on the merits of making changes to existing code that works.

Read the article

Has Twisted changed its dependencies?

- by cdecker

Hi all, I'm currently working on a Python/Twisted project which is to be distributed and tested on Planetlab. For some reason my code was working on friday and now that I wanted to test a minor change it refuses to work at all: Traceback (most recent call last): File "acn_a4/src/node.py", line 6, in <module> from twisted.internet.protocol import DatagramProtocol File "/usr/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-i686.egg/twisted/__init__.py", line 18, in <module> from twisted.python import compat File "/usr/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-i686.egg/twisted/python/compat.py", line 146, in <module> import operator File "/home/cdecker/dev/acn/acn_a4/src/operator.py", line 7, in <module> File "/home/cdecker/acn_a4/src/node.py", line 6, in <module> from twisted.internet.protocol import DatagramProtocol File "/usr/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-i686.egg/twisted/internet/protocol.py", line 20, in <module> from twisted.python import log, failure, components File "/usr/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-i686.egg/twisted/python/log.py", line 19, in <module> from twisted.python import util, context, reflect File "/usr/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-i686.egg/twisted/python/util.py", line 5, in <module> import os, sys, hmac, errno, new, inspect, warnings File "/usr/lib/python2.5/inspect.py", line 32, in <module> from operator import attrgetter ImportError: cannot import name attrgetter And since I'm pretty new to python I have no idea what could have caused this problem. All suggestions are welcome :-)

Read the article

lambda vs. operator.attrgetter('xxx') as sort key function in Python

- by Paul McGuire

I am looking at some code that has a lot of sort calls using comparison functions, and it seems like it should be using key functions. If you were to change seq.sort(lambda x,y: cmp(x.xxx, y.xxx)), which is preferable: seq.sort(key=operator.attrgetter('xxx')) or: seq.sort(key=lambda a:a.xxx) I would also be interested in comments on the merits of making changes to existing code that works.

Read the article

Sorting objects in Python

- by Curious2learn

I want to sort objects using by one of their attributes. As of now, I am doing it in the following way USpeople.sort(key=lambda person: person.utility[chosenCar],reverse=True) This works fine, but I have read that using operator.attrgetter() might be a faster way to achieve this sort. First, is this correct? Assuming that it is correct, how do I use operator.attrgetter() to achieve this sort? I tried, keyFunc=operator.attrgetter('utility[chosenCar]') USpeople.sort(key=keyFunc,reverse=True) However, I get an error saying that there is no attribute 'utility[chosenCar]'. The problem is that the attribute by which I want to sort is in a dictionary. For example, the utility attribute is in the following form: utility={chosenCar:25000,anotherCar:24000,yetAnotherCar:24500} I want to sort by the utility of the chosenCar using operator.attrgetter(). How could I do this? Thanks in advance.

Read the article

Python: Improving long cumulative sum

- by Bo102010

I have a program that operates on a large set of experimental data. The data is stored as a list of objects that are instances of a class with the following attributes: time_point - the time of the sample cluster - the name of the cluster of nodes from which the sample was taken code - the name of the node from which the sample was taken qty1 = the value of the sample for the first quantity qty2 = the value of the sample for the second quantity I need to derive some values from the data set, grouped in three ways - once for the sample as a whole, once for each cluster of nodes, and once for each node. The values I need to derive depend on the (time sorted) cumulative sums of qty1 and qty2: the maximum value of the element-wise sum of the cumulative sums of qty1 and qty2, the time point at which that maximum value occurred, and the values of qty1 and qty2 at that time point. I came up with the following solution: dataset.sort(key=operator.attrgetter('time_point')) # For the whole set sys_qty1 = 0 sys_qty2 = 0 sys_combo = 0 sys_max = 0 # For the cluster grouping cluster_qty1 = defaultdict(int) cluster_qty2 = defaultdict(int) cluster_combo = defaultdict(int) cluster_max = defaultdict(int) cluster_peak = defaultdict(int) # For the node grouping node_qty1 = defaultdict(int) node_qty2 = defaultdict(int) node_combo = defaultdict(int) node_max = defaultdict(int) node_peak = defaultdict(int) for t in dataset: # For the whole system ###################################################### sys_qty1 += t.qty1 sys_qty2 += t.qty2 sys_combo = sys_qty1 + sys_qty2 if sys_combo > sys_max: sys_max = sys_combo # The Peak class is to record the time point and the cumulative quantities system_peak = Peak(time_point=t.time_point, qty1=sys_qty1, qty2=sys_qty2) # For the cluster grouping ################################################## cluster_qty1[t.cluster] += t.qty1 cluster_qty2[t.cluster] += t.qty2 cluster_combo[t.cluster] = cluster_qty1[t.cluster] + cluster_qty2[t.cluster] if cluster_combo[t.cluster] > cluster_max[t.cluster]: cluster_max[t.cluster] = cluster_combo[t.cluster] cluster_peak[t.cluster] = Peak(time_point=t.time_point, qty1=cluster_qty1[t.cluster], qty2=cluster_qty2[t.cluster]) # For the node grouping ##################################################### node_qty1[t.node] += t.qty1 node_qty2[t.node] += t.qty2 node_combo[t.node] = node_qty1[t.node] + node_qty2[t.node] if node_combo[t.node] > node_max[t.node]: node_max[t.node] = node_combo[t.node] node_peak[t.node] = Peak(time_point=t.time_point, qty1=node_qty1[t.node], qty2=node_qty2[t.node]) This produces the correct output, but I'm wondering if it can be made more readable/Pythonic, and/or faster/more scalable. The above is attractive in that it only loops through the (large) dataset once, but unattractive in that I've essentially copied/pasted three copies of the same algorithm. To avoid the copy/paste issues of the above, I tried this also: def find_peaks(level, dataset): def grouping(object, attr_name): if attr_name == 'system': return attr_name else: return object.__dict__[attrname] cuml_qty1 = defaultdict(int) cuml_qty2 = defaultdict(int) cuml_combo = defaultdict(int) level_max = defaultdict(int) level_peak = defaultdict(int) for t in dataset: cuml_qty1[grouping(t, level)] += t.qty1 cuml_qty2[grouping(t, level)] += t.qty2 cuml_combo[grouping(t, level)] = (cuml_qty1[grouping(t, level)] + cuml_qty2[grouping(t, level)]) if cuml_combo[grouping(t, level)] > level_max[grouping(t, level)]: level_max[grouping(t, level)] = cuml_combo[grouping(t, level)] level_peak[grouping(t, level)] = Peak(time_point=t.time_point, qty1=node_qty1[grouping(t, level)], qty2=node_qty2[grouping(t, level)]) return level_peak system_peak = find_peaks('system', dataset) cluster_peak = find_peaks('cluster', dataset) node_peak = find_peaks('node', dataset) For the (non-grouped) system-level calculations, I also came up with this, which is pretty: dataset.sort(key=operator.attrgetter('time_point')) def cuml_sum(seq): rseq = [] t = 0 for i in seq: t += i rseq.append(t) return rseq time_get = operator.attrgetter('time_point') q1_get = operator.attrgetter('qty1') q2_get = operator.attrgetter('qty2') timeline = [time_get(t) for t in dataset] cuml_qty1 = cuml_sum([q1_get(t) for t in dataset]) cuml_qty2 = cuml_sum([q2_get(t) for t in dataset]) cuml_combo = [q1 + q2 for q1, q2 in zip(cuml_qty1, cuml_qty2)] combo_max = max(cuml_combo) time_max = timeline.index(combo_max) q1_at_max = cuml_qty1.index(time_max) q2_at_max = cuml_qty2.index(time_max) However, despite this version's cool use of list comprehensions and zip(), it loops through the dataset three times just for the system-level calculations, and I can't think of a good way to do the cluster-level and node-level calaculations without doing something slow like: timeline = defaultdict(int) cuml_qty1 = defaultdict(int) #...etc. for c in cluster_list: timeline[c] = [time_get(t) for t in dataset if t.cluster == c] cuml_qty1[c] = [q1_get(t) for t in dataset if t.cluster == c] #...etc. Does anyone here at Stack Overflow have suggestions for improvements? The first snippet above runs well for my initial dataset (on the order of a million records), but later datasets will have more records and clusters/nodes, so scalability is a concern. This is my first non-trivial use of Python, and I want to make sure I'm taking proper advantage of the language (this is replacing a very convoluted set of SQL queries, and earlier versions of the Python version were essentially very ineffecient straight transalations of what that did). I don't normally do much programming, so I may be missing something elementary. Many thanks!

Read the article

Python Vector Class

- by sfjedi

I'm coming from a C# background where this stuff is super easy—trying to translate into Python for Maya. There's gotta' be a better way to do this. Basically, I'm looking to create a Vector class that will simply have x, y and z coordinates, but it would be ideal if this class returned a tuple with all 3 coordinates and if you could edit the values of this tuple through x, y and z properties, somehow. This is what I have so far, but there must be a better way to do this than using an exec statement, right? I hate using exec statements. class Vector(object): '''Creates a Maya vector/triple, having x, y and z coordinates as float values''' def __init__(self, x=0, y=0, z=0): self.x, self.y, self.z = x, y, z def attrsetter(attr): def set_float(self, value): setattr(self, attr, float(value)) return set_float for xyz in 'xyz': exec("%s = property(fget=attrgetter('_%s'), fset=attrsetter('_%s'))" % (xyz, xyz, xyz))

Developer IT