Search Results

Search found 13776 results on 552 pages for 'python appengine'.

Page 161/552 | < Previous Page | 157 158 159 160 161 162 163 164 165 166 167 168  | Next Page >

  • Efficient file buffering & scanning methods for large files in python

    - by eblume
    The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

    Read the article

  • varargs in lambda functions in Python

    - by brain_damage
    Is it possible a lambda function to have variable number of arguments? For example, I want to write a metaclass, which creates a method for every method of some other class and this newly created method returns the opposite value of the original method and has the same number of arguments. And I want to do this with lambda function. How to pass the arguments? Is it possible? class Negate(type): def __new__(mcs, name, bases, _dict): extended_dict = _dict.copy() for (k, v) in _dict.items(): if hasattr(v, '__call__'): extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw) return type.__new__(mcs, name, bases, extended_dict) class P(metaclass=Negate): def __init__(self, a): self.a = a def yes(self): return True def maybe(self, you_can_chose): return you_can_chose But the result is totally wrong: >>>p = P(0) >>>p.yes() True >>>p.not_yes() # should be False Traceback (most recent call last): File "<pyshell#150>", line 1, in <module> p.not_yes() File "C:\Users\Nona\Desktop\p10.py", line 51, in <lambda> extended_dict["not_" + k] = lambda s, *args, **kw: not v(s, *args, **kw) TypeError: __init__() takes exactly 2 positional arguments (1 given) >>>p.maybe(True) True >>>p.not_maybe(True) #should be False True

    Read the article

  • Python recursion with list returns None

    - by newman
    def foo(a): a.append(1) if len(a) > 10: print a return a else: foo(a) Why this recursive function returns None (see transcript below)? I can't quite understand what I am doing wrong. In [263]: x = [] In [264]: y = foo(x) [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] In [265]: print y None

    Read the article

  • Dynamic Operator Overloading on dict classes in Python

    - by Ishpeck
    I have a class that dynamically overloads basic arithmetic operators like so... import operator class IshyNum: def __init__(self, n): self.num=n self.buildArith() def arithmetic(self, other, o): return o(self.num, other) def buildArith(self): map(lambda o: setattr(self, "__%s__"%o,lambda f: self.arithmetic(f, getattr(operator, o))), ["add", "sub", "mul", "div"]) if __name__=="__main__": number=IshyNum(5) print number+5 print number/2 print number*3 print number-3 But if I change the class to inherit from the dictionary (class IshyNum(dict):) it doesn't work. I need to explicitly def __add__(self, other) or whatever in order for this to work. Why?

    Read the article

  • (Python) Converting a dictionary to a list?

    - by Daria Egelhoff
    So I have this dictionary: ScoreDict = {"Blue": {'R1': 89, 'R2': 80}, "Brown": {'R1': 61, 'R2': 77}, "Purple": {'R1': 60, 'R2': 98}, "Green": {'R1': 74, 'R2': 91}, "Red": {'R1': 87, 'Lon': 74}} Is there any way how I can convert this dictionary into a list like this: ScoreList = [['Blue', 89, 80], ['Brown', 61, 77], ['Purple', 60, 98], ['Green', 74, 91], ['Red', 87, 74]] I'm not too familiar with dictionaries, so I really need some help here. Thanks in advance!

    Read the article

  • python - from matrix to dictionary in single line

    - by Sanich
    matrix is a list of lists. I've to return a dictionary of the form {i:(l1[i],l2[i],...,lm[i])} Where the key i is matched with a tuple the i'th elements from each list. Say matrix=[[1,2,3,4],[9,8,7,6],[4,8,2,6]] so the line: >>> dict([(i,tuple(matrix[k][i] for k in xrange(len(matrix)))) for i in xrange(len(matrix[0]))]) does the job pretty well and outputs: {0: (1, 9, 4), 1: (2, 8, 8), 2: (3, 7, 2), 3: (4, 6, 6)} but fails if the matrix is empty: matrix=[]. The output should be: {} How can i deal with this?

    Read the article

  • Python - Compress Ascii String

    - by n0idea
    I'm looking for a way to compress an ascii-based string, any help? I need also need to decompress it. I tried zlib but with no help. What can I do to compress the string into lesser length? code: def compress(request): if request.POST: data = request.POST.get('input') if is_ascii(data): result = zlib.compress(data) return render_to_response('index.html', {'result': result, 'input':data}, context_instance = RequestContext(request)) else: result = "Error, the string is not ascii-based" return render_to_response('index.html', {'result':result}, context_instance = RequestContext(request)) else: return render_to_response('index.html', {}, context_instance = RequestContext(request))

    Read the article

  • Find last match with python regular expression

    - by SDD
    I wanto to match the last occurence of a simple pattern in a string, e.g. list = re.findall(r"\w+ AAAA \w+", "foo bar AAAA foo2 AAAA bar2) print "last match: ", list[len(list)-1] however, if the string is very long, a huge list of matches is generated. Is there a more direct way to match the second occurence of "AAAA" or should I use this workaround?

    Read the article

  • Custom keys for Google App Engine models (Python)

    - by Cameron
    First off, I'm relatively new to Google App Engine, so I'm probably doing something silly. Say I've got a model Foo: class Foo(db.Model): name = db.StringProperty() I want to use name as a unique key for every Foo object. How is this done? When I want to get a specific Foo object, I currently query the datastore for all Foo objects with the target unique name, but queries are slow (plus it's a pain to ensure that name is unique when each new Foo is created). There's got to be a better way to do this! Thanks.

    Read the article

  • Pass in a value into Python Class through command line

    - by chrissygormley
    Hello, I have got some code to pass in a variable into a script from the command line. The script is: import sys, os def function(var): print var class function_call(object): def __init__(self, sysArgs): try: self.function = None self.args = [] self.modulePath = sysArgs[0] self.moduleDir, tail = os.path.split(self.modulePath) self.moduleName, ext = os.path.splitext(tail) __import__(self.moduleName) self.module = sys.modules[self.moduleName] if len(sysArgs) > 1: self.functionName = sysArgs[1] self.function = self.module.__dict__[self.functionName] self.args = sysArgs[2:] except Exception, e: sys.stderr.write("%s %s\n" % ("PythonCall#__init__", e)) def execute(self): try: if self.function: self.function(*self.args) except Exception, e: sys.stderr.write("%s %s\n" % ("PythonCall#execute", e)) if __name__=="__main__": test = test() function_call(sys.argv).execute() This works by entering ./function <function> <arg1 arg2 ....>. The problem is that I want to to select the function I want that is in a class rather than just a function by itself. The code I have tried is the same except that function(var): is in a class. I was hoping for some ideas on how to modify my function_call class to accept this. Thanks for any help.

    Read the article

  • Using __str__ representation for printing objects in containers in Python

    - by BobDobbs
    I've noticed that when an instance with an overloaded str method is passed to the print() function as an argument, it prints as intended. However, when passing a container that contains one of those instances to print(), it uses the repr method instead. That is to say, print(x) displays the correct string representation of x, and print(x, y) works correctly, but print([x]) or print((x, y)) prints the repr representation instead. First off, why does this happen? Secondly, is there a way to correct that behavior of print() in this circumstance?

    Read the article

  • use/run python's 2to3 as or like a unittest

    - by Vincent
    I have used the 2to3 utility to convert code from the command line. What I would like to do is run it basically as a unittest. Even if it tests the file rather than parts(funtions, methods...) as would be normal for a unittest. It does not need to be a unittest and I don't what to automatically convert the files I just want to monitor the py3 compliance of files in a unittest like manor. I can't seem to find any documentation or examples for this. An example and/or documentation would be great. Thanks

    Read the article

  • Python and sqlite3 - importing and exporting databases

    - by JPC
    I'm trying to write a script to import a database file. I wrote the script to export the file like so: import sqlite3 con = sqlite3.connect('../sqlite.db') with open('../dump.sql', 'w') as f: for line in con.iterdump(): f.write('%s\n' % line) Now I want to be able to import that database. I tried: import sqlite3 con = sqlite3.connect('../sqlite.db') f = open('../dump.sql','r') str = f.read() con.execute(str) but I'm not allowed to execute more than one statement. Is there a way to get it to run a .sql script directly?

    Read the article

  • Optimizing BeautifulSoup (Python) code

    - by user283405
    I have code that uses the BeautifulSoup library for parsing, but it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me with this? I am using BeautifulSoup for parsing and than save into a DB. If I comment out the save statement, it still takes a long time, so there is no problem with the database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

    Read the article

  • python: problem with dictionary get method default value

    - by goutham
    I'm having a new problem here .. CODE 1: try: urlParams += "%s=%s&"%(val['name'], data.get(val['name'], serverInfo_D.get(val['name']))) except KeyError: print "expected parameter not provided - "+val["name"]+" is missing" exit(0) CODE 2: try: urlParams += "%s=%s&"%(val['name'], data.get(val['name'], serverInfo_D[val['name']])) except KeyError: print "expected parameter not provided - "+val["name"]+" is missing" exit(0) see the diffrence in serverInfo_D[val['name']] & serverInfo_D.get(val['name']) code 2 fails but code 1 works the data serverInfo_D:{'user': 'usr', 'pass': 'pass'} data: {'par1': 9995, 'extraparam1': 22} val: {'par1','user','pass','extraparam1'} exception are raised for for data dict .. and all code in for loop which iterates over val

    Read the article

  • Proper structure for many test cases in Python with unittest

    - by mellort
    I am looking into the unittest package, and I'm not sure of the proper way to structure my test cases when writing a lot of them for the same method. Say I have a fact function which calculates the factorial of a number; would this testing file be OK? import unittest class functions_tester(unittest.TestCase): def test_fact_1(self): self.assertEqual(1, fact(1)) def test_fact_2(self): self.assertEqual(2, fact(2)) def test_fact_3(self): self.assertEqual(6, fact(3)) def test_fact_4(self): self.assertEqual(24, fact(4)) def test_fact_5(self): self.assertFalse(1==fact(5)) def test_fact_6(self): self.assertRaises(RuntimeError, fact, -1) #fact(-1) if __name__ == "__main__": unittest.main() It seems sloppy to have so many test methods for one method. I'd like to just have one testing method and put a ton of basic test cases (ie 4! ==24, 3!==6, 5!==120, and so on), but unittest doesn't let you do that. What is the best way to structure a testing file in this scenario? Thanks in advance for the help.

    Read the article

  • Get the last '/' or '\\' character in Python

    - by wowus
    If I have a string that looks like either ./A/B/c.d OR .\A\B\c.d How do I get just the "./A/B/" part? The direction of the slashes can be the same as they are passed. This problem kinda boils down to: How do I get the last of a specific character in a string? Basically, I want the path of a file without the file part of it.

    Read the article

  • Rectangle Rotation in Python/Pygame

    - by mramazingguy
    Hey I'm trying to rotate a rectangle around its center and when I try to rotate the rectangle, it moves up and to the left at the same time. Does anyone have any ideas on how to fix this? def rotatePoint(self, angle, point, origin): sinT = sin(radians(angle)) cosT = cos(radians(angle)) return (origin[0] + (cosT * (point[0] - origin[0]) - sinT * (point[1] - origin[1])), origin[1] + (sinT * (point[0] - origin[0]) + cosT * (point[1] - origin[1]))) def rotateRect(self, degrees): center = (self.collideRect.centerx, self.collideRect.centery) self.collideRect.topleft = self.rotatePoint(degrees, self.collideRect.topleft, center) self.collideRect.topright = self.rotatePoint(degrees, self.collideRect.topright, center) self.collideRect.bottomleft = self.rotatePoint(degrees, self.collideRect.bottomleft, center) self.collideRect.bottomright = self.rotatePoint(degrees, self.collideRect.bottomright, center)

    Read the article

  • Python If Statement Defaults to an elif

    - by Brad Carvalho
    Not sure why my code is defaulting to this elif. But it's never getting to the else statement. Even going as far as throwing index out of bound errors in the last elif. Please disregard my non use of regex. It wasn't allowed for this homework assignment. The problem is the last elif before the else statement. Cheers, Brad if item == '': print ("%s\n" % item).rstrip('\n') elif item.startswith('MOVE') and not item.startswith('MOVEI'): print 'Found MOVE' elif item.startswith('MOVEI'): print 'Found MOVEI' elif item.startswith('BGT'): print 'Found BGT' elif item.startswith('ADD'): print 'Found ADD' elif item.startswith('INC'): print 'Found INC' elif item.startswith('SUB'): print 'Found SUB' elif item.startswith('DEC'): print 'Found DEC' elif item.startswith('MUL'): print 'Found MUL' elif item.startswith('DIV'): print 'Found DIV' elif item.startswith('BEQ'): print 'Found BEQ' elif item.startswith('BLT'): print 'Found BLT' elif item.startswith('BR'): print 'Found BR' elif item.startswith('END'): print 'Found END' elif item.find(':') and item[(item.find(':') -1)].isalpha(): print 'Mya have found a label' else: print 'Not sure what I found'

    Read the article

  • os.walk in python not running with cmd line parameter passed as path

    - by kartiku
    Hello, I needed to find the number of files in a folder on the system. This is what i used: file_count = sum((len(f) for _, _, f in os.walk('path'))) This works fine when we specify the path as a string in quotes, but when I enter a variable name that holds the path, type(file_count) is a generator object, and hence cannot be used as an integer. How to solve this and why does this happen?

    Read the article

  • python multiprocessing member variable not set

    - by Jake
    In the following script, I get the "stop message received" output but the process never ends. Why is that? Is there another way to end a process besides terminate or os.kill that is along these lines? from multiprocessing import Process from time import sleep class Test(Process): def __init__(self): Process.__init__(self) self.stop = False def run(self): while self.stop == False: print "running" sleep(1.0) def end(self): print "stop message received" self.stop = True if __name__ == "__main__": test = Test() test.start() sleep(1.0) test.end() test.join()

    Read the article

< Previous Page | 157 158 159 160 161 162 163 164 165 166 167 168  | Next Page >