Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 47/361 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • Most efficient way to write over file after reading

    - by Ryan McClure
    I'm reading in some data from a file, manipulating it, and then overwriting it to the same file. Until now, I've been doing it like so: open (my $inFile, $file) or die "Could not open $file: $!"; $retString .= join ('', <$inFile>); ... close ($inFile); open (my $outFile, $file) or die "Could not open $file: $!"; print $outFile, $retString; close ($inFile); However I realized I can just use the truncate function and open the file for read/write: open (my $inFile, '+<', $file) or die "Could not open $file: $!"; $retString .= join ('', <$inFile>); ... truncate $inFile, 0; print $inFile $retString; close ($inFile); I don't see any examples of this anywhere. It seems to work well, but am I doing it correctly? Is there a better way to do this?

    Read the article

  • Most Efficient way to set Register to 1 or (-1)

    - by Bob
    I am taking an assembly course now, and the guy who checks our home assignments is a very pedantic old-school optimization freak. For example he deducts 10% if he sees: mov ax, 0 instead of: xor ax,ax even if it's only used once. I am not a complete beginner in assembly programing but I'm not an optimization expert, so I need your help in something (might be a very stupid question but I'll ask anyway): if I need to set a register value to 1 or (-1) is it better to use: mov ax, 1 or do something like: xor ax,ax inc ax I really need a good grade, so I'm trying to get it as optimized as possible. ( I need to optimize both time and code size)

    Read the article

  • Efficient file buffering & scanning methods for large files in python

    - by eblume
    The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

    Read the article

  • Have parameters in Dao methods to get entities the most efficient way for read-only access

    - by Blankman
    Allot of my use of hibernate, at least for that data that is presented on many parts of the web application, is for read-only purposes. I want to add some parameters to my Dao methods so I can modify the way hibernate pulls the data and how it handles transactions etc. Example usage: Data on the front page of my website is displayed to the users, it is read-only, so I want to avoid any session/entity tracking that hibernate usually does. This is data that is read-only, will not be changed in this transaction, etc. What would be the most performant way to pull the data? (the code below is c#/nhibernate, I'm implementing this in java as I learn it) public IList<Article> GetArticles() { return Session.CreateCriteria(typeof(Article)) // some where cluase }

    Read the article

  • Most efficient way of checking for a return from a function call in Perl

    - by Gaurav Dadhania
    I want to add the return value from the function call to an array iff something is returned (not by default, i.e. if I have a return statement in the subroutine.) so I'm using unshift @{$errors}, "HashValidator::$vfunction($hashref)"; but this actually adds the string of the function call to the array. I also tried unshift @{$errors}, $temp if defined my $temp = "HashValidator::$vfunction($hashref)"; with the same result. What would a perl one-liner look like that does this efficiently (I know I can do the ugly, multi-line check but I want to learn). Thanks,

    Read the article

  • Simple and efficient distribution of C++/Boost source code (amalgamation)

    - by Arrieta
    Hello: My job mostly consists of engineering analysis, but I find myself distributing code more and more frequently among my colleagues. A big pain is that not every user is proficient in the intricacies of compiling source code, and I cannot distribute executables. I've been working with C++ using Boost, and the problem is that I cannot request every sysadmin of every network to install the libraries. Instead, I want to distribute a single source file (or as few as possible) so that the user can g++ source.c -o program. So, the question is: can you pack the Boost libraries with your code, and end up with a single file? I am talking about the Boost libraries which are "headers only" or "templates only". As an inspiration, please look at the distribution of SQlite or the Lemon Parser Generator; the author amalgamates the stuff into a single source file which is trivial to compile. Thank you.

    Read the article

  • What is an efficient way to find a non-colliding rectangle nearest to a location

    - by hyn
    For a 2D game I am working on, I am using y axis sorting in a simple rectangle-based collision detection. This is working fine, and now I want to find the nearest empty rectangle at a given location with a given size, efficiently. How can I do this? Is there an algorithm? I could think of a simple brute force grid test (with each grid the size of the empty space we're looking for) but obviously this is slow and not even a complete test.

    Read the article

  • more efficient way to pickle a string

    - by gatoatigrado
    The pickle module seems to use string escape characters when pickling; this becomes inefficient e.g. on numpy arrays. Consider the following z = numpy.zeros(1000, numpy.uint8) len(z.dumps()) len(cPickle.dumps(z.dumps())) The lengths are 1133 characters and 4249 characters respectively. z.dumps() reveals something like "\x00\x00" (actual zeros in string), but pickle seems to be using the string's repr() function, yielding "'\x00\x00'" (zeros being ascii zeros). i.e. ("0" in z.dumps() == False) and ("0" in cPickle.dumps(z.dumps()) == True)

    Read the article

  • Efficient implementation of natural logarithm (ln) and exponentiation

    - by Donotalo
    Basically, I'm looking for implementation of log() and exp() functions provided in C library <math.h>. I'm working with 8 bit microcontrollers (OKI 411 and 431). I need to calculate Mean Kinetic Temperature. The requirement is that we should be able to calculate MKT as fast as possible and with as little code memory as possible. The compiler comes with log() and exp() functions in <math.h>. But calling either function and linking with the library causes the code size to increase by 5 Kilobytes, which will not fit in one of the micro we work with (OKI 411), because our code already consumed ~12K of available ~15K code memory. The implementation I'm looking for should not use any other C library functions (like pow(), sqrt() etc). This is because all library functions are packed in one library and even if one function is called, the linker will bring whole 5K library to code memory.

    Read the article

  • What is an efficient way to erase substrings?

    - by Legend
    I have a long string and a set of <end-index, string> list like the following: long_sentence = "This is a long long long long sentence" indices = [[6, "is"], [8, "is a"], [18, "long"], [23, "long"]] An element 6, "is" indicates that 6 is the end index of the word "is" in the string. I want to get the following string in the end: >> print long_sentence This .... long ......... long sentence" I tried an approach like this: temp = long_sentence for i in indices: temp = temp[:int(i[0]) - len(i[1])] + '.'*(len(i[1])+1) + temp[i[0]+1:] While this seems to be working, it is taking exceptionally long time (more than 6 hours on 5000 strings inside a 300 MB file). Is there a way to speed this up?

    Read the article

  • More efficient R / Sweave / TeXShop work-flow?

    - by user594795
    I've now got everything to work properly on my Mac OS X 10.6 machine so that I can create decent looking LaTeX documents with Sweave that include snippets of R code, output, and LaTeX formatting together. Unfortunately, I feel like my work-flow is a bit clunky and inefficient: Using TextWrangler, I write LaTeX code and R code (surrounded by <<= above and @ below R code chunk) together in one .Rnw file. After saving changes, I call the .Rnw file from R using the Sweave command Sweave(file="/Users/mymachine/Documents/Assign4.Rnw", syntax="SweaveSyntaxNoweb") In response, R outputs the following message: You can now run LaTeX on 'Assign4.tex' So then I find the .tex file (Assign4.tex) in the R directory and copy it over to the folder in my documents ~/Documents/ where the .Rnw file is sitting (to keep everything in one place). Then I open the .tex file (e.g. Assign4.tex) in TeXShop and compile it there into pdf format. It is only at this point that I get to see any changes I have made to the document and see if it 'looks nice'. Is there a way that I can compile everything with one button click? Specifically it would be nice to either call Sweave / R directly from TextWrangler or TeXShop. I suspect it might be possible to code a script in Terminal to do it, but I have no experience with Terminal. Please let me know if there's any other things I can do to streamline or improve my work flow.

    Read the article

  • Efficient update of SQLite table with many records

    - by blackrim
    I am trying to use sqlite (sqlite3) for a project to store hundreds of thousands of records (would like sqlite so users of the program don't have to run a [my]sql server). I have to update hundreds of thousands of records sometimes to enter left right values (they are hierarchical), but have found the standard update table set left_value = 4, right_value = 5 where id = 12340; to be very slow. I have tried surrounding every thousand or so with begin; .... update... update table set left_value = 4, right_value = 5 where id = 12340; update... .... commit; but again, very slow. Odd, because when I populate it with a few hundred thousand (with inserts), it finishes in seconds. I am currently trying to test the speed in python (the slowness is at the command line and python) before I move it to the C++ implementation, but right now this is way to slow and I need to find a new solution unless I am doing something wrong. Thoughts? (would take open source alternative to SQLite that is portable as well)

    Read the article

  • What is the most efficient way to delete all selected items in a ListViewItem collection

    - by Andrew
    My user is able to select multiple items in a ListView collection that is configured to show details (that is, a list of rows). What I want to do is add a Delete button that will delete all of the selected items from the ListViewItem collection associated with the ListView. The collection of selected items is available in ListView.SelectedItems, but ListView.Items doesn't appear to have a single method that lets me delete the entire range. I have to iterate through the range and delete them one by one, which will potentially modify a collection I'm iterating over. Any hints? Edit: What I'm basically after is the opposite of AddRange().

    Read the article

  • .NET Efficient way to generate WORD Doc - Server Side

    - by alexbf
    Hello, .NET 4.0 I am looking for the easiest way to generate a Word document on our server. Limitations : Server side I don't want to install word on the server Data source is XML I tried to generate a DOCX with XSLT which is fast and easy but the only way I could find to validate the generated document is to open it with Word and the only error I get when the document is not valid is "Error while opening document". Not very useful. Any ideas? Thanks, Alex

    Read the article

  • How efficient is a details table?

    - by Jeffrey Lott
    At my job, we have pseudo-standard of creating one table to hold the "standard" information for an entity, and a second table, named like 'TableNameDetails', which holds optional data elements. On average, for every row in the main table will have about 8-10 detail rows in it. My question is: What kind of performance impacts does this have over adding these details as additional nullable columns on the main table?

    Read the article

  • efficient thread-safe singleton in C++

    - by user168715
    The usual pattern for a singleton class is something like static Foo &getInst() { static Foo *inst = NULL; if(inst == NULL) inst = new Foo(...); return *inst; } However, it's my understanding that this solution is not thread-safe, since 1) Foo's constructor might be called more than once (which may or may not matter) and 2) inst may not be fully constructed before it is returned to a different thread. One solution is to wrap a mutex around the whole method, but then I'm paying for synchronization overhead long after I actually need it. An alternative is something like static Foo &getInst() { static Foo *inst = NULL; if(inst == NULL) { pthread_mutex_lock(&mutex); if(inst == NULL) inst = new Foo(...); pthread_mutex_unlock(&mutex); } return *inst; } Is this the right way to do it, or are there any pitfalls I should be aware of? For instance, are there any static initialization order problems that might occur, i.e. is inst always guaranteed to be NULL the first time getInst is called?

    Read the article

  • Improve a regex statement in order to be as efficient as it can be

    - by user551625
    I have a PHP program that, at some point, needs to analyze a big amount of HTML+javascript text to parse info. All I want to parse needs to be in two parts. Seperate all "HTML goups" to parse Parse each HTML group to get the needed information. In the 1st parse it needs to find: <div id="myHome" And start capturing after that tag. Then stop capturing before <span id="nReaders" And capture the number that comes after this tag and stop. In the 2nd parse use the capture nº 1 (0 has the whole thing and 2 has the number) from the parse made before and then find . I already have code to do that and it works. Is there a way to improve this, make it easier for the machine to parse? preg_match_all('%<div id="myHome"[^>]>(.*?)<span id="nReaders[^>]>([0-9]+)<"%msi', $data, $results, PREG_SET_ORDER); foreach($results AS $result){ preg_match_all('%<div class="myplacement".*?[.]php[?]((?:next|before))=([0-9]+).*?<tbody.*?<td[^>]>.*?[0-9]+"%msi', $result[1], $mydata, PREG_SET_ORDER); //takes care of the data and finish the program Note: I need this for a freeware program so it must be as general as possible and, if possible, not use php extensions ADD: I ommitted some parts here because I didn't expect for answers like those. There is also a need to parse text inside one of the tags that is in the document. It may be the 6th 7th or 8th tag but I know it is after a certain tag. The parser I've checked (thx profitphp) does work to find the script tag. What now? There are more than 1 tag with the same class. I want them all. But I want only with also one of a list of classes..... Where can I find instructions and demos and limitations of DOM parsers (like the one in http://simplehtmldom.sourceforge.net/)? I need something that will work on, at least, a big amount of free servers.

    Read the article

  • Efficient (basic) regular expression implementation for streaming data

    - by Brendan Dolan-Gavitt
    I'm looking for an implementation of regular expression matching that operates on a stream of data -- i.e., it has an API that allows a user to pass in one character at a time and report when a match is found on the stream of characters seen so far. Only very basic (classic) regular expressions are needed, so a DFA/NFA based implementation seems like it would be well-suited to the problem. Based on the fact that it's possible to do regular expression matching using a DFA/NFA in a single linear sweep, it seems like a streaming implementation should be possible. Requirements: The library should try to wait until the full string has been read before performing the match. The data I have really is streaming; there is no way to know how much data will arrive, it's not possible to seek forward or backward. Implementing specific stream matching for a couple special cases is not an option, as I don't know in advance what patterns a user might want to look for. For the curious, my use case is the following: I have a system which intercepts memory writes inside a full system emulator, and I would like to have a way to identify memory writes that match a regular expression (e.g., one could use this to find the point in the system where a URL is written to memory). I have found (links de-linkified because I don't have enough reputation): stackoverflow.com/questions/1962220/apply-a-regex-on-stream stackoverflow.com/questions/716927/applying-a-regular-expression-to-a-java-i-o-stream www.codeguru.com/csharp/csharp/cs_data/searching/article.php/c14689/Building-a-Regular-Expression-Stream-Search-with-the-NET-Framework.htm But all of these attempt to convert the stream to a string first and then use a stock regular expression library. Another thought I had was to modify the RE2 library, but according to the author it is architected around the assumption that the entire string is in memory at the same time. If nothing's available, then I can start down the unhappy path of reinventing this wheel to fit my own needs, but I'd really rather not if I can avoid it. Any help would be greatly appreciated!

    Read the article

  • Perl, time efficient hash

    - by Mike
    Is it possible to use a Perl hash in a manner that has O(log(n)) lookup and insertion? By default, I assume the lookup is O(n) since it's represented by an unsorted list. I know I could create a data structure to satisfy this (ie, a tree, etc) however, it would be nicer if it was built in and could be used as a normal hash (ie, with %)

    Read the article

  • Most efficient way to draw circles for polygon outlines

    - by user146780
    I'm using OpenGL and was told I should draw circles at each vertex of my outline to get smoothness. I tried this and it works great. The problem is speed. It crippled my application to draw a circle at each vertex. I'm not sure how else to fix the anomaly of my outlines other than circles, but using display lists and trying with vertex array both were brutally slow. Thanks

    Read the article

  • Datastructure choices for highspeed and memory efficient detection of duplicate of strings

    - by Jonathan Holland
    I have a interesting problem that could be solved in a number of ways: I have a function that takes in a string. If this function has never seen this string before, it needs to perform some processing. If the function has seen the string before, it needs to skip processing. After a specified amount of time, the function should accept duplicate strings. This function may be called thousands of time per second, and the string data may be very large. This is a highly abstracted explanation of the real application, just trying to get down to the core concept for the purpose of the question. The function will need to store state in order to detect duplicates. It also will need to store an associated timestamp in order to expire duplicates. It does NOT need to store the strings, a unique hash of the string would be fine, providing there is no false positives due to collisions (Use a perfect hash?), and the hash function was performant enough. The naive implementation would be simply (in C#): Dictionary<String,DateTime> though in the interest of lowering memory footprint and potentially increasing performance I'm evaluating a custom data structures to handle this instead of a basic hashtable. So, given these constraints, what would you use? EDIT, some additional information that might change proposed implementations: 99% of the strings will not be duplicates. Almost all of the duplicates will arrive back to back, or nearly sequentially. In the real world, the function will be called from multiple worker threads, so state management will need to be synchronized.

    Read the article

  • Efficient way to build a MySQL update query in Python

    - by ensnare
    I have a class variable called attributes which lists the instance variables I want to update in a database: attributes = ['id', 'first_name', 'last_name', 'name', 'name_url', 'email', 'password', 'password_salt', 'picture_id'] Each of the class attributes are updated upon instantiation. I would like to loop through each of the attributes and build a MySQL update query in the form of: UPDATE members SET id = self._id, first_name = self._first name ... Thanks.

    Read the article

  • Editing a 1MB file continuously, what's more efficient?

    - by kmels
    I've to be continuously editing a 1MB file, simulating a file system. I've to modify the directory of File Control Blocks, FAT, blocks, etc. Proffesor recommended overwriting the file every time an update is made. 1MB shouldn't take minutes to do that, but I don't like this way. Is it a FileChannel the way to go here? Also, I understand that if I edit a MappedByteBuffer, the content of the mapped file region is also edited immediately? i.e. is reflexive mapped? Thanks.

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >