Search Results

Search found 3758 results on 151 pages for 'efficient'.

Page 123/151 | < Previous Page | 119 120 121 122 123 124 125 126 127 128 129 130  | Next Page >

  • Alternative to Galileo GWS

    - by Anton Gogolev
    Please note that this Galileo is absolutely not related to Java. Galileo is basically a set of web services which can be used to book airline tickets. Originally, it was supposed to be used via Galileo Desktop, whereby operators would enter various commands to perform required operations. For example, SA*AZ610J20JULFCOJFK will "Display seat availability map for specified flight and class". Granted, humans can get used to it and be very efficient, but here comes a problem of integrating this with other systems. For that, folks at TravelPort basically slapped a SOAP interface to this system (which must have been written in COBOL or something), without even thinking about actually embracing XML. For example, it can contain <Ind1>N</Ind1> <Ind2>N</Ind2> <Ind3>N</Ind3> ... <Ind72>N</Ind72> <!-- Yes! 72! --> or, better yet <Text>P/RU/4xxx24528/RU/11MAY67/M/23DEC12/AxxxxxxV/MxxxM</Text> In light of this, my question is as follows: are there any sane airline tickets booking systems we can integrate with? Or are there companies which have products that can abstract away all this?

    Read the article

  • Simple description of worker and I/O threads in .NET

    - by Konstantin
    It's very hard to find detailed but simple description of worker and I/O threads in .NET What's clear to me regarding this topic (but may not be technically precise): Worker threads are threads that should employ CPU for their work; I/O threads (also called "completion port threads") should employ device drivers for their work and essentially "do nothing", only monitor the completion of non-CPU operations. What is not clear: Although method ThreadPool.GetAvailableThreads returns number of available threads of both types, it seems there is no public API to schedule work for I/O thread. You can only manually create worker thread in .NET? It seems that single I/O thread can monitor multiple I/O operations. Is it true? If so, why ThreadPool has so many available I/O threads by default? In some texts I read that callback, triggered after I/O operation completion is performed by I/O thread. Is it true? Isn’t this a job for worker thread, considering that this callback is CPU operation? To be more specific – do ASP.NET asynchronous pages user I/O threads? What exactly is performance benefit in switching I/O work to separate thread instead of increasing maximum number of worker threads? Is it because single I/O thread does monitor multiple operations? Or Windows does more efficient context switching when using I/O threads?

    Read the article

  • Generate a set of strings with maximum edit distance

    - by Kevin Jacobs
    Problem 1: I'd like to generate a set of n strings of fixed length m from alphabet s such that the minimum Levenshtein distance (edit distance) between any two strings is greater than some constant c. Obviously, I can use randomization methods (e.g., a genetic algorithm), but was hoping that this may be a well-studied problem in computer science or mathematics with some informative literature and an efficient algorithm or three. Problem 2: Same as above except that adjacent characters cannot repeat; the i'th character in each string may not be equal to the i+1'th character. E.g., 'CAT', 'AGA' and 'TAG' are allowed, 'GAA', 'AAT', and 'AAA' are not. Background: The basis for this problem is bioinformatic and involves designing unique DNA tags that can be attached to biologically derived DNA fragments and then sequenced using a fancy second generation sequencer. The goal is to be able to recognize each tag, allowing for random insertion, deletion, and substitution errors. The specific DNA sequencing technology has a relatively low error rate per base (~1%), but is less precise when a single base is repeated 2 or more times (motivating the additional constraints imposed in problem 2).

    Read the article

  • python histogram one-liner

    - by mykhal
    there are many ways, how to code histogram in Python. by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.: >>> L = 'abracadabra' >>> histogram(L) {'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2} it can be written like this: def histogram(L): d = {} for x in L: if x in d: d[x] += 1 else: d[x] = 1 return d ..however, there are much less ways, how do this in a single expression. if we had "dict comprehensions" in python, we would write: >>> { x: L.count(x) for x in set(L) } but we don't have them, so we have to write: >>> dict([(x, L.count(x)) for x in set(L)]) however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where: def gen(): for x in L: yield x we can go with reduce (R.I.P.): >>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong! oops, does not work, the key name is 'x', not x :( i ended with: >>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {}) (in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there) please beat me with a better one-liner, more readable! ;)

    Read the article

  • Exporting dataset to Excel file with multiple sheets in ASP.NET

    - by engg
    In C# ASP.NET 3.5 web application, I need to export multiple datatables (or a dataset) to an Excel 2007 file with multiple sheets, and then provide the user with 'Open/Save' dialog box, WITHOUT saving the Excel file on the web server. I have used Excel Interop before. I have been reading that it's not efficient and is not the best approach to achieve this and there are more ways to do it, 2 of them being: 1) Converting data in datatables to an XML string that Excel understands 2) Using OPEN XML SDK 2.0. It looks like OPEN XML SDK 2.0 is better, please let me know. Are there any other ways to do it? I don't want to use any third-party tools. If I use OPEN XML SDK, it creates an excel file, right? I don't want to save it on the (Windows 2003) server hard drive (I don't want to use Server.MapPath, these Excel files are dynamically created, and they are not required on the server, once client gets them). I directly want to prompt the user to open/save it. I know how to do it when the 'XML string' approach is used. Please help. Thank you.

    Read the article

  • Divide a path into N sections using Java or PostgreSQL/PostGIS

    - by Guido
    Imagine a GPS tracking system that is following the position of several objects. The points are stored in a database (PostgreSQL + PostGIS). Each path is composed by a different number of points. That is the reason why, in order to compare a pair of paths, I need to divide every path in a set of 100 points. Do you know any PostGIS function that already implement this algorithm? I've not been able to find it. If not, I'd like to solve it using Java. In this case I'd like to know an efficient and easy to implement algorithm to divide a path into N points. The most simple example could be to divide this path into three points: position 1 : x=1, y=2 position 2 : x=1, y=3 And the result should be: position 1 : x=1, y=2 (starting point) position 2 : x=5, y=2.5 position 3 : x=9, y=3 (end point) Edit: By 'compare a pair of paths' I mean to calculate the distance between two paths. I plan to divide each path in 100 points, and sum the euclidean distance between each one of these points as the distance between the two paths.

    Read the article

  • "Priming" a whole database in MSSQL for first-hit speed

    - by David Spillett
    For a particular apps I have a set of queries that I run each time the database has been restarted for any reason (server reboot usually). These "prime" SQL Server's page cache with the common core working set of the data so that the app is not unusually slow the first time a user logs in afterwards. One instance of the app is running on an over-specced arrangement where the SQL box has more RAM than the size of the database (4Gb in the machine, the DB is under 1.5Gb currently and unlikely to grow too much relative to that in the near future). Is there a neat/easy way of telling SQL Server to go away and load everything into RAM? It could be done the hard way by having a script scan sysobjects & sysindexes and running SELECT * FROM <table> WITH(INDEX(<index_name>)) ORDER BY <index_fields> for every key and index found, which should cause every used page to be read at least once and so be in RAM, but is there a cleaner or more efficient way? All planned instances where the database server is stopped are out-of-normal-working-hours (all the users are at most one timezone away and unlike me none of them work at silly hours) so such a process (until complete) slowing down users more than the working set not being primed at all would is not an issue.

    Read the article

  • C# String.Replace with a start/index (Added my (slow) implementation)

    - by Chris T
    I'd like an efficient method that would work something like this EDIT: Sorry I didn't put what I'd tried before. I updated the example now. // Method signature, Only replaces first instance or how many are specified in max public int MyReplace(ref string source,string org, string replace, int start, int max) { int ret = 0; int len = replace.Length; int olen = org.Length; for(int i = 0; i < max; i++) { // Find the next instance of the search string int x = source.IndexOf(org, ret + olen); if(x > ret) ret = x; else break; // Insert the replacement source = source.Insert(x, replace); // And remove the original source = source.Remove(x + len, olen); // removes original string } return ret; } string source = "The cat can fly but only if he is the cat in the hat"; int i = MyReplace(ref source,"cat", "giraffe", 8, 1); // Results in the string "The cat can fly but only if he is the giraffe in the hat" // i contains the index of the first letter of "giraffe" in the new string The only reason I'm asking is because my implementation I'd imagine getting slow with 1,000s of replaces.

    Read the article

  • Implementing Model-level caching

    - by Byron
    I was posting some comments in a related question about MVC caching and some questions about actual implementation came up. How does one implement a Model-level cache that works transparently without the developer needing to manually cache, yet still remains efficient? I would keep my caching responsibilities firmly within the model. It is none of the controller's or view's business where the model is getting data. All they care about is that when data is requested, data is provided - this is how the MVC paradigm is supposed to work. (Source: Post by Jarrod) The reason I am skeptical is because caching should usually not be done unless there is a real need, and shouldn't be done for things like search results. So somehow the Model itself has to know whether or not the SELECT statement being issued to it worthy of being cached. Wouldn't the Model have to be astronomically smart, and/or store statistics of what is being most often queried over a long period of time in order to accurately make a decision? And wouldn't the overhead of all this make the caching useless anyway? Also, how would you uniquely identify a query from another query (or more accurately, a resultset from another resultset)? What about if you're using prepared statements, with only the parameters changing according to user input? Another poster said this: I would suggest using the md5 hash of your query combined with a serialized version of your input arguments. This would require twice the number of serialization options. I was under the impression that serialization was quite expensive, and for large inputs this might be even worse than just re-querying. And is the minuscule chance of collision worth worrying about? Conceptually, caching in the Model seems like a good idea to me, but it seems in practicality the developer should have direct control over caching and write it into the controller. Thoughts/ideas? Edit: I'm using PHP and MySQL if that helps to narrow your focus.

    Read the article

  • Best way to randomly select rows *per* column in SQL Server

    - by LesterDove
    A search of SO yields many results describing how to select random rows of data from a database table. My requirement is a bit different, though, in that I'd like to select individual columns from across random rows in the most efficient/random/interesting way possible. To better illustrate: I have a large Customers table, and from that I'd like to generate a bunch of fictitious demo Customer records that aren't real people. I'm thinking of just querying randomly from the Customers table, and then randomly pairing FirstNames with LastNames, Address, City, State, etc. So if this is my real Customer data (simplified): FirstName LastName State ========================== Sally Simpson SD Will Warren WI Mike Malone MN Kelly Kline KS Then I'd generate several records that look like this: FirstName LastName State ========================== Sally Warren MN Kelly Malone SD Etc. My initial approach works, but it lacks the elegance that I'm hoping the final answer will provide. (I'm particularly unhappy with the repetitiveness of the subqueries, and the fact that this solution requires a known/fixed number of fields and therefore isn't reusable.) SELECT FirstName = (SELECT TOP 1 FirstName FROM Customer ORDER BY newid()), LastName= (SELECT TOP 1 LastNameFROM Customer ORDER BY newid()), State = (SELECT TOP 1 State FROM Customer ORDER BY newid()) Thanks!

    Read the article

  • Java: split a List into two sub-Lists?

    - by Chris Conway
    What's the simplest, most standard, and/or most efficient way to split a List into two sub-Lists in Java? It's OK to mutate the original List, so no copying should be necessary. The method signature could be /** Split a list into two sublists. The original list will be modified to * have size i and will contain exactly the same elements at indices 0 * through i-1 as it had originally; the returned list will have size * len-i (where len is the size of the original list before the call) * and will have the same elements at indices 0 through len-(i+1) as * the original list had at indices i through len-1. */ <T> List<T> split(List<T> list, int i); [EDIT] List.subList returns a view on the original list, which becomes invalid if the original is modified. So split can't use subList unless it also dispenses with the original reference (or, as in Marc Novakowski's answer, uses subList but immediately copies the result).

    Read the article

  • Finding Common Phrases in MS SQL TEXT Column

    - by regex
    Hello All, Short Desc: I'm curious to see if I can use SQL Analysis services or some other MS SQL service to mine some data for me that will show commonalities between SQL TEXT fields in a dataset. Long Desc I am looking at a subset of data that consists of about 10,000 rows of TEXT blobs which are used as a notes column in a issue tracking (ticketing) software. I would like to use something out of the box (without having to build something) that might be able to parse through all of the rows and find commonly used byte sequences in the "Notes" column. In other words, I want to find commonly used phrases (two to three word phrases, so 9 - 20 character sections of the TEXT blob). This will help me better determine if associate's notes contain similar phrases (troubleshooting techniques) that we could standardize in our troubleshooting process flow. Closing Note I'd really rather not build an application to do this as my method will probably not be the most efficient way to do it. Hopefully all this makes sense. Please let me know in the comments if anything needs clarification. Thanks in advance for your help.

    Read the article

  • C#/LINQ: How to define generically a keySelector for a templated class before calling OrderBy

    - by PierrOz
    Hi Folks, I have the following class defined in C# class myClass<T,U> { public T PropertyOne { get; set; } public U PropertyTwo { get; set; } } I need to write a function that reorder a list of myClass objects and takes two other parameters which define how I do this reorder: does my reordering depend on PropertyOne or PropertyTwo and is it ascending or descending. Let's say this two parameters are boolean. With my current knowledge in LINQ, I would write: public IList<myClass<T,U>> ReOrder(IList<myClass<T,U>> myList, bool usePropertyOne, bool ascending) { if (usePropertyOne) { if (ascending) { return myList.OrderBy(o => o.PropertyOne).ToList(); } else { return myList.OrderByDescending(o => o.PropertyOne).ToList(); } } else { if (ascending) { return myList.OrderBy(o => o.PropertyTwo).ToList(); } else { return myList.OrderByDescending(o => o.PropertyTwo).ToList(); } } } What could be a more efficient/elegant way to do that ? How can I declare the Func,TResult keySelector object to reuse when I call either OrderBy or OrderByDescending? I'm interesting in the answer since in my real life, I can have more than two properties.

    Read the article

  • What is a data structure for quickly finding non-empty intersections of a list of sets?

    - by Andrey Fedorov
    I have a set of N items, which are sets of integers, let's assume it's ordered and call it I[1..N]. Given a candidate set, I need to find the subset of I which have non-empty intersections with the candidate. So, for example, if: I = [{1,2}, {2,3}, {4,5}] I'm looking to define valid_items(items, candidate), such that: valid_items(I, {1}) == {1} valid_items(I, {2}) == {1, 2} valid_items(I, {3,4}) == {2, 3} I'm trying to optimize for one given set I and a variable candidate sets. Currently I am doing this by caching items_containing[n] = {the sets which contain n}. In the above example, that would be: items_containing = [{}, {1}, {1,2}, {2}, {3}, {3}] That is, 0 is contained in no items, 1 is contained in item 1, 2 is contained in itmes 1 and 2, 2 is contained in item 2, 3 is contained in item 2, and 4 and 5 are contained in item 3. That way, I can define valid_items(I, candidate) = union(items_containing[n] for n in candidate). Is there any more efficient data structure (of a reasonable size) for caching the result of this union? The obvious example of space 2^N is not acceptable, but N or N*log(N) would be.

    Read the article

  • How do I efficiently write a "toggle database value" function in AJAX?

    - by AmbroseChapel
    Say I have a website which shows the user ten images and asks them to categorise each image by clicking on buttons. A button for "funny", a button for "scary", a button for "pretty" and so on. These buttons aren't exclusive. A picture can be both funny and scary. The user clicks the "funny" button. An AJAX request is sent off to the database to mark that image as funny. The "funny" button lights up, by assigning a class in the DOM to mark it as "on". But the user made a mistake. They meant to hit the next button over. They should click "funny" again to turn it off, right? At this point I'm not sure whats the most efficient way to proceed. The database knows that the "funny" flag is set, but it's inefficient to query the database every time a button is clicked to say, is this flag set or not, then go on with a second database call to toggle it. Should I infer the state of the database flag from the DOM, i.e. if that button has the class "on" then the flag must be set, and branch at that point? Or would it be better to have a data structure in Javascript in the page which duplicates the state of each image in the database, so that every time I set the database flag to true, I also set the value in the Javascript data to true and so on?

    Read the article

  • Debugging unexpected error message - possible memory management problem?

    - by Ben Packard
    I am trying to debug an application that is throwing up strange (to my untutored eyed) errors. When I try to simply log the count of an array... NSLog(@"Array has %i items", [[self startingPlayers] count]); ...I sometimes get an error: -[NSCFString count]: unrecognized selector sent to instance 0x1002af600 or other times -[NSConcreteNotification count]: unrecognized selector sent to instance 0x1002af600 I am not sending 'count' to any NSString or NSNotification, and this line of code works fine normally. A Theory... Although the error varies, the crash happens at predictable times, immediately after I have run through some other code where I'm thinking I might have a memory management issue. Is it possible that the object reference is still pointing to something that is meant to be destroyed? Sorry if my terms are off, but perhaps it's expecting the array at the address it calls 'count' on, but finds another previous object that shouldn't still be there (eg an NSString)? Would this cause the problem? If so, what is the most efficient way to debug and find out what is that address? Most of my debugging up until now involves inserting NSLogs, so this would be a good opportunity to learn how to use the debugger.

    Read the article

  • Help making userscript work in chrome

    - by Vishal Shah
    I've written a userscript for Gmail Pimp.my.Gmail & i'd like it to be compatible with Google Chrome too. Now i have tried a couple of things, to the best of my Javascript knowledge (which is very weak) & have been successful up-to a certain extent, though im not sure if it's the right way. Here's what i tried, to make it work in Chrome: The very first thing i found is that contentWindow.document doesn't work in chrome, so i tried contentDocument, which works. BUT i noticed one thing, checking the console messages in Firefox and Chrome, i saw that the script gets executed multiple times in Firefox whereas in Chrome it just executes once! So i had to abandon the window.addEventListener('load', init, false); line and replace it with window.setTimeout(init, 5000); and i'm not sure if this is a good idea. The other thing i tried is keeping the window.addEventListener('load', init, false); line and using window.setTimeout(init, 1000); inside init() in case the canvasframe is not found. So please do lemme know what would be the best way to make this script cross-browser compatible. Oh and im all ears for making this script better/efficient code wise (which is sure there is)

    Read the article

  • SSIS - Bulk Update at Database Field Level

    - by Adam
    Hello, Here's our mission: Receive files from clients. Each file contains anywhere from 1 to 1,000,000 records. Records are loaded to a staging area and business-rule validation is applied. Valid records are then pumped into an OLTP database in a batch fashion, with the following rules: If record does not exist (we have a key, so this isn't an issue), create it. If record exists, optionally update each database field. The decision is made based on one of 3 factors...I don't believe it's important what those factors are. Our main problem is finding an efficient method of optionally updating the data at a field level. This is applicable across ~12 different database tables, with anywhere from 10 to 150 fields in each table (original DB design leaves much to be desired, but it is what it is). Our first attempt has been to introduce a table that mirrors the staging environment (1 field in staging for each system field) and contains a masking flag. The value of the masking flag represents the 3 factors. We've then put an UPDATE similar to... UPDATE OLTPTable1 SET Field1 = CASE WHEN Mask.Field1 = 0 THEN Staging.Field1 WHEN Mask.Field1 = 1 THEN COALESCE( Staging.Field1 , OLTPTable1.Field1 ) WHEN Mask.Field1 = 2 THEN COALESCE( OLTPTable1.Field1 , Staging.Field1 ) ... As you can imagine, the performance is rather horrendous. Has anyone tackled a similar requirement? We're a MS shop using a Windows Service to launch SSIS packages that handle the data processing. Unfortunately, we're pretty much novices at this stuff.

    Read the article

  • jdbc query - date ranges as parameters

    - by pstanton
    Hi all, I'd like to write a single JDBC statement that can handle the equivalent of any number of NOT BETWEEN date1 AND date2 where clauses. By single query, i mean that the same SQL string will be used to create the JDBC statements and then have different parameters supplied. This is so that the underlying frameworks can efficiently cache the query (i've been stung by that before). Essentially, I'd like to find a query that is the equivalent of SELECT * FROM table WHERE mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? and at the same time could be used with fewer parameters: SELECT * FROM table WHERE mydate NOT BETWEEN ? AND ? or more parameters SELECT * FROM table WHERE mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? AND mydate NOT BETWEEN ? AND ? I will consider using a temporary table if that will be simpler and more efficient. thanks for the help!

    Read the article

  • PHP templating challenge (optimizing front-end templates)

    - by Matt
    Hey all, I'm trying to do some templating optimizations and I'm wondering if it is possible to do something like this: function table_with_lowercase($data) { $out = '<table>'; for ($i=0; $i < 3; $i++) { $out .= '<tr><td>'; $out .= strtolower($data); $out .= '</td></tr>'; } $out .= "</table>"; return $out; } NOTE: You do not know what $data is when you run this function. Results in: <table> <tr><td><?php echo strtolower($data) ?></td></tr> <tr><td><?php echo strtolower($data) ?></td></tr> <tr><td><?php echo strtolower($data) ?></td></tr> </table> General Case: Anything that can be evaluated (compiled) will be. Any time there is an unknown variable, the variable and the functions enclosing it, will be output in a string format. Here's one more example: function capitalize($str) { return ucwords(strtolower($str)); } If $str is "HI ALL" then the output is: Hi All If $str is unknown then the output is: <?php echo ucwords(strtolower($str)); ?> In this case it would be easier to just call the function (ie. <?php echo capitalize($str) ?> ), but the example before would allow you to precompile your PHP to make it more efficient

    Read the article

  • PHP templating challenge (optimizing front-end templates)

    - by Matt
    Hey all, I'm trying to do some templating optimizations and I'm wondering if it is possible to do something like this: function table_with_lowercase($data) { $out = '<table>'; for ($i=0; $i < 3; $i++) { $out .= '<tr><td>'; $out .= strtolower($data); $out .= '</td></tr>'; } $out .= "</table>"; return $out; } NOTE: You do not know what $data is when you run this function. Results in: <table> <tr><td><?php echo strtolower($data) ?></td></tr> <tr><td><?php echo strtolower($data) ?></td></tr> <tr><td><?php echo strtolower($data) ?></td></tr> </table> General Case: Anything that can be evaluated (compiled) will be. Any time there is an unknown variable, the variable and the functions enclosing it, will be output in a string format. Here's one more example: function capitalize($str) { return ucwords(strtolower($str)); } If $str is "HI ALL" then the output is: Hi All If $str is unknown then the output is: <?php echo ucwords(strtolower($str)); ?> In this case it would be easier to just call the function (ie. <?php echo capitalize($str) ?> ), but the example before would allow you to precompile your PHP to make it more efficient

    Read the article

  • Passing data between ViewControllers versus doing local Fetch in each VC

    - by Tofrizer
    Hi All, I'm developing an iPhone app using Core Data and I'm looking for some general advice and recommendations on whether its acceptable to pass data between ViewControllers versus doing a local fetch in each ViewController as you navigate to it. Ordinarily I would say it all depends on various factors (e.g. performance etc) but the passing data approach is so prevalent in my app and I'm spooked by all the stories about Apple rejecting apps because of not conforming to their standard guidelines. So let me put another way -- is it non-standard to pass data between VC's? The reason I pass data so much is because each ViewController is just another view on to data present in my object model / graph. Once I have a handle on my first object in the first view controller (which I of course do have to fetch), I can use the existing object composition / relationships to drill down into the next level of detail into data and so I just pass these objects to the next VC. Separately, one possible downside with this passing-data-to-each-VC approach is I don't benefit from (what I perceive to be) the optimisation/benefits that NSFetchedResultsController provides in terms of efficient memory usage and section handling. My app is read-only but I do have one table with 5000 rows and I'm curious if I am missing out on NSFetchedResultsController benefits. Any thoughts on this as well? Can I somehow still benefit from NSFetchedResultsController goodness without having to do a full fetch (as I would have already passed in the data from my previous VC)? Thanks a lot.

    Read the article

  • Read from one large file and write to many (tens, hundreds, or thousands) files in Java?

    - by Rudiger
    I have a large-ish file (4-5 GB compressed) of small messages that I wish to parse into approximately 6,000 files by message type. Messages are small; anywhere from 5 to 50 bytes depending on the type. Each message starts with a fixed-size type field (a 6-byte key). If I read a message of type '000001', I want to write append its payload to 000001.dat, etc. The input file contains a mixture of messages; I want N homogeneous output files, where each output file contains only the messages of a given type. What's an efficient a fast way of writing these messages to so many individual files? I'd like to use as much memory and processing power to get it done as fast as possible. I can write compressed or uncompressed files to the disk. I'm thinking of using a hashmap with a message type key and an outputstream value, but I'm sure there's a better way to do it. Thanks!

    Read the article

  • tips for fixing bad coding/dev habits ?

    - by dfafa
    i want to become a better coder....so i have decided to sign up for computing science program...maybe a formal education can assist me. i started working on smaller projects to learn but currently i have really bad coding/dev habits which is hindering my productivity as the codebase increases.... i have highlighted them and perhaps someone could make suggestions (or redirect to resources) or a more efficient method. most stuff that i made in the past were web apps. i usually develop with putty + nano...i just love the minimalist feel i use winscp and develop directly on my private web server...too lazy to do it on localhost and upload it later. i dont use subversion control...which one do i need ? sometimes ctrl +z doesn't work well. when i run out of ideas for naming variable, i use swear words instead. i swear a lot when i get stuck....how to deal with anger issue ? my codes look ugly with comments everywhere. would rather use procedural coding finds "thinking" in OO difficult and time consuming i "write first think later". refactors code only if i am getting paid for it. dislikes configuring linux distro, Apache, MySQL, scaling, designing graphics and layouts. does not like writing tests likes working alone. does not like sharing codes. has an econ degree dislikes reading other people's code would rather write it on my own it seems my only true desire is to translate my ideas to a working prototype as fast as possible....it seems like i am very uninterested in the other details...could it be that i am not cut out to be a coder after all ? is going back to study comp sci a bad idea ?

    Read the article

  • Comparing all values within a List against each other

    - by Kave
    I am a bit stuck here and can't think further. public struct CandidateDetail { public int CellX { get; set; } public int CellY { get; set; } public int CellId { get; set; } } var dic = new Dictionary<int, List<CandidateDetail>>(); How can I compare each CandidateDetail item against other CandidateDetail items within the same dictionary in the most efficient way? Example: There are three keys for the dictionary: 5, 6 and 1. Therefore we have three entries. now each of these key entries would have a List associated with. In this case let say each of these three numbers has exactly two CandidateDetails items within the list associated to each key. This means in other words we have two 5, two 6 and two 1 in different or in the same cells. I would like to know: if[5].1stItem.CellId == [6].1stItem.CellId = we got a hit. That means we have a 5 and a 6 within the same Cell if[5].2ndItem.CellId == [6].2ndItem.CellId = perfect. We found out that the other 5 and 6 are together within a different cell. if[1].1stItem.CellId == ... Now I need to check the 1 also against the other 5 and 6 to see if the one exists within the previous same two cells or not. Could a Linq expression help perhaps? I am quite stuck here... I don't know...Maybe I am taking the wrong approach. I am trying to solve the "Hidden pair" of the game Sudoku. :) http://www.sudokusolver.eu/ExplainSolveMethodD.aspx Many Thanks, Kave

    Read the article

< Previous Page | 119 120 121 122 123 124 125 126 127 128 129 130  | Next Page >