Search Results

Search found 8219 results on 329 pages for 'less'.

Page 256/329 | < Previous Page | 252 253 254 255 256 257 258 259 260 261 262 263  | Next Page >

  • Design Philosophy Question - When to create new functions

    - by Eclyps19
    This is a general design question not relating to any language. I'm a bit torn between going for minimum code or optimum organization. I'll use my current project as an example. I have a bunch of tabs on a form that perform different functions. Lets say Tab 1 reads in a file with a specific layout, tab 2 exports a file to a specific location, etc. The problem I'm running into now is that I need these tabs to do something slightly different based on the contents of a variable. If it contains a 1 I may need to use Layout A and perform some extra concatenation, if it contains a 2 I may need to use Layout B and do no concatenation but add two integer fields, etc. There could be 10+ codes that I will be looking at. Is it more preferable to create an individual path for each code early on, or attempt to create a single path that branches out only when absolutely required. Creating an individual path for each code would allow my code to be extremely easy to follow at a glance, which in turn will help me out later on down the road when debugging or making changes. The downside to this is that I will increase the amount of code written by calling some of the same functions in multiple places (for example, steps 3, 5, and 9 for every single code may be exactly the same. Creating a single path that would branch out only when required will be a bit messier and more difficult to follow at a glance, but I would create less code by placing conditionals only at steps that are unique. I realize that this may be a case-by-case decision, but in general, if you were handed a previously built program to work on, which would you prefer?

    Read the article

  • Commenting out portions of code in Scala

    - by akauppi
    I am looking for a C(++) #if 0 -like way of being able to comment out whole pieces of Scala source code, for keeping around experimental or expired code for a while. I tried out a couple of alternatives and would like to hear what you use, and if you have come up with something better? // Simply block-marking N lines by '//' is one way... // <tags> """ anything My editor makes this easy, but it's not really The Thing. It gets easily mixed with actual one-line comments. Then I figured there's native XML support, so: <!-- ... did not work --> Wrapping in XML works, unless you have <tags> within the block: class none { val a= <ignore> ... cannot have //<tags> <here> (not even in end-of-line comments!) </ignore> } The same for multi-line strings seems kind of best, but there's an awful lot of boilerplate (not fashionable in Scala) to please the compiler (less if you're doing this within a class or an object): object none { val ignore= """ This seems like ... <truly> <anything goes> but three "'s of course """ } The 'right' way to do this might be: /*** /* ... works but not properly syntax highlighed in SubEthaEdit (or StackOverflow) */ ***/ ..but that matches the /* and */ only, not i.e. /*** to ***/. This means the comments within the block need to be balanced. And - the current Scala syntax highlighting mode for SubEthaEdit fails miserably on this. As a comparison, Lua has --[==[ matching ]==] and so forth. I think I'm spoilt? So - is there some useful trick I'm overseeing?

    Read the article

  • Sequential access to asynchronous sockets

    - by Lars A. Brekken
    I have a server that has several clients C1...Cn to each of which there is a TCP connection established. There are less than 10,000 clients. The message protocol is request/response based, where the server sends a request to a client and then the client sends a response. The server has several threads, T1...Tm, and each of these may send requests to any of the clients. I want to make sure that only one of these threads can send a request to a specific client at any one time, while the other threads wanting to send a request to the same client will have to wait. I do not want to block threads from sending requests to different clients at the same time. E.g. If T1 is sending a request to C3, another thread T2 should not be able to send anything to C3 until T1 has received its response. I was thinking of using a simple lock statement on the socket: lock (c3Socket) { // Send request to C3 // Get response from C3 } I am using asynchronous sockets, so I may have to use Monitor instead: Monitor.Enter(c3Socket); // Before calling .BeginReceive() And Monitor.Exit(c3Socket); // In .EndReceive I am worried about stuff going wrong and not letting go of the monitor and therefore blocking all access to a client. I'm thinking that my heartbeat thread could use Monitor.TryEnter() with a timeout and throw out sockets that it cannot get the monitor for. Would it make sense for me to make the Begin and End calls synchronous in order to be able to use the lock() statement? I know that I would be sacrificing concurrency for simplicity in this case, but it may be worth it. Am I overlooking anything here? Any input appreciated.

    Read the article

  • Exporting de-aggregated data

    - by Ben
    I'm currently working on a data export feature for a survey application. We are using SQL2k8. We store data in a normalized format: QuestionId, RespondentId, Answer. We have a couple other tables that define what the question text is for the QuestionId and demographics for the RespondentId... Currently I'm using some dynamic SQL to generate a pivot that joins the question table to the answer table and creates an export, its working... The problem is that it seems slow and we don't have that much data (less than 50k respondents). Right now I'm thinking "why am I 'paying' to de-aggregate the data for each query? Why don't I cache that?" The data being exported is based on dynamic criteria. It could be "give me respondents that completed on x date (or range)" or "people that like blue", etc. Because of that, I think I have to cache at the respondent level, find out what respondents are being exported and then select their combined cached de-aggregated data. To me the quick and dirty fix is a totally flat table, RespondentId, Question1, Question2, etc. The problem is, we have multiple clients and that doesn't scale AND I don't want to have to maintain the flattened table as the survey changes. So I'm thinking about putting an XML column on the respondent table and caching the results of a SELECT * FROM Data FOR XML AUTO WHERE RespondentId = x. With that in place, I would then be able to get my export with filtering and XML calls into the XML column. What are you doing to export aggregated data in a flattened format (CSV, Excel, etc)? Does this approach seem ok? I worry about the cost of XML functions on larger result sets (think SELECT RespondentId, XmlCol.value('//data/question_1', 'nvarchar(50)') AS [Why is there air?], XmlCol.RinseAndRepeat)... Is there a better technology/approach for this? Thanks!

    Read the article

  • Search engine recommendation for 100 sites of about 4000 pages

    - by fwkb
    I am looking for a search engine that can regularly (daily-ish) scan about 100 pages for changes and index an associated site if changes since the last scan are found. It should be able to handle about 100 sites, each averaging 4000 pages of about 5k average size, each on a different server (but only the one centralized search engine). Each of these sites will have a search form that gets submitted to this search engine. The results that are returned must be specific to the site that submitted them. I create the templates for the external sites, so I can give the search form a hidden field that specifies which site the form is submitted from. What would you recommend I look into? I would love to use a Python-based system for this, if feasible. I am currently using something called iSearch2. It doesn't seem very stable at this scale, the description of the product states it is not really intended to do multiple sites, is in PHP (which is less comfortable to me than Python), and has a few other shortcomings for my specific situation.

    Read the article

  • Permuting a binary tree without the use of lists

    - by Banang
    I need to find an algorithm for generating every possible permutation of a binary tree, and need to do so without using lists (this is because the tree itself carries semantics and restraints that cannot be translated into lists). I've found an algorithm that works for trees with the height of three or less, but whenever I get to greater hights, I loose one set of possible permutations per height added. Each node carries information about its original state, so that one node can determine if all possible permutations have been tried for that node. Also, the node carries information on weather or not it's been 'swapped', i.e. if it has seen all possible permutations of it's subtree. The tree is left-centered, meaning that the right node should always (except in some cases that I don't need to cover for this algorithm) be a leaf node, while the left node is always either a leaf or a branch. The algorithm I'm using at the moment can be described sort of like this: if the left child node has been swapped swap my right node with the left child nodes right node set the left child node as 'unswapped' if the current node is back to its original state swap my right node with the lowest left nodes' right node swap the lowest left nodes two childnodes set my left node as 'unswapped' set my left chilnode to use this as it's original state set this node as swapped return null return this; else if the left child has not been swapped if the result of trying to permute left child is null return the permutation of this node else return the permutation of the left child node if this node has a left node and a right node that are both leaves swap them set this node to be 'swapped' The desired behaviour of the algoritm would be something like this: branch / | branch 3 / | branch 2 / | 0 1 branch / | branch 3 / | branch 2 / | 1 0 <-- first swap branch / | branch 3 / | branch 1 <-- second swap / | 2 0 branch / | branch 3 / | branch 1 / | 0 2 <-- third swap branch / | branch 3 / | branch 0 <-- fourth swap / | 1 2 and so on... Sorry for the ridiculisly long and waddly explanation, would really, really apreciate any sort of help you guys could offer me. Thanks a bunch!

    Read the article

  • Incorrect sizing of a JPanel in a JScrollPane In Java 1.5

    - by Coder
    Hi, I am making an image loading component which consists of a JPanel containing a JScrollPane, which in turn contains another JPanel. What this component does is allows images to be dropped on top of it, after which point the image is loaded and the inner most JPanel is set to the size of the image dropped. This in turn causes the scroll bars to show up and the user can scroll the image. This all works fine. The problem comes in when i try to auto-shrink the image to the maximum visible area in the outer JPanel. In this case i do a uniform scale of the image to be less than or equal to the width and height of the outer JPanel. What happens now is that both the horizontal and vertical scroll bars show up indicating the the inner JPanel is bigger than the visible area (which should not be the case). I verified that the image is scale to the proper dimensions(ie. the maximum width and height is respected). I also verified that if i decrease the maximum height by 3 pixels, then no scroll bars appear. What i believe the problem is, is that panel.getWidth() and panel.getHeight() don't actually return the visible area (maximum area) that sub components can take up. Ie. there is likely some more width and height taken up by the border around the JPanel or something like that. My question is, how do i get around this problem. Functionally all i want is to determine the maximum size a JPanel can be in a JScrollPane, then set the panel to that size and paint an image over top of it and be assured that the scroll bars of the scroll pane will not show up. Right now the scroll bars are set to AS_NEEDED. Thanks!

    Read the article

  • Agile methodologies. Is it a by-product of mind control techniques as NLP / Scientology?

    - by Bobb
    The more I read about contemporary methods combinging scrum, tdd and xp, the more I feel like I already seen the methods. I am not arguing that agile approach is much more progressive than older rigid structures like waterfall, what I am saying is that it seems to me that agile methodologies are ideal to be used as a nest for a brainwashing business. I read few articles which kept referring to authors which I checked afterwards and they call themselves - coaches, trainers (usual thing when NLP specialists are involved) with no apparent software development history. Also I met a guy who is a scrum faciltator (term widly used in relation to scientology) in a high profile company. I talked to him less than 5 min but I got complete feeling that he is either on drugs or he has been programmed by a powerful NLP specialist. The way to talk and his body movements witnessed he is not an average normal person (in terms of normal distribution :))... Please dont get me wrong. I am not a fun of conspiracy theories. But I had an experience with a member of church of scientology tried to invade a commercial firm and actually went half way through to very top in just 3 weeks. I saw his work. For now I have complete impression is that psycho manipulators are now invading IT industry through the convenient door of agile techniques. Anyone has the same feeling/thoughts?

    Read the article

  • Creating C++ client app for some abstract windows server - how to manage TCP connection to server speed?

    - by Kabumbus
    So we have some server with some address port and ip. we are developing that server so we can implement on it what ever we need for help. What are standard/best practices for data transfer speed management between C++ windows client app and server (C++)? My main point is in how to get how much data can be uploaded/downloaded from/to client via his low speed network to my relatively super fast server. (I need it for set up of his live stream Audio/Video bit rate) My try on explaining number 3. We do not care how fast is our server. It is always faster than needed. We care about client tyring to stream out to our server his media. he streams encoded (via ffmpeg) live video data to our server. But he has say ADSL with 500kb/s of outgoing traffic. Also he uses some ICQ or what so ever so he has less than 500 kb/s per second. And he wants to stream live video! So we need to set up our ffmpeg to encode video with respect to the bit rate user can provide. We develop server side and client side. We need a way of finding out how much user can upload per second currently (so value can change dynamically over time)

    Read the article

  • What makes an effective UI for displaying versioning of structured hierarchical data

    - by Fadrian Sudaman
    Traditional version control system are displaying versioning information by grouping Projects-Folders-Files with Tree view on the left and details view on the right, then you will click on each item to look at revision history for that configuration history. Assuming that I have all the historical versioning information available for a project from Object-oriented model perspective (e.g. classes - methods - parameters and etc), what do you think will be the most effective way to present such information in UI so that you can easily navigate and access the snapshot view of the project and also the historical versioning information? Put yourself in the position that you are using a tool like this everyday in your job like you are currently using SVN, SS, Perforce or any VCS system, what will contribute to the usability, productivity and effectiveness of the tool. I personally find the classical way for display folders and files like above are very restrictive and less effective for displaying deep nested logical models. Assuming that this is a greenfield project and not restricted by specific technology, how do you think I should best approach this? I am looking for idea and input here to add values to my research project. Feel free to make any suggestions that you think is valuable. Thanks again for anyone that shares their thoughts.

    Read the article

  • Javascript to fire event when a key pressed on the Ajax Toolkit Combo box.

    - by Paul Chapman
    I have the following drop down list which is using the Ajax Toolkit to provide a combo box <cc1:ComboBox ID="txtDrug" runat="server" style="font-size:8pt; width:267px;" Font-Size="8pt" DropDownStyle="DropDownList" AutoCompleteMode="SuggestAppend" AutoPostBack="True" ontextchanged="txtDrug_TextChanged" /> Now I need to load this up with approx 7,000 records which takes a considerable time, and effects the response times when the page is posted back and forth. The code which loads these records is as follows; dtDrugs = wsHelper.spGetAllDrugs(); txtDrug.DataValueField = "pkDrugsID"; txtDrug.DataTextField = "drugName"; txtDrug.DataSource = dtDrugs; txtDrug.DataBind(); However if I could get an event to fire when a letter is typed instead of having to load 7000 records it is reduced to less than 50 in most instances. I think this can be done in Javascript. So the question is how can I get an event to fire such that when the form starts there is nothing in the drop down, but as soon as a key is pressed it searches for those records starting with that letter. The .Net side of things I'm sure about - it is the Javascript I'm not. Thanks in advance

    Read the article

  • What should i do for accomodating large scale data storage and retrieval?

    - by kailashbuki
    There's two columns in the table inside mysql database. First column contains the fingerprint while the second one contains the list of documents which have that fingerprint. It's much like an inverted index built by search engines. An instance of a record inside the table is shown below; 34 "doc1, doc2, doc45" The number of fingerprints is very large(can range up to trillions). There are basically following operations in the database: inserting/updating the record & retrieving the record accoring to the match in fingerprint. The table definition python snippet is: self.cursor.execute("CREATE TABLE IF NOT EXISTS `fingerprint` (fp BIGINT, documents TEXT)") And the snippet for insert/update operation is: if self.cursor.execute("UPDATE `fingerprint` SET documents=CONCAT(documents,%s) WHERE fp=%s",(","+newDocId, thisFP))== 0L: self.cursor.execute("INSERT INTO `fingerprint` VALUES (%s, %s)", (thisFP,newDocId)) The only bottleneck i have observed so far is the query time in mysql. My whole application is web based. So time is a critical factor. I have also thought of using cassandra but have less knowledge of it. Please suggest me a better way to tackle this problem.

    Read the article

  • MySQL get point in time totals from related tables

    - by batfastad
    Hi everyone We have an order book and invoicing system and I've been tasked with trying to output monthly rolling totals from these tables. But I don't know really where to start with this. I think there's some SQL syntax that I don't even know about yet. I'm familiar with INNER/LEFT/JOINS and GROUP BY etc but grouping by date is confusing since I don't know how to limit the data to only the current date that's being grouped by at that point. I think this will involve joining the tables to themselves or possibly a sub-select. I always thought it best to avoid sub-selects apart from when absolutely necessary. Basically the system has 3 tables orders: order_id, currency, order_stamp orders_lines: order_line_id, invoice_id, order_id, price invoices: invoice_id, invoice_stamp order_stamp and invoice_stamp are UTC unix timestamps stored as integers, not MySQL timestamps. I'm trying to get a listing by year/month showing the total of current unbilled orders (sum of price), at that point in time. Current orders are ones where order_stamp is less than or equal to 00:00 on the 1st of the month. Unbilled orders are ones where invoice_stamp is null or invoice_stamp is greater than 00:00 on the 1st of the month. At that point in time there may not be a related invoice yet and invoice_id might be null. Anyone got any suggestions on what I should join to what and what I need to group by? Cheers, B

    Read the article

  • Significance in R

    - by Gemsie
    Ok, this is quite hard to explain, but I'm at a complete loss what to do. I'm a relative newcomer to R and although I can completely admire how powerful it is, I'm not too good at actually using it.... Basically, I have some very contrived data that I need to analyse (it wasn't me who chose this, I can assure you!). I have the right and left hand lengths of lots of people, as well as some numeric data that shows their sociability. Now I would like to know if people who have significantly different lengths of hand are more or less sociable than those who have the same (leading into the research that 'symmetrical' people are more sociable and intelligent, etc. I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis? Ok, using Sasha's great advice, I did the cor.test and got the following: Pearson's product-moment correlation data: measurements$l.hand - measurements$r.hand and measurements$sociable t = 0.2148, df = 150, p-value = 0.8302 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.1420623 0.1762437 sample estimates: cor 0.01753501 I have never used this test before, so am unsure how to intepret it...you wouldn't think I was on my fourth Scientific degree would you?! :(

    Read the article

  • Java Executor: Small tasks or big ones?

    - by Arash Shahkar
    Consider one big task which could be broken into hundreds of small, independently-runnable tasks. To be more specific, each small task is to send a light network request and decide upon the answer received from the server. These small tasks are not expected to take longer than a second, and involve a few servers in total. I have in mind two approaches to implement this using the Executor framework, and I want to know which one's better and why. Create a few, say 5 to 10 tasks each involving doing a bunch of send and receives. Create a single task (Callable or Runnable) for each send & receive and schedule all of them (hundreds) to be run by the executor. I'm sorry if my question shows that I'm lazy to test these and see for myself what's better (at least performance-wise). My question, while looking after an answer to this specific case, has a more general aspect. In situations like these when you want to use an executor to do all the scheduling and other stuff, is it better to create lots of small tasks or to group those into a less number of bigger tasks?

    Read the article

  • Flex profiling - what is [enterFrameEvent] doing?

    - by Herms
    I've been tasked with finding (and potentially fixing) some serious performance problems with a Flex application that was delivered to us. The application will consistently take up 50 to 100% of the CPU at times when it is simply idling and shouldn't be doing anything. My first step was to run the profiler that comes with FlexBuilder. I expected to find some method that was taking up most of the time, showing me where the bottleneck was. However, I got something unexpected. The top 4 methods were: [enterFrameEvent] - 84% cumulative, 32% self time [reap] - 20% cumulative and self time [tincan] - 8% cumulative and self time global.isNaN - 4% cumulative and self time All other methods had less than 1% for both cumulative and self time. From what I've found online, the [bracketed methods] are what the profiler lists when it doesn't have an actual Flex method to show. I saw someone claim that [tincan] is the processing of RTMP requests, and I assume [reap] is the garbage collector. Does anyone know what [enterFrameEvent] is actually doing? I assume it's essentially the "main" function for the event loop, so the high cumulative time is expected. But why is the self time so high? What's actually going on? I didn't expect the player internals to be taking up so much time, especially since nothing is actually happening in the app (and there are no UI updates going on). Is there any good way to find dig into what's happening? I know something is going on that shouldn't be (it looks like there must be some kind of busy wait or other runaway loop), but the profiler isn't giving me any results that I was expecting. My next step is going to be to start adding debug trace statements in various places to try and track down what's actually happening, but I feel like there has to be a better way.

    Read the article

  • what is meant by normalization in huge pointers

    - by wrapperm
    Hi, I have a lot of confusion on understanding the difference between a "far" pointer and "huge" pointer, searched for it all over in google for a solution, couldnot find one. Can any one explain me the difference between the two. Also, what is the exact normalization concept related to huge pointers. Please donot give me the following or any similar answers: "The only difference between a far pointer and a huge pointer is that a huge pointer is normalized by the compiler. A normalized pointer is one that has as much of the address as possible in the segment, meaning that the offset is never larger than 15. A huge pointer is normalized only when pointer arithmetic is performed on it. It is not normalized when an assignment is made. You can cause it to be normalized without changing the value by incrementing and then decrementing it. The offset must be less than 16 because the segment can represent any value greater than or equal to 16 (e.g. Absolute address 0x17 in a normalized form would be 0001:0001. While a far pointer could address the absolute address 0x17 with 0000:0017, this is not a valid huge (normalized) pointer because the offset is greater than 0000F.). Huge pointers can also be incremented and decremented using arithmetic operators, but since they are normalized they will not wrap like far pointers." Here the normalization concept is not very well explained, or may be I'm unable to understand it very well. Can anyone try explaining this concept from a beginners point of view. Thanks, Rahamath

    Read the article

  • java number exceeds long.max_value - how to detect?

    - by jurchiks
    I'm having problems detecting if a sum/multiplication of two numbers exceeds the maximum value of a long integer. Example code: long a = 2 * Long.MAX_VALUE; System.out.println("long.max * smth > long.max... or is it? a=" + a); This gives me -2, while I would expect it to throw a NumberFormatException... Is there a simple way of making this work? Because I have some code that does multiplications in nested IF blocks or additions in a loop and I would hate to add more IFs to each IF or inside the loop. Edit: oh well, it seems that this answer from another question is the most appropriate for what I need: http://stackoverflow.com/a/9057367/540394 I don't want to do boxing/unboxing as it adds unnecassary overhead, and this way is very short, which is a huge plus to me. I'll just write two short functions to do these checks and return the min or max long. Edit2: here's the function for limiting a long to its min/max value according to the answer I linked to above: /** * @param a : one of the two numbers added/multiplied * @param b : the other of the two numbers * @param c : the result of the addition/multiplication * @return the minimum or maximum value of a long integer if addition/multiplication of a and b is less than Long.MIN_VALUE or more than Long.MAX_VALUE */ public static long limitLong(long a, long b, long c) { return (((a > 0) && (b > 0) && (c <= 0)) ? Long.MAX_VALUE : (((a < 0) && (b < 0) && (c >= 0)) ? Long.MIN_VALUE : c)); } Tell me if you think this is wrong.

    Read the article

  • How to maintain a persistant network-connection between two applications over a network?

    - by John
    I was recently approached by my management with an interesting problem - where I am pretty sure I am telling my bosses the correct information but I really want to make sure I am telling them the correct stuff. I am being asked to develop some software that has this function: An application at one location is constantly processing real-time data every second and only generates data if the underlying data has changed in any way. On the event that the data has changed send the results to another box over a network Maintains a persistent connection between the both machines, altering the remote box if for some reason the network connection went down From what I understand, I imagine that I need to do some reading on doing some sort of TCP/IP socket-level stuff. That way if the connection is dropped the remote location will be aware that the data it has received may be stale. However management seems to be very convinced that this can be accomplished using SOAP. I was under the impression that SOAP is more or less a way for a client to initiate a procedure from a server and get some results via the HTTP protocol. Am I wrong in assuming this? I haven't been able to find much information on how SOAP might be able to solve a problem like this. I feel like a lot of people around my office are using SOAP as a buzzword and that has generated a bit of confusion over what SOAP actually is - and is capable of. Any thoughts on how to accomplish this task would be appreciated!

    Read the article

  • How to get OCI lib to work on red hat machine to work with R Oracle?

    - by Matt Bannert
    I need to get OCI lib working on my rhel 6.3 machine and I am experiencing some trouble with OCI headers files that can't be found. I have installed (using yum install) oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64.rpm because this official page it's all I need to run OCI. To test the whole thing in general I've installed sqplus64, which worked after I set export LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib. Unfortunately the headers files couldn't be found after setting LD_LIBRARY_PATH. Actually I am not surprised because there is no include directory in any of these oracle paths. So the question is: Where do I get these missing header files from? Are they actually already there and I just can find them? Btw: I am doing this whole exercise because I want to use ROracle on my R Studio server and this R package depends on the OCI library. Once I am back in R territory the road gets much less bumpier for me. EDIT: this documentation helped me a little further. However, I guess I found some header files now in: "/usr/include/oracle/11.2/client64". But which variable do I have to set to this location?

    Read the article

  • Apache console accesses network drives, service does not?

    - by danspants
    I have an apache 2.2 server running Django. We have a network drive T: which we need constant access to within our Django app. When running Apache as a service, we cannot access this drive, as far as any django code is concerned the drive does not exist. If I add... <Directory "t:/"> Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all </Directory> to the httpd.conf file the service no longer runs, but I can start apache as a console and it works fine, Django can find the network drive and all is well. Why is there a difference between the console and the service? Should there be a difference? I have the service using my own log on so in theory it should have the same access as I do. I'm keen to keep it running as a service as it's far less obtrusive when I'm working on the server (unless there's a way to hide the console?). Any help would be most appreciated.

    Read the article

  • Improve Efficiency in Array comparison in Ruby

    - by user2985025
    Hi I am working on Ruby /cucumber and have an requirement to develop a comparison module/program to compare two files. Below are the requirements The project is a migration project . Data from one application is moved to another Need to compare the data from the existing application against the new ones. Solution : I have developed a comparison engine in Ruby for the above requirement. a) Get the data, de duplicated and sorted from both the DB's b) Put the data in a text file with "||" as delimiter c) Use the key columns (number) that provides a unique record in the db to compare the two files For ex File1 has 1,2,3,4,5,6 and file2 has 1,2,3,4,5,7 and the columns 1,2,3,4,5 are key columns. I use these key columns and compare 6 and 7 which results in a fail. Issue : The major issue we are facing here is if the mismatches are more than 70% for 100,000 records or more the comparison time is large. If the mismatches are less than 40% then comparison time is ok. Diff and Diff -LCS will not work in this case because we need key columns to arrive at accurate data comparison between two applications. Is there any other method to efficiently reduce the time if the mismatches are more thatn 70% for 100,000 records or more. Thanks

    Read the article

  • Is it better to use a relational database or document-based database for an app like Wufoo?

    - by mboyle
    I'm working on an application that's similar to Wufoo in that it allows our users to create their own databases and collect/present records with auto generated forms and views. Since every user is creating a different schema (one user might have a database of their baseball card collection, another might have a database of their recipes) our current approach is using MySQL to create separate databases for every user with its own tables. So in other words, the databases our MySQL server contains look like: main-web-app-db (our web app containing tables for users account info, billing, etc) user_1_db (baseball_cards_table) user_2_db (recipes_table) .... And so on. If a user wants to set up a new database to keep track of their DVD collection, we'd do a "create database ..." with "create table ...". If they enter some data in and then decide they want to change a column we'd do an "alter table ....". Now, the further along I get with building this out the more it seems like MySQL is poorly suited to handling this. 1) My first concern is that switching databases every request, first to our main app's database for authentication etc, and then to the user's personal database, is going to be inefficient. 2) The second concern I have is that there's going to be a limit to the number of databases a single MySQL server can host. Pretending for a moment this application had 500,000 user databases, is MySQL designed to operate this way? What if it were a million, or more? 3) Lastly, is this method going to be a nightmare to support and scale? I've never heard of MySQL being used in this way so I do worry about how this affects things like replication and other methods of scaling. To me, it seems like MySQL wasn't built to be used in this way but what do I know. I've been looking at document-based databases like MongoDB, CouchDB, and Redis as alternatives because it seems like a schema-less approach to this particular problem makes a lot of sense. Can anyone offer some advice on this?

    Read the article

  • How can I cluster short messages [Tweets] based on topic ? [Topic Based Clustering]

    - by Jagira
    Hello, I am planning an application which will make clusters of short messages/tweets based on topics. The number of topics will be limited like Sports [ NBA, NFL, Cricket, Soccer ], Entertainment [ movies, music ] and so on... I can think of two approaches to this Ask for users to tag questions like Stackoverflow does. Users can select tags from a predefined list of tags. Then on server side I will cluster them on based of tags. Pros:- Simple design. Less complexity in code. Cons:- Choices for users will be restricted. Clusters will not be dynamic. If a new event occurs, the predefined tags will miss it. Take the message, delete the stopwords [ predefined in a dictionary ] and apply some clustering algorithm to make a cluster and depending on its popularity, display the cluster. The cluster will be maintained according to its sustained popularity. New messages will be skimmed and assigned to corresponding clusters. Pros:- Dynamic clustering based on the popularity of the event/accident. Cons:- Increased complexity. More server resources required. I would like to know whether there are any other approaches to this problem. Or are there any ways of improving the above mentioned methods? Also suggest some good clustering algorithms.I think "K-Nearest Clustering" algorithm is apt for this situation.

    Read the article

  • How to speed-up python nested loop?

    - by erich
    I'm performing a nested loop in python that is included below. This serves as a basic way of searching through existing financial time series and looking for periods in the time series that match certain characteristics. In this case there are two separate, equally sized, arrays representing the 'close' (i.e. the price of an asset) and the 'volume' (i.e. the amount of the asset that was exchanged over the period). For each period in time I would like to look forward at all future intervals with lengths between 1 and INTERVAL_LENGTH and see if any of those intervals have characteristics that match my search (in this case the ratio of the close values is greater than 1.0001 and less than 1.5 and the summed volume is greater than 100). My understanding is that one of the major reasons for the speedup when using NumPy is that the interpreter doesn't need to type-check the operands each time it evaluates something so long as you're operating on the array as a whole (e.g. numpy_array * 2), but obviously the code below is not taking advantage of that. Is there a way to replace the internal loop with some kind of window function which could result in a speedup, or any other way using numpy/scipy to speed this up substantially in native python? Alternatively, is there a better way to do this in general (e.g. will it be much faster to write this loop in C++ and use weave)? ARRAY_LENGTH = 500000 INTERVAL_LENGTH = 15 close = np.array( xrange(ARRAY_LENGTH) ) volume = np.array( xrange(ARRAY_LENGTH) ) close, volume = close.astype('float64'), volume.astype('float64') results = [] for i in xrange(len(close) - INTERVAL_LENGTH): for j in xrange(i+1, i+INTERVAL_LENGTH): ret = close[j] / close[i] vol = sum( volume[i+1:j+1] ) if ret > 1.0001 and ret < 1.5 and vol > 100: results.append( [i, j, ret, vol] ) print results

    Read the article

< Previous Page | 252 253 254 255 256 257 258 259 260 261 262 263  | Next Page >