Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 321/361 | < Previous Page | 317 318 319 320 321 322 323 324 325 326 327 328  | Next Page >

  • Read/Write/Find/Replace huge csv file

    - by notapipe
    I have a huge (4,5 GB) csv file.. I need to perform basic cut and paste, replace operations for some columns.. the data is pretty well organized.. the only problem is I cannot play with it with Excel because of the size (2000 rows, 550000 columns). here is some part of the data: ID,Affection,Sex,DRB1_1,DRB1_2,SENum,SEStatus,AntiCCP,RFUW,rs3094315,rs12562034,rs3934834,rs9442372,rs3737728 D0024949,0,F,0101,0401,SS,yes,?,?,A_A,A_A,G_G,G_G D0024302,0,F,0101,7,SN,yes,?,?,A_A,G_G,A_G,?_? D0023151,0,F,0101,11,SN,yes,?,?,A_A,G_G,G_G,G_G I need to remove 4th, 5th, 6th, 7th, 8th and 9th columns; I need to find every _ character from column 10 onwards and replace it with a space ( ) character; I need to replace every ? with zero (0); I need to replace every comma with a tab; I need to remove first row (that has column names; I need to replace every 0 with 1, every 1 with 2 and every ? with 0 in 2nd column; I need to replace F with 2, M with 1 and ? with 0 in 3rd column; so that in the resulting file the output reads: D0024949 1 2 A A A A G G G G D0024302 1 2 A A G G A G 0 0 D0023151 1 2 A A G G G G G G (both input and output should read one line per row, ne extra blank row) Is there a memory efficient way of doing that with java(and I need a code to do that) or a usable tool for playing with this large data so that I can easily apply Excel functionality..

    Read the article

  • design an extendable database model

    - by wishi_
    Hi! Currently I'm doing a project whose specifications are unclear - well who doesn't. I wonder what's the best development strategy to design a DB, that's going to be extended sooner or later with additional tables and relations. I want to include "changeability". My main concern is that I want to apply design patterns (it's a university project) and I want to separate the constant factors from those, that change by choosing appropriate design patterns - in my case MVC and a set of sub-patterns at model level. When it comes to the DB however, I may have to resdesign my model in my MVC approach, because my domain model at a later stage my require a different set of classes representing the DB tables. I use Hibernate as an abstraction layer between DB and application. Would you start with a very minimal DB, just a few tables and relations? And what if I want an efficient DB, too? I wonder what strategies are applied in the real world. Stakeholder analysis for example isn't a sufficient planing solution when it comes to changing requirements. I think - at a DB level - my design pattern ends. So there's breach whose impact I'd like to minimize with a smart strategy.

    Read the article

  • A good data model for finding a user's favorite stories

    - by wings
    Original Design Here's how I originally had my Models set up: class UserData(db.Model): user = db.UserProperty() favorites = db.ListProperty(db.Key) # list of story keys # ... class Story(db.Model): title = db.StringProperty() # ... On every page that displayed a story I would query UserData for the current user: user_data = UserData.all().filter('user =' users.get_current_user()).get() story_is_favorited = (story in user_data.favorites) New Design After watching this talk: Google I/O 2009 - Scalable, Complex Apps on App Engine, I wondered if I could set things up more efficiently. class FavoriteIndex(db.Model): favorited_by = db.StringListProperty() The Story Model is the same, but I got rid of the UserData Model. Each instance of the new FavoriteIndex Model has a Story instance as a parent. And each FavoriteIndex stores a list of user id's in it's favorited_by property. If I want to find all of the stories that have been favorited by a certain user: index_keys = FavoriteIndex.all(keys_only=True).filter('favorited_by =', users.get_current_user().user_id()) story_keys = [k.parent() for k in index_keys] stories = db.get(story_keys) This approach avoids the serialization/deserialization that's otherwise associated with the ListProperty. Efficiency vs Simplicity I'm not sure how efficient the new design is, especially after a user decides to favorite 300 stories, but here's why I like it: A favorited story is associated with a user, not with her user data On a page where I display a story, it's pretty easy to ask the story if it's been favorited (without calling up a separate entity filled with user data). fav_index = FavoriteIndex.all().ancestor(story).get() fav_of_current_user = users.get_current_user().user_id() in fav_index.favorited_by It's also easy to get a list of all the users who have favorited a story (using the method in #2) Is there an easier way? Please help. How is this kind of thing normally done?

    Read the article

  • Negative number representation across multiple architechture

    - by Donotalo
    I'm working with OKI 431 micro controller. It can communicate with PC with appropriate software installed. An EEPROM is connected in the I2C bus of the micro which works as permanent memory. The PC software can read from and write to this EEPROM. Consider two numbers, B and C, each is two byte integer. B is known to both the PC software and the micro and is a constant. C will be a number so close to B such that B-C will fit in a signed 8 bit integer. After some testing, appropriate value for C will be determined by PC and will be stored into the EEPROM of the micro for later use. Now the micro can store C in two ways: The micro can store whole two byte representing C The micro can store B-C as one byte signed integer, and can later derive C from B and B-C I think that two's complement representation of negative number is now universally accepted by hardware manufacturers. Still I personally don't like negative numbers to be stored in a storage medium which will be accessed by two different architectures because negative number can be represented in different ways. For you information, 431 also uses two's complement. Should I get rid of the headache that negative number can be represented in different ways and accept the one byte solution as my other team member suggested? Or should I stick to the decision of the two byte solution because I don't need to deal with negative numbers? Which one would you prefer and why?

    Read the article

  • MS SQL - Multi-Column substring matching

    - by hamlin11
    One of my clients is hooked on multi-column substring matching. I understand that Contains and FreeText search for words (and at least in the case of Contains, word prefixes). However, based upon my understanding of this MSDN book, neither of these nor their variants are capable of searching substrings. I have used LIKE rather extensively (Select * from A where A.B Like '%substr%') Sample table A: ID | Col1 | Col2 | Col3 | ------------------------------------- 1 | oklahoma | colorado | Utah | 2 | arkansas | colorado | oklahoma | 3 | florida | michigan | florida | ------------------------------------- The following code will give us row 1 and row 2: select * from A where Col1 like '%klah%' or Col2 like '%klah%' or Col3 like '%klah%' This is rather ugly, probably slow, and I just don't like it very much. Probably because the implementations that I'm dealing with have 10+ columns that need searched. The following may be a slight improvement as code readability goes, but as far as performance, we're still in the same ball park. select * from A where (Col1 + ' ' + Col2 + ' ' + Col3) like '%klah%' I have thought about simply adding insert, update, and delete triggers that simply add the concatenated version of the above columns into a separate table that shadows this table. Sample Shadow_Table: ID | searchtext | --------------------------------- 1 | oklahoma colorado Utah | 2 | arkansas colorado oklahoma | 3 | florida michigan florida | --------------------------------- This would allow us to perform the following query to search for '%klah%' select * from Shadow_Table where searchtext like '%klah%' I really don't like having to remember that this shadow table exists and that I'm supposed to use it when I am performing multi-column substring matching, but it probably yields pretty quick reads at the expense of write and storage space. My gut feeling tells me there there is an existing solution built into SQL Server 2008. However, I don't seem to be able to find anything other than research papers on the subject. Any help would be appreciated.

    Read the article

  • Titanium TableViewRow classname with custom rows

    - by pancake
    I would like to know in what way the 'className' property of a Ti.UI.TableViewRow helps when creating custom rows. For example, I populate a tableview with custom rows in the following way: function populateTableView(tableView, data) { var rows = []; var row; var title, image; var i; for (i = 0; i < data.length; i++) { title = Ti.UI.createLabel({ text : data[i].title, width : 100, height: 30, top: 5, left: 25 }); image = Ti.UI.createImage({ image : 'some_image.png', width: 30, height: 30, top: 5, left: 5 }); /* and, like, 5+ more views or whatever */ row = Ti.UI.createTableViewRow(); row.add(titleLabel); row.add(image); rows.push(row); } tableView.setData(rows); } Of course, this example of a "custom" row is easily created using the standard title and image properties of the TableViewRow, but that isn't the point. How is the allocation of new labels, image views and other child views of a table view prevented in favour of their reuse? I know in iOS this is achieved by using the method -[UITableView dequeueReusableCellWithIdentifier:] to fetch a row object from a 'reservoir' (so 'className' is 'identifier' here) that isn't currently being used for displaying data, but already has the needed child views laid out correctly in it, thus only requiring to update the data contained within (text, image data, etc). As this system is so unbelievably simple, I have a lot of trouble believing the method employed by the Titanium API does not support this. After reading through the API and searching the web, I do however suspect this is the case. The 'className' property is recommended as an easy way to make table views more efficient in Titanium, but its relation to custom table view rows is not explained in any way. If anyone could clarify this matter for me, I would be very grateful.

    Read the article

  • Splitting a set of object into several subsets of 'similar' objects

    - by doublep
    Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subsets: S = S1 + S2 + ... + Sn and build Di for each subset. Using n structures is less efficient than using one, but at least this way I can fit into memory constraints. Since size of f(S) grows faster than S itself, combined size of Di is much less than size of D. However, it is still desirable to reduce n, i.e. the number of subsets; or reduce the combined size of Di. For this, I need to split S in such a way that each Si contains "similar" objects, because then f will produce a smaller output structure if input objects are "similar enough" to each other. The problems is that while "similarity" of objects in S and size of f(S) do correlate, there is no way to compute the latter other than just evaluating f(S), and f is not quite fast. Algorithm I have currently is to iteratively add each next object from S into one of Si, so that this results in the least possible (at this stage) increase in combined Di size: for x in S: i = such i that size(f(Si + {x})) - size(f(Si)) is min Si = Si + {x} This gives practically useful results, but certainly pretty far from optimum (i.e. the minimal possible combined size). Also, this is slow. To speed up somewhat, I compute size(f(Si + {x})) - size(f(Si)) only for those i where x is "similar enough" to objects already in Si. Is there any standard approach to such kinds of problems? I know of branch and bounds algorithm family, but it cannot be applied here because it would be prohibitively slow. My guess is that it is simply not possible to compute optimal distribution of S into Si in reasonable time. But is there some common iteratively improving algorithm?

    Read the article

  • Ember Data Sycn - LocalStorage+REST+RealTime+Online/Offline

    - by Miguel Madero
    We have a combination of requirements in terms o data access. Pre-load some reference data. We need reference data to survive browser restarts instead of just living in memory to avoid loading it all the time. I'm currently using the LocalStorageAdapter for that. Once we have it, we would like to sync changes (polling or using Socket.IO in the background and updating the LocalStorage could do the trick) There're other models that are more transactional, where we would need to directly go to the Server and get/save them. It would be nice to use something like the RESTAdapter for that. Lastly, there're some operations that should work off-line and changes should be synced later. To make it more concrete: * We pre-load vendor and "favorite products" into Local Storage. We work offline with those. * We need to sync server changes to vendor and product information. * If they search the full catalog, that requires them to be online. * When offline, we need to allow users to add something to their cart or even submit and order. We would like to queue this action and submit it when they have an Internet Connection. So a few questions are derived from this: * Is there a way to user RESTAdapter in combination with LocalStorage? * Is there some Socket.IO support? (Happy to do this part manually) * Is there Queueing support? Ideally at the Ember-Data level. I know we will have to do a lot of this manually and pull together the different lego pieces, but I wanted to ask for some perspective from experience Ember devs.

    Read the article

  • Advanced search engine or server for relational database [closed]

    - by Pawel
    In my current project we are storing big volume of data in relational database. One of the recent key requirements is to enrich application by adding some advanced search capabilities. In the Project, performance is one of the important factors due to very large tables (10+ milions of records) with parent-children relations (for example: multi-level parent-child relationship, where I am looking for all parents with specific children). The search engine should also be able to check these references for hits. I have found some potential engines on stack overflow, however it looks like that all of them are dedicated rather for text search than relational db and hosted on linux os: lucene Solr Sphinx As I understand some of them use documents as a source of searching, but is it possible or efficient to create programmaticaly documents based on my relational data? As I am not familiar with all of their features/capabilities can anyone please make some recommendations or propose some different solution? To summarize my requirements: framework/engine to search relational database including decendants. support for Microsoft SQL Server can be used in .NET applications preferably hosted on Windows systems Does any of mentioned above are able to solve my problem? do you know any better solution?

    Read the article

  • Is There a Better Way to Feed Different Parameters into Functions with If-Statements?

    - by FlowofSoul
    I've been teaching myself Python for a little while now, and I've never programmed before. I just wrote a basic backup program that writes out the progress of each individual file while it is copying. I wrote a function that determines buffer size so that smaller files are copied with a smaller buffer, and bigger files are copied with a bigger buffer. The way I have the code set up now doesn't seem very efficient, as there is an if loop that then leads to another if loops, creating four options, and they all just call the same function with different parameters. import os import sys def smartcopy(filestocopy, dest_path, show_progress = False): """Determines what buffer size to use with copy() Setting show_progress to True calls back display_progress()""" #filestocopy is a list of dictionaries for the files needed to be copied #dictionaries are used as the fullpath, st_mtime, and size are needed if len(filestocopy.keys()) == 0: return None #Determines average file size for which buffer to use average_size = 0 for key in filestocopy.keys(): average_size += int(filestocopy[key]['size']) average_size = average_size/len(filestocopy.keys()) #Smaller buffer for smaller files if average_size < 1024*10000: #Buffer sizes determined by informal tests on my laptop if show_progress: for key in filestocopy.keys(): #dest_path+key is the destination path, as the key is the relative path #and the dest_path is the top level folder copy(filestocopy[key]['fullpath'], dest_path+key, callback = lambda pos, total: display_progress(pos, total, key)) else: for key in filestocopy.keys(): copy(filestocopy[key]['fullpath'], dest_path+key, callback = None) #Bigger buffer for bigger files else: if show_progress: for key in filestocopy.keys(): copy(filestocopy[key]['fullpath'], dest_path+key, 1024*2600, callback = lambda pos, total: display_progress(pos, total, key)) else: for key in filestocopy.keys(): copy(filestocopy[key]['fullpath'], dest_path+key, 1024*2600) def display_progress(pos, total, filename): percent = round(float(pos)/float(total)*100,2) if percent <= 100: sys.stdout.write(filename + ' - ' + str(percent)+'% \r') else: percent = 100 sys.stdout.write(filename + ' - Completed \n') Is there a better way to accomplish what I'm doing? Sorry if the code is commented poorly or hard to follow. I didn't want to ask someone to read through all 120 lines of my poorly written code, so I just isolated the two functions. Thanks for any help.

    Read the article

  • Does .Net use Device Dependent or Device Independent Bitmaps?

    - by Brian
    When loading an image into memory, does .Net use DDB, DIB, or something else entirely? If possible, please cite your sources. I'm wondering because we currently have a classic ASP application that is using a 3rd party component to load images that is occasionally creating a “Not enough storage is available to process this command.” error. The error is very inconsistent but tends to happen on larger images (not always, but often). After resetting IIS, processing the same file again typically works just fine. After much research I have found that DDBs tend to have this problem when processing large images because they work out of video memory. Considering that we are running on a web server with an integrated video card and limited shared memory, this could certainly be our problem. We are in the early stages of converting our app to .Net and am wondering if using .Net for this might be a viable alternative to our current method which is why I am asking the question. Any advice is welcome :) but out of curiosity if nothing else, I am really hoping for an answer to the question; does .Net use DDB or DIB?

    Read the article

  • Deserialize xml which uses attribute name/value pairs

    - by Bodyloss
    My application receives a constant stream of xml files which are more or less a direct copy of the database record <record type="update"> <field name="id">987654321</field> <field name="user_id">4321</field> <field name="updated">2011-11-24 13:43:23</field> </record> And I need to deserialize this into a class which provides nullable property's for all columns class Record { public long? Id { get; set; } public long? UserId { get; set; } public DateTime? Updated { get; set; } } I just cant seem to work out a method of doing this without having to parse the xml file manually and switch on the field's name attribute to store the values. Is their a way this can be achieved quickly using an XmlSerializer? And if not is their a more efficient way of parsing it manually? Regards and thanks My main problem is that the attribute name needs to have its value set to a property name and its value as the contents of a <field>..</field> element

    Read the article

  • CUDA - multiple kernels to compute a single value

    - by Roger
    Hey, I'm trying to write a kernel to essentially do the following in C float sum = 0.0; for(int i = 0; i < N; i++){ sum += valueArray[i]*valueArray[i]; } sum += sum / N; At the moment I have this inside my kernel, but it is not giving correct values. int i0 = blockIdx.x * blockDim.x + threadIdx.x; for(int i=i0; i<N; i += blockDim.x*gridDim.x){ *d_sum += d_valueArray[i]*d_valueArray[i]; } *d_sum= __fdividef(*d_sum, N); The code used to call the kernel is kernelName<<<64,128>>>(N, d_valueArray, d_sum); cudaMemcpy(&sum, d_sum, sizeof(float) , cudaMemcpyDeviceToHost); I think that each kernel is calculating a partial sum, but the final divide statement is not taking into account the accumulated value from each of the threads. Every kernel is producing it's own final value for d_sum? Does anyone know how could I go about doing this in an efficient way? Maybe using shared memory between threads? I'm very new to GPU programming. Cheers

    Read the article

  • Any suggestions to improve my PDO connection class?

    - by Scarface
    Hey guys I am pretty new to pdo so I basically just put together a simple connection class using information out of the introductory book I was reading but is this connection efficient? If anyone has any informative suggestions, I would really appreciate it. class PDOConnectionFactory{ public $con = null; // swich database? public $dbType = "mysql"; // connection parameters public $host = "localhost"; public $user = "user"; public $senha = "password"; public $db = "database"; public $persistent = false; // new PDOConnectionFactory( true ) <--- persistent connection // new PDOConnectionFactory() <--- no persistent connection public function PDOConnectionFactory( $persistent=false ){ // it verifies the persistence of the connection if( $persistent != false){ $this->persistent = true; } } public function getConnection(){ try{ $this->con = new PDO($this->dbType.":host=".$this->host.";dbname=".$this->db, $this->user, $this->senha, array( PDO::ATTR_PERSISTENT => $this->persistent ) ); // carried through successfully, it returns connected return $this->con; // in case that an error occurs, it returns the error; }catch ( PDOException $ex ){ echo "We are currently experiencing technical difficulties. We have a bunch of monkies working really hard to fix the problem. Check back soon: ".$ex->getMessage(); } } // close connection public function Close(){ if( $this->con != null ) $this->con = null; } }

    Read the article

  • Need Help finding an appropriate task assignment algorithm for a college project involving coordinat

    - by Trif Mircea
    I am a long time lurker here and have found over time many answers regarding jquery and web development topics so I decided to ask a question of my own. This time I have to create a c++ project for college which should help manage the workflow of a company providing all kinds of services through in the field teams. The ideas I have so far are: client-server application; the server is a dispatcher where all the orders from clients get and the clients are mobile devices (PDAs) each team in the field having one a order from a client is a task. Each task is made up of a series of subtasks. You have a database with estimations on how long a task should take to complete you also know what tasks or subtasks each team on the field can perform based on what kind of specialists made up the team (not going to complicate the problem by adding needed materials, it is considered that if a member of a team can perform a subtask he has the stuff needed) Now knowing these factors, what would a good task assignment algorithm be? The criteria is: how many tasks can a team do, how many tasks they have in the queue, it could also be location, how far away are they from the place but I don't think I can implement that.. It needs to be efficient and also to adapt quickly is the human dispatcher manually assigns a task. Any help or leads would be really appreciated. Also I'm not 100% sure in the idea so if you have another way you would go about creating such an application please share, even if it just a quick outline. I have to write a theoretical part too so even if the ideas are far more complex that what i outlined that would be ok ; I'd write those and implement what I can. Edit: C++ is the only language I know unfortunately.

    Read the article

  • Why is BorderLayout calling setSize() and setBounds()?

    - by ags
    I'm trying to get my head around proper use of the different LayoutManagers to make my GUI design skills more efficient and effective. For me, that usually requires a detailed understanding of what is going on under the hood. I've found some good discussion of the interaction and consequences of a Container using BorderLayout containing a Container using FlowLayout. I understand it for the most part, but wanted to confirm my mental model and to do so I am looking at the code for BorderLayout. In the code snippet below taken from BorderLayout.layoutContainer(), note the calls to the child Component's setSize() method followed by setBounds(). Looking at the source for these methods of Component, setSize() actually calls setBounds() with the current values for Component.x and Component.y. Why is this done (and not entirely redudant?) Doesn't the setBounds() call completely overwrite the results of the setSize() call? if ((c=getChild(NORTH,ltr)) != null) { c.setSize(right - left, c.height); Dimension d = c.getPreferredSize(); c.setBounds(left, top, right - left, d.height); top += d.height + vgap; } I'm also tring to understand where/when the child Component's size is initially set (before the LayoutManager.layoutContainer() method is called). Finally, this post itself raises a "meta-question": in a situation like this, where the source is available elsewhere, is the accepteed protocol to include the entire method? Or some other way to make it easier for folks to participate in the thread? Thanks.

    Read the article

  • Finding the index of a list item in jQuery

    - by Tim Piele
    I have an unordered list of eight items. On page load the first five <li> have default thumbnail images in them and the 6th one has a 1px by 1px placeholder image with the ID of $('#last'). When a user inserts a new image it replaces the 'src' of $('#last') with their new image. It's not the most efficient way but it works. <ul> <li><img src="img1.png" /></li> <li><img src="img2.png" /></li> <li><img src="img3.png" /></li> <li><img src="img4.png" /></li> <li><img src="img5.png" /></li> <li><img src="1px.png" id="last"/></li> <li></li> <li></li> </ul> When the user adds a new image the ID of $('#last') is removed and I use each() to find the next empty <li> and insert the 1px by 1px image in it, with an ID of $('#last') so it is ready for the next image upload. At this point I need to get the index() of the <li> that now has the 1px by 1px image in it, whose ID is $('#last'), so that I can store the index in the session, so when a user comes back to the page the $('#last') ID is still set and ready to accept another image. How do I get the index of the <li> with that image in it, since it was set after page load? Is there a way to use delegate() or on() to get it? i.e. how do I get the index of an element that was set after page load?

    Read the article

  • Optimize master-detail insert statements

    - by Dave Jarvis
    Quest After a day of running (against nearly 1 GB of data), a set of statements are tumbling down to 40 inserts per second. I am looking to increase that by an order of magnitude or two. SQL Code The code to insert the information comes in two parts: a master record and detail records. The master record: INSERT INTO MONTH_REF (DISTRICT_ID, STATION_ID, CATEGORY_ID, YEAR, MONTH) VALUES ('101', '0066', '010', 1984, 07); The detail records: INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES ((SELECT ID FROM MONTH_REF M WHERE M.DISTRICT_ID = '101' AND M.STATION_ID = '0066' AND M.CAT EGORY_ID = '010' AND M.YEAR = 1984 AND M.MONTH = 07), 0, ' ', 1); INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES ((SELECT ID FROM MONTH_REF M WHERE M.DISTRICT_ID = '101' AND M.STATION_ID = '0066' AND M.CAT EGORY_ID = '010' AND M.YEAR = 1984 AND M.MONTH = 07), 0.5, ' ', 2); INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES ((SELECT ID FROM MONTH_REF M WHERE M.DISTRICT_ID = '101' AND M.STATION_ID = '0066' AND M.CAT EGORY_ID = '010' AND M.YEAR = 1984 AND M.MONTH = 07), 0, 'T', 3); Proposed Solution INSERT INTO MONTH_REF (DISTRICT_ID, STATION_ID, CATEGORY_ID, YEAR, MONTH) VALUES ('101', '0066', '010', 1984, 07); SET @month_ref_id := (SELECT LAST_INSERT_ID()); INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES (@month_ref_id, 0, ' ', 1); INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES (@month_ref_id, 0.5, ' ', 2); INSERT INTO DAILY (MONTH_REF_ID, AMOUNT, DAILY_FLAG_ID, DAY) VALUES (@month_ref_id, 0, 'T', 3); Constraints The MONTH_REF table has an AUTO_INCREMENT primary key and is indexed on it. The DAILY table has no index and no primary key. A primary key can be added to the DAILY table, if it would help. Question Is there a more efficient way to execute the (billion or so) insert statements than the proposed solution? Thank you!

    Read the article

  • C++ Matrix class hierachy

    - by bpw1621
    Should a matrix software library have a root class (e.g., MatrixBase) from which more specialized (or more constrained) matrix classes (e.g., SparseMatrix, UpperTriangluarMatrix, etc.) derive? If so, should the derived classes be derived publicly/protectively/privately? If not, should they be composed with a implementation class encapsulating common functionality and be otherwise unrelated? Something else? I was having a conversation about this with a software developer colleague (I am not per se) who mentioned that it is a common programming design mistake to derive a more restricted class from a more general one (e.g., he used the example of how it was not a good idea to derive a Circle class from an Ellipse class as similar to the matrix design issue) even when it is true that a SparseMatrix "IS A" MatrixBase. The interface presented by both the base and derived classes should be the same for basic operations; for specialized operations, a derived class would have additional functionality that might not be possible to implement for an arbitrary MatrixBase object. For example, we can compute the cholesky decomposition only for a PositiveDefiniteMatrix class object; however, multiplication by a scalar should work the same way for both the base and derived classes. Also, even if the underlying data storage implementation differs the operator()(int,int) should work as expected for any type of matrix class. I have started looking at a few open-source matrix libraries and it appears like this is kind of a mixed bag (or maybe I'm looking at a mixed bag of libraries). I am planning on helping out with a refactoring of a math library where this has been a point of contention and I'd like to have opinions (that is unless there really is an objective right answer to this question) as to what design philosophy would be best and what are the pros and cons to any reasonable approach.

    Read the article

  • How can I build a generic dataset-handling Perl library?

    - by Pep.
    Hello, I want to build a generic Perl module for handling and analysing biomedical character separated datasets and which can, most certain, be used on any kind of datasets that contain a mixture of categorical (A,B,C,..) and continuous (1.2,3,881..) and identifier (XXX1,XXX2...). The plan is to have people initialize the module and then use some arguments to point to the data file(s), the place were the analysis reports should be placed and the structure of the data. By structure of data I mean which variable is in which place and its name/type. And this is where I need some enlightenment. I am baffled how to do this in a clean way. Obviously, having people create a simple schema file, be it XML or some other format would be the cleanest but maybe not all people enjoy doing something like this. The solutions I can think of are: Create a configuration file in XML or similar and with a prespecified format. Pass the information during initialization of the module. Use the first row of the data as headers and try to guess types (ouch) Surely there must be a "canonical" way of doing this that is also usable and efficient. Thanks p.

    Read the article

  • gl_FragColor and glReadPixels

    - by chun0216
    I am still trying to read pixels from fragment shader and I have some questions. I know that gl_FragColor returns with vec4 meaning RGBA, 4 channels. After that, I am using glReadPixels to read FBO and write it in data GLubyte *pixels = new GLubyte[640*480*4]; glReadPixels(0, 0, 640,480, GL_RGBA, GL_UNSIGNED_BYTE, pixels); This works fine but it really has speed issue. Instead of this, I want to just read RGB so ignore alpha channels. I tried: GLubyte *pixels = new GLubyte[640*480*3]; glReadPixels(0, 0, 640,480, GL_RGB, GL_UNSIGNED_BYTE, pixels); instead and this didn't work though. I guess it's because gl_FragColor returns 4 channels and maybe I should do something before this? Actually, since my returned image (gl_FragColor) is grayscale, I did something like float gray = 0.5 //or some other values gl_FragColor = vec4(gray,gray,gray,1.0); So is there any efficient way to use glReadPixels instead of using the first 4 channels method? Any suggestion? By the way, this is on opengl es 2.0 code.

    Read the article

  • Distributed Cache with Serialized File as DataStore in Oracle Coherence

    - by user226295
    Weired but I am investigating the Oracle Coherence as a substitue for distribute cache. My primarr problem is that we dont have distribituted cache as such as of now in our app. Thats my major concern. And thats what I want to implement. So, lets say if I take up a machine and start a new (3rd) reading process, it will be able to connect to the cache and listen to the cache and will have a full set of cache triplicated (as of now its duplicated) Now thats waste from a common person stanpoint too. The size of the cache is 2 GB and without going distibuted its limiting us. Thats bring me to Coheremce. But now, we dont have database as persistent store too. we have the archival processes as our persistent store. (90 days worth of data) Ok now multiply that with soem where around 2 GB * 90 (thats the bare minimum we want to keep). Preliminary/Intermediate analysis of Coherence as a solution. And a (supposedly) brilliant thought crossed my mind. Why not have this as persistant storage with my distributed cache. Does Oracle Coherence support that. I will get rid of archiving infrastructure too (i hate daemon archiving processes). For some starnge reasons, I dont wanna go to the DB to replace those flat files. What say?, can Coherence be my savior? Any other stable alternate too. (Coherence is imposed on me by big guys, FYI)

    Read the article

  • JavaScript - Efficiently find all elements containing one of a large set of strings

    - by noah
    I have a set of strings and I need to find all all of the occurrences in an HTML document. Where the string occurs is important because I need to handle each case differently: String is all or part of an attribute. e.g., the string is foo: <input value="foo"> - Add class ATTR to the element. String is the full text of an element. e.g., <button>foo</button> - Add class TEXT to the element. String is inline in the text of an element. e.g., <p>I love foo</p> - Wrap the text in a span tag with class TEXT. Also, I need to match the longest string first. e.g., if I have foo and foobar, then <p>I love foobar</p> should become <p>I love <span class="TEXT">foobar</span></p>, not <p>I love <span class="TEXT">foo</span>bar</p>. The inline text is easy enough: Sort the strings descending by length and find and replace each in document.body.innerHTML with <span class="TEXT">$1</span>, although I'm not sure if that is the most efficient way to go. For the attributes, I can do something like this: sortedStrings.each(function(it) { document.body.innerHTML.replace(new RegExp('(\S+?)="[^"]*'+escapeRegExChars(it)+'[^"]*"','g'),function(s,attr) { $('[+attr+'*='+it+']').addClass('ATTR'); }); }); Again, that seems inefficient. Lastly, for the full text elements, a depth first search of the document that compares the innerHTML to each string will work, but for a large number of strings, it seems very inefficient. Any answer that offers performance improvements gets an upvote :)

    Read the article

  • figuring out which field to look for a value in with SQL and perl

    - by Micah
    I'm not too good with SQL and I know there's probably a much more efficient way to accomplish what I'm doing here, so any help would be much appreciated. Thanks in advance for your input! I'm writing a short program for the local school high school. At this school, juniors and seniors who have driver's licenses and cars can opt to drive to school rather than ride the bus. Each driver is assigned exactly one space, and their DLN is used as the primary key of the driver's table. Makes, models, and colors of cars are stored in a separate cars table, related to the drivers table by the License plate number field. My idea is to have a single search box on the main GUI of the program where the school secretary can type in who/what she's looking for and pull up a list of results. Thing is, she could be typing a license plate number, a car color, make, and model, someone driver's name, some student driver's DLN, or a space number. As the programmer, I don't know what exactly she's looking for, so a couple of options come to mind for me to build to be certain I check everywhere for a match: 1) preform a couple of SELECT * FROM [tablename] SQL statements, one per table and cram the results into arrays in my program, then search across the arrays one element at a time with regex, looking for a matched pattern similar to the search term, and if I find one, add the entire record that had a match in it to a results array to display on screen at the end of the search. 2) take whatever she's looking for into the program as a scaler and prepare multiple select statements around it, such as SELECT * FROM DRIVERS WHERE DLN = $Search_Variable SELECT * FROM DRIVERS WHERE First_Name = $Search_Variable SELECT * FROM CARS WHERE LICENSE = $Search_Variable and so on for each attribute of each table, sticking the results into a results array to show on screen when the search is done. Is there a cleaner way to go about this lookup without having to make her specify exactly what she's looking for? Possibly some kind of SQL statement I've never seen before?

    Read the article

  • Rand(); with exclusion to and already randomly generated number..?

    - by Stefan
    Hey, I have a function which calls a users associated users from a table. The function then uses the rand(); function to chose from the array 5 randomly selected userID's however!... In the case where a user doesnt have many associated users but above the min (if below the 5 it just returns the array as it is) then it gives bad results due to repeat rand numbers... How can overcome this or exclude a previously selected rand number from the next rand(); function call. Here is the section of code doing the work. Bare in mind this must be highly efficient as this script is used everywhere. $size = sizeof($users)-1; $nusers[0] = $users[rand(0,$size)]; $nusers[1] = $users[rand(0,$size)]; $nusers[2] = $users[rand(0,$size)]; $nusers[3] = $users[rand(0,$size)]; $nusers[4] = $users[rand(0,$size)]; return $nusers; Thanks in advance! Stefan

    Read the article

< Previous Page | 317 318 319 320 321 322 323 324 325 326 327 328  | Next Page >