Search Results

Search found 2750 results on 110 pages for 'recursive subquery factor'.

Page 94/110 | < Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101  | Next Page >

  • Are certain open-source licenses more suitable than others for career growth?

    - by Francisco Garcia
    As a software engineer/programmer myself, I love the possibility to download the code and learn from it. However building software is what brings food to my table. I have doubts regarding the type of license I should use for my own personal projects or when picking up one project to learn from. There are already many questions about licenses on Stackoverflow, but I would like to make this one much more specific. If your main profession and way of living is building software: which type of license do you find more useful for you? And I mean, the license that can benefit you most as a professional because it gives you more freedom to reuse the experience you gain. GPL is a great license to build communities because it forces you to give back your work. However I like BSD licenses because of their extra freedom. I know that if the code I am exploring is BSD licensed, I might be able to expand not only my skills, but also my programmer toolbox. Whenever I am working for a company, I might recall that something similar was done in another project and I will be able to copy or imitate certain part of the code. I know that there are religious wars regarding GPL vs BSD and it is not my intention to start one. Probably many companies already take snipsets from GPL projects anyway. I just want to insist in the factor of professional enrichment. I do not intend to discriminate any license. I said I prefer BSD licenses but I also use Linux because the user base is bigger and also the market demand.

    Read the article

  • caching previous return values of procedures in scheme

    - by Brock
    In Chapter 16 of "The Seasoned Schemer", the authors define a recursive procedure "depth", which returns 'pizza nested in n lists, e.g (depth 3) is (((pizza))). They then improve it as "depthM", which caches its return values using set! in the lists Ns and Rs, which together form a lookup-table, so you don't have to recurse all the way down if you reach a return value you've seen before. E.g. If I've already computed (depthM 8), when I later compute (depthM 9), I just lookup the return value of (depthM 8) and cons it onto null, instead of recursing all the way down to (depthM 0). But then they move the Ns and Rs inside the procedure, and initialize them to null with "let". Why doesn't this completely defeat the point of caching the return values? From a bit of experimentation, it appears that the Ns and Rs are being reinitialized on every call to "depthM". Am I misunderstanding their point? I guess my question is really this: Is there a way in Scheme to have lexically-scoped variables preserve their values in between calls to a procedure, like you can do in Perl 5.10 with "state" variables?

    Read the article

  • Where should an application's default folder live?

    - by HotOil
    Hi: I'm creating a little app that configures a connected device and then saves the config information in a file. The filename cannot be chosen by the user, but its location can be chosen. Where is the best place for the app's default save-to folder? I have seen examples out there where it is the "MyDocuments" location (eg Visual Studio does this). I have seen a folder created right at the top of the C:\ drive. I find that to be a little obnoxious, personally. It could be in the Program Files[Manufacturer] or Program Files[Product Name], or wherever the app was installed. I have used this location in the past; I dislike it because Windows Explorer does not allow a user to browse to there very easily ('browsability'). Going with this last notion that 'browsability' is a factor, I suppose MyDocuments is the best choice. Is this the most common, most widely accepted practice? I think historically we have chosen the install folder because that co-locates the data with the device management utilities. But I would really like to get away from that. I don't want the user to have to go pawing through system files to find his/her data, esp if that person is not too Windows-savvy. Also, I am using the .NET WinForms FolderBrowserDialog, and the "Environment.SpecialFolders" enum isn't helpful in setting up the dialog to point into the Program Files folder. Thanks for your input! Suz.

    Read the article

  • Aggregate path counts using HierarchyID

    - by austincav
    Business problem - understand process fallout using analytics data. Here is what we have done so far: Build a dictionary table with every possible process step Find each process "start" Find the last step for each start Join dictionary table to last step to find path to final step In the final report output we end up with a list of paths for each start to each final step: User Fallout Step HierarchyID.ToString() A 1/1/1 B 1/1/1/1/1 C 1/1/1/1 D 1/1/1 E 1/1 What this means is that five users (A-E) started the process. Assume only User B finished, the other four did not. Since this is a simple example (without branching) we want the output to look as follows: Step Unique Users 1 5 2 5 3 4 4 2 5 1 The easiest solution I could think of is to take each hierarchyID.ToString(), parse that out into a set of subpaths, JOIN back to the dictionary table, and output using GROUP BY. Given the volume of data, I'd like to use the built-in HierarchyID functions, e.g. IsAncestorOf. Any ideas or thoughts how I could write this? Maybe a recursive CTE?

    Read the article

  • Does it ever make sense to make a fundamental (non-pointer) parameter const?

    - by Scott Smith
    I recently had an exchange with another C++ developer about the following use of const: void Foo(const int bar); He felt that using const in this way was good practice. I argued that it does nothing for the caller of the function (since a copy of the argument was going to be passed, there is no additional guarantee of safety with regard to overwrite). In addition, doing this prevents the implementer of Foo from modifying their private copy of the argument. So, it both mandates and advertises an implementation detail. Not the end of the world, but certainly not something to be recommended as good practice. I'm curious as to what others think on this issue. Edit: OK, I didn't realize that const-ness of the arguments didn't factor into the signature of the function. So, it is possible to mark the arguments as const in the implementation (.cpp), and not in the header (.h) - and the compiler is fine with that. That being the case, I guess the policy should be the same for making local variables const. One could make the argument that having different looking signatures in the header and source file would confuse others (as it would have confused me). While I try to follow the Principle of Least Astonishment with whatever I write, I guess it's reasonable to expect developers to recognize this as legal and useful.

    Read the article

  • I'm having an issue to use GLshort for representing Vertex, and Normal.

    - by Xylopia
    As my project gets close to optimization stage, I notice that reducing Vertex Metadata could vastly improve the performance of 3D rendering. Eventually, I've dearly searched around and have found following advices from stackoverflow. Using GL_SHORT instead of GL_FLOAT in an OpenGL ES vertex array How do you represent a normal or texture coordinate using GLshorts? Advice on speeding up OpenGL ES 1.1 on the iPhone Simple experiments show that switching from "FLOAT" to "SHORT" for vertex and normal isn't tough, but what troubles me is when you're to scale back verticies to their original size (with glScalef), normals are multiplied by the reciprocal of the scale. Natural remedy for this is to multiply the normals w/ scale before you submit to GPU. Then, my short normals almost become 0, because the scale factor is usually smaller than 0. Duh! How do you use "short" for both vertex and normal at the same time? I've been trying this and that for about a full day, but I could only go for "float vertex w/ byte normal" or "short vertex w/ float normal" so far. Your help would be truly appreciated.

    Read the article

  • What does this structure actually do?

    - by LGTrader
    I found this structure code in a Julia Set example from a book on CUDA. I'm a newbie C programmer and cannot get my head around what it's doing, nor have I found the right thing to read on the web to clear it up. Here's the structure: struct cuComplex { float r; float i; cuComplex( float a, float b ) : r(a), i(b) {} float magnitude2( void ) { return r * r + i * i; } cuComplex operator*(const cuComplex& a) { return cuComplex(r*a.r - i*a.i, i*a.r + r*a.i); } cuComplex operator+(const cuComplex& a) { return cuComplex(r+a.r, i+a.i); } }; and it's called very simply like this: cuComplex c(-0.8, 0.156); cuComplex a(jx, jy); int i = 0; for (i=0; i<200; i++) { a = a * a + c; if (a.magnitude2() > 1000) return 0; } return 1; So, the code did what? Defined something of structure type 'cuComplex' giving the real and imaginary parts of a number. (-0.8 & 0.156) What is getting returned? (Or placed in the structure?) How do I work through the logic of the operator stuff in the struct to understand what is actually calculated and held there? I think that it's probably doing recursive calls back into the stucture float magnitude2 (void) { return return r * r + i * i; } probably calls the '*' operator for r and again for i, and then the results of those two operations call the '+' operator? Is this correct and what gets returned at each step? Just plain confused. Thanks!

    Read the article

  • ImageViews sometimes not displaying in FrameLayout activity

    - by Ken
    The top level layout in my activity is a framelayout. I have completed, debugged and tested this app and it works exactly like it should in all respects on my g1 and on various emulators. But on 3.7-inch displays running 2.1+, some imageviews packed in a linearlayout are periodically not visible. I know that they are there because you can touch and drag them with effect in the app. So I assume somehow they have gotten under the SurfaceView that is the main component of the app. This is apparently so even though the SurfaceView is declared in the xml prior to the LinearLayout. However, the ImageViews IN the LinearLayout are added programmatically towards the end of onCreate(). Framelayout stacks everything that is added to it, one on top of the other--the only way you will see more than one child of a frame layout is if they are smaller than the screen and are placed apart from eachother. Oddly, sometimes the imageviews ARE visible--it is random. Anyway, I've been trying to combat this with framelayout.bringChildToFront(View v) on the linearlayout without success. I wonder if anyone has any insight into how the behavior could be random like that, and how I should code these imageviews to keep this from happening, and why the problem appears only to occur on 3.7 vs 3.2 inch screens (as it happens, the two 3.2-inch screens were both htc, so vendor might be factor too). [edit] Actually, I've determined that this is a 2.2 issue, not a screen size (or even vendor) issue. Can't ensure that ImageViews added to a framelayout with a SurfaceView in it will appear on top of the surfaceview. I ran some tests in the respective onDraw() methods and the imageviews are 'visible' (0), and nothing does anything to the alpha of the drawables, which are there as well at ondraw(). [/edit] Any insight would be welcomed. Ken T.

    Read the article

  • How to get the size of a binary tree ?

    - by Andrei Ciobanu
    I have a very simple binary tree structure, something like: struct nmbintree_s { unsigned int size; int (*cmp)(const void *e1, const void *e2); void (*destructor)(void *data); nmbintree_node *root; }; struct nmbintree_node_s { void *data; struct nmbintree_node_s *right; struct nmbintree_node_s *left; }; Sometimes i need to extract a 'tree' from another and i need to get the size to the 'extracted tree' in order to update the size of the initial 'tree' . I was thinking on two approaches: 1) Using a recursive function, something like: unsigned int nmbintree_size(struct nmbintree_node* node) { if (node==NULL) { return(0); } return( nmbintree_size(node->left) + nmbintree_size(node->right) + 1 ); } 2) A preorder / inorder / postorder traversal done in an iterative way (using stack / queue) + counting the nodes. What approach do you think is more 'memory failure proof' / performant ? Any other suggestions / tips ? NOTE: I am probably going to use this implementation in the future for small projects of mine. So I don't want to unexpectedly fail :).

    Read the article

  • Doctrine - get the offset of an object in a collection (implementing an infinite scroll)

    - by dan
    I am using Doctrine and trying to implement an infinite scroll on a collection of notes displayed on the user's browser. The application is very dynamic, therefore when the user submits a new note, the note is added to the top of the collection straightaway, besides being sent (and stored) to the server. Which is why I can't use a traditional pagination method, where you just send the page number to the server and the server will figure out the offset and the number of results from that. To give you an example of what I mean, imagine there are 20 notes displayed, then the user adds 2 more notes, therefore there are 22 notes displayed. If I simply requests "page 2", the first 2 items of that page will be the last two items of the page currently displayed to the user. Which is why I am after a more sophisticated method, which is the one I am about to explain. Please consider the following code, which is part of the server code serving an AJAX request for more notes: // $lastNoteDisplayedId is coming from the AJAX request $lastNoteDisplayed = $repository->findBy($lastNoteDisplayedId); $allNotes = $repository->findBy($filter, array('createdAt' => 'desc')); $offset = getLastNoteDisplayedOffset($allNotes, $lastNoteDisplayedId); // retrieve the page to send back so that it can be appended to the listing $notesPerPage = 30 $notes = $repository->findBy( array(), array('createdAt' => 'desc'), $notesPerPage, $offset ); $response = json_encode($notes); return $response; Basically I would need to write the method getLastNoteDisplayedOffset, that given the whole set of notes and one particoular note, it can give me its offset, so that I can use it for the pagination of the previous Doctrine statement. I know probably a possible implementation would be: getLastNoteDisplayedOffset($allNotes, $lastNoteDisplayedId) { $i = 0; foreach ($allNotes as $note) { if ($note->getId() === $lastNoteDisplayedId->getId()) { break; } $i++; } return $i; } I would prefer not to loop through all notes because performance is an important factor. I was wondering if Doctrine has got a method itself or if you can suggest a different approach.

    Read the article

  • Creating a three level ASP.NET menu with SiteMap, how do i do it?

    - by user270399
    I want to create a three level menu, I have got a recursive function today that works with three levels. But the thing is how do i output the third lever? Using two repeaters i have managed to get a hold of the first two levels through the ChildNodes property. But that only gives me the second level. What if a want the third level? Example code below. How do i get the third level? :) <asp:Repeater ID="FirstLevel" DataSourceID="SiteMapDataSource" runat="server" EnableViewState="false"> <ItemTemplate> <li class="top"> <a href='/About/<%#Eval("Title")%>.aspx' class="top_link"><span class="down"><%#Eval("Title")%></span><!--[if gte IE 7]><!--></a><!--<![endif]--> <asp:Repeater runat="server" ID="SecondLevel" DataSource='<%#((SiteMapNode)Container.DataItem).ChildNodes%>'> <HeaderTemplate><!--[if lte IE 6]><table><tr><td><![endif]--><ul class="sub"></HeaderTemplate> <ItemTemplate> <li> <a href='<%#((string)Eval("Url")).Replace("~", "")%>' style="text-align: left;"><%#Eval("Title")%></a> Third repeater here? </li> </ItemTemplate> <FooterTemplate></ul><!--[if lte IE 6]></td></tr></table></a><![endif]--></FooterTemplate> </asp:Repeater> </li> </ItemTemplate> </asp:Repeater>

    Read the article

  • How to accurately resize nested elements with ems and font-size percentage?

    - by moonDogDog
    I have a carousel with textboxes for each image, and my client (who knows nothing about HTML) edits the textboxes using a WYSIWYG text editor. The resulting output is akin to your worst nightmares; something like: <div class="carousel-text-container"> <span style="font-size:18pt;"> <span style="font-weight:bold;"> Lorem <span style="font-size:15pt:">Dolor</span> </span> Ipsum <span style="font-size:19pt;"><span>&nbsp;<span>Sit</span>Amet</span> </span> </div> This site has to be displayed at 3 different sizes to accomodate smaller monitors, so I have been using CSS media queries to resize the site. Now I am having trouble resizing the text inside the textbox correctly. I have tried using jQuery.css to get the font size of each element in px, and then convert it to em. Then, by setting a font-size:x% sort of declaration on .carousel-text-container, I hoped that that would resize everything properly. Unfortunately, there seems to be a recursive nature with how font-size is applied in ems. That is, .example is not resized properly in the following because its parent is also influencing it <span style="font-size:2em;"> Something <span class="example" style="font-size:1.5em;">Else</span> </span> How can I resize everything reliably and precisely such I can achieve a true percentage of my original font size, margin, padding, line-height, etc. for all the children of .carousel-text-container?

    Read the article

  • Add leading zeros to this javascript countdown script?

    - by eddjedi
    I am using the following countdown script which works great, but I can't figure out how to add leading zeros to the numbers (eg so it displays 09 instead of 9.) Can anybody help me out please? Here's the current script: function countDown(id, end, cur){ this.container = document.getElementById(id); this.endDate = new Date(end); this.curDate = new Date(cur); var context = this; var formatResults = function(day, hour, minute, second){ var displayString = [ '<div class="stat statBig">',day,'</div>', '<div class="stat statBig">',hour,'</div>', '<div class="stat statBig">',minute,'</div>', '<div class="stat statBig">',second,'</div>' ]; return displayString.join(""); } var update = function(){ context.curDate.setSeconds(context.curDate.getSeconds()+1); var timediff = (context.endDate-context.curDate)/1000; // Check if timer expired: if (timediff<0){ return context.container.innerHTML = formatResults(0,0,0,0); } var oneMinute=60; //minute unit in seconds var oneHour=60*60; //hour unit in seconds var oneDay=60*60*24; //day unit in seconds var dayfield=Math.floor(timediff/oneDay); var hourfield=Math.floor((timediff-dayfield*oneDay)/oneHour); var minutefield=Math.floor((timediff-dayfield*oneDay-hourfield*oneHour)/oneMinute); var secondfield=Math.floor((timediff-dayfield*oneDay-hourfield*oneHour-minutefield*oneMinute)); context.container.innerHTML = formatResults(dayfield, hourfield, minutefield, secondfield); // Call recursively setTimeout(update, 1000); }; // Call the recursive loop update(); }

    Read the article

  • How to transform a production to LL(1) grammar for a list separated by a semicolon?

    - by Subb
    Hi, I'm reading this introductory book on parsing (which is pretty good btw) and one of the exercice is to "build a parser for your favorite language." Since I don't want to die today, I thought I could do a parser for something relatively simple, ie a simplified CSS. Note: This book teach you how to right a LL(1) parser using the recursive-descent algorithm. So, as a sub-exercice, I am building the grammar from what I know of CSS. But I'm stuck on a production that I can't transform in LL(1) : //EBNF block = "{", declaration, {";", declaration}, [";"], "}" //BNF <block> =:: "{" <declaration> "}" <declaration> =:: <single-declaration> <opt-end> | <single-declaration> ";" <declaration> <opt-end> =:: "" | ";" This describe a CSS block. Valid block can have the form : { property : value } { property : value; } { property : value; property : value } { property : value; property : value; } ... The problem is with the optional ";" at the end, because it overlap with the starting character of {";", declaration}, so when my parser meet a semicolon in this context, it doesn't know what to do. The book talk about this problem, but in its example, the semicolon is obligatory, so the rule can be modified like this : block = "{", declaration, ";", {declaration, ";"}, "}" So, Is it possible to achieve what I'm trying to do using a LL(1) parser?

    Read the article

  • To OpenID or not to OpenID? Is it worth it?

    - by Eloff
    Does OpenID improve the user experience? Edit Not to detract from the other comments, but I got one really good reply below that outlined 3 advantages of OpenID in a rational bottom line kind of way. I've also heard some whisperings in other comments that you can get access to some details on the user through OpenID (name? email? what?) and that using that it might even be able to simplify the registration process by not needing to gather as much information. Things that definitely need to be gathered in a checkout process: Full name Email (I'm pretty sure I'll have to ask for these myself) Billing address Shipping address Credit card info There may be a few other things that are interesting from a marketing point of view, but I wouldn't ask the user to manually enter anything not absolutely required during the checkout process. So what's possible in this regard? /Edit (You may have noticed stackoverflow uses OpenID) It seems to me it is easier and faster for the user to simply enter a username and password in a signup form they have to go through anyway. I mean you don't avoid entering a username and password either with OpenID. But you avoid the confusion of choosing a OpenID provider, and the trip out to and back from and external site. With Microsoft making Live ID an OpenID provider (More Info), bringing on several hundred million additional accounts to those provided by Google, Yahoo, and others, this question is more important than ever. I have to require new customers to sign up during the checkout process, and it is absolutely critical that the experience be as easy and smooth as possible, every little bit harder it becomes translates into lost sales. No geek factor outweighs cold hard cash at the end of the day :) OpenID seems like a nice idea, but the implementation is of questionable value. What are the advantages of OpenID and is it really worth it in my scenario described above?

    Read the article

  • Using child visitor in C#

    - by Thomas Matthews
    I am setting up a testing component and trying to keep it generic. I want to use a generic Visitor class, but not sure about using descendant classes. Example: public interface Interface_Test_Case { void execute(); void accept(Interface_Test_Visitor v); } public interface Interface_Test_Visitor { void visit(Interface_Test_Case tc); } public interface Interface_Read_Test_Case : Interface_Test_Case { uint read_value(); } public class USB_Read_Test : Interface_Read_Test_Case { void execute() { Console.WriteLine("Executing USB Read Test Case."); } void accept(Interface_Test_Visitor v) { Console.WriteLine("Accepting visitor."); } uint read_value() { Console.WriteLine("Reading value from USB"); return 0; } } public class USB_Read_Visitor : Interface_Test_Visitor { void visit(Interface_Test_Case tc) { Console.WriteLine("Not supported Test Case."); } void visit(Interface_Read_Test_Case rtc) { Console.WriteLine("Not supported Read Test Case."); } void visit(USB_Read_Test urt) { Console.WriteLine("Yay, visiting USB Read Test case."); } } // Code fragment USB_Read_Test test_case; USB_Read_Visitor visitor; test_case.accept(visitor); What are the rules the C# compiler uses to determine which of the methods in USB_Read_Visitor will be executed by the code fragment? I'm trying to factor out dependencies of my testing component. Unfortunately, my current Visitor class contains visit methods for classes not related to the testing component. Am I trying to achieve the impossible?

    Read the article

  • What is a good approach to preloading data?

    - by Bob Horn
    Are there best practices out there for loading data into a database, to be used with a new installation of an application? For example, for application foo to run, it needs some basic data before it can even be started. I've used a couple options in the past: TSQL for every row that needs to be preloaded: IF NOT EXISTS (SELECT * FROM Master.Site WHERE Name = @SiteName) INSERT INTO [Master].[Site] ([EnterpriseID], [Name], [LastModifiedTime], [LastModifiedUser]) VALUES (@EnterpriseId, @SiteName, GETDATE(), @LastModifiedUser) Another option is a spreadsheet. Each tab represents a table, and data is entered into the spreadsheet as we realize we need it. Then, a program can read this spreadsheet and populate the DB. There are complicating factors, including the relationships between tables. So, it's not as simple as loading tables by themselves. For example, if we create Security.Member rows, then we want to add those members to Security.Role, we need a way of maintaining that relationship. Another factor is that not all databases will be missing this data. Some locations will already have most of the data, and others (that may be new locations around the world), will start from scratch. Any ideas are appreciated.

    Read the article

  • Running Different Modules on Tomcat/Server

    - by umesh awasthi
    Hi All, I have started working on my own project idea but on the starting i have strucked in the structure decision may be lack of knowledge is the promary factor. i am wondering how we can make 2 different modules to co-operate on the server.Here is is explanation of what i am trying to ask as per my design i need 2 modules for my application 1. Back end handling where i can do all content handling as well as other admin task a console which will be capable of handling everything from ceating importing contents to everything. 2 USer end which is only an interface to the end user to use the application. to visualize the things its kinda e-commerce application one is back office management and other is user end of web-shop. since these two module will be very much interlinked but i don't want them to mix up i want them to develop independently since admin is one which is core and it will also going to serve user interface. my question is how i can develop two module independently but on the other hand i want them to co-operate to accomplish the task as a whole. Really sorry in advance if i am not making any sense. Thanks in advance

    Read the article

  • Entity Framework DateTime update extremely slow

    - by Phyxion
    I have this situation currently with Entity Framework: using (TestEntities dataContext = DataContext) { UserSession session = dataContext.UserSessions.FirstOrDefault(userSession => userSession.Id == SessionId); if (session != null) { session.LastAvailableDate = DateTime.Now; dataContext.SaveChanges(); } } This is all working perfect, except for the fact that it is terribly slow compared to what I expect (14 calls per second, tested with 100 iterations). When I update this record manually through this command: dataContext.Database.ExecuteSqlCommand(String.Format("update UserSession set LastAvailableDate = '{0}' where Id = '{1}'", DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss.fffffff"), SessionId)); I get 55 calls per second, which is more than fast enough. However, when I don't update the session.LastAvailableDate but I update an integer (e.g. session.UserId) or string with Entity Framework, I get 50 calls per second, which is also more than fast enough. Only the datetime field is terrible slow. The difference of a factor 4 is unacceptable and I was wondering how I can improve this as I don't prefer using direct SQL when I can also use the Entity Framework. I'm using Entity Framework 4.3.1 (also tried 4.1).

    Read the article

  • Partitioning data set in r based on multiple classes of observations

    - by Danny
    I'm trying to partition a data set that I have in R, 2/3 for training and 1/3 for testing. I have one classification variable, and seven numerical variables. Each observation is classified as either A, B, C, or D. For simplicity's sake, let's say that the classification variable, cl, is A for the first 100 observations, B for observations 101 to 200, C till 300, and D till 400. I'm trying to get a partition that has 2/3 of the observations for each of A, B, C, and D (as opposed to simply getting 2/3 of the observations for the entire data set since it will likely not have equal amounts of each classification). When I try to sample from a subset of the data, such as sample(subset(data, cl=='A')), the columns are reordered instead of the rows. To summarize, my goal is to have 67 random observations from each of A, B, C, and D as my training data, and store the remaining 33 observations for each of A, B, C, and D as testing data. I have found a very similar question to mine, but it did not factor in multiple variables. I feel silly asking this question because it seems so simple, but I'm stumped. Also, this is my first question on this site, so I apologize in advance for any faux pas on my part.

    Read the article

  • What is the fastest way to find duplicates in multiple BIG txt files?

    - by user2950750
    I am really in deep water here and I need a lifeline. I have 10 txt files. Each file has up to 100.000.000 lines of data. Each line is simply a number representing something else. Numbers go up to 9 digits. I need to (somehow) scan these 10 files and find the numbers that appear in all 10 files. And here comes the tricky part. I have to do it in less than 2 seconds. I am not a developer, so I need an explanation for dummies. I have done enough research to learn that hash tables and map reduce might be something that I can make use of. But can it really be used to make it this fast, or do I need more advanced solutions? I have also been thinking about cutting up the files into smaller files. To that 1 file with 100.000.000 lines is transformed into 100 files with 1.000.000 lines. But I do not know what is best: 10 files with 100 million lines or 1000 files with 1 million lines? When I try to open the 100 million line file, it takes forever. So I think, maybe, it is just too big to be used. But I don't know if you can write code that will scan it without opening. Speed is the most important factor in this, and I need to know if it can be done as fast as I need it, or if I have to store my data in another way, for example, in a database like mysql or something. Thank you in advance to anybody that can give some good feedback.

    Read the article

  • convert variable with mixed date formats to one format in r

    - by jalapic
    A sample of my dataframe: date 1 25 February 1987 2 20 August 1974 3 9 October 1984 4 18 August 1992 5 19 September 1995 6 16-Oct-63 7 30-Sep-65 8 22 Jan 2008 9 13-11-1961 10 18 August 1987 11 15-Sep-70 12 5 October 1994 13 5 December 1984 14 03/23/87 15 30 August 1988 16 26-10-1993 17 22 August 1989 18 13-Sep-97 I have a large dataframe with a date variable that has multiple formats for dates. Most of the formats in the variable are shown above- there are a couple of very rare others too. The reason why there are multiple formats is that the data were pulled together from various websites that each used different formats. I have tried using straightforward conversions e.g. strftime(mydf$date,"%d/%m/%Y") but these sorts of conversion will not work if there are multiple formats. I don't want to resort to multiple gsub type editing. I was wondering if I am missing a more simple solution? Code for example: structure(list(date = structure(c(12L, 8L, 18L, 6L, 7L, 4L, 14L, 10L, 1L, 5L, 3L, 17L, 16L, 11L, 15L, 13L, 9L, 2L), .Label = c("13-11-1961", "13-Sep-97", "15-Sep-70", "16-Oct-63", "18 August 1987", "18 August 1992", "19 September 1995", "20 August 1974", "22 August 1989", "22 Jan 2008", "03/23/87", "25 February 1987", "26-10-1993", "30-Sep-65", "30 August 1988", "5 December 1984", "5 October 1994", "9 October 1984"), class = "factor")), .Names = "date", row.names = c(NA, -18L), class = "data.frame")

    Read the article

  • C++. How to define template parameter of type T for class A when class T needs a type A template parameter?

    - by jaybny
    Executor class has template of type P and it takes a P object in constructor. Algo class has a template E and also has a static variable of type E. Processor class has template T and a collection of Ts. Question how can I define Executor< Processor<Algo> > and Algo<Executor> ? Is this possible? I see no way to defining this, its kind of an "infinite recursive template argument" See code. template <class T> class Processor { map<string,T> ts; void Process(string str, int i) { ts[str].Do(i); } } template <class P> class Executor { Proc &p; Executor(P &p) : Proc(p) {} void Foo(string str, int i) { p.Process(str,i); } Execute(string str) { } } template <class E> class Algo { static E e; void Do(int i) {} void Foo() { e.Execute("xxx"); } } main () { typedef Processor<Algo> PALGO; // invalid typedef Executor<PALGO> EPALGO; typedef Algo<EPALGO> AEPALGO; Executor<PALGO> executor(PALGO()); AEPALGO::E = executor; }

    Read the article

  • Is memory allocation in linux non-blocking?

    - by Mark
    I am curious to know if the allocating memory using a default new operator is a non-blocking operation. e.g. struct Node { int a,b; }; ... Node foo = new Node(); If multiple threads tried to create a new Node and if one of them was suspended by the OS in the middle of allocation, would it block other threads from making progress? The reason why I ask is because I had a concurrent data structure that created new nodes. I then modified the algorithm to recycle the nodes. The throughput performance of the two algorithms was virtually identical on a 24 core machine. However, I then created an interference program that ran on all the system cores in order to create as much OS pre-emption as possible. The throughput performance of the algorithm that created new nodes decreased by a factor of 5 relative the the algorithm that recycled nodes. I'm curious to know why this would occur. Thanks. *Edit : pointing me to the code for the c++ memory allocator for linux would be helpful as well. I tried looking before posting this question, but had trouble finding it.

    Read the article

  • Why is my quick sort so slow?

    - by user513075
    Hello, I am practicing writing sorting algorithms as part of some interview preparation, and I am wondering if anybody can help me spot why this quick sort is not very fast? It appears to have the correct runtime complexity, but it is slower than my merge sort by a constant factor of about 2. I would also appreciate any comments that would improve my code that don't necessarily answer the question. Thanks a lot for your help! Please don't hesitate to let me know if I have made any etiquette mistakes. This is my first question here. private class QuickSort implements Sort { @Override public int[] sortItems(int[] ts) { List<Integer> toSort = new ArrayList<Integer>(); for (int i : ts) { toSort.add(i); } toSort = partition(toSort); int[] ret = new int[ts.length]; for (int i = 0; i < toSort.size(); i++) { ret[i] = toSort.get(i); } return ret; } private List<Integer> partition(List<Integer> toSort) { if (toSort.size() <= 1) return toSort; int pivotIndex = myRandom.nextInt(toSort.size()); Integer pivot = toSort.get(pivotIndex); toSort.remove(pivotIndex); List<Integer> left = new ArrayList<Integer>(); List<Integer> right = new ArrayList<Integer>(); for (int i : toSort) { if (i > pivot) right.add(i); else left.add(i); } left = partition(left); right = partition(right); left.add(pivot); left.addAll(right); return left; } }

    Read the article

< Previous Page | 90 91 92 93 94 95 96 97 98 99 100 101  | Next Page >