number systems - Page 447

How to reference other documents in a couchDB view (joining like functionality)

- by Surfrdan

We have a CouchDB representation of an XML database which we use to power a javascript based frontend for manipulating the XML documents. The basic structure is a simple 3 level hierachy. i.e. A - B - C A: Parent doucument (type A) B: any number of child documents of parent type A C: any number of child documents of parent type B We represent these 3 document types in CouchDB with a 'type' attribute: e.g. { "_id":"llgc-id:433", "_rev":"1-3760f3e01d7752a7508b047e0d094301", "type":"A", "label":"Top Level A document", "logicalMap":{ "issues":{ "1":{ "URL":"http://hdl.handle.net/10107/434-0", "FILE":"llgc-id:434" }, "2":{ "URL":"http://hdl.handle.net/10107/467-0", "FILE":"llgc-id:467" etc... } } } } { "_id":"llgc-id:433", "_rev":"1-3760f3e01d7752a7508b047e0d094301", "type":"B", "label":"a B document", } What I want to do is produce a view which returns documents just like the A type but includes the label attribute from the B document within the logicalMap list e.g. { "_id":"llgc-id:433", "_rev":"1-3760f3e01d7752a7508b047e0d094301", "type":"A", "label":"Top Level A document", "logicalMap":{ "issues":{ "1":{ "URL":"http://hdl.handle.net/10107/434-0", "FILE":"llgc-id:434", "LABEL":"a B document" }, "2":{ "URL":"http://hdl.handle.net/10107/467-0", "FILE":"llgc-id:467", "LABEL":"another B document" etc... } } } } I'm struggling to get my head around the best way to perform this. It looks like it should be fairly simple though!

Read the article

Sharing authentication between forum and main CMS in Rails

- by Newy

I have a Rails forum product that resides under the subdomains of my customers (i.e. http://forum.customer.com). Their main site has a CMS and an authentication system, and my forum product has a separate authentication system. Is there an elegant way to have "cross-signins" across these systems? I want someone already logged into the main CMS to seamlessly (as possible) transition into my product.

Read the article

jquery calculation sum two different type of item

- by st4n

I'm writing a script like the example shown in the demo where All of the "Total" values (Including the "Grand Total") are Automatically Calculated using the calc () method. at this link: But I have some fields in which to apply the equation qty * price, and others where I want to do other operations .. you can tell me how? thank you very much i try with this, but it is a very stupid code .. and the grandTotal .. not sum the two different fields: function recalc() { $("[id^=total_item]").calc("qty * price", { qty: $("input[name^=qty_item_]"), price: $("[id^=price_item_]") }, function (s){ // return the number as a dollar amount return "$" + s.toFixed(2); }, function ($this){ // sum the total of the $("[id^=total_item]") selector var sum = $this.sum(); $("#grandTotal").text( // round the results to 2 digits "$" + sum.toFixed(2) ); }); $("[id^=total_otheritem]").calc("qty1 / price1", { qty1: $("input[name^=qty_other_]"), price1: $("[id^=price_other_]") }, function (s){ // return the number as a dollar amount return "$" + s.toFixed(2); }, function ($this){ var sum = $this.sum(); $("#grandTotal").text( // round the results to 2 digits "$" + sum.toFixed(2) ); }); }

Read the article

Start app from within python

- by Aaron Hoffman

Hello, I'm trying to start an application using Python. I've seen that some people use startfile but I also read that it only works with Windows. I'm using Mac systems and hoping for it to work with them. Thanks, Aaron

Read the article

In Ruby, how does one get their IP octet without going through DNS?

- by user30997

I can, on some of my systems, get my IP address (192.68.m.n format) by doing this: addr = IPSocket::getAddress(Socket.gethostname()) ...the trouble is that this only works if the name the local machine uses for itself is the name the DNS server associates with it. How *&#( hard can it be for ruby to just return its primary interface's IP address? I have to do this in a platform-independant way or I'd just call ifconfig or ipconfig and parse it.

Read the article

How does DateTime.Now affect query plan caching in SQL Server?

- by Bill Paetzke

Question: Does passing DateTime.Now as a parameter to a proc prevent SQL Server from caching the query plan? If so, then is the web app missing out on huge performance gains? Possible Solution: I thought DateTime.Today.AddDays(1) would be a possible solution. It would pass the same end-date to the sql proc (per day). And the user would still get the latest data. Please speak to this as well. Given Example: Let's say we have a stored procedure. It reports data back to a user on a webpage. The user can set a date range. If the user sets today's date as the "end date," which includes today's data, the web app passes DateTime.Now to the sql proc. Let's say that one user runs a report--5/1/2010 to now--over and over several times. On the webpage, the user sees 5/1/2010 to 5/4/2010. But the web app passes DateTime.Now to the sql proc as the end date. So, the end date in the proc will always be different, although the user is querying a similar date range. Assume the number of records in the table and number of users are large. So any performance gains matter. Hence the importance of the question. Example proc and execution (if that helps to understand): CREATE PROCEDURE GetFooData @StartDate datetime @EndDate datetime AS SELECT * FROM Foo WHERE LogDate >= @StartDate AND LogDate < @EndDate Here's a sample execution using DateTime.Now: EXEC GetFooData '2010-05-01', '2010-05-04 15:41:27' -- passed in DateTime.Now Here's a sample execution using DateTime.Today.AddDays(1) EXEC GetFooData '2010-05-01', '2010-05-05' -- passed in DateTime.Today.AddDays(1) The same data is returned for both procs, since the current time is: 2010-05-04 15:41:27.

Read the article

Load images into separate movie clips from a XML, Flash, Actionscript 3.0

- by James Dunay

I have an xml image bank, pretty standard, and I have a loader, along with movie clips that I want the images loaded into, the problem that I am running into is I want the images to load into separate movie clips, so I’m using a case statement to specify where they go. However, I can only get them to load into a single movie clip, I assume they are loading ontop of each other and I don’t know how to get them to separate out. I’ll post my code. It doesn’t make any sense to me but if you have any suggestions that would be real great. I can make separate loaders and then just do 1 image per loader, but that just doesn’t sound right to me. var counterNumber:Number = 0; function callThumbs():void{ for (var i:Number = 0; i <3; i++){ thumbLoaded(); counterNumber++; } } function thumbLoaded(){ var photoLoader = new Loader(); switch (counterNumber){ case 1: photoLoader.load(new URLRequest(MovieClip(this.parent).xml.photos.imageOne.image.@url[0])); whole.boxOne.pictureLoader.addChild(photoLoader); trace("1Done"); break; case 2: photoLoader.load(new URLRequest(MovieClip(this.parent).xml.photos.imageTwo.image.@url[0])); whole.boxTwo.pictureLoader.addChild(photoLoader); trace("2Done"); break; } }

Read the article

Why is git better than Subversion?

- by Ben Mills

I've been using Subversion for a few years and after using SourceSafe, I just love Subversion. Combined with TortoiseSVN, I can't really imagine how it could be any better. Yet there's a growing number of developers claiming that Subversion has problems and that we should be moving to the new breed of distributed version control systems, such as Git. Can anyone explain how Git improves upon Subversion?

Read the article

Dynamically generate client-side HTML form control using JavaScript and server-side Python code in Google App Engine

- by gisc

I have the following client-side front-end HTML using Jinja2 template engine: {% for record in result %} <textarea name="remark">{{ record.remark }}</textarea> <input type="submit" name="approve" value="Approve" /> {% endfor %} Thus the HTML may show more than 1 set of textarea and submit button. The back-end Python code retrieves a variable number of records from a gql query using the model, and pass this to the Jinja2 template in result. When a submit button is clicked, it triggers the post method to update the record: def post(self): if self.request.get('approve'): updated_remark = self.request.get('remark') record.remark = db.Text(updated_remark) record.put() However, in some instances, the record updated is NOT the one that correspond to the submit button clicked (eg if a user clicks on record 1 submit, record 2 remark gets updated, but not record 1). I gather that this is due to the duplicate attribute name remark. I can possibly use JavaScript/jQuery to generate different attribute names. The question is, how do I code the back-end Python to get the (variable number of) names generated by the JavaScript? Thanks.

Read the article

Jquery Sorting by Letter

- by Batfan

I am using jquery to sort through a group of paragraph tags (kudos to Aaron Harun). It pulls the value "letter" (a letter) from the url string and displays only paragraphs that start with that letter. It hides all others and also consolidates the list so that there are no duplicates showing. See the code: var letter = '<?php echo(strlen($_GET['letter']) == 1) ? $_GET['letter'] : ''; ?>' function finish(){ var found_first = []; jQuery('p').each(function(){ if(jQuery(this).text().substr(0,1).toUpperCase() == letter){ if(found_first[jQuery(this).text()] != true){ jQuery(this).addClass('current-series'); found_first[jQuery(this).text()] = true; }else{ jQuery(this).hide(); } } else{ jQuery(this).hide();} }) } Been working with this all day and I have 2 Questions on this: Is there a way to get it to ignore the word 'The', if it's first? For example, if a paragraph starts with 'The Amazing', I would like it to show up on the 'A' page, not the 'T' page, like it currently is. Is there a way to have a single page for (all) numbers? For example, the url to the page would be something similar to domain.com/index.php?letter=0 and this would show only the paragraph tags that start with a number, any number. I can currently do this with single numbers but, I would like 1 page for all numbers.

Read the article

Why is calling close() after fopen() not closing?

- by Richard Morgan

I ran across the following code in one of our in-house dlls and I am trying to understand the behavior it was showing: long GetFD(long* fd, const char* fileName, const char* mode) { string fileMode; if (strlen(mode) == 0 || tolower(mode[0]) == 'w' || tolower(mode[0]) == 'o') fileMode = string("w"); else if (tolower(mode[0]) == 'a') fileMode = string("a"); else if (tolower(mode[0]) == 'r') fileMode = string("r"); else return -1; FILE* ofp; ofp = fopen(fileName, fileMode.c_str()); if (! ofp) return -1; *fd = (long)_fileno(ofp); if (*fd < 0) return -1; return 0; } long CloseFD(long fd) { close((int)fd); return 0; } After repeated calling of GetFD with the appropriate CloseFD, the whole dll would no longer be able to do any file IO. I wrote a tester program and found that I could GetFD 509 times, but the 510th time would error. Using Process Explorer, the number of Handles did not increase. So it seems that the dll is reaching the limit for the number of open files; setting _setmaxstdio(2048) does increase the amount of times we can call GetFD. Obviously, the close() is working quite right. After a bit of searching, I replaced the fopen() call with: long GetFD(long* fd, const char* fileName, const char* mode) { *fd = (long)open(fileName, 2); if (*fd < 0) return -1; return 0; } Now, repeatedly calling GetFD/CloseFD works. What is going on here?

Read the article

What customer support alternatives like groovehq i can use for a site

- by rmarimon

I've been experimenting with GrooveHQ as a means to provide support to my clients. They have a very nice idea and have developed it beautifully. At the end is just a ticketing system with multiple channels to communicate with your clients. It is like the rt of our times. What I'm looking for is for other providers of this hosted multi channel ticketing systems. I'm not sure if this belongs in SO but hey...

Read the article

Why Does My Vector<PEVENTLOGRECORD> Mysteriously Get Cleared?

- by Eric

Hello everyone, I am making a program that reads and stores data from Windows EventLog files (.evt) in C++. I am using the calls OpenBackupEventLog(ServerName, FileName) and ReadEventLog(...). Also using this: PEVENTLOGRECORD Anyway, without supplying all of the code, here is the basic idea: 1. I get a handle to the .evt file using OpenBackupEventLog() and passing in a file name. 2. I then use ReadEventLog() to fill up a buffer with an unknown number of EventLog messages. 3. I traverse through the buffer and add each message to a vector 4. I keep filling up buffers (repeat steps 2 and 3) until I reach the end of the file. Here is my code for filling the vector: vector<PEVENTLOGRECORD> allRecords; while(_status == ERROR_SUCCESS) { if(!ReadEventLog(...)) CheckStatus(); else FillVectorFromBuffer(allRecords) } // Function FillVectorFromBuffer FillVectorFromBuffer(vector(PEVENTLOGRECORD) &allRecords) { int bytesExamined = 0; PBYTE pRecord = (PBYTE)_lpBuffer; // This is one of the params in ReadEventLog() while(bytesExamined < _pnBytesRead) // Another param from ReadEventLog { PEVENTLOGRECORD currentRecord = (PEVENTLOGRECORD)(pRecord); allRecords.push_back(currentRecord); pRecord += currentRecord->Length; bytesExamined += currentRecord->Length; } } Anyway, whenever I run this, it will get all the EventLogs in the file, and the vector will have everything I want it to. But as soon as this line: if(!ReadEventLog()) gets called and returns true (aka ReadEventLog() returns false), then every field in my vector gets set to zero. The vector will still contain the correct number of elements, it's just that all of the fields in the PEVENTLOGRECORD struct are now zero. Anyone with better debugging experience have any ideas? Thanks.

Read the article

trying to append a list, but something breaks

- by romunov

I'm trying to create an empty list which will have as many elements as there are num.of.walkers. I then try to append, to each created element, a new sub-list (length of new sub-list corresponds to a value in a. When I fiddle around in R everything goes smooth: list.of.dist[[1]] <- vector("list", a[1]) list.of.dist[[2]] <- vector("list", a[2]) list.of.dist[[3]] <- vector("list", a[3]) list.of.dist[[4]] <- vector("list", a[4]) I then try to write a function. Here is my feeble attempt that results in an error. Can someone chip in what am I doing wrong? countNumberOfWalks <- function(walk.df) { list.of.walkers <- sort(unique(walk.df$label)) num.of.walkers <- length(unique(walk.df$label)) #Pre-allocate objects for further manipulation list.of.dist <- vector("list", num.of.walkers) a <- c() # Count the number of walks per walker. for (i in list.of.walkers) { a[i] <- nrow(walk.df[walk.df$label == i,]) } a <- as.vector(a) # Add a sublist (length = number of walks) for each walker. for (i in i:num.of.walkers) { list.of.dist[[i]] <- vector("list", a[i]) } return(list.of.dist) } > num.of.walks.per.walker <- countNumberOfWalks(walk.df) Error in vector("list", a[i]) : vector size cannot be NA

Read the article

HTML5 or Flash?

- by lewiguez

I have to write a web application for a client soon. Looking at the specs, there is no reason why the project couldn't be an HTML5/CSS/Javascript project, but the client is arguing that it has to be Flash. The project has a number of dynamic elements and is web-based. It'll only be used in-house by a small number of people and all of those people use either Google Chrome or Safari 4. They are all pretty tech-savvy to boot. My question is this: what are some of the reasons (preferably technical since this is Stack Overflow) I can present to my client as to why HTML5 is better than Flash (that's assuming I'm right and it is in this case)? Is it OK to use HTML5 even though it's still a draft spec (I'm assuming it is after checking out all those Apple HTML5 demos a few days ago)? Also, would a hybrid approach be preferable for now? Something that uses Flash wherever the canvas object would've been used in the HTML5 approach and that conforms to a normal XHTML approach. Help!

Read the article

Is there any cross-platform threading library in C or C++?

- by NumberFour

Hello, I'm looking for some easy to use cross-platform threading library written in C or C++. What's your opinion on boost::thread or Pthreads? Does Pthreads run only on POSIX compliant systems? What about the threading support in the Qt library? Thanks for any hints.

Read the article

How to associate a file extension to a program without making it the default program

- by CharlesB

I'm deploying a small conversion tool on some systems, and want the users to be able to run it from the right click Open with menu. But I don't want to change the default program users have associated to this file type. It is easy to associate a file extension/type to a program, but how to do it without changing the default program?

Read the article

How do I create rows with alternating colors for a UITableView on iPhone?

- by Mat

Hi all, i would to have alternate 2 colors of rows, like the first black, the second white, the third black, etc, etc... my approach is like a basic exercise of programming to calculate if a number is odd number or not: - (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath { static NSString *CellIdentifier = @"Cell"; cell = ((MainCell*)[tableView dequeueReusableCellWithIdentifier:CellIdentifier]); if (cell==nil) { NSArray *topLevelObjects=[[NSBundle mainBundle] loadNibNamed:@"MainCell" owner:self options:nil]; for (id currentObject in topLevelObjects){ if ([currentObject isKindOfClass:[UITableViewCell class]]){ if ((indexPath.row % 2)==0) { [cell.contentView setBackgroundColor:[UIColor purpleColor]]; }else{ [cell.contentView setBackgroundColor:[UIColor whiteColor]]; } cell = (MainCell *) currentObject; break; } } }else { AsyncImageView* oldImage = (AsyncImageView*) [cell.contentView viewWithTag:999]; [oldImage removeFromSuperview]; }return cell; The problem is that when i do a rapid scroll, the background of cells become like the last 2 cell black, the first 2 cell white or something like this, but if i scroll slow works fine. I think the problem is the cache of reusableCell. Any ideas? TIA

Read the article

one two-directed tcp socket OR two one-directed? (linux, high volume, low latency)

- by osgx

Hello I need to send (interchange) a high volume of data periodically with the lowest possible latency between 2 machines. The network is rather fast (e.g. 1Gbit or even 2G+). Os is linux. Is it be faster with using 1 tcp socket (for send and recv) or with using 2 uni-directed tcp sockets? The test for this task is very like NetPIPE network benchmark - measure latency and bandwidth for sizes from 2^1 up to 2^13 bytes, each size sent and received 3 times at least (in teal task the number of sends is greater. both processes will be sending and receiving, like ping-pong maybe). The benefit of 2 uni-directed connections come from linux: http://lxr.linux.no/linux+v2.6.18/net/ipv4/tcp_input.c#L3847 3847/* 3848 * TCP receive function for the ESTABLISHED state. 3849 * 3850 * It is split into a fast path and a slow path. The fast path is 3851 * disabled when: ... 3859 * - Data is sent in both directions. Fast path only supports pure senders 3860 * or pure receivers (this means either the sequence number or the ack 3861 * value must stay constant) ... 3863 * 3864 * When these conditions are not satisfied it drops into a standard 3865 * receive procedure patterned after RFC793 to handle all cases. 3866 * The first three cases are guaranteed by proper pred_flags setting, 3867 * the rest is checked inline. Fast processing is turned on in 3868 * tcp_data_queue when everything is OK. All other conditions for disabling fast path is false. And only not-unidirected socket stops kernel from fastpath in receive

Read the article

How do I detect proximity of the mouse pointer to a line in Flex?

- by Hanno Fietz

I'm working on a charting UI in Flex. One of the features I want to implement is "snapping" of the mousepointer to the data points in the diagram. I. e., if the user hovers the mouse pointer over a line diagram and gets close to the data point, I want the pointer to move to the exact coordinates and show a marker, like this: Currently, the lines are drawn on a Shape, using the Graphics API. The Shape is a child DisplayObject of a custom UIComponent subclass with the exact same dimensions. This means, I already get mouseOver events on the parent of the diagram's canvas. Now I need a way to detect if the pointer is close to one of the data points. I. e. I need an answer to the question "Which data points lie within a radius of x pixels from my current position and which of them is closest?" upon each move of the mouse. I can think of the following possibilities: draw the lines not as simple lines in the graphics API, but as more advanced objects that can have their own mouseOver events. However, I want the snapping to trigger before the mouse is actually over the line. check the original data for possible candidates upon each mouse movement. Using binary search, I might be able to reduce the number of items I have to compare sufficently. prepare some kind of new data structure from the raw data that makes the above search more efficient. I don't know how that would look like. I'm guessing this is a pretty standard problem for a number of applications, but probably the actual code usually is inside of some framework. Is there anything I can read about this topic?

Read the article

one two-directed tcp socket of two one-directed? (linux, high volume, low latency)

- by osgx

Hello I need to send (interchange) a high volume of data periodically with the lowest possible latency between 2 machines. The network is rather fast (e.g. 1Gbit or even 2G+). Os is linux. Is it be faster with using 1 tcp socket (for send and recv) or with using 2 uni-directed tcp sockets? The test for this task is very like NetPIPE network benchmark - measure latency and bandwidth for sizes from 2^1 up to 2^13 bytes, each size sent and received 3 times at least (in teal task the number of sends is greater. both processes will be sending and receiving, like ping-pong maybe). The benefit of 2 uni-directed connections come from linux: http://lxr.linux.no/linux+v2.6.18/net/ipv4/tcp_input.c#L3847 3847/* 3848 * TCP receive function for the ESTABLISHED state. 3849 * 3850 * It is split into a fast path and a slow path. The fast path is 3851 * disabled when: ... 3859 * - Data is sent in both directions. Fast path only supports pure senders 3860 * or pure receivers (this means either the sequence number or the ack 3861 * value must stay constant) ... 3863 * 3864 * When these conditions are not satisfied it drops into a standard 3865 * receive procedure patterned after RFC793 to handle all cases. 3866 * The first three cases are guaranteed by proper pred_flags setting, 3867 * the rest is checked inline. Fast processing is turned on in 3868 * tcp_data_queue when everything is OK. All other conditions for disabling fast path is false. And only not-unidirected socket stops kernel from fastpath in receive

Read the article

is there a better way to write this frankenstein LINQ query that searches for values in a child tabl

- by MRV

I have a table of Users and a one to many UserSkills table. I need to be able to search for users based on skills. This query takes a list of desired skills and searches for users who have those skills. I want to sort the users based on the number of desired skills they posses. So if a users only has 1 of 3 desired skills he will be further down the list than the user who has 3 of 3 desired skills. I start with my comma separated list of skill IDs that are being searched for: List<short> searchedSkillsRaw = skills.Value.Split(',').Select(i => short.Parse(i)).ToList(); I then filter out only the types of users that are searchable: List<User> users = (from u in db.Users where u.Verified == true && u.Level > 0 && u.Type == 1 && (u.UserDetail.City == city.SelectedValue || u.UserDetail.City == null) select u).ToList(); and then comes the crazy part: var fUsers = from u in users select new { u.Id, u.FirstName, u.LastName, u.UserName, UserPhone = u.UserDetail.Phone, UserSkills = (from uskills in u.UserSkills join skillsJoin in configSkills on uskills.SkillId equals skillsJoin.ValueIdInt into tempSkills from skillsJoin in tempSkills.DefaultIfEmpty() where uskills.UserId == u.Id select new { SkillId = uskills.SkillId, SkillName = skillsJoin.Name, SkillNameFound = searchedSkillsRaw.Contains(uskills.SkillId) }), UserSkillsFound = (from uskills in u.UserSkills where uskills.UserId == u.Id && searchedSkillsRaw.Contains(uskills.SkillId) select uskills.UserId).Count() } into userResults where userResults.UserSkillsFound > 0 orderby userResults.UserSkillsFound descending select userResults; and this works! But it seems super bloated and inefficient to me. Especially the secondary part that counts the number of skills found. Thanks for any advice you can give. --r

Read the article

Drawing and filling different polygons at the same time in MATLAB

- by Hossein

Hi,I have the code below. It load a CSV file into memory. This file contains the coordinates for different polygons.Each row of this file has X,Y coordinates and a string which tells that to which polygon this datapoint belongs. for example a polygone named "Poly1" with 100 data points has 100 rows in this file like : Poly1,X1,Y1 Poly1,X2,Y2 ... Poly1,X100,Y100 Poly2,X1,Y1 ..... The index.csv file has the number of datapoint(number of rows) for each polygon in file Polygons.csv. These details are not important. The thing is: I can successfully extract the datapoints for each polygon using the code below. However, When I plot the lines of different polygons are connected to each other and the plot looks crappy. I need the polygons to be separated(they are connected and overlapping the some areas though). I thought by using "fill" I can actually see them better. But "fill" just filles every polygon that it can find and that is not desirable. I only want to fill inside the polygons. Can someone help me? I can also send you my datapoint if necessary, they are less than 200Kb. Thanks [coordinates,routeNames,polygonData] = xlsread('Polygons.csv'); index = dlmread('Index.csv'); firstPointer = 0 lastPointer = index(1) for Counter=2:size(index) firstPointer = firstPointer + index(Counter) + 1 hold on plot(coordinates(firstPointer:lastPointer,2),coordinates(firstPointer:lastPointer,1),'r-') lastPointer = lastPointer + index(Counter) end

Read the article

javascript hide/show tabs using JQuery

- by JohnMerlino

Hey all, I have a quick question of how I can use jquery tabs (you click on link button to display/hide certain divs). The div id matches the href of the link: HTML links: <table class='layout tabs'> <tr> <td><a href="#site">Site</a></td> <td><a href="#siteno">Number</a></td> </tr> <tr> <td><a href="#student">Student</a></td> <td><a href="#school">School</a></td> </tr> </table> </div> div that needs to display/hide: <div id="site"> <table class='explore'> <thead class='ui-widget-header'> <tr> <th class=' sortable'> Site </th> <th class=' sortable'> Number </th> </tr> </thead> </table> </div> Thanks for any response.

Read the article

Non standard interaction among two tables to avoid very large merge

- by riko

Suppose I have two tables A and B. Table A has a multi-level index (a, b) and one column (ts). b determines univocally ts. A = pd.DataFrame( [('a', 'x', 4), ('a', 'y', 6), ('a', 'z', 5), ('b', 'x', 4), ('b', 'z', 5), ('c', 'y', 6)], columns=['a', 'b', 'ts']).set_index(['a', 'b']) AA = A.reset_index() Table B is another one-column (ts) table with non-unique index (a). The ts's are sorted "inside" each group, i.e., B.ix[x] is sorted for each x. Moreover, there is always a value in B.ix[x] that is greater than or equal to the values in A. B = pd.DataFrame( dict(a=list('aaaaabbcccccc'), ts=[1, 2, 4, 5, 7, 7, 8, 1, 2, 4, 5, 8, 9])).set_index('a') The semantics in this is that B contains observations of occurrences of an event of type indicated by the index. I would like to find from B the timestamp of the first occurrence of each event type after the timestamp specified in A for each value of b. In other words, I would like to get a table with the same shape of A, that instead of ts contains the "minimum value occurring after ts" as specified by table B. So, my goal would be: C: ('a', 'x') 4 ('a', 'y') 7 ('a', 'z') 5 ('b', 'x') 7 ('b', 'z') 7 ('c', 'y') 8 I have some working code, but is terribly slow. C = AA.apply(lambda row: ( row[0], row[1], B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))), axis=1).set_index(['a', 'b']) Profiling shows the culprit is obviously B.ix[row[0]].irow(np.searchsorted(B.ts[row[0]], row[2]))). However, standard solutions using merge/join would take too much RAM in the long run. Consider that now I have 1000 a's, assume constant the average number of b's per a (probably 100-200), and consider that the number of observations per a is probably in the order of 300. In production I will have 1000 more a's. 1,000,000 x 200 x 300 = 60,000,000,000 rows may be a bit too much to keep in RAM, especially considering that the data I need is perfectly described by a C like the one I discussed above. How would I improve the performance?

Search Results

Search found 30575 results on 1223 pages for 'number systems'.

Page 447/1223 | < Previous Page | 443 444 445 446 447 448 449 450 451 452 453 454 | Next Page >

- by Surfrdan

- by Newy

- by st4n

- by Aaron Hoffman

- by user30997

- by Bill Paetzke

- by James Dunay

- by Ben Mills

- by gisc

- by Batfan

- by Richard Morgan

- by rmarimon

- by Eric

- by romunov

- by lewiguez

- by NumberFour

- by CharlesB

- by Mat

- by osgx

- by Hanno Fietz

- by osgx

- by MRV

- by Hossein

- by JohnMerlino

- by riko

< Previous Page | 443 444 445 446 447 448 449 450 451 452 453 454 | Next Page >