Search Results

Search found 3047 results on 122 pages for 'subset sum'.

Page 47/122 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • SQL SERVER – Use ROLL UP Clause instead of COMPUTE BY

    - by pinaldave
    Note: This upgrade was test performed on development server with using bits of SQL Server 2012 RC0 (which was available at in public) when this test was performed. However, SQL Server RTM (GA on April 1) is expected to behave similarly. I recently observed an upgrade from SQL Server 2005 to SQL Server 2012 with compatibility keeping at SQL Server 2012 (110). After upgrading the system and testing the various modules of the application, we quickly observed that few of the reports were not working. They were throwing error. When looked at carefully I noticed that it was using COMPUTE BY clause, which is deprecated in SQL Server 2012. COMPUTE BY clause is replaced by ROLL UP clause in SQL Server 2012. However there is no direct replacement of the code, user have to re-write quite a few things when using ROLL UP instead of COMPUTE BY. The primary reason is that how each of them returns results. In original code COMPUTE BY was resulting lots of result set but ROLL UP. Here is the example of the similar code of ROLL UP and COMPUTE BY. I personally find the ROLL UP much easier than COMPUTE BY as it returns all the results in single resultset unlike the other one. Here is the quick code which I wrote to demonstrate the said behavior. CREATE TABLE tblPopulation ( Country VARCHAR(100), [State] VARCHAR(100), City VARCHAR(100), [Population (in Millions)] INT ) GO INSERT INTO tblPopulation VALUES('India', 'Delhi','East Delhi',9 ) INSERT INTO tblPopulation VALUES('India', 'Delhi','South Delhi',8 ) INSERT INTO tblPopulation VALUES('India', 'Delhi','North Delhi',5.5) INSERT INTO tblPopulation VALUES('India', 'Delhi','West Delhi',7.5) INSERT INTO tblPopulation VALUES('India', 'Karnataka','Bangalore',9.5) INSERT INTO tblPopulation VALUES('India', 'Karnataka','Belur',2.5) INSERT INTO tblPopulation VALUES('India', 'Karnataka','Manipal',1.5) INSERT INTO tblPopulation VALUES('India', 'Maharastra','Mumbai',30) INSERT INTO tblPopulation VALUES('India', 'Maharastra','Pune',20) INSERT INTO tblPopulation VALUES('India', 'Maharastra','Nagpur',11 ) INSERT INTO tblPopulation VALUES('India', 'Maharastra','Nashik',6.5) GO SELECT Country,[State],City, SUM ([Population (in Millions)]) AS [Population (in Millions)] FROM tblPopulation GROUP BY Country,[State],City WITH ROLLUP GO SELECT Country,[State],City, [Population (in Millions)] FROM tblPopulation ORDER BY Country,[State],City COMPUTE SUM([Population (in Millions)]) BY Country,[State]--,City GO After writing this blog post I continuously feel that there should be some better way to do the same task. Is there any easier way to replace COMPUTE BY? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Your interesting code tricks/ conventions? [closed]

    - by Paul
    What interesting conventions, rules, tricks do you use in your code? Preferably some that are not so popular so that the rest of us would find them as novelties. :) Here's some of mine... Input and output parameters This applies to C++ and other languages that have both references and pointers. This is the convention: input parameters are always passed by value or const reference; output parameters are always passed by pointer. This way I'm able to see at a glance, directly from the function call, what parameters might get modified by the function: Inspiration: Old C code int a = 6, b = 7, sum = 0; calculateSum(a, b, &sum); Ordering of headers My typical source file begins like this (see code below). The reason I put the matching header first is because, in case that header is not self-sufficient (I forgot to include some necessary library, or forgot to forward declare some type or function), a compiler error will occur. // Matching header #include "example.h" // Standard libraries #include <string> ... Setter functions Sometimes I find that I need to set multiple properties of an object all at once (like when I just constructed it and I need to initialize it). To reduce the amount of typing and, in some cases, improve readability, I decided to make my setters chainable: Inspiration: Builder pattern class Employee { public: Employee& name(const std::string& name); Employee& salary(double salary); private: std::string name_; double salary_; }; Employee bob; bob.name("William Smith").salary(500.00); Maybe in this particular case it could have been just as well done in the constructor. But for Real WorldTM applications, classes would have lots more fields that should be set to appropriate values and it becomes unmaintainable to do it in the constructor. So what about you? What personal tips and tricks would you like to share?

    Read the article

  • Scheduling thread tiles with C++ AMP

    - by Daniel Moth
    This post assumes you are totally comfortable with, what some of us call, the simple model of C++ AMP, i.e. you could write your own matrix multiplication. We are now ready to explore the tiled model, which builds on top of the non-tiled one. Tiling the extent We know that when we pass a grid (which is just an extent under the covers) to the parallel_for_each call, it determines the number of threads to schedule and their index values (including dimensionality). For the single-, two-, and three- dimensional cases you can go a step further and subdivide the threads into what we call tiles of threads (others may call them thread groups). So here is a single-dimensional example: extent<1> e(20); // 20 units in a single dimension with indices from 0-19 grid<1> g(e);      // same as extent tiled_grid<4> tg = g.tile<4>(); …on the 3rd line we subdivided the single-dimensional space into 5 single-dimensional tiles each having 4 elements, and we captured that result in a concurrency::tiled_grid (a new class in amp.h). Let's move on swiftly to another example, in pictures, this time 2-dimensional: So we start on the left with a grid of a 2-dimensional extent which has 8*6=48 threads. We then have two different examples of tiling. In the first case, in the middle, we subdivide the 48 threads into tiles where each has 4*3=12 threads, hence we have 2*2=4 tiles. In the second example, on the right, we subdivide the original input into tiles where each has 2*2=4 threads, hence we have 4*3=12 tiles. Notice how you can play with the tile size and achieve different number of tiles. The numbers you pick must be such that the original total number of threads (in our example 48), remains the same, and every tile must have the same size. Of course, you still have no clue why you would do that, but stick with me. First, we should see how we can use this tiled_grid, since the parallel_for_each function that we know expects a grid. Tiled parallel_for_each and tiled_index It turns out that we have additional overloads of parallel_for_each that accept a tiled_grid instead of a grid. However, those overloads, also expect that the lambda you pass in accepts a concurrency::tiled_index (new in amp.h), not an index<N>. So how is a tiled_index different to an index? A tiled_index object, can have only 1 or 2 or 3 dimensions (matching exactly the tiled_grid), and consists of 4 index objects that are accessible via properties: global, local, tile_origin, and tile. The global index is the same as the index we know and love: the global thread ID. The local index is the local thread ID within the tile. The tile_origin index returns the global index of the thread that is at position 0,0 of this tile, and the tile index is the position of the tile in relation to the overall grid. Confused? Here is an example accompanied by a picture that hopefully clarifies things: array_view<int, 2> data(8, 6, p_my_data); parallel_for_each(data.grid.tile<2,2>(), [=] (tiled_index<2,2> t_idx) restrict(direct3d) { /* todo */ }); Given the code above and the picture on the right, what are the values of each of the 4 index objects that the t_idx variables exposes, when the lambda is executed by T (highlighted in the picture on the right)? If you can't work it out yourselves, the solution follows: t_idx.global       = index<2> (6,3) t_idx.local          = index<2> (0,1) t_idx.tile_origin = index<2> (6,2) t_idx.tile             = index<2> (3,1) Don't move on until you are comfortable with this… the picture really helps, so use it. Tiled Matrix Multiplication Example – part 1 Let's paste here the C++ AMP matrix multiplication example, bolding the lines we are going to change (can you guess what the changes will be?) 01: void MatrixMultiplyTiled_Part1(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M, N, vC); 07: parallel_for_each(c.grid, 08: [=](index<2> idx) restrict(direct3d) { 09: 10: int row = idx[0]; int col = idx[1]; 11: float sum = 0.0f; 12: for(int i = 0; i < W; i++) 13: sum += a(row, i) * b(i, col); 14: c[idx] = sum; 15: }); 16: } To turn this into a tiled example, first we need to decide our tile size. Let's say we want each tile to be 16*16 (which assumes that we'll have at least 256 threads to process, and that c.grid.extent.size() is divisible by 256, and moreover that c.grid.extent[0] and c.grid.extent[1] are divisible by 16). So we insert at line 03 the tile size (which must be a compile time constant). 03: static const int TS = 16; ...then we need to tile the grid to have tiles where each one has 16*16 threads, so we change line 07 to be as follows 07: parallel_for_each(c.grid.tile<TS,TS>(), ...that means that our index now has to be a tiled_index with the same characteristics as the tiled_grid, so we change line 08 08: [=](tiled_index<TS, TS> t_idx) restrict(direct3d) { ...which means, without changing our core algorithm, we need to be using the global index that the tiled_index gives us access to, so we insert line 09 as follows 09: index<2> idx = t_idx.global; ...and now this code just works and it is tiled! Closing thoughts on part 1 The process we followed just shows the mechanical transformation that can take place from the simple model to the tiled model (think of this as step 1). In fact, when we wrote the matrix multiplication example originally, the compiler was doing this mechanical transformation under the covers for us (and it has additional smarts to deal with the cases where the total number of threads scheduled cannot be divisible by the tile size). The point is that the thread scheduling is always tiled, even when you use the non-tiled model. But with this mechanical transformation, we haven't gained anything… Hint: our goal with explicitly using the tiled model is to gain even more performance. In the next post, we'll evolve this further (beyond what the compiler can automatically do for us, in this first release), so you can see the full usage of the tiled model and its benefits… Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • How to calculate the covariance in T-SQL

    - by Peter Larsson
    DECLARE @Sample TABLE         (             x INT NOT NULL,             y INT NOT NULL         ) INSERT  @Sample VALUES  (3, 9),         (2, 7),         (4, 12),         (5, 15),         (6, 17) ;WITH cteSource(x, xAvg, y, yAvg, n) AS (         SELECT  1E * x,                 AVG(1E * x) OVER (PARTITION BY (SELECT NULL)),                 1E * y,                 AVG(1E * y) OVER (PARTITION BY (SELECT NULL)),                 COUNT(*) OVER (PARTITION BY (SELECT NULL))         FROM    @Sample ) SELECT  SUM((x - xAvg) *(y - yAvg)) / MAX(n) AS [COVAR(x,y)] FROM    cteSource

    Read the article

  • How to discriminate from two nodes with identical frequencies in a Huffman's tree?

    - by Omega
    Still on my quest to compress/decompress files with a Java implementation of Huffman's coding (http://en.wikipedia.org/wiki/Huffman_coding) for a school assignment. From the Wikipedia page, I quote: Create a leaf node for each symbol and add it to the priority queue. While there is more than one node in the queue: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. Add the new node to the queue. The remaining node is the root node and the tree is complete. Now, emphasis: Remove the two nodes of highest priority (lowest probability) from the queue Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities. So I have to take two nodes with the lowest frequency. What if there are multiple nodes with the same low frequency? How do I discriminate which one to use? The reason I ask this is because Wikipedia has this image: And I wanted to see if my Huffman's tree was the same. I created a file with the following content: aaaaeeee nnttmmiihhssfffouxprl And this was the result: Doesn't look so bad. But there clearly are some differences when multiple nodes have the same frequency. My questions are the following: What is Wikipedia's image doing to discriminate the nodes with the same frequency? Is my tree wrong? (Is Wikipedia's image method the one and only answer?) I guess there is one specific and strict way to do this, because for our school assignment, files that have been compressed by my program should be able to be decompressed by other classmate's programs - so there must be a "standard" or "unique" way to do it. But I'm a bit lost with that. My code is rather straightforward. It literally just follows Wikipedia's listed steps. The way my code extracts the two nodes with the lowest frequency from the queue is to iterate all nodes and if the current node has a lower frequency than any of the two "smallest" known nodes so far, then it replaces the highest one. Just like that.

    Read the article

  • How to remove the boundary effects arising due to zero padding in scipy/numpy fft?

    - by Omkar
    I have made a python code to smoothen a given signal using the Weierstrass transform, which is basically the convolution of a normalised gaussian with a signal. The code is as follows: #Importing relevant libraries from __future__ import division from scipy.signal import fftconvolve import numpy as np def smooth_func(sig, x, t= 0.002): N = len(x) x1 = x[-1] x0 = x[0] # defining a new array y which is symmetric around zero, to make the gaussian symmetric. y = np.linspace(-(x1-x0)/2, (x1-x0)/2, N) #gaussian centered around zero. gaus = np.exp(-y**(2)/t) #using fftconvolve to speed up the convolution; gaus.sum() is the normalization constant. return fftconvolve(sig, gaus/gaus.sum(), mode='same') If I run this code for say a step function, it smoothens the corner, but at the boundary it interprets another corner and smoothens that too, as a result giving unnecessary behaviour at the boundary. I explain this with a figure shown in the link below. Boundary effects This problem does not arise if we directly integrate to find convolution. Hence the problem is not in Weierstrass transform, and hence the problem is in the fftconvolve function of scipy. To understand why this problem arises we first need to understand the working of fftconvolve in scipy. The fftconvolve function basically uses the convolution theorem to speed up the computation. In short it says: convolution(int1,int2)=ifft(fft(int1)*fft(int2)) If we directly apply this theorem we dont get the desired result. To get the desired result we need to take the fft on a array double the size of max(int1,int2). But this leads to the undesired boundary effects. This is because in the fft code, if size(int) is greater than the size(over which to take fft) it zero pads the input and then takes the fft. This zero padding is exactly what is responsible for the undesired boundary effects. Can you suggest a way to remove this boundary effects? I have tried to remove it by a simple trick. After smoothening the function I am compairing the value of the smoothened signal with the original signal near the boundaries and if they dont match I replace the value of the smoothened func with the input signal at that point. It is as follows: i = 0 eps=1e-3 while abs(smooth[i]-sig[i])> eps: #compairing the signals on the left boundary smooth[i] = sig[i] i = i + 1 j = -1 while abs(smooth[j]-sig[j])> eps: # compairing on the right boundary. smooth[j] = sig[j] j = j - 1 There is a problem with this method, because of using an epsilon there are small jumps in the smoothened function, as shown below: jumps in the smooth func Can there be any changes made in the above method to solve this boundary problem?

    Read the article

  • is there a formal algebra method to analyze programs?

    - by Gabriel
    Is there a formal/academic connection between an imperative program and algebra, and if so where would I learn about it? The example I'm thinking of is: if(C1) { A1(); A2(); } if(C2) { A1(); A2(); } Represented as a sum of terms: (C1)(A1) + (C1)(A2) + (C2)(A1) + (C2)(A2) = (C1+C2)(A1+A2) The idea being that manipulation could lead to programatic refactoring - "factoring" being the common concept in this example.

    Read the article

  • how to fix it =.= "Third party sources disabled"

    - by Long Pham
    When i tried to upgrade to 13.10, it was appearing on the screen while preparing to update "Third party sources disabled Some third party entries in your sources.list were disabled. You can re-enable them after the upgrade with the 'software-properties' tool or your package manager." After that it was on: W:Failed to fetch bzip2:/var/lib/apt/lists/partial/archive.ubuntu.com_ubuntu_dists_saucy_universe_source_Sources Hash Sum mismatch , E:Some index files failed to download. They have been ignored, or old ones used instead.

    Read the article

  • algorithm for project euler problem no 18

    - by Valentino Ru
    Problem number 18 from Project Euler's site is as follows: By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23. 3 7 4 2 4 6 8 5 9 3 That is, 3 + 7 + 4 + 9 = 23. Find the maximum total from top to bottom of the triangle below: 75 95 64 17 47 82 18 35 87 10 20 04 82 47 65 19 01 23 75 03 34 88 02 77 73 07 63 67 99 65 04 28 06 16 70 92 41 41 26 56 83 40 80 70 33 41 48 72 33 47 32 37 16 94 29 53 71 44 65 25 43 91 52 97 51 14 70 11 33 28 77 73 17 78 39 68 17 57 91 71 52 38 17 14 91 43 58 50 27 29 48 63 66 04 68 89 53 67 30 73 16 69 87 40 31 04 62 98 27 23 09 70 98 73 93 38 53 60 04 23 NOTE: As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o) The formulation of this problems does not make clear if the "Traversor" is greedy, meaning that he always choosed the child with be higher value the maximum of every single walkthrough is asked The NOTE says, that it is possible to solve this problem by trying every route. This means to me, that is is also possible without! This leads to my actual question: Assumed that not the greedy one is the max, is there any algorithm that finds the max walkthrough value without trying every route and that doesn't act like the greedy algorithm? I implemented an algorithm in Java, putting the values first in a node structure, then applying the greedy algorithm. The result, however, is cosidered as wrong by Project Euler. sum = 0; void findWay(Node node){ sum += node.value; if(node.nodeLeft != null && node.nodeRight != null){ if(node.nodeLeft.value > node.nodeRight.value){ findWay(node.nodeLeft); }else{ findWay(node.nodeRight); } } }

    Read the article

  • Parent-child hierarchies and unary operators in PowerPivot

    - by Marco Russo (SQLBI)
    Alberto wrote an excellent post describing how to implement the Unary Operator feature (which is present in Analysis Services) in PowerPivot (there was a previous post about parent-child hierarchies, too). I have to say that the solution is not so easy to implement as in Analysis Services, but it just works and, from a practical point of view, it is not so difficult to implement if you understand how it works and accept its limitations (only sum and subtractions are supported). I think that many...(read more)

    Read the article

  • SAF Architecture Evaluation (Introduction)

    I saidin “what’s Software architecture” – architecture is both an early artifact and it also represents the significant decisions about the system – or to sum it up”Architecture is the decisions that youwishyou could get right early in a project.(Ralph Johnston*). That is exactly why I made evaluation one of the key steps in SAF. [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Oracle Snapshot Not Working [closed]

    - by nayef harb
    i have created a snapshot that takes data from 2 tables and has a refresh rate of 1 day. The snapshot data is not refreshing it is still the same. is there something that i am missing ? Here is the code: CREATE SNAPSHOT test REFRESH COMPLETE START WITH SYSDATE NEXT sysdate + 1 AS select item_code,item_conc_code,tran_bran_code,sum(tran_qty) bal_qty from tranhist a, itemmast b where a.tran_item_code = b.item_code group by item_code,item_conc_code,tran_bran_code

    Read the article

  • Interesting fact #123423

    - by Tim Dexter
    Question from a customer on an internal mailing list this, succintly answered by RTF Template God, Hok-Min Q: Whats the upper limit for a sum calculation in terms of the largest number BIP can handle? A: Internally, XSL-T processor uses double precession.  Therefore the upper limit and precision will be same as double (IEEE 754 double-precision binary floating-point format, binary64). Approximately 16 significant decimal digits, max is 1.7976931348623157 x 10308 . So, now you know :)

    Read the article

  • Making the most of next weeks SharePoint 2010 developer training

    - by Eric Nelson
    [you can still register if you are free on the afternoons of 9th to 11th – UK time] We have 50+ registrations with more coming in – which is fantastic. Please read on to make the most of the training. Background We have structured the training to make sure that you can still learn lots during the three days even if you do not have SharePoint 2010 installed. Additionally the course is based around a subset of the channel 9 training to allow you to easily dig deeper or look again at specific areas. Which means if you have zero time between now and next Wednesday then you are still good to go. But if you can do some pre-work you will likely get even more out of the three days. Step 1: Check out the topics and resources available on-demand The course is based around a subset of the channel 9 training to allow you to easily dig deeper or look again at specific areas. Take a lap around the SharePoint 2010 Training Course on Channel 9 Download the SharePoint Developer Training Kit Step 2: Use a pre-configured Virtual Machine which you can download (best start today – it is large!) Consider using the VM we created If you don't have access to SharePoint 2010. You will need a 64bit host OS and bare minimum of 4GB of RAM. 8GB recommended. Virtual PC can not be used with this VM – Virtual PC only supports 32bit guests. The 2010-7a Information Worker VM gives you everything you need to develop for SharePoint 2010. Watch the Video on how to use this VM Download the VM Remember you only need to download the “parts” for the 2010-7a VM. There are 3 subtly different ways of using this VM: Easiest is to follow the advice of the video and get yourself a host OS of Windows Server 2008 R2 with Hyper-V and simply use the VM Alternatively you can take the VHD and create a “Boot to VHD” if you have Windows 7 Ultimate or Enterprise Edition. This works really well – especially if you are already familiar with “Boot to VHD” (This post I did will help you get started) Or you can take the VHD and use an alternative VM tool such as VirtualBox if you have a different host OS. NB: This tends to involve some work to get everything running fine. Check out parts 1 to 3 from Rolly and if you go with Virtual Box use an IDE controller not SATA. SATA will blue screen. Note in the screenshot below I also converted the vhd to a vmdk. I used the FREE Starwind Converter to do this whilst I was fighting blue screens – not sure its necessary as VirtualBox does now work with VHDs. or Step 3 – Install SharePoint 2010 on a 64bit Windows 7 or Vista Host I haven’t tried this but it is now supported. Check out MSDN. Final notes: I am in the process of securing a number of hosted VMs for ISVs directly managed by my team. Your Architect Evangelist will have details once I have them! Else we can sort out on the Wed. Regrettably I am unable to give folks 1:1 support on any issues around Boot to VHD, 3rd party VM products etc. Related Links: Check you are fully plugged into the work of my team – have you done these simple steps including joining our new LinkedIn group?

    Read the article

  • Efficiently separating Read/Compute/Write steps for concurrent processing of entities in Entity/Component systems

    - by TravisG
    Setup I have an entity-component architecture where Entities can have a set of attributes (which are pure data with no behavior) and there exist systems that run the entity logic which act on that data. Essentially, in somewhat pseudo-code: Entity { id; map<id_type, Attribute> attributes; } System { update(); vector<Entity> entities; } A system that just moves along all entities at a constant rate might be MovementSystem extends System { update() { for each entity in entities position = entity.attributes["position"]; position += vec3(1,1,1); } } Essentially, I'm trying to parallelise update() as efficiently as possible. This can be done by running entire systems in parallel, or by giving each update() of one system a couple of components so different threads can execute the update of the same system, but for a different subset of entities registered with that system. Problem In reality, these systems sometimes require that entities interact(/read/write data from/to) each other, sometimes within the same system (e.g. an AI system that reads state from other entities surrounding the current processed entity), but sometimes between different systems that depend on each other (i.e. a movement system that requires data from a system that processes user input). Now, when trying to parallelize the update phases of entity/component systems, the phases in which data (components/attributes) from Entities are read and used to compute something, and the phase where the modified data is written back to entities need to be separated in order to avoid data races. Otherwise the only way (not taking into account just "critical section"ing everything) to avoid them is to serialize parts of the update process that depend on other parts. This seems ugly. To me it would seem more elegant to be able to (ideally) have all processing running in parallel, where a system may read data from all entities as it wishes, but doesn't write modifications to that data back until some later point. The fact that this is even possible is based on the assumption that modification write-backs are usually very small in complexity, and don't require much performance, whereas computations are very expensive (relatively). So the overhead added by a delayed-write phase might be evened out by more efficient updating of entities (by having threads work more % of the time instead of waiting). A concrete example of this might be a system that updates physics. The system needs to both read and write a lot of data to and from entities. Optimally, there would be a system in place where all available threads update a subset of all entities registered with the physics system. In the case of the physics system this isn't trivially possible because of race conditions. So without a workaround, we would have to find other systems to run in parallel (which don't modify the same data as the physics system), other wise the remaining threads are waiting and wasting time. However, that has disadvantages Practically, the L3 cache is pretty much always better utilized when updating a large system with multiple threads, as opposed to multiple systems at once, which all act on different sets of data. Finding and assembling other systems to run in parallel can be extremely time consuming to design well enough to optimize performance. Sometimes, it might even not be possible at all because a system just depends on data that is touched by all other systems. Solution? In my thinking, a possible solution would be a system where reading/updating and writing of data is separated, so that in one expensive phase, systems only read data and compute what they need to compute, and then in a separate, performance-wise cheap, write phase, attributes of entities that needed to be modified are finally written back to the entities. The Question How might such a system be implemented to achieve optimal performance, as well as making programmer life easier? What are the implementation details of such a system and what might have to be changed in the existing EC-architecture to accommodate this solution?

    Read the article

  • Should I expose IObservable<T> on my interfaces?

    - by Alex
    My colleague and I have dispute. We are writing a .NET application that processes massive amounts of data. It receives data elements, groups subsets of them into blocks according to some criterion and processes those blocks. Let's say we have data items of type Foo arriving some source (from the network, for example) one by one. We wish to gather subsets of related objects of type Foo, construct an object of type Bar from each such subset and process objects of type Bar. One of us suggested the following design. Its main theme is exposing IObservable objects directly from the interfaces of our components. // ********* Interfaces ********** interface IFooSource { // this is the event-stream of objects of type Foo IObservable<Foo> FooArrivals { get; } } interface IBarSource { // this is the event-stream of objects of type Bar IObservable<Bar> BarArrivals { get; } } / ********* Implementations ********* class FooSource : IFooSource { // Here we put logic that receives Foo objects from the network and publishes them to the FooArrivals event stream. } class FooSubsetsToBarConverter : IBarSource { IFooSource fooSource; IObservable<Bar> BarArrivals { get { // Do some fancy Rx operators on fooSource.FooArrivals, like Buffer, Window, Join and others and return IObservable<Bar> } } } // this class will subscribe to the bar source and do processing class BarsProcessor { BarsProcessor(IBarSource barSource); void Subscribe(); } // ******************* Main ************************ class Program { public static void Main(string[] args) { var fooSource = FooSourceFactory.Create(); var barsProcessor = BarsProcessorFactory.Create(fooSource) // this will create FooSubsetToBarConverter and BarsProcessor barsProcessor.Subscribe(); fooSource.Run(); // this enters a loop of listening for Foo objects from the network and notifying about their arrival. } } The other suggested another design that its main theme is using our own publisher/subscriber interfaces and using Rx inside the implementations only when needed. //********** interfaces ********* interface IPublisher<T> { void Subscribe(ISubscriber<T> subscriber); } interface ISubscriber<T> { Action<T> Callback { get; } } //********** implementations ********* class FooSource : IPublisher<Foo> { public void Subscribe(ISubscriber<Foo> subscriber) { /* ... */ } // here we put logic that receives Foo objects from some source (the network?) publishes them to the registered subscribers } class FooSubsetsToBarConverter : ISubscriber<Foo>, IPublisher<Bar> { void Callback(Foo foo) { // here we put logic that aggregates Foo objects and publishes Bars when we have received a subset of Foos that match our criteria // maybe we use Rx here internally. } public void Subscribe(ISubscriber<Bar> subscriber) { /* ... */ } } class BarsProcessor : ISubscriber<Bar> { void Callback(Bar bar) { // here we put code that processes Bar objects } } //********** program ********* class Program { public static void Main(string[] args) { var fooSource = fooSourceFactory.Create(); var barsProcessor = barsProcessorFactory.Create(fooSource) // this will create BarsProcessor and perform all the necessary subscriptions fooSource.Run(); // this enters a loop of listening for Foo objects from the network and notifying about their arrival. } } Which one do you think is better? Exposing IObservable and making our components create new event streams from Rx operators, or defining our own publisher/subscriber interfaces and using Rx internally if needed? Here are some things to consider about the designs: In the first design the consumer of our interfaces has the whole power of Rx at his/her fingertips and can perform any Rx operators. One of us claims this is an advantage and the other claims that this is a drawback. The second design allows us to use any publisher/subscriber architecture under the hood. The first design ties us to Rx. If we wish to use the power of Rx, it requires more work in the second design because we need to translate the custom publisher/subscriber implementation to Rx and back. It requires writing glue code for every class that wishes to do some event processing.

    Read the article

  • Better way to design a database

    - by cMinor
    I have a conceptual problem and I would like to get your ideas on how I'll be able to do what I am aiming. My goal is to create a database with information of persons who work at a place depending on their profession and skills,and keep control of salary and projects (how much would cost summing all the hours of work) I have 3 categories which can have subcategories: Outsourcing Technician welder turner assistant Administrative supervisor manager So each person has its information and the projects they are working on, also one person may do several jobs... I was thinking about having 5 tables (EMPLOYEE, SKILLS, PROYECTS, SALARY, PROFESSION) but I guess there is a better way of doing this. create table Employee ( PRIMARY KEY [Person_ID] int(10), [Name] varchar(30), [sex] varchar(10), [address] varchar(10), [profession] varchar(10), [Skills_ID] int(10), [Proyect_ID] int(10), [Salary_ID] int(10), [Salary] float ) create table Skills ( PRIMARY KEY [Skills_ID] int(10), FOREIGN KEY [Skills_name] varchar(10) REFERENCES Employee(Person_ID), [Skills_pay] float(10), [Comments] varchar(50) ) create table Proyects ( PRIMARY KEY [Proyect_ID] int(10), FOREIGN KEY [Skills_name] varchar(10) REFERENCES Employee(Person_ID) [Proyect_name] varchar(10), [working_Hours] float(10), [Comments] varchar(50) ) create table Salary ( PRIMARY KEY [Salary_ID] int(10), FOREIGN KEY [Skills_name] varchar(10) REFERENCES Employee(Person_ID) [Proyect_name] varchar(10), [working_Hours] float(10), [Comments] varchar(50) ) So to get the total amount of the cost of a project I would just sum the working hours of each employee envolved and sum some extra costs in an aggregate query. Is there a way to do this in a more efficient way? What to add or delete of this small model? I guess I am missing something in the salary - maybe I need another table for that?

    Read the article

  • Excel not properly recalculating values

    - by gms8994
    I have an excel sheet with values in it (this sheet is generated by a custom perl script, but I don't think that's where the problem lies). In it, I have a formula: =sum(indirect(concatenate(address(6,column()),":",address(17,column())))) The purpose of this formula is to give me the SUM() of the cells in the current column, between rows 6 and 17. In Gnumeric Spreadsheet, as soon as I open the file, this works. But in Excel (both 2003 and 2007), opening the file gives #VALUE! errors in the fields with this formula, stating that the INDIRECT call with the values $B$6:$B$17 will result in an error. Here's the kink in the issue. If I edit the field (via F2), and make no changes, and hit enter, the values update. Also, it seems, if I save the file as .xlsx (Excel 2007 format), the values update upon opening. Unfortunately, I'm not sure that creating an xlsx is a possibility with the modules that I'm using, and many of our clients probably wouldn't be able to use it anyway. Any suggestions? Editing 200+ files every month for each client isn't going to be feasible, so if there's something I'm missing, please let me know.

    Read the article

  • MySQL select query result set changes based on column order

    - by user197191
    I have a drupal 7 site using the Views module to back-end site content search results. The same query with the same dataset returns different results from MySQL 5.5.28 to MySQL 5.6.14. The results from 5.5.28 are the correct, expected results. The results from 5.6.14 are not. If, however, I simply move a column in the select statement, the query returns the correct results. Here is the code-generated query in question (modified for readability). I apologize for the length; I couldn't find a way to reproduce it without the whole query: SELECT DISTINCT node_node_revision.nid AS node_node_revision_nid, node_revision.title AS node_revision_title, node_field_revision_field_position_institution_ref.nid AS node_field_revision_field_position_institution_ref_nid, node_revision.vid AS vid, node_revision.nid AS node_revision_nid, node_node_revision.title AS node_node_revision_title, SUM(search_index.score * search_total.count) AS score, 'node' AS field_data_field_system_inst_name_node_entity_type, 'node' AS field_revision_field_position_college_division_node_entity_t, 'node' AS field_revision_field_position_department_node_entity_type, 'node' AS field_revision_field_search_lvl_degree_lvls_node_entity_type, 'node' AS field_revision_field_position_app_deadline_node_entity_type, 'node' AS field_revision_field_position_start_date_node_entity_type, 'node' AS field_revision_body_node_entity_type FROM node_revision node_revision LEFT JOIN node node_node_revision ON node_revision.nid = node_node_revision.nid LEFT JOIN field_revision_field_position_institution_ref field_revision_field_position_institution_ref ON node_revision.vid = field_revision_field_position_institution_ref.revision_id AND (field_revision_field_position_institution_ref.entity_type = 'node' AND field_revision_field_position_institution_ref.deleted = '0') LEFT JOIN node node_field_revision_field_position_institution_ref ON field_revision_field_position_institution_ref.field_position_institution_ref_target_id = node_field_revision_field_position_institution_ref.nid LEFT JOIN field_revision_field_position_cip_code field_revision_field_position_cip_code ON node_revision.vid = field_revision_field_position_cip_code.revision_id AND (field_revision_field_position_cip_code.entity_type = 'node' AND field_revision_field_position_cip_code.deleted = '0') LEFT JOIN node node_field_revision_field_position_cip_code ON field_revision_field_position_cip_code.field_position_cip_code_target_id = node_field_revision_field_position_cip_code.nid LEFT JOIN node node_node_revision_1 ON node_revision.nid = node_node_revision_1.nid LEFT JOIN field_revision_field_position_vacancy_status field_revision_field_position_vacancy_status ON node_revision.vid = field_revision_field_position_vacancy_status.revision_id AND (field_revision_field_position_vacancy_status.entity_type = 'node' AND field_revision_field_position_vacancy_status.deleted = '0') LEFT JOIN search_index search_index ON node_revision.nid = search_index.sid LEFT JOIN search_total search_total ON search_index.word = search_total.word WHERE ( ( (node_node_revision.status = '1') AND (node_node_revision.type IN ('position')) AND (field_revision_field_position_vacancy_status.field_position_vacancy_status_target_id IN ('38')) AND( (search_index.type = 'node') AND( (search_index.word = 'accountant') ) ) AND ( (node_revision.vid=node_node_revision.vid AND node_node_revision.status=1) ) ) ) GROUP BY search_index.sid, vid, score, field_data_field_system_inst_name_node_entity_type, field_revision_field_position_college_division_node_entity_t, field_revision_field_position_department_node_entity_type, field_revision_field_search_lvl_degree_lvls_node_entity_type, field_revision_field_position_app_deadline_node_entity_type, field_revision_field_position_start_date_node_entity_type, field_revision_body_node_entity_type HAVING ( ( (COUNT(*) >= '1') ) ) ORDER BY node_node_revision_title ASC LIMIT 20 OFFSET 0; Again, this query returns different sets of results from MySQL 5.5.28 (correct) to 5.6.14 (incorrect). If I move the column named "score" (the SUM() column) to the end of the column list, the query returns the correct set of results in both versions of MySQL. My question is: Is this expected behavior (and why), or is this a bug? I'm on the verge of reverting my entire environment back to 5.5 because of this.

    Read the article

  • MySQL config for 2GB ram

    - by Tiffany Walker
    How is my config? Does it work well for 2GB? What would be an ideal config for a 2GB ram server? [mysqld] set-variable = max_connections=500 log-slow-queries safe-show-database local-infile=0 skip-networking symbolic-links=0 max_connections = 500 key_buffer = 256M myisam_sort_buffer_size = 64M join_buffer_size = 2M read_buffer_size = 2M sort_buffer_size = 2M read_rnd_buffer_size = 2M thread_concurrency = 16 table_cache = 1024 thread_cache_size = 50 wait_timeout = 7200 connect_timeout = 10 tmp_table_size = 32M max_allowed_packet = 160M max_connect_errors = 10 query_cache_limit = 1M query_cache_size = 32M query_cache_type = 1 [mysqld_safe] open_files_limit = 8192 [mysqldump] max_allowed_packet = 16M [myisamchk] key_buffer = 64M sort_buffer = 64M read_buffer = 16M write_buffer = 16M UPDATE 2012-03-28 12:58 EDT By RolandoMySQLDBA Please run these queries and paste them into your question: For MyISAM SELECT CONCAT(ROUND(KBS/POWER(1024, IF(PowerOf1024<0,0,IF(PowerOf1024>3,0,PowerOf1024)))+0.4999), SUBSTR(' KMG',IF(PowerOf1024<0,0, IF(PowerOf1024>3,0,PowerOf1024))+1,1)) recommended_key_buffer_size FROM (SELECT LEAST(POWER(2,32),KBS1) KBS FROM (SELECT SUM(index_length) KBS1 FROM information_schema.tables WHERE engine='MyISAM' AND table_schema NOT IN ('information_schema','mysql')) AA ) A, (SELECT 2 PowerOf1024) B; For InnoDB SELECT CONCAT(ROUND(KBS/POWER(1024, IF(PowerOf1024<0,0,IF(PowerOf1024>3,0,PowerOf1024)))+0.49999), SUBSTR(' KMG',IF(PowerOf1024<0,0, IF(PowerOf1024>3,0,PowerOf1024))+1,1)) recommended_innodb_buffer_pool_size FROM (SELECT SUM(data_length+index_length) KBS FROM information_schema.tables WHERE engine='InnoDB') A, (SELECT 2 PowerOf1024) B;

    Read the article

  • What I should know about memory management?

    - by bua
    first of all: I don't use stackadmin or similar so please don't vote for moving there, I'm reading man top and paper "what every programmer should know about memory ..." I need really simple explanation like for retard ;) Having following top dump: top - 11:21:19 up 37 days, 21:16, 4 users, load average: 0.41, 0.75, 1.09 Tasks: 313 total, 5 running, 308 sleeping, 0 stopped, 0 zombie Cpu(s): 0.4%us, 0.6%sy, 0.9%ni, 96.2%id, 0.1%wa, 0.0%hi, 1.9%si, 0.0%st Mem: 132103848k total, 131916948k used, 186900k free, 54000k buffers Swap: 73400944k total, 73070884k used, 330060k free, 13931192k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3305 tudb 25 10 144m 52m 940 R 6.0 0.0 1306:09 app 3011 tudb 15 0 71528 19m 604 S 3.3 0.0 171:57.83 app 3373 tudb 25 10 209m 93m 940 S 3.0 0.1 1074:53 app 3338 tudb 25 10 144m 47m 940 R 2.7 0.0 780:48.48 app 4227 tudb 25 10 208m 99m 904 S 1.3 0.1 198:56.01 app 8506 tudb 25 10 80.7g 49g 932 S 2.0 39.6 458:31.22 app I'm wondering what is: RES (my expl. physical memory consumption ? see 49GB) VIRT (memory mapped disk to cache? see 80GB) SHR (shared pages?) Swap: (is this cached label - for memory mapped disk into swap cache?) Should sum of RES give MEM: X used? or maybe sum of VIRT?

    Read the article

  • Interpolation and Morphing of an image in labview and/or openCV

    - by Marc
    I am working on an image manipulation problem. I have an overhead projector that projects onto a screen, and I have a camera that takes pictures of that. I can establish a 1:1 correspondence between a subset of projector coordinates and a subset of camera pixels by projecting dots on the screen and finding the centers of mass of the resulting regions on the camera. I thus have a map proj_x, proj_y <-- cam_x, cam_y for scattered point pairs My original plan was to regularize this map using the Mathscript function griddata. This would work fine in MATLAB, as follows [pgridx, pgridy] = meshgrid(allprojxpts, allprojypts) fitcx = griddata (proj_x, proj_y, cam_x, pgridx, pgridy); fitcy = griddata (proj_x, proj_y, cam_y, pgridx, pgridy); and the reverse for the camera to projector mapping Unfortunately, this code causes Labview to run out of memory on the meshgrid step (the camera is 5 megapixels, which apparently is too much for labview to handle) I then started looking through openCV, and found the cvRemap function. Unfortunately, this function takes as its starting point a regularized pixel-pixel map like the one I was trying to generate above. However, it made me hope that functions for creating such a map might be available in openCV. I couldn't find it in the openCV 1.0 API (I am stuck with 1.0 for legacy reasons), but I was hoping it's there or that someone has an easy trick. So my question is one of the following 1) How can I interpolate from scattered points to a grid in openCV; (i.e., given z = f(x,y) for scattered values of x and y, how to fill an image with f(im_x, im_y) ? 2) How can I perform an image transform that maps image 1 to image 2, given that I know a scattered mapping of points in coordinate system 1 to coordinate system 2. This could be implemented either in Labview or OpenCV. Note: I am tagging this post delaunay, because that's one method of doing a scattered interpolation, but the better tag would be "scattered interpolation"

    Read the article

  • How to use JQUERY to filter table rows dynamically using multiple form inputs

    - by tresstylez
    I'm displaying a table with multiple rows and columns. I'm using a JQUERY plugin called uiTableFilter which uses a text field input and filters (shows/hides) the table rows based on the input you provide. All you do is specify a column you want to filter on, and it will display only rows that have the text field input in that column. Simple and works fine. I want to add a SECOND text input field that will help me narrow the results down even further. So, for instance if I had a PETS table and one column was petType and one was petColor -- I could type in CAT into the first text field, to show ALL cats, and then in the 2nd text field, I could type black, and the resulting table would display only rows where BLACK CATS were found. Basically, a subset. Here is the JQUERY I'm using: $("#typeFilter").live('keyup', function() { if ($(this).val().length > 2 || $(this).val().length == 0) { var newTable = $('#pets'); $.uiTableFilter( theTable, this.value, "petType" ); } }) // end typefilter $("#colorFilter").live('keyup', function() { if ($(this).val().length > 2 || $(this).val().length == 0) { var newTable = $('#pets'); $.uiTableFilter( newTable, this.value, "petColor" ); } }) // end colorfilter Problem is, I can use one filter, and it will display the correct subset of table rows, but when I provide input for the other filter, it doesn't seem to recognize the visible table rows that are remaining from the previous column, but instead it appears that it does an entirely new filtering of the original table. If 10 rows are returned after applying one filter, the 2nd filter should only apply to THOSE 10 rows. I've tried LIVE and BIND, but not working. Can anyone shed some light on where I'm going wrong? Thanks!

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >