Search Results

Search found 442 results on 18 pages for 'the naive'.

Page 17/18 | < Previous Page | 13 14 15 16 17 18  | Next Page >

  • How to compute a unicode string which bidirectional representation is specified?

    - by valdo
    Hello, fellows. I have a rather pervert question. Please forgive me :) There's an official algorithm that describes how bidirectional unicode text should be presented. http://www.unicode.org/reports/tr9/tr9-15.html I receive a string (from some 3rd-party source), which contains latin/hebrew characters, as well as digits, white-spaces, punctuation symbols and etc. The problem is that the string that I receive is already in the representation form. I.e. - the sequence of characters that I receive should just be presented from left to right. Now, my goal is to find the unicode string which representation is exactly the same. Means - I need to pass that string to another entity; it would then render this string according to the official algorithm, and the result should be the same. Assuming the following: The default text direction (of the rendering entity) is RTL. I don't want to inject "special unicode characters" that explicitly override the text direction (such as RLO, RLE, etc.) I suspect there may exist several solutions. If so - I'd like to preserve the RTL-looking of the string as much as possible. The string usually consists of hebrew words mostly. I'd like to preserve the correct order of those words, and characters inside those words. Whereas other character sequences may (and should) be transposed. One naive way to solve this is just to swap the whole string (this takes care of the hebrew words), and then swap inside it sequences of non-hebrew characters. This however doesn't always produce correct results, because actual rules of representation are rather complex. The only comprehensive algorithm that I see so far is brute-force check. The string can be divided into sequences of same-class characters. Those sequences may be joined in random order, plus any of them may be reversed. I can check all those combinations to obtain the correct result. Plus this technique may be optimized. For instance the order of hebrew words is known, so we only have to check different combinations of their "joining" sequences. Any better ideas? If you have an idea, not necessarily the whole solution - it's ok. I'll appreciate any idea. Thanks in advance.

    Read the article

  • Pure sine wave inverter

    - by Nick
    Not exactly programming (sorry) but I think it's pretty close and can be of interest to other programmers. I'm trying to setup a battery power station so that I can work from anywhere. I go surfing a lot and my idea is to be able to work from wherever I can park my car (given there's coverage). So, I'm getting a deep cycle battery, a 240V charger (I'm in Australia), and an inverter. At the back of my laptop it says 19V and 4.62A. From the people I've spoken to that means it consumes about 90W at most. So my inverter needs to be able to output about 100W. Most of them seem to be 200W and up so this shouldn't be a problem. I want to be able run my laptop for 10 hours (plus the 2 hours I get from the laptop battery) straight. According to the people I've spoken to and from what I gather online I need a battery that has the amp hours for my "amp draw". I have no idea how to calculate this but I've been guesstimating. Most deep cycle batteries seem to be classified using amp hours (Ah)... 35Ah, 50Ah, 75Ah, 100Ah, and so on. However the amp hours on those batteries is for a 240V and I seem to be using 19V. According to an expert I spoke to you'd need a 100Ah battery to power a 5A appliance at 240V for 10 hours (you only get about 50% useful power). That to me is 5A * 240V = 100Ah battery. So, naive as I might be I take 240V and divide that by my 19V and reach the conclusion that I can get away with a battery that's about 12 times smaller than that 100Ah. The expert told me I needed a 50Ah battery so that's probably what I'll be getting, but it would be interesting to know what I theoretically would need to power my laptop for 10 hours. As for charging the battery the expert I spoke to said I needed a 3-5A charger to be able to charge that 50Ah battery from flat to full in about 10 hours (I will just leave it plugged in over night). Now to my question. The expert said it's not a matter of "if" more like a guaranteed "when" my computer will stuff up if I don't use a "pure sine wave inverter". From what I gather the power that comes out of that battery is not as clean as the power we get in the socket at home. Apparently it's "square" and the one in the socket is nice and smooth. I've already got an inverter, but it's not "pure". Do I really need to buy the $200-300 pure sine wave inverter or can I get away with something else? Perhaps the laptop adapter that sits in the middle of my laptop power cable already fixes that wave to be nice and smooth? Thanks!

    Read the article

  • Can MySQL reasonably perform queries on billions of rows?

    - by haxney
    I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

    Read the article

  • Problem with Precision floating point operation in C

    - by Microkernel
    Hi Guys, For one of my course project I started implementing "Naive Bayesian classifier" in C. My project is to implement a document classifier application (especially Spam) using huge training data. Now I have problem implementing the algorithm because of the limitations in the C's datatype. ( Algorithm I am using is given here, http://en.wikipedia.org/wiki/Bayesian_spam_filtering ) PROBLEM STATEMENT: The algorithm involves taking each word in a document and calculating probability of it being spam word. If p1, p2 p3 .... pn are probabilities of word-1, 2, 3 ... n. The probability of doc being spam or not is calculated using Here, probability value can be very easily around 0.01. So even if I use datatype "double" my calculation will go for a toss. To confirm this I wrote a sample code given below. #define PROBABILITY_OF_UNLIKELY_SPAM_WORD (0.01) #define PROBABILITY_OF_MOSTLY_SPAM_WORD (0.99) int main() { int index; long double numerator = 1.0; long double denom1 = 1.0, denom2 = 1.0; long double doc_spam_prob; /* Simulating FEW unlikely spam words */ for(index = 0; index < 162; index++) { numerator = numerator*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD; denom2 = denom2*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD; denom1 = denom1*(long double)(1 - PROBABILITY_OF_UNLIKELY_SPAM_WORD); } /* Simulating lot of mostly definite spam words */ for (index = 0; index < 1000; index++) { numerator = numerator*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD; denom2 = denom2*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD; denom1 = denom1*(long double)(1- PROBABILITY_OF_MOSTLY_SPAM_WORD); } doc_spam_prob= (numerator/(denom1+denom2)); return 0; } I tried Float, double and even long double datatypes but still same problem. Hence, say in a 100K words document I am analyzing, if just 162 words are having 1% spam probability and remaining 99838 are conspicuously spam words, then still my app will say it as Not Spam doc because of Precision error (as numerator easily goes to ZERO)!!!. This is the first time I am hitting such issue. So how exactly should this problem be tackled?

    Read the article

  • Preoblem with Precision floating point operation in C

    - by Microkernel
    Hi Guys, For one of my course project I started implementing "Naive Bayesian classifier" in C. My project is to implement a document classifier application (especially Spam) using huge training data. Now I have problem implementing the algorithm because of the limitations in the C's datatype. ( Algorithm I am using is given here, http://en.wikipedia.org/wiki/Bayesian_spam_filtering ) PROBLEM STATEMENT: The algorithm involves taking each word in a document and calculating probability of it being spam word. If p1, p2 p3 .... pn are probabilities of word-1, 2, 3 ... n. The probability of doc being spam or not is calculated using Here, probability value can be very easily around 0.01. So even if I use datatype "double" my calculation will go for a toss. To confirm this I wrote a sample code given below. #define PROBABILITY_OF_UNLIKELY_SPAM_WORD (0.01) #define PROBABILITY_OF_MOSTLY_SPAM_WORD (0.99) int main() { int index; long double numerator = 1.0; long double denom1 = 1.0, denom2 = 1.0; long double doc_spam_prob; /* Simulating FEW unlikely spam words */ for(index = 0; index < 162; index++) { numerator = numerator*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD; denom2 = denom2*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD; denom1 = denom1*(long double)(1 - PROBABILITY_OF_UNLIKELY_SPAM_WORD); } /* Simulating lot of mostly definite spam words */ for (index = 0; index < 1000; index++) { numerator = numerator*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD; denom2 = denom2*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD; denom1 = denom1*(long double)(1- PROBABILITY_OF_MOSTLY_SPAM_WORD); } doc_spam_prob= (numerator/(denom1+denom2)); return 0; } I tried Float, double and even long double datatypes but still same problem. Hence, say in a 100K words document I am analyzing, if just 162 words are having 1% spam probability and remaining 99838 are conspicuously spam words, then still my app will say it as Not Spam doc because of Precision error (as numerator easily goes to ZERO)!!!. This is the first time I am hitting such issue. So how exactly should this problem be tackled?

    Read the article

  • Calling compiled C from R with .C()

    - by Sarah
    I'm trying to call a program (function getNBDensities in the C executable measurementDensities_out) from R. The function is passed several arrays and the variable double runsum. Right now, the getNBDensities function basically does nothing: it prints to screen the values of passed parameters. My problem is the syntax of calling the function: array(.C("getNBDensities", hr = as.double(hosp.rate), # a vector (s x 1) sp = as.double(samplingProbabilities), # another vector (s x 1) odh = as.double(odh), # another vector (s x 1) simCases = as.integer(x[c("xC1","xC2","xC3")]), # another vector (s x 1) obsCases = as.integer(y[c("yC1","yC2","yC3")]), # another vector (s x 1) runsum = as.double(runsum), # double DUP = TRUE, NAOK = TRUE, PACKAGE = "measurementDensities_out")$f, dim = length(y[c("yC1","yC2","yC3")]), dimnames = c("yC1","yC2","yC3")) The error I get, after proper execution of the function (i.e., the right output is printed to screen), is Error in dim(data) <- dim : attempt to set an attribute on NULL I'm unclear what the dimensions are that I should be passing the function: should it be s x 5 + 1 (five vectors of length s and one double)? I've tried all sorts of combinations (including sx5+1) and have found only seemingly conflicting descriptions/examples online of what's supposed to happen here. For those who are interested, the C code is below: #include <R.h> #include <Rmath.h> #include <math.h> #include <Rdefines.h> #include <R_ext/PrtUtil.h> #define NUM_STRAINS 3 #define DEBUG void getNBDensities( double *hr, double *sp, double *odh, int *simCases, int *obsCases, double *runsum ); void getNBDensities( double *hr, double *sp, double *odh, int *simCases, int *obsCases, double *runsum ) { #ifdef DEBUG for ( int s = 0; s < NUM_STRAINS; s++ ) { Rprintf("\nFor strain %d",s); Rprintf("\n\tHospitalization rate = %lg", hr[s]); Rprintf("\n\tSimulation probability = %lg",sp[s]); Rprintf("\n\tSimulated cases = %d",simCases[s]); Rprintf("\n\tObserved cases = %d",obsCases[s]); Rprintf("\n\tOverdispersion parameter = %lg",odh[s]); } Rprintf("\nRunning sum = %lg",runsum[0]); #endif } naive solution While better (i.e., potentially faster or syntactically clearer) solutions may exist (see Dirk's answer below), the following simplification of the code works: out<-.C("getNBDensities", hr = as.double(hosp.rate), sp = as.double(samplingProbabilities), odh = as.double(odh), simCases = as.integer(x[c("xC1","xC2","xC3")]), obsCases = as.integer(y[c("yC1","yC2","yC3")]), runsum = as.double(runsum)) The variables can be accessed in >out.

    Read the article

  • Preserving Permalinks

    - by Daniel Moth
    One of the things that gets me on a rant is websites that break permalinks. If you have posted something somewhere and there is a public URL pointing to it, that URL should never ever return a 404. You are breaking all websites that ever linked to you and you are breaking all search engine links to your content (that others will try and follow). It is a pet peeve of mine. So when I had to move my blog, obviously I would preserve the root URL (www.danielmoth.com/Blog/), but I also wanted to preserve every URL my blog has generated over the years. To be clear, our focus here is on the URL formatting, not the content migration which I'll talk about in my next post. In this post, I'll describe my solution first and then what it solves. 1. The IIS7 Rewrite Module and web.config There are a few ways you can map an old URL to a new one (so when requests to the old URL come in, they get redirected to the new one). The new blog engine I use (dasBlog) has built-in functionality to do that (Scott refers to it here). Instead, the way I chose to address the issue was to use the IIS7 rewrite module. The IIS7 rewrite module allows redirecting URLs based on pattern matching, regular expressions and, of course, hardcoded full URLs for things that don't fall into any pattern. You can configure it visually from IIS Manager using a handy dialog that allows testing patterns against input URLs. Here is what mine looked like after configuring a few rules: To learn more about this technology check out this video, the reference page and this overview blog post; all 3 pages have a collection of related resources at the bottom worth checking out too. All the visual configuration ends up in a web.config file at the root folder of your website. If you are on a shared hosting service, probably the only way you can use the Rewrite Module is by directly editing the web.config file. Next, I'll describe the URLs I had to map and how that manifested itself in the web.config file. What I did was create the rules locally using the GUI, and then took the generated web.config file and uploaded it to my live site. You can view my web.config here. 2. Monthly Archives Observe the difference between the way the two blog engines generate this type of URL Blogger: /Blog/2004_07_01_mothblog_archive.html dasBlog: /Blog/default,month,2004-07.aspx In my web.config file, the rule that deals with this is the one named "monthlyarchive_redirect". 3. Categories Observe the difference between the way the two blog engines generate this type of URL Blogger: /Blog/labels/Personal.html dasBlog: /Blog/CategoryView,category,Personal.aspx In my web.config file the rule that deals with this is the one named "category_redirect". 4. Posts Observe the difference between the way the two blog engines generate this type of URL Blogger: /Blog/2004/07/hello-world.html dasBlog: /Blog/Hello-World.aspx In my web.config file the rule that deals with this is the one named "post_redirect". Note: The decision is taken to use dasBlog URLs that do not include the date info (see the description of my Appearance settings). If we included the date info then it would have to include the day part, which blogger did not generate. This makes it impossible to redirect correctly and to have a single permalink for blog posts moving forward. An implication of this decision, is that no two blog posts can have the same title. The tool I will describe in my next post (inelegantly) deals with duplicates, but not with triplicates or higher. 5. Unhandled by a generic rule Unfortunately, the two blog engines use different rules for generating URLs for blog posts. Most of the time the conversion is as simple as the example of the previous section where a post titled "Hello World" generates a URL with the words separated by a hyphen. Some times that is not the case, for example: /Blog/2006/05/medc-wrap-up.html /Blog/MEDC-Wrapup.aspx or /Blog/2005/01/best-of-moth-2004.html /Blog/Best-Of-The-Moth-2004.aspx or /Blog/2004/11/more-windows-mobile-2005-details.html /Blog/More-Windows-Mobile-2005-Details-Emerge.aspx In short, blogger does not add words to the title beyond ~39 characters, it drops some words from the title generation (e.g. a, an, on, the), and it preserve hyphens that appear in the title. For this reason, we need to detect these and explicitly list them for redirects (no regular expression can help here because the full set of rules is not listed anywhere). In my web.config file the rule that deals with this is the one named "Redirect rule1 for FullRedirects" combined with the rewriteMap named "StaticRedirects". Note: The tool I describe in my next post will detect all the URLs that need to be explicitly redirected and will list them in a file ready for you to copy them to your web.config rewriteMap. 6. C# code doing the same as the web.config I wrote some naive code that does the same thing as the web.config: given a string it will return a new string converted according to the 3 rules above. It does not take into account the 4th case where an explicit hard-coded conversion is needed (the tool I present in the next post does take that into account). static string REGEX_post_redirect = "[0-9]{4}/[0-9]{2}/([0-9a-z-]+).html"; static string REGEX_category_redirect = "labels/([_0-9a-z-% ]+).html"; static string REGEX_monthlyarchive_redirect = "([0-9]{4})_([0-9]{2})_[0-9]{2}_mothblog_archive.html"; static string Redirect(string oldUrl) { GroupCollection g; if (RunRegExOnIt(oldUrl, REGEX_post_redirect, 2, out g)) return string.Concat(g[1].Value, ".aspx"); if (RunRegExOnIt(oldUrl, REGEX_category_redirect, 2, out g)) return string.Concat("CategoryView,category,", g[1].Value, ".aspx"); if (RunRegExOnIt(oldUrl, REGEX_monthlyarchive_redirect, 3, out g)) return string.Concat("default,month,", g[1].Value, "-", g[2], ".aspx"); return string.Empty; } static bool RunRegExOnIt(string toRegEx, string pattern, int groupCount, out GroupCollection g) { if (pattern.Length == 0) { g = null; return false; } g = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Compiled).Match(toRegEx).Groups; return (g.Count == groupCount); } Comments about this post welcome at the original blog.

    Read the article

  • WWDC and Tech Ed: A Tale of Two DevCons

    - by andrewbrust
    Next week marks the first full week of June.  Summer will feel in full swing and it will be a pretty big season for technology.  In seeming acknowledgement of that very fact, both Apple and Microsoft will be holding large developers conferences starting Monday.  Apple will hold its annual Worldwide Developers Conference (WWDC) in lovely San Francisco and Microsoft will hold its Tech Ed conference in muggy, oil-laden yet soulful New Orleans.  A brief survey of each show reveals much about the differences in each company’s offerings, strategy, and approach to customers and partners. In the interest of full disclosure, I must explain that I will be speaking at Microsoft’s Tech Ed show, and have done so, on and off, since 2003.  I have never been to an Apple conference and, as readers of this blog may know, I acquired my first ever Apple product 2 months ago when I bought an iPad on the day of that product’s launch.  I think I have keen insights into Microsoft’s conference.  My ability to comment on Apple’s event ranges somewhere between backseat driver and naive observer.  Just so you know. Although both shows cater to their respective company’s developers, there are a number of differences in the events’ purposes and content approaches.  First off, let’s consider each show as a news and PR vehicle.  WWDC will feature Steve Jobs’ keynote address and most likely will be where Apple officially reveals details of its 4th-generation iPhone. Jobs will likely also provide deep background information on the corresponding iPhone OS release.  These presumed announcements will make the show a magnet for the tech press and tech blogger elite.  Apple’s customers will be interested too, especially since the iPhone OS release will likely be made available to owners of existing iPhone, iPod Touch and iPad devices. Tech Ed, on the other hand, may not be especially newsworthy at all.  The keynote address will be given by Bob Muglia, who is President of the company’s Server and Tools Division, and he’ll likely be reviewing things more than previewing them. That’s because the company has, in the last 6-8 months, already released new versions of a majority of its products, including Windows, Office, SharePoint, SQL Server, Exchange, its Azure cloud platform, its .NET software development layer, its Silverlight Rich Internet Application (RIA) technology and its Visual Studio developer suite.  Redmond’s product pipeline has functioned more like a firehose of late, and the company has a ton of work to do to get developers up to speed on everything that’s new. I know I keep saying “developers,” but in Tech Ed’s case, that’s not really accurate.  In North America, Tech Ed caters to both developers and IT pros (i.e. technologists who work with physical IT infrastructure, as well as security and administration of the server software that runs on it).  This pairing has, since its inception, struck some as anomalous and others, including many exhibitors, as very smart. Certainly, it means Tech Ed ends up being a confab for virtually all professionals in Microsoft’s ecosystem.  And this year, Microsoft’s Business Intelligence (BI) conference will be co-located with Tech Ed, further enhancing that fusion effect. Clearly then, Microsoft’s show will focus on education, as its name assures us.  Apple’s will serve as both a press event and an opportunity to get its own App Store developer channel synced up with its newest technology advances.  For example, we already know that iPhone OS 4.0 will provide for a limited multitasking capability; that will only work well if people know how to code to it in a capable way.  Apple also told us its iAd advertising platform will be part of the new OS, and Steve Jobs insists that’s to provide a revenue opportunity for developers.  This too, then, needs to be explicated and soaked up buy the faithful. A look at each show’s breakout session lineup provides some interesting takeaways.  WWDC will have very few Mac-specific sessions on offer, and virtually no sessions that at are IT- or “Enterprise-“ related.  It’s all about the phone, music players and tablets.  However, WWDC will have plenty of low-level, hardcore tech coverage of such things as Advanced Memory Analysis and Creating Secure Applications, as well as lots of rich media-related content like Core Animation and Game Design and Development.  Beyond Apple’s proprietary platform, WWDC will also feature an array of sessions on HTML 5 and other Web standards.  In all, WWDC offers over 100 technical sessions and hands-on labs. What about Tech Ed’s editorial content?  Like the target audience, it really runs the gamut.  The show has 21 tracks (versus WWDC’s 5) and more than 745 “learning opportunities” which include breakout sessions, demo stations, hands-on labs and BIrds of a Feather discussion sessions.  Topics range from Architecture talks like Patterns of Parallel Programming to cloud computing talks like Building High Capacity Compute Applications with Windows Azure to IT-focused topics like Virtualization of Microsoft SharePoint 2010 Farm Architecture.  I also count 19 sessions on Windows Phone 7.  Unfortunately, with regard to Web standards and HTML 5, only a few sessions are offered, all of them specific to Internet Explorer. All-in-all, Apple’s show looks more exciting and “sexier” than Tech Ed. Microsoft’s show seems a lot more enterprise-focused than WWDC. This is, of course, well in sync with each company’s approach and products.  Microsoft’s content is much wider ranging and bests WWDC in sheer volume of sessions and labs.  I suppose some might argue that less is more; others that Apple’s consumer-focused offerings simply don’t provide for the same depth of coverage to a business audience.  Microsoft has a serious focus on the cloud and  a paucity of coverage on client-side Web standards; Apple has virtually no cloud offering at all.  Again, this reflects each tech titan’s go-to-market strategy. My own take is that employees of each company should attend the other’s event.  The amount of mutual exclusivity in content may make sense in terms of corporate philosophy, but the reality is that each company could stand to diversify into the other’s territory, at least somewhat. My own talk at Tech Ed will focus on competitive analysis around Microsoft’s BI products.  Apple does not today figure into that analysis. Maybe one day it will.

    Read the article

  • Parallelism in .NET – Part 11, Divide and Conquer via Parallel.Invoke

    - by Reed
    Many algorithms are easily written to work via recursion.  For example, most data-oriented tasks where a tree of data must be processed are much more easily handled by starting at the root, and recursively “walking” the tree.  Some algorithms work this way on flat data structures, such as arrays, as well.  This is a form of divide and conquer: an algorithm design which is based around breaking up a set of work recursively, “dividing” the total work in each recursive step, and “conquering” the work when the remaining work is small enough to be solved easily. Recursive algorithms, especially ones based on a form of divide and conquer, are often a very good candidate for parallelization. This is apparent from a common sense standpoint.  Since we’re dividing up the total work in the algorithm, we have an obvious, built-in partitioning scheme.  Once partitioned, the data can be worked upon independently, so there is good, clean isolation of data. Implementing this type of algorithm is fairly simple.  The Parallel class in .NET 4 includes a method suited for this type of operation: Parallel.Invoke.  This method works by taking any number of delegates defined as an Action, and operating them all in parallel.  The method returns when every delegate has completed: Parallel.Invoke( () => { Console.WriteLine("Action 1 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 2 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 3 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); } ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Running this simple example demonstrates the ease of using this method.  For example, on my system, I get three separate thread IDs when running the above code.  By allowing any number of delegates to be executed directly, concurrently, the Parallel.Invoke method provides us an easy way to parallelize any algorithm based on divide and conquer.  We can divide our work in each step, and execute each task in parallel, recursively. For example, suppose we wanted to implement our own quicksort routine.  The quicksort algorithm can be designed based on divide and conquer.  In each iteration, we pick a pivot point, and use that to partition the total array.  We swap the elements around the pivot, then recursively sort the lists on each side of the pivot.  For example, let’s look at this simple, sequential implementation of quicksort: public static void QuickSort<T>(T[] array) where T : IComparable<T> { QuickSortInternal(array, 0, array.Length - 1); } private static void QuickSortInternal<T>(T[] array, int left, int right) where T : IComparable<T> { if (left >= right) { return; } SwapElements(array, left, (left + right) / 2); int last = left; for (int current = left + 1; current <= right; ++current) { if (array[current].CompareTo(array[left]) < 0) { ++last; SwapElements(array, last, current); } } SwapElements(array, left, last); QuickSortInternal(array, left, last - 1); QuickSortInternal(array, last + 1, right); } static void SwapElements<T>(T[] array, int i, int j) { T temp = array[i]; array[i] = array[j]; array[j] = temp; } Here, we implement the quicksort algorithm in a very common, divide and conquer approach.  Running this against the built-in Array.Sort routine shows that we get the exact same answers (although the framework’s sort routine is slightly faster).  On my system, for example, I can use framework’s sort to sort ten million random doubles in about 7.3s, and this implementation takes about 9.3s on average. Looking at this routine, though, there is a clear opportunity to parallelize.  At the end of QuickSortInternal, we recursively call into QuickSortInternal with each partition of the array after the pivot is chosen.  This can be rewritten to use Parallel.Invoke by simply changing it to: // Code above is unchanged... SwapElements(array, left, last); Parallel.Invoke( () => QuickSortInternal(array, left, last - 1), () => QuickSortInternal(array, last + 1, right) ); } This routine will now run in parallel.  When executing, we now see the CPU usage across all cores spike while it executes.  However, there is a significant problem here – by parallelizing this routine, we took it from an execution time of 9.3s to an execution time of approximately 14 seconds!  We’re using more resources as seen in the CPU usage, but the overall result is a dramatic slowdown in overall processing time. This occurs because parallelization adds overhead.  Each time we split this array, we spawn two new tasks to parallelize this algorithm!  This is far, far too many tasks for our cores to operate upon at a single time.  In effect, we’re “over-parallelizing” this routine.  This is a common problem when working with divide and conquer algorithms, and leads to an important observation: When parallelizing a recursive routine, take special care not to add more tasks than necessary to fully utilize your system. This can be done with a few different approaches, in this case.  Typically, the way to handle this is to stop parallelizing the routine at a certain point, and revert back to the serial approach.  Since the first few recursions will all still be parallelized, our “deeper” recursive tasks will be running in parallel, and can take full advantage of the machine.  This also dramatically reduces the overhead added by parallelizing, since we’re only adding overhead for the first few recursive calls.  There are two basic approaches we can take here.  The first approach would be to look at the total work size, and if it’s smaller than a specific threshold, revert to our serial implementation.  In this case, we could just check right-left, and if it’s under a threshold, call the methods directly instead of using Parallel.Invoke. The second approach is to track how “deep” in the “tree” we are currently at, and if we are below some number of levels, stop parallelizing.  This approach is a more general-purpose approach, since it works on routines which parse trees as well as routines working off of a single array, but may not work as well if a poor partitioning strategy is chosen or the tree is not balanced evenly. This can be written very easily.  If we pass a maxDepth parameter into our internal routine, we can restrict the amount of times we parallelize by changing the recursive call to: // Code above is unchanged... SwapElements(array, left, last); if (maxDepth < 1) { QuickSortInternal(array, left, last - 1, maxDepth); QuickSortInternal(array, last + 1, right, maxDepth); } else { --maxDepth; Parallel.Invoke( () => QuickSortInternal(array, left, last - 1, maxDepth), () => QuickSortInternal(array, last + 1, right, maxDepth)); } We no longer allow this to parallelize indefinitely – only to a specific depth, at which time we revert to a serial implementation.  By starting the routine with a maxDepth equal to Environment.ProcessorCount, we can restrict the total amount of parallel operations significantly, but still provide adequate work for each processing core. With this final change, my timings are much better.  On average, I get the following timings: Framework via Array.Sort: 7.3 seconds Serial Quicksort Implementation: 9.3 seconds Naive Parallel Implementation: 14 seconds Parallel Implementation Restricting Depth: 4.7 seconds Finally, we are now faster than the framework’s Array.Sort implementation.

    Read the article

  • The Red Gate and .NET Reflector Debacle

    - by Rick Strahl
    About a month ago Red Gate – the company who owns the NET Reflector tool most .NET devs use at one point or another – decided to change their business model for Reflector and take the product from free to a fully paid for license model. As a bit of history: .NET Reflector was originally created by Lutz Roeder as a free community tool to inspect .NET assemblies. Using Reflector you can examine the types in an assembly, drill into type signatures and quickly disassemble code to see how a particular method works.  In case you’ve been living under a rock and you’ve never looked at Reflector, here’s what it looks like drilled into an assembly from disk with some disassembled source code showing: Note that you get tons of information about each element in the tree, and almost all related types and members are clickable both in the list and source view so it’s extremely easy to navigate and follow the code flow even in this static assembly only view. For many year’s Lutz kept the the tool up to date and added more features gradually improving an already amazing tool and making it better. Then about two and a half years ago Red Gate bought the tool from Lutz. A lot of ruckus and noise ensued in the community back then about what would happen with the tool and… for the most part very little did. Other than the incessant update notices with prominent Red Gate promo on them life with Reflector went on. The product didn’t die and and it didn’t go commercial or to a charge model. When .NET 4.0 came out it still continued to work mostly because the .NET feature set doesn’t drastically change how types behave.  Then a month back Red Gate started making noise about a new Version Version 7 which would be commercial. No more free version - and a shit storm broke out in the community. Now normally I’m not one to be critical of companies trying to make money from a product, much less for a product that’s as incredibly useful as Reflector. There isn’t day in .NET development that goes by for me where I don’t fire up Reflector. Whether it’s for examining the innards of the .NET Framework, checking out third party code, or verifying some of my own code and resources. Even more so recently I’ve been doing a lot of Interop work with a non-.NET application that needs to access .NET components and Reflector has been immensely valuable to me (and my clients) if figuring out exact type signatures required to calling .NET components in assemblies. In short Reflector is an invaluable tool to me. Ok, so what’s the problem? Why all the fuss? Certainly the $39 Red Gate is trying to charge isn’t going to kill any developer. If there’s any tool in .NET that’s worth $39 it’s Reflector, right? Right, but that’s not the problem here. The problem is how Red Gate went about moving the product to commercial which borders on the downright bizarre. It’s almost as if somebody in management wrote a slogan: “How can we piss off the .NET community in the most painful way we can?” And that it seems Red Gate has a utterly succeeded. People are rabid, and for once I think that this outrage isn’t exactly misplaced. Take a look at the message thread that Red Gate dedicated from a link off the download page. Not only is Version 7 going to be a paid commercial tool, but the older versions of Reflector won’t be available any longer. Not only that but older versions that are already in use also will continually try to update themselves to the new paid version – which when installed will then expire unless registered properly. There have also been reports of Version 6 installs shutting themselves down and failing to work if the update is refused (I haven’t seen that myself so not sure if that’s true). In other words Red Gate is trying to make damn sure they’re getting your money if you attempt to use Reflector. There’s a lot of temptation there. Think about the millions of .NET developers out there and all of them possibly upgrading – that’s a nice chunk of change that Red Gate’s sitting on. Even with all the community backlash these guys are probably making some bank right now just because people need to get life to move on. Red Gate also put up a Feedback link on the download page – which not surprisingly is chock full with hate mail condemning the move. Oddly there’s not a single response to any of those messages by the Red Gate folks except when it concerns license questions for the full version. It puzzles me what that link serves for other yet than another complete example of failure to understand how to handle customer relations. There’s no doubt that that all of this has caused some serious outrage in the community. The sad part though is that this could have been handled so much less arrogantly and without pissing off the entire community and causing so much ill-will. People are pissed off and I have no doubt that this negative publicity will show up in the sales numbers for their other products. I certainly hope so. Stupidity ought to be painful! Why do Companies do boneheaded stuff like this? Red Gate’s original decision to buy Reflector was hotly debated but at that the time most of what would happen was mostly speculation. But I thought it was a smart move for any company that is in need of spreading its marketing message and corporate image as a vendor in the .NET space. Where else do you get to flash your corporate logo to hordes of .NET developers on a regular basis?  Exploiting that marketing with some goodwill of providing a free tool breeds positive feedback that hopefully has a good effect on the company’s visibility and the products it sells. Instead Red Gate seems to have taken exactly the opposite tack of corporate bullying to try to make a quick buck – and in the process ruined any community goodwill that might have come from providing a service community for free while still getting valuable marketing. What’s so puzzling about this boneheaded escapade is that the company doesn’t need to resort to underhanded tactics like what they are trying with Reflector 7. The tools the company makes are very good. I personally use SQL Compare, Sql Data Compare and ANTS Profiler on a regular basis and all of these tools are essential in my toolbox. They certainly work much better than the tools that are in the box with Visual Studio. Chances are that if Reflector 7 added useful features I would have been more than happy to shell out my $39 to upgrade when the time is right. It’s Expensive to give away stuff for Free At the same time, this episode shows some of the big problems that come with ‘free’ tools. A lot of organizations are realizing that giving stuff away for free is actually quite expensive and the pay back is often very intangible if any at all. Those that rely on donations or other voluntary compensation find that they amount contributed is absolutely miniscule as to not matter at all. Yet at the same time I bet most of those clamouring the loudest on that Red Gate Reflector feedback page that Reflector won’t be free anymore probably have NEVER made a donation to any open source project or free tool ever. The expectation of Free these days is just too great – which is a shame I think. There’s a lot to be said for paid software and having somebody to hold to responsible to because you gave them some money. There’s an incentive –> payback –> responsibility model that seems to be missing from free software (not all of it, but a lot of it). While there certainly are plenty of bad apples in paid software as well, money tends to be a good motivator for people to continue working and improving products. Reasons for giving away stuff are many but often it’s a naïve desire to share things when things are simple. At first it might be no problem to volunteer time and effort but as products mature the fun goes out of it, and as the reality of product maintenance kicks in developers want to get something back for the time and effort they’re putting in doing non-glamorous work. It’s then when products die or languish and this is painful for all to watch. For Red Gate however, I think there was always a pretty good payback from the Reflector acquisition in terms of marketing: Visibility and possible positioning of their products although they seemed to have mostly ignored that option. On the other hand they started this off pretty badly even 2 and a half years back when they aquired Reflector from Lutz with the same arrogant attitude that is evident in the latest episode. You really gotta wonder what folks are thinking in management – the sad part is from advance emails that were circulating, they were fully aware of the shit storm they were inciting with this and I suspect they are banking on the sheer numbers of .NET developers to still make them a tidy chunk of change from upgrades… Alternatives are coming For me personally the single license isn’t a problem, but I actually have a tool that I sell (an interop Web Service proxy generation tool) to customers and one of the things I recommend to use with has been Reflector to view assembly information and to find which Interop classes to instantiate from the non-.NET environment. It’s been nice to use Reflector for this with its small footprint and zero-configuration installation. But now with V7 becoming a paid tool that option is not going to be available anymore. Luckily it looks like the .NET community is jumping to it and trying to fill the void. Amidst the Red Gate outrage a new library called ILSpy has sprung up and providing at least some of the core functionality of Reflector with an open source library. It looks promising going forward and I suspect there will be a lot more support and interest to support this project now that Reflector has gone over to the ‘dark side’…© Rick Strahl, West Wind Technologies, 2005-2011

    Read the article

  • Developing Schema Compare for Oracle (Part 2): Dependencies

    - by Simon Cooper
    In developing Schema Compare for Oracle, one of the issues we came across was the size of the databases. As detailed in my last blog post, we had to allow schema pre-filtering due to the number of objects in a standard Oracle database. Unfortunately, this leads to some quite tricky situations regarding object dependencies. This post explains how we deal with these dependencies. 1. Cross-schema dependencies Say, in the following database, you're populating SchemaA, and synchronizing SchemaA.Table1: SOURCE   TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(Col1));   CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100) REFERENCES SchemaB.Table1(Col1)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY);   CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100) PRIMARY KEY); We need to do a rebuild of SchemaA.Table1 to change Col1 from a VARCHAR2(100) to a NUMBER. This consists of: Creating a table with the new schema Inserting data from the old table to the new table, with appropriate conversion functions (in this case, TO_NUMBER) Dropping the old table Rename new table to same name as old table Unfortunately, in this situation, the rebuild will fail at step 1, as we're trying to create a NUMBER column with a foreign key reference to a VARCHAR2(100) column. As we're only populating SchemaA, the naive implementation of the object population prefiltering (sticking a WHERE owner = 'SCHEMAA' on all the data dictionary queries) will generate an incorrect sync script. What we actually have to do is: Drop foreign key constraint on SchemaA.Table1 Rebuild SchemaB.Table1 Rebuild SchemaA.Table1, adding the foreign key constraint to the new table This means that in order to generate a correct synchronization script for SchemaA.Table1 we have to know what SchemaB.Table1 is, and that it also needs to be rebuilt to successfully rebuild SchemaA.Table1. SchemaB isn't the schema that the user wants to synchronize, but we still have to load the table and column information for SchemaB.Table1 the same way as any table in SchemaA. Fortunately, Oracle provides (mostly) complete dependency information in the dictionary views. Before we actually read the information on all the tables and columns in the database, we can get dependency information on all the objects that are either pointed at by objects in the schemas we’re populating, or point to objects in the schemas we’re populating (think about what would happen if SchemaB was being explicitly populated instead), with a suitable query on all_constraints (for foreign key relationships) and all_dependencies (for most other types of dependencies eg a function using another function). The extra objects found can then be included in the actual object population, and the sync wizard then has enough information to figure out the right thing to do when we get to actually synchronize the objects. Unfortunately, this isn’t enough. 2. Dependency chains The solution above will only get the immediate dependencies of objects in populated schemas. What if there’s a chain of dependencies? A.tbl1 -> B.tbl1 -> C.tbl1 -> D.tbl1 If we’re only populating SchemaA, the implementation above will only include B.tbl1 in the dependent objects list, whereas we might need to know about C.tbl1 and D.tbl1 as well, in order to ensure a modification on A.tbl1 can succeed. What we actually need is a graph traversal on the dependency graph that all_dependencies represents. Fortunately, we don’t have to read all the database dependency information from the server and run the graph traversal on the client computer, as Oracle provides a method of doing this in SQL – CONNECT BY. So, we can put all the dependencies we want to include together in big bag with UNION ALL, then run a SELECT ... CONNECT BY on it, starting with objects in the schema we’re populating. We should end up with all the objects that might be affected by modifications in the initial schema we’re populating. Good solution? Well, no. For one thing, it’s sloooooow. all_dependencies, on my test databases, has got over 110,000 rows in it, and the entire query, for which Oracle was creating a temporary table to hold the big bag of graph edges, was often taking upwards of two minutes. This is too long, and would only get worse for large databases. But it had some more fundamental problems than just performance. 3. Comparison dependencies Consider the following schema: SOURCE   TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(col1));   CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY);   CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100)); What will happen if we used the dependency algorithm above on the source & target database? Well, SchemaA.Table1 has a foreign key reference to SchemaB.Table1, so that will be included in the source database population. On the target, SchemaA.Table1 has no such reference. Therefore SchemaB.Table1 will not be included in the target database population. In the resulting comparison of the two objects models, what you will end up with is: SOURCE  TARGET SchemaA.Table1 -> SchemaA.Table1 SchemaB.Table1 -> (no object exists) When this comparison is synchronized, we will see that SchemaB.Table1 does not exist, so we will try the following sequence of actions: Create SchemaB.Table1 Rebuild SchemaA.Table1, with foreign key to SchemaB.Table1 Oops. Because the dependencies are only followed within a single database, we’ve tried to create an object that already exists. To fix this we can include any objects found as dependencies in the source or target databases in the object population of both databases. SchemaB.Table1 will then be included in the target database population, and we won’t try and create objects that already exist. All good? Well, consider the following schema (again, only explicitly populating SchemaA, and synchronizing SchemaA.Table1): SOURCE   TARGET CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(col1));   CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY);   CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100) PRIMARY KEY); CREATE TABLE SchemaC.Table1 ( Col1 NUMBER);   CREATE TABLE SchemaC.Table1 ( Col1 VARCHAR2(100) REFERENCES SchemaB.Table1); Although we’re now including SchemaB.Table1 on both sides of the comparison, there’s a third table (SchemaC.Table1) that we don’t know about that will cause the rebuild of SchemaB.Table1 to fail if we try and synchronize SchemaA.Table1. That’s because we’re only running the dependency query on the schemas we’re explicitly populating; to solve this issue, we would have to run the dependency query again, but this time starting the graph traversal from the objects found in the other database. Furthermore, this dependency chain could be arbitrarily extended.This leads us to the following algorithm for finding all the dependencies of a comparison: Find initial dependencies of schemas the user has selected to compare on the source and target Include these objects in both the source and target object populations Run the dependency query on the source, starting with the objects found as dependents on the target, and vice versa Repeat 2 & 3 until no more objects are found For the schema above, this will result in the following sequence of actions: Find initial dependenciesSchemaA.Table1 -> SchemaB.Table1 found on sourceNo objects found on target Include objects in both source and targetSchemaB.Table1 included in source and target Run dependency query, starting with found objectsNo objects to start with on sourceSchemaB.Table1 -> SchemaC.Table1 found on target Include objects in both source and targetSchemaC.Table1 included in source and target Run dependency query on found objectsNo objects found in sourceNo objects to start with in target Stop This will ensure that we include all the necessary objects to make any synchronization work. However, there is still the issue of query performance; the CONNECT BY on the entire database dependency graph is still too slow. After much sitting down and drawing complicated diagrams, we decided to move the graph traversal algorithm from the server onto the client (which turned out to run much faster on the client than on the server); and to ensure we don’t read the entire dependency graph onto the client we also pull the graph across in bits – we start off with dependency edges involving schemas selected for explicit population, and whenever the graph traversal comes across a dependency reference to a schema we don’t yet know about a thunk is hit that pulls in the dependency information for that schema from the database. We continue passing more dependent objects back and forth between the source and target until no more dependency references are found. This gives us the list of all the extra objects to populate in the source and target, and object population can then proceed. 4. Object blacklists and fast dependencies When we tested this solution, we were puzzled in that in some of our databases most of the system schemas (WMSYS, ORDSYS, EXFSYS, XDB, etc) were being pulled in, and this was increasing the database registration and comparison time quite significantly. After debugging, we discovered that the culprits were database tables that used one of the Oracle PL/SQL types (eg the SDO_GEOMETRY spatial type). These were creating a dependency chain from the database tables we were populating to the system schemas, and hence pulling in most of the system objects in that schema. To solve this we introduced blacklists of objects we wouldn’t follow any dependency chain through. As well as the Oracle-supplied PL/SQL types (MDSYS.SDO_GEOMETRY, ORDSYS.SI_COLOR, among others) we also decided to blacklist the entire PUBLIC and SYS schemas, as any references to those would likely lead to a blow up in the dependency graph that would massively increase the database registration time, and could result in the client running out of memory. Even with these improvements, each dependency query was taking upwards of a minute. We discovered from Oracle execution plans that there were some columns, with dependency information we required, that were querying system tables with no indexes on them! To cut a long story short, running the following query: SELECT * FROM all_tab_cols WHERE data_type_owner = ‘XDB’; results in a full table scan of the SYS.COL$ system table! This single clause was responsible for over half the execution time of the dependency query. Hence, the ‘Ignore slow dependencies’ option was born – not querying this and a couple of similar clauses to drastically speed up the dependency query execution time, at the expense of producing incorrect sync scripts in rare edge cases. Needless to say, along with the sync script action ordering, the dependency code in the database registration is one of the most complicated and most rewritten parts of the Schema Compare for Oracle engine. The beta of Schema Compare for Oracle is out now; if you find a bug in it, please do tell us so we can get it fixed!

    Read the article

  • Matrix Multiplication with C++ AMP

    - by Daniel Moth
    As part of our API tour of C++ AMP, we looked recently at parallel_for_each. I ended that post by saying we would revisit parallel_for_each after introducing array and array_view. Now is the time, so this is part 2 of parallel_for_each, and also a post that brings together everything we've seen until now. The code for serial and accelerated Consider a naïve (or brute force) serial implementation of matrix multiplication  0: void MatrixMultiplySerial(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 1: { 2: for (int row = 0; row < M; row++) 3: { 4: for (int col = 0; col < N; col++) 5: { 6: float sum = 0.0f; 7: for(int i = 0; i < W; i++) 8: sum += vA[row * W + i] * vB[i * N + col]; 9: vC[row * N + col] = sum; 10: } 11: } 12: } We notice that each loop iteration is independent from each other and so can be parallelized. If in addition we have really large amounts of data, then this is a good candidate to offload to an accelerator. First, I'll just show you an example of what that code may look like with C++ AMP, and then we'll analyze it. It is assumed that you included at the top of your file #include <amp.h> 13: void MatrixMultiplySimple(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 14: { 15: concurrency::array_view<const float,2> a(M, W, vA); 16: concurrency::array_view<const float,2> b(W, N, vB); 17: concurrency::array_view<concurrency::writeonly<float>,2> c(M, N, vC); 18: concurrency::parallel_for_each(c.grid, 19: [=](concurrency::index<2> idx) restrict(direct3d) { 20: int row = idx[0]; int col = idx[1]; 21: float sum = 0.0f; 22: for(int i = 0; i < W; i++) 23: sum += a(row, i) * b(i, col); 24: c[idx] = sum; 25: }); 26: } First a visual comparison, just for fun: The beginning and end is the same, i.e. lines 0,1,12 are identical to lines 13,14,26. The double nested loop (lines 2,3,4,5 and 10,11) has been transformed into a parallel_for_each call (18,19,20 and 25). The core algorithm (lines 6,7,8,9) is essentially the same (lines 21,22,23,24). We have extra lines in the C++ AMP version (15,16,17). Now let's dig in deeper. Using array_view and extent When we decided to convert this function to run on an accelerator, we knew we couldn't use the std::vector objects in the restrict(direct3d) function. So we had a choice of copying the data to the the concurrency::array<T,N> object, or wrapping the vector container (and hence its data) with a concurrency::array_view<T,N> object from amp.h – here we used the latter (lines 15,16,17). Now we can access the same data through the array_view objects (a and b) instead of the vector objects (vA and vB), and the added benefit is that we can capture the array_view objects in the lambda (lines 19-25) that we pass to the parallel_for_each call (line 18) and the data will get copied on demand for us to the accelerator. Note that line 15 (and ditto for 16 and 17) could have been written as two lines instead of one: extent<2> e(M, W); array_view<const float, 2> a(e, vA); In other words, we could have explicitly created the extent object instead of letting the array_view create it for us under the covers through the constructor overload we chose. The benefit of the extent object in this instance is that we can express that the data is indeed two dimensional, i.e a matrix. When we were using a vector object we could not do that, and instead we had to track via additional unrelated variables the dimensions of the matrix (i.e. with the integers M and W) – aren't you loving C++ AMP already? Note that the const before the float when creating a and b, will result in the underling data only being copied to the accelerator and not be copied back – a nice optimization. A similar thing is happening on line 17 when creating array_view c, where we have indicated that we do not need to copy the data to the accelerator, only copy it back. The kernel dispatch On line 18 we make the call to the C++ AMP entry point (parallel_for_each) to invoke our parallel loop or, as some may say, dispatch our kernel. The first argument we need to pass describes how many threads we want for this computation. For this algorithm we decided that we want exactly the same number of threads as the number of elements in the output matrix, i.e. in array_view c which will eventually update the vector vC. So each thread will compute exactly one result. Since the elements in c are organized in a 2-dimensional manner we can organize our threads in a two-dimensional manner too. We don't have to think too much about how to create the first argument (a grid) since the array_view object helpfully exposes that as a property. Note that instead of c.grid we could have written grid<2>(c.extent) or grid<2>(extent<2>(M, N)) – the result is the same in that we have specified M*N threads to execute our lambda. The second argument is a restrict(direct3d) lambda that accepts an index object. Since we elected to use a two-dimensional extent as the first argument of parallel_for_each, the index will also be two-dimensional and as covered in the previous posts it represents the thread ID, which in our case maps perfectly to the index of each element in the resulting array_view. The kernel itself The lambda body (lines 20-24), or as some may say, the kernel, is the code that will actually execute on the accelerator. It will be called by M*N threads and we can use those threads to index into the two input array_views (a,b) and write results into the output array_view ( c ). The four lines (21-24) are essentially identical to the four lines of the serial algorithm (6-9). The only difference is how we index into a,b,c versus how we index into vA,vB,vC. The code we wrote with C++ AMP is much nicer in its indexing, because the dimensionality is a first class concept, so you don't have to do funny arithmetic calculating the index of where the next row starts, which you have to do when working with vectors directly (since they store all the data in a flat manner). I skipped over describing line 20. Note that we didn't really need to read the two components of the index into temporary local variables. This mostly reflects my personal choice, in some algorithms to break down the index into local variables with names that make sense for the algorithm, i.e. in this case row and col. In other cases it may i,j,k or x,y,z, or M,N or whatever. Also note that we could have written line 24 as: c(idx[0], idx[1])=sum  or  c(row, col)=sum instead of the simpler c[idx]=sum Targeting a specific accelerator Imagine that we had more than one hardware accelerator on a system and we wanted to pick a specific one to execute this parallel loop on. So there would be some code like this anywhere before line 18: vector<accelerator> accs = MyFunctionThatChoosesSuitableAccelerators(); accelerator acc = accs[0]; …and then we would modify line 18 so we would be calling another overload of parallel_for_each that accepts an accelerator_view as the first argument, so it would become: concurrency::parallel_for_each(acc.default_view, c.grid, ...and the rest of your code remains the same… how simple is that? Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt?

    - by Stu Thompson
    Prelude: I'm a code-monkey that's increasingly taken on SysAdmin duties for my small company. My code is our product, and increasingly we provide the same app as SaaS. About 18 months ago I moved our servers from a premium hosting centric vendor to a barebones rack pusher in a tier IV data center. (Literally across the street.) This ment doing much more ourselves--things like networking, storage and monitoring. As part the big move, to replace our leased direct attached storage from the hosting company, I built a 9TB two-node NAS based on SuperMicro chassises, 3ware RAID cards, Ubuntu 10.04, two dozen SATA disks, DRBD and . It's all lovingly documented in three blog posts: Building up & testing a new 9TB SATA RAID10 NFSv4 NAS: Part I, Part II and Part III. We also setup a Cacit monitoring system. Recently we've been adding more and more data points, like SMART values. I could not have done all this without the awesome boffins at ServerFault. It's been a fun and educational experience. My boss is happy (we saved bucket loads of $$$), our customers are happy (storage costs are down), I'm happy (fun, fun, fun). Until yesterday. Outage & Recovery: Some time after lunch we started getting reports of sluggish performance from our application, an on-demand streaming media CMS. About the same time our Cacti monitoring system sent a blizzard of emails. One of the more telling alerts was a graph of iostat await. Performance became so degraded that Pingdom began sending "server down" notifications. The overall load was moderate, there was not traffic spike. After logging onto the application servers, NFS clients of the NAS, I confirmed that just about everything was experiencing highly intermittent and insanely long IO wait times. And once I hopped onto the primary NAS node itself, the same delays were evident when trying to navigate the problem array's file system. Time to fail over, that went well. Within 20 minuts everything was confirmed to be back up and running perfectly. Post-Mortem: After any and all system failures I perform a post-mortem to determine the cause of the failure. First thing I did was ssh back into the box and start reviewing logs. It was offline, completely. Time for a trip to the data center. Hardware reset, backup an and running. In /var/syslog I found this scary looking entry: Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_00], 6 Currently unreadable (pending) sectors Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_07], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 171 to 170 Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 16 Currently unreadable (pending) sectors Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 4 Offline uncorrectable sectors Nov 15 06:49:45 umbilo smartd[2827]: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Nov 15 06:49:45 umbilo smartd[2827]: # 1 Short offline Completed: read failure 90% 6576 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 2 Short offline Completed: read failure 90% 6087 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 3 Short offline Completed: read failure 10% 5901 656821791 Nov 15 06:49:45 umbilo smartd[2827]: # 4 Short offline Completed: read failure 90% 5818 651637856 Nov 15 06:49:45 umbilo smartd[2827]: So I went to check the Cacti graphs for the disks in the array. Here we see that, yes, disk 7 is slipping away just like syslog says it is. But we also see that disk 8's SMART Read Erros are fluctuating. There are no messages about disk 8 in syslog. More interesting is that the fluctuating values for disk 8 directly correlate to the high IO wait times! My interpretation is that: Disk 8 is experiencing an odd hardware fault that results in intermittent long operation times. Somehow this fault condition on the disk is locking up the entire array Maybe there is a more accurate or correct description, but the net result has been that the one disk is impacting the performance of the whole array. The Question(s) How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt? Am I being naïve to think that the RAID card should have dealt with this? How can I prevent a single misbehaving disk from impacting the entire array? Am I missing something?

    Read the article

  • NLP with greatly contrained input and abilities

    - by Mike F
    Hat in hand here. I'm a seasoned developer and I would be grateful for a bit of help. I don't have time to read or digest long intricate discussions on theoretical concepts around NLP (or go get my PHD). That said, I have read a few and it's a damn interesting field. The problem is I need real world solutions, for real world products, in real world time frames. The problem I'm having is right now I'm not sure what the right questions are to ask to get started implementing. I believe this is mostly related to vocabulary. I'll read somewhere, a blog post, a forum post, a whitepaper, and it says, I'm doing flooping with the blargy blarg method, and I go google flooping and blargy blarg, and I get references to more obscurity. It seemingly never ends. So, my question is multiphased. First, more generally, how do I become passingly educated on this quickly? Just in time educated. I only need to know what I need to know to take the next step. I've spent 20 years writing code. Explain quick. I'll get it. (I mean provide a reference to something that explains quickly of course). I'm happy to read the right book, but I don't want to read a book where I read the chapter introduction that explains what floopy floop is and then skip over the rest of the chapter with examples of floopy flooping (because now I get what it is). I also don't want to read a book that goes into too much detail with theoretical underpinnings or history. For example, the Jurafsky book seems like way more than I need: http://www.amazon.com/gp/product/0131873210. But I will read it if this is the right book to read. (It's also dang expensive!) I need the root node of the expedited learning tree here, if you will. Point me in the right direction and I'll be quite grateful. I'm expecting quite a lot of firehose drinking - I just need the right firehose. Second, what I need to do is take a single sentence, with a very reduced vocabulary, and get a grammar tree (sorry if this is the wrong terminology) that I can do something with. I know I could easily write this command line input style in c in a more conventional manner, but I need it to be way better than that. But I don't need a chatterbot either. What I'm doing needs to live in a constrained environment. I can't use Python (unfortunately). I can't ship with gigabytes of corpuses. I need any libraries I use to be in c/c++. If I have to write this myself, I will. Hopefully, it will be achievable considering the reduced problem set. Maybe, probably, that's just naive. If so, let me know. :-) Thanks in advance - Mike

    Read the article

  • Designing interfaces: predict methods needed, discipline yourself and deal with code that comes to m

    - by fireeyedboy
    Was: Design by contract: predict methods needed, discipline yourself and deal with code that comes to mind I like the idea of designing by contract a lot (at least, as far as I understand the principal). I believe it means you define intefaces first before you start implementing actual code, right? However, from my limited experience (3 OOP years now) I usually can't resist the urge to start coding pretty early, for several reasons: because my limited experience has shown me I am unable to predict what methods I will be needing in the interface, so I might as well start coding right away. or because I am simply too impatient to write out the whole interfaces first. or when I do try it, I still wind up implementing bits of code already, because I fear I might forget this or that imporant bit of code, that springs to mind when I am designing the interfaces. As you see, especially with the last two points, this leads to a very disorderly way of doing things. Tasks get mixed up. I should draw a clear line between designing interfaces and actual coding. If you, unlike me, are a good/disciplined planner, as intended above, how do you: ...know the majority of methods you will be needing up front so well? Especially if it's components that implement stuff you are not familiar with yet. ...resist the urge to start coding right away? ...deal with code that comes to mind when you are designing the interfaces? UPDATE: Thank you for the answers so far. Valuable insights! And... I stand corrected; it seems I misinterpreted the idea of Design By Contract. For clarity, what I actually meant was: "coming up with interface methods before implementing the actual components". An additional thing that came up in my mind is related to point 1): b) How do you know the majority of components you will be needing. How do you flesh out these things before you start actually coding? For arguments sake, let's say I'm a novice with the MVC pattern, and I wanted to implement such a component/architecture. A naive approach would be to think of: a front controller some abstract action controller some abstract view ... and be done with it, so to speak. But, being more familiar with the MVC pattern, I know now that it makes sense to also have: a request object a router a dispatcher a response object view helpers etc.. etc.. If you map this idea to some completely new component you want to develop, with which you have no experience yet; how do you come up with these sort of additional components without actually coding the thing, and stuble upon the ideas that way? How would you know up front how fine grained some components should be? Is this a matter of disciplining yourself to think it out thoroughly? Or is it a matter of being good at thinking in abstractions?

    Read the article

  • Is the Unix Philosophy still relevant in the Web 2.0 world?

    - by David Titarenco
    Introduction Hello, let me give you some background before I begin. I started programming when I was 5 or 6 on my dad's PSION II (some primitive BASIC-like language), then I learned more and more, eventually inching my way up to C, C++, Java, PHP, JS, etc. I think I'm a pretty decent coder. I think most people would agree. I'm not a complete social recluse, but I do stuff like write a virtual machine for fun. I've never taken a computer course in college because I've been in and out for the past couple of years and have only been taking core classes; never having been particularly amazing at school, perhaps I'm missing some basic tenet that most learn in CS101. I'm currently reading Coders at Work and this question is based on some ideas I read in there. A Brief (Fictionalized) Example So a certain sunny day I get an idea. I hire a designer and hammer away at some C/C++ code for a couple of months, soon thereafter releasing silvr.com, a website that transmutes lead into silver. Yep, I started my very own start-up and even gave it a clever web 2.0 name with a vowel missing. Mom and dad are proud. I come up with some numbers I should be seeing after 1, 2, 3, 6, 9, 12 months and set sail. Obviously, my transmuting server isn't perfect, sometimes it segfaults, sometimes it leaks memory. I fix it and keep truckin'. After all, gdb is my best friend. Eventually, I'm at a position where a very small community of people are happily transmuting lead into silver on a semi-regular basis, but they want to let their friends on MySpace know how many grams of lead they transmuted today. And they want to post images of their lead and silver nuggets on flickr. I'm losing out on potential traffic unless I let them log in with their Yahoo, Google, and Facebook accounts. They want webcam support and live cock fighting, merry-go-rounds and Jabberwockies. All these things seem necessary. The Aftermath Of course, I have to re-write the transmuting server! After all, I've been losing money all these months. I need OAuth libraries and OpenID libraries, JSON support, and the only stable Jabberwocky API is for Java. C++ isn't even an option anymore. I'm just one guy! The Java binary just grows and grows since I need some legacy Apache include for the JSON library, and some antiquated Sun dependency for OAuth support. Then I pick up a book like Coders at Work and read what people like jwz say about complexity... I think to myself.. Keep it simple, stupid. I like simple things. I've always loved the Unix Philosophy but even after trying to keep the new server source modular and sleek, I loathe having to write one more line of code. It feels that I'm just piling crap on top of other crap. Maybe I'm naive thinking every piece of software can be simple and clever. Maybe it's just a phase.. or is the Unix Philosophy basically dead when it comes to the current state of (web) development? I'm just kind of disheartened :(

    Read the article

  • Is it possible to implement bitwise operators using integer arithmetic?

    - by Statement
    Hello World! I am facing a rather peculiar problem. I am working on a compiler for an architecture that doesn't support bitwise operations. However, it handles signed 16 bit integer arithmetics and I was wondering if it would be possible to implement bitwise operations using only: Addition (c = a + b) Subtraction (c = a - b) Division (c = a / b) Multiplication (c = a * b) Modulus (c = a % b) Minimum (c = min(a, b)) Maximum (c = max(a, b)) Comparisons (c = (a < b), c = (a == b), c = (a <= b), et.c.) Jumps (goto, for, et.c.) The bitwise operations I want to be able to support are: Or (c = a | b) And (c = a & b) Xor (c = a ^ b) Left Shift (c = a << b) Right Shift (c = a b) (All integers are signed so this is a problem) Signed Shift (c = a b) One's Complement (a = ~b) (Already found a solution, see below) Normally the problem is the other way around; how to achieve arithmetic optimizations using bitwise hacks. However not in this case. Writable memory is very scarce on this architecture, hence the need for bitwise operations. The bitwise functions themselves should not use a lot of temporary variables. However, constant read-only data & instruction memory is abundant. A side note here also is that jumps and branches are not expensive and all data is readily cached. Jumps cost half the cycles as arithmetic (including load/store) instructions do. On other words, all of the above supported functions cost twice the cycles of a single jump. Some thoughts that might help: I figured out that you can do one's complement (negate bits) with the following code: // Bitwise one's complement b = ~a; // Arithmetic one's complement b = -1 - a; I also remember the old shift hack when dividing with a power of two so the bitwise shift can be expressed as: // Bitwise left shift b = a << 4; // Arithmetic left shift b = a * 16; // 2^4 = 16 // Signed right shift b = a >>> 4; // Arithmetic right shift b = a / 16; For the rest of the bitwise operations I am slightly clueless. I wish the architects of this architecture would have supplied bit-operations. I would also like to know if there is a fast/easy way of computing the power of two (for shift operations) without using a memory data table. A naive solution would be to jump into a field of multiplications: b = 1; switch (a) { case 15: b = b * 2; case 14: b = b * 2; // ... exploting fallthrough (instruction memory is magnitudes larger) case 2: b = b * 2; case 1: b = b * 2; } Or a Set & Jump approach: switch (a) { case 15: b = 32768; break; case 14: b = 16384; break; // ... exploiting the fact that a jump is faster than one additional mul // at the cost of doubling the instruction memory footprint. case 2: b = 4; break; case 1: b = 2; break; }

    Read the article

  • Rx IObservable buffering to smooth out bursts of events

    - by Dan
    I have an Observable sequence that produces events in rapid bursts (ie: five events one right after another, then a long delay, then another quick burst of events, etc.). I want to smooth out these bursts by inserting a short delay between events. Imagine the following diagram as an example: Raw: --oooo--------------ooooo-----oo----------------ooo| Buffered: --o--o--o--o--------o--o--o--o--o--o--o---------o--o--o| My current approach is to generate a metronome-like timer via Observable.Interval() that signals when it's ok to pull another event from the raw stream. The problem is that I can't figure out how to then combine that timer with my raw unbuffered observable sequence. IObservable.Zip() is close to doing what I want, but it only works so long as the raw stream is producing events faster than the timer. As soon as there is a significant lull in the raw stream, the timer builds up a series of unwanted events that then immediately pair up with the next burst of events from the raw stream. Ideally, I want an IObservable extension method with the following function signature that produces the bevaior I've outlined above. Now, come to my rescue StackOverflow :) public static IObservable<T> Buffered(this IObservable<T> src, TimeSpan minDelay) PS. I'm brand new to Rx, so my apologies if this is a trivially simple question... 1. Simple yet flawed approach Here's my initial naive and simplistic solution that has quite a few problems: public static IObservable<T> Buffered<T>(this IObservable<T> source, TimeSpan minDelay) { Queue<T> q = new Queue<T>(); source.Subscribe(x => q.Enqueue(x)); return Observable.Interval(minDelay).Where(_ => q.Count > 0).Select(_ => q.Dequeue()); } The first obvious problem with this is that the IDisposable returned by the inner subscription to the raw source is lost and therefore the subscription can't be terminated. Calling Dispose on the IDisposable returned by this method kills the timer, but not the underlying raw event feed that is now needlessly filling the queue with nobody left to pull events from the queue. The second problem is that there's no way for exceptions or end-of-stream notifications to be propogated through from the raw event stream to the buffered stream - they are simply ignored when subscribing to the raw source. And last but not least, now I've got code that wakes up periodically regardless of whether there is actually any work to do, which I'd prefer to avoid in this wonderful new reactive world. 2. Way overly complex appoach To solve the problems encountered in my initial simplistic approach, I wrote a much more complicated function that behaves much like IObservable.Delay() (I used .NET Reflector to read that code and used it as the basis of my function). Unfortunately, a lot of the boilerplate logic such as AnonymousObservable is not publicly accessible outside the system.reactive code, so I had to copy and paste a lot of code. This solution appears to work, but given its complexity, I'm less confident that its bug free. I just can't believe that there isn't a way to accomplish this using some combination of the standard Reactive extensions. I hate feeling like I'm needlessly reinventing the wheel, and the pattern I'm trying to build seems like a fairly standard one.

    Read the article

  • In XSLT is it possible to use the value of an xpath expression in a call to a template using an par

    - by Cell
    I am performing an xsl transform and in it I call a template with a param using the following code <xsl:call-template name="GenerateColumns"> <xsl:with-param name="curRow" select="$curRow"/> <xsl:with-param name="curCol" select="$curCol + 1"/> </xsl:call-template> This calls a template function which outputs part of a table element in HTML. The curRow and curCol are used to determine which row and column we are in the table. gbl_maxCols is set to the number of columns in an html table <xsl:template name="GenerateColumns"> <xsl:when test="$curCol &lt;= $gbl_maxCols"> <td> <xsl:attribute="colspan"> <xsl:value-of select="/page/column/@skipColumns"/> </xsl:attribute> </xsl:when> </xsl:template> The result of this function is a set of td elements, however some of these elements (those with a skipColumn attribute greater than 1 span more than 1 column, I need to skip this many columns with the next call to generateColumns. this works just like I would expect in the case where I simply increment the curCol param but I have a case where I need to use the value from the xml attribute skipColumns in the math to calculate the value for curCol. In the above case I iterate through all the columns and this works for the majority of my use cases. However in same cases I need to skip over some of the columns and need to pass in that value from the xml attribute to calculate how many columns I need to skip. My naive first attempt was something like this <xsl:call-template name="GenerateColumns"> <xsl:with-param name="curRow" select="$curRow"/> <xsl:with-param name="curCol" select="$curCol + /page/column/@skipColumns"/> </xsl:call-template> But unforutnately this does not seem to work. Is there any way to use an attribute from an xml page in the calculation for the value of a param in xsl. My xml page is something like this (edited heavily since the xml file is rather large) <page> <column name="blank" skipColumns="1"/> <column name="blank" skipColumns="1"/> <column name="test" skipColumns="3"/> <column name="blank" skipColumns="1"/> <column name="test2" skipColumns="6"/> </page> after all of this I would like to have a set of td elements like the following <td></td><td></td><td colSpan="3"></td><td></td><td colSpan="6"></td> if I just iterate through the columns I instead end up with something like this which gives me more td elements than I should have <td></td><td></td><td colSpan="3"></td><td></td><td colSpan="6"></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td> Edited to provide more information

    Read the article

  • Python: Memory usage and optimization when modifying lists

    - by xApple
    The problem My concern is the following: I am storing a relativity large dataset in a classical python list and in order to process the data I must iterate over the list several times, perform some operations on the elements, and often pop an item out of the list. It seems that deleting one item out of a Python list costs O(N) since Python has to copy all the items above the element at hand down one place. Furthermore, since the number of items to delete is approximately proportional to the number of elements in the list this results in an O(N^2) algorithm. I am hoping to find a solution that is cost effective (time and memory-wise). I have studied what I could find on the internet and have summarized my different options below. Which one is the best candidate ? Keeping a local index: while processingdata: index = 0 while index < len(somelist): item = somelist[index] dosomestuff(item) if somecondition(item): del somelist[index] else: index += 1 This is the original solution I came up with. Not only is this not very elegant, but I am hoping there is better way to do it that remains time and memory efficient. Walking the list backwards: while processingdata: for i in xrange(len(somelist) - 1, -1, -1): dosomestuff(item) if somecondition(somelist, i): somelist.pop(i) This avoids incrementing an index variable but ultimately has the same cost as the original version. It also breaks the logic of dosomestuff(item) that wishes to process them in the same order as they appear in the original list. Making a new list: while processingdata: for i, item in enumerate(somelist): dosomestuff(item) newlist = [] for item in somelist: if somecondition(item): newlist.append(item) somelist = newlist gc.collect() This is a very naive strategy for eliminating elements from a list and requires lots of memory since an almost full copy of the list must be made. Using list comprehensions: while processingdata: for i, item in enumerate(somelist): dosomestuff(item) somelist[:] = [x for x in somelist if somecondition(x)] This is very elegant but under-the-cover it walks the whole list one more time and must copy most of the elements in it. My intuition is that this operation probably costs more than the original del statement at least memory wise. Keep in mind that somelist can be huge and that any solution that will iterate through it only once per run will probably always win. Using the filter function: while processingdata: for i, item in enumerate(somelist): dosomestuff(item) somelist = filter(lambda x: not subtle_condition(x), somelist) This also creates a new list occupying lots of RAM. Using the itertools' filter function: from itertools import ifilterfalse while processingdata: for item in itertools.ifilterfalse(somecondtion, somelist): dosomestuff(item) This version of the filter call does not create a new list but will not call dosomestuff on every item breaking the logic of the algorithm. I am including this example only for the purpose of creating an exhaustive list. Moving items up the list while walking while processingdata: index = 0 for item in somelist: dosomestuff(item) if not somecondition(item): somelist[index] = item index += 1 del somelist[index:] This is a subtle method that seems cost effective. I think it will move each item (or the pointer to each item ?) exactly once resulting in an O(N) algorithm. Finally, I hope Python will be intelligent enough to resize the list at the end without allocating memory for a new copy of the list. Not sure though. Abandoning Python lists: class Doubly_Linked_List: def __init__(self): self.first = None self.last = None self.n = 0 def __len__(self): return self.n def __iter__(self): return DLLIter(self) def iterator(self): return self.__iter__() def append(self, x): x = DLLElement(x) x.next = None if self.last is None: x.prev = None self.last = x self.first = x self.n = 1 else: x.prev = self.last x.prev.next = x self.last = x self.n += 1 class DLLElement: def __init__(self, x): self.next = None self.data = x self.prev = None class DLLIter: etc... This type of object resembles a python list in a limited way. However, deletion of an element is guaranteed O(1). I would not like to go here since this would require massive amounts of code refactoring almost everywhere.

    Read the article

  • Faster method for Matrix vector product for large matrix in C or C++ for use in GMRES

    - by user35959
    I have a large, dense matrix A, and I aim to find the solution to the linear system Ax=b using an iterative method (in MATLAB was the plan using its built in GMRES). For more than 10,000 rows, this is too much for my computer to store in memory, but I know that the entries in A are constructed by two known vectors x and y of length N and the entries satisfy: A(i,j) = .5*(x[i]-x[j])^2+([y[i]-y[j])^2 * log(x[i]-x[j])^2+([y[i]-y[j]^2). MATLAB's GMRES command accepts as input a function call that can compute the matrix vector product A*x, which allows me to handle larger matrices than I can store in memory. To write the matrix-vecotr product function, I first tried this in matlab by going row by row and using some vectorization, but I avoid spawning the entire array A (since it would be too large). This was fairly slow unfortnately in my application for GMRES. My plan was to write a mex file for MATLAB to, which is in C, and ideally should be significantly faster than the matlab code. I'm rather new to C, so this went rather poorly and my naive attempt at writing the code in C was slower than my partially vectorized attempt in Matlab. #include <math.h> #include "mex.h" void Aproduct(double *x, double *ctrs_x, double *ctrs_y, double *b, mwSize n) { mwSize i; mwSize j; double val; for (i=0; i<n; i++) { for (j=0; j<i; j++) { val = pow(ctrs_x[i]-ctrs_x[j],2)+pow(ctrs_y[i]-ctrs_y[j],2); b[i] = b[i] + .5* val * log(val) * x[j]; } for (j=i+1; j<n; j++) { val = pow(ctrs_x[i]-ctrs_x[j],2)+pow(ctrs_y[i]-ctrs_y[j],2); b[i] = b[i] + .5* val * log(val) * x[j]; } } } The above is the computational portion of the code for the matlab mex file (which is slightly modified C, if I understand correctly). Please note that I skip the case i=j, since in that case the variable val will be a 0*log(0), which should be interpreted as 0 for me, so I just skip it. Is there a more efficient or faster way to write this? When I call this C function via the mex file in matlab, it is quite slow, slower even than the matlab method I used. This surprises me since I suspected that C code should be much faster than matlab. The alternative matlab method which is partially vectorized that I am comparing it with is function Ax = Aprod(x,ctrs) n = length(x); Ax = zeros(n,1); for j=1:(n-3) v = .5*((ctrs(j,1)-ctrs(:,1)).^2+(ctrs(j,2)-ctrs(:,2)).^2).*log((ctrs(j,1)-ctrs(:,1)).^2+(ctrs(j,2)-ctrs(:,2)).^2); v(j)=0; Ax(j) = dot(v,x(1:n-3); end (the n-3 is because there is actually 3 extra components, but they are dealt with separately,so I excluded that code). This is partly vectorized and only needs one for loop, so it makes some sense that it is faster. However, I was hoping I could go even faster with C+mex file. Any suggestions or help would be greatly appreciated! Thanks! EDIT: I should be more clear. I am open to any faster method that can help me use GMRES to invert this matrix that I am interested in, which requires a faster way of doing the matrix vector product without explicitly loading the array into memory. Thanks!

    Read the article

  • RSA encryption/ Decryption in a client server application

    - by user308806
    Hi guys, probably missing something very straight forward on this, but please forgive me, I'm very naive! Have a client server application where the client identifies its self with an RSA encrypted username & password. Unfortunately I'm getting a "bad padding exception: data must start with zero" when i try to decrypt with the public key on the client side. I'm fairly sure the key is correct as I have tested encrypting with public key then decrypting with private key on the client side with no problems at all. Just seems when I transfer it over the connection it messses it up somehow?! Using PrintWriter & BufferedReader on the sockets if thats of importance. EncodeBASE64 & DecodeBASE64 encode byte[] to 64base and vice versa respectively. Any ideas guys?? Client side: Socket connectionToServer = new Socket("127.0.0.1", 7050); InputStream in = connectionToServer.getInputStream(); DataInputStream dis = new DataInputStream(in); int length = dis.readInt(); byte[] data = new byte[length]; // dis.readFully(data); dis.read(data); System.out.println("The received Data*****************************************"); System.out.println("The length of bits "+ length); System.out.println(data); System.out.println("***********************************************************"); Decryption d = new Decryption(); byte [] ttt = d.decrypt(data); System.out.print(data); String ss = new String(ttt); System.out.println("***********************"); System.out.println(ss); System.out.println("************************"); Server Side: in = connectionFromClient.getInputStream(); OutputStream out = connectionFromClient.getOutputStream(); DataOutputStream dataOut = new DataOutputStream(out); LicenseList licenses = new LicenseList(); String ValidIDs = licenses.getAllIDs(); System.out.println(ValidIDs); Encryption enc = new Encryption(); byte[] encrypted = enc.encrypt(ValidIDs); byte[] dd = enc.encrypt(ValidIDs); String tobesent = new String(dd); //byte[] rsult = enc.decrypt(dd); //String tt = String(rsult); System.out.println("The sent data**********************************************"); System.out.println(dd); String temp = new String(dd); System.out.println(temp); System.out.println("*************************************************************"); //BufferedWriter bf = new BufferedWriter(OutputStreamWriter(out)); //dataOut.write(ValidIDs.getBytes().length); dataOut.writeInt(ValidIDs.getBytes().length); dataOut.flush(); dataOut.write(encrypted); dataOut.flush(); System.out.println("********Testing**************"); System.out.println("Here are the ids:::"); System.out.println(licenses.getAllIDs()); System.out.println("**********************"); //bw.write("it is working well\n");

    Read the article

  • How to develop an english .com domain value rating algorithm?

    - by Tom
    I've been thinking about an algorithm that should rougly be able to guess the value of an english .com domain in most cases. For this to work I want to perform tests that consider the strengths and weaknesses of an english .com domain. A simple point based system is what I had in mind, where each domain property can be given a certain weight to factor it's importance in. I had these properties in mind: domain character length Eg. initially 20 points are added. If the domain has 4 or less characters, no points are substracted. For each extra character, one or more points are substracted on an exponential basis (the more characters, the higher the penalty). domain characters Eg. initially 20 points are added. If the domain is only alphabetic, no points are substracted. For each non-alhabetic character, X points are substracted (exponential increase again). domain name words Scans through a big offline english database, including non-formal speech, eg. words like "tweet" should be recognized. Question 1 : where can I get a modern list of english words for use in such application? Are these lists available for free? Are there lists like these with non-formal words? The more words are found per character, the more points are added. So, a domain with a lot of characters will still not get a lot of points. words hype-level I believe this is a tricky one, but this should be the cause to differentiate perfect but boring domains from perfect and interesting domains. For example, the following domain is probably not that valueable: www.peanutgalaxy.com The algorithm should identify that peanuts and galaxies are not very popular topics on the web. This is just an example. On the other side, a domain like www.shopdeals.com should ring a bell to the hype test, as shops and deals are quite popular on the web. My initial thought would be to see how often these keywords are references to on the web, preferably with some database. Question 2: is this logic flawed, or does this hype level test have merit? Question 3: are such "hype databases" available? Or is there anything else that could work offline? The problem with eg. a query to google is that it requires a lot of requests due to the many domains to be tested. domain name spelling mistakes Domains like "freemoneyz.com" etc. are generally (notice I am making a lot of assumptions in this post but that's necessary I believe) not valueable due to the spelling mistakes. Question 4: are there any offline APIs available to check for spelling mistakes, preferably in javascript or some database that I can use interact with myself. Or should a word list help here as well? use of consonants, vowels etc. A domain that is easy to pronounce (eg. Google) is usually much more valueable than one that is not (eg. Gkyld). Question 5: how does one test for such pronuncability? Do you check for consonants, vowels, etc.? What does a valueable domain have? Has there been any work in this field, where should I look? That is what I came up with, which leads me to my final two questions. Question 6: can you think of any more english .com domain strengths or weaknesses? Which? How would you implement these? Question 7: do you believe this idea has any merit or all, or am I too naive? Anything I should know, read or hear about? Suggestions/comments? Thanks!

    Read the article

  • How can I clear content without getting the dreaded "stop running this script?" dialog?

    - by Cheeso
    I have a div, that holds a div. like this: <div id='reportHolder' class='column'> <div id='report'> </div> </div> Within the inner div, I add a bunch (7-12) of pairs of a and div elements, like this: <h4><a>Heading1</a></h4> <div> ...content here....</div> The total size of the content, is maybe 200k. Each div just contains a fragment of HTML. Within it, there are numerous <span> elements, containing other html elements, and they nest, to maybe 5-8 levels deep. Nothing really extraordinary, I don't think. After I add all the content, I then create an accordion. like this: $('#report').accordion({collapsible:true, active:false}); This all works fine. The problem is, when I try to clear or remove the report div, it takes a looooooong time, and I get 3 or 4 popups asking "Do you want to stop running this script?" I have tried several ways: option 1: $('#report').accordion('destroy'); $('#report').remove(); $("#reportHolder").html("<div id='report'> </div>"); option 2: $('#report').accordion('destroy'); $('#report').html(''); $("#reportHolder").html("<div id='report'> </div>"); option 3: $('#report').accordion('destroy'); $("#reportHolder").html("<div id='report'> </div>"); after getting a suggestion in the comment, I also tried: option 4: $('#report').accordion('destroy'); $('#report').empty(); $("#reportHolder").html("<div id='report'> </div>"); No matter what, it hangs for a long while. The call to accordion('destroy') seems to not be the source of the delay. It's the erasure of the html content within the report div. This is jQuery 1.3.2. EDIT - fixed code typo. ps: this happens on FF3.5 as well as IE8 . Questions: What is taking so long? How can I remove content more quickly? Addendum I broke into the debugger in FF, during "option 4", and the stacktrace I see is: data() trigger() triggerHandler() add() each() each() add() empty() each() each() (?)() // <<-- this is the call to empty() ResetUi() // <<-- my code onclick I don't understand why add() is in the stack. I am removing content, not adding it. I'm afraid that in the context of the remove (all), jQuery does something naive. Like it grabs the html content, does the text replace to remove one html element, then calls .add() to put back what remains. Is there a way to tell jQuery to NOT propagate events when removing HTML content from the dom?

    Read the article

  • What language/framework (technology) to use for website (flash games portal)

    - by cripox
    Hello, I know there are a lot of similar questions on the net, but because I am a newbie in web development I didn't find the solution for my specific problem. I am planing on creating a flash games portal from scratch. It is a big chance that there will be big traffic from the beginning (millions of pageviews). I want to reduce the server costs as much as possible but in the same time to not be tide to an expensive contract as there is a chance that the project will not be as successfully as I want and in that case the money would be very little. The question is : what technology to use? I don't know any web dev technology yet so it doesn't matter what I will learn. My web dev experience is a little php 8 years ago, and from then I programmed in C++ / Java- game and mobile development. I like Java and C syntax and language very much and I tend to dislike dynamic typing or non robust scripting (like php)- but I can get along if these are the best choices. The candidates are now: - Grails (my best for now) Ruby on Rails Cake PHP Other technologies (Google App Engine, Python/Django etc...) I was considering at first using pure C and compiling the web app in the server- just to squeeze more from the servers, but soon I understand that this is overkill. Next my eyes came on Ruby - as there is a lot of buzz for it's easiness of use. Next I discovered Grails and looked at Java because it is said that it is "faster". But I don't know what this "Faster" really means on my needs, so here comes the first question: 1) What will be my biggest consumption on the server, other than bandwidth, for a lot of flash content requests? Is it memory? I heard that Java needs a lot of memory, but is faster. Is it CPU? I am planning to take some daily VPS.NET nodes at first, to see if there is a demand, and if the "spike" is permanent to move to a dedicated server (serverloft.com has some good offers), else to remain with less nodes. I was also considering developing in Google App Engine- cheap or free hosting to use at first - so I can test my assumption- and also very easy to use (no need for sys administration) but the costs became high if used more ( 3 million games played / month .. x mb/ each). And the issue with Google is that it looks me in this technology. My other concern is scalability (not only for traffic/users, but as adding functionality) My plans are to release a functional site in just 4 weeks (just the basics frontend and some quick basic backend - so I can be able to modify some things and add games manually) - but then to raise it and add more things to it. I am planning to take a little different approach than other portals so I need to write it from scratch (a script will not do). 2) Will Grails take much more resources than RoR or Php server wise? I heard that making it on Java stack will be hardware expensive and is overkill if you don't make a bank application. My application will not be very complex (I hope and i will try to) but will have a lot of traffic. I also took in account using CDN for files, but the cheapest CDN found was 5c/GB (vps.net) and the cost per gb on serverloft (http://www.serverloft.com/dedizierte-server/server-details.php?products=4) is only 1.79 cents/GB and comes with the other resources either. I am new to this domain (web). I am learning the ropes and searching on the web for ~half of year but don't have any really practical experience, so I know that I must have some naive thinking and other issues that i don't know from now, so please give me any advice you want regarding anything, not just the specific questions asked. And thank you so much for such great community!

    Read the article

< Previous Page | 13 14 15 16 17 18  | Next Page >