Search Results

Search found 1325 results on 53 pages for 'factor'.

Page 53/53 | < Previous Page | 49 50 51 52 53 

  • Down Tools Week Cometh: Kissing Goodbye to CVs/Resumes and Cover Letters

    - by Bart Read
    I haven't blogged about what I'm doing in my (not so new) temporary role as Red Gate's technical recruiter, mostly because it's been routine, business as usual stuff, and because I've been trying to understand the role by doing it. I think now though the time has come to get a little more radical, so I'm going to tell you why I want to largely eliminate CVs/resumes and cover letters from the application process for some of our technical roles, and why I think that might be a good thing for candidates (and for us). I have a terrible confession to make, or at least it's a terrible confession for a recruiter: I don't really like CV sifting, or reading cover letters, and, unless I've misread the mood around here, neither does anybody else. It's dull, it's time-consuming, and it's somewhat soul destroying because, when all is said and done, you're being paid to be incredibly judgemental about people based on relatively little information. I feel like I've dirtied myself by saying that - I mean, after all, it's a core part of my job - but it sucks, it really does. (And, of course, the truth is I'm still a software engineer at heart, and I'm always looking for ways to do things better.) On the flip side, I've never met anyone who likes writing their CV. It takes hours and hours of faffing around and massaging it into shape, and the whole process is beset by a gnawing anxiety, frustration, and insecurity. All you really want is a chance to demonstrate your skills - not just talk about them - and how do you do that in a CV or cover letter? Often the best candidates will include samples of their work (a portfolio, screenshots, links to websites, product downloads, etc.), but sometimes this isn't possible, or may not be appropriate, or you just don't think you're allowed because of what your school/university careers service has told you (more commonly an issue with grads, obviously). And what are we actually trying to find out about people with all of this? I think the common criteria are actually pretty basic: Smart Gets things done (thanks for these two Joel) Not an a55hole* (sorry, have to get around Simple Talk's swear filter - and thanks to Professor Robert I. Sutton for this one) *Of course, everyone has off days, and I don't honestly think we're too worried about somebody being a bit grumpy every now and again. We can do a bit better than this in the context of the roles I'm talking about: we can be more specific about what "gets things done" means, at least in part. For software engineers and interns, the non-exhaustive meaning of "gets things done" is: Excellent coder For test engineers, the non-exhaustive meaning of "gets things done" is: Good at finding problems in software Competent coder Team player, etc., to me, are covered by "not an a55hole". I don't expect people to be the life and soul of the party, or a wild extrovert - that's not what team player means, and it's not what "not an a55hole" means. Some of our best technical staff are quiet, introverted types, but they're still pleasant to work with. My problem is that I don't think the initial sift really helps us find out whether people are smart and get things done with any great efficacy. It's better than nothing, for sure, but it's not as good as it could be. It's also contentious, and potentially unfair/inequitable - if you want to get an idea of what I mean by this, check out the background information section at the bottom. Before I go any further, let's look at the Red Gate recruitment process for technical staff* as it stands now: (LOTS of) People apply for jobs. All these applications go through a brutal process of manual sifting, which eliminates between 75 and 90% of them, depending upon the role, and the time of year**. Depending upon the role, those who pass the sift will be sent an assessment or telescreened. For the purposes of this blog post I'm only interested in those that are sent some sort of programming assessment, or bug hunt. This means software engineers, test engineers, and software interns, which are the roles for which I receive the most applications. The telescreen tends to be reserved for project or product managers. Those that pass the assessment are invited in for first interview. This interview is mostly about assessing their technical skills***, although we're obviously on the look out for cultural fit red flags as well. If the first interview goes well we'll invite candidates back for a second interview. This is where team/cultural fit is really scoped out. We also use this interview to dive more deeply into certain areas of their skillset, and explore any concerns that may have come out of the first interview (these obviously won't have been serious or obvious enough to cause a rejection at that point, but are things we do need to look into before we'd consider making an offer). We might subsequently invite them in for lunch before we make them an offer. This tends to happen when we're recruiting somebody for a specific team and we'd like them to meet all the people they'll be working with directly. It's not an interview per se, but can prove pivotal if they don't gel with the team. Anyone who's made it this far will receive an offer from us. *We have a slightly quirky definition of "technical staff" as it relates to the technical recruiter role here. It includes software engineers, test engineers, software interns, user experience specialists, technical authors, project managers, product managers, and development managers, but does not include product support or information systems roles. **For example, the quality of graduate applicants overall noticeably drops as the academic year wears on, which is not to say that by now there aren't still stars in there, just that they're fewer and further between. ***Some organisations prefer to assess for team fit first, but I think assessing technical skills is a more effective initial filter - if they're the nicest person in the world, but can't cut a line of code they're not going to work out. Now, as I suggested in the title, Red Gate's Down Tools Week is upon us once again - next week in fact - and I had proposed as a project that we refactor and automate the first stage of marking our programming assessments. Marking assessments, and in fact organising the marking of them, is a somewhat time-consuming process, and we receive many assessment solutions that just don't make the cut, for whatever reason. Whilst I don't think it's possible to fully automate marking, I do think it ought to be possible to run a suite of automated tests over each candidate's solution to see whether or not it behaves correctly and, if it does, move on to a manual stage where we examine the code for structure, decomposition, style, readability, maintainability, etc. Obviously it's possible to use tools to generate potentially helpful metrics for some of these indices as well. This would obviously reduce the marking workload, and would provide candidates with quicker feedback about whether they've been successful - though I do wonder if waiting a tactful interval before sending a (nicely written) rejection might be wise. I duly scrawled out a picture of my ideal process, which looked like this: The problem is, as soon as I'd roughed it out, I realised that fundamentally it wasn't an ideal process at all, which explained the gnawing feeling of cognitive dissonance I'd been wrestling with all week, whilst I'd been trying to find time to do this. Here's what I mean. Automated assessment marking, and the associated infrastructure around that, makes it much easier for us to deal with large numbers of assessments. This means we can be much more permissive about who we send assessments out to or, in other words, we can give more candidates the opportunity to really demonstrate their skills to us. And this leads to a question: why not give everyone the opportunity to demonstrate their skills, to show that they're smart and can get things done? (Two or three of us even discussed this in the down tools week hustings earlier this week.) And isn't this a lot simpler than the alternative we'd been considering? (FYI, this was automated CV/cover letter sifting by some form of textual analysis to ideally eliminate the worst 50% or so of applications based on an analysis of the 20,000 or so historical applications we've received since 2007 - definitely not the basic keyword analysis beloved of recruitment agencies, since this would eliminate hardly anyone who was awful, but definitely would eliminate stellar Oxbridge candidates - #fail - or some nightmarishly complex Google-like system where we profile all our currently employees, only to realise that we're never going to get representative results because we don't have a statistically significant sample size in any given role - also #fail.) No, I think the new way is better. We let people self-select. We make them the masters (or mistresses) of their own destiny. We give applicants the power - we put their fate in their hands - by giving them the chance to demonstrate their skills, which is what they really want anyway, instead of requiring that they spend hours and hours creating a CV and cover letter that I'm going to evaluate for suitability, and make a value judgement about, in approximately 1 minute (give or take). It doesn't matter what university you attended, it doesn't matter if you had a bad year when you took your A-levels - here's your chance to shine, so take it and run with it. (As a side benefit, we cut the number of applications we have to sift by something like two thirds.) WIN! OK, yeah, sounds good, but will it actually work? That's an excellent question. My gut feeling is yes, and I'll justify why below (and hopefully have gone some way towards doing that above as well), but what I'm proposing here is really that we run an experiment for a period of time - probably a couple of months or so - and measure the outcomes we see: How many people apply? (Wouldn't be surprised or alarmed to see this cut by a factor of ten.) How many of them submit a good assessment? (More/less than at present?) How much overhead is there for us in dealing with these assessments compared to now? What are the success and failure rates at each interview stage compared to now? How many people are we hiring at the end of it compared to now? I think it'll work because I hypothesize that, amongst other things: It self-selects for people who really want to work at Red Gate which, at the moment, is something I have to try and assess based on their CV and cover letter - but if you're not that bothered about working here, why would you complete the assessment? Candidates who would submit a shoddy application probably won't feel motivated to do the assessment. Candidates who would demonstrate good attention to detail in their CV/cover letter will demonstrate good attention to detail in the assessment. In general, only the better candidates will complete and submit the assessment. Marking assessments is much less work so we'll be able to deal with any increase that we see (hopefully we will see). There are obviously other questions as well: Is plagiarism going to be a problem? Is there any way we can detect/discourage potential plagiarism? How do we assess candidates' education and experience? What about their ability to communicate in writing? Do we still want them to submit a CV afterwards if they pass assessment? Do we want to offer them the opportunity to tell us a bit about why they'd like the job when they submit their assessment? How does this affect our relationship with recruitment agencies we might use to hire for these roles? So, what's the objective for next week's Down Tools Week? Pretty simple really - we want to implement this process for the Graduate Software Engineer and Software Engineer positions that you can find on our website. I will be joined by a crack team of our best developers (Kevin Boyle, and new Red-Gater, Sam Blackburn), and recruiting hostess with the mostest Laura McQuillen, and hopefully a couple of others as well - if I can successfully twist more arms before Monday.* Hopefully by next Friday our experiment will be up and running, and we may have changed the way Red Gate recruits software engineers for good! Stay tuned and we'll let you know how it goes! *I'm going to play dirty by offering them beer and chocolate during meetings. Some background information: how agonising over the initial CV/cover letter sift helped lead us to bin it off entirely The other day I was agonising about the new university/good degree grade versus poor A-level results issue, and decided to canvas for other opinions to see if there was something I could do that was fairer than my current approach, which is almost always to reject. This generated quite an involved discussion on our Yammer site: I'm sure you can glean a pretty good impression of my own educational prejudices from that discussion as well, although I'm very open to changing my opinion - hopefully you've already figured that out from reading the rest of this post. Hopefully you can also trace a logical path from agonising about sifting to, "Uh, hang on, why on earth are we doing this anyway?!?" Technorati Tags: recruitment,hr,developers,testers,red gate,cv,resume,cover letter,assessment,sea change

    Read the article

  • .NET Code Evolution

    - by Alois Kraus
    Originally posted on: http://geekswithblogs.net/akraus1/archive/2013/07/24/153504.aspxAt my day job I do look at a lot of code written by other people. Most of the code is quite good and some is even a masterpiece. And there is also code which makes you think WTF… oh it was written by me. Hm not so bad after all. There are many excuses reasons for bad code. Most often it is time pressure followed by not enough ambition (who cares) or insufficient training. Normally I do care about code quality quite a lot which makes me a (perceived) slow worker who does write many tests and refines the code quite a lot because of the design deficiencies. Most of the deficiencies I do find by putting my design under stress while checking for invariants. It does also help a lot to step into the code with a debugger (sometimes also Windbg). I do this much more often when my tests are red. That way I do get a much better understanding what my code really does and not what I think it should be doing. This time I do want to show you how code can evolve over the years with different .NET Framework versions. Once there was  time where .NET 1.1 was new and many C++ programmers did switch over to get rid of not initialized pointers and memory leaks. There were also nice new data structures available such as the Hashtable which is fast lookup table with O(1) time complexity. All was good and much code was written since then. At 2005 a new version of the .NET Framework did arrive which did bring many new things like generics and new data structures. The “old” fashioned way of Hashtable were coming to an end and everyone used the new Dictionary<xx,xx> type instead which was type safe and faster because the object to type conversion (aka boxing) was no longer necessary. I think 95% of all Hashtables and dictionaries use string as key. Often it is convenient to ignore casing to make it easy to look up values which the user did enter. An often followed route is to convert the string to upper case before putting it into the Hashtable. Hashtable Table = new Hashtable(); void Add(string key, string value) { Table.Add(key.ToUpper(), value); } This is valid and working code but it has problems. First we can pass to the Hashtable a custom IEqualityComparer to do the string matching case insensitive. Second we can switch over to the now also old Dictionary type to become a little faster and we can keep the the original keys (not upper cased) in the dictionary. Dictionary<string, string> DictTable = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase); void AddDict(string key, string value) { DictTable.Add(key, value); } Many people do not user the other ctors of Dictionary because they do shy away from the overhead of writing their own comparer. They do not know that .NET has for strings already predefined comparers at hand which you can directly use. Today in the many core area we do use threads all over the place. Sometimes things break in subtle ways but most of the time it is sufficient to place a lock around the offender. Threading has become so mainstream that it may sound weird that in the year 2000 some guy got a huge incentive for the idea to reduce the time to process calibration data from 12 hours to 6 hours by using two threads on a dual core machine. Threading does make it easy to become faster at the expense of correctness. Correct and scalable multithreading can be arbitrarily hard to achieve depending on the problem you are trying to solve. Lets suppose we want to process millions of items with two threads and count the processed items processed by all threads. A typical beginners code might look like this: int Counter; void IJustLearnedToUseThreads() { var t1 = new Thread(ThreadWorkMethod); t1.Start(); var t2 = new Thread(ThreadWorkMethod); t2.Start(); t1.Join(); t2.Join(); if (Counter != 2 * Increments) throw new Exception("Hmm " + Counter + " != " + 2 * Increments); } const int Increments = 10 * 1000 * 1000; void ThreadWorkMethod() { for (int i = 0; i < Increments; i++) { Counter++; } } It does throw an exception with the message e.g. “Hmm 10.222.287 != 20.000.000” and does never finish. The code does fail because the assumption that Counter++ is an atomic operation is wrong. The ++ operator is just a shortcut for Counter = Counter + 1 This does involve reading the counter from a memory location into the CPU, incrementing value on the CPU and writing the new value back to the memory location. When we do look at the generated assembly code we will see only inc dword ptr [ecx+10h] which is only one instruction. Yes it is one instruction but it is not atomic. All modern CPUs have several layers of caches (L1,L2,L3) which try to hide the fact how slow actual main memory accesses are. Since cache is just another word for redundant copy it can happen that one CPU does read a value from main memory into the cache, modifies it and write it back to the main memory. The problem is that at least the L1 cache is not shared between CPUs so it can happen that one CPU does make changes to values which did change in meantime in the main memory. From the exception you can see we did increment the value 20 million times but half of the changes were lost because we did overwrite the already changed value from the other thread. This is a very common case and people do learn to protect their  data with proper locking.   void Intermediate() { var time = Stopwatch.StartNew(); Action acc = ThreadWorkMethod_Intermediate; var ar1 = acc.BeginInvoke(null, null); var ar2 = acc.BeginInvoke(null, null); ar1.AsyncWaitHandle.WaitOne(); ar2.AsyncWaitHandle.WaitOne(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Intermediate did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Intermediate() { for (int i = 0; i < Increments; i++) { lock (this) { Counter++; } } } This is better and does use the .NET Threadpool to get rid of manual thread management. It does give the expected result but it can result in deadlocks because you do lock on this. This is in general a bad idea since it can lead to deadlocks when other threads use your class instance as lock object. It is therefore recommended to create a private object as lock object to ensure that nobody else can lock your lock object. When you read more about threading you will read about lock free algorithms. They are nice and can improve performance quite a lot but you need to pay close attention to the CLR memory model. It does make quite weak guarantees in general but it can still work because your CPU architecture does give you more invariants than the CLR memory model. For a simple counter there is an easy lock free alternative present with the Interlocked class in .NET. As a general rule you should not try to write lock free algos since most likely you will fail to get it right on all CPU architectures. void Experienced() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Experienced); t1.Wait(); t2.Wait(); if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Experienced did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Experienced() { for (int i = 0; i < Increments; i++) { Interlocked.Increment(ref Counter); } } Since time does move forward we do not use threads explicitly anymore but the much nicer Task abstraction which was introduced with .NET 4 at 2010. It is educational to look at the generated assembly code. The Interlocked.Increment method must be called which does wondrous things right? Lets see: lock inc dword ptr [eax] The first thing to note that there is no method call at all. Why? Because the JIT compiler does know very well about CPU intrinsic functions. Atomic operations which do lock the memory bus to prevent other processors to read stale values are such things. Second: This is the same increment call prefixed with a lock instruction. The only reason for the existence of the Interlocked class is that the JIT compiler can compile it to the matching CPU intrinsic functions which can not only increment by one but can also do an add, exchange and a combined compare and exchange operation. But be warned that the correct usage of its methods can be tricky. If you try to be clever and look a the generated IL code and try to reason about its efficiency you will fail. Only the generated machine code counts. Is this the best code we can write? Perhaps. It is nice and clean. But can we make it any faster? Lets see how good we are doing currently. Level Time in s IJustLearnedToUseThreads Flawed Code Intermediate 1,5 (lock) Experienced 0,3 (Interlocked.Increment) Master 0,1 (1,0 for int[2]) That lock free thing is really a nice thing. But if you read more about CPU cache, cache coherency, false sharing you can do even better. int[] Counters = new int[12]; // Cache line size is 64 bytes on my machine with an 8 way associative cache try for yourself e.g. 64 on more modern CPUs void Master() { var time = Stopwatch.StartNew(); Task t1 = Task.Factory.StartNew(ThreadWorkMethod_Master, 0); Task t2 = Task.Factory.StartNew(ThreadWorkMethod_Master, Counters.Length - 1); t1.Wait(); t2.Wait(); Counter = Counters[0] + Counters[Counters.Length - 1]; if (Counter != 2 * Increments) throw new Exception(String.Format("Hmm {0:N0} != {1:N0}", Counter, 2 * Increments)); Console.WriteLine("Master did take: {0:F1}s", time.Elapsed.TotalSeconds); } void ThreadWorkMethod_Master(object number) { int index = (int) number; for (int i = 0; i < Increments; i++) { Counters[index]++; } } The key insight here is to use for each core its own value. But if you simply use simply an integer array of two items, one for each core and add the items at the end you will be much slower than the lock free version (factor 3). Each CPU core has its own cache line size which is something in the range of 16-256 bytes. When you do access a value from one location the CPU does not only fetch one value from main memory but a complete cache line (e.g. 16 bytes). This means that you do not pay for the next 15 bytes when you access them. This can lead to dramatic performance improvements and non obvious code which is faster although it does have many more memory reads than another algorithm. So what have we done here? We have started with correct code but it was lacking knowledge how to use the .NET Base Class Libraries optimally. Then we did try to get fancy and used threads for the first time and failed. Our next try was better but it still had non obvious issues (lock object exposed to the outside). Knowledge has increased further and we have found a lock free version of our counter which is a nice and clean way which is a perfectly valid solution. The last example is only here to show you how you can get most out of threading by paying close attention to your used data structures and CPU cache coherency. Although we are working in a virtual execution environment in a high level language with automatic memory management it does pay off to know the details down to the assembly level. Only if you continue to learn and to dig deeper you can come up with solutions no one else was even considering. I have studied particle physics which does help at the digging deeper part. Have you ever tried to solve Quantum Chromodynamics equations? Compared to that the rest must be easy ;-). Although I am no longer working in the Science field I take pride in discovering non obvious things. This can be a very hard to find bug or a new way to restructure data to make something 10 times faster. Now I need to get some sleep ….

    Read the article

  • Selling Federal Enterprise Architecture (EA)

    - by TedMcLaughlan
    Selling Federal Enterprise Architecture A taxonomy of subject areas, from which to develop a prioritized marketing and communications plan to evangelize EA activities within and among US Federal Government organizations and constituents. Any and all feedback is appreciated, particularly in developing and extending this discussion as a tool for use – more information and details are also available. "Selling" the discipline of Enterprise Architecture (EA) in the Federal Government (particularly in non-DoD agencies) is difficult, notwithstanding the general availability and use of the Federal Enterprise Architecture Framework (FEAF) for some time now, and the relatively mature use of the reference models in the OMB Capital Planning and Investment (CPIC) cycles. EA in the Federal Government also tends to be a very esoteric and hard to decipher conversation – early apologies to those who agree to continue reading this somewhat lengthy article. Alignment to the FEAF and OMB compliance mandates is long underway across the Federal Departments and Agencies (and visible via tools like PortfolioStat and ITDashboard.gov – but there is still a gap between the top-down compliance directives and enablement programs, and the bottom-up awareness and effective use of EA for either IT investment management or actual mission effectiveness. "EA isn't getting deep enough penetration into programs, components, sub-agencies, etc.", verified a panelist at the most recent EA Government Conference in DC. Newer guidance from OMB may be especially difficult to handle, where bottom-up input can't be accurately aligned, analyzed and reported via standardized EA discipline at the Agency level – for example in addressing the new (for FY13) Exhibit 53D "Agency IT Reductions and Reinvestments" and the information required for "Cloud Computing Alternatives Evaluation" (supporting the new Exhibit 53C, "Agency Cloud Computing Portfolio"). Therefore, EA must be "sold" directly to the communities that matter, from a coordinated, proactive messaging perspective that takes BOTH the Program-level value drivers AND the broader Agency mission and IT maturity context into consideration. Selling EA means persuading others to take additional time and possibly assign additional resources, for a mix of direct and indirect benefits – many of which aren't likely to be realized in the short-term. This means there's probably little current, allocated budget to work with; ergo the challenge of trying to sell an "unfunded mandate". Also, the concept of "Enterprise" in large Departments like Homeland Security tends to cross all kinds of organizational boundaries – as Richard Spires recently indicated by commenting that "...organizational boundaries still trump functional similarities. Most people understand what we're trying to do internally, and at a high level they get it. The problem, of course, is when you get down to them and their system and the fact that you're going to be touching them...there's always that fear factor," Spires said. It is quite clear to the Federal IT Investment community that for EA to meet its objective, understandable, relevant value must be measured and reported using a repeatable method – as described by GAO's recent report "Enterprise Architecture Value Needs To Be Measured and Reported". What's not clear is the method or guidance to sell this value. In fact, the current GAO "Framework for Assessing and Improving Enterprise Architecture Management (Version 2.0)", a.k.a. the "EAMMF", does not include words like "sell", "persuade", "market", etc., except in reference ("within Core Element 19: Organization business owner and CXO representatives are actively engaged in architecture development") to a brief section in the CIO Council's 2001 "Practical Guide to Federal Enterprise Architecture", entitled "3.3.1. Develop an EA Marketing Strategy and Communications Plan." Furthermore, Core Element 19 of the EAMMF is advised to be applied in "Stage 3: Developing Initial EA Versions". This kind of EA sales campaign truly should start much earlier in the maturity progress, i.e. in Stages 0 or 1. So, what are the understandable, relevant benefits (or value) to sell, that can find an agreeable, participatory audience, and can pave the way towards success of a longer-term, funded set of EA mechanisms that can be methodically measured and reported? Pragmatic benefits from a useful EA that can help overcome the fear of change? And how should they be sold? Following is a brief taxonomy (it's a taxonomy, to help organize SME support) of benefit-related subjects that might make the most sense, in creating the messages and organizing an initial "engagement plan" for evangelizing EA "from within". An EA "Sales Taxonomy" of sorts. We're not boiling the ocean here; the subjects that are included are ones that currently appear to be urgently relevant to the current Federal IT Investment landscape. Note that successful dialogue in these topics is directly usable as input or guidance for actually developing early-stage, "Fit-for-Purpose" (a DoDAF term) Enterprise Architecture artifacts, as prescribed by common methods found in most EA methodologies, including FEAF, TOGAF, DoDAF and our own Oracle Enterprise Architecture Framework (OEAF). The taxonomy below is organized by (1) Target Community, (2) Benefit or Value, and (3) EA Program Facet - as in: "Let's talk to (1: Community Member) about how and why (3: EA Facet) the EA program can help with (2: Benefit/Value)". Once the initial discussion targets and subjects are approved (that can be measured and reported), a "marketing and communications plan" can be created. A working example follows the Taxonomy. Enterprise Architecture Sales Taxonomy Draft, Summary Version 1. Community 1.1. Budgeted Programs or Portfolios Communities of Purpose (CoPR) 1.1.1. Program/System Owners (Senior Execs) Creating or Executing Acquisition Plans 1.1.2. Program/System Owners Facing Strategic Change 1.1.2.1. Mandated 1.1.2.2. Expected/Anticipated 1.1.3. Program Managers - Creating Employee Performance Plans 1.1.4. CO/COTRs – Creating Contractor Performance Plans, or evaluating Value Engineering Change Proposals (VECP) 1.2. Governance & Communications Communities of Practice (CoP) 1.2.1. Policy Owners 1.2.1.1. OCFO 1.2.1.1.1. Budget/Procurement Office 1.2.1.1.2. Strategic Planning 1.2.1.2. OCIO 1.2.1.2.1. IT Management 1.2.1.2.2. IT Operations 1.2.1.2.3. Information Assurance (Cyber Security) 1.2.1.2.4. IT Innovation 1.2.1.3. Information-Sharing/ Process Collaboration (i.e. policies and procedures regarding Partners, Agreements) 1.2.2. Governing IT Council/SME Peers (i.e. an "Architects Council") 1.2.2.1. Enterprise Architects (assumes others exist; also assumes EA participants aren't buried solely within the CIO shop) 1.2.2.2. Domain, Enclave, Segment Architects – i.e. the right affinity group for a "shared services" EA structure (per the EAMMF), which may be classified as Federated, Segmented, Service-Oriented, or Extended 1.2.2.3. External Oversight/Constraints 1.2.2.3.1. GAO/OIG & Legal 1.2.2.3.2. Industry Standards 1.2.2.3.3. Official public notification, response 1.2.3. Mission Constituents Participant & Analyst Community of Interest (CoI) 1.2.3.1. Mission Operators/Users 1.2.3.2. Public Constituents 1.2.3.3. Industry Advisory Groups, Stakeholders 1.2.3.4. Media 2. Benefit/Value (Note the actual benefits may not be discretely attributable to EA alone; EA is a very collaborative, cross-cutting discipline.) 2.1. Program Costs – EA enables sound decisions regarding... 2.1.1. Cost Avoidance – a TCO theme 2.1.2. Sequencing – alignment of capability delivery 2.1.3. Budget Instability – a Federal reality 2.2. Investment Capital – EA illuminates new investment resources via... 2.2.1. Value Engineering – contractor-driven cost savings on existing budgets, direct or collateral 2.2.2. Reuse – reuse of investments between programs can result in savings, chargeback models; avoiding duplication 2.2.3. License Refactoring – IT license & support models may not reflect actual or intended usage 2.3. Contextual Knowledge – EA enables informed decisions by revealing... 2.3.1. Common Operating Picture (COP) – i.e. cross-program impacts and synergy, relative to context 2.3.2. Expertise & Skill – who truly should be involved in architectural decisions, both business and IT 2.3.3. Influence – the impact of politics and relationships can be examined 2.3.4. Disruptive Technologies – new technologies may reduce costs or mitigate risk in unanticipated ways 2.3.5. What-If Scenarios – can become much more refined, current, verifiable; basis for Target Architectures 2.4. Mission Performance – EA enables beneficial decision results regarding... 2.4.1. IT Performance and Optimization – towards 100% effective, available resource utilization 2.4.2. IT Stability – towards 100%, real-time uptime 2.4.3. Agility – responding to rapid changes in mission 2.4.4. Outcomes –measures of mission success, KPIs – vs. only "Outputs" 2.4.5. Constraints – appropriate response to constraints 2.4.6. Personnel Performance – better line-of-sight through performance plans to mission outcome 2.5. Mission Risk Mitigation – EA mitigates decision risks in terms of... 2.5.1. Compliance – all the right boxes are checked 2.5.2. Dependencies –cross-agency, segment, government 2.5.3. Transparency – risks, impact and resource utilization are illuminated quickly, comprehensively 2.5.4. Threats and Vulnerabilities – current, realistic awareness and profiles 2.5.5. Consequences – realization of risk can be mapped as a series of consequences, from earlier decisions or new decisions required for current issues 2.5.5.1. Unanticipated – illuminating signals of future or non-symmetric risk; helping to "future-proof" 2.5.5.2. Anticipated – discovering the level of impact that matters 3. EA Program Facet (What parts of the EA can and should be communicated, using business or mission terms?) 3.1. Architecture Models – the visual tools to be created and used 3.1.1. Operating Architecture – the Business Operating Model/Architecture elements of the EA truly drive all other elements, plus expose communication channels 3.1.2. Use Of – how can the EA models be used, and how are they populated, from a reasonable, pragmatic yet compliant perspective? What are the core/minimal models required? What's the relationship of these models, with existing system models? 3.1.3. Scope – what level of granularity within the models, and what level of abstraction across the models, is likely to be most effective and useful? 3.2. Traceability – the maturity, status, completeness of the tools 3.2.1. Status – what in fact is the degree of maturity across the integrated EA model and other relevant governance models, and who may already be benefiting from it? 3.2.2. Visibility – how does the EA visibly and effectively prove IT investment performance goals are being reached, with positive mission outcome? 3.3. Governance – what's the interaction, participation method; how are the tools used? 3.3.1. Contributions – how is the EA program informed, accept submissions, collect data? Who are the experts? 3.3.2. Review – how is the EA validated, against what criteria?  Taxonomy Usage Example:   1. To speak with: a. ...a particular set of System Owners Facing Strategic Change, via mandate (like the "Cloud First" mandate); about... b. ...how the EA program's visible and easily accessible Infrastructure Reference Model (i.e. "IRM" or "TRM"), if updated more completely with current system data, can... c. ...help shed light on ways to mitigate risks and avoid future costs associated with NOT leveraging potentially-available shared services across the enterprise... 2. ....the following Marketing & Communications (Sales) Plan can be constructed: a. Create an easy-to-read "Consequence Model" that illustrates how adoption of a cloud capability (like elastic operational storage) can enable rapid and durable compliance with the mandate – using EA traceability. Traceability might be from the IRM to the ARM (that identifies reusable services invoking the elastic storage), and then to the PRM with performance measures (such as % utilization of purchased storage allocation) included in the OMB Exhibits; and b. Schedule a meeting with the Program Owners, timed during their Acquisition Strategy meetings in response to the mandate, to use the "Consequence Model" for advising them to organize a rapid and relevant RFI solicitation for this cloud capability (regarding alternatives for sourcing elastic operational storage); and c. Schedule a series of short "Discovery" meetings with the system architecture leads (as agreed by the Program Owners), to further populate/validate the "As-Is" models and frame the "To Be" models (via scenarios), to better inform the RFI, obtain the best feedback from the vendor community, and provide potential value for and avoid impact to all other programs and systems. --end example -- Note that communications with the intended audience should take a page out of the standard "Search Engine Optimization" (SEO) playbook, using keywords and phrases relating to "value" and "outcome" vs. "compliance" and "output". Searches in email boxes, internal and external search engines for phrases like "cost avoidance strategies", "mission performance metrics" and "innovation funding" should yield messages and content from the EA team. This targeted, informed, practical sales approach should result in additional buy-in and participation, additional EA information contribution and model validation, development of more SMEs and quick "proof points" (with real-life testing) to bolster the case for EA. The proof point here is a successful, timely procurement that satisfies not only the external mandate and external oversight review, but also meets internal EA compliance/conformance goals and therefore is more transparently useful across the community. In short, if sold effectively, the EA will perform and be recognized. EA won’t therefore be used only for compliance, but also (according to a validated, stated purpose) to directly influence decisions and outcomes. The opinions, views and analysis expressed in this document are those of the author and do not necessarily reflect the views of Oracle.

    Read the article

  • ActiveMQ - "Cannot send, channel has already failed" every 2 seconds?

    - by quanta
    ActiveMQ 5.7.0 In the activemq.log, I'm seeing this exception every 2 seconds: 2013-11-05 13:00:52,374 | DEBUG | Transport Connection to: tcp://127.0.0.1:37501 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Due to this keyword InactivityIOException, the first thing comes to my mind is InactivityMonitor, but the strange thing is MaxInactivityDuration=30000: 2013-11-05 13:11:02,672 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-2 Moreover, I also didn't see something like this: No message received since last read check for ... or: Channel was inactive for too (30000) long Do a netstat, I see these connections in TIME_WAIT state: tcp 0 0 127.0.0.1:38545 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38544 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38522 127.0.0.1:61616 TIME_WAIT - Here're the output when running tcpdump: Internet Protocol Version 4, Src: 127.0.0.1 (127.0.0.1), Dst: 127.0.0.1 (127.0.0.1) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport)) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00) Total Length: 296 Identification: 0x7b6a (31594) Flags: 0x02 (Don't Fragment) 0... .... = Reserved bit: Not set .1.. .... = Don't fragment: Set ..0. .... = More fragments: Not set Fragment offset: 0 Time to live: 64 Protocol: TCP (6) Header checksum: 0xc063 [correct] [Good: True] [Bad: False] Source: 127.0.0.1 (127.0.0.1) Destination: 127.0.0.1 (127.0.0.1) Transmission Control Protocol, Src Port: 61616 (61616), Dst Port: 54669 (54669), Seq: 1, Ack: 2, Len: 244 Source port: 61616 (61616) Destination port: 54669 (54669) [Stream index: 11] Sequence number: 1 (relative sequence number) [Next sequence number: 245 (relative sequence number)] Acknowledgement number: 2 (relative ack number) Header length: 32 bytes Flags: 0x018 (PSH, ACK) 000. .... .... = Reserved: Not set ...0 .... .... = Nonce: Not set .... 0... .... = Congestion Window Reduced (CWR): Not set .... .0.. .... = ECN-Echo: Not set .... ..0. .... = Urgent: Not set .... ...1 .... = Acknowledgement: Set .... .... 1... = Push: Set .... .... .0.. = Reset: Not set .... .... ..0. = Syn: Not set .... .... ...0 = Fin: Not set Window size value: 256 [Calculated window size: 32768] [Window size scaling factor: 128] Checksum: 0xff1c [validation disabled] [Good Checksum: False] [Bad Checksum: False] Options: (12 bytes) No-Operation (NOP) No-Operation (NOP) Timestamps: TSval 2304161892, TSecr 2304161891 Kind: Timestamp (8) Length: 10 Timestamp value: 2304161892 Timestamp echo reply: 2304161891 [SEQ/ACK analysis] [Bytes in flight: 244] Constrained Application Protocol, TID: 240, Length: 244 00.. .... = Version: 0 ..00 .... = Type: Confirmable (0) .... 0000 = Option Count: 0 Code: Unknown (0) Transaction ID: 240 Payload Content-Type: text/plain (default), Length: 240, offset: 4 Line-based text data: text/plain [truncated] \001ActiveMQ\000\000\000\t\001\000\000\000<DE>\000\000\000\t\000\fMaxFrameSize\006\177<FF><FF><FF><FF> <FF><FF><FF>\000\tCacheSize\005\000\000\004\000\000\fCacheEnabled\001\001\000\022SizePrefixDisabled\001\000\000 MaxInactivityDurationInitalDelay\006\ It is very likely a tcp port check. This is what I see when trying telnet from another host: 2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///10.8.20.9:46775@61616 java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=localhost,Type=Connection,ConnectorName=ope nwire,ViewType=address,Name=tcp_//10.8.20.9_46775 | org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:/ //10.8.20.9:46775@61616 2013-11-05 16:12:41,073 | DEBUG | Stopping connection: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Stopping transport tcp:///10.8.20.9:46775@61616 | org.apache.activemq.transport.tcp.TcpTranspo rt | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Initialized TaskRunnerFactory[ActiveMQ Task] using ExecutorService: java.util.concurrent.Threa dPoolExecutor@23cc2a28 | org.apache.activemq.thread.TaskRunnerFactory | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Closed socket Socket[addr=/10.8.20.9,port=46775,localport=61616] | org.apache.activemq.transpo rt.tcp.TcpTransport | ActiveMQ Task-1 2013-11-05 16:12:41,074 | DEBUG | Forcing shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@23cc2a28 | org.apache.activemq.util.ThreadPoolUtils | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Stopped transport: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Connection Stopped: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,902 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-5 So the question is: how can I find out the process that is trying to connect to my ActiveMQ (from localhost) every 2 seconds?

    Read the article

  • Runtime error of TASM language help!

    - by dominoos
    .model small .stack 400h .data message db "hello. ", 0ah, 0dh, "$" firstdigit db ? seconddigit db ? thirddigit db ? number dw ? newnumber db ? anumber dw 0d bnumber dw 0d Firstn db 0ah, 0dh, "Enter first 3 digit number: ","$" secondn db 0ah, 0dh, "Enter second 3 digit number: ","$" messageB db 0ah, 0dh, "HCF of two number is: ","$" linebreaker db 0ah, 0dh, ' ', 0ah, 0dh, '$' .code Start: mov ax, @data ; establish access to the data segment mov ds, ax ; mov number, 0d mov dx, offset message ; print the string "yob choi 0648293" mov ah, 9h int 21h num: mov dx, offset Firstn ; print the string "put 1st 3 digit" mov ah, 9h int 21h ;run JMP FirstFirst ; jump to FirstFirst FirstFirst: ;first digit mov ah, 1d ;bios code for read a keystroke int 21h ;call bios, it is understood that the ascii code will be returned in al mov firstdigit, al ;may as well save a copy sub al, 30h ;Convert code to an actual integer cbw ;CONVERT BYTE TO WORD. This takes whatever number is in al and ;extends it to ax, doubling its size from 8 bits to 16 bits ;The first digit now occupies all of ax as an integer mov cx, 100d ;This is so we can calculate 100*1st digit +10*2nd digit + 3rd digit mul cx ;start to accumulate the 3 digit number in the variable imul cx ;it is understood that the other operand is ax ; the result will use both dx::ax ;dx will contain only leading zeros add anumber, ax ;save ;Second Digit mov ah, 1d ;bios code for read a keystroke int 21h ;call bios, it is understood that the ascii code will be returned in al mov seconddigit, al ;may as well save a copy sub al, 30h ;Convert code to an actual integer cbw ;CONVERT BYTE TO WORD. This takes whatever number is in al and ;extends it to ax, boubling its size from 8 bits to 16 bits ;The first digit now occupies all of ax as an integer mov cx, 10d ;continue to accumulate the 3 digit number in the variable mul cx ;it is understood that the other operand is ax, containing first digit ;the result will use both dx::ax ;dx will contain only leading zeros. add anumber, ax ;save ;third Digit mov ah, 1d ;samething as above int 21h ; mov thirddigit, al ; sub al, 30h ; cbw ; add anumber, ax ; jmp num2 ;go to checks Num2: mov dx, offset secondn ; print the string "put 2nd 3 digits" mov ah, 9h int 21h ;run JMP SecondSecond SecondSecond: ;first digit mov ah, 1d ;bios code for read a keystroke int 21h ;call bios, it is understood that the ascii code will be returned in al mov firstdigit, al ;may as well save a copy sub al, 30h ;Convert code to an actual integer cbw ;CONVERT BYTE TO WORD. This takes whatever number is in al and ;extends it to ax, doubling its size from 8 bits to 16 bits ;The first digit now occupies all of ax as an integer mov cx, 100d ;This is so we can calculate 100*1st digit +10*2nd digit + 3rd digit mul cx ;start to accumulate the 3 digit number in the variable imul cx ;it is understood that the other operand is ax ; the result will use both dx::ax ;dx will contain only leading zeros add bnumber, ax ;save ;Second Digit mov ah, 1d ;bios code for read a keystroke int 21h ;call bios, it is understood that the ascii code will be returned in al mov seconddigit, al ;may as well save a copy sub al, 30h ;Convert code to an actual integer cbw ;CONVERT BYTE TO WORD. This takes whatever number is in al and ;extends it to ax, boubling its size from 8 bits to 16 bits ;The first digit now occupies all of ax as an integer mov cx, 10d ;continue to accumulate the 3 digit number in the variable mul cx ;it is understood that the other operand is ax, containing first digit ;the result will use both dx::ax ;dx will contain only leading zeros. add bnumber, ax ;save ;third Digit mov ah, 1d ;samething as above int 21h ; mov thirddigit, al ; sub al, 30h ; cbw ; add bnumber, ax ; jmp compare ;go to compare compare: CMP ax, anumber ;comparing numbB and Number JA comp1 ;go to comp1 if anumber is bigger CMP ax, anumber ; JB comp2 ;go to comp2 if anumber is smaller CMP ax, anumber ; JE equal ;go to equal if two numbers are the same JMP compare ;go to compare (avioding error) comp1: SUB ax, anumber; subtract smaller number from bigger number JMP compare ; comp2: SUB anumber, ax; subtract smaller number from bigger number JMP compare ; equal: mov ah, 9d ;make linkbreak after the 2nd 3 digit number mov dx, offset linebreaker int 21h mov ah, 9d ;print "HCF of two number is:" mov dx, offset messageB int 21h mov ax,anumber ;copying 2nd number into ax add al,30h ; converting to ascii mov newnumber,al ; copying from low part of register into newnumb mov ah, 2d ;bios code for print a character mov dl, newnumber ;we had saved the ascii code here int 21h ;call to bios JMP exit; exit: mov ah, 4ch int 21h ;exit the program End hi, this is a program that finds highest common factor of 2 different 3digit number. if i put 200, 235,312 (low numbers) it works fine. but if i put 500, 550, 654(bigger number) the program crashes after the 2nd 3digit number is entered. can you help me to find out what problem is?

    Read the article

  • Matlab Image watermarking question , using both SVD and DWT

    - by Georgek
    Hello all . here is a code that i got over the net ,and it is supposed to embed a watermark of size(50*20) called _copyright.bmp in the Code below . the size of the cover object is (512*512), it is called _lena_std_bw.bmp.What we did here is we did DWT2 2 times for the image , when we reached our second dwt2 cA2 size is 128*128. You should notice that the blocksize and it equals 4, it is used to determine the max msg size based on cA2 according to the following code:max_message=RcA2*CcA2/(blocksize^2). in our current case max_message would equal 128*128/(4^2)=1024. i want to embed a bigger watermark in the 2nd dwt2 and lets say the size of that watermark is 400*10(i can change the dimension using MS PAINT), what i have to do is change the size of the blocksize to 2. so max_message=4096.Matlab gives me 3 errors and they are : ??? Error using == plus Matrix dimensions must agree. Error in == idwt2 at 93 x = upsconv2(a,{Lo_R,Lo_R},sx,dwtEXTM,shift)+ ... % Approximation. Error in == two_dwt_svd_low_low at 88 CAA1 = idwt2(cA22,cH2,cV2,cD2,'haar',[RcA1,CcA1]); The origional Code is (the origional code where blocksize =4): %This algorithm makes DWT for the whole image and after that make DWT for %cH1 and make SVD for cH2 and embed the watermark in every level after SVD %(1) -------------- Embed Watermark ------------------------------------ %Add the watermar W to original image I and give the watermarked image in J %-------------------------------------------------------------------------- % set the gain factor for embeding and threshold for evaluation clc; clear all; close all; % save start time start_time=cputime; % set the value of threshold and alpha thresh=.5; alpha =0.01; % read in the cover object file_name='_lena_std_bw.bmp'; cover_object=double(imread(file_name)); % determine size of watermarked image Mc=size(cover_object,1); %Height Nc=size(cover_object,2); %Width % read in the message image and reshape it into a vector file_name='_copyright.bmp'; message=double(imread(file_name)); T=message; Mm=size(message,1); %Height Nm=size(message,2); %Width % perform 1-level DWT for the whole cover image [cA1,cH1,cV1,cD1] = dwt2(cover_object,'haar'); % determine the size of cA1 [RcA1 CcA1]=size(cA1) % perform 2-level DWT for cA1 [cA2,cH2,cV2,cD2] = dwt2(cA1,'haar'); % determine the size of cA2 [RcA2 CcA2]=size(cA2) % set the value of blocksize blocksize=4 % reshape the watermark to a vector message_vector=round(reshape(message,Mm*Nm,1)./256); W=message_vector; % determine maximum message size based on cA2, and blocksize max_message=RcA2*CcA2/(blocksize^2) % check that the message isn't too large for cover if (length(message) max_message) error('Message too large to fit in Cover Object') end %----------------------- process the image in blocks ---------------------- x=1; y=1; for (kk = 1:length(message_vector)) [cA2u cA2s cA2v]=svd(cA2(y:y+blocksize-1,x:x+blocksize-1)); % if message bit contains zero, modify S of the original image if (message_vector(kk) == 0) cA2s = cA2s*(1 + alpha); % otherwise mask is filled with zeros else cA2s=cA2s; end cA22(y:y+blocksize-1,x:x+blocksize-1)=cA2u*cA2s*cA2v; % move to next block of mask along x; If at end of row, move to next row if (x+blocksize) >= CcA2 x=1; y=y+blocksize; else x=x+blocksize; end end % perform IDWT CAA1 = idwt2(cA22,cH2,cV2,cD2,'haar',[RcA1,CcA1]); watermarked_image= idwt2(CAA1,cH1,cV1,cD1,'haar',[Mc,Nc]); % convert back to uint8 watermarked_image_uint8=uint8(watermarked_image); % write watermarked Image to file imwrite(watermarked_image_uint8,'dwt_watermarked.bmp','bmp'); % display watermarked image figure(1) imshow(watermarked_image_uint8,[]) title('Watermarked Image') %(2) ---------------------------------------------------------------------- %---------- Extract Watermark from attacked watermarked image ------------- %-------------------------------------------------------------------------- % read in the watermarked object file_name='dwt_watermarked.bmp'; watermarked_image=double(imread(file_name)); % determine size of watermarked image Mw=size(watermarked_image,1); %Height Nw=size(watermarked_image,2); %Width % perform 1-level DWT for the whole watermarked image [ca1,ch1,cv1,cd1] = dwt2(watermarked_image,'haar'); % determine the size of ca1 [Rca1 Cca1]=size(ca1); % perform 2-level DWT for ca1 [ca2,ch2,cv2,cd2] = dwt2(ca1,'haar'); % determine the size of ca2 [Rca2 Cca2]=size(ca2); % process the image in blocks % for each block get a bit for message x=1; y=1; for (kk = 1:length(message_vector)) % sets correlation to 1 when patterns are identical to avoid /0 errors % otherwise calcluate difference between the cover image and the % watermarked image [cA2u cA2s cA2v]=svd(cA2(y:y+blocksize-1,x:x+blocksize-1)); [ca2u1 ca2s1 ca2v1]=svd(ca2(y:y+blocksize-1,x:x+blocksize-1)); correlation(kk)=diag(ca2s1-cA2s)'*diag(ca2s1-cA2s)/(alpha*alpha)/(diag(cA2s)*diag(cA2s)); % move on to next block. At and of row move to next row if (x+blocksize) >= Cca2 x=1; y=y+blocksize; else x=x+blocksize; end end % if correlation exceeds average correlation correlation(kk)=correlation(kk)+mean(correlation(1:Mm*Nm)); for kk = 1:length(correlation) if (correlation(kk) > thresh*alpha);%thresh*mean(correlation(1:Mo*No))) message_vector(kk)=0; end end % reshape the message vector and display recovered watermark. figure(2) message=reshape(message_vector(1:Mm*Nm),Mm,Nm); imshow(message,[]) title('Recovered Watermark') % display processing time elapsed_time=cputime-start_time, please do help,its my graduation project and i have been trying this code for along time but failed miserable. Thanks in advance

    Read the article

  • Unaccounted for database size

    - by Nazadus
    I currently have a database that is 20GB in size. I've run a few scripts which show on each tables size (and other incredibly useful information such as index stuff) and the biggest table is 1.1 million records which takes up 150MB of data. We have less than 50 tables most of which take up less than 1MB of data. After looking at the size of each table I don't understand why the database shouldn't be 1GB in size after a shrink. The amount of available free space that SqlServer (2005) reports is 0%. The log mode is set to simple. At this point my main concern is I feel like I have 19GB of unaccounted for used space. Is there something else I should look at? Normally I wouldn't care and would make this a passive research project except this particular situation calls for us to do a backup and restore on a weekly basis to put a copy on a satellite (which has no internet, so it must be done manually). I'd much rather copy 1GB (or even if it were down to 5GB!) than 20GB of data each week. sp_spaceused reports the following: Navigator-Production 19184.56 MB 3.02 MB And the second part of it: 19640872 KB 19512112 KB 108184 KB 20576 KB while I've found a few other scripts (such as the one from two of the server database size questions here, they all report the same information either found above or below). The script I am using is from SqlTeam. Here is the header info: * BigTables.sql * Bill Graziano (SQLTeam.com) * graz@<email removed> * v1.11 The top few tables show this (table, rows, reserved space, data, index, unused, etc): Activity 1143639 131 MB 89 MB 41768 KB 1648 KB 46% 1% EventAttendance 883261 90 MB 58 MB 32264 KB 328 KB 54% 0% Person 113437 31 MB 15 MB 15752 KB 912 KB 103% 3% HouseholdMember 113443 12 MB 6 MB 5224 KB 432 KB 82% 4% PostalAddress 48870 8 MB 6 MB 2200 KB 280 KB 36% 3% The rest of the tables are either the same in size or smaller. No more than 50 tables. Update 1: - All tables use unique identifiers. Usually an int incremented by 1 per row. I've also re-indexed everything. I ran the dbcc shrink command as well as updating the usage before and after. And over and over. An interesting thing I found is that when I restarted the server and confirmed no one was using it (and no maintenance procs are running, this is a very new application -- under a week old) and when I went to run the shrink, every now and then it would say something about data changed. Googling yielded too few useful answers with the obvious not applying (it was 1am and I disconnected everyone, so it seems impossible that was really the case). The data was migrated via C# code which basically looked at another server and brought things over. The quantity of deletes, at this point in time, are probably under 50k in rows. Even if those rows were the biggest rows, that wouldn't be more than 100M I would imagine. When I go to shrink via the GUI it reports 0% available to shrink, indicating that I've already gotten it as small as it thinks it can go. Update 2: sp_spaceused 'Activity' yields this (which seems right on the money): Activity 1143639 134488 KB 91072 KB 41768 KB 1648 KB Fill factor was 90. All primary keys are ints. Here is the command I used to 'updateusage': DBCC UPDATEUSAGE(0); Update 3: Per Edosoft's request: Image 111975 2407773 19262184 It appears as though the image table believes it's the 19GB portion. I don't understand what this means though. Is it really 19GB or is it misrepresented? Update 4: Talking to a co-worker and I found out that it's because of the pages, as someone else here has also state the potential for that. The only index on the image table is a clustered PK. Is this something I can fix or do I just have to deal with it? The regular script shows the Image table to be 6MB in size. Update 5: I think I'm just going to have to deal with it after further research. The images have been resized to be roughly 2-5KB each and on a normal file system doesn't consume much space but on SqlServer it seems to consume considerably more. The real answer, in the long run, will likely be separating that table in to another partition or something similar.

    Read the article

  • Where to look for real url

    - by smallB
    I'm trying to write simple application for downloading videos from youtube. My code for getting file (http://www.youtube.com/watch?v=pViMzR_ylXg) looks like: bool FD_core::get_file() { QNetworkRequest request; request.setUrl(QUrl("http://www.youtube.com/watch?v=pViMzR_ylXg")); connect(network_access_manager_, SIGNAL(finished(QNetworkReply*)), this, SLOT(onRequestCompleted(QNetworkReply *))); network_access_manager_->get(request); return true; } void FD_core::onRequestCompleted(QNetworkReply * reply) { QByteArray data_ = reply->readAll(); cout << data_.constData(); qDebug() << "size: " << data_.size(); } In the above function data_.constData() produces lots of text, part (very small) of it: <!DOCTYPE html> <html lang="en" dir="ltr" > <head> <script> var yt = yt || {};yt.timing = yt.timing || {};yt.timing.tick = function(label, opt_time) {var timer = yt.timing['timer'] || {};if(opt_time) {timer[label] = opt_time;}else {timer[label] = new Date().getTime();}yt.timing['timer'] = timer;};yt.timing.info = function(label, value) {var info_args = yt.timing['info_args'] || {};info_args[label] = value;yt.timing['info_args'] = info_args;};yt.timing.info('e', "907050,906359,927900,919320,914021,916611,922401,920704,912806,927201,925706,928001,922403,913546,913556,920201,911116,901451");yt.timing.wff = true;yt.timing.info('pr', "1");yt.timing.info('an', "dclk,aftv,afv");if (document.webkitVisibilityState == 'prerender') {document.addEventListener('webkitvisibilitychange', function() {yt.timing.tick('start');}, false);}yt.timing.tick('start');yt.timing.info('li','0');try {yt.timing['srt'] = window.gtbExternal && window.gtbExternal.pageT() ||window.external && window.external.pageT;} catch(e) {}if (window.chrome && window.chrome.csi) {yt.timing['srt'] = Math.floor(window.chrome.csi().pageT);}if (window.msPerformance && window.msPerformance.timing) {yt.timing['srt'] = window.msPerformance.timing.responseStart - window.msPerformance.timing.navigationStart;} </script> <script>var yt = yt || {};yt.preload = {};yt.preload.counter_ = 0;yt.preload.start = function(src) {var img = new Image();var counter = ++yt.preload.counter_;yt.preload[counter] = img;img.onload = img.onerror = function () {delete yt.preload[counter];};img.src = src;img = null;};yt.preload.start("http:\/\/o-o---preferred---sn-xn5ucu-q0ce---v3---lscache7.c.youtube.com\/crossdomain.xml");yt.preload.start("http:\/\/o-o---preferred---sn-xn5ucu-q0ce---v3---lscache7.c.youtube.com\/generate_204?ip=95.83.224.63\u0026upn=A3aUhLYV55M\u0026sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cgcr%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire\u0026fexp=907050%2C906359%2C927900%2C919320%2C914021%2C916611%2C922401%2C920704%2C912806%2C927201%2C925706%2C928001%2C922403%2C913546%2C913556%2C920201%2C911116%2C901451\u0026mt=1354207274\u0026key=yt1\u0026algorithm=throttle-factor\u0026burst=40\u0026ipbits=8\u0026itag=34\u0026sver=3\u0026signature=692E605215EB4D2CA407291CA26E14B844768A89.7A2930CE25FDDFC7C4FF5AA56DD02538B0020267\u0026mv=m\u0026source=youtube\u0026ms=au\u0026gcr=ie\u0026expire=1354228237\u0026factor=1.25\u0026cp=U0hUSVJNVl9IUUNONF9KR1pDOi0tSFhhRzVFRkd6\u0026id=a5588ccd1ff29578");</script><title>Die Antwoord - Fok Julle Naaiers (Mike Tyson&#39;s Words NOT DJ Hi-Teks) - YouTube</title><link rel="search" type="application/opensearchdescription+xml" href="http://www.youtube.com/opensearch?locale=en_US" title="YouTube Video Search"><link rel="icon" href="http://s.ytimg.com/yts/img/favicon-vfldLzJxy.ico" type="image/x-icon"><link rel="shortcut icon" href="http://s.ytimg.com/yts/img/favicon-vfldLzJxy.ico" type="image/x-icon"> <link rel="icon" href="//s.ytimg.com/yts/img/favicon_32-vflWoMFGx.png" sizes="32x32"><link rel="canonical" href="/watch?v=pViMzR_ylXg"><link rel="alternate" media="handheld" href="http://m.youtube.com/watch?v=pViMzR_ylXg"><link rel="alternate" media="only screen and (max-width: 640px)" href="http://m.youtube.com/watch?v=pViMzR_ylXg"><link rel="shortlink" href="http://youtu.be/pViMzR_ylXg"> <meta name="title" content="Die Antwoord - Fok Julle Naaiers (Mike Tyson&#39;s Words NOT DJ Hi-Teks)"> <meta name="description" content="Some of the lyrics of &quot;Die Antwoord&quot; new single &quot;Fok Julle Naaiers&quot; have caused such controversy that Die Antwoord have split with their record label Intersc..."> <meta name="keywords" content="Die Antwoord, Fok Julle Naaiers, Mike Tyson, DJ Hi-Tek, Faggot"> <link rel="alternate" type="application/json+oembed" href="http://www.youtube.com/oembed?url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DpViMzR_ylXg&amp;format=json" title="Die Antwoord - Fok Julle Naaiers (Mike Tyson&#39;s Words NOT DJ Hi-Teks)"> <link rel="alternate" type="text/xml+oembed" href="http://www.youtube.com/oembed?url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DpViMzR_ylXg&amp;format=xml" title="Die Antwoord - Fok Julle Naaiers (Mike Tyson&#39;s Words NOT DJ Hi-Teks)"> <meta property="og:url" content="http://www.youtube.com/watch?v=pViMzR_ylXg"> <meta property="og:title" content="Die Antwoord - Fok Julle Naaiers (Mike Tyson&#39;s Words NOT DJ Hi-Teks)"> <meta property="og:description" content="Some of the lyrics of &quot;Die Antwoord&quot; new single &quot;Fok Julle Naaiers&quot; have caused such controversy that Die Antwoord have split with their record label Intersc..."> <meta property="og:type" content="video"> <meta property="og:image" content="http://i1.ytimg.com/vi/pViMzR_ylXg/mqdefault.jpg"> <meta property="og:video" content="http://www.youtube.com/v/pViMzR_ylXg?version=3&amp;autohide=1"> <meta property="og:video:type" content="application/x-shockwave-flash"> <meta property="og:video:width" content="853"> <meta property="og:video:height" content="480"> <meta property="og:site_name" content="YouTube"> <meta property="fb:app_id" content="87741124305"> <meta name="twitter:card" value="player"> <meta name="twitter:site" value="@youtube"> <meta name="twitter:player" value="https://www.youtube.com/embed/pViMzR_ylXg"> <meta property="twitter:player:width" content="853"> <meta property="twitter:player:height" content="480"> So my question is, where in this file is the url hidden which will allow me to download the wanted file?

    Read the article

  • Problem implementing Blinn–Phong shading model

    - by Joe Hopfgartner
    I did this very simple, perfectly working, implementation of Phong Relflection Model (There is no ambience implemented yet, but that doesn't bother me for now). The functions should be self explaining. /** * Implements the classic Phong illumination Model using a reflected light * vector. */ public class PhongIllumination implements IlluminationModel { @RGBParam(r = 0, g = 0, b = 0) public Vec3 ambient; @RGBParam(r = 1, g = 1, b = 1) public Vec3 diffuse; @RGBParam(r = 1, g = 1, b = 1) public Vec3 specular; @FloatParam(value = 20, min = 1, max = 200.0f) public float shininess; /* * Calculate the intensity of light reflected to the viewer . * * @param P = The surface position expressed in world coordinates. * * @param V = Normalized viewing vector from surface to eye in world * coordinates. * * @param N = Normalized normal vector at surface point in world * coordinates. * * @param surfaceColor = surfaceColor Color of the surface at the current * position. * * @param lights = The active light sources in the scene. * * @return Reflected light intensity I. */ public Vec3 shade(Vec3 P, Vec3 V, Vec3 N, Vec3 surfaceColor, Light lights[]) { Vec3 surfaceColordiffused = Vec3.mul(surfaceColor, diffuse); Vec3 totalintensity = new Vec3(0, 0, 0); for (int i = 0; i < lights.length; i++) { Vec3 L = lights[i].calcDirection(P); N = N.normalize(); V = V.normalize(); Vec3 R = Vec3.reflect(L, N); // reflection vector float diffuseLight = Vec3.dot(N, L); float specularLight = Vec3.dot(V, R); if (diffuseLight > 0) { totalintensity = Vec3.add(Vec3.mul(Vec3.mul( surfaceColordiffused, lights[i].calcIntensity(P)), diffuseLight), totalintensity); if (specularLight > 0) { Vec3 Il = lights[i].calcIntensity(P); Vec3 Ilincident = Vec3.mul(Il, Math.max(0.0f, Vec3 .dot(N, L))); Vec3 intensity = Vec3.mul(Vec3.mul(specular, Ilincident), (float) Math.pow(specularLight, shininess)); totalintensity = Vec3.add(totalintensity, intensity); } } } return totalintensity; } } Now i need to adapt it to become a Blinn-Phong illumination model I used the formulas from hearn and baker, followed pseudocodes and tried to implement it multiple times according to wikipedia articles in several languages but it never worked. I just get no specular reflections or they are so weak and/or are at the wrong place and/or have the wrong color. From the numerous wrong implementations I post some little code that already seems to be wrong. So I calculate my Half Way vector and my new specular light like so: Vec3 H = Vec3.mul(Vec3.add(L.normalize(), V), Vec3.add(L.normalize(), V).length()); float specularLight = Vec3.dot(H, N); With theese little changes it should already work (maby not with correct intensity but basically it should be correct). But the result is wrong. Here are two images. Left how it should render correctly and right how it renders. If i lower the shininess factor you can see a little specular light at the top right: Altough I understand the concept of Phong illumination and also the simplified more performant adaptaion of blinn phong I am trying around for days and just cant get it to work. Any help is appriciated. Edit: I was made aware of an error by this answer, that i am mutiplying by |L+V| instead of dividing by it when calculating H. I changed to deviding doing so: Vec3 H = Vec3.mul(Vec3.add(L.normalize(), V), 1/Vec3.add(L.normalize(), V).length()); Unfortunately this doesnt change much. The results look like this: and if I rise the specular constant and lower the shininess You can see the effects more clearly in a smilar wrong way: However this division just the normalisation. I think I am missing one step. Because the formulas like this just dont make sense to me. If you look at this picture: http://en.wikipedia.org/wiki/File:Blinn-Phong_vectors.svg The projection of H to N is far less than V to R. And if you imagine changing the vector V in the picture the angle is the same when the viewing vector is "on the left side". and becomes more and more different when going to the right. I pesonally would multiply the whole projection by two to become something similiar (and the hole point is to avoid the calculation of R). Altough I didnt read anythinga bout that anywehre i am gonna try this out... Result: The intension of the specular light is far too much (white areas) and the position is still wrong. I think I am messing something else up because teh reflection are just at the wrong place. But what? Edit: Now I read on wikipedia in the notes that the angle of N/H is in fact approximalty half or V/R. To compensate that i should multiply my shineness exponent by 4 rather than my projection. If i do that I end up with this: Far to intense but still one thing. The projection is at the wrong place. Where could i mess up my vectors?

    Read the article

  • How should I change my Graph structure (very slow insertion)?

    - by Nazgulled
    Hi, This program I'm doing is about a social network, which means there are users and their profiles. The profiles structure is UserProfile. Now, there are various possible Graph implementations and I don't think I'm using the best one. I have a Graph structure and inside, there's a pointer to a linked list of type Vertex. Each Vertex element has a value, a pointer to the next Vertex and a pointer to a linked list of type Edge. Each Edge element has a value (so I can define weights and whatever it's needed), a pointer to the next Edge and a pointer to the Vertex owner. I have a 2 sample files with data to process (in CSV style) and insert into the Graph. The first one is the user data (one user per line); the second one is the user relations (for the graph). The first file is quickly inserted into the graph cause I always insert at the head and there's like ~18000 users. The second file takes ages but I still insert the edges at the head. The file has about ~520000 lines of user relations and takes between 13-15mins to insert into the Graph. I made a quick test and reading the data is pretty quickly, instantaneously really. The problem is in the insertion. This problem exists because I have a Graph implemented with linked lists for the vertices. Every time I need to insert a relation, I need to lookup for 2 vertices, so I can link them together. This is the problem... Doing this for ~520000 relations, takes a while. How should I solve this? Solution 1) Some people recommended me to implement the Graph (the vertices part) as an array instead of a linked list. This way I have direct access to every vertex and the insertion is probably going to drop considerably. But, I don't like the idea of allocating an array with [18000] elements. How practically is this? My sample data has ~18000, but what if I need much less or much more? The linked list approach has that flexibility, I can have whatever size I want as long as there's memory for it. But the array doesn't, how am I going to handle such situation? What are your suggestions? Using linked lists is good for space complexity but bad for time complexity. And using an array is good for time complexity but bad for space complexity. Any thoughts about this solution? Solution 2) This project also demands that I have some sort of data structures that allows quick lookup based on a name index and an ID index. For this I decided to use Hash Tables. My tables are implemented with separate chaining as collision resolution and when a load factor of 0.70 is reach, I normally recreate the table. I base the next table size on this http://planetmath.org/encyclopedia/GoodHashTablePrimes.html. Currently, both Hash Tables hold a pointer to the UserProfile instead of duplication the user profile itself. That would be stupid, changing data would require 3 changes and it's really dumb to do it that way. So I just save the pointer to the UserProfile. The same user profile pointer is also saved as value in each Graph Vertex. So, I have 3 data structures, one Graph and two Hash Tables and every single one of them point to the same exact UserProfile. The Graph structure will serve the purpose of finding the shortest path and stuff like that while the Hash Tables serve as quick index by name and ID. What I'm thinking to solve my Graph problem is to, instead of having the Hash Tables value point to the UserProfile, I point it to the corresponding Vertex. It's still a pointer, no more and no less space is used, I just change what I point to. Like this, I can easily and quickly lookup for each Vertex I need and link them together. This will insert the ~520000 relations pretty quickly. I thought of this solution because I already have the Hash Tables and I need to have them, then, why not take advantage of them for indexing the Graph vertices instead of the user profile? It's basically the same thing, I can still access the UserProfile pretty quickly, just go to the Vertex and then to the UserProfile. But, do you see any cons on this second solution against the first one? Or only pros that overpower the pros and cons on the first solution? Other Solution) If you have any other solution, I'm all ears. But please explain the pros and cons of that solution over the previous 2. I really don't have much time to be wasting with this right now, I need to move on with this project, so, if I'm doing to do such a change, I need to understand exactly what to change and if that's really the way to go. Hopefully no one fell asleep reading this and closed the browser, sorry for the big testament. But I really need to decide what to do about this and I really need to make a change. P.S: When answering my proposed solutions, please enumerate them as I did so I know exactly what are you talking about and don't confuse my self more than I already am.

    Read the article

  • Improving HTML scrapper efficiency with pcntl_fork()

    - by Michael Pasqualone
    With the help from two previous questions, I now have a working HTML scrapper that feeds product information into a database. What I am now trying to do is improve efficiently by wrapping my brain around with getting my scrapper working with pcntl_fork. If I split my php5-cli script into 10 separate chunks, I improve total runtime by a large factor so I know I am not i/o or cpu bound but just limited by the linear nature of my scraping functions. Using code I've cobbled together from multiple sources, I have this working test: <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $hrefArray = array("http://slashdot.org", "http://slashdot.org", "http://slashdot.org", "http://slashdot.org"); function doDomStuff($singleHref,$childPid) { $html = new DOMDocument(); $html->loadHtmlFile($singleHref); $xPath = new DOMXPath($html); $domQuery = '//div[@id="slogan"]/h2'; $domReturn = $xPath->query($domQuery); foreach($domReturn as $return) { $slogan = $return->nodeValue; echo "Child PID #" . $childPid . " says: " . $slogan . "\n"; } } $pids = array(); foreach ($hrefArray as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref,$childPid); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); Which raises the following questions: 1) Given my hrefArray contains 4 urls - if the array was to contain say 1,000 product urls this code would spawn 1,000 child processes? If so, what is the best way to limit the amount of processes to say 10, and again 1,000 urls as an example split the child work load to 100 products per child (10 x 100). 2) I've learn that pcntl_fork creates a copy of the process and all variables, classes, etc. What I would like to do is replace my hrefArray variable with a DOMDocument query that builds the list of products to scrape, and then feeds them off to child processes to do the processing - so spreading the load across 10 child workers. My brain is telling I need to do something like the following (obviously this doesn't work, so don't run it): <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $maxChildWorkers = 10; $html = new DOMDocument(); $html->loadHtmlFile('http://xxxx'); $xPath = new DOMXPath($html); $domQuery = '//div[@id=productDetail]/a'; $domReturn = $xPath->query($domQuery); $hrefsArray[] = $domReturn->getAttribute('href'); function doDomStuff($singleHref) { // Do stuff here with each product } // To figure out: Split href array into $maxChilderWorks # of workArray1, workArray2 ... workArray10. $pids = array(); foreach ($workArray(1,2,3 ... 10) as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); But what I can't figure out is how to build my hrefsArray[] in the master/parent process only and feed it off to the child process. Currently everything I've tried causes loops in the child processes. I.e. my hrefsArray gets built in the master, and in each subsequent child process. I am sure I am going about this all totally wrong, so would greatly appreciate just general nudge in the right direction.

    Read the article

  • Qt C++ signals and slots did not fire

    - by Xegara
    I have programmed Qt a couple of times already and I really like the signals and slots feature. But now, I guess I'm having a problem when a signal is emitted from one thread, the corresponding slot from another thread is not fired. The connection was made in the main program. This is also my first time to use Qt for ROS which uses CMake. The signal fired by the QThread triggered their corresponding slots but the emitted signal of my class UserInput did not trigger the slot in tflistener where it supposed to. I have tried everything I can. Any help? The code is provided below. Main.cpp #include <QCoreApplication> #include <QThread> #include "userinput.h" #include "tfcompleter.h" int main(int argc, char** argv) { QCoreApplication app(argc, argv); QThread *thread1 = new QThread(); QThread *thread2 = new QThread(); UserInput *input1 = new UserInput(); TfCompleter *completer = new TfCompleter(); QObject::connect(input1, SIGNAL(togglePause2()), completer, SLOT(toggle())); QObject::connect(thread1, SIGNAL(started()), completer, SLOT(startCounting())); QObject::connect(thread2, SIGNAL(started()), input1, SLOT(start())); completer->moveToThread(thread1); input1->moveToThread(thread2); thread1->start(); thread2->start(); app.exec(); return 0; } What I want to do is.. There are two seperate threads. One thread is for the user input. When the user enters [space], the thread emits a signal to toggle the boolean member field of the other thread. The other thread 's task is to just continue its process if the user wants it to run, otherwise, the user does not want it to run. I wanted to grant the user to toggle the processing anytime that he wants, that's why I decided to bring them into seperate threads. The following codes are the tflistener and userinput. tfcompleter.h #ifndef TFCOMPLETER_H #define TFCOMPLETER_H #include <QObject> #include <QtCore> class TfCompleter : public QObject { Q_OBJECT private: bool isCount; public Q_SLOTS: void toggle(); void startCounting(); }; #endif tflistener.cpp #include "tfcompleter.h" #include <iostream> void TfCompleter::startCounting() { static uint i = 0; while(true) { if(isCount) std::cout << i++ << std::endl; } } void TfCompleter::toggle() { // isCount = ~isCount; std::cout << "isCount " << std::endl; } UserInput.h #ifndef USERINPUT_H #define USERINPUT_H #include <QObject> #include <QtCore> class UserInput : public QObject { Q_OBJECT public Q_SLOTS: void start(); // Waits for the keypress from the user and emits the corresponding signal. public: Q_SIGNALS: void togglePause2(); }; #endif UserInput.cpp #include "userinput.h" #include <iostream> #include <cstdio> // Implementation of getch #include <termios.h> #include <unistd.h> /* reads from keypress, doesn't echo */ int getch(void) { struct termios oldattr, newattr; int ch; tcgetattr( STDIN_FILENO, &oldattr ); newattr = oldattr; newattr.c_lflag &= ~( ICANON | ECHO ); tcsetattr( STDIN_FILENO, TCSANOW, &newattr ); ch = getchar(); tcsetattr( STDIN_FILENO, TCSANOW, &oldattr ); return ch; } void UserInput::start() { char c = 0; while (true) { c = getch(); if (c == ' ') { Q_EMIT togglePause2(); std::cout << "SPACE" << std::endl; } c = 0; } } Here is the CMakeLists.txt. I just placed it here also since I don't know maybe the CMake has also a factor here. CMakeLists.txt ############################################################################## # CMake ############################################################################## cmake_minimum_required(VERSION 2.4.6) ############################################################################## # Ros Initialisation ############################################################################## include($ENV{ROS_ROOT}/core/rosbuild/rosbuild.cmake) rosbuild_init() set(CMAKE_AUTOMOC ON) #set the default path for built executables to the "bin" directory set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin) #set the default path for built libraries to the "lib" directory set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib) # Set the build type. Options are: # Coverage : w/ debug symbols, w/o optimization, w/ code-coverage # Debug : w/ debug symbols, w/o optimization # Release : w/o debug symbols, w/ optimization # RelWithDebInfo : w/ debug symbols, w/ optimization # MinSizeRel : w/o debug symbols, w/ optimization, stripped binaries #set(ROS_BUILD_TYPE Debug) ############################################################################## # Qt Environment ############################################################################## # Could use this, but qt-ros would need an updated deb, instead we'll move to catkin # rosbuild_include(qt_build qt-ros) rosbuild_find_ros_package(qt_build) include(${qt_build_PACKAGE_PATH}/qt-ros.cmake) rosbuild_prepare_qt4(QtCore) # Add the appropriate components to the component list here ADD_DEFINITIONS(-DQT_NO_KEYWORDS) ############################################################################## # Sections ############################################################################## #file(GLOB QT_FORMS RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} ui/*.ui) #file(GLOB QT_RESOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} resources/*.qrc) file(GLOB_RECURSE QT_MOC RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} FOLLOW_SYMLINKS include/rgbdslam_client/*.hpp) #QT4_ADD_RESOURCES(QT_RESOURCES_CPP ${QT_RESOURCES}) #QT4_WRAP_UI(QT_FORMS_HPP ${QT_FORMS}) QT4_WRAP_CPP(QT_MOC_HPP ${QT_MOC}) ############################################################################## # Sources ############################################################################## file(GLOB_RECURSE QT_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} FOLLOW_SYMLINKS src/*.cpp) ############################################################################## # Binaries ############################################################################## rosbuild_add_executable(rgbdslam_client ${QT_SOURCES} ${QT_MOC_HPP}) #rosbuild_add_executable(rgbdslam_client ${QT_SOURCES} ${QT_RESOURCES_CPP} ${QT_FORMS_HPP} ${QT_MOC_HPP}) target_link_libraries(rgbdslam_client ${QT_LIBRARIES})

    Read the article

  • A Taxonomy of Numerical Methods v1

    - by JoshReuben
    Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms     Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

    Read the article

  • Toorcon14

    - by danx
    Toorcon 2012 Information Security Conference San Diego, CA, http://www.toorcon.org/ Dan Anderson, October 2012 It's almost Halloween, and we all know what that means—yes, of course, it's time for another Toorcon Conference! Toorcon is an annual conference for people interested in computer security. This includes the whole range of hackers, computer hobbyists, professionals, security consultants, press, law enforcement, prosecutors, FBI, etc. We're at Toorcon 14—see earlier blogs for some of the previous Toorcon's I've attended (back to 2003). This year's "con" was held at the Westin on Broadway in downtown San Diego, California. The following are not necessarily my views—I'm just the messenger—although I could have misquoted or misparaphrased the speakers. Also, I only reviewed some of the talks, below, which I attended and interested me. MalAndroid—the Crux of Android Infections, Aditya K. Sood Programming Weird Machines with ELF Metadata, Rebecca "bx" Shapiro Privacy at the Handset: New FCC Rules?, Valkyrie Hacking Measured Boot and UEFI, Dan Griffin You Can't Buy Security: Building the Open Source InfoSec Program, Boris Sverdlik What Journalists Want: The Investigative Reporters' Perspective on Hacking, Dave Maas & Jason Leopold Accessibility and Security, Anna Shubina Stop Patching, for Stronger PCI Compliance, Adam Brand McAfee Secure & Trustmarks — a Hacker's Best Friend, Jay James & Shane MacDougall MalAndroid—the Crux of Android Infections Aditya K. Sood, IOActive, Michigan State PhD candidate Aditya talked about Android smartphone malware. There's a lot of old Android software out there—over 50% Gingerbread (2.3.x)—and most have unpatched vulnerabilities. Of 9 Android vulnerabilities, 8 have known exploits (such as the old Gingerbread Global Object Table exploit). Android protection includes sandboxing, security scanner, app permissions, and screened Android app market. The Android permission checker has fine-grain resource control, policy enforcement. Android static analysis also includes a static analysis app checker (bouncer), and a vulnerablity checker. What security problems does Android have? User-centric security, which depends on the user to grant permission and make smart decisions. But users don't care or think about malware (the're not aware, not paranoid). All they want is functionality, extensibility, mobility Android had no "proper" encryption before Android 3.0 No built-in protection against social engineering and web tricks Alternative Android app markets are unsafe. Simply visiting some markets can infect Android Aditya classified Android Malware types as: Type A—Apps. These interact with the Android app framework. For example, a fake Netflix app. Or Android Gold Dream (game), which uploads user files stealthy manner to a remote location. Type K—Kernel. Exploits underlying Linux libraries or kernel Type H—Hybrid. These use multiple layers (app framework, libraries, kernel). These are most commonly used by Android botnets, which are popular with Chinese botnet authors What are the threats from Android malware? These incude leak info (contacts), banking fraud, corporate network attacks, malware advertising, malware "Hackivism" (the promotion of social causes. For example, promiting specific leaders of the Tunisian or Iranian revolutions. Android malware is frequently "masquerated". That is, repackaged inside a legit app with malware. To avoid detection, the hidden malware is not unwrapped until runtime. The malware payload can be hidden in, for example, PNG files. Less common are Android bootkits—there's not many around. What they do is hijack the Android init framework—alteering system programs and daemons, then deletes itself. For example, the DKF Bootkit (China). Android App Problems: no code signing! all self-signed native code execution permission sandbox — all or none alternate market places no robust Android malware detection at network level delayed patch process Programming Weird Machines with ELF Metadata Rebecca "bx" Shapiro, Dartmouth College, NH https://github.com/bx/elf-bf-tools @bxsays on twitter Definitions. "ELF" is an executable file format used in linking and loading executables (on UNIX/Linux-class machines). "Weird machine" uses undocumented computation sources (I think of them as unintended virtual machines). Some examples of "weird machines" are those that: return to weird location, does SQL injection, corrupts the heap. Bx then talked about using ELF metadata as (an uintended) "weird machine". Some ELF background: A compiler takes source code and generates a ELF object file (hello.o). A static linker makes an ELF executable from the object file. A runtime linker and loader takes ELF executable and loads and relocates it in memory. The ELF file has symbols to relocate functions and variables. ELF has two relocation tables—one at link time and another one at loading time: .rela.dyn (link time) and .dynsym (dynamic table). GOT: Global Offset Table of addresses for dynamically-linked functions. PLT: Procedure Linkage Tables—works with GOT. The memory layout of a process (not the ELF file) is, in order: program (+ heap), dynamic libraries, libc, ld.so, stack (which includes the dynamic table loaded into memory) For ELF, the "weird machine" is found and exploited in the loader. ELF can be crafted for executing viruses, by tricking runtime into executing interpreted "code" in the ELF symbol table. One can inject parasitic "code" without modifying the actual ELF code portions. Think of the ELF symbol table as an "assembly language" interpreter. It has these elements: instructions: Add, move, jump if not 0 (jnz) Think of symbol table entries as "registers" symbol table value is "contents" immediate values are constants direct values are addresses (e.g., 0xdeadbeef) move instruction: is a relocation table entry add instruction: relocation table "addend" entry jnz instruction: takes multiple relocation table entries The ELF weird machine exploits the loader by relocating relocation table entries. The loader will go on forever until told to stop. It stores state on stack at "end" and uses IFUNC table entries (containing function pointer address). The ELF weird machine, called "Brainfu*k" (BF) has: 8 instructions: pointer inc, dec, inc indirect, dec indirect, jump forward, jump backward, print. Three registers - 3 registers Bx showed example BF source code that implemented a Turing machine printing "hello, world". More interesting was the next demo, where bx modified ping. Ping runs suid as root, but quickly drops privilege. BF modified the loader to disable the library function call dropping privilege, so it remained as root. Then BF modified the ping -t argument to execute the -t filename as root. It's best to show what this modified ping does with an example: $ whoami bx $ ping localhost -t backdoor.sh # executes backdoor $ whoami root $ The modified code increased from 285948 bytes to 290209 bytes. A BF tool compiles "executable" by modifying the symbol table in an existing ELF executable. The tool modifies .dynsym and .rela.dyn table, but not code or data. Privacy at the Handset: New FCC Rules? "Valkyrie" (Christie Dudley, Santa Clara Law JD candidate) Valkyrie talked about mobile handset privacy. Some background: Senator Franken (also a comedian) became alarmed about CarrierIQ, where the carriers track their customers. Franken asked the FCC to find out what obligations carriers think they have to protect privacy. The carriers' response was that they are doing just fine with self-regulation—no worries! Carriers need to collect data, such as missed calls, to maintain network quality. But carriers also sell data for marketing. Verizon sells customer data and enables this with a narrow privacy policy (only 1 month to opt out, with difficulties). The data sold is not individually identifiable and is aggregated. But Verizon recommends, as an aggregation workaround to "recollate" data to other databases to identify customers indirectly. The FCC has regulated telephone privacy since 1934 and mobile network privacy since 2007. Also, the carriers say mobile phone privacy is a FTC responsibility (not FCC). FTC is trying to improve mobile app privacy, but FTC has no authority over carrier / customer relationships. As a side note, Apple iPhones are unique as carriers have extra control over iPhones they don't have with other smartphones. As a result iPhones may be more regulated. Who are the consumer advocates? Everyone knows EFF, but EPIC (Electrnic Privacy Info Center), although more obsecure, is more relevant. What to do? Carriers must be accountable. Opt-in and opt-out at any time. Carriers need incentive to grant users control for those who want it, by holding them liable and responsible for breeches on their clock. Location information should be added current CPNI privacy protection, and require "Pen/trap" judicial order to obtain (and would still be a lower standard than 4th Amendment). Politics are on a pro-privacy swing now, with many senators and the Whitehouse. There will probably be new regulation soon, and enforcement will be a problem, but consumers will still have some benefit. Hacking Measured Boot and UEFI Dan Griffin, JWSecure, Inc., Seattle, @JWSdan Dan talked about hacking measured UEFI boot. First some terms: UEFI is a boot technology that is replacing BIOS (has whitelisting and blacklisting). UEFI protects devices against rootkits. TPM - hardware security device to store hashs and hardware-protected keys "secure boot" can control at firmware level what boot images can boot "measured boot" OS feature that tracks hashes (from BIOS, boot loader, krnel, early drivers). "remote attestation" allows remote validation and control based on policy on a remote attestation server. Microsoft pushing TPM (Windows 8 required), but Google is not. Intel TianoCore is the only open source for UEFI. Dan has Measured Boot Tool at http://mbt.codeplex.com/ with a demo where you can also view TPM data. TPM support already on enterprise-class machines. UEFI Weaknesses. UEFI toolkits are evolving rapidly, but UEFI has weaknesses: assume user is an ally trust TPM implicitly, and attached to computer hibernate file is unprotected (disk encryption protects against this) protection migrating from hardware to firmware delays in patching and whitelist updates will UEFI really be adopted by the mainstream (smartphone hardware support, bank support, apathetic consumer support) You Can't Buy Security: Building the Open Source InfoSec Program Boris Sverdlik, ISDPodcast.com co-host Boris talked about problems typical with current security audits. "IT Security" is an oxymoron—IT exists to enable buiness, uptime, utilization, reporting, but don't care about security—IT has conflict of interest. There's no Magic Bullet ("blinky box"), no one-size-fits-all solution (e.g., Intrusion Detection Systems (IDSs)). Regulations don't make you secure. The cloud is not secure (because of shared data and admin access). Defense and pen testing is not sexy. Auditors are not solution (security not a checklist)—what's needed is experience and adaptability—need soft skills. Step 1: First thing is to Google and learn the company end-to-end before you start. Get to know the management team (not IT team), meet as many people as you can. Don't use arbitrary values such as CISSP scores. Quantitive risk assessment is a myth (e.g. AV*EF-SLE). Learn different Business Units, legal/regulatory obligations, learn the business and where the money is made, verify company is protected from script kiddies (easy), learn sensitive information (IP, internal use only), and start with low-hanging fruit (customer service reps and social engineering). Step 2: Policies. Keep policies short and relevant. Generic SANS "security" boilerplate policies don't make sense and are not followed. Focus on acceptable use, data usage, communications, physical security. Step 3: Implementation: keep it simple stupid. Open source, although useful, is not free (implementation cost). Access controls with authentication & authorization for local and remote access. MS Windows has it, otherwise use OpenLDAP, OpenIAM, etc. Application security Everyone tries to reinvent the wheel—use existing static analysis tools. Review high-risk apps and major revisions. Don't run different risk level apps on same system. Assume host/client compromised and use app-level security control. Network security VLAN != segregated because there's too many workarounds. Use explicit firwall rules, active and passive network monitoring (snort is free), disallow end user access to production environment, have a proxy instead of direct Internet access. Also, SSL certificates are not good two-factor auth and SSL does not mean "safe." Operational Controls Have change, patch, asset, & vulnerability management (OSSI is free). For change management, always review code before pushing to production For logging, have centralized security logging for business-critical systems, separate security logging from administrative/IT logging, and lock down log (as it has everything). Monitor with OSSIM (open source). Use intrusion detection, but not just to fulfill a checkbox: build rules from a whitelist perspective (snort). OSSEC has 95% of what you need. Vulnerability management is a QA function when done right: OpenVas and Seccubus are free. Security awareness The reality is users will always click everything. Build real awareness, not compliance driven checkbox, and have it integrated into the culture. Pen test by crowd sourcing—test with logging COSSP http://www.cossp.org/ - Comprehensive Open Source Security Project What Journalists Want: The Investigative Reporters' Perspective on Hacking Dave Maas, San Diego CityBeat Jason Leopold, Truthout.org The difference between hackers and investigative journalists: For hackers, the motivation varies, but method is same, technological specialties. For investigative journalists, it's about one thing—The Story, and they need broad info-gathering skills. J-School in 60 Seconds: Generic formula: Person or issue of pubic interest, new info, or angle. Generic criteria: proximity, prominence, timeliness, human interest, oddity, or consequence. Media awareness of hackers and trends: journalists becoming extremely aware of hackers with congressional debates (privacy, data breaches), demand for data-mining Journalists, use of coding and web development for Journalists, and Journalists busted for hacking (Murdock). Info gathering by investigative journalists include Public records laws. Federal Freedom of Information Act (FOIA) is good, but slow. California Public Records Act is a lot stronger. FOIA takes forever because of foot-dragging—it helps to be specific. Often need to sue (especially FBI). CPRA is faster, and requests can be vague. Dumps and leaks (a la Wikileaks) Journalists want: leads, protecting ourselves, our sources, and adapting tools for news gathering (Google hacking). Anonomity is important to whistleblowers. They want no digital footprint left behind (e.g., email, web log). They don't trust encryption, want to feel safe and secure. Whistleblower laws are very weak—there's no upside for whistleblowers—they have to be very passionate to do it. Accessibility and Security or: How I Learned to Stop Worrying and Love the Halting Problem Anna Shubina, Dartmouth College Anna talked about how accessibility and security are related. Accessibility of digital content (not real world accessibility). mostly refers to blind users and screenreaders, for our purpose. Accessibility is about parsing documents, as are many security issues. "Rich" executable content causes accessibility to fail, and often causes security to fail. For example MS Word has executable format—it's not a document exchange format—more dangerous than PDF or HTML. Accessibility is often the first and maybe only sanity check with parsing. They have no choice because someone may want to read what you write. Google, for example, is very particular about web browser you use and are bad at supporting other browsers. Uses JavaScript instead of links, often requiring mouseover to display content. PDF is a security nightmare. Executible format, embedded flash, JavaScript, etc. 15 million lines of code. Google Chrome doesn't handle PDF correctly, causing several security bugs. PDF has an accessibility checker and PDF tagging, to help with accessibility. But no PDF checker checks for incorrect tags, untagged content, or validates lists or tables. None check executable content at all. The "Halting Problem" is: can one decide whether a program will ever stop? The answer, in general, is no (Rice's theorem). The same holds true for accessibility checkers. Language-theoretic Security says complicated data formats are hard to parse and cannot be solved due to the Halting Problem. W3C Web Accessibility Guidelines: "Perceivable, Operable, Understandable, Robust" Not much help though, except for "Robust", but here's some gems: * all information should be parsable (paraphrasing) * if not parsable, cannot be converted to alternate formats * maximize compatibility in new document formats Executible webpages are bad for security and accessibility. They say it's for a better web experience. But is it necessary to stuff web pages with JavaScript for a better experience? A good example is The Drudge Report—it has hand-written HTML with no JavaScript, yet drives a lot of web traffic due to good content. A bad example is Google News—hidden scrollbars, guessing user input. Solutions: Accessibility and security problems come from same source Expose "better user experience" myth Keep your corner of Internet parsable Remember "Halting Problem"—recognize false solutions (checking and verifying tools) Stop Patching, for Stronger PCI Compliance Adam Brand, protiviti @adamrbrand, http://www.picfun.com/ Adam talked about PCI compliance for retail sales. Take an example: for PCI compliance, 50% of Brian's time (a IT guy), 960 hours/year was spent patching POSs in 850 restaurants. Often applying some patches make no sense (like fixing a browser vulnerability on a server). "Scanner worship" is overuse of vulnerability scanners—it gives a warm and fuzzy and it's simple (red or green results—fix reds). Scanners give a false sense of security. In reality, breeches from missing patches are uncommon—more common problems are: default passwords, cleartext authentication, misconfiguration (firewall ports open). Patching Myths: Myth 1: install within 30 days of patch release (but PCI §6.1 allows a "risk-based approach" instead). Myth 2: vendor decides what's critical (also PCI §6.1). But §6.2 requires user ranking of vulnerabilities instead. Myth 3: scan and rescan until it passes. But PCI §11.2.1b says this applies only to high-risk vulnerabilities. Adam says good recommendations come from NIST 800-40. Instead use sane patching and focus on what's really important. From NIST 800-40: Proactive: Use a proactive vulnerability management process: use change control, configuration management, monitor file integrity. Monitor: start with NVD and other vulnerability alerts, not scanner results. Evaluate: public-facing system? workstation? internal server? (risk rank) Decide:on action and timeline Test: pre-test patches (stability, functionality, rollback) for change control Install: notify, change control, tickets McAfee Secure & Trustmarks — a Hacker's Best Friend Jay James, Shane MacDougall, Tactical Intelligence Inc., Canada "McAfee Secure Trustmark" is a website seal marketed by McAfee. A website gets this badge if they pass their remote scanning. The problem is a removal of trustmarks act as flags that you're vulnerable. Easy to view status change by viewing McAfee list on website or on Google. "Secure TrustGuard" is similar to McAfee. Jay and Shane wrote Perl scripts to gather sites from McAfee and search engines. If their certification image changes to a 1x1 pixel image, then they are longer certified. Their scripts take deltas of scans to see what changed daily. The bottom line is change in TrustGuard status is a flag for hackers to attack your site. Entire idea of seals is silly—you're raising a flag saying if you're vulnerable.

    Read the article

  • Custom rails route problem with 2.3.8 and Mongrel

    - by CHsurfer
    I have a controller called 'exposures' which I created automatically with the script/generate scaffold call. The scaffold pages work fine. I created a custom action called 'test' in the exposures controller. When I try to call the page (http://127.0.0.1:3000/exposures/test/1) I get a blank, white screen with no text at all in the source. I am using Rails 2.3.8 and mongrel in the development environment. There are no entries in development.log and the console that was used to open mongrel has the following error: You might have expected an instance of Array. The error occurred while evaluating nil.split D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/cgi_process.rb:52:in dispatch_cgi' D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/dispatcher.rb:101:in dispatch_cgi' D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/dispatcher.rb:27:in dispatch' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:76:in process' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:74:in synchronize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:74:in process' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:159:in process_client' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:158:in each' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:158:in process_client' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in initialize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in new' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in initialize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in new' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:282:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:281:in each' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:281:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/mongrel_rails:128:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/command.rb:212:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/mongrel_rails:281 D:/Rails/ruby/bin/mongrel_rails:19:in load' D:/Rails/ruby/bin/mongrel_rails:19 Here is the exposures_controller code: class ExposuresController < ApplicationController # GET /exposures # GET /exposures.xml def index @exposures = Exposure.all respond_to do |format| format.html # index.html.erb format.xml { render :xml => @exposures } end end #/exposure/graph/1 def graph @exposure = Exposure.find(params[:id]) project_name = @exposure.tender.project.name group_name = @exposure.tender.user.group.name tender_desc = @exposure.tender.description direction = "Cash Out" direction = "Cash In" if @exposure.supply currency_1_and_2 = "#{@exposure.currency_in} = #{@exposure.currency_out}" title = "#{project_name}:#{group_name}:#{tender_desc}/n" title += "#{direction}:#{currency_1_and_2}" factors = Array.new carrieds = Array.new days = Array.new @exposure.rates.each do |r| factors << r.factor carrieds << r.carried days << r.day.to_s end max = (factors+carrieds).max min = (factors+carrieds).min g = Graph.new g.title(title, '{font-size: 12px;}') g.set_data(factors) g.line_hollow(2, 4, '0x80a033', 'Bounces', 10) g.set_x_labels(days) g.set_x_label_style( 10, '#CC3399', 2 ); g.set_y_min(min*0.9) g.set_y_max(max*1.1) g.set_y_label_steps(5) render :text = g.render end def test render :text = "this works" end # GET /exposures/1 # GET /exposures/1.xml def show @exposure = Exposure.find(params[:id]) @graph = open_flash_chart_object(700,250, "/exposures/graph/#{@exposure.id}") #@graph = "/exposures/graph/#{@exposure.id}" respond_to do |format| format.html # show.html.erb format.xml { render :xml => @exposure } end end # GET /exposures/new # GET /exposures/new.xml def new @exposure = Exposure.new respond_to do |format| format.html # new.html.erb format.xml { render :xml => @exposure } end end # GET /exposures/1/edit def edit @exposure = Exposure.find(params[:id]) end # POST /exposures # POST /exposures.xml def create @exposure = Exposure.new(params[:exposure]) respond_to do |format| if @exposure.save flash[:notice] = 'Exposure was successfully created.' format.html { redirect_to(@exposure) } format.xml { render :xml => @exposure, :status => :created, :location => @exposure } else format.html { render :action => "new" } format.xml { render :xml => @exposure.errors, :status => :unprocessable_entity } end end end # PUT /exposures/1 # PUT /exposures/1.xml def update @exposure = Exposure.find(params[:id]) respond_to do |format| if @exposure.update_attributes(params[:exposure]) flash[:notice] = 'Exposure was successfully updated.' format.html { redirect_to(@exposure) } format.xml { head :ok } else format.html { render :action => "edit" } format.xml { render :xml => @exposure.errors, :status => :unprocessable_entity } end end end # DELETE /exposures/1 # DELETE /exposures/1.xml def destroy @exposure = Exposure.find(params[:id]) @exposure.destroy respond_to do |format| format.html { redirect_to(exposures_url) } format.xml { head :ok } end end end Clever readers will notice the 'graph' action. This is what I really want to work, but if I can't even get the test action working, then I'm sure I have no chance. Any ideas? I have restarted mongrel a few times with no change. Here is the output of Rake routes (but I don't believe this is the problem. The error would be in the form of and HTML error response). D:\Rails\rails_apps\fxrake routes (in D:/Rails/rails_apps/fx) DEPRECATION WARNING: Rake tasks in vendor/plugins/open_flash_chart/tasks are deprecated. Use lib/tasks instead. (called from D:/ by/gems/1.8/gems/rails-2.3.8/lib/tasks/rails.rb:10) rates GET /rates(.:format) {:controller="rates", :action="index"} POST /rates(.:format) {:controller="rates", :action="create"} new_rate GET /rates/new(.:format) {:controller="rates", :action="new"} edit_rate GET /rates/:id/edit(.:format) {:controller="rates", :action="edit"} rate GET /rates/:id(.:format) {:controller="rates", :action="show"} PUT /rates/:id(.:format) {:controller="rates", :action="update"} DELETE /rates/:id(.:format) {:controller="rates", :action="destroy"} tenders GET /tenders(.:format) {:controller="tenders", :action="index"} POST /tenders(.:format) {:controller="tenders", :action="create"} new_tender GET /tenders/new(.:format) {:controller="tenders", :action="new"} edit_tender GET /tenders/:id/edit(.:format) {:controller="tenders", :action="edit"} tender GET /tenders/:id(.:format) {:controller="tenders", :action="show"} PUT /tenders/:id(.:format) {:controller="tenders", :action="update"} DELETE /tenders/:id(.:format) {:controller="tenders", :action="destroy"} exposures GET /exposures(.:format) {:controller="exposures", :action="index"} POST /exposures(.:format) {:controller="exposures", :action="create"} new_exposure GET /exposures/new(.:format) {:controller="exposures", :action="new"} edit_exposure GET /exposures/:id/edit(.:format) {:controller="exposures", :action="edit"} exposure GET /exposures/:id(.:format) {:controller="exposures", :action="show"} PUT /exposures/:id(.:format) {:controller="exposures", :action="update"} DELETE /exposures/:id(.:format) {:controller="exposures", :action="destroy"} currencies GET /currencies(.:format) {:controller="currencies", :action="index"} POST /currencies(.:format) {:controller="currencies", :action="create"} new_currency GET /currencies/new(.:format) {:controller="currencies", :action="new"} edit_currency GET /currencies/:id/edit(.:format) {:controller="currencies", :action="edit"} currency GET /currencies/:id(.:format) {:controller="currencies", :action="show"} PUT /currencies/:id(.:format) {:controller="currencies", :action="update"} DELETE /currencies/:id(.:format) {:controller="currencies", :action="destroy"} projects GET /projects(.:format) {:controller="projects", :action="index"} POST /projects(.:format) {:controller="projects", :action="create"} new_project GET /projects/new(.:format) {:controller="projects", :action="new"} edit_project GET /projects/:id/edit(.:format) {:controller="projects", :action="edit"} project GET /projects/:id(.:format) {:controller="projects", :action="show"} PUT /projects/:id(.:format) {:controller="projects", :action="update"} DELETE /projects/:id(.:format) {:controller="projects", :action="destroy"} groups GET /groups(.:format) {:controller="groups", :action="index"} POST /groups(.:format) {:controller="groups", :action="create"} new_group GET /groups/new(.:format) {:controller="groups", :action="new"} edit_group GET /groups/:id/edit(.:format) {:controller="groups", :action="edit"} group GET /groups/:id(.:format) {:controller="groups", :action="show"} PUT /groups/:id(.:format) {:controller="groups", :action="update"} DELETE /groups/:id(.:format) {:controller="groups", :action="destroy"} users GET /users(.:format) {:controller="users", :action="index"} POST /users(.:format) {:controller="users", :action="create"} new_user GET /users/new(.:format) {:controller="users", :action="new"} edit_user GET /users/:id/edit(.:format) {:controller="users", :action="edit"} user GET /users/:id(.:format) {:controller="users", :action="show"} PUT /users/:id(.:format) {:controller="users", :action="update"} DELETE /users/:id(.:format) {:controller="users", :action="destroy"} /:controller/:action/:id /:controller/:action/:id(.:format) D:\Rails\rails_apps\fxrails -v Rails 2.3.8 Thanks in advance for the help -Jon

    Read the article

  • rotating bitmaps. In code.

    - by Marco van de Voort
    Is there a faster way to rotate a large bitmap by 90 or 270 degrees than simply doing a nested loop with inverted coordinates? The bitmaps are 8bpp and typically 2048*2400*8bpp Currently I do this by simply copying with argument inversion, roughly (pseudo code: for x = 0 to 2048-1 for y = 0 to 2048-1 dest[x][y]=src[y][x]; (In reality I do it with pointers, for a bit more speed, but that is roughly the same magnitude) GDI is quite slow with large images, and GPU load/store times for textures (GF7 cards) are in the same magnitude as the current CPU time. Any tips, pointers? An in-place algorithm would even be better, but speed is more important than being in-place. Target is Delphi, but it is more an algorithmic question. SSE(2) vectorization no problem, it is a big enough problem for me to code it in assembler Duplicates How do you rotate a two dimensional array?. Follow up to Nils' answer Image 2048x2700 - 2700x2048 Compiler Turbo Explorer 2006 with optimization on. Windows: Power scheme set to "Always on". (important!!!!) Machine: Core2 6600 (2.4 GHz) time with old routine: 32ms (step 1) time with stepsize 8 : 12ms time with stepsize 16 : 10ms time with stepsize 32+ : 9ms Meanwhile I also tested on a Athlon 64 X2 (5200+ iirc), and the speed up there was slightly more than a factor four (80 to 19 ms). The speed up is well worth it, thanks. Maybe that during the summer months I'll torture myself with a SSE(2) version. However I already thought about how to tackle that, and I think I'll run out of SSE2 registers for an straight implementation: for n:=0 to 7 do begin load r0, <source+n*rowsize> shift byte from r0 into r1 shift byte from r0 into r2 .. shift byte from r0 into r8 end; store r1, <target> store r2, <target+1*<rowsize> .. store r8, <target+7*<rowsize> So 8x8 needs 9 registers, but 32-bits SSE only has 8. Anyway that is something for the summer months :-) Note that the pointer thing is something that I do out of instinct, but it could be there is actually something to it, if your dimensions are not hardcoded, the compiler can't turn the mul into a shift. While muls an sich are cheap nowadays, they also generate more register pressure afaik. The code (validated by subtracting result from the "naieve" rotate1 implementation): const stepsize = 32; procedure rotatealign(Source: tbw8image; Target:tbw8image); var stepsx,stepsy,restx,resty : Integer; RowPitchSource, RowPitchTarget : Integer; pSource, pTarget,ps1,ps2 : pchar; x,y,i,j: integer; rpstep : integer; begin RowPitchSource := source.RowPitch; // bytes to jump to next line. Can be negative (includes alignment) RowPitchTarget := target.RowPitch; rpstep:=RowPitchTarget*stepsize; stepsx:=source.ImageWidth div stepsize; stepsy:=source.ImageHeight div stepsize; // check if mod 16=0 here for both dimensions, if so -> SSE2. for y := 0 to stepsy - 1 do begin psource:=source.GetImagePointer(0,y*stepsize); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(target.imagewidth-(y+1)*stepsize,0); for x := 0 to stepsx - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; // 3 more areas to do, with dimensions // - stepsy*stepsize * restx // right most column of restx width // - stepsx*stepsize * resty // bottom row with resty height // - restx*resty // bottom-right rectangle. restx:=source.ImageWidth mod stepsize; // typically zero because width is // typically 1024 or 2048 resty:=source.Imageheight mod stepsize; if restx>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(source.ImageWidth-restx,0); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(Target.imagewidth-stepsize,Target.imageheight-restx); for y := 0 to stepsy - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize*RowPitchSource); dec(ptarget,stepsize); end; end; if resty>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(0,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,0); for x := 0 to stepsx - 1 do begin for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; if (resty>0) and (restx>0) then begin // another loop less, since only one block psource:=source.GetImagePointer(source.ImageWidth-restx,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,target.ImageHeight-restx); for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; end; end;

    Read the article

  • questions regarding the use of A* with the 15-square puzzle

    - by Cheeso
    I'm trying to build an A* solver for a 15-square puzzle. The goal is to re-arrange the tiles so that they appear in their natural positions. You can only slide one tile at a time. Each possible state of the puzzle is a node in the search graph. For the h(x) function, I am using an aggregate sum, across all tiles, of the tile's dislocation from the goal state. In the above image, the 5 is at location 0,0, and it belongs at location 1,0, therefore it contributes 1 to the h(x) function. The next tile is the 11, located at 0,1, and belongs at 2,2, therefore it contributes 3 to h(x). And so on. EDIT: I now understand this is what they call "Manhattan distance", or "taxicab distance". I have been using a step count for g(x). In my implementation, for any node in the state graph, g is just +1 from the prior node's g. To find successive nodes, I just examine where I can possibly move the "hole" in the puzzle. There are 3 neighbors for the puzzle state (aka node) that is displayed: the hole can move north, west, or east. My A* search sometimes converges to a solution in 20s, sometimes 180s, and sometimes doesn't converge at all (waited 10 mins or more). I think h is reasonable. I'm wondering if I've modeled g properly. In other words, is it possible that my A* function is reaching a node in the graph via a path that is not the shortest path? Maybe have I not waited long enough? Maybe 10 minutes is not long enough? For a fully random arrangement, (assuming no parity problems), What is the average number of permutations an A* solution will examine? (please show the math) I'm going to look for logic errors in my code, but in the meantime, Any tips? (ps: it's done in Javascript). Also, no, this isn't CompSci homework. It's just a personal exploration thing. I'm just trying to learn Javascript. EDIT: I've found that the run-time is highly depend upon the heuristic. I saw the 10x factor applied to the heuristic from the article someone mentioned, and it made me wonder - why 10x? Why linear? Because this is done in javascript, I could modify the code to dynamically update an html table with the node currently being considered. This allowd me to peek at the algorithm as it was progressing. With a regular taxicab distance heuristic, I watched as it failed to converge. There were 5's and 12's in the top row, and they kept hanging around. I'd see 1,2,3,4 creep into the top row, but then they'd drop out, and other numbers would move up there. What I was hoping to see was 1,2,3,4 sort of creeping up to the top, and then staying there. I thought to myself - this is not the way I solve this personally. Doing this manually, I solve the top row, then the 2ne row, then the 3rd and 4th rows sort of concurrently. So I tweaked the h(x) function to more heavily weight the higher rows and the "lefter" columns. The result was that the A* converged much more quickly. It now runs in 3 minutes instead of "indefinitely". With the "peek" I talked about, I can see the smaller numbers creep up to the higher rows and stay there. Not only does this seem like the right thing, it runs much faster. I'm in the process of trying a bunch of variations. It seems pretty clear that A* runtime is very sensitive to the heuristic. Currently the best heuristic I've found uses the summation of dislocation * ((4-i) + (4-j)) where i and j are the row and column, and dislocation is the taxicab distance. One interesting part of the result I got: with a particular heuristic I find a path very quickly, but it is obviously not the shortest path. I think this is because I am weighting the heuristic. In one case I got a path of 178 steps in 10s. My own manual effort produce a solution in 87 moves. (much more than 10s). More investigation warranted. So the result is I am seeing it converge must faster, and the path is definitely not the shortest. I have to think about this more. Code: var stop = false; function Astar(start, goal, callback) { // start and goal are nodes in the graph, represented by // an array of 16 ints. The goal is: [1,2,3,...14,15,0] // Zero represents the hole. // callback is a method to call when finished. This runs a long time, // therefore we need to use setTimeout() to break it up, to avoid // the browser warning like "Stop running this script?" // g is the actual distance traveled from initial node to current node. // h is the heuristic estimate of distance from current to goal. stop = false; start.g = start.dontgo = 0; // calcHeuristic inserts an .h member into the array calcHeuristicDistance(start); // start the stack with one element var closed = []; // set of nodes already evaluated. var open = [ start ]; // set of nodes to evaluate (start with initial node) var iteration = function() { if (open.length==0) { // no more nodes. Fail. callback(null); return; } var current = open.shift(); // get highest priority node // update the browser with a table representation of the // node being evaluated $("#solution").html(stateToString(current)); // check solution returns true if current == goal if (checkSolution(current,goal)) { // reconstructPath just records the position of the hole // through each node var path= reconstructPath(start,current); callback(path); return; } closed.push(current); // get the set of neighbors. This is 3 or fewer nodes. // (nextStates is optimized to NOT turn directly back on itself) var neighbors = nextStates(current, goal); for (var i=0; i<neighbors.length; i++) { var n = neighbors[i]; // skip this one if we've already visited it if (closed.containsNode(n)) continue; // .g, .h, and .previous get assigned implicitly when // calculating neighbors. n.g is nothing more than // current.g+1 ; // add to the open list if (!open.containsNode(n)) { // slot into the list, in priority order (minimum f first) open.priorityPush(n); n.previous = current; } } if (stop) { callback(null); return; } setTimeout(iteration, 1); }; // kick off the first iteration iteration(); return null; }

    Read the article

  • A way of doing real-world test-driven development (and some thoughts about it)

    - by Thomas Weller
    Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like  'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make:  It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below:   First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how  the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator {     /// <summary>     /// Defines a very simple calculator component for demo purposes.     /// </summary>     public interface ICalculator     {         /// <summary>         /// Gets the result of the last successful operation.         /// </summary>         /// <value>The last result.</value>         /// <remarks>         /// Will be <see langword="null" /> before the first successful operation.         /// </remarks>         double? LastResult { get; }       } // interface ICalculator   } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1)   First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test {     [TestFixture]     public class CalculatorTest     {         private readonly ServiceContainer container = new ServiceContainer();           [Test]         public void CalculatorLastResultIsInitiallyNull()         {             ICalculator calculator = container.GetService<ICalculator>();               Assert.IsNull(calculator.LastResult);         }       } // class CalculatorTest   } // namespace Calculator.Test       This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more.  Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type:        Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult         {             get             {                 throw new NotImplementedException();             }         }     } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest {     #region Fields       private readonly ServiceContainer container = new ServiceContainer();       #endregion // Fields       #region Setup/TearDown       [FixtureSetUp]     public void FixtureSetUp()     {        container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll");     }       ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property:     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult { get; private set; }     }         One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method:         ...   /// <summary>         /// Adds the specified operands.         /// </summary>         /// <param name="operand1">The operand1.</param>         /// <param name="operand2">The operand2.</param>         /// <returns>The result of the additon.</returns>         /// <exception cref="ArgumentException">         /// Argument <paramref name="operand1"/> is &lt; 0.<br/>         /// -- or --<br/>         /// Argument <paramref name="operand2"/> is &lt; 0.         /// </exception>         double Add(double operand1, double operand2);       } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this:   Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location.  This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…).     Back to our Calculator again: Two more R# – clicks implement the Add() skeleton:         ...           public double Add(double operand1, double operand2)         {             throw new NotImplementedException();         }       } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest:   [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); }   // in Calculator: public double Add(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }     if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }     throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio):   Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs:   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, result); }   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, calculator.LastResult); }   ...   // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is &lt; 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is &lt; 0. /// </exception> double Subtract(double operand1, double operand2);   ...   // Calculator.cs:   public double Subtract(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }       if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }       return (this.LastResult = operand1 - operand2).Value; }   Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         #region ICalculator           public double? LastResult { get; private set; }           public double Add(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 + operand2).Value;         }           public double Subtract(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 - operand2).Value;         }           #endregion // ICalculator           #region Implementation (Helper)           private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2)         {             if (operand1 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand1");             }               if (operand2 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand2");             }         }           #endregion // Implementation (Helper)       } // class Calculator   } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…).  As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks -  is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

    Read the article

  • 256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

    - by Alan Smith
    For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour.  This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue   static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005>     define wooden texture { noise surface { ambient 0.2  diffuse 0.7  specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1  lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7  } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } }   viewpoint {     from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60     resolution 640, 480 aspect 1.6 image_format 0 }       light <-10, 30, 20> light <-10, 30, -20>   object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden }   object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome }   After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue }   In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000   As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72   Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: ·         On-Premise o   Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o   Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o   Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o   Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. ·         Windows Azure o   Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment.   The architecture of each worker role is shown below.   The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" >   <WorkerRole name="CloudRayWorkerRole" vmsize="Small">     <Imports>     </Imports>     <ConfigurationSettings>       <Setting name="DataConnectionString" />     </ConfigurationSettings>     <LocalResources>       <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" />     </LocalResources>   </WorkerRole> </ServiceDefinition>     The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1.       The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2.       PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3.       DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4.       The JPG file is uploaded from local storage to the images blob container. 5.       A message is placed on the images queue to indicate a new image is available for download. 6.       The job message is deleted from the job queue. 7.       The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() {     // Set environment variables     string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation);     string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation);       LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder");     string localStorageRootPath = rayStorage.RootPath;       JobQueue jobQueue = new JobQueue("renderjobs");     JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs");     CloudRayBlob sceneBlob = new CloudRayBlob("scenes");     CloudRayBlob imageBlob = new CloudRayBlob("images");     RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource();       Frames = 0;       while (true)     {         // Get the render job from the queue         CloudQueueMessage jobMsg = jobQueue.Get();           if (jobMsg != null)         {             // Get the file details             string sceneFile = jobMsg.AsString;             string tgaFile = sceneFile.Replace(".pi", ".tga");             string jpgFile = sceneFile.Replace(".pi", ".jpg");               string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile);             string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile);             string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile);               // Copy the scene file to local storage             sceneBlob.DownloadFile(sceneFilePath);               // Run the ray tracer.             string polyrayArguments =                 string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath);             Process polyRayProcess = new Process();             polyRayProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath);             polyRayProcess.StartInfo.Arguments = polyrayArguments;             polyRayProcess.Start();             polyRayProcess.WaitForExit();               // Convert the image             string dtaArguments =                 string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath));             Process dtaProcess = new Process();             dtaProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath);             dtaProcess.StartInfo.Arguments = dtaArguments;             dtaProcess.Start();             dtaProcess.WaitForExit();               // Upload the image to blob storage             imageBlob.UploadFile(jpgFilePath);               // Add a download job.             downloadQueue.Add(jpgFile);               // Delete the render job message             jobQueue.Delete(jobMsg);               Frames++;         }         else         {             Thread.Sleep(1000);         }           // Log the worker role activity.         roleLifecycleDataSource.Alive             ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames);     } }     Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage.   public class RoleLifecycle : TableServiceEntity {     public string ServerName { get; set; }     public string Status { get; set; }     public DateTime StartTime { get; set; }     public DateTime EndTime { get; set; }     public long SecondsRunning { get; set; }     public DateTime LastActiveTime { get; set; }     public int Frames { get; set; }     public string Comment { get; set; }       public RoleLifecycle()     {     }       public RoleLifecycle(string roleName)     {         PartitionKey = roleName;         RowKey = Utils.GetAscendingRowKey();         Status = "Started";         StartTime = DateTime.UtcNow;         LastActiveTime = StartTime;         EndTime = StartTime;         SecondsRunning = 0;         Frames = 0;     } }     A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="16" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames.   Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation.   With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes.     At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances.   <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="256" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames.   Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds.   We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes.   The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each.   Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles   The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. ·         Windows Azure can provide massive compute power, on demand, in a matter of minutes. ·         The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. ·         Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. ·         No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. ·         Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. ·         Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. ·         Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. ·         Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. ·         If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. ·         Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. ·         Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

    Read the article

  • Debian squeeze keyboard and touchpad not working / detected on laptop

    - by Esa
    They work before gdm3 starts. a connected mouse also stops working, but functions after removal and re-plug. no xorg.conf. log doesn't show any loading of drivers for kbd/touchpad [ 33.783] X.Org X Server 1.10.4 Release Date: 2011-08-19 [ 33.783] X Protocol Version 11, Revision 0 [ 33.783] Build Operating System: Linux 3.0.0-1-amd64 x86_64 Debian [ 33.783] Current Operating System: Linux sus 3.2.0-0.bpo.2-amd64 #1 SMP Sun Mar 25 10:33:35 UTC 2012 x86_64 [ 33.783] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-0.bpo.2-amd64 root=UUID=8686f840-d165-4d1e-b995-2ebbd94aa3d2 ro quiet [ 33.783] Build Date: 28 August 2011 09:39:43PM [ 33.783] xorg-server 2:1.10.4-1~bpo60+1 (Cyril Brulebois <[email protected]>) [ 33.783] Current version of pixman: 0.16.4 [ 33.783] Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. [ 33.783] Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. [ 33.783] (==) Log file: "/var/log/Xorg.0.log", Time: Wed Mar 28 09:34:04 2012 [ 33.837] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 33.936] (==) No Layout section. Using the first Screen section. [ 33.936] (==) No screen section available. Using defaults. [ 33.936] (**) |-->Screen "Default Screen Section" (0) [ 33.936] (**) | |-->Monitor "<default monitor>" [ 33.936] (==) No monitor specified for screen "Default Screen Section". Using a default monitor configuration. [ 33.936] (==) Automatically adding devices [ 33.936] (==) Automatically enabling devices [ 34.164] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist. [ 34.164] Entry deleted from font path. [ 34.226] (==) FontPath set to: /usr/share/fonts/X11/misc, /usr/share/fonts/X11/100dpi/:unscaled, /usr/share/fonts/X11/75dpi/:unscaled, /usr/share/fonts/X11/Type1, /usr/share/fonts/X11/100dpi, /usr/share/fonts/X11/75dpi, /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType, built-ins [ 34.226] (==) ModulePath set to "/usr/lib/xorg/modules" [ 34.226] (II) The server relies on udev to provide the list of input devices. If no devices become available, reconfigure udev or disable AutoAddDevices. [ 34.226] (II) Loader magic: 0x7d3ae0 [ 34.226] (II) Module ABI versions: [ 34.226] X.Org ANSI C Emulation: 0.4 [ 34.226] X.Org Video Driver: 10.0 [ 34.226] X.Org XInput driver : 12.2 [ 34.226] X.Org Server Extension : 5.0 [ 34.227] (--) PCI:*(0:1:5:0) 1002:9712:103c:1661 rev 0, Mem @ 0xd0000000/268435456, 0xf1400000/65536, 0xf1300000/1048576, I/O @ 0x00008000/256 [ 34.227] (--) PCI: (0:2:0:0) 1002:6760:103c:1661 rev 0, Mem @ 0xe0000000/268435456, 0xf0300000/131072, I/O @ 0x00004000/256, BIOS @ 0x????????/131072 [ 34.227] (II) Open ACPI successful (/var/run/acpid.socket) [ 34.227] (II) LoadModule: "extmod" [ 34.249] (II) Loading /usr/lib/xorg/modules/extensions/libextmod.so [ 34.277] (II) Module extmod: vendor="X.Org Foundation" [ 34.277] compiled for 1.10.4, module version = 1.0.0 [ 34.277] Module class: X.Org Server Extension [ 34.277] ABI class: X.Org Server Extension, version 5.0 [ 34.277] (II) Loading extension SELinux [ 34.277] (II) Loading extension MIT-SCREEN-SAVER [ 34.277] (II) Loading extension XFree86-VidModeExtension [ 34.277] (II) Loading extension XFree86-DGA [ 34.277] (II) Loading extension DPMS [ 34.277] (II) Loading extension XVideo [ 34.277] (II) Loading extension XVideo-MotionCompensation [ 34.277] (II) Loading extension X-Resource [ 34.277] (II) LoadModule: "dbe" [ 34.277] (II) Loading /usr/lib/xorg/modules/extensions/libdbe.so [ 34.299] (II) Module dbe: vendor="X.Org Foundation" [ 34.299] compiled for 1.10.4, module version = 1.0.0 [ 34.299] Module class: X.Org Server Extension [ 34.299] ABI class: X.Org Server Extension, version 5.0 [ 34.299] (II) Loading extension DOUBLE-BUFFER [ 34.299] (II) LoadModule: "glx" [ 34.299] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so [ 34.477] (II) Module glx: vendor="X.Org Foundation" [ 34.477] compiled for 1.10.4, module version = 1.0.0 [ 34.477] ABI class: X.Org Server Extension, version 5.0 [ 34.477] (==) AIGLX enabled [ 34.477] (II) Loading extension GLX [ 34.477] (II) LoadModule: "record" [ 34.478] (II) Loading /usr/lib/xorg/modules/extensions/librecord.so [ 34.481] (II) Module record: vendor="X.Org Foundation" [ 34.481] compiled for 1.10.4, module version = 1.13.0 [ 34.481] Module class: X.Org Server Extension [ 34.481] ABI class: X.Org Server Extension, version 5.0 [ 34.481] (II) Loading extension RECORD [ 34.481] (II) LoadModule: "dri" [ 34.481] (II) Loading /usr/lib/xorg/modules/extensions/libdri.so [ 34.512] (II) Module dri: vendor="X.Org Foundation" [ 34.512] compiled for 1.10.4, module version = 1.0.0 [ 34.512] ABI class: X.Org Server Extension, version 5.0 [ 34.512] (II) Loading extension XFree86-DRI [ 34.512] (II) LoadModule: "dri2" [ 34.512] (II) Loading /usr/lib/xorg/modules/extensions/libdri2.so [ 34.515] (II) Module dri2: vendor="X.Org Foundation" [ 34.515] compiled for 1.10.4, module version = 1.2.0 [ 34.515] ABI class: X.Org Server Extension, version 5.0 [ 34.515] (II) Loading extension DRI2 [ 34.515] (==) Matched ati as autoconfigured driver 0 [ 34.515] (==) Matched vesa as autoconfigured driver 1 [ 34.515] (==) Matched fbdev as autoconfigured driver 2 [ 34.515] (==) Assigned the driver to the xf86ConfigLayout [ 34.515] (II) LoadModule: "ati" [ 34.706] (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so [ 34.724] (II) Module ati: vendor="X.Org Foundation" [ 34.724] compiled for 1.10.3, module version = 6.14.2 [ 34.724] Module class: X.Org Video Driver [ 34.724] ABI class: X.Org Video Driver, version 10.0 [ 34.724] (II) LoadModule: "radeon" [ 34.725] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so [ 34.923] (II) Module radeon: vendor="X.Org Foundation" [ 34.923] compiled for 1.10.3, module version = 6.14.2 [ 34.923] Module class: X.Org Video Driver [ 34.923] ABI class: X.Org Video Driver, version 10.0 [ 34.945] (II) LoadModule: "vesa" [ 34.945] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so [ 34.988] (II) Module vesa: vendor="X.Org Foundation" [ 34.988] compiled for 1.10.3, module version = 2.3.0 [ 34.988] Module class: X.Org Video Driver [ 34.988] ABI class: X.Org Video Driver, version 10.0 [ 34.988] (II) LoadModule: "fbdev" [ 34.988] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so [ 35.020] (II) Module fbdev: vendor="X.Org Foundation" [ 35.020] compiled for 1.10.3, module version = 0.4.2 [ 35.020] ABI class: X.Org Video Driver, version 10.0 [ 35.020] (II) RADEON: Driver for ATI Radeon chipsets: <snip> [ 35.023] (II) VESA: driver for VESA chipsets: vesa [ 35.023] (II) FBDEV: driver for framebuffer: fbdev [ 35.023] (++) using VT number 7 [ 35.033] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so [ 35.033] (II) [KMS] Kernel modesetting enabled. [ 35.033] (WW) Falling back to old probe method for vesa [ 35.034] (WW) Falling back to old probe method for fbdev [ 35.034] (II) Loading sub module "fbdevhw" [ 35.034] (II) LoadModule: "fbdevhw" [ 35.034] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so [ 35.185] (II) Module fbdevhw: vendor="X.Org Foundation" [ 35.185] compiled for 1.10.4, module version = 0.0.2 [ 35.185] ABI class: X.Org Video Driver, version 10.0 [ 35.288] (II) RADEON(0): Creating default Display subsection in Screen section "Default Screen Section" for depth/fbbpp 24/32 [ 35.288] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32 [ 35.288] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps) [ 35.288] (==) RADEON(0): Default visual is TrueColor [ 35.288] (==) RADEON(0): RGB weight 888 [ 35.288] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC) [ 35.288] (--) RADEON(0): Chipset: "ATI Mobility Radeon HD 4200" (ChipID = 0x9712) [ 35.288] (II) RADEON(0): PCI card detected [ 35.288] drmOpenDevice: node name is /dev/dri/card0 [ 35.288] drmOpenDevice: open result is 9, (OK) [ 35.288] drmOpenByBusid: Searching for BusID pci:0000:01:05.0 [ 35.288] drmOpenDevice: node name is /dev/dri/card0 [ 35.288] drmOpenDevice: open result is 9, (OK) [ 35.288] drmOpenByBusid: drmOpenMinor returns 9 [ 35.288] drmOpenByBusid: drmGetBusid reports pci:0000:01:05.0 [ 35.288] (II) Loading sub module "exa" [ 35.288] (II) LoadModule: "exa" [ 35.288] (II) Loading /usr/lib/xorg/modules/libexa.so [ 35.335] (II) Module exa: vendor="X.Org Foundation" [ 35.335] compiled for 1.10.4, module version = 2.5.0 [ 35.335] ABI class: X.Org Video Driver, version 10.0 [ 35.335] (II) RADEON(0): KMS Color Tiling: disabled [ 35.335] (II) RADEON(0): KMS Pageflipping: enabled [ 35.335] (II) RADEON(0): SwapBuffers wait for vsync: enabled [ 35.360] (II) RADEON(0): Output VGA-0 has no monitor section [ 35.360] (II) RADEON(0): Output LVDS has no monitor section [ 35.364] (II) RADEON(0): Output HDMI-0 has no monitor section [ 35.388] (II) RADEON(0): EDID for output VGA-0 [ 35.388] (II) RADEON(0): EDID for output LVDS [ 35.388] (II) RADEON(0): Manufacturer: LGD Model: 2ac Serial#: 0 [ 35.388] (II) RADEON(0): Year: 2010 Week: 0 [ 35.388] (II) RADEON(0): EDID Version: 1.3 [ 35.388] (II) RADEON(0): Digital Display Input [ 35.388] (II) RADEON(0): Max Image Size [cm]: horiz.: 34 vert.: 19 [ 35.388] (II) RADEON(0): Gamma: 2.20 [ 35.388] (II) RADEON(0): No DPMS capabilities specified [ 35.388] (II) RADEON(0): Supported color encodings: RGB 4:4:4 YCrCb 4:4:4 [ 35.388] (II) RADEON(0): First detailed timing is preferred mode [ 35.388] (II) RADEON(0): redX: 0.616 redY: 0.371 greenX: 0.355 greenY: 0.606 [ 35.388] (II) RADEON(0): blueX: 0.152 blueY: 0.100 whiteX: 0.313 whiteY: 0.329 [ 35.388] (II) RADEON(0): Manufacturer's mask: 0 [ 35.388] (II) RADEON(0): Supported detailed timing: [ 35.388] (II) RADEON(0): clock: 69.3 MHz Image Size: 344 x 194 mm [ 35.388] (II) RADEON(0): h_active: 1366 h_sync: 1398 h_sync_end 1430 h_blank_end 1486 h_border: 0 [ 35.388] (II) RADEON(0): v_active: 768 v_sync: 770 v_sync_end 774 v_blanking: 782 v_border: 0 [ 35.388] (II) RADEON(0): LG Display [ 35.388] (II) RADEON(0): LP156WH2-TLQB [ 35.388] (II) RADEON(0): EDID (in hex): [ 35.388] (II) RADEON(0): 00ffffffffffff0030e4ac0200000000 [ 35.388] (II) RADEON(0): 00140103802213780ac1259d5f5b9b27 [ 35.388] (II) RADEON(0): 19505400000001010101010101010101 [ 35.388] (II) RADEON(0): 010101010101121b567850000e302020 [ 35.388] (II) RADEON(0): 240058c2100000190000000000000000 [ 35.388] (II) RADEON(0): 00000000000000000000000000fe004c [ 35.388] (II) RADEON(0): 4720446973706c61790a2020000000fe [ 35.388] (II) RADEON(0): 004c503135365748322d544c514200c1 [ 35.388] (II) RADEON(0): Printing probed modes for output LVDS [ 35.388] (II) RADEON(0): Modeline "1366x768"x59.6 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 35.388] (II) RADEON(0): Modeline "1280x720"x59.9 74.50 1280 1344 1472 1664 720 723 728 748 -hsync +vsync (44.8 kHz) [ 35.388] (II) RADEON(0): Modeline "1152x768"x59.8 71.75 1152 1216 1328 1504 768 771 781 798 -hsync +vsync (47.7 kHz) [ 35.388] (II) RADEON(0): Modeline "1024x768"x59.9 63.50 1024 1072 1176 1328 768 771 775 798 -hsync +vsync (47.8 kHz) [ 35.388] (II) RADEON(0): Modeline "800x600"x59.9 38.25 800 832 912 1024 600 603 607 624 -hsync +vsync (37.4 kHz) [ 35.388] (II) RADEON(0): Modeline "848x480"x59.7 31.50 848 872 952 1056 480 483 493 500 -hsync +vsync (29.8 kHz) [ 35.388] (II) RADEON(0): Modeline "720x480"x59.7 26.75 720 744 808 896 480 483 493 500 -hsync +vsync (29.9 kHz) [ 35.388] (II) RADEON(0): Modeline "640x480"x59.4 23.75 640 664 720 800 480 483 487 500 -hsync +vsync (29.7 kHz) [ 35.392] (II) RADEON(0): EDID for output HDMI-0 [ 35.392] (II) RADEON(0): Output VGA-0 disconnected [ 35.392] (II) RADEON(0): Output LVDS connected [ 35.392] (II) RADEON(0): Output HDMI-0 disconnected [ 35.392] (II) RADEON(0): Using exact sizes for initial modes [ 35.392] (II) RADEON(0): Output LVDS using initial mode 1366x768 [ 35.392] (II) RADEON(0): Using default gamma of (1.0, 1.0, 1.0) unless otherwise stated. [ 35.392] (II) RADEON(0): mem size init: gart size :1fdff000 vram size: s:10000000 visible:fba0000 [ 35.392] (II) RADEON(0): EXA: Driver will allow EXA pixmaps in VRAM [ 35.392] (==) RADEON(0): DPI set to (96, 96) [ 35.392] (II) Loading sub module "fb" [ 35.392] (II) LoadModule: "fb" [ 35.392] (II) Loading /usr/lib/xorg/modules/libfb.so [ 35.492] (II) Module fb: vendor="X.Org Foundation" [ 35.492] compiled for 1.10.4, module version = 1.0.0 [ 35.492] ABI class: X.Org ANSI C Emulation, version 0.4 [ 35.492] (II) Loading sub module "ramdac" [ 35.492] (II) LoadModule: "ramdac" [ 35.492] (II) Module "ramdac" already built-in [ 35.492] (II) UnloadModule: "vesa" [ 35.492] (II) Unloading vesa [ 35.492] (II) UnloadModule: "fbdev" [ 35.492] (II) Unloading fbdev [ 35.492] (II) UnloadModule: "fbdevhw" [ 35.492] (II) Unloading fbdevhw [ 35.492] (--) Depth 24 pixmap format is 32 bpp [ 35.492] (II) RADEON(0): [DRI2] Setup complete [ 35.492] (II) RADEON(0): [DRI2] DRI driver: r600 [ 35.492] (II) RADEON(0): Front buffer size: 4224K [ 35.492] (II) RADEON(0): VRAM usage limit set to 228096K [ 35.615] (==) RADEON(0): Backing store disabled [ 35.615] (II) RADEON(0): Direct rendering enabled [ 35.658] (II) RADEON(0): Setting EXA maxPitchBytes [ 35.658] (II) EXA(0): Driver allocated offscreen pixmaps [ 35.658] (II) EXA(0): Driver registered support for the following operations: [ 35.658] (II) Solid [ 35.658] (II) Copy [ 35.658] (II) Composite (RENDER acceleration) [ 35.658] (II) UploadToScreen [ 35.658] (II) DownloadFromScreen [ 35.687] (II) RADEON(0): Acceleration enabled [ 35.687] (==) RADEON(0): DPMS enabled [ 35.687] (==) RADEON(0): Silken mouse enabled [ 35.721] (II) RADEON(0): Set up textured video [ 35.721] (II) RADEON(0): RandR 1.2 enabled, ignore the following RandR disabled message. [ 35.721] (--) RandR disabled [ 35.721] (II) Initializing built-in extension Generic Event Extension [ 35.721] (II) Initializing built-in extension SHAPE [ 35.721] (II) Initializing built-in extension MIT-SHM [ 35.721] (II) Initializing built-in extension XInputExtension [ 35.721] (II) Initializing built-in extension XTEST [ 35.721] (II) Initializing built-in extension BIG-REQUESTS [ 35.721] (II) Initializing built-in extension SYNC [ 35.721] (II) Initializing built-in extension XKEYBOARD [ 35.721] (II) Initializing built-in extension XC-MISC [ 35.721] (II) Initializing built-in extension SECURITY [ 35.721] (II) Initializing built-in extension XINERAMA [ 35.721] (II) Initializing built-in extension XFIXES [ 35.721] (II) Initializing built-in extension RENDER [ 35.721] (II) Initializing built-in extension RANDR [ 35.721] (II) Initializing built-in extension COMPOSITE [ 35.721] (II) Initializing built-in extension DAMAGE [ 35.721] (II) SELinux: Disabled on system [ 35.982] (II) AIGLX: enabled GLX_MESA_copy_sub_buffer [ 35.982] (II) AIGLX: enabled GLX_INTEL_swap_event [ 35.982] (II) AIGLX: enabled GLX_SGI_swap_control and GLX_MESA_swap_control [ 35.982] (II) AIGLX: enabled GLX_SGI_make_current_read [ 35.982] (II) AIGLX: GLX_EXT_texture_from_pixmap backed by buffer objects [ 35.982] (II) AIGLX: Loaded and initialized /usr/lib/dri/r600_dri.so [ 35.982] (II) GLX: Initialized DRI2 GL provider for screen 0 [ 35.999] (II) RADEON(0): Setting screen physical size to 361 x 203 [ 43.896] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.896] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.896] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 43.924] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.924] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.924] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 43.988] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.988] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.988] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 67.375] (II) config/udev: Adding input device Logitech USB Optical Mouse (/dev/input/event1) [ 67.376] (**) Logitech USB Optical Mouse: Applying InputClass "evdev pointer catchall" [ 67.376] (II) LoadModule: "evdev" [ 67.376] (II) Loading /usr/lib/xorg/modules/input/evdev_drv.so [ 67.392] (II) Module evdev: vendor="X.Org Foundation" [ 67.392] compiled for 1.10.3, module version = 2.6.0 [ 67.392] Module class: X.Org XInput Driver [ 67.392] ABI class: X.Org XInput driver, version 12.2 [ 67.392] (II) Using input driver 'evdev' for 'Logitech USB Optical Mouse' [ 67.392] (II) Loading /usr/lib/xorg/modules/input/evdev_drv.so [ 67.392] (**) Logitech USB Optical Mouse: always reports core events [ 67.392] (**) Logitech USB Optical Mouse: Device: "/dev/input/event1" [ 67.392] (--) Logitech USB Optical Mouse: Found 12 mouse buttons [ 67.392] (--) Logitech USB Optical Mouse: Found scroll wheel(s) [ 67.392] (--) Logitech USB Optical Mouse: Found relative axes [ 67.392] (--) Logitech USB Optical Mouse: Found x and y relative axes [ 67.392] (II) Logitech USB Optical Mouse: Configuring as mouse [ 67.392] (II) Logitech USB Optical Mouse: Adding scrollwheel support [ 67.392] (**) Logitech USB Optical Mouse: YAxisMapping: buttons 4 and 5 [ 67.392] (**) Logitech USB Optical Mouse: EmulateWheelButton: 4, EmulateWheelInertia: 10, EmulateWheelTimeout: 200 [ 67.392] (**) Option "config_info" "udev:/sys/devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/input/input14/event1" [ 67.392] (II) XINPUT: Adding extended input device "Logitech USB Optical Mouse" (type: MOUSE) [ 67.392] (II) Logitech USB Optical Mouse: initialized for relative axes. [ 67.392] (**) Logitech USB Optical Mouse: (accel) keeping acceleration scheme 1 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration profile 0 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration factor: 2.000 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration threshold: 4 [ 67.392] (II) config/udev: Adding input device Logitech USB Optical Mouse (/dev/input/mouse0) [ 67.392] (II) No input driver/identifier specified (ignoring) [ 78.692] (II) Logitech USB Optical Mouse: Close [ 78.692] (II) UnloadModule: "evdev" [ 78.692] (II) Unloading evdev

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Updating the value of a math equation with YUI slider and simple radio buttons.

    - by dj lewis
    I have a form that is used to show a price for a product. I have a YUI slider setup that changes the price, and it works perfectly. Now I'm trying to add in radio buttons that also should update that same price value. The price displayed should take into account all 3 fields, and update dynamically as any are updated. This is the code I have, but I don't have any radio buttons for cpanelPrice yet as I'm still just trying to get the IPs to work. <script type="text/javascript"> (function() { var Event = YAHOO.util.Event, Dom = YAHOO.util.Dom, lang = YAHOO.lang, slider, bg="slider-bg", thumb="slider-thumb", orderlink="order-link", monthlyprice="monthly-price", dram="ram", stor="storage",dcpu="cpu",bandw="bandwidth",slid="sliderbg" // The slider can move 0 pixels up var topConstraint = 0; // The slider can move 200 pixels down var bottomConstraint = 585; // Custom scale factor for converting the pixel offset into a real value var scaleFactor = 1; // The amount the slider moves when the value is changed with the arrow // keys var keyIncrement = 65; var tickSize = 65; Event.onDOMReady(function() { slider = YAHOO.widget.Slider.getHorizSlider(bg, thumb, topConstraint, bottomConstraint, tickSize); slider.setValue(1, true); slider.animate = true; slider.getRealValue = function() { return Math.round(this.getValue() * scaleFactor); } slider.subscribe("change", function(offsetFromStart) { var ordnode = Dom.get(orderlink); var prinode = Dom.get(monthlyprice); var ramnode = Dom.get(dram); var stornode = Dom.get(stor); var cpunode = Dom.get(dcpu); var bwnode = Dom.get(bandw); var slidnode = Dom.get(slid); var actualValue = slider.getRealValue(); if (actualValue < 0) { var actualValue = 0; } if (actualValue > -1 && actualValue < 5) { basePrice = 15; var pid = "7"; var ram = "128 MB"; stornode.innerHTML = "5"; cpunode.innerHTML = ".5"; bwnode.innerHTML = "50"; slidnode.innerHTML = "<img src=\"/images/sliderbg1.png\" alt=\"\" />"; } else if (actualValue > 60 && actualValue < 70) { basePrice = 25; var pid = "8"; var ram = "256 MB"; stornode.innerHTML = "10"; cpunode.innerHTML = ".5"; bwnode.innerHTML = "100"; slidnode.innerHTML = "<img src=\"/images/sliderbg2.png\" alt=\"\" />"; } else if (actualValue > 125 && actualValue < 135) { basePrice = 40; var pid = "9"; var ram = "512 MB"; stornode.innerHTML = "20"; cpunode.innerHTML = "1"; bwnode.innerHTML = "200"; slidnode.innerHTML = "<img src=\"/images/sliderbg3.png\" alt=\"\" />"; } else if (actualValue > 190 && actualValue < 200) { basePrice = 60; var pid = "10"; var ram = "1 GB"; stornode.innerHTML = "40"; cpunode.innerHTML = "1"; bwnode.innerHTML = "400"; slidnode.innerHTML = "<img src=\"/images/sliderbg4.png\" alt=\"\" />"; } else if (actualValue> 255 && actualValue < 265) { basePrice = 80; var pid = "11"; var ram = "1.5 GB"; stornode.innerHTML = "60"; cpunode.innerHTML = "1"; bwnode.innerHTML = "600"; slidnode.innerHTML = "<img src=\"/images/sliderbg5.png\" alt=\"\" />"; } else if (actualValue > 320 && actualValue < 330) { basePrice = 110; var pid = "12"; var ram = "2 GB"; stornode.innerHTML = "80"; cpunode.innerHTML = "2"; bwnode.innerHTML = "800"; slidnode.innerHTML = "<img src=\"/images/sliderbg6.png\" alt=\"\" />"; } else if (actualValue > 385 && actualValue < 395) { basePrice = 140; var pid = "13"; var ram = "2.5 GB"; stornode.innerHTML = "100"; cpunode.innerHTML = "2"; bwnode.innerHTML = "1000"; slidnode.innerHTML = "<img src=\"/images/sliderbg7.png\" alt=\"\" />"; } else if (actualValue > 450 && actualValue < 460) { basePrice = 170; var pid = "14"; var ram = "3 GB"; stornode.innerHTML = "120"; cpunode.innerHTML = "3"; bwnode.innerHTML = "1200"; slidnode.innerHTML = "<img src=\"/images/sliderbg8.png\" alt=\"\" />"; } else if (actualValue > 515 && actualValue < 525) { basePrice = 200; var pid = "15"; var ram = "3.5 GB"; stornode.innerHTML = "140"; cpunode.innerHTML = "3"; bwnode.innerHTML = "1400"; slidnode.innerHTML = "<img src=\"/images/sliderbg9.png\" alt=\"\" />"; } else if (actualValue > 580 && actualValue < 590) { basePrice = 240; var pid = "16"; var ram = "4 GB"; stornode.innerHTML = "160"; cpunode.innerHTML = "4"; bwnode.innerHTML = "1600"; slidnode.innerHTML = "<img src=\"/images/sliderbg10.png\" alt=\"\" />"; } // Setup the order link ordnode.innerHTML = "<a href=\"https://account.hostingbeast.com/cart.php?a=add&pid=" + pid + "\"><img src=\"/images/blank.gif\" alt=\"Order VPS Hosting\" height=\"100\" width=\"100\" /></a>"; ramnode.innerHTML = ram; ipPrice = 0; function setIpPrice(ips) { ipPrice = ips.value; } cpanelPrice = 0; prinode.innerHTML = basePrice + ipPrice + cpanelPrice; }); // Use setValue to reset the value to white: Event.on("putval", "click", function(e) { slider.setValue(100, false); //false here means to animate if possible }); setTimeout(function () { slider.setValue(10); },0); }); })(); </script> <div style="width: 649px; margin:auto"> <span id="sliderbg"></span> <div class="yui-skin-sam"> <div id="slider-bg" class="yui-h-slider" tabindex="-1"> <div id="slider-thumb" class="yui-slider-thumb"><img src="/images/thumb-bar.png"></div> </div> </div> </div> <div class="vpsdetails"> <div id="vpsprod"><span id="cpu"></span></div> <div id="vpsram"><span id="ram"></span></div> <div id="vpsstor"><span id="storage"></span> GB</div> <div id="vpsbw"><span id="bandwidth"></span> GB</div> <div id="slideprice">$ <span id="monthly-price"></span></div> </div> <input type="radio" name="ips" value="2" onclick="setIpPrice(this.value - 2 * 2);" checked="checked" /> 2 <input type="radio" name="ips" value="4" onclick="setIpPrice(this.value - 2 * 2);" /> 4

    Read the article

  • Optimized OCR black/white pixel algorithm

    - by eagle
    I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

    Read the article

  • Referencing movie clips from within an actionscript class

    - by Ant
    Hi all, I have been given the task of adding a scoring system to various flash games. This simply involves taking input, adding functionality such as pausing and replaying and then outputting the score, time left etc. at the end. I've so far successfully edited two games. Both these games used the "actions" code on frames. The latest game I'm trying to do uses an actionscript class which makes it both easier and harder. I'm not very adept at flash at all, but I've worked it out so far. I've added various movie clips that are to be used for displaying the pause screen background, buttons for replaying etc. I've been showing and hiding these using: back._visible = true; //movie clip, instance of back (back.png) I doubt it's best practice, but it's quick and has been working. However, now with the change of coding style to classes, this doesn't seem to work. I kinda understand why, but I'm now unsure how to hide/show these elements. Any help would be greatly appreciated :) I've attached the modified AS. class RivalOrbs extends MovieClip { var infinite_levels, orbs_start, orbs_inc, orbs_per_level, show_timer, _parent, one_time_per_level, speed_start, speed_inc_percent, max_speed, percent_starting_on_wrong_side, colorize, colors, secs_per_level; function RivalOrbs() { super(); mc = this; this.init(); } // End of the function function get_num_orbs() { if (infinite_levels) { return (orbs_start + (level - 1) * orbs_inc); } else if (level > orbs_per_level.length) { return (0); } else { return (orbs_per_level[level - 1]); } // end else if } // End of the function function get_timer_str(secs) { var _loc2 = Math.floor(secs / 60); var _loc1 = secs % 60; return ((_loc2 > 0 ? (_loc2) : ("0")) + ":" + (_loc1 >= 10 ? (_loc1) : ("0" + _loc1))); } // End of the function function frame() { //PLACE PAUSE CODE HERE if (!Key.isDown(80) and !Key.isDown(Key.ESCAPE)) { _root.offKey = true; } else if (Key.isDown(80) or Key.isDown(Key.ESCAPE)) { if (_root.offKey and _root.game_mode == "play") { _root.game_mode = "pause"; /* back._visible = true; btn_resume._visible = true; btn_exit._visible = true; txt_pause._visible = true; */ } else if (_root.offKey and _root.game_mode == "pause") { _root.game_mode = "play"; } _root.offKey = false; } if (_root.game_mode == "pause" or paused) { return; } else { /* back._visible = false; btn_resume._visible = false; btn_exit._visible = false; txt_pause._visible = false; */ } if (show_timer && total_secs != -1 || show_timer && _parent.timesup) { _loc7 = total_secs - Math.ceil((getTimer() - timer) / 1000); var diff = oldSeconds - (_loc7 + additional); if (diff > 1) additional = additional + diff; _loc7 = _loc7 + additional; oldSeconds = _loc7; trace(oldSeconds); mc.timer_field.text = this.get_timer_str(Math.max(0, _loc7)); if (_loc7 <= -1 || _parent.timesup) { if (one_time_per_level) { _root.gotoAndPlay("Lose"); } else { this.show_dialog(false); return; } // end if } // end if } // end else if var _loc9 = _root._xmouse; var _loc8 = _root._ymouse; var _loc6 = {x: _loc9, y: _loc8}; mc.globalToLocal(_loc6); _loc6.y = Math.max(-mc.bg._height / 2 + gap / 2, _loc6.y); _loc6.y = Math.min(mc.bg._height / 2 - gap / 2, _loc6.y); mc.wall1._y = _loc6.y - gap / 2 - mc.wall1._height / 2; mc.wall2._y = _loc6.y + gap / 2 + mc.wall1._height / 2; var _loc5 = true; for (var _loc4 = 0; _loc4 < this.get_num_orbs(); ++_loc4) { var _loc3 = mc.stage["orb" + _loc4]; _loc3.x_last = _loc3._x; _loc3.y_last = _loc3._y; _loc3._x = _loc3._x + _loc3.x_speed; _loc3._y = _loc3._y + _loc3.y_speed; if (_loc3._x < l_thresh) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = l_thresh + (l_thresh - _loc3._x); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._x > r_thresh) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = r_thresh - (_loc3._x - r_thresh); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._y < t_thresh) { _loc3.y_speed = _loc3.y_speed * -1; _loc3._y = t_thresh + (t_thresh - _loc3._y); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._y > b_thresh) { _loc3.y_speed = _loc3.y_speed * -1; _loc3._y = b_thresh - (_loc3._y - b_thresh); _loc3.gotoAndPlay("hit"); } // end if if (_loc3.x_speed > 0) { if (_loc3._x >= m1_thresh && _loc3.x_last < m1_thresh || _loc3._x >= m1_thresh && _loc3._x <= m2_thresh) { if (_loc3._y <= mc.wall1._y + mc.wall1._height / 2 || _loc3._y >= mc.wall2._y - mc.wall2._height / 2) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = m1_thresh - (_loc3._x - m1_thresh); _loc3.gotoAndPlay("hit"); } // end if } // end if } else if (_loc3._x <= m2_thresh && _loc3.x_last > m2_thresh || _loc3._x >= m1_thresh && _loc3._x <= m2_thresh) { if (_loc3._y <= mc.wall1._y + mc.wall1._height / 2 || _loc3._y >= mc.wall2._y - mc.wall2._height / 2) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = m2_thresh + (m2_thresh - _loc3._x); _loc3.gotoAndPlay("hit"); } // end if } // end else if if (_loc3.side == 1 && _loc3._x > 0) { _loc5 = false; } // end if if (_loc3.side == 2 && _loc3._x < 0) { _loc5 = false; } // end if } // end of for if (_loc5) { this.end_level(); } // end if } // End of the function function colorize_hex(mc, hex) { var _loc4 = hex >> 16; var _loc5 = (hex ^ hex >> 16 << 16) >> 8; var _loc3 = hex >> 8 << 8 ^ hex; var _loc2 = new flash.geom.ColorTransform(0, 0, 0, 1, _loc4, _loc5, _loc3, 0); mc.transform.colorTransform = _loc2; } // End of the function function tint_hex(mc, hex, amount) { var _loc4 = hex >> 16; var _loc5 = hex >> 8 & 255; var _loc3 = hex & 255; this.tint(mc, _loc4, _loc5, _loc3, amount); } // End of the function function tint(mc, r, g, b, amount) { var _loc4 = 100 - amount; var _loc1 = new Object(); _loc1.ra = _loc1.ga = _loc1.ba = _loc4; var _loc2 = amount / 100; _loc1.rb = r * _loc2; _loc1.gb = g * _loc2; _loc1.bb = b * _loc2; var _loc3 = new Color(mc); _loc3.setTransform(_loc1); } // End of the function function get_num_levels() { if (infinite_levels) { return (Number.MAX_VALUE); } else { return (orbs_per_level.length); } // end else if } // End of the function function end_level() { _global.inputTimeAvailable = _global.inputTimeAvailable - (60 - oldSeconds); ++level; _parent.levelOver = true; if (level <= this.get_num_levels()) { this.show_dialog(true); } else { _root.gotoAndPlay("Win"); } // end else if } // End of the function function get_speed() { var _loc3 = speed_start; for (var _loc2 = 0; _loc2 < level - 1; ++_loc2) { _loc3 = _loc3 + _loc3 * (speed_inc_percent / 100); } // end of for return (Math.min(_loc3, Math.max(max_speed, speed_start))); } // End of the function function init_orbs() { var _loc6 = this.get_speed(); var _loc7 = Math.max(1, Math.ceil(this.get_num_orbs() * (percent_starting_on_wrong_side / 100))); for (var _loc3 = 0; _loc3 < this.get_num_orbs(); ++_loc3) { var _loc2 = null; if (_loc3 % 2 == 0) { _loc2 = mc.stage.attachMovie("Orb1", "orb" + _loc3, _loc3); _loc2.side = 1; if (colorize && color1 != -1) { this.colorize_hex(_loc2.orb.bg, color1); } // end if _loc2._x = Math.random() * (mc.bg._width * 4.000000E-001) - mc.bg._width * 2.000000E-001 - mc.bg._width / 4; } else { _loc2 = mc.stage.attachMovie("Orb2", "orb" + _loc3, _loc3); _loc2.side = 2; if (colorize && color2 != -1) { this.colorize_hex(_loc2.orb.bg, color2); } // end if _loc2._x = Math.random() * (mc.bg._width * 4.000000E-001) - mc.bg._width * 2.000000E-001 + mc.bg._width / 4; } // end else if _loc2._width = _loc2._height = orb_w; _loc2._y = Math.random() * (mc.bg._height * 8.000000E-001) - mc.bg._height * 4.000000E-001; if (_loc3 < _loc7) { _loc2._x = _loc2._x * -1; } // end if var _loc5 = Math.random() * 60; var _loc4 = _loc5 / 180 * 3.141593E+000; _loc2.x_speed = Math.cos(_loc4) * _loc6; _loc2.y_speed = Math.sin(_loc4) * _loc6; if (Math.random() >= 5.000000E-001) { _loc2.x_speed = _loc2.x_speed * -1; } // end if if (Math.random() >= 5.000000E-001) { _loc2.y_speed = _loc2.y_speed * -1; } // end if } // end of for } // End of the function function init_colors() { if (colorize && colors.length >= 2) { color1 = colors[Math.floor(Math.random() * colors.length)]; for (color2 = colors[Math.floor(Math.random() * colors.length)]; color2 == color1; color2 = colors[Math.floor(Math.random() * colors.length)]) { } // end of for this.tint_hex(mc.side1, color1, 40); this.tint_hex(mc.side2, color2, 40); } else { color1 = -1; color2 = -1; } // end else if } // End of the function function get_total_secs() { if (show_timer) { if (secs_per_level.length > 0) { if (level > secs_per_level.length) { return (secs_per_level[secs_per_level.length - 1]); } else { return (secs_per_level[level - 1]); } // end if } // end if } // end else if return (-1); } // End of the function function start_level() { trace ("start_level"); _parent.timesup = false; _parent.levelOver = false; _parent.times_up_comp.start_timer(); this.init_orbs(); mc.level_field.text = "LEVEL " + level; total_secs = _global.inputTimeAvailable; if (total_secs > 60) total_secs = 60; timer = getTimer(); paused = false; mc.dialog.gotoAndPlay("off"); } // End of the function function clear_orbs() { for (var _loc2 = 0; mc.stage["orb" + _loc2]; ++_loc2) { mc.stage["orb" + _loc2].removeMovieClip(); } // end of for } // End of the function function show_dialog(new_level) { mc.back._visible = false; trace("yes"); paused = true; if (new_level) { this.init_colors(); } // end if this.clear_orbs(); mc.dialog.gotoAndPlay("level"); if (!new_level || _parent.timesup) { mc.dialog.level_top.text = "Time\'s Up!"; /* dyn_line1.text = "Goodbye " + _global.inputName + "!"; dyn_line2.text = "You scored " + score; //buttons if (_global.inputTimeAvailable > 60) btn_replay._visible = true; btn_resume._visible = false; btn_exit._visible = false; txt_pause._visible = false; sendInfo = new LoadVars(); sendLoader = new LoadVars(); sendInfo.game_name = 'rival_orbs'; sendInfo.timeavailable = _global.inputTimeAvailable; if (sendInfo.timeavailable < 0) sendInfo.timeavailable = 0; sendInfo.id = _global.inputId; sendInfo.score = level*_global.inputFactor; sendInfo.directive = 'record'; //sendInfo.sendAndLoad('ncc1701e.aspx', sendLoader, "GET"); sendInfo.sendAndLoad('http://keyload.co.uk/output.php', sendLoader, "POST"); */ } else if (level > 1) { mc.dialog.level_top.text = "Next Level:"; } else { mc.dialog.level_top.text = ""; } // end else if mc.dialog.level_num.text = "LEVEL " + level; mc.dialog.level_mid.text = "Number of Orbs: " + this.get_num_orbs(); _root.max_level = level; var _this = this; mc.dialog.btn.onRelease = function () { _this.start_level(); }; } // End of the function function init() { var getInfo = new LoadVars(); var getLoader = new LoadVars(); getInfo.directive = "read"; getInfo.sendAndLoad('http://keyload.co.uk/input.php', getLoader, "GET"); getLoader.onLoad = function (success) { if (success) { _global.inputId = this.id; _global.inputTimeAvailable = this.timeavailable; _global.inputFactor = this.factor; _global.inputName = this.name; } else { trace("Failed"); } } _root.game_mode = "play"; /* back._visible = false; btn_exit._visible = false; btn_replay._visible = false; btn_resume._visible = false; txt_pause._visible = false; */ l_thresh = -mc.bg._width / 2 + orb_w / 2; t_thresh = -mc.bg._height / 2 + orb_w / 2; r_thresh = mc.bg._width / 2 - orb_w / 2; b_thresh = mc.bg._height / 2 - orb_w / 2; m1_thresh = -wall_w / 2 - orb_w / 2; m2_thresh = wall_w / 2 + orb_w / 2; this.show_dialog(true); mc.onEnterFrame = frame; } // End of the function var mc = null; var orb_w = 15; var wall_w = 2; var l_thresh = 0; var r_thresh = 0; var t_thresh = 0; var b_thresh = 0; var m1_thresh = 0; var m2_thresh = 0; var color1 = -1; var color2 = -1; var level = 1; var total_secs = 30; var gap = 60; var timer = 0; var additional = 0; var oldSeconds = 0; var paused = true; var _loc7 = 0; } // End of Class

    Read the article

  • Is there a Telecommunications Reference Architecture?

    - by raul.goycoolea
    @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Abstract   Reference architecture provides needed architectural information that can be provided in advance to an enterprise to enable consistent architectural best practices. Enterprise Reference Architecture helps business owners to actualize their strategies, vision, objectives, and principles. It evaluates the IT systems, based on Reference Architecture goals, principles, and standards. It helps to reduce IT costs by increasing functionality, availability, scalability, etc. Telecom Reference Architecture provides customers with the flexibility to view bundled service bills online with the provision of multiple services. It provides real-time, flexible billing and charging systems, to handle complex promotions, discounts, and settlements with multiple parties. This paper attempts to describe the Reference Architecture for the Telecom Enterprises. It lays the foundation for a Telecom Reference Architecture by articulating the requirements, drivers, and pitfalls for telecom service providers. It describes generic reference architecture for telecom enterprises and moves on to explain how to achieve Enterprise Reference Architecture by using SOA.   Introduction   A Reference Architecture provides a methodology, set of practices, template, and standards based on a set of successful solutions implemented earlier. These solutions have been generalized and structured for the depiction of both a logical and a physical architecture, based on the harvesting of a set of patterns that describe observations in a number of successful implementations. It helps as a reference for the various architectures that an enterprise can implement to solve various problems. It can be used as the starting point or the point of comparisons for various departments/business entities of a company, or for the various companies for an enterprise. It provides multiple views for multiple stakeholders.   Major artifacts of the Enterprise Reference Architecture are methodologies, standards, metadata, documents, design patterns, etc.   Purpose of Reference Architecture   In most cases, architects spend a lot of time researching, investigating, defining, and re-arguing architectural decisions. It is like reinventing the wheel as their peers in other organizations or even the same organization have already spent a lot of time and effort defining their own architectural practices. This prevents an organization from learning from its own experiences and applying that knowledge for increased effectiveness.   Reference architecture provides missing architectural information that can be provided in advance to project team members to enable consistent architectural best practices.   Enterprise Reference Architecture helps an enterprise to achieve the following at the abstract level:   ·       Reference architecture is more of a communication channel to an enterprise ·       Helps the business owners to accommodate to their strategies, vision, objectives, and principles. ·       Evaluates the IT systems based on Reference Architecture Principles ·       Reduces IT spending through increasing functionality, availability, scalability, etc ·       A Real-time Integration Model helps to reduce the latency of the data updates Is used to define a single source of Information ·       Provides a clear view on how to manage information and security ·       Defines the policy around the data ownership, product boundaries, etc. ·       Helps with cost optimization across project and solution portfolios by eliminating unused or duplicate investments and assets ·       Has a shorter implementation time and cost   Once the reference architecture is in place, the set of architectural principles, standards, reference models, and best practices ensure that the aligned investments have the greatest possible likelihood of success in both the near term and the long term (TCO).     Common pitfalls for Telecom Service Providers   Telecom Reference Architecture serves as the first step towards maturity for a telecom service provider. During the course of our assignments/experiences with telecom players, we have come across the following observations – Some of these indicate a lack of maturity of the telecom service provider:   ·       In markets that are growing and not so mature, it has been observed that telcos have a significant amount of in-house or home-grown applications. In some of these markets, the growth has been so rapid that IT has been unable to cope with business demands. Telcos have shown a tendency to come up with workarounds in their IT applications so as to meet business needs. ·       Even for core functions like provisioning or mediation, some telcos have tried to manage with home-grown applications. ·       Most of the applications do not have the required scalability or maintainability to sustain growth in volumes or functionality. ·       Applications face interoperability issues with other applications in the operator's landscape. Integrating a new application or network element requires considerable effort on the part of the other applications. ·       Application boundaries are not clear, and functionality that is not in the initial scope of that application gets pushed onto it. This results in the development of the multiple, small applications without proper boundaries. ·       Usage of Legacy OSS/BSS systems, poor Integration across Multiple COTS Products and Internal Systems. Most of the Integrations are developed on ad-hoc basis and Point-to-Point Integration. ·       Redundancy of the business functions in different applications • Fragmented data across the different applications and no integrated view of the strategic data • Lot of performance Issues due to the usage of the complex integration across OSS and BSS systems   However, this is where the maturity of the telecom industry as a whole can be of help. The collaborative efforts of telcos to overcome some of these problems have resulted in bodies like the TM Forum. They have come up with frameworks for business processes, data, applications, and technology for telecom service providers. These could be a good starting point for telcos to clean up their enterprise landscape.   Industry Trends in Telecom Reference Architecture   Telecom reference architectures are evolving rapidly because telcos are facing business and IT challenges.   “The reality is that there probably is no killer application, no silver bullet that the telcos can latch onto to carry them into a 21st Century.... Instead, there are probably hundreds – perhaps thousands – of niche applications.... And the only way to find which of these works for you is to try out lots of them, ramp up the ones that work, and discontinue the ones that fail.” – Martin Creaner President & CTO TM Forum.   The following trends have been observed in telecom reference architecture:   ·       Transformation of business structures to align with customer requirements ·       Adoption of more Internet-like technical architectures. The Web 2.0 concept is increasingly being used. ·       Virtualization of the traditional operations support system (OSS) ·       Adoption of SOA to support development of IP-based services ·       Adoption of frameworks like Service Delivery Platforms (SDPs) and IP Multimedia Subsystem ·       (IMS) to enable seamless deployment of various services over fixed and mobile networks ·       Replacement of in-house, customized, and stove-piped OSS/BSS with standards-based COTS products ·       Compliance with industry standards and frameworks like eTOM, SID, and TAM to enable seamless integration with other standards-based products   Drivers of Reference Architecture   The drivers of the Reference Architecture are Reference Architecture Goals, Principles, and Enterprise Vision and Telecom Transformation. The details are depicted below diagram. @font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }div.Section1 { page: Section1; } Figure 1. Drivers for Reference Architecture @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Today’s telecom reference architectures should seamlessly integrate traditional legacy-based applications and transition to next-generation network technologies (e.g., IP multimedia subsystems). This has resulted in new requirements for flexible, real-time billing and OSS/BSS systems and implications on the service provider’s organizational requirements and structure.   Telecom reference architectures are today expected to:   ·       Integrate voice, messaging, email and other VAS over fixed and mobile networks, back end systems ·       Be able to provision multiple services and service bundles • Deliver converged voice, video and data services ·       Leverage the existing Network Infrastructure ·       Provide real-time, flexible billing and charging systems to handle complex promotions, discounts, and settlements with multiple parties. ·       Support charging of advanced data services such as VoIP, On-Demand, Services (e.g.  Video), IMS/SIP Services, Mobile Money, Content Services and IPTV. ·       Help in faster deployment of new services • Serve as an effective platform for collaboration between network IT and business organizations ·       Harness the potential of converging technology, networks, devices and content to develop multimedia services and solutions of ever-increasing sophistication on a single Internet Protocol (IP) ·       Ensure better service delivery and zero revenue leakage through real-time balance and credit management ·       Lower operating costs to drive profitability   Enterprise Reference Architecture   The Enterprise Reference Architecture (RA) fills the gap between the concepts and vocabulary defined by the reference model and the implementation. Reference architecture provides detailed architectural information in a common format such that solutions can be repeatedly designed and deployed in a consistent, high-quality, supportable fashion. This paper attempts to describe the Reference Architecture for the Telecom Application Usage and how to achieve the Enterprise Level Reference Architecture using SOA.   • Telecom Reference Architecture • Enterprise SOA based Reference Architecture   Telecom Reference Architecture   Tele Management Forum’s New Generation Operations Systems and Software (NGOSS) is an architectural framework for organizing, integrating, and implementing telecom systems. NGOSS is a component-based framework consisting of the following elements:   ·       The enhanced Telecom Operations Map (eTOM) is a business process framework. ·       The Shared Information Data (SID) model provides a comprehensive information framework that may be specialized for the needs of a particular organization. ·       The Telecom Application Map (TAM) is an application framework to depict the functional footprint of applications, relative to the horizontal processes within eTOM. ·       The Technology Neutral Architecture (TNA) is an integrated framework. TNA is an architecture that is sustainable through technology changes.   NGOSS Architecture Standards are:   ·       Centralized data ·       Loosely coupled distributed systems ·       Application components/re-use  ·       A technology-neutral system framework with technology specific implementations ·       Interoperability to service provider data/processes ·       Allows more re-use of business components across multiple business scenarios ·       Workflow automation   The traditional operator systems architecture consists of four layers,   ·       Business Support System (BSS) layer, with focus toward customers and business partners. Manages order, subscriber, pricing, rating, and billing information. ·       Operations Support System (OSS) layer, built around product, service, and resource inventories. ·       Networks layer – consists of Network elements and 3rd Party Systems. ·       Integration Layer – to maximize application communication and overall solution flexibility.   Reference architecture for telecom enterprises is depicted below. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 2. Telecom Reference Architecture   The major building blocks of any Telecom Service Provider architecture are as follows:   1. Customer Relationship Management   CRM encompasses the end-to-end lifecycle of the customer: customer initiation/acquisition, sales, ordering, and service activation, customer care and support, proactive campaigns, cross sell/up sell, and retention/loyalty.   CRM also includes the collection of customer information and its application to personalize, customize, and integrate delivery of service to a customer, as well as to identify opportunities for increasing the value of the customer to the enterprise.   The key functionalities related to Customer Relationship Management are   ·       Manage the end-to-end lifecycle of a customer request for products. ·       Create and manage customer profiles. ·       Manage all interactions with customers – inquiries, requests, and responses. ·       Provide updates to Billing and other south bound systems on customer/account related updates such as customer/ account creation, deletion, modification, request bills, final bill, duplicate bills, credit limits through Middleware. ·       Work with Order Management System, Product, and Service Management components within CRM. ·       Manage customer preferences – Involve all the touch points and channels to the customer, including contact center, retail stores, dealers, self service, and field service, as well as via any media (phone, face to face, web, mobile device, chat, email, SMS, mail, the customer's bill, etc.). ·       Support single interface for customer contact details, preferences, account details, offers, customer premise equipment, bill details, bill cycle details, and customer interactions.   CRM applications interact with customers through customer touch points like portals, point-of-sale terminals, interactive voice response systems, etc. The requests by customers are sent via fulfillment/provisioning to billing system for ordering processing.   2. Billing and Revenue Management   Billing and Revenue Management handles the collection of appropriate usage records and production of timely and accurate bills – for providing pre-bill usage information and billing to customers; for processing their payments; and for performing payment collections. In addition, it handles customer inquiries about bills, provides billing inquiry status, and is responsible for resolving billing problems to the customer's satisfaction in a timely manner. This process grouping also supports prepayment for services.   The key functionalities provided by these applications are   ·       To ensure that enterprise revenue is billed and invoices delivered appropriately to customers. ·       To manage customers’ billing accounts, process their payments, perform payment collections, and monitor the status of the account balance. ·       To ensure the timely and effective fulfillment of all customer bill inquiries and complaints. ·       Collect the usage records from mediation and ensure appropriate rating and discounting of all usage and pricing. ·       Support revenue sharing; split charging where usage is guided to an account different from the service consumer. ·       Support prepaid and post-paid rating. ·       Send notification on approach / exceeding the usage thresholds as enforced by the subscribed offer, and / or as setup by the customer. ·       Support prepaid, post paid, and hybrid (where some services are prepaid and the rest of the services post paid) customers and conversion from post paid to prepaid, and vice versa. ·       Support different billing function requirements like charge prorating, promotion, discount, adjustment, waiver, write-off, account receivable, GL Interface, late payment fee, credit control, dunning, account or service suspension, re-activation, expiry, termination, contract violation penalty, etc. ·       Initiate direct debit to collect payment against an invoice outstanding. ·       Send notification to Middleware on different events; for example, payment receipt, pre-suspension, threshold exceed, etc.   Billing systems typically get usage data from mediation systems for rating and billing. They get provisioning requests from order management systems and inquiries from CRM systems. Convergent and real-time billing systems can directly get usage details from network elements.   3. Mediation   Mediation systems transform/translate the Raw or Native Usage Data Records into a general format that is acceptable to billing for their rating purposes.   The following lists the high-level roles and responsibilities executed by the Mediation system in the end-to-end solution.   ·       Collect Usage Data Records from different data sources – like network elements, routers, servers – via different protocol and interfaces. ·       Process Usage Data Records – Mediation will process Usage Data Records as per the source format. ·       Validate Usage Data Records from each source. ·       Segregates Usage Data Records coming from each source to multiple, based on the segregation requirement of end Application. ·       Aggregates Usage Data Records based on the aggregation rule if any from different sources. ·       Consolidates multiple Usage Data Records from each source. ·       Delivers formatted Usage Data Records to different end application like Billing, Interconnect, Fraud Management, etc. ·       Generates audit trail for incoming Usage Data Records and keeps track of all the Usage Data Records at various stages of mediation process. ·       Checks duplicate Usage Data Records across files for a given time window.   4. Fulfillment   This area is responsible for providing customers with their requested products in a timely and correct manner. It translates the customer's business or personal need into a solution that can be delivered using the specific products in the enterprise's portfolio. This process informs the customers of the status of their purchase order, and ensures completion on time, as well as ensuring a delighted customer. These processes are responsible for accepting and issuing orders. They deal with pre-order feasibility determination, credit authorization, order issuance, order status and tracking, customer update on customer order activities, and customer notification on order completion. Order management and provisioning applications fall into this category.   The key functionalities provided by these applications are   ·       Issuing new customer orders, modifying open customer orders, or canceling open customer orders; ·       Verifying whether specific non-standard offerings sought by customers are feasible and supportable; ·       Checking the credit worthiness of customers as part of the customer order process; ·       Testing the completed offering to ensure it is working correctly; ·       Updating of the Customer Inventory Database to reflect that the specific product offering has been allocated, modified, or cancelled; ·       Assigning and tracking customer provisioning activities; ·       Managing customer provisioning jeopardy conditions; and ·       Reporting progress on customer orders and other processes to customer.   These applications typically get orders from CRM systems. They interact with network elements and billing systems for fulfillment of orders.   5. Enterprise Management   This process area includes those processes that manage enterprise-wide activities and needs, or have application within the enterprise as a whole. They encompass all business management processes that   ·       Are necessary to support the whole of the enterprise, including processes for financial management, legal management, regulatory management, process, cost, and quality management, etc.;   ·       Are responsible for setting corporate policies, strategies, and directions, and for providing guidelines and targets for the whole of the business, including strategy development and planning for areas, such as Enterprise Architecture, that are integral to the direction and development of the business;   ·       Occur throughout the enterprise, including processes for project management, performance assessments, cost assessments, etc.     (i) Enterprise Risk Management:   Enterprise Risk Management focuses on assuring that risks and threats to the enterprise value and/or reputation are identified, and appropriate controls are in place to minimize or eliminate the identified risks. The identified risks may be physical or logical/virtual. Successful risk management ensures that the enterprise can support its mission critical operations, processes, applications, and communications in the face of serious incidents such as security threats/violations and fraud attempts. Two key areas covered in Risk Management by telecom operators are:   ·       Revenue Assurance: Revenue assurance system will be responsible for identifying revenue loss scenarios across components/systems, and will help in rectifying the problems. The following lists the high-level roles and responsibilities executed by the Revenue Assurance system in the end-to-end solution. o   Identify all usage information dropped when networks are being upgraded. o   Interconnect bill verification. o   Identify where services are routinely provisioned but never billed. o   Identify poor sales policies that are intensifying collections problems. o   Find leakage where usage is sent to error bucket and never billed for. o   Find leakage where field service, CRM, and network build-out are not optimized.   ·       Fraud Management: Involves collecting data from different systems to identify abnormalities in traffic patterns, usage patterns, and subscription patterns to report suspicious activity that might suggest fraudulent usage of resources, resulting in revenue losses to the operator.   The key roles and responsibilities of the system component are as follows:   o   Fraud management system will capture and monitor high usage (over a certain threshold) in terms of duration, value, and number of calls for each subscriber. The threshold for each subscriber is decided by the system and fixed automatically. o   Fraud management will be able to detect the unauthorized access to services for certain subscribers. These subscribers may have been provided unauthorized services by employees. The component will raise the alert to the operator the very first time of such illegal calls or calls which are not billed. o   The solution will be to have an alarm management system that will deliver alarms to the operator/provider whenever it detects a fraud, thus minimizing fraud by catching it the first time it occurs. o   The Fraud Management system will be capable of interfacing with switches, mediation systems, and billing systems   (ii) Knowledge Management   This process focuses on knowledge management, technology research within the enterprise, and the evaluation of potential technology acquisitions.   Key responsibilities of knowledge base management are to   ·       Maintain knowledge base – Creation and updating of knowledge base on ongoing basis. ·       Search knowledge base – Search of knowledge base on keywords or category browse ·       Maintain metadata – Management of metadata on knowledge base to ensure effective management and search. ·       Run report generator. ·       Provide content – Add content to the knowledge base, e.g., user guides, operational manual, etc.   (iii) Document Management   It focuses on maintaining a repository of all electronic documents or images of paper documents relevant to the enterprise using a system.   (iv) Data Management   It manages data as a valuable resource for any enterprise. For telecom enterprises, the typical areas covered are Master Data Management, Data Warehousing, and Business Intelligence. It is also responsible for data governance, security, quality, and database management.   Key responsibilities of Data Management are   ·       Using ETL, extract the data from CRM, Billing, web content, ERP, campaign management, financial, network operations, asset management info, customer contact data, customer measures, benchmarks, process data, e.g., process inputs, outputs, and measures, into Enterprise Data Warehouse. ·       Management of data traceability with source, data related business rules/decisions, data quality, data cleansing data reconciliation, competitors data – storage for all the enterprise data (customer profiles, products, offers, revenues, etc.) ·       Get online update through night time replication or physical backup process at regular frequency. ·       Provide the data access to business intelligence and other systems for their analysis, report generation, and use.   (v) Business Intelligence   It uses the Enterprise Data to provide the various analysis and reports that contain prospects and analytics for customer retention, acquisition of new customers due to the offers, and SLAs. It will generate right and optimized plans – bolt-ons for the customers.   The following lists the high-level roles and responsibilities executed by the Business Intelligence system at the Enterprise Level:   ·       It will do Pattern analysis and reports problem. ·       It will do Data Analysis – Statistical analysis, data profiling, affinity analysis of data, customer segment wise usage patterns on offers, products, service and revenue generation against services and customer segments. ·       It will do Performance (business, system, and forecast) analysis, churn propensity, response time, and SLAs analysis. ·       It will support for online and offline analysis, and report drill down capability. ·       It will collect, store, and report various SLA data. ·       It will provide the necessary intelligence for marketing and working on campaigns, etc., with cost benefit analysis and predictions.   It will advise on customer promotions with additional services based on loyalty and credit history of customer   ·       It will Interface with Enterprise Data Management system for data to run reports and analysis tasks. It will interface with the campaign schedules, based on historical success evidence.   (vi) Stakeholder and External Relations Management   It manages the enterprise's relationship with stakeholders and outside entities. Stakeholders include shareholders, employee organizations, etc. Outside entities include regulators, local community, and unions. Some of the processes within this grouping are Shareholder Relations, External Affairs, Labor Relations, and Public Relations.   (vii) Enterprise Resource Planning   It is used to manage internal and external resources, including tangible assets, financial resources, materials, and human resources. Its purpose is to facilitate the flow of information between all business functions inside the boundaries of the enterprise and manage the connections to outside stakeholders. ERP systems consolidate all business operations into a uniform and enterprise wide system environment.   The key roles and responsibilities for Enterprise System are given below:   ·        It will handle responsibilities such as core accounting, financial, and management reporting. ·       It will interface with CRM for capturing customer account and details. ·       It will interface with billing to capture the billing revenue and other financial data. ·       It will be responsible for executing the dunning process. Billing will send the required feed to ERP for execution of dunning. ·       It will interface with the CRM and Billing through batch interfaces. Enterprise management systems are like horizontals in the enterprise and typically interact with all major telecom systems. E.g., an ERP system interacts with CRM, Fulfillment, and Billing systems for different kinds of data exchanges.   6. External Interfaces/Touch Points   The typical external parties are customers, suppliers/partners, employees, shareholders, and other stakeholders. External interactions from/to a Service Provider to other parties can be achieved by a variety of mechanisms, including:   ·       Exchange of emails or faxes ·       Call Centers ·       Web Portals ·       Business-to-Business (B2B) automated transactions   These applications provide an Internet technology driven interface to external parties to undertake a variety of business functions directly for themselves. These can provide fully or partially automated service to external parties through various touch points.   Typical characteristics of these touch points are   ·       Pre-integrated self-service system, including stand-alone web framework or integration front end with a portal engine ·       Self services layer exposing atomic web services/APIs for reuse by multiple systems across the architectural environment ·       Portlets driven connectivity exposing data and services interoperability through a portal engine or web application   These touch points mostly interact with the CRM systems for requests, inquiries, and responses.   7. Middleware   The component will be primarily responsible for integrating the different systems components under a common platform. It should provide a Standards-Based Platform for building Service Oriented Architecture and Composite Applications. The following lists the high-level roles and responsibilities executed by the Middleware component in the end-to-end solution.   ·       As an integration framework, covering to and fro interfaces ·       Provide a web service framework with service registry. ·       Support SOA framework with SOA service registry. ·       Each of the interfaces from / to Middleware to other components would handle data transformation, translation, and mapping of data points. ·       Receive data from the caller / activate and/or forward the data to the recipient system in XML format. ·       Use standard XML for data exchange. ·       Provide the response back to the service/call initiator. ·       Provide a tracking until the response completion. ·       Keep a store transitional data against each call/transaction. ·       Interface through Middleware to get any information that is possible and allowed from the existing systems to enterprise systems; e.g., customer profile and customer history, etc. ·       Provide the data in a common unified format to the SOA calls across systems, and follow the Enterprise Architecture directive. ·       Provide an audit trail for all transactions being handled by the component.   8. Network Elements   The term Network Element means a facility or equipment used in the provision of a telecommunications service. Such terms also includes features, functions, and capabilities that are provided by means of such facility or equipment, including subscriber numbers, databases, signaling systems, and information sufficient for billing and collection or used in the transmission, routing, or other provision of a telecommunications service.   Typical network elements in a GSM network are Home Location Register (HLR), Intelligent Network (IN), Mobile Switching Center (MSC), SMS Center (SMSC), and network elements for other value added services like Push-to-talk (PTT), Ring Back Tone (RBT), etc.   Network elements are invoked when subscribers use their telecom devices for any kind of usage. These elements generate usage data and pass it on to downstream systems like mediation and billing system for rating and billing. They also integrate with provisioning systems for order/service fulfillment.   9. 3rd Party Applications   3rd Party systems are applications like content providers, payment gateways, point of sale terminals, and databases/applications maintained by the Government.   Depending on applicability and the type of functionality provided by 3rd party applications, the integration with different telecom systems like CRM, provisioning, and billing will be done.   10. Service Delivery Platform   A service delivery platform (SDP) provides the architecture for the rapid deployment, provisioning, execution, management, and billing of value added telecom services. SDPs are based on the concept of SOA and layered architecture. They support the delivery of voice, data services, and content in network and device-independent fashion. They allow application developers to aggregate network capabilities, services, and sources of content. SDPs typically contain layers for web services exposure, service application development, and network abstraction.   SOA Reference Architecture   SOA concept is based on the principle of developing reusable business service and building applications by composing those services, instead of building monolithic applications in silos. It’s about bridging the gap between business and IT through a set of business-aligned IT services, using a set of design principles, patterns, and techniques.   In an SOA, resources are made available to participants in a value net, enterprise, line of business (typically spanning multiple applications within an enterprise or across multiple enterprises). It consists of a set of business-aligned IT services that collectively fulfill an organization’s business processes and goals. We can choreograph these services into composite applications and invoke them through standard protocols. SOA, apart from agility and reusability, enables:   ·       The business to specify processes as orchestrations of reusable services ·       Technology agnostic business design, with technology hidden behind service interface ·       A contractual-like interaction between business and IT, based on service SLAs ·       Accountability and governance, better aligned to business services ·       Applications interconnections untangling by allowing access only through service interfaces, reducing the daunting side effects of change ·       Reduced pressure to replace legacy and extended lifetime for legacy applications, through encapsulation in services   ·       A Cloud Computing paradigm, using web services technologies, that makes possible service outsourcing on an on-demand, utility-like, pay-per-usage basis   The following section represents the Reference Architecture of logical view for the Telecom Solution. The new custom built application needs to align with this logical architecture in the long run to achieve EA benefits.   Packaged implementation applications, such as ERP billing applications, need to expose their functions as service providers (as other applications consume) and interact with other applications as service consumers.   COT applications need to expose services through wrappers such as adapters to utilize existing resources and at the same time achieve Enterprise Architecture goal and objectives.   The following are the various layers for Enterprise level deployment of SOA. This diagram captures the abstract view of Enterprise SOA layers and important components of each layer. Layered architecture means decomposition of services such that most interactions occur between adjacent layers. However, there is no strict rule that top layers should not directly communicate with bottom layers.   The diagram below represents the important logical pieces that would result from overall SOA transformation. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 3. Enterprise SOA Reference Architecture 1.          Operational System Layer: This layer consists of all packaged applications like CRM, ERP, custom built applications, COTS based applications like Billing, Revenue Management, Fulfilment, and the Enterprise databases that are essential and contribute directly or indirectly to the Enterprise OSS/BSS Transformation.   ERP holds the data of Asset Lifecycle Management, Supply Chain, and Advanced Procurement and Human Capital Management, etc.   CRM holds the data related to Order, Sales, and Marketing, Customer Care, Partner Relationship Management, Loyalty, etc.   Content Management handles Enterprise Search and Query. Billing application consists of the following components:   ·       Collections Management, Customer Billing Management, Invoices, Real-Time Rating, Discounting, and Applying of Charges ·       Enterprise databases will hold both the application and service data, whether structured or unstructured.   MDM - Master data majorly consists of Customer, Order, Product, and Service Data.     2.          Enterprise Component Layer:   This layer consists of the Application Services and Common Services that are responsible for realizing the functionality and maintaining the QoS of the exposed services. This layer uses container-based technologies such as application servers to implement the components, workload management, high availability, and load balancing.   Application Services: This Service Layer enables application, technology, and database abstraction so that the complex accessing logic is hidden from the other service layers. This is a basic service layer, which exposes application functionalities and data as reusable services. The three types of the Application access services are:   ·       Application Access Service: This Service Layer exposes application level functionalities as a reusable service between BSS to BSS and BSS to OSS integration. This layer is enabled using disparate technology such as Web Service, Integration Servers, and Adaptors, etc.   ·       Data Access Service: This Service Layer exposes application data services as a reusable reference data service. This is done via direct interaction with application data. and provides the federated query.   ·       Network Access Service: This Service Layer exposes provisioning layer as a reusable service from OSS to OSS integration. This integration service emphasizes the need for high performance, stateless process flows, and distributed design.   Common Services encompasses management of structured, semi-structured, and unstructured data such as information services, portal services, interaction services, infrastructure services, and security services, etc.   3.          Integration Layer:   This consists of service infrastructure components like service bus, service gateway for partner integration, service registry, service repository, and BPEL processor. Service bus will carry the service invocation payloads/messages between consumers and providers. The other important functions expected from it are itinerary based routing, distributed caching of routing information, transformations, and all qualities of service for messaging-like reliability, scalability, and availability, etc. Service registry will hold all contracts (wsdl) of services, and it helps developers to locate or discover service during design time or runtime.   • BPEL processor would be useful in orchestrating the services to compose a complex business scenario or process. • Workflow and business rules management are also required to support manual triggering of certain activities within business process. based on the rules setup and also the state machine information. Application, data, and service mediation layer typically forms the overall composite application development framework or SOA Framework.   4.          Business Process Layer: These are typically the intermediate services layer and represent Shared Business Process Services. At Enterprise Level, these services are from Customer Management, Order Management, Billing, Finance, and Asset Management application domains.   5.          Access Layer: This layer consists of portals for Enterprise and provides a single view of Enterprise information management and dashboard services.   6.          Channel Layer: This consists of various devices; applications that form part of extended enterprise; browsers through which users access the applications.   7.          Client Layer: This designates the different types of users accessing the enterprise applications. The type of user typically would be an important factor in determining the level of access to applications.   8.          Vertical pieces like management, monitoring, security, and development cut across all horizontal layers Management and monitoring involves all aspects of SOA-like services, SLAs, and other QoS lifecycle processes for both applications and services surrounding SOA governance.     9.          EA Governance, Reference Architecture, Roadmap, Principles, and Best Practices:   EA Governance is important in terms of providing the overall direction to SOA implementation within the enterprise. This involves board-level involvement, in addition to business and IT executives. At a high level, this involves managing the SOA projects implementation, managing SOA infrastructure, and controlling the entire effort through all fine-tuned IT processes in accordance with COBIT (Control Objectives for Information Technology).   Devising tools and techniques to promote reuse culture, and the SOA way of doing things needs competency centers to be established in addition to training the workforce to take up new roles that are suited to SOA journey.   Conclusions   Reference Architectures can serve as the basis for disparate architecture efforts throughout the organization, even if they use different tools and technologies. Reference architectures provide best practices and approaches in the independent way a vendor deals with technology and standards. Reference Architectures model the abstract architectural elements for an enterprise independent of the technologies, protocols, and products that are used to implement an SOA. Telecom enterprises today are facing significant business and technology challenges due to growing competition, a multitude of services, and convergence. Adopting architectural best practices could go a long way in meeting these challenges. The use of SOA-based architecture for communication to each of the external systems like Billing, CRM, etc., in OSS/BSS system has made the architecture very loosely coupled, with greater flexibility. Any change in the external systems would be absorbed at the Integration Layer without affecting the rest of the ecosystem. The use of a Business Process Management (BPM) tool makes the management and maintenance of the business processes easy, with better performance in terms of lead time, quality, and cost. Since the Architecture is based on standards, it will lower the cost of deploying and managing OSS/BSS applications over their lifecycles.

    Read the article

< Previous Page | 49 50 51 52 53