Search Results

Search found 2896 results on 116 pages for 'comparison operators'.

Page 105/116 | < Previous Page | 101 102 103 104 105 106 107 108 109 110 111 112 | Next Page >

IBM "per core" comparisons for SPECjEnterprise2010

- by jhenning

I recently stumbled upon a blog entry from Roman Kharkovski (an IBM employee) comparing some SPECjEnterprise2010 results for IBM vs. Oracle. Mr. Kharkovski's blog claims that SPARC delivers half the transactions per core vs. POWER7. Prior to any argument, I should say that my predisposition is to like Mr. Kharkovski, because he says that his blog is intended to be factual; that the intent is to try to avoid marketing hype and FUD tactic; and mostly because he features a picture of himself wearing a bike helmet (me too). Therefore, in a spirit of technical argument, rather than FUD fight, there are a few areas in his comparison that should be discussed. Scaling is not free For any benchmark, if a small system scores 13k using quantity R1 of some resource, and a big system scores 57k using quantity R2 of that resource, then, sure, it's tempting to divide: is 13k/R1 > 57k/R2 ? It is tempting, but not necessarily educational. The problem is that scaling is not free. Building big systems is harder than building small systems. Scoring 13k/R1 on a little system provides no guarantee whatsoever that one can sustain that ratio when attempting to handle more than 4 times as many users. Choosing the denominator radically changes the picture When ratios are used, one can vastly manipulate appearances by the choice of denominator. In this case, lots of choices are available for the resource to be compared (R1 and R2 above). IBM chooses to put cores in the denominator. Mr. Kharkovski provides some reasons for that choice in his blog entry. And yet, it should be noted that the very concept of a core is: arbitrary: not necessarily comparable across vendors; fluid: modern chips shift chip resources in response to load; and invisible: unless you have a microscope, you can't see it. By contrast, one can actually see processor chips with the naked eye, and they are a bit easier to count. If we put chips in the denominator instead of cores, we get: 13161.07 EjOPS / 4 chips = 3290 EjOPS per chip for IBM vs 57422.17 EjOPS / 16 chips = 3588 EjOPS per chip for Oracle The choice of denominator makes all the difference in the appearance. Speaking for myself, dividing by chips just seems to make more sense, because: I can see chips and count them; and I can accurately compare the number of chips in my system to the count in some other vendor's system; and Tthe probability of being able to continue to accurately count them over the next 10 years of microprocessor development seems higher than the probability of being able to accurately and comparably count "cores". SPEC Fair use requirements Speaking as an individual, not speaking for SPEC and not speaking for my employer, I wonder whether Mr. Kharkovski's blog article, taken as a whole, meets the requirements of the SPEC Fair Use rule www.spec.org/fairuse.html section I.D.2. For example, Mr. Kharkovski's footnote (1) begins Results from http://www.spec.org as of 04/04/2013 Oracle SUN SPARC T5-8 449 EjOPS/core SPECjEnterprise2010 (Oracle's WLS best SPECjEnterprise2010 EjOPS/core result on SPARC). IBM Power730 823 EjOPS/core (World Record SPECjEnterprise2010 EJOPS/core result) The questionable tactic, from a Fair Use point of view, is that there is no such metric at the designated location. At www.spec.org, You can find the SPEC metric 57422.17 SPECjEnterprise2010 EjOPS for Oracle and You can also find the SPEC metric 13161.07 SPECjEnterprise2010 EjOPS for IBM. Despite the implication of the footnote, you will not find any mention of 449 nor anything that says 823. SPEC says that you can, under its fair use rule, derive your own values; but it emphasizes: "The context must not give the appearance that SPEC has created or endorsed the derived value." Substantiation and transparency Although SPEC disclaims responsibility for non-SPEC information (section I.E), it says that non-SPEC data and methods should be accurate, should be explained, should be substantiated. Unfortunately, it is difficult or impossible for the reader to independently verify the pricing: Were like units compared to like (e.g. list price to list price)? Were all components (hw, sw, support) included? Were all fees included? Note that when tpc.org shows IBM pricing, there are often items such as "PROCESSOR ACTIVATION" and "MEMORY ACTIVATION". Without the transparency of a detailed breakdown, the pricing claims are questionable. T5 claim for "Fastest Processor" Mr. Kharkovski several times questions Oracle's claim for fastest processor, writing You see, when you publish industry benchmarks, people may actually compare your results to other vendor's results. Well, as we performance people always say, "it depends". If you believe in performance-per-core as the primary way of looking at the world, then yes, the POWER7+ is impressive, spending its chip resources to support up to 32 threads (8 cores x 4 threads). Or, it just might be useful to consider performance-per-chip. Each SPARC T5 chip allows 128 hardware threads to be simultaneously executing (16 cores x 8 threads). The Industry Standard Benchmark that focuses specifically on processor chip performance is SPEC CPU2006. For this very well known and popular benchmark, SPARC T5: provides better performance than both POWER7 and POWER7+, for 1 chip vs. 1 chip, for 8 chip vs. 8 chip, for integer (SPECint_rate2006) and floating point (SPECfp_rate2006), for Peak tuning and for Base tuning. For example, at the 8-chip level, integer throughput (SPECint_rate2006) is: 3750 for SPARC 2170 for POWER7+. You can find the details at the March 2013 BestPerf CPU2006 page SPEC is a trademark of the Standard Performance Evaluation Corporation, www.spec.org. The two specific results quoted for SPECjEnterprise2010 are posted at the URLs linked from the discussion. Results for SPEC CPU2006 were verified at spec.org 1 July 2013, and can be rechecked here.

Read the article
T-SQL Tuesday #21 - Crap!

- by Most Valuable Yak (Rob Volk)

Adam Machanic's (blog | twitter) ever popular T-SQL Tuesday series is being held on Wednesday this time, and the topic is… SHIT CRAP. No, not fecal material. But crap code. Crap SQL. Crap ideas that you thought were good at the time, or were forced to do due (doo-doo?) to lack of time. The challenge for me is to look back on my SQL Server career and find something that WASN'T crap. Well, there's a lot that wasn't, but for some reason I don't remember those that well. So the additional challenge is to pick one particular turd that I really wish I hadn't squeezed out. Let's see if this outline fits the bill: An ETL process on text files; That had to interface between SQL Server and an AS/400 system; That didn't use SSIS (should have) or BizTalk (ummm, no) but command-line scripting, using Unix utilities(!) via: xp_cmdshell; That had to email reports and financial data, some of it sensitive Yep, the stench smell is coming back to me now, as if it was yesterday… As to why SSIS and BizTalk were not options, basically I didn't know either of them well enough to get the job done (and I still don't). I also had a strict deadline of 3 days, in addition to all the other responsibilities I had, so no time to learn them. And seeing how screwed up the rest of the process was: Payment files from multiple vendors in multiple formats; Sent via FTP, PGP encrypted email, or some other wizardry; Manually opened/downloaded and saved to a particular set of folders (couldn't change this); Once processed, had to be placed BACK in the same folders with the original archived; x2 divisions that had to run separately; Plus an additional vendor file in another format on a completely different schedule; So that they could be MANUALLY uploaded into the AS/400 system (couldn't change this either, even if it was technically possible) I didn't feel so bad about the solution I came up with, which was naturally: Copy the payment files to the local SQL Server drives, using xp_cmdshell Run batch files (via xp_cmdshell) to parse the different formats using sed, a Unix utility (this was before Powershell) Use other Unix utilities (join, split, grep, wc) to process parsed files and generate metadata (size, date, checksum, line count) Run sqlcmd to execute a stored procedure that passed the parsed file names so it would bulk load the data to do a comparison bcp the compared data out to ANOTHER text file so that I could grep that data out of the original file Run another stored procedure to import the matched data into SQL Server so it could process the payments, including file metadata Process payment batches and log which division and vendor they belong to Email the payment details to the finance group (since it was too hard for them to run a web report with the same data…which they ran anyway to compare the emailed file against…which always matched, surprisingly) Email another report showing unmatched payments so they could manually void them…about 3 months afterward All in "Excel" format, using xp_sendmail (SQL 2000 system) Copy the unmatched data back to the original folder locations, making sure to match the file format exactly (if you've ever worked with ACH files, you'll understand why this sucked) If you're one of the 10 people who have read my blog before, you know that I love the DOS "for" command. Like passionately. Like fairy-tale love. So my batch files were riddled with for loops, nested within other for loops, that called other batch files containing for loops. I think there was one section that had 4 or 5 nested for commands. It was wrong, disturbed, and completely un-maintainable by anyone, even myself. Months, even a year, after I left the company I got calls from someone who had to make a minor change to it, and they called me to talk them out of spraying the office with an AK-47 after looking at this code. (for you Star Trek TOS fans) The funniest part of this, well, one of the funniest, is that I made the deadline…sort of, I was only a day late…and the DAMN THING WORKED practically unchanged for 3 years. Most of the problems came from the manual parts of the overall process, like forgetting to decrypt the files, or missing/late files, or saved to the wrong folders. I'm definitely not trying to toot my own horn here, because this was truly one of the dumbest, crappiest solutions I ever came up with. Fortunately as far as I know it's no longer in use and someone has written a proper replacement. Today I would knuckle down and do it in SSIS or Powershell, even if it took me weeks to get it right. The real lesson from this crap code is to make things MAINTAINABLE and UNDERSTANDABLE. sed scripting regular expressions doesn't fit that criteria in any way. If you ever find yourself under pressure to do something fast at all costs, DON'T DO IT. Stop and consider long-term maintainability, not just for yourself but for others on your team. If you can't explain the basic approach in under 5 minutes, it ultimately won't succeed. And while you may love to leave all that crap behind, it may follow you anyway, and you'll step in it again. P.S. - if you're wondering about all the manual stuff that couldn't be changed, it was because the entire process had gone through Six Sigma, and was deemed the best possible way. Phew! Talk about stink!

Read the article
Windows Phone 7 Review – Part 1: LG Quantum

- by Nikita Polyakov

As many of my fellow geeks, I ran out and got a retail windows Phone 7 on the first day. Just had to have it :) I’ve had the developer prototypes in my hands for previous 3 months on and off, so I finally wanted to have one I call my own. I’ve rushed the Launch I’ve checked out both AT&T and T-Mobile offerings on day 1 and decided on a Samsung Focus. Great screen, super light and thin. If you don’t believe me that this phone can compete with the best of the non-Phone 7 offerings - get it in your hand to compare for yourself. I have to say that even though the on-screen keyboard on Windows Phone 7 is one of the best, the amount of text I write on my phone and my expectation of how long that takes for a short reply are very high. Also the phone being so slick and sexy did not feel solid or confident in my hand or pocket. As the dust settled Arrives the LG Quantum – now on AT&T and worldwide. First impression of the softer plastic, the back battery cover is solid metal - the entire phone feels solid and indestructible! Phone fits just right in my hand, it’s almost too good. It does not feel like it will crack in your jeans. I feel safe holding it and don’t feel like if I or someone were to bump into me walking it’d fly out of my hand. I’ve dropped and had thrown the Focus a few times on accident as it’s weight is negligible. I won’t even dream of lying the first day adjusting to a 3.5’ LCD screen from the Samsung’s blistering bright and poppy AMOLED 4’ was hard. But the colors and sharpness are still very good. I find it almost easier on the eyes actually for day to day use. I had a chance to lay the phone down in the line with the prototypes and final versions of other phones that had LCD screens – LG makes HTC looks like a budget LCD compared to a high end LCD in the home theatre department. I am consistently complemented by friends that have the HD7 or Surround on how much better my screen looks. The screen just looks like the most color correct phone out of the line up. Even next to Samsung it makes it look oversaturated, but can’t match the true blacks compensating with true white. Day to Day Usability What I also noticed that is a huge difference is how much I am not accidently hitting the soft keys at the bottom. I real pain on Focus since holding it in am average size hand already would accidently touch the controls at the bottom. QWERTY keyboard on this phone is great. It’s like the mission for LG is “make it solid!”. Keyboard has a very durable feel. LG’s has a secret wild card though is the DLNA support. If you seen an ad for it, you should. Imagine this – playing a song from your phone straight to your network connected A/V receiver. Done. Pictures to TV. Done. Video. Done. DLNA works with components that advertise to as well as Windows 7, XBOX 360 and other consoles. I will write an extensive review of that experience in near future. LG Exclusive apps – from panorama photo taker to voice to text translator and even look-n-type app that works like a backup inverse camera, there is quite a bit there that won’t be found on the other phones. I’ll review those in more detail in another segment. Conclusion So for a quick comparison: If you want a phone that is super thin, light and is core reference of a Windows Phone 7 – Samsung Focus it is. If you want a great phone with solid secure feel, real keyboard, media features - the hands down winner is LG Quantum. You can pick up the LG Quantum at AT&T in US and worldwide as LG Optimus 7Q. Final thought: I have not had SmartPhone that I felt was a reliable trusty primary communication device since Samsung BlackJack II, this time the LG got the crown. [ Disclosure: Phone was provided to me free of charge. That has been the case for all of my phones for years, nothing new - I get them all. ]

Read the article
Taking a Flying Leap

- by Lance Shaw

Yesterday, I went skydiving with three of my children. It was thrilling, scary, invigorating and exciting. While there is obvious risk involved, the reward and feeling of success was well worth it. You might already be wondering what skydiving would have to with WebCenter, so let me explain. Implementing a skydiving program and becoming an instructor does not happen overnight. It does not happen with the purchase of the needed technology. Not one of us would go out, buy a parachute, the harnesses, helmet and all the gear and be able to convince anyone that we are now ready to be a skydiving instructor. The fact is that obtaining the technology is merely a small piece of the overall process and so is the case with managing content in your company. You don't just buy the right software (Oracle WebCenter Content) and go to your boss and declare information management success. There is planning, research and effort that goes into deploying software of any kind and especially when it is as mission-critical to the success of your business as Enterprise Content Management. To become a certified skydiving instructor takes at least 3 years of commitment and often longer. In the United States, candidates must complete over 500 solo jumps of their own over a minimum of 36 months and then must complete additional rigorous training under observation. When you consider the amount of time and effort involved, it's not unlike getting a college degree and anyone that has trusted their lives to one of these instructors will no doubt appreciate their dedication to the curriculum. Implementing an ECM system won't take that long, but it certainly requires commitment, analysis and consideration. But guess what? Humans are involved and that means that mistakes can happen and that rules change. This struck me while reading an excellent post on darkreading.com by Glenn S. Phillips entitled "Mission Impossible: 4 Reasons Compliance is Impossible". His over-arching point was that with information management and security, environments change and people are involved meaning the work is never done. He stated that you can never claim your compliance efforts are complete because of the following reasons. People are involved. And lets face it, some are more trustworthy than others. Change is Constant. There is always some new technology coming along that is disruptive. Consumer grade cloud file sharing and sync tools come to mind here. Compliance is interpreted, not defined. Laws and the judges that read them are always on the move. Technology is a tool, not a complete solution. There is no magic pill. The skydiving analogy holds true here as well. Ultimately, a single person packs your parachute. For obvious reasons, you prefer that this person be trustworthy but there are no absolute guarantees of a 100% error-free scenario. Weather and wind conditions are never a constant and the best-laid plans for a great day of skydiving are easily disrupted by forces outside of your control. Rules and regulations vary by location and may be updated at any time and as I mentioned early on, even the best technology on its own will only get you started. The good news is that, like skydiving, with the right technology, the right planning, the right team and a proper understanding of the rules and regulations that govern your industry, your ECM deployment can be a great success. Failure to plan for any of the 4 factors that Glenn outlined in his article will certainly put your deployment and maybe even your company at risk, so consider them carefully. As a final aside, for those of you who consider skydiving an incredibly dangerous and risky pastime, consider this comparative statistic. In 2012, the U.S. Parachute Association recorded 19 fatal skydiving accidents in the U.S. out of roughly 3.1 million jumps. That’s 0.006 fatalities per 1,000 jumps. By comparison, the U.S. National Highway Traffic Safety Administration reports that there were 34,080 deaths due to car accidents in 2012. Based on the percentages, one could argue that it is safer to jump out of a plane than to drive to the airport where the skydiving will take place. While the way you manage, secure, classify, control, retain and dispose of company files may not carry as much risk as driving or skydiving, it certainly carries risk for the organization when not planned and deployed appropriately. Consider all the factors involved in your organization as you make your content management plans. For additional areas of consideration, be sure to download our free whitepaper on the topic entitled "The Top 10 Criteria for Choosing an ECM System" which is available for download here.

Read the article
Community Branching

- by Dane Morgridge

As some may have noticed, I have taken a liking to Ruby (and Rails in particular) quite a bit recently. This last weekend I spoke at the NYC Code Camp on a comparison of ASP.NET and Rails as well as an intro to Entity Framework talk. I am speaking at RubyNation in April and have submitted to other ruby conferences around the area and I am also doing a Rails and MongoDB talk at the Philly Code Camp in April. Before you start to think this is my "I'm leaving .NET post", which it isn't so I need to clarify. I am not, nor do I intend to any time in the near future plan on abandoning .NET. I am simply branching out into another community based on a development technology that I very much enjoy. If you look at my twitter bio, you will see that I am into Entity Framework, Ruby on Rails, C++ and ASP.NET MVC, and not necessarily in that order. I know you're probably thinking to your self that I am crazy, which is probably true on several levels (especially the C++ part). I was actually crazy enough at the NYC Code Camp to show up wearing a Linux t-shirt, presenting with my MacBook Pro on Entity Framework, ASP.NET MVC and Rails. (I did get pelted in the head with candy by Rachel Appel for it though) At all of the code camps I am submitting to this year, i will be submitting sessions on likely all four topics, and some sessions will be a combination of 2 or more. For example, my "ASP.NET MVC: A Gateway To Rails?" talk touches ASP.NET MVC, Entity Framework Code First and Rails. Simply put (and I talk about this in my MVC & Rails talk) is that learning and using Rails has made me a better ASP.NET MVC developer. Just one example of this is helper methods. When I started working with ASP.NET MVC, I didn't really want to use helpers and preferred to just use standard html tags, especially where links were concerned. It was just me being stubborn and not really seeing all of the benefit of the helpers. To my defense, coming from WebForms, I wanted to be as bare metal as possible and it seemed at first like a lot of the helpers were an unnecessary abstraction. I took my first look at Rails back in v1 and didn't spend very much time with it so I dismissed it and went on my merry ASP.NET WebForms way. Then I picked up ASP.NET MVC and grasped the MVC pattern itself much better. After this, I took another look at Rails and everything made sense. I decided then to learn Rails. (I think it is important for developers to learn new languages and platforms regularly so it was a natural progression for me) I wanted to learn it the right way, so when I dug into code, everyone used helpers everywhere for pretty much everything possible. I took some time to dig in and found out how helpful they were and subsequently realized how awesome they were in ASP.NET MVC also and started using them. In short, I love Rails (and Ruby in general). I also love ASP.NET MVC and Entity Framework and yes I still love C++. I have varying degrees of love for them individually at any given moment and it is likely to shift based on the current project I am working on. I know you're thinking it so before you ask the question. "Which do I use when?", I'm going to give the standard developer answer of: It depends. There are a lot of factors that I am not going to even go into that would go into a decision. The most basic question I would ask though is, does this project depend on .NET? If it does, then I'd say that ASP.NET MVC is probably going to be the more logical choice and I am going to leave it at that. I am working on projects right now in both technologies and I don't see that changing anytime soon (one project even uses both). With all that being said, you'll find me at code camps, conferences and user groups presenting on .NET, Ruby or both, writing about .NET and Ruby and I will likely be blogging on both in the future. I know of others that have successfully branched out to other communities and with any luck I'll be successful at it too. On a (sorta) side note, I read a post by Justin Etheredge the other day that pretty much sums up my feelings about Ruby as a language. I highly recommend checking it out: What Is So Great About Ruby?

Read the article
Polite busy-waiting with WRPAUSE on SPARC

- by Dave

Unbounded busy-waiting is an poor idea for user-space code, so we typically use spin-then-block strategies when, say, waiting for a lock to be released or some other event. If we're going to spin, even briefly, then we'd prefer to do so in a manner that minimizes performance degradation for other sibling logical processors ("strands") that share compute resources. We want to spin politely and refrain from impeding the progress and performance of other threads — ostensibly doing useful work and making progress — that run on the same core. On a SPARC T4, for instance, 8 strands will share a core, and that core has its own L1 cache and 2 pipelines. On x86 we have the PAUSE instruction, which, naively, can be thought of as a hardware "yield" operator which temporarily surrenders compute resources to threads on sibling strands. Of course this helps avoid intra-core performance interference. On the SPARC T2 our preferred busy-waiting idiom was "RD %CCR,%G0" which is a high-latency no-nop. The T4 provides a dedicated and extremely useful WRPAUSE instruction. The processor architecture manuals are the authoritative source, but briefly, WRPAUSE writes a cycle count into the the PAUSE register, which is ASR27. Barring interrupts, the processor then delays for the requested period. There's no need for the operating system to save the PAUSE register over context switches as it always resets to 0 on traps. Digressing briefly, if you use unbounded spinning then ultimately the kernel will preempt and deschedule your thread if there are other ready threads than are starving. But by using a spin-then-block strategy we can allow other ready threads to run without resorting to involuntary time-slicing, which operates on a long-ish time scale. Generally, that makes your application more responsive. In addition, by blocking voluntarily we give the operating system far more latitude regarding power management. Finally, I should note that while we have OS-level facilities like sched_yield() at our disposal, yielding almost never does what you'd want or naively expect. Returning to WRPAUSE, it's natural to ask how well it works. To help answer that question I wrote a very simple C/pthreads benchmark that launches 8 concurrent threads and binds those threads to processors 0..7. The processors are numbered geographically on the T4, so those threads will all be running on just one core. Unlike the SPARC T2, where logical CPUs 0,1,2 and 3 were assigned to the first pipeline, and CPUs 4,5,6 and 7 were assigned to the 2nd, there's no fixed mapping between CPUs and pipelines in the T4. And in some circumstances when the other 7 logical processors are idling quietly, it's possible for the remaining logical processor to leverage both pipelines. Some number T of the threads will iterate in a tight loop advancing a simple Marsaglia xor-shift pseudo-random number generator. T is a command-line argument. The main thread loops, reporting the aggregate number of PRNG steps performed collectively by those T threads in the last 10 second measurement interval. The other threads (there are 8-T of these) run in a loop busy-waiting concurrently with the T threads. We vary T between 1 and 8 threads, and report on various busy-waiting idioms. The values in the table are the aggregate number of PRNG steps completed by the set of T threads. The unit is millions of iterations per 10 seconds. For the "PRNG step" busy-waiting mode, the busy-waiting threads execute exactly the same code as the T worker threads. We can easily compute the average rate of progress for individual worker threads by dividing the aggregate score by the number of worker threads T. I should note that the PRNG steps are extremely cycle-heavy and access almost no memory, so arguably this microbenchmark is not as representative of "normal" code as it could be. And for the purposes of comparison I included a row in the table that reflects a waiting policy where the waiting threads call poll(NULL,0,1000) and block in the kernel. Obviously this isn't busy-waiting, but the data is interesting for reference. _table { border:2px black dotted; margin: auto; width: auto; } _tr { border: 2px red dashed; } _td { border: 1px green solid; } _table { border:2px black dotted; margin: auto; width: auto; } _tr { border: 2px red dashed; } td { background-color : #E0E0E0 ; text-align : right ; } th { text-align : left ; } td { background-color : #E0E0E0 ; text-align : right ; } th { text-align : left ; } Aggregate progress T = #worker threads Wait Mechanism for 8-T threadsT=1T=2T=3T=4T=5T=6T=7T=8 Park thread in poll() 32653347334833483348334833483348 no-op 415 831 124316482060249729303349 RD %ccr,%g0 "pause" 14262429269228623013316232553349 PRNG step 412 829 124616702092251029303348 WRPause(8000) 32443361333133483349334833483348 WRPause(4000) 32153308331533223347334833473348 WRPause(1000) 30853199322432513310334833483348 WRPause(500) 29173070315032223270330933483348 WRPause(250) 26942864294930773205338833483348 WRPause(100) 21552469262227902911321433303348

Read the article
Best Method For Evaluating Existing Software or New Software

How many of us have been faced with having to decide on an off-the-self or a custom built component, application, or solution to integrate in to an existing system or to be the core foundation of a new system? What is the best method for evaluating existing software or new software still in the design phase? One of the industry preferred methodologies to use is the Active Reviews for Intermediate Designs (ARID) evaluation process. ARID is a hybrid mixture of the Active Design Review (ADR) methodology and the Architectural Tradeoff Analysis Method (ATAM). So what is ARID? ARD’s main goal is to ensure quality, detailed designs in software. One way in which it does this is by empowering reviewers by assigning generic open ended survey questions. This approach attempts to remove the possibility for allowing the standard answers such as “Yes” or “No”. The ADR process ignores the “Yes”/”No” questions due to the fact that they can be leading based on how the question is asked. Additionally these questions tend to receive less thought in comparison to more open ended questions. Common Active Design Review Questions What possible exceptions can occur in this component, application, or solution? How should exceptions be handled in this component, application, or solution? Where should exceptions be handled in this component, application, or solution? How should the component, application, or solution flow based on the design? What is the maximum execution time for every component, application, or solution? What environments can this component, application, or solution? What data dependencies does this component, application, or solution have? What kind of data does this component, application, or solution require? Ok, now I know what ARID is, how can I apply? Let’s imagine that your organization is going to purchase an off-the-shelf (OTS) solution for its customer-relationship management software. What process would we use to ensure that the correct purchase is made? If we use ARID, then we will have a series of 9 steps broken up by 2 phases in order to ensure that the correct OTS solution is purchases. Phase 1 Identify the Reviewers Prepare the Design Briefing Prepare the Seed Scenarios Prepare the Materials When identifying reviewers for a design it is preferred that they be pulled from a candidate pool comprised of developers that are going to implement the design. The believe is that developers actually implementing the design will have more a vested interest in ensuring that the design is correct prior to the start of code. Design debriefing consist of a summary of the design, examples of the design solving real world examples put in to use and should be no longer than two hours typically. The primary goal of this briefing is to adequately summarize the design so that the review members could actually implement the design. In the example of purchasing an OTS product I would attempt to review my briefing prior to its distribution with the review facilitator to ensure that nothing was excluded that should have not been. This practice will also allow me to test the length of the briefing to ensure that can be delivered in an appropriate about of time. Seed Scenarios are designed to illustrate conceptualized scenarios when applied with a set of sample data. These scenarios can then be used by the reviewers in the actual evaluation of the software, All materials needed for the evaluation should be prepared ahead of time so that they can be reviewed prior to and during the meeting. Materials Included: Presentation Seed Scenarios Review Agenda Phase 2 Present ARID Present Design Brainstorm and prioritize scenarios Apply scenarios Summarize Prior to the start of any ARID review meeting the Facilitator should define the remaining steps of ARID so that all the participants know exactly what they are doing prior to the start of the review process. Once the ARID rules have been laid out, then the lead designer presents an overview of the design which typically takes about two hours. During this time no questions about the design or rational are allowed to be asked by the review panel as a standard, but they are written down for use latter in the process. After the presentation the list of compiled questions is then summarized and sent back to the lead designer as areas that need to be addressed further. In the example of purchasing an OTS product issues could arise regarding security, the implementation needed or even if this is this the correct product to solve the needed solution. After the Design presentation a brainstorming and prioritize scenarios process begins by reducing the seed scenarios down to just the highest priority scenarios. These will then be used to test the design for suitability. Once the selected scenarios have been defined the reviewers apply the examples provided in the presentation to the scenarios. The intended output of this process is to provide code or pseudo code that makes use of the examples provided while solving the selected seed scenarios. As a standard rule, the designers of the systems are not allowed to help the review board unless they all become stuck. When this occurs it is documented and along with the reason why the designer needed to help the review panel back on track. Once all of the scenarios have been completed the review facilitator reviews with the group issues that arise during the process. Then the reviewers will be polled as to efficacy of the review experience. References: Clements, Paul., Kazman, Rick., Klien, Mark. (2002). Evaluating Software Architectures: Methods and Case Studies Indianapolis, IN: Addison-Wesley

Read the article
Session Report - Modern Software Development Anti-Patterns

- by Janice J. Heiss

In this standing-room-only session, building upon his 2011 JavaOne Rock Star “Diabolical Developer” session, Martijn Verburg, this time along with Ben Evans, identified and explored common “anti-patterns” – ways of doing things that keep developers from doing their best work. They emphasized the importance of social interaction and team communication, along with identifying certain psychological pitfalls that lead developers astray. Their emphasis was less on technical coding errors and more how to function well and to keep one’s focus on what really matters. They are the authors of the highly regarded The Well-Grounded Java Developer and are both movers and shakers in the London JUG community and on the Java Community Process. The large room was packed as they gave a fast-moving, witty presentation with lots of laughs and personal anecdotes. Below are a few of the anti-patterns they discussed.Anti-Pattern One: Conference-Driven DeliveryThe theme here is the belief that “Real pros hack code and write their slides minutes before their talks.” Their response to this anti-pattern is an expression popular in the military – PPPPPP, which stands for, “Proper preparation prevents piss-poor performance.”“Communication is very important – probably more important than the code you write,” claimed Verburg. “The more you speak in front of large groups of people the easier it gets, but it’s always important to do dry runs, to present to smaller groups. And important to be members of user groups where you can give presentations. It’s a great place to practice speaking skills; to gain new skills; get new contacts, to network.”They encouraged attendees to record themselves and listen to themselves giving a presentation. They advised them to start with a spouse or friends if need be. Learning to communicate to a group, they argued, is essential to being a successful developer. The emphasis here is that software development is a team activity and good, clear, accessible communication is essential to the functioning of software teams. Anti-Pattern Two: Mortgage-Driven Development The main theme here was that, in a period of worldwide recession and economic stagnation, people are concerned about keeping their jobs. So there is a tendency for developers to treat knowledge as power and not share what they know about their systems with their colleagues, so when it comes time to fix a problem in production, they will be the only one who knows how to fix it – and will have made themselves an indispensable cog in a machine so you cannot be fired. So developers avoid documentation at all costs, or if documentation is required, put it on a USB chip and lock it in a lock box. As in the first anti-pattern, the idea here is that communicating well with your colleagues is essential and documentation is a key part of this. Social interactions are essential. Both Verburg and Evans insisted that increasingly, year by year, successful software development is more about communication than the technical aspects of the craft. Developers who understand this are the ones who will have the most success. Anti-Pattern Three: Distracted by Shiny – Always Use the Latest Technology to Stay AheadThe temptation here is to pick out some obscure framework, try a bit of Scala, HTML5, and Clojure, and always use the latest technology and upgrade to the latest point release of everything. Don’t worry if something works poorly because you are ahead of the curve. Verburg and Evans insisted that there need to be sound reasons for everything a developer does. Developers should not bring in something simply because for some reason they just feel like it or because it’s new. They recommended a site run by a developer named Matt Raible with excellent comparison spread sheets regarding Web frameworks and other apps. They praised it as a useful tool to help developers in their decision-making processes. They pointed out that good developers sometimes make bad choices out of boredom, to add shiny things to their CV, out of frustration with existing processes, or just from a lack of understanding. They pointed out that some code may stay in a business system for 15 or 20 years, but not all code is created equal and some may change after 3 or 6 months. Developers need to know where the code they are contributing fits in. What is its likely lifespan? Anti-Pattern Four: Design-Driven Design The anti-pattern: If you want to impress your colleagues and bosses, use design patents left, right, and center – MVC, Session Facades, SOA, etc. Or the UML modeling suite from IBM, back in the day… Generate super fast code. And the more jargon you can talk when in the vicinity of the manager the better.Verburg shared a true story about a time when he was interviewing a guy for a job and asked him what his previous work was. The interviewee said that he essentially took patterns and uses an approved book of Enterprise Architecture Patterns and applied them. Verburg was dumbstruck that someone could have a job in which they took patterns from a book and applied them. He pointed out that the idea that design is a separate activity is simply wrong. He repeated a saying that he uses, “You should pay your junior developers for the lines of code they write and the things they add; you should pay your senior developers for what they take away.”He explained that by encouraging people to take things away, the code base gets simpler and reflects the actual business use cases developers are trying to solve, as opposed to the framework that is being imposed. He told another true story about a project to decommission a very long system. 98% of the code was decommissioned and people got a nice bonus. But the 2% remained on the mainframe so the 98% reduction in code resulted in zero reduction in costs, because the entire mainframe was needed to run the 2% that was left. There is an incentive to get rid of source code and subsystems when they are no longer needed. The session continued with several more anti-patterns that were equally insightful.

Read the article
await, WhenAll, WaitAll, oh my!!

- by cibrax

If you are dealing with asynchronous work in .NET, you might know that the Task class has become the main driver for wrapping asynchronous calls. Although this class was officially introduced in .NET 4.0, the programming model for consuming tasks was much more simplified in C# 5.0 in .NET 4.5 with the addition of the new async/await keywords. In a nutshell, you can use these keywords to make asynchronous calls as if they were sequential, and avoiding in that way any fork or callback in the code. The compiler takes care of the rest. I was yesterday writing some code for making multiple asynchronous calls to backend services in parallel. The code looked as follow, var allResults = new List<Result>(); foreach(var provider in providers) { var results = await provider.GetResults(); allResults.AddRange(results); } return allResults; You see, I was using the await keyword to make multiple calls in parallel. Something I did not consider was the overhead this code implied after being compiled. I started an interesting discussion with some smart folks in twitter. One of them, Tugberk Ugurlu, had the brilliant idea of actually write some code to make a performance comparison with another approach using Task.WhenAll. There are two additional methods you can use to wait for the results of multiple calls in parallel, WhenAll and WaitAll. WhenAll creates a new task and waits for results in that new task, so it does not block the calling thread. WaitAll, on the other hand, blocks the calling thread. This is the code Tugberk initially wrote, and I modified afterwards to also show the results of WaitAll. class Program { private static Func<Stopwatch, Task>[] funcs = new Func<Stopwatch, Task>[] { async (watch) => { watch.Start(); await Task.Delay(1000); Console.WriteLine("1000 one has been completed."); }, async (watch) => { await Task.Delay(1500); Console.WriteLine("1500 one has been completed."); }, async (watch) => { await Task.Delay(2000); Console.WriteLine("2000 one has been completed."); watch.Stop(); Console.WriteLine(watch.ElapsedMilliseconds + "ms has been elapsed."); } }; static void Main(string[] args) { Console.WriteLine("Await in loop work starts..."); DoWorkAsync().ContinueWith(task => { Console.WriteLine("Parallel work starts..."); DoWorkInParallelAsync().ContinueWith(t => { Console.WriteLine("WaitAll work starts..."); WaitForAll(); }); }); Console.ReadLine(); } static async Task DoWorkAsync() { Stopwatch watch = new Stopwatch(); foreach (var func in funcs) { await func(watch); } } static async Task DoWorkInParallelAsync() { Stopwatch watch = new Stopwatch(); await Task.WhenAll(funcs[0](watch), funcs[1](watch), funcs[2](watch)); } static void WaitForAll() { Stopwatch watch = new Stopwatch(); Task.WaitAll(funcs[0](watch), funcs[1](watch), funcs[2](watch)); } } After running this code, the results were very concluding. Await in loop work starts... 1000 one has been completed. 1500 one has been completed. 2000 one has been completed. 4532ms has been elapsed. Parallel work starts... 1000 one has been completed. 1500 one has been completed. 2000 one has been completed. 2007ms has been elapsed. WaitAll work starts... 1000 one has been completed. 1500 one has been completed. 2000 one has been completed. 2009ms has been elapsed. The await keyword in a loop does not really make the calls in parallel.

Read the article
Subterranean IL: The ThreadLocal type

- by Simon Cooper

I came across ThreadLocal<T> while I was researching ConcurrentBag. To look at it, it doesn't really make much sense. What's all those extra Cn classes doing in there? Why is there a GenericHolder<T,U,V,W> class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem. Thread statics Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET: [ThreadStatic] private static string s_ThreadStaticField; ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity. TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread that has existed since .NET 1.1: LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot); Each instance of LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field. As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot objects, it's not obvious how instances of LocalDataStoreSlot correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T> is far simpler and easier to use. ThreadLocal ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static, and so shared between all instances of the declaring type. There's something else going on here. The values stored in instances of ThreadLocal<T> are stored in instantiations of the GenericHolder<T,U,V,W> class, which contains a single ThreadStatic field (s_value) to store the actual value. This class is then instantiated with various combinations of the Cn types for generic arguments. In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2> has a completely separate s_value field to GenericHolder<int,C1,C14,C1>. This feature is (ab)used by ThreadLocal to emulate instance thread-static fields. Every time an instance of ThreadLocal is constructed, it is assigned a unique number from the static s_currentTypeId field using Interlocked.Increment, in the FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific Cn types that instantiates the GenericHolder class. That instantiation is therefore 'owned' by that instance of ThreadLocal. This gives each instance of ThreadLocal its own ThreadStatic field through a specific unique instantiation of the GenericHolder class. Although GenericHolder has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (C0 to C15). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>, and separately a maximum of 4096 instances of ThreadLocal<object>, etc. However, there is an upper limit of 16384 enforced on the total number of ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal instances created is tracked by the ThreadLocalGlobalCounter class. So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder type indexes of ThreadLocal instances that get disposed using the s_availableIndices concurrent stack. This allows GenericHolder instantiations of disposed ThreadLocal instances to be re-used. But if there aren't any available instantiations, then ThreadLocal falls back on a standard thread local slot using TLSHolder. This makes it very important to dispose of your ThreadLocal instances if you'll be using lots of them, so the type instantiations can be recycled. The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.

Read the article
Null Values And The T-SQL IN Operator

- by Jesse

I came across some unexpected behavior while troubleshooting a failing test the other day that took me long enough to figure out that I thought it was worth sharing here. I finally traced the failing test back to a SELECT statement in a stored procedure that was using the IN t-sql operator to exclude a certain set of values. Here’s a very simple example table to illustrate the issue: Customers CustomerId INT, NOT NULL, Primary Key CustomerName nvarchar(100) NOT NULL SalesRegionId INT NULL The ‘SalesRegionId’ column contains a number representing the sales region that the customer belongs to. This column is nullable because new customers get created all the time but assigning them to sales regions is a process that is handled by a regional manager on a periodic basis. For the purposes of this example, the Customers table currently has the following rows: CustomerId CustomerName SalesRegionId 1 Customer A 1 2 Customer B NULL 3 Customer C 4 4 Customer D 2 5 Customer E 3 How could we write a query against this table for all customers that are NOT in sales regions 2 or 4? You might try something like this: 1: SELECT 2: CustomerId, 3: CustomerName, 4: SalesRegionId 5: FROM Customers 6: WHERE SalesRegionId NOT IN (2,4) Will this work? In short, no; at least not in the way that you might expect. Here’s what this query will return given the example data we’re working with: CustomerId CustomerName SalesRegionId 1 Customer A 1 5 Customer E 5 I was expecting that this query would also return ‘Customer B’, since that customer has a NULL SalesRegionId. In my mind, having a customer with no sales region should be included in a set of customers that are not in sales regions 2 or 4.When I first started troubleshooting my issue I made note of the fact that this query should probably be re-written without the NOT IN clause, but I didn’t suspect that the NOT IN clause was actually the source of the issue. This particular query was only one minor piece in a much larger process that was being exercised via an automated integration test and I simply made a poor assumption that the NOT IN would work the way that I thought it should. So why doesn’t this work the way that I thought it should? From the MSDN documentation on the t-sql IN operator: If the value of test_expression is equal to any value returned by subquery or is equal to any expression from the comma-separated list, the result value is TRUE; otherwise, the result value is FALSE. Using NOT IN negates the subquery value or expression. The key phrase out of that quote is, “… is equal to any expression from the comma-separated list…”. The NULL SalesRegionId isn’t included in the NOT IN because of how NULL values are handled in equality comparisons. From the MSDN documentation on ANSI_NULLS: The SQL-92 standard requires that an equals (=) or not equal to (<>) comparison against a null value evaluates to FALSE. When SET ANSI_NULLS is ON, a SELECT statement using WHERE column_name = NULL returns zero rows even if there are null values in column_name. A SELECT statement using WHERE column_name <> NULL returns zero rows even if there are nonnull values in column_name. In fact, the MSDN documentation on the IN operator includes the following blurb about using NULL values in IN sub-queries or expressions that are used with the IN operator: Any null values returned by subquery or expression that are compared to test_expression using IN or NOT IN return UNKNOWN. Using null values in together with IN or NOT IN can produce unexpected results. If I were to include a ‘SET ANSI_NULLS OFF’ command right above my SELECT statement I would get ‘Customer B’ returned in the results, but that’s definitely not the right way to deal with this. We could re-write the query to explicitly include the NULL value in the WHERE clause: 1: SELECT 2: CustomerId, 3: CustomerName, 4: SalesRegionId 5: FROM Customers 6: WHERE (SalesRegionId NOT IN (2,4) OR SalesRegionId IS NULL) This query works and properly includes ‘Customer B’ in the results, but I ultimately opted to re-write the query using a LEFT OUTER JOIN against a table variable containing all of the values that I wanted to exclude because, in my case, there could potentially be several hundred values to be excluded. If we were to apply the same refactoring to our simple sales region example we’d end up with: 1: DECLARE @regionsToIgnore TABLE (IgnoredRegionId INT) 2: INSERT @regionsToIgnore values (2),(4) 3: 4: SELECT 5: c.CustomerId, 6: c.CustomerName, 7: c.SalesRegionId 8: FROM Customers c 9: LEFT OUTER JOIN @regionsToIgnore r ON r.IgnoredRegionId = c.SalesRegionId 10: WHERE r.IgnoredRegionId IS NULL By performing a LEFT OUTER JOIN from Customers to the @regionsToIgnore table variable we can simply exclude any rows where the IgnoredRegionId is null, as those represent customers that DO NOT appear in the ignored regions list. This approach will likely perform better if the number of sales regions to ignore gets very large and it also will correctly include any customers that do not yet have a sales region.

Read the article
Developing Schema Compare for Oracle (Part 5): Query Snapshots

- by Simon Cooper

If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

Read the article
Documentation Changes in Solaris 11.1

- by alanc

One of the first places you can see Solaris 11.1 changes are in the docs, which have now been posted in the Solaris 11.1 Library on docs.oracle.com. I spent a good deal of time reviewing documentation for this release, and thought some would be interesting to blog about, but didn't review all the changes (not by a long shot), and am not going to cover all the changes here, so there's plenty left for you to discover on your own. Just comparing the Solaris 11.1 Library list of docs against the Solaris 11 list will show a lot of reorganization and refactoring of the doc set, especially in the system administration guides. Hopefully the new break down will make it easier to get straight to the sections you need when a task is at hand. Packaging System Unfortunately, the excellent in-depth guide for how to build packages for the new Image Packaging System (IPS) in Solaris 11 wasn't done in time to make the initial Solaris 11 doc set. An interim version was published shortly after release, in PDF form on the OTN IPS page. For Solaris 11.1 it was included in the doc set, as Packaging and Delivering Software With the Image Packaging System in Oracle Solaris 11.1, so should be easier to find, and easier to share links to specific pages the HTML version. Beyond just how to build a package, it includes details on how Solaris is packaged, and how package updates work, which may be useful to all system administrators who deal with Solaris 11 upgrades & installations. The Adding and Updating Oracle Solaris 11.1 Software Packages was also extended, including new sections on Relaxing Version Constraints Specified by Incorporations and Locking Packages to a Specified Version that may be of interest to those who want to keep the Solaris 11 versions of certain packages when they upgrade, such as the couple of packages that had functionality removed by an (unusual for an update release) End of Feature process in the 11.1 release. Also added in this release is a document containing the lists of all the packages in each of the major package groups in Solaris 11.1 (solaris-desktop, solaris-large-server, and solaris-small-server). While you can simply get the contents of those groups from the package repository, either via the web interface or the pkg command line, the documentation puts them in handy tables for easier side-by-side comparison, or viewing the lists before you've installed the system to pick which one you want to initially install. X Window System We've not had good X11 coverage in the online Solaris docs in a while, mostly relying on the man pages, and upstream X.Org docs. In this release, we've integrated some X coverage into the Solaris 11.1 Desktop Adminstrator's Guide, including sections on installing fonts for fontconfig or legacy X11 clients, X server configuration, and setting up remote access via X11 or VNC. Of course we continue to work on improving the docs, including a lot of contributions to the upstream docs all OS'es share (more about that another time). Security One of the things Oracle likes to do for its products is to publish security guides for administrators & developers to know how to build systems that meet their security needs. For Solaris, we started this with Solaris 11, providing a guide for sysadmins to find where the security relevant configuration options were documented. The Solaris 11.1 Security Guidelines extend this to cover new security features, such as Address Space Layout Randomization (ASLR) and Read-Only Zones, as well as adding additional guidelines for existing features, such as how to limit the size of tmpfs filesystems, to avoid users driving the system into swap thrashing situations. For developers, the corresponding document is the Developer's Guide to Oracle Solaris 11 Security, which has been the source for years for documentation of security-relevant Solaris API's such as PAM, GSS-API, and the Solaris Cryptographic Framework. For Solaris 11.1, a new appendix was added to start providing Secure Coding Guidelines for Developers, leveraging the CERT Secure Coding Standards and OWASP guidelines to provide the base recommendations for common programming languages and their standard API's. Solaris specific secure programming guidance was added via links to other documentation in the product doc set. In parallel, we updated the Solaris C Libary Functions security considerations list with details of Solaris 11 enhancements such as FD_CLOEXEC flags, additional *at() functions, and new stdio functions such as asprintf() and getline(). A number of code examples throughout the Solaris 11.1 doc set were updated to follow these recommendations, changing unbounded strcpy() calls to strlcpy(), sprintf() to snprintf(), etc. so that developers following our examples start out with safer code. The Writing Device Drivers guide even had the appendix updated to list which of these utility functions, like snprintf() and strlcpy(), are now available via the Kernel DDI. Little Things Of course all the big new features got documented, and some major efforts were put into refactoring and renovation, but there were also a lot of smaller things that got fixed as well in the nearly a year between the Solaris 11 and 11.1 doc releases - again too many to list here, but a random sampling of the ones I know about & found interesting or useful: The Privileges section of the DTrace Guide now gives users a pointer to find out how to set up DTrace privileges for non-global zones and what limitations are in place there. A new section on Recommended iSCSI Configuration Practices was added to the iSCSI configuration section when it moved into the SAN Configuration and Multipathing administration guide. The Managing System Power Services section contains an expanded explanation of the various tunables for power management in Solaris 11.1. The sample dcmd sources in /usr/demo/mdb were updated to include ::help output, so that developers like myself who follow the examples don't forget to include it (until a helpful code reviewer pointed it out while reviewing the mdb module changes for Xorg 1.12). The README file in that directory was updated to show the correct paths for installing both kernel & userspace modules, including the 64-bit variants.

Read the article
CS, SE, HCI, Information Science, Please recommendation for further education of the former performing art manager seeking career in IT industries? [on hold]

- by Baek Seungjoo

IT specialists there J Thank you very much for your collective efforts here, and I got huge help reading your professional comments and advices on each questions I have searched so far! This time, I would like to ask for your practical advices or recommendation on what I am struggling on at this moment. I am currently seeking higher education for my career transition from performing art manager and director to “IT software and/or service development and management specialist”. However, as this field is quite new to me, and there are lots of different work positions, I have no idea which grad major I better pursue in order to get qualification. Of course I know this question could sounds wired as it is kind of personal choice. But my lack of understanding on how IT software companies work in general, your practical and experience-based advice will be great help to me, who spent more than two months of self-research on net. OK. Before my question, here is my plan and history, which are quite different from those currently in IT industry I think… 1) Target Firstly, get career transition into IT service or products companies and get experiences. Eventually, pursue IT entrepreneurship in combination with my arts and cultural production and business expertise. 2) Background Career: performing arts director and manager in theatre-based scale opera and musical Art education in youth BA in literature and Chinese studies (Art & Humanities) MA in Cultural & Creative Industries (Art & Humanities) – dissertation with focus on digital prosumption and the lived experience of the prosumer. (a qualitative research on the agents in the digital world) 2) Personally Huge interest in IT hardware and software, and their trend. Skills to build up, repair, tune PCs -of course this is no more than personal hobby, but shows my interests in this field. 4) Problem Encounter a question “So, what do you think you can contribute practically in this position”. This question turn me down everytime I go through job interviews, and I decided more education in the relevant area. Here are my questions. 1) In terms of work positions in IT software companies, I wonder if I can put the comparison of what “Artists” is to “Arts Manager or Director” is what “Developer” is to “Product Manager”. (Of course, this stereotypical division of Artist-Art Manager is out of sense because the domain overlaps to some extent, and is blurring at least in my field, and they are in different contexts, but just speaking easily.) Normally, artist comes with special arts educations, and they live in their own world of artistic inspiration and creation, and they feel alive in practice and on stages. Meanwhile, from the point of staging and managing productions, the role of art manager is critical as well. Our role cares how the production appeals to the audience in effective way, how to make profit and future sustainable management through that, how to set up future strategy in consideration of the external conditions such as political and social circumstances, audience trend and level, other production trends from on-going and historical perspectives, how and what the production make voice to the society from political, economic, humanitarian stances. So, we need keen eyes on economic, political, and societal environment, have to understand human-being and their desires, must know how to make presentation and attract investors, must have sense in managing and fighting over the limited financial resource, how to extend networking and so on. It is common that the two agents create productions in collaboration (normally not in that ideal way but in conflict and fight though J ). So, we need to know each other’s expertise to some extent, for better production. What are the work positions in IT software industries equivalent to the role of “art manager” in performing arts? From my view, considering developers come with special education in the world of computer science, software engineering, or others (self-education sometimes), and they express themselves with the arts of coding, computer languages on the black screen, and make sort of their artistic production online to the audience, I guess there might be someone who collaborate with developers in creating, managing, and launching IT services or products. 2) Which education among CS, SE, HCI, Information Science, is needed for those seeking such work position? Especially for person like me. (At this moment, Information Science has the highest possibility to get in, since I lack Calculus and Math in undergrad educaiton. But please let me know irrespective of this concern, I think there are ways to back it up if CS or SE education needed in my case) 3) Which field between Information Science and HCI can be more practical background regarding job hungting? And which of them have more demands in job market? AS I checked, HCI is more close to CS than IS in its focus of study area. Thank you very much for your patience reading such a long inquiry, and I appreciate to your efforts in advance. Have a nice day in this beautiful summer.

Read the article
k-d tree implementation [closed]

- by user466441

when i run my code and debugged,i got this error - this 0x00093584 {_Myproxy=0x00000000 _Mynextiter=0x00000000 } std::_Iterator_base12 * const - _Myproxy 0x00000000 {_Mycont=??? _Myfirstiter=??? } std::_Container_proxy * _Mycont CXX0017: Error: symbol "" not found _Myfirstiter CXX0030: Error: expression cannot be evaluated + _Mynextiter 0x00000000 {_Myproxy=??? _Mynextiter=??? } std::_Iterator_base12 * but i dont know what does it means,code is this #include<iostream> #include<vector> #include<algorithm> using namespace std; struct point { float x,y; }; vector<point>pointleft(4); vector<point>pointright(4); //we are going to implement two comparison function for x and y coordinates,we need it in calculation of median (we should sort vector //by x or y according to depth informaton,is depth even or odd. bool sortby_X(point &a,point &b) { return a.x<b.x; } bool sortby_Y(point &a,point &b) { return a.y<b.y; } //so i am going to implement to median finding algorithm,one for finding median by x and another find median by y point medianx(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_X); temp=points[(points.size()/2)]; return temp; } point mediany(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_Y); temp=points[(points.size()/2)]; return temp; } //now construct basic tree structure struct Tree { float x,y; Tree(point a) { x=a.x; y=a.y; } Tree *left; Tree *right; }; Tree * build_kd( Tree *root,vector<point>points,int depth) { point temp; if(points.size()==1)// that point is as a leaf { if(root==NULL) root=new Tree(points[0]); return root; } if(depth%2==0) { temp=medianx(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if (points[i].x<temp.x) pointleft[i]=points[i]; else pointright[i]=points[i]; } } else { temp=mediany(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if(points[i].y<temp.y) pointleft[i]=points[i]; else pointright[i]=points[i]; } } return build_kd(root->left,pointleft,depth+1); return build_kd(root->right,pointright,depth+1); } void print(Tree *root) { while(root!=NULL) { cout<<root->x<<" " <<root->y; print(root->left); print(root->right); } } int main() { int depth=0; Tree *root=NULL; vector<point>points(4); float x,y; int n=4; for(int i=0;i<n;i++) { cin>>x>>y; points[i].x=x; points[i].y=y; } root=build_kd(root,points,depth); print(root); return 0; } i am trying ti implement in c++ this pseudo code tuple function build_kd_tree(int depth, set points): if points contains only one point: return that point as a leaf. if depth is even: Calculate the median x-value. Create a set of points (pointsLeft) that have x-values less than the median. Create a set of points (pointsRight) that have x-values greater than or equal to the median. else: Calculate the median y-value. Create a set of points (pointsLeft) that have y-values less than the median. Create a set of points (pointsRight) that have y-values greater than or equal to the median. treeLeft = build_kd_tree(depth + 1, pointsLeft) treeRight = build_kd_tree(depth + 1, pointsRight) return(median, treeLeft, treeRight) please help me what this error means?

Read the article
Solaris 11.1 changes building of code past the point of __NORETURN

- by alanc

While Solaris 11.1 was under development, we started seeing some errors in the builds of the upstream X.Org git master sources, such as: "Display.c", line 65: Function has no return statement : x_io_error_handler "hostx.c", line 341: Function has no return statement : x_io_error_handler from functions that were defined to match a specific callback definition that declared them as returning an int if they did return, but these were calling exit() instead of returning so hadn't listed a return value. These had been generating warnings for years which we'd been ignoring, but X.Org has made enough progress in cleaning up code for compiler warnings and static analysis issues lately, that the community turned up the default error levels, including the gcc flag -Werror=return-type and the equivalent Solaris Studio cc flags -v -errwarn=E_FUNC_HAS_NO_RETURN_STMT, so now these became errors that stopped the build. Yet on Solaris, gcc built this code fine, while Studio errored out. Investigation showed this was due to the Solaris headers, which during Solaris 10 development added a number of annotations to the headers when gcc was being used for the amd64 kernel bringup before the Studio amd64 port was ready. Since Studio did not support the inline form of these annotations at the time, but instead used #pragma for them, the definitions were only present for gcc. To resolve this, I fixed both sides of the problem, so that it would work for building new X.Org sources on older Solaris releases or with older Studio compilers, as well as fixing the general problem before it broke more software building on Solaris. To the X.Org sources, I added the traditional Studio #pragma does_not_return to recognize that functions like exit() don't ever return, in patches such as this Xserver patch. Adding a dummy return statement was ruled out as that introduced unreachable code errors from compilers and analyzers that correctly realized you couldn't reach that code after a return statement. And on the Solaris 11.1 side, I updated the annotation definitions in <sys/ccompile.h> to enable for Studio 12.0 and later compilers the annotations already existing in a number of system headers for functions like exit() and abort(). If you look in that file you'll see the annotations we currently use, though the forms there haven't gone through review to become a Committed interface, so may change in the future. Actually getting this integrated into Solaris though took a bit more work than just editing one header file. Our ELF binary build comparison tool, wsdiff, actually showed a large number of differences in the resulting binaries due to the compiler using this information for branch prediction, code path analysis, and other possible optimizations, so after comparing enough of the disassembly output to be comfortable with the changes, we also made sure to get this in early enough in the release cycle so that it would get plenty of test exposure before the release. It also required updating quite a bit of code to avoid introducing new lint or compiler warnings or errors, and people building applications on top of Solaris 11.1 and later may need to make similar changes if they want to keep their build logs similarly clean. Previously, if you had a function that was declared with a non-void return type, lint and cc would warn if you didn't return a value, even if you called a function like exit() or panic() that ended execution. For instance: #include <stdlib.h> int callback(int status) { if (status == 0) return status; exit(status); } would previously require a never executed return 0; after the exit() to avoid lint warning "function falls off bottom without returning value". Now the compiler & lint will both issue "statement not reached" warnings for a return 0; after the final exit(), allowing (or in some cases, requiring) it to be removed. However, if there is no return statement anywhere in the function, lint will warn that you've declared a function returning a value that never does so, suggesting you can declare it as void. Unfortunately, if your function signature is required to match a certain form, such as in a callback, you not be able to do so, and will need to add a /* LINTED */ to the end of the function. If you need your code to build on both a newer and an older release, then you will either need to #ifdef these unreachable statements, or, to keep your sources common across releases, add to your sources the corresponding #pragma recognized by both current and older compiler versions, such as: #pragma does_not_return(exit) #pragma does_not_return(panic) Hopefully this little extra work is paid for by the compilers & code analyzers being able to better understand your code paths, giving you better optimizations and more accurate errors & warning messages.

Read the article
What are the industry metrics for average spend on dev hardware and software? [on hold]

- by RationalGeek

I'm trying to budget for my dev shop and compare our budget items to industry expectations. I'm hoping to find some information on what percentage of a dev's salary is generally spent on tooling, both hardware and software. Where can I find such information? If instead there is a source that looks at raw dollars that is useful, too. I can extrapolate what I need from that. NOTE: Your anecdotal evidence from your own job will not be very helpful. I'm looking for industry average statistics from a credible source. EDIT: I'm reluctant to even keep this question going based on the passionate negative responses of commenters, but I do think this is valuable information (assuming anyone will care to answer) so let me make one attempt to clarify why I'm looking for this information, and then leave it at that. I'm not sure why understanding and validating my motives is a necessary step to providing the information, but apparently that is the case, so I will do my best. Firstly, let me respond to the idea that us "management types" shouldn't use these types of metrics to evaluate budgets. I agree in part. Ideally, you should spend whatever is necessary on developers in order to keep them fully happy and productive. And this is true of all employees. However, companies operate in a world of limited resources, and every dollar spent in one area means a dollar not spent in another. So it is not enough to simply say "I need to spend $10,000 per developer next year" without having some way to justify that position. One way to help justify it is to compare yourself against the industry. If it is the case that on average a software shops spends 5% (making up that number) of their total development budget (salaries being the large portion of the other 95%, for arguments sake), and I'm only spending 3%, it helps in the justification process. So, it is not my intent to use this information to limit what I spend on developers, but rather to arm myself with the necessary justification to spend what I need to spend on developers to give them the best tools I can. I have been a developer for many years and I understand the need for proper tooling. Next, let's examine the idea that even considering the relationship between a spend on developer salaries and developer tooling is ludicrous and should be banned from budgetary thinking. As Jimmy Hoffa put it in their comment, it's like saying "I'm going to spend no more than 10% of median employee salary on light bulbs and coffee from now on.". Well, yes, it is like saying that, and from a budgeting perspective, this is a useful way to look at things. If you know that, on average, an employee consumes X dollars of coffee a year, then you can project a coffee budget based on that. And you can compare it to an industry metric to understand where you fall: do you spend more on coffee than other companies or less? Why might this be? If you are a coffee supply manager, that seems like a useful thought process. The same seems to hold true for developers. Now, on to the idea that I need to compare "apples to apples" and only look at other shops that are in the same place geographically, the same business, the same application architecture, and the same development frameworks. I guess if I could find such a statistic that said "a shop that is exactly identical to yours spends X on developer tooling" it would be wonderful. But there is plenty of value in an average statistic. Here's an analogy: let's say you are working on a household budget and need to decide how much to spend on groceries. Is it enough to know that the average consumer spends 15% on groceries and therefore decide that you will budget exactly 15%? No. You have to tweak your budget based on your individual needs and situation. But the generalized statistic does help in this evaluation. You can know if your budget is grossly off from what others are doing, and this can help you figure out why this is. So, I will concede the point that it would be better to find statistics that align to my shop, though I think any statistics I could find would be useful for what I'm doing. In that light, let's say that my shop is mostly focused on ASP.NET web applications. That doesn't map perfectly to reality because large enterprises have very heterogenous IT environments. But if I was going to pick one technology that is our focus that would be it. But, if you were to point me at some statistics that are related to a Linux shop doing embedded Java applications, I would still find it useful as a point of comparison. SUMMARY: Let me try to rephrase my question. I'm trying to find industry metrics on how much dev shops spend on developer tooling, both hardware and software. I don't so much care whether it is expressed as a percentage of total budget or as X dollars per dev or as Y percentage of salary. Any metric would be useful. If there are metrics that are specific to ASP.NET dev shops in the Northeast US, all the better, but I would be happy to find anything.

Read the article
SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

- by pinaldave

It’s no secret that bad data leads to bad decisions and poor results. However, how do you prevent dirty data from taking up residency in your data store? Some might argue that it’s the responsibility of the person sending you the data. While that may be true, in practice that will rarely hold up. It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data. What constitutes bad data? There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain. Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster. Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1: Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level. Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data. Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code. I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example. The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2: Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint. A regular expression is used to identify patterns within data. The patterns could be something as simple as a date format or something much more complex such as a street address. For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period. So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be. How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be: ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above: ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ . In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do. Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3: Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email. Email addresses are often entered incorrectly because people are trying to avoid spam. While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet: \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed: ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4: Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations. To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator. Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File. The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified. In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected. This makes diagnosing errors much easier! Tip 5: Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions. For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes. Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation. It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results. For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here. Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Migrating SQL Server Databases – The DBA’s Checklist (Part 1)

- by Sadequl Hussain

It is a fact of life: SQL Server databases change homes. They move from one instance to another, from one version to the next, from old servers to new ones. They move around as an organisation’s data grows, applications are enhanced or new versions of the database software are released. If not anything else, servers become old and unreliable and databases eventually need to find a new home. Consider the following scenarios: 1. A new database application is rolled out in a production server from the development or test environment 2. A copy of the production database needs to be installed in a test server for troubleshooting purposes 3. A copy of the development database is regularly refreshed in a test server during the system development life cycle 4. A SQL Server is upgraded to a newer version. This can be an in-place upgrade or a side-by-side migration 5. One or more databases need to be moved between different instances as part of a consolidation strategy. The instances can be running the same or different version of SQL Server 6. A database has to be restored from a backup file provided by a third party application vendor 7. A backup of the database is restored in the same or different instance for disaster recovery 8. A database needs to be migrated within the same instance: a. Files are moved from direct attached storage to storage area network b. The same database is copied under a different name for another application Migrating SQL Server database applications is a complex topic in itself. There are a number of components that can be involved: jobs, DTS or SSIS packages, logins or linked servers are only few pieces of the puzzle. However, in this article we will focus only on the central part of migration: the installation of the database itself. Unless it is an in-place upgrade, typically the database is taken from a source server and installed in a destination instance. Most of the time, a full backup file is used for the rollout. The backup file is either provided to the DBA or the DBA takes the backup and restores it in the target server. Sometimes the database is detached from the source and the files are copied to and attached in the destination. Regardless of the method of copying, moving, refreshing, restoring or upgrading the physical database, there are a number of steps the DBA should follow before and after it has been installed in the destination. It is these post database installation steps we are going to discuss below. Some of these steps apply in almost every scenario described above while some will depend on the type of objects contained within the database. Also, the principles hold regardless of the number of databases involved. Step 1: Make a copy of data and log files when attaching and detaching When detaching and attaching databases, ensure you have made copies of the data and log files if the destination is running a newer version of SQL Server. This is because once attached to a newer version, the database cannot be detached and attached back to an older version. Trying to do so will give you a message like the following: Server: Msg 602, Level 21, State 50, Line 1 Could not find row in sysindexes for database ID 6, object ID 1, index ID 1. Run DBCC CHECKTABLE on sysindexes. Connection Broken If you try to backup the attached database and restore it in the source, it will still fail. Similarly, if you are restoring the database in a newer version, it cannot be backed up or detached and put back in an older version of SQL. Unlike detach and attach method though, you do not lose the backup file or the original database here. When detaching and attaching a database, it is important you keep all the log files available along with the data files. It is possible to attach a database without a log file and SQL Server can be instructed to create a new log file, however this does not work if the database was detached when the primary file group was read-only. You will need all the log files in such cases. Step 2: Change database compatibility level Once the database has been restored or attached to a newer version of SQL Server, change the database compatibility level to reflect the newer version unless there is a compelling reason not to do so. When attaching or restoring from a previous version of SQL, the database retains the older version’s compatibility level. The only time you would want to keep a database with an older compatibility level is when the code within your database is no longer supported by SQL Server. For example, outer joins with *= or the =* operators were still possible in SQL 2000 (with a warning message), but not in SQL 2005 anymore. If your stored procedures or triggers are using this form of join, you would want to keep the database with an older compatibility level. For a list of compatibility issues between older and newer versions of SQL Server databases, refer to the Books Online under the sp_dbcmptlevel topic. Application developers and architects can help you in deciding whether you should change the compatibility level or not. You can always change the compatibility mode from the newest to an older version if necessary. To change the compatibility level, you can either use the database’s property from the SQL Server Management Studio or use the sp_dbcmptlevel stored procedure. Bear in mind that you cannot run the built-in reports for databases from SQL Server Management Studio if you keep the database with an older compatibility level. The following figure shows the error message I received when trying to run the “Disk Usage by Top Tables” report against a database. This database was hosted in a SQL Server 2005 system and still had a compatibility mode 80 (SQL 2000). Continues…

Read the article
Finding a person in the forest

- by PointsToShare

© 2011 By: Dov Trietsch. All rights reserved finding a person in the forest or Limiting the AD result in SharePoint People Picker There are times when we need to limit the SharePoint audience of certain farms or servers or site collections to a particular audience. One of my experiences involved limiting access to US citizens, another to a particular location. Now, most of us – your humble servant included – are not Active Directory experts – but we must be able to handle the “audience restrictions” as required. So here is how it’s done in a nutshell. Important note. Not all could be done in PowerShell (at least not yet)! There are no Windows PowerShell commands to configure People Picker. The stsadm command is: stsadm -o setproperty -pn peoplepicker-searchadcustomquery -pv ADQuery –url http://somethingOrOther Note the long-hyphenated property name. Now to filling the ADQuery. LDAP Query in a nutshell Syntax LDAP is no older than SQL and an LDAP query is actually a query against the LDAP Database. LDAP attributes are the equivalent of Database columns, so why do we have to learn a new query language? Beats me! But we must, so here it is. The syntax of an LDAP query string is made of individual statements with relational operators including: = Equal <= Lower than or equal >= Greater than or equal… and memberOf – a group membership. ! Not * Wildcard Equal and memberOf are the most commonly used. Checking for absence uses the ! – not and the * - wildcard Example: (SN=Grant) All whose last name – SurName – is Grant Example: (!(SN=Grant)) All except Grant Example: (!(SN=*)) all where there is no SurName i.e SurName is absent (probably Rappers). Example: (CN=MyGroup) Common Name is MyGroup. Example: (GN=J*) all the Given Names that start with J (JJ, Jane, Jon, John, etc.) The cryptic SN, CN, GN, etc. are attributes and more about them later All the queries are enclosed in parentheses (Query). Complex queries are comprised of sets that are in AND or OR conditions. AND is denoted by the ampersand (&) and the OR is denoted by the vertical pipe (|). The general syntax is that of the Prefix polish notation where the operand precedes the variables. E.g +ab is the sum of a and b. In an LDAP query (&(A)(B)) will garner the objects for which both A and B are true. In an LDAP query (&(A)(B)(C)) will garner the objects for which A, B and C are true. There’s no limit to the number of conditions. In an LDAP query (|(A)(B)) will garner the objects for which either A or B are true. In an LDAP query (|(A)(B)(C)) will garner the objects for which at least one of A, B and C is true. There’s no limit to the number of conditions. More complex queries have both types of conditions and the parentheses determine the order of operations. Attributes Now let’s get into the SN, CN, GN, and other attributes of the query SN – is the SurName (last name) GN – is the Given Name (first name) CN – is the Common Name, usually GN followed by SN OU – is an Organization Unit such as division, department etc. DC – is a Domain Content in the AD forest l – lower case ‘L’ stands for location. Jerusalem anybody? Or Katmandu. UPN – User Principal Name, is usually the first part of an email address. By nature it is unique in the forest. Most systems set the UPN to be the first initial followed by the SN of the person involved. Some limit the total to 8 characters. If we have many ‘jsmith’ we have to somehow distinguish them from each other. DN – is the distinguished name – a name unique to AD forest in which it lives. Usually it’s a CN with some domain or group distinguishers. DN is important in conjunction with the memberOf relation. Groups have stricter requirement. Each group has to have a unique name - its CN and it has to be unique regardless of its place. See more below. All of the attributes are case insensitive. CN, cn, Cn, and cN are identical. objectCategory is an element that requires special consideration. AD contains many different object like computers, printers, and of course people and groups. In the queries below, we’re limiting our search to people (person). Putting it altogether Let’s get a list of all the Johns in the SPAdmin group of the Jerusalem that local domain. (&(objectCategory=person)(memberOf=cn=SPAdmin,ou=Jerusalem,dc=local)) The memberOf=cn=SPAdmin uses the cn (Common Name) of the SPAdmin group. This is how the memberOf relation is used. ‘SPAdmin’ is actually the DN of the group. Also the memberOf relation does not allow wild cards (*) in the group name. Also, you are limited to at most one ‘OU’ entry. Let’s add Marvin Minsky to the search above. |(&(objectCategory=person)(memberOf=cn=SPAdmin,ou=Jerusalem,dc=local))(CN=Marvin Minsky) Here I added the or pipeline at the beginning of the query and put the CN requirement for Minsky at the end. Note that if Marvin was already in the prior result, he’s not going to be listed twice. One last note: You may see a dryer but more complete list of attributes rules and examples in: http://www.tek-tips.com/faqs.cfm?fid=5667 And finally (thus negating the claim that my previous note was last), to the best of my knowledge there are 3 more ways to limit the audience. One is to use the peoplepicker-searchadcustomfilter property using the same ADQuery. This works only in SP1 and above. The second is to limit the search to users within this particular site collection – the property name is peoplepicker-onlysearchwithinsitecollection and the value is yes (-pv yes) And the third is –pn peoplepicker-serviceaccountdirectorypaths –pv “OU=ou1,DC=dc1…..” Again you are limited to at most one ‘OU’ phrase – no OU=ou1,OU=ou2… And now the real end. The main property discussed in this sprawling and seemingly endless monogram – peoplepicker-searchadcustomquery - is the most general way of getting the job done. Here are a few examples of command lines that worked and some that didn’t. Can you see why? C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (Title=David) Operation completed successfully. C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (!Title=David) Operation completed successfully. C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (OU=OURealName,OU=OUMid,OU=OUTop,DC=TopDC,DC=MidDC,DC=BottomDC) Command line error. Too many OUs C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (OU=OURealName) Operation completed successfully. C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (DC=TopDC,DC=MidDC,DC=BottomDC) Operation completed successfully. C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN>stsa dm -o setproperty -url http://somethingOrOther -pn peoplepicker-searchadcustomfi lter -pv (OU=OURealName,DC=TopDC,DC=MidDC,DC=BottomDC) Operation completed successfully. That’s all folks!

Read the article
CodePlex Daily Summary for Monday, May 03, 2010

CodePlex Daily Summary for Monday, May 03, 2010New Projects.radiko: エアログラス採用のシンプルなradiko(http://radiko.jp/)クライアントです。タスクトレイのアイコンからラジオ局の切り替えができます。7Scale: EmptyB2C MVC Plattform: The B2C MVC Plattform aims to be pluggable site framework to help small busisness accomplish basic tasks between business and customers.ElValWeb: The goal of the project to create full featured implementation of ModelValidatorProvider for Enterprise Library Application Validation Block, wich ...esatis yazilimi: asp.net yazılımı ile satış magazasi websitesi kur.IEnumerable.It sample code: IEnumerable.It sample codejQuery MicroAjax for ASP.NET: MicroAjax is a set of jQuery plugins and .NET components designed to provide simple, powerful and efficient Ajax centric web application design pat...Karbon VOS: Karbon VOS is an advanced Virtual Operating System Template for Visual Basic Express. It's developed in Visual Basic. Karbon VOS hopes to one day b...LINQ Mapper: LINQ Mapper translates simple LINQ queries between different sources. It allows you to write queries against your domain model, but have them run ...Meccano Silverlight Framework: Meccano is a new generation of frameworks for creation of LOB Silverlight applications based on MEF, RX, WCF, ADO.NET Data Services etc. It is inte...Multiuse Model View (MMV) Library: This project is an open source library for the Multiuse Model View (MMV) pattern for building robust WPF and ASP.Net applications. Visit my blog ht...Process Affinity Control: Process Affinity Control allows to set the affinity masks of processes based on rules.SilverSpatial: This project helps bridge the gap between Silverlight and Geo-Spatial data type (such as SQL Spatial). It implements the Well-Known-Binary (WKB) fo...StageAssets: Application for storing data about "things" and people in theatre. For example equipment, actors and so on.Stratosphere: Mono compatible library with set of primitives to work with scalable table, queue and block containers with corresponding implementations for Amazo...TRX Web-Viewer: A simple web-based application to upload and view VSTS 2008 and VSTS 2010 test result files with some basic lookup and feature-wise management of r...WDT2: WDT 2 is the school project to begin learning .NET enviroment, The main focus is on learning the use of almost all the componenets.WPF Behavior Library: WPF Behavior Library is a set of additional actions for WPF that allow you to add extra behaviors to a control quickly and easily. Currently the on...YouTubeEmbeddedVideo WebControl for ASP.NET: A Control to embed YouTube videos in ASP.NET pages. Works in C# and VB.NETNew Releases.radiko: beta: 東京局のみ対応あとは手抜きActiveWorlds Managed .NET SDK: AwManaged Technology Preview - WIN32 (Alpha): This WIN32 release contains the Server Console Application. The Setup executable should be run as administrator on O.S. using UAC (Vista/Win7)AJAX Control Framework: v1.0.1.0: v1.0.1.0 - Contains a Bing Maps sample project, a number of bug fixes and a few performance improvements. - AJAX enable ANY custom control that der...App_Code (and Usercontrol) Editor (ACE): v1.0.0 alpha: The first alpha release of the AppCode Editor for Umbraco 4.0.3 is now available to download! Tested to work with usercontrols - pre-compilation wi...ElValWeb: ElValWeb 0.0.1.0: Version 0.0.1.0 contains client validation support forAndCompositeValidator ContainsCharactersValidator DomainValidator NotNullValidator Or...esatis yazilimi: magaza: magazanın yazılımları ve veri tabanının yazılımlarıGrunty OS: Grunty OS USB: Download Grunty OS for USBGrunty OS: Grunty OS.ISO: Grunty OS ISOKarbon VOS: Milestone 1 (Kaptua): Milestone 1...Live Meeting API Wrapper: LiveMeetingAPIWrapperV1.2: Added get meeting and update meeting.Multiuse Model View (MMV) Library: v0.3: first alpha release. Medium amount of functionality and some use cases tested.MVC Foolproof Validation: Beta 0.9.3774: Adds resource provided error messages, regular expression operators and a new RegularExpressionIf attribute.Process Affinity Control: Version 1.0.0: This is the first release. Planned features for the next release: No administrative privileges needed to run the manager Select the active scena...SharePoint 2010 Service Manager: SharePoint 2010 Service Manager 1.1: Added support to run under UAC with automatic security elevationSharePoint Event Handler Manager: Event Handler Manager 2.0: Please download the application here: http://www.ackermantech.com/registerevents.aspxSkyDrive Synchronizer: SkyDrive Sync Beta 0.1: Beta release includes: Upload and download Synchronize updated files Delete files on web/locally if not in source Split larger files into sma...Stratosphere: Stratosphere 1.0.0.0: Initial beta releaseSuggested Resources for .NET Developers: 0.8.0.0 VS2010 - focus on displaying content: This is the first release of Suggested Resources that can be downloaded from the internet. While there is still a lot of work to be done this rele...TRX Web-Viewer: TRX Web-Viewer V1.0: First working versionVCC: Latest build, v2.1.30502.0: Automatic drop of latest buildWatchersNET.TagCloud: WatchersNET.TagCloud 01.04.00: !Whats New New Tag Mode: Search Referrers (Shows Search Tags From Google, Ask, Bing, Yahoo and the Dnn Site Search) Taxonomy Tags now contains L...Web/Cloud Applications Development Framework | Visual WebGui: 6.4 Beta 2e: Fully featured beta version of Visual WebGui Web/Cloud Applicaiton Development FrameworkWPF Behavior Library: WPF Behavior Library 0.1 Release: First alpha release of the WPF Behavior Library. It should be stable but doesn't have all of the features it will have in the future and the API ma...xvanneste: Sharepoint Social Network Client: Client permettant d'avoir accés au social network de sharepoint a l'exterieur du navigateur.Most Popular ProjectsRawrWBFS ManagerAJAX Control Toolkitpatterns & practices – Enterprise LibraryMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)iTuner - The iTunes CompanionASP.NETDotNetNuke® Community EditionMost Active ProjectsIonics Isapi Rewrite Filterpatterns & practices – Enterprise LibraryRawrHydroServer - CUAHSI Hydrologic Information System ServerAJAX Control Frameworkpatterns & practices: Azure Security GuidanceTinyProjectBlogEngine.NETNB_Store - Free DotNetNuke Ecommerce Catalog ModuleDambach Linear Algebra Framework

Read the article
When OneTug Just Isn’t Enough…

- by onefloridacoder

I stole that from the back of a T-shirt I saw at the Orlando Code Camp 2010. This was my first code camp and my first time volunteering for an event like this as well. It was an awesome day. I cannot begin to count the “aaahh”, “I did-not-know I could do that”, in the crowds and for myself. I think it was a great day of learning for everyone at all levels. All of the presenters were different and provided great insights into the topics they were presenting. Here’s a list of the ones that I attended. KodeFuGuru, “Pirates vs. Ninjas” He touched on many good topics to relax some of the ways we think when we are writing out code, and still looks good, readable, etc. As he pointed out in all of his examples, we might not always realize everything that’s going on under the covers. He exposed a bug in his own code, and verbalized the mental gymnastics he went through when he knew there was something wrong with one of his IEnumerable implementations. For me, it was great to hear that someone else labors over these gut reactions to code quickly snapped together, to the point that we rush to the refactor stage to fix what’s bothering us – and learn. He has some content on extension methods that was very interesting. My “that is so cool” moment was when he swapped out AddEntity method on an entity class and used a With extension method instead. Some of the LINQ scales fell off my eyes at that moment, and I realized my own code could be a lot more powerful (and readable) if incorporate a few of these examples at the appropriate times. And he cautioned as well… “don’t go crazy with this stuff”, there’s a place and time for everything. One of his examples demo’d toward the end of the talk is on his sight where he’s chaining methods together, cool stuff. Quotes I liked: “Extension Methods - Extension methods to put features back on the model type, without impacting the type.” “Favor Declarative Code” – Check out the ? and ?? operators if you’re not already using them. “Favor Fluent Code” “Avoid Pirate Ninja Zombies! If you see one run!” I’m definitely going to be looking at “Extract Projection” when I get into VS2010. BDD 101 – Sean Chambers http://github.com/schambers This guy had a whole host of gremlins against him, final score Sean 5, Gremlins 1. He ran the code samples from his github repo in the code github code viewer since the PC they school gave him to use didn’t have VS installed. He did a great job of converting the grammar between BDD and TDD, and how this style of development can be used in integration tests as well as the different types of gated builds on a CI box – he didn’t go into a discussion around CI, but we could infer that it could work. Like when we use WSSF, it does cause a class explosion to happen however the amount of code per class it limit to just covering the concern at hand – no more, no less. As in “When I as a <Role>, expect {something} to happen, because {}” This keeps us (the developer) from gold plating our solutions and creating less waste. He basically keeps the code that prove out the requirement to two lines of code. Nice. He uses SpecUnit to merge this grammar into his .NET projects and gave an overview on how this ties into writing his own BDD tests. Some folks were familiar with Given / When / Then as story acceptance criteria and here’s how he mapped it: “Given <Context> When <Something Happens> Then <I expect...>” There are a few base classes and overrides in the SpecUnit framework that help with setting up the context for each test which looked very handy. Successfully Running Your Own Coding Business The speaker ran through a list of items that sounded like common sense stuff LLC, banking, separating expenses, etc. Then moved into role playing with business owners and an ISV. That was pretty good stuff, it pays to be a good listener all of the time even if your client is sitting on the other side of the phone tearing you head off for you – but that’s all it is, and get used to it its par for the course. Oh, yeah always answer the phone was one simple thing that you can do to move your business forward. But like Cory Foy tweeted this week, “If you owe me a lot of money, don’t have a message that says your away for five weeks skiing in Colorado.” Lots of food for thought that’s on my list of “todo’s and to-don’ts”. Speaker Idol Next, I had the pleasure of helping Russ Fustino tape this part of Code Camp as my primary volunteer opportunity that day. You remember Russ, “know the code” from the awesome Russ’ Tool Shed series. He did a great job orchestrating and capturing the Speaker Idol finals. So I didn’t actually miss any sessions, but was able to see three back to back in one setting. The idol finalists gave a 10 minute talk and very deep subjects, but different styles of talks. No one walked away empty handed for jobs very well done. Russ has details on his site. The pictures and video captured is supposed to be published on Channel 9 at a later date. It was also a valuable experience to see what makes technical speakers effective in their talks. I picked up quite a few speaking tips from what I heard from the judges and contestants. Design For Developers – Diane Leeper If you are a great developer, you’re probably a lousy designer. Diane didn’t come to poke holes in what we think we can do with UI layout and design, but she provided some tools we can use to figure out metaphors for visualizing data. If you need help with that check out Silverlight Pivot – that’s what she was getting at. I was first introduced to her at one of John Papa’s talks last year at a Lakeland User Group meeting and she’s very passionate about design. She was able to discuss different elements of Pivot, while to a developer is just looked cool. I believe she was providing the deck from her talk to folks after her talk, so send her an email if you’re interested. She says she can talk about design for hours and hours – we all left that session believing her. Rinse and Repeat Orlando Code Camp 2010 was awesome, and would totally do it again. There were lots of folks from my shop there, and some that have left my shop to go elsewhere. So it was a reunion of sorts and a great celebration for the simple fact that its great to be a developer and there’s a community that supports and recognizes it as well. The sponsors were generous and the organizers were very tired, namely Esteban Garcia and Will Strohl who were responsible for making a lot of this magic happen. And if you don’t believe me, check out the chatter on Twitter.

Read the article
PostSharp, Obfuscation, and IL

- by Simon Cooper

Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp. Attributes and AOP Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member. PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime. A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed: public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect { public override void OnEntry(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Entering {0}.{1}.", method.DeclaringType.FullName, method.Name)); } public override void OnExit(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Leaving {0}.{1}.", method.DeclaringType.FullName, method.Name)); } } [Trace] public void MethodToLog() { ... } Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself. PostSharp Performance Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things. Ldtoken C# allows you to get the Type object corresponding to a specific type name using the typeof operator: Type t = typeof(Random); The C# compiler compiles this operator to the following IL: ldtoken [mscorlib]System.Random call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle( valuetype [mscorlib]System.RuntimeTypeHandle) The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used. However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle: // get a MethodBase for String.EndsWith(string) ldtoken method instance bool [mscorlib]System.String::EndsWith(string) call class [mscorlib]System.Reflection.MethodBase [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle( valuetype [mscorlib]System.RuntimeMethodHandle) // get a FieldInfo for the String.Empty field ldtoken field string [mscorlib]System.String::Empty call class [mscorlib]System.Reflection.FieldInfo [mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle( valuetype [mscorlib]System.RuntimeFieldHandle) These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups. The kicker However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way. So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature. Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated. PostSharp and Obfuscation When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks. So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table: MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: get_Prop1 21: set_Prop1 22: DoFoo 23: GetWibble When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected: MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: #kkA 21: #zAb 22: #EF5a 23: #2tg As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too. So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!

Read the article
The SSIS tuning tip that everyone misses

- by Rob Farley

I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

Read the article
Refactoring Part 1 : Intuitive Investments

- by Wes McClure

Fear, it’s what turns maintaining applications into a nightmare. Technology moves on, teams move on, someone is left to operate the application, what was green is now perceived brown. Eventually the business will evolve and changes will need to be made. The approach to those changes often dictates the long term viability of the application. Fear of change, lack of passion and a lack of interest in understanding the domain often leads to a paranoia to do anything that doesn’t involve duct tape and bailing twine. Don’t get me wrong, those have a place in the short term viability of a project but they don’t have a place in the long term. Add to it “us versus them” in regards to the original team and those that maintain it, internal politics and other factors and you have a recipe for disaster. This results in code that quickly becomes unmanageable. Even the most clever of designs will eventually become sub optimal and debt will amount that exponentially makes changes difficult. This is where refactoring comes in, and it’s something I’m very passionate about. Refactoring is about improving the process whereby we make change, it’s an exponential investment in the process of change. Without it we will incur exponential complexity that halts productivity. Investments, especially in the long term, require intuition and reflection. How can we tackle new development effectively via evolving the original design and paying off debt that has been incurred? The longer we wait to ask and answer this question, the more it will cost us. Small requests don’t warrant big changes, but realizing when changes now will pay off in the long term, and especially in the short term, is valuable. I have done my fair share of maintaining applications and continuously refactoring as needed, but recently I’ve begun work on a project that hasn’t had much debt, if any, paid down in years. This is the first in a series of blog posts to try to capture the process which is largely driven by intuition of smaller refactorings from other projects. Signs that refactoring could help: Testability How can decreasing test time not pay dividends? One of the first things I found was that a very important piece often takes 30+ minutes to test. I can only imagine how much time this has cost historically, but more importantly the time it might cost in the coming weeks: I estimate at least 10-20 hours per person! This is simply unacceptable for almost any situation. As it turns out, about 6 hours of working with this part of the application and I was able to cut the time down to under 30 seconds! In less than the lost time of one week, I was able to fix the problem for all future weeks! If we can’t test fast then we can’t change fast, nor with confidence. Code is used by end users and it’s also used by developers, consider your own needs in terms of the code base. Adding logic to enable/disable features during testing can help decouple parts of an application and lead to massive improvements. What exactly is so wrong about test code in real code? Often, these become features for operators and sometimes end users. If you cannot run an integration test within a test runner in your IDE, it’s time to refactor. Readability Are variables named meaningfully via a ubiquitous language? Is the code segmented functionally or behaviorally so as to minimize the complexity of any one area? Are aspects properly segmented to avoid confusion (security, logging, transactions, translations, dependency management etc) Is the code declarative (what) or imperative (how)? What matters, not how. LINQ is a great abstraction of the what, not how, of collection manipulation. The Reactive framework is a great example of the what, not how, of managing streams of data. Are constants abstracted and named, or are they just inline? Do people constantly bitch about the code/design? If the code is hard to understand, it will be hard to change with confidence. It’s a large undertaking if the original designers didn’t pay much attention to readability and as such will never be done to “completion.” Make sure not to go over board, instead use this as you change an application, not in lieu of changes (like with testability). Complexity Simplicity will never be achieved, it’s highly subjective. That said, a lot of code can be significantly simplified, tidy it up as you go. Refactoring will often converge upon a simplification step after enough time, keep an eye out for this. Understandability In the process of changing code, one often gains a better understanding of it. Refactoring code is a good way to learn how it works. However, it’s usually best in combination with other reasons, in effect killing two birds with one stone. Often this is done when readability is poor, in which case understandability is usually poor as well. In the large undertaking we are making with this legacy application, we will be replacing it. Therefore, understanding all of its features is important and this refactoring technique will come in very handy. Unused code How can deleting things not help? This is a freebie in refactoring, it’s very easy to detect with modern tools, especially in statically typed languages. We have VCS for a reason, if in doubt, delete it out (ok that was cheesy)! If you don’t know where to start when refactoring, this is an excellent starting point! Duplication Do not pray and sacrifice to the anti-duplication gods, there are excellent examples where consolidated code is a horrible idea, usually with divergent domains. That said, mediocre developers live by copy/paste. Other times features converge and aren’t combined. Tools for finding similar code are great in the example of copy/paste problems. Knowledge of the domain helps identify convergent concepts that often lead to convergent solutions and will give intuition for where to look for conceptual repetition. 80/20 and the Boy Scouts It’s often said that 80% of the time 20% of the application is used most. These tend to be the parts that are changed. There are also parts of the code where 80% of the time is spent changing 20% (probably for all the refactoring smells above). I focus on these areas any time I make a change and follow the philosophy of the Boy Scout in cleaning up more than I messed up. If I spend 2 hours changing an application, in the 20%, I’ll always spend at least 15 minutes cleaning it or nearby areas. This gives a huge productivity edge on developers that don’t. Ironically after a short period of time the 20% shrinks enough that we don’t have to spend 80% of our time there and can move on to other areas. Refactoring is highly subjective, never attempt to refactor to completion! Learn to be comfortable with leaving one part of the application in a better state than others. It’s an evolution, not a revolution. These are some simple areas to look into when making changes and can help get one started in the process. I’ve often found that refactoring is a convergent process towards simplicity that sometimes spans a few hours but often can lead to massive simplifications over the timespan of weeks and months of regular development.

Read the article

Search Results

Search found 2896 results on 116 pages for 'comparison operators'.

Page 105/116 | < Previous Page | 101 102 103 104 105 106 107 108 109 110 111 112 | Next Page >

- by jhenning

- by Most Valuable Yak (Rob Volk)

- by Nikita Polyakov

- by Lance Shaw

- by Dane Morgridge

- by Dave

- by Janice J. Heiss

- by cibrax

- by Simon Cooper

- by Jesse

- by Simon Cooper

- by alanc

- by Baek Seungjoo

- by user466441

- by alanc

- by RationalGeek

- by pinaldave

- by Sadequl Hussain

- by PointsToShare

- by onefloridacoder

- by Simon Cooper

- by Rob Farley

- by Wes McClure

< Previous Page | 101 102 103 104 105 106 107 108 109 110 111 112 | Next Page >