Search Results

Search found 7805 results on 313 pages for 'high voltage'.

Page 60/313 | < Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67 | Next Page >

What instance type should I choose?

- by cute

To run video chat service? High-Memory or High-CPU? And, guess Standard isn't suitable?

Read the article
A way of doing real-world test-driven development (and some thoughts about it)

- by Thomas Weller

Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like 'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make: It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below: First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator { /// <summary> /// Defines a very simple calculator component for demo purposes. /// </summary> public interface ICalculator { /// <summary> /// Gets the result of the last successful operation. /// </summary> /// <value>The last result.</value> /// <remarks> /// Will be <see langword="null" /> before the first successful operation. /// </remarks> double? LastResult { get; } } // interface ICalculator } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1) First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test { [TestFixture] public class CalculatorTest { private readonly ServiceContainer container = new ServiceContainer(); [Test] public void CalculatorLastResultIsInitiallyNull() { ICalculator calculator = container.GetService<ICalculator>(); Assert.IsNull(calculator.LastResult); } } // class CalculatorTest } // namespace Calculator.Test This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more. Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type: Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get { throw new NotImplementedException(); } } } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest { #region Fields private readonly ServiceContainer container = new ServiceContainer(); #endregion // Fields #region Setup/TearDown [FixtureSetUp] public void FixtureSetUp() { container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll"); } ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property: [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get; private set; } } One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method: ... /// <summary> /// Adds the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the additon.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Add(double operand1, double operand2); } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this: Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location. This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…). Back to our Calculator again: Two more R# – clicks implement the Add() skeleton: ... public double Add(double operand1, double operand2) { throw new NotImplementedException(); } } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest: [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } // in Calculator: public double Add(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio): Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, result); } [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, calculator.LastResult); } ... // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Subtract(double operand1, double operand2); ... // Calculator.cs: public double Subtract(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } return (this.LastResult = operand1 - operand2).Value; } Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { #region ICalculator public double? LastResult { get; private set; } public double Add(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 + operand2).Value; } public double Subtract(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 - operand2).Value; } #endregion // ICalculator #region Implementation (Helper) private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } } #endregion // Implementation (Helper) } // class Calculator } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…). As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks - is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

Read the article
SQL Server 2012 - AlwaysOn

- by Claus Jandausch

Ich war nicht nur irritiert, ich war sogar regelrecht schockiert - und für einen kurzen Moment sprachlos (was nur selten der Fall ist). Gerade eben hatte mich jemand gefragt "Wann Oracle denn etwas Vergleichbares wie AlwaysOn bieten würde - und ob überhaupt?" War ich hier im falschen Film gelandet? Ich konnte nicht anders, als meinen Unmut kundzutun und zu erklären, dass die Fragestellung normalerweise anders herum läuft. Zugegeben - es mag vielleicht strittige Punkte geben im Vergleich zwischen Oracle und SQL Server - bei denen nicht unbedingt immer Oracle die Nase vorn haben muss - aber das Thema Clustering für Hochverfügbarkeit (HA), Disaster Recovery (DR) und Skalierbarkeit gehört mit Sicherheit nicht dazu. Dieses Erlebnis hakte ich am Nachgang als Einzelfall ab, der so nie wieder vorkommen würde. Bis ich kurz darauf eines Besseren belehrt wurde und genau die selbe Frage erneut zu hören bekam. Diesmal sogar im Exadata-Umfeld und einem Oracle Stretch Cluster. Einmal ist keinmal, doch zweimal ist einmal zu viel... Getreu diesem alten Motto war mir klar, dass man das so nicht länger stehen lassen konnte. Ich habe keine Ahnung, wie die Microsoft Marketing Abteilung es geschafft hat, unter dem AlwaysOn Brading eine innovative Technologie vermuten zu lassen - aber sie hat ihren Job scheinbar gut gemacht. Doch abgesehen von einem guten Marketing, stellt sich natürlich die Frage, was wirklich dahinter steckt und wie sich das Ganze mit Oracle vergleichen lässt - und ob überhaupt? Damit wären wir wieder bei der ursprünglichen Frage angelangt. So viel zum Hintergrund dieses Blogbeitrags - von meiner Antwort handelt der restliche Blog. "Windows was the God ..." Um den wahren Unterschied zwischen Oracle und Microsoft verstehen zu können, muss man zunächst das bedeutendste Microsoft Dogma kennen. Es lässt sich schlicht und einfach auf den Punkt bringen: "Alles muss auf Windows basieren." Die Überschrift dieses Absatzes ist kein von mir erfundener Ausspruch, sondern ein Zitat. Konkret stammt es aus einem längeren Artikel von Kurt Eichenwald in der Vanity Fair aus dem August 2012. Er lautet Microsoft's Lost Decade und sei jedem ans Herz gelegt, der die "Microsoft-Maschinerie" unter Steve Ballmer und einige ihrer Kuriositäten besser verstehen möchte. "YOU TALKING TO ME?" Microsoft C.E.O. Steve Ballmer bei seiner Keynote auf der 2012 International Consumer Electronics Show in Las Vegas am 9. Januar Manche Dinge in diesem Artikel mögen überspitzt dargestellt erscheinen - sind sie aber nicht. Vieles davon kannte ich bereits aus eigener Erfahrung und kann es nur bestätigen. Anderes hat sich mir erst so richtig erschlossen. Insbesondere die folgenden Passagen führten zum Aha-Erlebnis: “Windows was the god—everything had to work with Windows,” said Stone... “Every little thing you want to write has to build off of Windows (or other existing roducts),” one software engineer said. “It can be very confusing, …” Ich habe immer schon darauf hingewiesen, dass in einem SQL Server Failover Cluster die Microsoft Datenbank eigentlich nichts Nenneswertes zum Geschehen beiträgt, sondern sich voll und ganz auf das Windows Betriebssystem verlässt. Deshalb muss man auch die Windows Server Enterprise Edition installieren, soll ein Failover Cluster für den SQL Server eingerichtet werden. Denn hier werden die Cluster Services geliefert - nicht mit dem SQL Server. Er ist nur lediglich ein weiteres Server Produkt, für das Windows in Ausfallszenarien genutzt werden kann - so wie Microsoft Exchange beispielsweise, oder Microsoft SharePoint, oder irgendein anderes Server Produkt das auf Windows gehostet wird. Auch Oracle kann damit genutzt werden. Das Stichwort lautet hier: Oracle Failsafe. Nur - warum sollte man das tun, wenn gleichzeitig eine überlegene Technologie wie die Oracle Real Application Clusters (RAC) zur Verfügung steht, die dann auch keine Windows Enterprise Edition voraussetzen, da Oracle die eigene Clusterware liefert. Welche darüber hinaus für kürzere Failover-Zeiten sorgt, da diese Cluster-Technologie Datenbank-integriert ist und sich nicht auf "Dritte" verlässt. Wenn man sich also schon keine technischen Vorteile mit einem SQL Server Failover Cluster erkauft, sondern zusätzlich noch versteckte Lizenzkosten durch die Lizenzierung der Windows Server Enterprise Edition einhandelt, warum hat Microsoft dann in den vergangenen Jahren seit SQL Server 2000 nicht ebenfalls an einer neuen und innovativen Lösung gearbeitet, die mit Oracle RAC mithalten kann? Entwickler hat Microsoft genügend? Am Geld kann es auch nicht liegen? Lesen Sie einfach noch einmal die beiden obenstehenden Zitate und sie werden den Grund verstehen. Anders lässt es sich ja auch gar nicht mehr erklären, dass AlwaysOn aus zwei unterschiedlichen Technologien besteht, die beide jedoch wiederum auf dem Windows Server Failover Clustering (WSFC) basieren. Denn daraus ergeben sich klare Nachteile - aber dazu später mehr. Um AlwaysOn zu verstehen, sollte man sich zunächst kurz in Erinnerung rufen, was Microsoft bisher an HA/DR (High Availability/Desaster Recovery) Lösungen für SQL Server zur Verfügung gestellt hat. Replikation Basiert auf logischer Replikation und Pubisher/Subscriber Architektur Transactional Replication Merge Replication Snapshot Replication Microsoft's Replikation ist vergleichbar mit Oracle GoldenGate. Oracle GoldenGate stellt jedoch die umfassendere Technologie dar und bietet High Performance. Log Shipping Microsoft's Log Shipping stellt eine einfache Technologie dar, die vergleichbar ist mit Oracle Managed Recovery in Oracle Version 7. Das Log Shipping besitzt folgende Merkmale: Transaction Log Backups werden von Primary nach Secondary/ies geschickt Einarbeitung (z.B. Restore) auf jedem Secondary individuell Optionale dritte Server Instanz (Monitor Server) für Überwachung und Alarm Log Restore Unterbrechung möglich für Read-Only Modus (Secondary) Keine Unterstützung von Automatic Failover Database Mirroring Microsoft's Database Mirroring wurde verfügbar mit SQL Server 2005, sah aus wie Oracle Data Guard in Oracle 9i, war funktional jedoch nicht so umfassend. Für ein HA/DR Paar besteht eine 1:1 Beziehung, um die produktive Datenbank (Principle DB) abzusichern. Auf der Standby Datenbank (Mirrored DB) werden alle Insert-, Update- und Delete-Operationen nachgezogen. Modi Synchron (High-Safety Modus) Asynchron (High-Performance Modus) Automatic Failover Unterstützt im High-Safety Modus (synchron) Witness Server vorausgesetzt Zur Frage der Kontinuität Es stellt sich die Frage, wie es um diesen Technologien nun im Zusammenhang mit SQL Server 2012 bestellt ist. Unter Fanfaren seinerzeit eingeführt, war Database Mirroring das erklärte Mittel der Wahl. Ich bin kein Produkt Manager bei Microsoft und kann hierzu nur meine Meinung äußern, aber zieht man den SQL AlwaysOn Team Blog heran, so sieht es nicht gut aus für das Database Mirroring - zumindest nicht langfristig. "Does AlwaysOn Availability Group replace Database Mirroring going forward?” “The short answer is we recommend that you migrate from the mirroring configuration or even mirroring and log shipping configuration to using Availability Group. Database Mirroring will still be available in the Denali release but will be phased out over subsequent releases. Log Shipping will continue to be available in future releases.” Damit wären wir endlich beim eigentlichen Thema angelangt. Was ist eine sogenannte Availability Group und was genau hat es mit der vielversprechend klingenden Bezeichnung AlwaysOn auf sich? SQL Server 2012 - AlwaysOn Zwei HA-Features verstekcne sich hinter dem “AlwaysOn”-Branding. Einmal das AlwaysOn Failover Clustering aka SQL Server Failover Cluster Instances (FCI) - zum Anderen die AlwaysOn Availability Groups. Failover Cluster Instances (FCI) Entspricht ungefähr dem Stretch Cluster Konzept von Oracle Setzt auf Windows Server Failover Clustering (WSFC) auf Bietet HA auf Instanz-Ebene AlwaysOn Availability Groups (Verfügbarkeitsgruppen) Ähnlich der Idee von Consistency Groups, wie in Storage-Level Replikations-Software von z.B. EMC SRDF Abhängigkeiten zu Windows Server Failover Clustering (WSFC) Bietet HA auf Datenbank-Ebene Hinweis: Verwechseln Sie nicht eine SQL Server Datenbank mit einer Oracle Datenbank. Und auch nicht eine Oracle Instanz mit einer SQL Server Instanz. Die gleichen Begriffe haben hier eine andere Bedeutung - nicht selten ein Grund, weshalb Oracle- und Microsoft DBAs schnell aneinander vorbei reden. Denken Sie bei einer SQL Server Datenbank eher an ein Oracle Schema, das kommt der Sache näher. So etwas wie die SQL Server Northwind Datenbank ist vergleichbar mit dem Oracle Scott Schema. Wenn Sie die genauen Unterschiede kennen möchten, finden Sie eine detaillierte Beschreibung in meinem Buch "Oracle10g Release 2 für Windows und .NET", erhältich bei Lehmanns, Amazon, etc. Windows Server Failover Clustering (WSFC) Wie man sieht, basieren beide AlwaysOn Technologien wiederum auf dem Windows Server Failover Clustering (WSFC), um einerseits Hochverfügbarkeit auf Ebene der Instanz zu gewährleisten und andererseits auf der Datenbank-Ebene. Deshalb nun eine kurze Beschreibung der WSFC. Die WSFC sind ein mit dem Windows Betriebssystem geliefertes Infrastruktur-Feature, um HA für Server Anwendungen, wie Microsoft Exchange, SharePoint, SQL Server, etc. zu bieten. So wie jeder andere Cluster, besteht ein WSFC Cluster aus einer Gruppe unabhängiger Server, die zusammenarbeiten, um die Verfügbarkeit einer Applikation oder eines Service zu erhöhen. Falls ein Cluster-Knoten oder -Service ausfällt, kann der auf diesem Knoten bisher gehostete Service automatisch oder manuell auf einen anderen im Cluster verfügbaren Knoten transferriert werden - was allgemein als Failover bekannt ist. Unter SQL Server 2012 verwenden sowohl die AlwaysOn Avalability Groups, als auch die AlwaysOn Failover Cluster Instances die WSFC als Plattformtechnologie, um Komponenten als WSFC Cluster-Ressourcen zu registrieren. Verwandte Ressourcen werden in eine Ressource Group zusammengefasst, die in Abhängigkeit zu anderen WSFC Cluster-Ressourcen gebracht werden kann. Der WSFC Cluster Service kann jetzt die Notwendigkeit zum Neustart der SQL Server Instanz erfassen oder einen automatischen Failover zu einem anderen Server-Knoten im WSFC Cluster auslösen. Failover Cluster Instances (FCI) Eine SQL Server Failover Cluster Instanz (FCI) ist eine einzelne SQL Server Instanz, die in einem Failover Cluster betrieben wird, der aus mehreren Windows Server Failover Clustering (WSFC) Knoten besteht und so HA (High Availability) auf Ebene der Instanz bietet. Unter Verwendung von Multi-Subnet FCI kann auch Remote DR (Disaster Recovery) unterstützt werden. Eine weitere Option für Remote DR besteht darin, eine unter FCI gehostete Datenbank in einer Availability Group zu betreiben. Hierzu später mehr. FCI und WSFC Basis FCI, das für lokale Hochverfügbarkeit der Instanzen genutzt wird, ähnelt der veralteten Architektur eines kalten Cluster (Aktiv-Passiv). Unter SQL Server 2008 wurde diese Technologie SQL Server 2008 Failover Clustering genannt. Sie nutzte den Windows Server Failover Cluster. In SQL Server 2012 hat Microsoft diese Basistechnologie unter der Bezeichnung AlwaysOn zusammengefasst. Es handelt sich aber nach wie vor um die klassische Aktiv-Passiv-Konfiguration. Der Ablauf im Failover-Fall ist wie folgt: Solange kein Hardware-oder System-Fehler auftritt, werden alle Dirty Pages im Buffer Cache auf Platte geschrieben Alle entsprechenden SQL Server Services (Dienste) in der Ressource Gruppe werden auf dem aktiven Knoten gestoppt Die Ownership der Ressource Gruppe wird auf einen anderen Knoten der FCI transferriert Der neue Owner (Besitzer) der Ressource Gruppe startet seine SQL Server Services (Dienste) Die Connection-Anforderungen einer Client-Applikation werden automatisch auf den neuen aktiven Knoten mit dem selben Virtuellen Network Namen (VNN) umgeleitet Abhängig vom Zeitpunkt des letzten Checkpoints, kann die Anzahl der Dirty Pages im Buffer Cache, die noch auf Platte geschrieben werden müssen, zu unvorhersehbar langen Failover-Zeiten führen. Um diese Anzahl zu drosseln, besitzt der SQL Server 2012 eine neue Fähigkeit, die Indirect Checkpoints genannt wird. Indirect Checkpoints ähnelt dem Fast-Start MTTR Target Feature der Oracle Datenbank, das bereits mit Oracle9i verfügbar war. SQL Server Multi-Subnet Clustering Ein SQL Server Multi-Subnet Failover Cluster entspricht vom Konzept her einem Oracle RAC Stretch Cluster. Doch dies ist nur auf den ersten Blick der Fall. Im Gegensatz zu RAC ist in einem lokalen SQL Server Failover Cluster jeweils nur ein Knoten aktiv für eine Datenbank. Für die Datenreplikation zwischen geografisch entfernten Sites verlässt sich Microsoft auf 3rd Party Lösungen für das Storage Mirroring. Die Verbesserung dieses Szenario mit einer SQL Server 2012 Implementierung besteht schlicht darin, dass eine VLAN-Konfiguration (Virtual Local Area Network) nun nicht mehr benötigt wird, so wie dies bisher der Fall war. Das folgende Diagramm stellt dar, wie der Ablauf mit SQL Server 2012 gehandhabt wird. In Site A und Site B wird HA jeweils durch einen lokalen Aktiv-Passiv-Cluster sichergestellt. Besondere Aufmerksamkeit muss hier der Konfiguration und dem Tuning geschenkt werden, da ansonsten völlig inakzeptable Failover-Zeiten resultieren. Dies liegt darin begründet, weil die Downtime auf Client-Seite nun nicht mehr nur von der reinen Failover-Zeit abhängt, sondern zusätzlich von der Dauer der DNS Replikation zwischen den DNS Servern. (Rufen Sie sich in Erinnerung, dass wir gerade von Multi-Subnet Clustering sprechen). Außerdem ist zu berücksichtigen, wie schnell die Clients die aktualisierten DNS Informationen abfragen. Spezielle Konfigurationen für Node Heartbeat, HostRecordTTL (Host Record Time-to-Live) und Intersite Replication Frequeny für Active Directory Sites und Services werden notwendig. Default TTL für Windows Server 2008 R2: 20 Minuten Empfohlene Einstellung: 1 Minute DNS Update Replication Frequency in Windows Umgebung: 180 Minuten Empfohlene Einstellung: 15 Minuten (minimaler Wert) Betrachtet man diese Werte, muss man feststellen, dass selbst eine optimale Konfiguration die rigiden SLAs (Service Level Agreements) heutiger geschäftskritischer Anwendungen für HA und DR nicht erfüllen kann. Denn dies impliziert eine auf der Client-Seite erlebte Failover-Zeit von insgesamt 16 Minuten. Hierzu ein Auszug aus der SQL Server 2012 Online Dokumentation: Cons: If a cross-subnet failover occurs, the client recovery time could be 15 minutes or longer, depending on your HostRecordTTL setting and the setting of your cross-site DNS/AD replication schedule. Wir sind hier an einem Punkt unserer Überlegungen angelangt, an dem sich erklärt, weshalb ich zuvor das "Windows was the God ..." Zitat verwendet habe. Die unbedingte Abhängigkeit zu Windows wird zunehmend zum Problem, da sie die Komplexität einer Microsoft-basierenden Lösung erhöht, anstelle sie zu reduzieren. Und Komplexität ist das Letzte, was sich CIOs heutzutage wünschen. Zur Ehrenrettung des SQL Server 2012 und AlwaysOn muss man sagen, dass derart lange Failover-Zeiten kein unbedingtes "Muss" darstellen, sondern ein "Kann". Doch auch ein "Kann" kann im unpassenden Moment unvorhersehbare und kostspielige Folgen haben. Die Unabsehbarkeit ist wiederum Ursache vieler an der Implementierung beteiligten Komponenten und deren Abhängigkeiten, wie beispielsweise drei Cluster-Lösungen (zwei von Microsoft, eine 3rd Party Lösung). Wie man die Sache auch dreht und wendet, kommt man an diesem Fakt also nicht vorbei - ganz unabhängig von der Dauer einer Downtime oder Failover-Zeiten. Im Gegensatz zu AlwaysOn und der hier vorgestellten Version eines Stretch-Clusters, vermeidet eine entsprechende Oracle Implementierung eine derartige Komplexität, hervorgerufen duch multiple Abhängigkeiten. Den Unterschied machen Datenbank-integrierte Mechanismen, wie Fast Application Notification (FAN) und Fast Connection Failover (FCF). Für Oracle MAA Konfigurationen (Maximum Availability Architecture) sind Inter-Site Failover-Zeiten im Bereich von Sekunden keine Seltenheit. Wenn Sie dem Link zur Oracle MAA folgen, finden Sie außerdem eine Reihe an Customer Case Studies. Auch dies ist ein wichtiges Unterscheidungsmerkmal zu AlwaysOn, denn die Oracle Technologie hat sich bereits zigfach in höchst kritischen Umgebungen bewährt. Availability Groups (Verfügbarkeitsgruppen) Die sogenannten Availability Groups (Verfügbarkeitsgruppen) sind - neben FCI - der weitere Baustein von AlwaysOn. Hinweis: Bevor wir uns näher damit beschäftigen, sollten Sie sich noch einmal ins Gedächtnis rufen, dass eine SQL Server Datenbank nicht die gleiche Bedeutung besitzt, wie eine Oracle Datenbank, sondern eher einem Oracle Schema entspricht. So etwas wie die SQL Server Northwind Datenbank ist vergleichbar mit dem Oracle Scott Schema. Eine Verfügbarkeitsgruppe setzt sich zusammen aus einem Set mehrerer Benutzer-Datenbanken, die im Falle eines Failover gemeinsam als Gruppe behandelt werden. Eine Verfügbarkeitsgruppe unterstützt ein Set an primären Datenbanken (primäres Replikat) und einem bis vier Sets von entsprechenden sekundären Datenbanken (sekundäre Replikate). Es können jedoch nicht alle SQL Server Datenbanken einer AlwaysOn Verfügbarkeitsgruppe zugeordnet werden. Der SQL Server Spezialist Michael Otey zählt in seinem SQL Server Pro Artikel folgende Anforderungen auf: Verfügbarkeitsgruppen müssen mit Benutzer-Datenbanken erstellt werden. System-Datenbanken können nicht verwendet werden Die Datenbanken müssen sich im Read-Write Modus befinden. Read-Only Datenbanken werden nicht unterstützt Die Datenbanken in einer Verfügbarkeitsgruppe müssen Multiuser Datenbanken sein Sie dürfen nicht das AUTO_CLOSE Feature verwenden Sie müssen das Full Recovery Modell nutzen und es muss ein vollständiges Backup vorhanden sein Eine gegebene Datenbank kann sich nur in einer einzigen Verfügbarkeitsgruppe befinden und diese Datenbank düerfen nicht für Database Mirroring konfiguriert sein Microsoft empfiehl außerdem, dass der Verzeichnispfad einer Datenbank auf dem primären und sekundären Server identisch sein sollte Wie man sieht, eignen sich Verfügbarkeitsgruppen nicht, um HA und DR vollständig abzubilden. Die Unterscheidung zwischen der Instanzen-Ebene (FCI) und Datenbank-Ebene (Availability Groups) ist von hoher Bedeutung. Vor kurzem wurde mir gesagt, dass man mit den Verfügbarkeitsgruppen auf Shared Storage verzichten könne und dadurch Kosten spart. So weit so gut ... Man kann natürlich eine Installation rein mit Verfügbarkeitsgruppen und ohne FCI durchführen - aber man sollte sich dann darüber bewusst sein, was man dadurch alles nicht abgesichert hat - und dies wiederum für Desaster Recovery (DR) und SLAs (Service Level Agreements) bedeutet. Kurzum, um die Kombination aus beiden AlwaysOn Produkten und der damit verbundene Komplexität kommt man wohl in der Praxis nicht herum. Availability Groups und WSFC AlwaysOn hängt von Windows Server Failover Clustering (WSFC) ab, um die aktuellen Rollen der Verfügbarkeitsreplikate einer Verfügbarkeitsgruppe zu überwachen und zu verwalten, und darüber zu entscheiden, wie ein Failover-Ereignis die Verfügbarkeitsreplikate betrifft. Das folgende Diagramm zeigt de Beziehung zwischen Verfügbarkeitsgruppen und WSFC: Der Verfügbarkeitsmodus ist eine Eigenschaft jedes Verfügbarkeitsreplikats. Synychron und Asynchron können also gemischt werden: Availability Modus (Verfügbarkeitsmodus) Asynchroner Commit-Modus Primäres replikat schließt Transaktionen ohne Warten auf Sekundäres Synchroner Commit-Modus Primäres Replikat wartet auf Commit von sekundärem Replikat Failover Typen Automatic Manual Forced (mit möglichem Datenverlust) Synchroner Commit-Modus Geplanter, manueller Failover ohne Datenverlust Automatischer Failover ohne Datenverlust Asynchroner Commit-Modus Nur Forced, manueller Failover mit möglichem Datenverlust Der SQL Server kennt keinen separaten Switchover Begriff wie in Oracle Data Guard. Für SQL Server werden alle Role Transitions als Failover bezeichnet. Tatsächlich unterstützt der SQL Server keinen Switchover für asynchrone Verbindungen. Es gibt nur die Form des Forced Failover mit möglichem Datenverlust. Eine ähnliche Fähigkeit wie der Switchover unter Oracle Data Guard ist so nicht gegeben. SQL Sever FCI mit Availability Groups (Verfügbarkeitsgruppen) Neben den Verfügbarkeitsgruppen kann eine zweite Failover-Ebene eingerichtet werden, indem SQL Server FCI (auf Shared Storage) mit WSFC implementiert wird. Ein Verfügbarkeitesreplikat kann dann auf einer Standalone Instanz gehostet werden, oder einer FCI Instanz. Zum Verständnis: Die Verfügbarkeitsgruppen selbst benötigen kein Shared Storage. Diese Kombination kann verwendet werden für lokale HA auf Ebene der Instanz und DR auf Datenbank-Ebene durch Verfügbarkeitsgruppen. Das folgende Diagramm zeigt dieses Szenario: Achtung! Hier handelt es sich nicht um ein Pendant zu Oracle RAC plus Data Guard, auch wenn das Bild diesen Eindruck vielleicht vermitteln mag - denn alle sekundären Knoten im FCI sind rein passiv. Es existiert außerdem eine weitere und ernsthafte Einschränkung: SQL Server Failover Cluster Instanzen (FCI) unterstützen nicht das automatische AlwaysOn Failover für Verfügbarkeitsgruppen. Jedes unter FCI gehostete Verfügbarkeitsreplikat kann nur für manuelles Failover konfiguriert werden. Lesbare Sekundäre Replikate Ein oder mehrere Verfügbarkeitsreplikate in einer Verfügbarkeitsgruppe können für den lesenden Zugriff konfiguriert werden, wenn sie als sekundäres Replikat laufen. Dies ähnelt Oracle Active Data Guard, jedoch gibt es Einschränkungen. Alle Abfragen gegen die sekundäre Datenbank werden automatisch auf das Snapshot Isolation Level abgebildet. Es handelt sich dabei um eine Versionierung der Rows. Microsoft versuchte hiermit die Oracle MVRC (Multi Version Read Consistency) nachzustellen. Tatsächlich muss man die SQL Server Snapshot Isolation eher mit Oracle Flashback vergleichen. Bei der Implementierung des Snapshot Isolation Levels handelt sich um ein nachträglich aufgesetztes Feature und nicht um einen inhärenten Teil des Datenbank-Kernels, wie im Falle Oracle. (Ich werde hierzu in Kürze einen weiteren Blogbeitrag verfassen, wenn ich mich mit der neuen SQL Server 2012 Core Lizenzierung beschäftige.) Für die Praxis entstehen aus der Abbildung auf das Snapshot Isolation Level ernsthafte Restriktionen, derer man sich für den Betrieb in der Praxis bereits vorab bewusst sein sollte: Sollte auf der primären Datenbank eine aktive Transaktion zu dem Zeitpunkt existieren, wenn ein lesbares sekundäres Replikat in die Verfügbarkeitsgruppe aufgenommen wird, werden die Row-Versionen auf der korrespondierenden sekundären Datenbank nicht sofort vollständig verfügbar sein. Eine aktive Transaktion auf dem primären Replikat muss zuerst abgeschlossen (Commit oder Rollback) und dieser Transaktions-Record auf dem sekundären Replikat verarbeitet werden. Bis dahin ist das Isolation Level Mapping auf der sekundären Datenbank unvollständig und Abfragen sind temporär geblockt. Microsoft sagt dazu: "This is needed to guarantee that row versions are available on the secondary replica before executing the query under snapshot isolation as all isolation levels are implicitly mapped to snapshot isolation." (SQL Storage Engine Blog: AlwaysOn: I just enabled Readable Secondary but my query is blocked?) Grundlegend bedeutet dies, dass ein aktives lesbares Replikat nicht in die Verfügbarkeitsgruppe aufgenommen werden kann, ohne das primäre Replikat vorübergehend stillzulegen. Da Leseoperationen auf das Snapshot Isolation Transaction Level abgebildet werden, kann die Bereinigung von Ghost Records auf dem primären Replikat durch Transaktionen auf einem oder mehreren sekundären Replikaten geblockt werden - z.B. durch eine lang laufende Abfrage auf dem sekundären Replikat. Diese Bereinigung wird auch blockiert, wenn die Verbindung zum sekundären Replikat abbricht oder der Datenaustausch unterbrochen wird. Auch die Log Truncation wird in diesem Zustant verhindert. Wenn dieser Zustand längere Zeit anhält, empfiehlt Microsoft das sekundäre Replikat aus der Verfügbarkeitsgruppe herauszunehmen - was ein ernsthaftes Downtime-Problem darstellt. Die Read-Only Workload auf den sekundären Replikaten kann eingehende DDL Änderungen blockieren. Obwohl die Leseoperationen aufgrund der Row-Versionierung keine Shared Locks halten, führen diese Operatioen zu Sch-S Locks (Schemastabilitätssperren). DDL-Änderungen durch Redo-Operationen können dadurch blockiert werden. Falls DDL aufgrund konkurrierender Lese-Workload blockiert wird und der Schwellenwert für 'Recovery Interval' (eine SQL Server Konfigurationsoption) überschritten wird, generiert der SQL Server das Ereignis sqlserver.lock_redo_blocked, welches Microsoft zum Kill der blockierenden Leser empfiehlt. Auf die Verfügbarkeit der Anwendung wird hierbei keinerlei Rücksicht genommen. Keine dieser Einschränkungen existiert mit Oracle Active Data Guard. Backups auf sekundären Replikaten Über die sekundären Replikate können Backups (BACKUP DATABASE via Transact-SQL) nur als copy-only Backups einer vollständigen Datenbank, Dateien und Dateigruppen erstellt werden. Das Erstellen inkrementeller Backups ist nicht unterstützt, was ein ernsthafter Rückstand ist gegenüber der Backup-Unterstützung physikalischer Standbys unter Oracle Data Guard. Hinweis: Ein möglicher Workaround via Snapshots, bleibt ein Workaround. Eine weitere Einschränkung dieses Features gegenüber Oracle Data Guard besteht darin, dass das Backup eines sekundären Replikats nicht ausgeführt werden kann, wenn es nicht mit dem primären Replikat kommunizieren kann. Darüber hinaus muss das sekundäre Replikat synchronisiert sein oder sich in der Synchronisation befinden, um das Beackup auf dem sekundären Replikat erstellen zu können. Vergleich von Microsoft AlwaysOn mit der Oracle MAA Ich komme wieder zurück auf die Eingangs erwähnte, mehrfach an mich gestellte Frage "Wann denn - und ob überhaupt - Oracle etwas Vergleichbares wie AlwaysOn bieten würde?" und meine damit verbundene (kurze) Irritation. Wenn Sie diesen Blogbeitrag bis hierher gelesen haben, dann kennen Sie jetzt meine darauf gegebene Antwort. Der eine oder andere Punkt traf dabei nicht immer auf Jeden zu, was auch nicht der tiefere Sinn und Zweck meiner Antwort war. Wenn beispielsweise kein Multi-Subnet mit im Spiel ist, sind alle diesbezüglichen Kritikpunkte zunächst obsolet. Was aber nicht bedeutet, dass sie nicht bereits morgen schon wieder zum Thema werden könnten (Sag niemals "Nie"). In manch anderes Fettnäpfchen tritt man wiederum nicht unbedingt in einer Testumgebung, sondern erst im laufenden Betrieb. Erst recht nicht dann, wenn man sich potenzieller Probleme nicht bewusst ist und keine dedizierten Tests startet. Und wer AlwaysOn erfolgreich positionieren möchte, wird auch gar kein Interesse daran haben, auf mögliche Schwachstellen und den besagten Teufel im Detail aufmerksam zu machen. Das ist keine Unterstellung - es ist nur menschlich. Außerdem ist es verständlich, dass man sich in erster Linie darauf konzentriert "was geht" und "was gut läuft", anstelle auf das "was zu Problemen führen kann" oder "nicht funktioniert". Wer will schon der Miesepeter sein? Für mich selbst gesprochen, kann ich nur sagen, dass ich lieber vorab von allen möglichen Einschränkungen wissen möchte, anstelle sie dann nach einer kurzen Zeit der heilen Welt schmerzhaft am eigenen Leib erfahren zu müssen. Ich bin davon überzeugt, dass es Ihnen nicht anders geht. Nachfolgend deshalb eine Zusammenfassung all jener Punkte, die ich im Vergleich zur Oracle MAA (Maximum Availability Architecture) als unbedingt Erwähnenswert betrachte, falls man eine Evaluierung von Microsoft AlwaysOn in Betracht zieht. 1. AlwaysOn ist eine komplexe Technologie Der SQL Server AlwaysOn Stack ist zusammengesetzt aus drei verschiedenen Technlogien: Windows Server Failover Clustering (WSFC) SQL Server Failover Cluster Instances (FCI) SQL Server Availability Groups (Verfügbarkeitsgruppen) Man kann eine derartige Lösung nicht als nahtlos bezeichnen, wofür auch die vielen von Microsoft dargestellten Einschränkungen sprechen. Während sich frühere SQL Server Versionen in Richtung eigener HA/DR Technologien entwickelten (wie Database Mirroring), empfiehlt Microsoft nun die Migration. Doch weshalb dieser Schwenk? Er führt nicht zu einem konsisten und robusten Angebot an HA/DR Technologie für geschäftskritische Umgebungen. Liegt die Antwort in meiner These begründet, nach der "Windows was the God ..." noch immer gilt und man die Nachteile der allzu engen Kopplung mit Windows nicht sehen möchte? Entscheiden Sie selbst ... 2. Failover Cluster Instanzen - Kein RAC-Pendant Die SQL Server und Windows Server Clustering Technologie basiert noch immer auf dem veralteten Aktiv-Passiv Modell und führt zu einer Verschwendung von Systemressourcen. In einer Betrachtung von lediglich zwei Knoten erschließt sich auf Anhieb noch nicht der volle Mehrwert eines Aktiv-Aktiv Clusters (wie den Real Application Clusters), wie er von Oracle bereits vor zehn Jahren entwickelt wurde. Doch kennt man die Vorzüge der Skalierbarkeit durch einfaches Hinzufügen weiterer Cluster-Knoten, die dann alle gemeinsam als ein einziges logisches System zusammenarbeiten, versteht man was hinter dem Motto "Pay-as-you-Grow" steckt. In einem Aktiv-Aktiv Cluster geht es zwar auch um Hochverfügbarkeit - und ein Failover erfolgt zudem schneller, als in einem Aktiv-Passiv Modell - aber es geht eben nicht nur darum. An dieser Stelle sei darauf hingewiesen, dass die Oracle 11g Standard Edition bereits die Nutzung von Oracle RAC bis zu vier Sockets kostenfrei beinhaltet. Möchten Sie dazu Windows nutzen, benötigen Sie keine Windows Server Enterprise Edition, da Oracle 11g die eigene Clusterware liefert. Sie kommen in den Genuss von Hochverfügbarkeit und Skalierbarkeit und können dazu die günstigere Windows Server Standard Edition nutzen. 3. SQL Server Multi-Subnet Clustering - Abhängigkeit zu 3rd Party Storage Mirroring Die SQL Server Multi-Subnet Clustering Architektur unterstützt den Aufbau eines Stretch Clusters, basiert dabei aber auf dem Aktiv-Passiv Modell. Das eigentlich Problematische ist jedoch, dass man sich zur Absicherung der Datenbank auf 3rd Party Storage Mirroring Technologie verlässt, ohne Integration zwischen dem Windows Server Failover Clustering (WSFC) und der darunterliegenden Mirroring Technologie. Wenn nun im Cluster ein Failover auf Instanzen-Ebene erfolgt, existiert keine Koordination mit einem möglichen Failover auf Ebene des Storage-Array. 4. Availability Groups (Verfügbarkeitsgruppen) - Vier, oder doch nur Zwei? Ein primäres Replikat erlaubt bis zu vier sekundäre Replikate innerhalb einer Verfügbarkeitsgruppe, jedoch nur zwei im Synchronen Commit Modus. Während dies zwar einen Vorteil gegenüber dem stringenten 1:1 Modell unter Database Mirroring darstellt, fällt der SQL Server 2012 damit immer noch weiter zurück hinter Oracle Data Guard mit bis zu 30 direkten Stanbdy Zielen - und vielen weiteren durch kaskadierende Ziele möglichen. Damit eignet sich Oracle Active Data Guard auch für die Bereitstellung einer Reader-Farm Skalierbarkeit für Internet-basierende Unternehmen. Mit AwaysOn Verfügbarkeitsgruppen ist dies nicht möglich. 5. Availability Groups (Verfügbarkeitsgruppen) - kein asynchrones Switchover Die Technologie der Verfügbarkeitsgruppen wird auch als geeignetes Mittel für administrative Aufgaben positioniert - wie Upgrades oder Wartungsarbeiten. Man muss sich jedoch einem gravierendem Defizit bewusst sein: Im asynchronen Verfügbarkeitsmodus besteht die einzige Möglichkeit für Role Transition im Forced Failover mit Datenverlust! Um den Verlust von Daten durch geplante Wartungsarbeiten zu vermeiden, muss man den synchronen Verfügbarkeitsmodus konfigurieren, was jedoch ernstzunehmende Auswirkungen auf WAN Deployments nach sich zieht. Spinnt man diesen Gedanken zu Ende, kommt man zu dem Schluss, dass die Technologie der Verfügbarkeitsgruppen für geplante Wartungsarbeiten in einem derartigen Umfeld nicht effektiv genutzt werden kann. 6. Automatisches Failover - Nicht immer möglich Sowohl die SQL Server FCI, als auch Verfügbarkeitsgruppen unterstützen automatisches Failover. Möchte man diese jedoch kombinieren, wird das Ergebnis kein automatisches Failover sein. Denn ihr Zusammentreffen im Failover-Fall führt zu Race Conditions (Wettlaufsituationen), weshalb diese Konfiguration nicht länger das automatische Failover zu einem Replikat in einer Verfügbarkeitsgruppe erlaubt. Auch hier bestätigt sich wieder die tiefere Problematik von AlwaysOn, mit einer Zusammensetzung aus unterschiedlichen Technologien und der Abhängigkeit zu Windows. 7. Problematische RTO (Recovery Time Objective) Microsoft postioniert die SQL Server Multi-Subnet Clustering Architektur als brauchbare HA/DR Architektur. Bedenkt man jedoch die Problematik im Zusammenhang mit DNS Replikation und den möglichen langen Wartezeiten auf Client-Seite von bis zu 16 Minuten, sind strenge RTO Anforderungen (Recovery Time Objectives) nicht erfüllbar. Im Gegensatz zu Oracle besitzt der SQL Server keine Datenbank-integrierten Technologien, wie Oracle Fast Application Notification (FAN) oder Oracle Fast Connection Failover (FCF). 8. Problematische RPO (Recovery Point Objective) SQL Server ermöglicht Forced Failover (erzwungenes Failover), bietet jedoch keine Möglichkeit zur automatischen Übertragung der letzten Datenbits von einem alten zu einem neuen primären Replikat, wenn der Verfügbarkeitsmodus asynchron war. Oracle Data Guard hingegen bietet diese Unterstützung durch das Flush Redo Feature. Dies sichert "Zero Data Loss" und beste RPO auch in erzwungenen Failover-Situationen. 9. Lesbare Sekundäre Replikate mit Einschränkungen Aufgrund des Snapshot Isolation Transaction Level für lesbare sekundäre Replikate, besitzen diese Einschränkungen mit Auswirkung auf die primäre Datenbank. Die Bereinigung von Ghost Records auf der primären Datenbank, wird beeinflusst von lang laufenden Abfragen auf der lesabaren sekundären Datenbank. Die lesbare sekundäre Datenbank kann nicht in die Verfügbarkeitsgruppe aufgenommen werden, wenn es aktive Transaktionen auf der primären Datenbank gibt. Zusätzlich können DLL Änderungen auf der primären Datenbank durch Abfragen auf der sekundären blockiert werden. Und imkrementelle Backups werden hier nicht unterstützt. Keine dieser Restriktionen existiert unter Oracle Data Guard.

Read the article
Weird Network Behavior of Home Router

- by Stilgar

First of all I would like to apologize because what you are going to read will be long and confusing but I am fighting this issue for 3 days now and am out of ideas. At home I have the following setup 50Mbps Internet connects into a home router A 2 desktop computers connect to router A via standard FTP LAN cables including one where the cable is ~20m long. a second router B connects to router A via standard FTP LAN cable X (~20m long). several devices connect to the wireless network of router B and there are a couple of desktop computers connected to it through FTP LAN cables. For some reason computers connected to router B when it is connected via cable X have very slow Internet connection. It is like 5 times slower than what is expected. This is the actual problem I am trying to solve. Interesting facts If a computer is connected to cable X directly instead of through router B the Internet speed is just fine (up to the 50Mbps I get from the ISP). Tested with two computers. I have tried replacing router B with another router C and the problem persists. If I connect router B via another cable to the same ports with the same settings everything seems to work fine and computers connected to router B have quite fast Internet I have tested mainly via Speedtest.net but I have also achieved similar speeds when downloading a file The upload speed is quite higher than the download speed in all cases. Note that my ISP usually has higher upload speed (unless it manages to hit the 50Mbps cap) It seems like the speed when connecting through router B with cable X is reduced 4-5 times no matter what the original speed is. For example via router B I get 10Mbps speed to local servers where I get 50Mbps when connected on router A. If I use a distant server where the ISP is only able to provide 25Mbps I get 4-5Mbps on router B. WiFi is slower than LAN on both routers (which is normal) but the reduced speed is reduced proportionally for WiFi. In addition the upload speed is normally higher from the ISP and it is also reduced proportionally. I have tried two different network configurations. One where I have NAT behind NAT where router B connects to router A via the WAN port and has its own DHCP. Second where router B connects to router A via standard LAN port and has DHCP disabled. In this configuration router B serves as a switch and the Network Gateway for computers connected to router B is the internal IP address of router A. Both configurations work just fine but both manifest the reduced speed issue. pings seem to work just fine As far as I can tell none of the cables is crossed The RJ45 setup for cable X orange orange-white brown brow-white blue blue-white green green-white This is a big problem for me since cable X passes through walls and floors and is very hard to replace. I also may have gotten some of the facts wrong because I am almost going crazy with this issue and testing includes going several floors up and down the staircase. One hypothesis I came up with is that the cable is defective in such a way that the voltage from the router affects its performance. When it is connected to a computer it performs just fine but the router has less power. Related hypothesis includes the cable being affected by electricity cables in the walls when the voltage is low. (I know nothing about electricity) So any ideas what to do, what to test or what the issue may be?

Read the article
Memcached Lagging

- by Brad Dwyer

Let me preface this by saying that this is a followup question to this topic. That was "solved" by switching from Solaris (SmartOS) to Ubuntu for the memcached server. Now we've multiplied load by about 5x and are running into problems again. We are running a site that is doing about 1000 requests/minute, each request hits Memcached with approximately 3 reads and 1 write. So load is approximately 65 requests per second. Total data in the cache is about 37M, and each key contains a very small amount of data (a JSON-encoded array of integers amounting to less than 1K). We have setup a benchmarking script on these pages and fed the data into StatsD for logging. The problem is that there are spikes where Memcached takes a very long time to respond. These do not appear to correlate with spikes in traffic. What could be causing these spikes? Why would memcached take over a second to reply? We just booted up a second server to put in the pool and it didn't make any noticeable difference in the frequency or severity of the spikes. This is the output of getStats() on the servers: Array ( [-----------] => Array ( [pid] => 1364 [uptime] => 3715684 [threads] => 4 [time] => 1336596719 [pointer_size] => 64 [rusage_user_seconds] => 7924 [rusage_user_microseconds] => 170000 [rusage_system_seconds] => 187214 [rusage_system_microseconds] => 190000 [curr_items] => 12578 [total_items] => 53516300 [limit_maxbytes] => 943718400 [curr_connections] => 14 [total_connections] => 72550117 [connection_structures] => 165 [bytes] => 2616068 [cmd_get] => 450388258 [cmd_set] => 53493365 [get_hits] => 450388258 [get_misses] => 2244297 [evictions] => 0 [bytes_read] => 2138744916 [bytes_written] => 745275216 [version] => 1.4.2 ) [-----------:11211] => Array ( [pid] => 8099 [uptime] => 4687 [threads] => 4 [time] => 1336596719 [pointer_size] => 64 [rusage_user_seconds] => 7 [rusage_user_microseconds] => 170000 [rusage_system_seconds] => 290 [rusage_system_microseconds] => 990000 [curr_items] => 2384 [total_items] => 225964 [limit_maxbytes] => 943718400 [curr_connections] => 7 [total_connections] => 588097 [connection_structures] => 91 [bytes] => 562641 [cmd_get] => 1012562 [cmd_set] => 225778 [get_hits] => 1012562 [get_misses] => 125161 [evictions] => 0 [bytes_read] => 91270698 [bytes_written] => 350071516 [version] => 1.4.2 ) ) Edit: Here is the result of a set and retrieve of 10,000 values. Normal: Stored 10000 values in 5.6118 seconds. Average: 0.0006 High: 0.1958 Low: 0.0003 Fetched 10000 values in 5.1215 seconds. Average: 0.0005 High: 0.0141 Low: 0.0003 When Spiking: Stored 10000 values in 16.5074 seconds. Average: 0.0017 High: 0.9288 Low: 0.0003 Fetched 10000 values in 19.8771 seconds. Average: 0.0020 High: 0.9478 Low: 0.0003

Read the article
CLSF & CLK 2013 Trip Report by Jeff Liu

- by jamesmorris

This is a contributed post from Jeff Liu, lead XFS developer for the Oracle mainline Linux kernel team. Recently, I attended both the China Linux Storage and Filesystem workshop (CLSF), and the China Linux Kernel conference (CLK), which were held in Shanghai. Here are the highlights for both events. CLSF - 17th October XFS update (led by Jeff Liu) XFS keeps rapid progress with a lot of changes, especially focused on the infrastructure/performance improvements as well as new feature development. This can be reflected with a sample statistics among XFS/Ext4+JBD2/Btrfs via: # git diff --stat --minimal -C -M v3.7..v3.12-rc4 -- fs/xfs|fs/ext4+fs/jbd2|fs/btrfs XFS: 141 files changed, 27598 insertions(+), 19113 deletions(-) Ext4+JBD2: 39 files changed, 10487 insertions(+), 5454 deletions(-) Btrfs: 70 files changed, 19875 insertions(+), 8130 deletions(-) What made up those changes in XFS? Self-describing metadata(CRC32c). This is a new feature and it contributed about 70% code changes, it can be enabled via `mkfs.xfs -m crc=1 /dev/xxx` for v5 superblock. Transaction log space reservation improvements. With this change, we can calculate the log space reservation at mount time rather than runtime to reduce the the CPU overhead. User namespace support. So both XFS and USERNS can be enabled on kernel configuration begin from Linux 3.10. Thanks Dwight Engen's efforts for this thing. Split project/group quota inodes. Originally, project quota can not be enabled with group quota at the same time because they were share the same quota file inode, now it works but only for v5 super block. i.e, CRC enabled. CONFIG_XFS_WARN, an new lightweight runtime debugger which can be deployed in production environment. Readahead log object recovery, this change can speed up the log replay progress significantly. Speculative preallocation inode tracking, clearing and throttling. The main purpose is to deal with inodes with post-EOF space due to speculative preallocation, support improved quota management to free up a significant amount of unwritten space when at or near EDQUOT. It support backgroup scanning which occurs on a longish interval(5 mins by default, tunable), and on-demand scanning/trimming via ioctl(2). Bitter arguments ensued from this session, especially for the comparison between Ext4 and Btrfs in different areas, I have to spent a whole morning of the 1st day answering those questions. We basically agreed on XFS is the best choice in Linux nowadays because: Stable, XFS has a good record in stability in the past 10 years. Fengguang Wu who lead the 0-day kernel test project also said that he has observed less error than other filesystems in the past 1+ years, I own it to the XFS upstream code reviewer, they always performing serious code review as well as testing. Good performance for large/small files, XFS does not works very well for small files has already been an old story for years. Best choice (maybe) for distributed PB filesystems. e.g, Ceph recommends delopy OSD daemon on XFS because Ext4 has limited xattr size. Best choice for large storage (>16TB). Ext4 does not support a single file more than around 15.95TB. Scalability, any objection to XFS is best in this point? :) XFS is better to deal with transaction concurrency than Ext4, why? The maximum size of the log in XFS is 2038MB compare to 128MB in Ext4. Misc. Ext4 is widely used and it has been proved fast/stable in various loads and scenarios, XFS just need more customers, and Btrfs is still on the road to be a manhood. Ceph Introduction (Led by Li Wang) This a hot topic. Li gave us a nice introduction about the design as well as their current works. Actually, Ceph client has been included in Linux kernel since 2.6.34 and supported by Openstack since Folsom but it seems that it has not yet been widely deployment in production environment. Their major work is focus on the inline data support to separate the metadata and data storage, reduce the file access time, i.e, a file access need communication twice, fetch the metadata from MDS and then get data from OSD, and also, the small file access is limited by the network latency. The solution is, for the small files they would like to store the data at metadata so that when accessing a small file, the metadata server can push both metadata and data to the client at the same time. In this way, they can reduce the overhead of calculating the data offset and save the communication to OSD. For this feature, they have only run some small scale testing but really saw noticeable improvements. Test environment: Intel 2 CPU 12 Core, 64GB RAM, Ubuntu 12.04, Ceph 0.56.6 with 200GB SATA disk, 15 OSD, 1 MDS, 1 MON. The sequence read performance for 1K size files improved about 50%. I have asked Li and Zheng Yan (the core developer of Ceph, who also worked on Btrfs) whether Ceph is really stable and can be deployed at production environment for large scale PB level storage, but they can not give a positive answer, looks Ceph even does not spread over Dreamhost (subject to confirmation). From Li, they only deployed Ceph for a small scale storage(32 nodes) although they'd like to try 6000 nodes in the future. Improve Linux swap for Flash storage (led by Shaohua Li) Because of high density, low power and low price, flash storage (SSD) is a good candidate to partially replace DRAM. A quick answer for this is using SSD as swap. But Linux swap is designed for slow hard disk storage, so there are a lot of challenges to efficiently use SSD for swap. SWAPOUT swap_map scan swap_map is the in-memory data structure to track swap disk usage, but it is a slow linear scan. It will become a bottleneck while finding many adjacent pages in the use of SSD. Shaohua Li have changed it to a cluster(128K) list, resulting in O(1) algorithm. However, this apporoach needs restrictive cluster alignment and only enabled for SSD. IO pattern In most cases, the swap io is in interleaved pattern because of mutiple reclaimers or a free cluster is shared by all reclaimers. Even though block layer can merge interleaved IO to some extent, but we cannot count on it completely. Hence the per-cpu cluster is added base on the previous change, it can help reclaimer do sequential IO and the block layer will be easier to merge IO. TLB flush: If we're reclaiming one active page, we should first move the page from active lru list to inactive lru list, and then reclaim the page from inactive lru to swap it out. During the process, we need to clear PTE twice: first is 'A'(ACCESS) bit, second is 'P'(PRESENT) bit. Processors need to send lots of ipi which make the TLB flush really expensive. Some works have been done to improve this, including rework smp_call_functiom_many() or remove the first TLB flush in x86, but there still have some arguments here and only parts of works have been pushed to mainline. SWAPIN: Page fault does iodepth=1 sync io, but it's a little waste if only issue a page size's IO. The obvious solution is doing swap readahead. But the current in-kernel swap readahead is arbitary(always 8 pages), and it always doesn't perform well for both random and sequential access workload. Shaohua introduced a new flag for madvise(MADV_WILLNEED) to do swap prefetch, so the changes happen in userspace API and leave the in-kernel readahead unchanged(but I think some improvement can also be done here). SWAP discard As we know, discard is important for SSD write throughout, but the current swap discard implementation is synchronous. He changed it to async discard which allow discard and write run in the same time. Meanwhile, the unit of discard is also optimized to cluster. Misc: lock contention For many concurrent swapout and swapin , the lock contention such as anon_vma or swap_lock is high, so he changed the swap_lock to a per-swap lock. But there still have some lock contention in very high speed SSD because of swapcache address_space lock. Zproject (led by Bob Liu) Bob gave us a very nice introduction about the current memory compression status. Now there are 3 projects(zswap/zram/zcache) which all aim at smooth swap IO storm and promote performance, but they all have their own pros and cons. ZSWAP It is implemented based on frontswap API and it uses a dynamic allocater named Zbud to allocate free pages. Zbud means pairs of zpages are "buddied" and it can only store at most two compressed pages in one page frame, so the max compress ratio is 50%. Each page frame is lru-linked and can do shink in memory pressure. If the compressed memory pool reach its limitation, shink or reclaim happens. It decompress the page frame into two new allocated pages and then write them to real swap device, but it can fail when allocating the two pages. ZRAM Acts as a compressed ramdisk and used as swap device, and it use zsmalloc as its allocator which has high density but may have fragmentation issues. Besides, page reclaim is hard since it will need more pages to uncompress and free just one page. ZRAM is preferred by embedded system which may not have any real swap device. Now both ZRAM and ZSWAP are in driver/staging tree, and in the mm community there are some disscussions of merging ZRAM into ZSWAP or viceversa, but no agreement yet. ZCACHE Handles file page compression but it is removed out of staging recently. From industry (led by Tang Jie, LSI) An LSI engineer introduced several new produces to us. The first is raid5/6 cards that it use full stripe writes to improve performance. The 2nd one he introduced is SandForce flash controller, who can understand data file types (data entropy) to reduce write amplification (WA) for nearly all writes. It's called DuraWrite and typical WA is 0.5. What's more, if enable its Dynamic Logical Capacity function module, the controller can do data compression which is transparent to upper layer. LSI testing shows that with this virtual capacity enables 1x TB drive can support up to 2x TB capacity, but the application must monitor free flash space to maintain optimal performance and to guard against free flash space exhaustion. He said the most useful application is for datebase. Another thing I think it's worth to mention is that a NV-DRAM memory in NMR/Raptor which is directly exposed to host system. Applications can directly access the NV-DRAM via a memory address - using standard system call mmap(). He said that it is very useful for database logging now. This kind of NVM produces are beginning to appear in recent years, and it is said that Samsung is building a research center in China for related produces. IMHO, NVM will bring an effect to current os layer especially on file system, e.g. its journaling may need to redesign to fully utilize these nonvolatile memory. OCFS2 (led by Canquan Shen) Without a doubt, HuaWei is the biggest contributor to OCFS2 in the past two years. They have posted 46 upstream patches and 39 patches have been merged. Their current project is based on 32/64 nodes cluster, but they also tried 128 nodes at the experimental stage. The major work they are working is to support ATS (atomic test and set), it can be works with DLM at the same time. Looks this idea is inspired by the vmware VMFS locking, i.e, http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html CLK - 18th October 2013 Improving Linux Development with Better Tools (Andi Kleen) This talk focused on how to find/solve bugs along with the Linux complexity growing. Generally, we can do this with the following kind of tools: Static code checkers tools. e.g, sparse, smatch, coccinelle, clang checker, checkpatch, gcc -W/LTO, stanse. This can help check a lot of things, simple mistakes, complex problems, but the challenges are: some are very slow, false positives, may need a concentrated effort to get false positives down. Especially, no static checker I found can follow indirect calls (“OO in C”, common in kernel): struct foo_ops { int (*do_foo)(struct foo *obj); } foo->do_foo(foo); Dynamic runtime checkers, e.g, thread checkers, kmemcheck, lockdep. Ideally all kernel code would come with a test suite, then someone could run all the dynamic checkers. Fuzzers/test suites. e.g, Trinity is a great tool, it finds many bugs, but needs manual model for each syscall. Modern fuzzers around using automatic feedback, but notfor kernel yet: http://taviso.decsystem.org/making_software_dumber.pdf Debuggers/Tracers to understand code, e.g, ftrace, can dump on events/oops/custom triggers, but still too much overhead in many cases to run always during debug. Tools to read/understand source, e.g, grep/cscope work great for many cases, but do not understand indirect pointers (OO in C model used in kernel), give us all “do_foo” instances: struct foo_ops { int (*do_foo)(struct foo *obj); } = { .do_foo = my_foo }; foo>do_foo(foo); That would be great to have a cscope like tool that understands this based on types/initializers XFS: The High Performance Enterprise File System (Jeff Liu) [slides] I gave a talk for introducing the disk layout, unique features, as well as the recent changes. The slides include some charts to reflect the performances between XFS/Btrfs/Ext4 for small files. About a dozen users raised their hands when I asking who has experienced with XFS. I remembered that when I asked the same question in LinuxCon/Japan, only 3 people raised their hands, but they are Chris Mason, Ric Wheeler, and another attendee. The attendee questions were mainly focused on stability, and comparison with other file systems. Linux Containers (Feng Gao) The speaker introduced us that the purpose for those kind of namespaces, include mount/UTS/IPC/Network/Pid/User, as well as the system API/ABI. For the userspace tools, He mainly focus on the Libvirt LXC rather than us(LXC). Libvirt LXC is another userspace container management tool, implemented as one type of libvirt driver, it can manage containers, create namespace, create private filesystem layout for container, Create devices for container and setup resources controller via cgroup. In this talk, Feng also mentioned another two possible new namespaces in the future, the 1st is the audit, but not sure if it should be assigned to user namespace or not. Another is about syslog, but the question is do we really need it? In-memory Compression (Bob Liu) Same as CLSF, a nice introduction that I have already mentioned above. Misc There were some other talks related to ACPI based memory hotplug, smart wake-affinity in scheduler etc., but my head is not big enough to record all those things. -- Jeff Liu

Read the article
Metro: Understanding CSS Media Queries

- by Stephen.Walther

If you are building a Metro style application then your application needs to look great when used on a wide variety of devices. Your application needs to work on tiny little phones, slates, desktop monitors, and the super high resolution displays of the future. Your application also must support portable devices used with different orientations. If someone tilts their phone from portrait to landscape mode then your application must still be usable. Finally, your Metro style application must look great in different states. For example, your Metro application can be in a “snapped state” when it is shrunk so it can share screen real estate with another application. In this blog post, you learn how to use Cascading Style Sheet media queries to support different devices, different device orientations, and different application states. First, you are provided with an overview of the W3C Media Query recommendation and you learn how to detect standard media features. Next, you learn about the Microsoft extensions to media queries which are supported in Metro style applications. For example, you learn how to use the –ms-view-state feature to detect whether an application is in a “snapped state” or “fill state”. Finally, you learn how to programmatically detect the features of a device and the state of an application. You learn how to use the msMatchMedia() method to execute a media query with JavaScript. Using CSS Media Queries Media queries enable you to apply different styles depending on the features of a device. Media queries are not only supported by Metro style applications, most modern web browsers now support media queries including Google Chrome 4+, Mozilla Firefox 3.5+, Apple Safari 4+, and Microsoft Internet Explorer 9+. Loading Different Style Sheets with Media Queries Imagine, for example, that you want to display different content depending on the horizontal resolution of a device. In that case, you can load different style sheets optimized for different sized devices. Consider the following HTML page: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>U.S. Robotics and Mechanical Men</title> <link href="main.css" rel="stylesheet" type="text/css" />  <link href="medium.css" rel="stylesheet" type="text/css" media="(max-width:1100px)" />  <link href="small.css" rel="stylesheet" type="text/css" media="(max-width:800px)" /> </head> <body> <div id="header"> <h1>U.S. Robotics and Mechanical Men</h1> </div>  <div id="leftColumn"> <img src="advertisement1.gif" alt="advertisement" /> <img src="advertisement2.jpg" alt="advertisement" /> </div>  <div id="mainContentColumn"> <label>Search Products</label> <input id="search" /><button>Search</button> </div>  <div id="rightColumn"> <h1>Deal of the Day!</h1> <p> Buy two cameras and get a third camera for free! Offer is good for today only. </p> </div> </body> </html> The HTML page above contains three columns: a leftColumn, mainContentColumn, and rightColumn. When the page is displayed on a low resolution device, such as a phone, only the mainContentColumn appears: When the page is displayed in a medium resolution device, such as a slate, both the leftColumn and the mainContentColumns are displayed: Finally, when the page is displayed in a high-resolution device, such as a computer monitor, all three columns are displayed: Different content is displayed with the help of media queries. The page above contains three style sheet links. Two of the style links include a media attribute: <link href="main.css" rel="stylesheet" type="text/css" />  <link href="medium.css" rel="stylesheet" type="text/css" media="(max-width:1100px)" />  <link href="small.css" rel="stylesheet" type="text/css" media="(max-width:800px)" /> The main.css style sheet contains default styles for the elements in the page. The medium.css style sheet is applied when the page width is less than 1100px. This style sheet hides the rightColumn and changes the page background color to lime: html { background-color: lime; } #rightColumn { display:none; } Finally, the small.css style sheet is loaded when the page width is less than 800px. This style sheet hides the leftColumn and changes the page background color to red: html { background-color: red; } #leftColumn { display:none; } The different style sheets are applied as you stretch and contract your browser window. You don’t need to refresh the page after changing the size of the page for a media query to be applied: Using the @media Rule You don’t need to divide your styles into separate files to take advantage of media queries. You can group styles by using the @media rule. For example, the following HTML page contains one set of styles which are applied when a device’s orientation is portrait and another set of styles when a device’s orientation is landscape: <!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <title>Application1</title> <style type="text/css"> html { font-family:'Segoe UI Semilight'; font-size: xx-large; } @media screen and (orientation:landscape) { html { background-color: lime; } p.content { width: 50%; margin: auto; } } @media screen and (orientation:portrait) { html { background-color: red; } p.content { width: 90%; margin: auto; } } </style> </head> <body> <p class="content"> Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Maecenas porttitor congue massa. Fusce posuere, magna sed pulvinar ultricies, purus lectus malesuada libero, sit amet commodo magna eros quis urna. </p> </body> </html> When a device has a landscape orientation then the background color is set to the color lime and the text only takes up 50% of the available horizontal space: When the device has a portrait orientation then the background color is red and the text takes up 90% of the available horizontal space: Using Standard CSS Media Features The official list of standard media features is contained in the W3C CSS Media Query recommendation located here: http://www.w3.org/TR/css3-mediaqueries/ Here is the official list of the 13 media features described in the standard: · width – The current width of the viewport · height – The current height of the viewport · device-width – The width of the device · device-height – The height of the device · orientation – The value portrait or landscape · aspect-ratio – The ratio of width to height · device-aspect-ratio – The ratio of device width to device height · color – The number of bits per color supported by the device · color-index – The number of colors in the color lookup table of the device · monochrome – The number of bits in the monochrome frame buffer · resolution – The density of the pixels supported by the device · scan – The values progressive or interlace (used for TVs) · grid – The values 0 or 1 which indicate whether the device supports a grid or a bitmap Many of the media features in the list above support the min- and max- prefix. For example, you can test for the min-width using a query like this: (min-width:800px) You can use the logical and operator with media queries when you need to check whether a device supports more than one feature. For example, the following query returns true only when the width of the device is between 800 and 1,200 pixels: (min-width:800px) and (max-width:1200px) Finally, you can use the different media types – all, braille, embossed, handheld, print, projection, screen, speech, tty, tv — with a media query. For example, the following media query only applies to a page when a page is being printed in color: print and (color) If you don’t specify a media type then media type all is assumed. Using Metro Style Media Features Microsoft has extended the standard list of media features which you can include in a media query with two custom media features: · -ms-high-contrast – The values any, black-white, white-black · -ms-view-state – The values full-screen, fill, snapped, device-portrait You can take advantage of the –ms-high-contrast media feature to make your web application more accessible to individuals with disabilities. In high contrast mode, you should make your application easier to use for individuals with vision disabilities. The –ms-view-state media feature enables you to detect the state of an application. For example, when an application is snapped, the application only occupies part of the available screen real estate. The snapped application appears on the left or right side of the screen and the rest of the screen real estate is dominated by the fill application (Metro style applications can only be snapped on devices with a horizontal resolution of greater than 1,366 pixels). Here is a page which contains style rules for an application in both a snap and fill application state: <!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <title>MyWinWebApp</title> <style type="text/css"> html { font-family:'Segoe UI Semilight'; font-size: xx-large; } @media screen and (-ms-view-state:snapped) { html { background-color: lime; } } @media screen and (-ms-view-state:fill) { html { background-color: red; } } </style> </head> <body> <p class="content"> Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Maecenas porttitor congue massa. Fusce posuere, magna sed pulvinar ultricies, purus lectus malesuada libero, sit amet commodo magna eros quis urna. </p> </body> </html> When the application is snapped, the application appears with a lime background color: When the application state is fill then the background color changes to red: When the application takes up the entire screen real estate – it is not in snapped or fill state – then no special style rules apply and the application appears with a white background color. Querying Media Features with JavaScript You can perform media queries using JavaScript by taking advantage of the window.msMatchMedia() method. This method returns a MSMediaQueryList which has a matches method that represents success or failure. For example, the following code checks whether the current device is in portrait mode: if (window.msMatchMedia("(orientation:portrait)").matches) { console.log("portrait"); } else { console.log("landscape"); } If the matches property returns true, then the device is in portrait mode and the message “portrait” is written to the Visual Studio JavaScript Console window. Otherwise, the message “landscape” is written to the JavaScript Console window. You can create an event listener which triggers code whenever the results of a media query changes. For example, the following code writes a message to the JavaScript Console whenever the current device is switched into or out of Portrait mode: window.msMatchMedia("(orientation:portrait)").addListener(function (mql) { if (mql.matches) { console.log("Switched to portrait"); } }); Be aware that the event listener is triggered whenever the result of the media query changes. So the event listener is triggered both when you switch from landscape to portrait and when you switch from portrait to landscape. For this reason, you need to verify that the matches property has the value true before writing the message. Summary The goal of this blog entry was to explain how CSS media queries work in the context of a Metro style application written with JavaScript. First, you were provided with an overview of the W3C CSS Media Query recommendation. You learned about the standard media features which you can query such as width and orientation. Next, we focused on the Microsoft extensions to media queries. You learned how to use –ms-view-state to detect whether a Metro style application is in “snapped” or “fill” state. You also learned how to use the msMatchMedia() method to perform a media query from JavaScript.

Read the article
problems mounting an external IDE drive via USB in ubuntu

- by Roy Rico

I am having a problem connecting a specific IDE drive to my linux box. It's an old drive which I just want to get about 3 GB of files off of. INFO I am trying to connect a 200GB IDE Maxtor Drive, internally and externally... externally: I am using an self powered USB IDE external drive enclosure which I have used to connect various drives, under ubuntu and windows, in the past. The other posts stated it coudl be a problem I think i may have formatted the /dev/sdc partition instead of /dev/sdc1 partition when i originally formatted the drive. internally: I only have one machine left that has an internal IDE interface, and it's got XP on it. I plugged this drive internally into this machine with windows XP and used the ext2/ext3 drivers to mount this drive, but some files have question marks (?) in the file names which is messing up my copy process in windows. I can't delete the files under windows. Ubuntu Linux will not install on my only remaining machine that has IDE controller. I have tried the suggestions in the questions below http://superuser.com/questions/88182/mount-an-external-drive-in-ubuntu http://superuser.com/questions/23210/ubuntu-fails-to-mount-usb-drive it looks like i can see the drive in /proc/partitions $ cat /proc/partitions major minor #blocks name 8 0 78125000 sda 8 1 74894998 sda1 8 2 1 sda2 8 5 3229033 sda5 8 16 199148544 sdb <-- could be my drive? but it's not listed under fdisk -l $ fdisk -l Disk /dev/sda: 80.0 GB, 80000000000 bytes 255 heads, 63 sectors/track, 9726 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd0f4738c Device Boot Start End Blocks Id System /dev/sda1 * 1 9324 74894998+ 83 Linux /dev/sda2 9325 9726 3229065 5 Extended /dev/sda5 9325 9726 3229033+ 82 Linux swap / Solaris and here is my log of /var/log/messages. with a bunch of weird output, can someone let me know what that weird output is? Mar 3 19:49:40 mala kernel: [687455.112029] usb 1-7: new high speed USB device using ehci_hcd and address 3 Mar 3 19:49:41 mala kernel: [687455.248576] usb 1-7: configuration #1 chosen from 1 choice Mar 3 19:49:41 mala kernel: [687455.267450] Initializing USB Mass Storage driver... Mar 3 19:49:41 mala kernel: [687455.269180] scsi4 : SCSI emulation for USB Mass Storage devices Mar 3 19:49:41 mala kernel: [687455.269410] usbcore: registered new interface driver usb-storage Mar 3 19:49:41 mala kernel: [687455.269416] USB Mass Storage support registered. Mar 3 19:49:46 mala kernel: [687460.270917] scsi 4:0:0:0: Direct-Access Maxtor 6 Y200P0 YAR4 PQ: 0 ANSI: 2 Mar 3 19:49:46 mala kernel: [687460.271485] sd 4:0:0:0: Attached scsi generic sg2 type 0 Mar 3 19:49:46 mala kernel: [687460.278858] sd 4:0:0:0: [sdb] 398297088 512-byte logical blocks: (203 GB/189 GiB) Mar 3 19:49:46 mala kernel: [687460.280866] sd 4:0:0:0: [sdb] Write Protect is off Mar 3 19:50:16 mala kernel: [687460.283784] sdb: Mar 3 19:50:16 mala kernel: [687491.112020] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:50:47 mala kernel: [687522.120030] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:51:18 mala kernel: [687553.112034] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:51:49 mala kernel: [687584.116025] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:52:02 mala kernel: [687596.170632] type=1505 audit(1267671122.035:31): operation="profile_replace" pid=8426 name=/usr/lib/cups/backend/cups-pdf Mar 3 19:52:02 mala kernel: [687596.171551] type=1505 audit(1267671122.035:32): operation="profile_replace" pid=8426 name=/usr/sbin/cupsd Mar 3 19:52:06 mala kernel: [687600.908056] async/0 D c08145c0 0 7655 2 0x00000000 Mar 3 19:52:06 mala kernel: [687600.908062] e5601d38 00000046 e5774000 c08145c0 e4c2a848 c08145c0 d203973a 0002713d Mar 3 19:52:06 mala kernel: [687600.908072] c08145c0 c08145c0 e4c2a848 c08145c0 00000000 0002713d c08145c0 f0a98c00 Mar 3 19:52:06 mala kernel: [687600.908079] e4c2a5b0 c20125c0 00000002 e5601d80 e5601d44 c056f3be e5601d78 e5601d4c Mar 3 19:52:06 mala kernel: [687600.908087] Call Trace: Mar 3 19:52:06 mala kernel: [687600.908099] [<c056f3be>] io_schedule+0x1e/0x30 Mar 3 19:52:06 mala kernel: [687600.908107] [<c01b2cf5>] sync_page+0x35/0x40 Mar 3 19:52:06 mala kernel: [687600.908111] [<c056f8f7>] __wait_on_bit_lock+0x47/0x90 Mar 3 19:52:06 mala kernel: [687600.908115] [<c01b2cc0>] ? sync_page+0x0/0x40 Mar 3 19:52:06 mala kernel: [687600.908121] [<c020f390>] ? blkdev_readpage+0x0/0x20 Mar 3 19:52:06 mala kernel: [687600.908125] [<c01b2ca9>] __lock_page+0x79/0x80 Mar 3 19:52:06 mala kernel: [687600.908130] [<c015c130>] ? wake_bit_function+0x0/0x50 Mar 3 19:52:06 mala kernel: [687600.908135] [<c01b459f>] read_cache_page_async+0xbf/0xd0 Mar 3 19:52:06 mala kernel: [687600.908139] [<c01b45c2>] read_cache_page+0x12/0x60 Mar 3 19:52:06 mala kernel: [687600.908144] [<c0232dca>] read_dev_sector+0x3a/0x80 Mar 3 19:52:06 mala kernel: [687600.908148] [<c0233d3e>] adfspart_check_ICS+0x1e/0x160 Mar 3 19:52:06 mala kernel: [687600.908152] [<c023339f>] ? disk_name+0xaf/0xc0 Mar 3 19:52:06 mala kernel: [687600.908157] [<c0233d20>] ? adfspart_check_ICS+0x0/0x160 Mar 3 19:52:06 mala kernel: [687600.908161] [<c02334de>] check_partition+0x10e/0x180 Mar 3 19:52:06 mala kernel: [687600.908165] [<c02335f6>] rescan_partitions+0xa6/0x330 Mar 3 19:52:06 mala kernel: [687600.908171] [<c0312472>] ? kobject_get+0x12/0x20 Mar 3 19:52:06 mala kernel: [687600.908175] [<c0312472>] ? kobject_get+0x12/0x20 Mar 3 19:52:06 mala kernel: [687600.908180] [<c039fc43>] ? get_device+0x13/0x20 Mar 3 19:52:06 mala kernel: [687600.908185] [<c03c263f>] ? sd_open+0x5f/0x1b0 Mar 3 19:52:06 mala kernel: [687600.908189] [<c020fda0>] __blkdev_get+0x140/0x310 Mar 3 19:52:06 mala kernel: [687600.908194] [<c020f0ac>] ? bdget+0xec/0x100 Mar 3 19:52:06 mala kernel: [687600.908198] [<c020ff7a>] blkdev_get+0xa/0x10 Mar 3 19:52:06 mala kernel: [687600.908202] [<c0232f30>] register_disk+0x120/0x140 Mar 3 19:52:06 mala kernel: [687600.908207] [<c0308b4d>] ? blk_register_region+0x2d/0x40 Mar 3 19:52:06 mala kernel: [687600.908211] [<c03084f0>] ? exact_match+0x0/0x10 Mar 3 19:52:06 mala kernel: [687600.908216] [<c0308cf0>] add_disk+0x80/0x140 Mar 3 19:52:06 mala kernel: [687600.908221] [<c03084f0>] ? exact_match+0x0/0x10 Mar 3 19:52:06 mala kernel: [687600.908225] [<c0308860>] ? exact_lock+0x0/0x20 Mar 3 19:52:06 mala kernel: [687600.908230] [<c03c53df>] sd_probe_async+0xff/0x1c0

Read the article
problems mounting an external IDE drive via USB in ubuntu

- by Roy Rico

I am having a problem connecting a specific IDE drive to my linux box. It's an old drive which I just want to get about 3 GB of files off of. INFO I am trying to connect a 200GB IDE Maxtor Drive, internally and externally... externally: I am using an self powered USB IDE external drive enclosure which I have used to connect various drives, under ubuntu and windows, in the past. The other posts stated it coudl be a problem I think i may have formatted the /dev/sdc partition instead of /dev/sdc1 partition when i originally formatted the drive. internally: I only have one machine left that has an internal IDE interface, and it's got XP on it. I plugged this drive internally into this machine with windows XP and used the ext2/ext3 drivers to mount this drive, but some files have question marks (?) in the file names which is messing up my copy process in windows. I can't delete the files under windows. Ubuntu Linux will not install on my only remaining machine that has IDE controller. I have tried the suggestions in the questions below http://superuser.com/questions/88182/mount-an-external-drive-in-ubuntu http://superuser.com/questions/23210/ubuntu-fails-to-mount-usb-drive it looks like i can see the drive in /proc/partitions $ cat /proc/partitions major minor #blocks name 8 0 78125000 sda 8 1 74894998 sda1 8 2 1 sda2 8 5 3229033 sda5 8 16 199148544 sdb <-- could be my drive? but it's not listed under fdisk -l $ fdisk -l Disk /dev/sda: 80.0 GB, 80000000000 bytes 255 heads, 63 sectors/track, 9726 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd0f4738c Device Boot Start End Blocks Id System /dev/sda1 * 1 9324 74894998+ 83 Linux /dev/sda2 9325 9726 3229065 5 Extended /dev/sda5 9325 9726 3229033+ 82 Linux swap / Solaris and here is my log of /var/log/messages. with a bunch of weird output, can someone let me know what that weird output is? Mar 3 19:49:40 mala kernel: [687455.112029] usb 1-7: new high speed USB device using ehci_hcd and address 3 Mar 3 19:49:41 mala kernel: [687455.248576] usb 1-7: configuration #1 chosen from 1 choice Mar 3 19:49:41 mala kernel: [687455.267450] Initializing USB Mass Storage driver... Mar 3 19:49:41 mala kernel: [687455.269180] scsi4 : SCSI emulation for USB Mass Storage devices Mar 3 19:49:41 mala kernel: [687455.269410] usbcore: registered new interface driver usb-storage Mar 3 19:49:41 mala kernel: [687455.269416] USB Mass Storage support registered. Mar 3 19:49:46 mala kernel: [687460.270917] scsi 4:0:0:0: Direct-Access Maxtor 6 Y200P0 YAR4 PQ: 0 ANSI: 2 Mar 3 19:49:46 mala kernel: [687460.271485] sd 4:0:0:0: Attached scsi generic sg2 type 0 Mar 3 19:49:46 mala kernel: [687460.278858] sd 4:0:0:0: [sdb] 398297088 512-byte logical blocks: (203 GB/189 GiB) Mar 3 19:49:46 mala kernel: [687460.280866] sd 4:0:0:0: [sdb] Write Protect is off Mar 3 19:50:16 mala kernel: [687460.283784] sdb: Mar 3 19:50:16 mala kernel: [687491.112020] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:50:47 mala kernel: [687522.120030] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:51:18 mala kernel: [687553.112034] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:51:49 mala kernel: [687584.116025] usb 1-7: reset high speed USB device using ehci_hcd and address 3 Mar 3 19:52:02 mala kernel: [687596.170632] type=1505 audit(1267671122.035:31): operation="profile_replace" pid=8426 name=/usr/lib/cups/backend/cups-pdf Mar 3 19:52:02 mala kernel: [687596.171551] type=1505 audit(1267671122.035:32): operation="profile_replace" pid=8426 name=/usr/sbin/cupsd Mar 3 19:52:06 mala kernel: [687600.908056] async/0 D c08145c0 0 7655 2 0x00000000 Mar 3 19:52:06 mala kernel: [687600.908062] e5601d38 00000046 e5774000 c08145c0 e4c2a848 c08145c0 d203973a 0002713d Mar 3 19:52:06 mala kernel: [687600.908072] c08145c0 c08145c0 e4c2a848 c08145c0 00000000 0002713d c08145c0 f0a98c00 Mar 3 19:52:06 mala kernel: [687600.908079] e4c2a5b0 c20125c0 00000002 e5601d80 e5601d44 c056f3be e5601d78 e5601d4c Mar 3 19:52:06 mala kernel: [687600.908087] Call Trace: Mar 3 19:52:06 mala kernel: [687600.908099] [<c056f3be>] io_schedule+0x1e/0x30 Mar 3 19:52:06 mala kernel: [687600.908107] [<c01b2cf5>] sync_page+0x35/0x40 Mar 3 19:52:06 mala kernel: [687600.908111] [<c056f8f7>] __wait_on_bit_lock+0x47/0x90 Mar 3 19:52:06 mala kernel: [687600.908115] [<c01b2cc0>] ? sync_page+0x0/0x40 Mar 3 19:52:06 mala kernel: [687600.908121] [<c020f390>] ? blkdev_readpage+0x0/0x20 Mar 3 19:52:06 mala kernel: [687600.908125] [<c01b2ca9>] __lock_page+0x79/0x80 Mar 3 19:52:06 mala kernel: [687600.908130] [<c015c130>] ? wake_bit_function+0x0/0x50 Mar 3 19:52:06 mala kernel: [687600.908135] [<c01b459f>] read_cache_page_async+0xbf/0xd0 Mar 3 19:52:06 mala kernel: [687600.908139] [<c01b45c2>] read_cache_page+0x12/0x60 Mar 3 19:52:06 mala kernel: [687600.908144] [<c0232dca>] read_dev_sector+0x3a/0x80 Mar 3 19:52:06 mala kernel: [687600.908148] [<c0233d3e>] adfspart_check_ICS+0x1e/0x160 Mar 3 19:52:06 mala kernel: [687600.908152] [<c023339f>] ? disk_name+0xaf/0xc0 Mar 3 19:52:06 mala kernel: [687600.908157] [<c0233d20>] ? adfspart_check_ICS+0x0/0x160 Mar 3 19:52:06 mala kernel: [687600.908161] [<c02334de>] check_partition+0x10e/0x180 Mar 3 19:52:06 mala kernel: [687600.908165] [<c02335f6>] rescan_partitions+0xa6/0x330 Mar 3 19:52:06 mala kernel: [687600.908171] [<c0312472>] ? kobject_get+0x12/0x20 Mar 3 19:52:06 mala kernel: [687600.908175] [<c0312472>] ? kobject_get+0x12/0x20 Mar 3 19:52:06 mala kernel: [687600.908180] [<c039fc43>] ? get_device+0x13/0x20 Mar 3 19:52:06 mala kernel: [687600.908185] [<c03c263f>] ? sd_open+0x5f/0x1b0 Mar 3 19:52:06 mala kernel: [687600.908189] [<c020fda0>] __blkdev_get+0x140/0x310 Mar 3 19:52:06 mala kernel: [687600.908194] [<c020f0ac>] ? bdget+0xec/0x100 Mar 3 19:52:06 mala kernel: [687600.908198] [<c020ff7a>] blkdev_get+0xa/0x10 Mar 3 19:52:06 mala kernel: [687600.908202] [<c0232f30>] register_disk+0x120/0x140 Mar 3 19:52:06 mala kernel: [687600.908207] [<c0308b4d>] ? blk_register_region+0x2d/0x40 Mar 3 19:52:06 mala kernel: [687600.908211] [<c03084f0>] ? exact_match+0x0/0x10 Mar 3 19:52:06 mala kernel: [687600.908216] [<c0308cf0>] add_disk+0x80/0x140 Mar 3 19:52:06 mala kernel: [687600.908221] [<c03084f0>] ? exact_match+0x0/0x10 Mar 3 19:52:06 mala kernel: [687600.908225] [<c0308860>] ? exact_lock+0x0/0x20 Mar 3 19:52:06 mala kernel: [687600.908230] [<c03c53df>] sd_probe_async+0xff/0x1c0

Read the article
SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

- by pinaldave

Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
MySQL – Scalability on Amazon RDS: Scale out to multiple RDS instances

- by Pinal Dave

Today, I’d like to discuss getting better MySQL scalability on Amazon RDS. The question of the day: “What can you do when a MySQL database needs to scale write-intensive workloads beyond the capabilities of the largest available machine on Amazon RDS?” Let’s take a look. In a typical EC2/RDS set-up, users connect to app servers from their mobile devices and tablets, computers, browsers, etc. Then app servers connect to an RDS instance (web/cloud services) and in some cases they might leverage some read-only replicas. Figure 1. A typical RDS instance is a single-instance database, with read replicas. This is not very good at handling high write-based throughput. As your application becomes more popular you can expect an increasing number of users, more transactions, and more accumulated data. User interactions can become more challenging as the application adds more sophisticated capabilities. The result of all this positive activity: your MySQL database will inevitably begin to experience scalability pressures. What can you do? Broadly speaking, there are four options available to improve MySQL scalability on RDS. 1. Larger RDS Instances – If you’re not already using the maximum available RDS instance, you can always scale up – to larger hardware. Bigger CPUs, more compute power, more memory et cetera. But the largest available RDS instance is still limited. And they get expensive. “High-Memory Quadruple Extra Large DB Instance”: 68 GB of memory 26 ECUs (8 virtual cores with 3.25 ECUs each) 64-bit platform High I/O Capacity Provisioned IOPS Optimized: 1000Mbps 2. Provisioned IOPs – You can get provisioned IOPs and higher throughput on the I/O level. However, there is a hard limit with a maximum instance size and maximum number of provisioned IOPs you can buy from Amazon and you simply cannot scale beyond these hardware specifications. 3. Leverage Read Replicas – If your application permits, you can leverage read replicas to offload some reads from the master databases. But there are a limited number of replicas you can utilize and Amazon generally requires some modifications to your existing application. And read-replicas don’t help with write-intensive applications. 4. Multiple Database Instances – Amazon offers a fourth option: “You can implement partitioning,thereby spreading your data across multiple database Instances” (Link) However, Amazon does not offer any guidance or facilities to help you with this. “Multiple database instances” is not an RDS feature. And Amazon doesn’t explain how to implement this idea. In fact, when asked, this is the response on an Amazon forum: Q: Is there any documents that describe the partition DB across multiple RDS? I need to use DB with more 1TB but exist a limitation during the create process, but I read in the any FAQ that you need to partition database, but I don’t find any documents that describe it. A: “DB partitioning/sharding is not an official feature of Amazon RDS or MySQL, but a technique to scale out database by using multiple database instances. The appropriate way to split data depends on the characteristics of the application or data set. Therefore, there is no concrete and specific guidance.” So now what? The answer is to scale out with ScaleBase. Amazon RDS with ScaleBase: What you get – MySQL Scalability! ScaleBase is specifically designed to scale out a single MySQL RDS instance into multiple MySQL instances. Critically, this is accomplished with no changes to your application code. Your application continues to “see” one database. ScaleBase does all the work of managing and enforcing an optimized data distribution policy to create multiple MySQL instances. With ScaleBase, data distribution, transactions, concurrency control, and two-phase commit are all 100% transparent and 100% ACID-compliant, so applications, services and tooling continue to interact with your distributed RDS as if it were a single MySQL instance. The result: now you can cost-effectively leverage multiple MySQL RDS instance to scale out write-intensive workloads to an unlimited number of users, transactions, and data. Amazon RDS with ScaleBase: What you keep – Everything! And how does this change your Amazon environment? 1. Keep your application, unchanged – There is no change your application development life-cycle at all. You still use your existing development tools, frameworks and libraries. Application quality assurance and testing cycles stay the same. And, critically, you stay with an ACID-compliant MySQL environment. 2. Keep your RDS value-added services – The value-added services that you rely on are all still available. Amazon will continue to handle database maintenance and updates for you. You can still leverage High Availability via Multi A-Z. And, if it benefits youra application throughput, you can still use read replicas. 3. Keep your RDS administration – Finally the RDS monitoring and provisioning tools you rely on still work as they did before. With your one large MySQL instance, now split into multiple instances, you can actually use less expensive, smallersmaller available RDS hardware and continue to see better database performance. Conclusion Amazon RDS is a tremendous service, but it doesn’t offer solutions to scale beyond a single MySQL instance. Larger RDS instances get more expensive. And when you max-out on the available hardware, you’re stuck. Amazon recommends scaling out your single instance into multiple instances for transaction-intensive apps, but offers no services or guidance to help you. This is where ScaleBase comes in to save the day. It gives you a simple and effective way to create multiple MySQL RDS instances, while removing all the complexities typically caused by “DIY” sharding andwith no changes to your applications . With ScaleBase you continue to leverage the AWS/RDS ecosystem: commodity hardware and value added services like read replicas, multi A-Z, maintenance/updates and administration with monitoring tools and provisioning. SCALEBASE ON AMAZON If you’re curious to try ScaleBase on Amazon, it can be found here – Download NOW. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
General monitoring for SQL Server Analysis Services using Performance Monitor

- by Testas

A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs. File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor, a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used…… Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above 25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file

Read the article
How to Achieve OC4J RMI Load Balancing

- by fip

This is an old, Oracle SOA and OC4J 10G topic. In fact this is not even a SOA topic per se. Questions of RMI load balancing arise when you developed custom web applications accessing human tasks running off a remote SOA 10G cluster. Having returned from a customer who faced challenges with OC4J RMI load balancing, I felt there is still some confusions in the field how OC4J RMI load balancing work. Hence I decide to dust off an old tech note that I wrote a few years back and share it with the general public. Here is the tech note: Overview A typical use case in Oracle SOA is that you are building web based, custom human tasks UI that will interact with the task services housed in a remote BPEL 10G cluster. Or, in a more generic way, you are just building a web based application in Java that needs to interact with the EJBs in a remote OC4J cluster. In either case, you are talking to an OC4J cluster as RMI client. Then immediately you must ask yourself the following questions: 1. How do I make sure that the web application, as an RMI client, even distribute its load against all the nodes in the remote OC4J cluster? 2. How do I make sure that the web application, as an RMI client, is resilient to the node failures in the remote OC4J cluster, so that in the unlikely case when one of the remote OC4J nodes fail, my web application will continue to function? That is the topic of how to achieve load balancing with OC4J RMI client. Solutions You need to configure and code RMI load balancing in two places: 1. Provider URL can be specified with a comma separated list of URLs, so that the initial lookup will land to one of the available URLs. 2. Choose a proper value for the oracle.j2ee.rmi.loadBalance property, which, along side with the PROVIDER_URL property, is one of the JNDI properties passed to the JNDI lookup.(http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI) More details below: About the PROVIDER_URL The JNDI property java.name.provider.url's job is, when the client looks up for a new context at the very first time in the client session, to provide a list of RMI context The value of the JNDI property java.name.provider.url goes by the format of a single URL, or a comma separate list of URLs. A single URL. For example: opmn:ormi://host1:6003:oc4j_instance1/appName1 A comma separated list of multiple URLs. For examples: opmn:ormi://host1:6003:oc4j_instanc1/appName, opmn:ormi://host2:6003:oc4j_instance1/appName, opmn:ormi://host3:6003:oc4j_instance1/appName When the client looks up for a new Context the very first time in the client session, it sends a query against the OPMN referenced by the provider URL. The OPMN host and port specifies the destination of such query, and the OC4J instance name and appName are actually the “where clause” of the query. When the PROVIDER URL reference a single OPMN server Let's consider the case when the provider url only reference a single OPMN server of the destination cluster. In this case, that single OPMN server receives the query and returns a list of the qualified Contexts from all OC4Js within the cluster, even though there is a single OPMN server in the provider URL. A context represent a particular starting point at a particular server for subsequent object lookup. For example, if the URL is opmn:ormi://host1:6003:oc4j_instance1/appName, then, OPMN will return the following contexts: appName on oc4j_instance1 on host1 appName on oc4j_instance1 on host2, appName on oc4j_instance1 on host3, (provided that host1, host2, host3 are all in the same cluster) Please note that One OPMN will be sufficient to find the list of all contexts from the entire cluster that satisfy the JNDI lookup query. You can do an experiment by shutting down appName on host1, and observe that OPMN on host1 will still be able to return you appname on host2 and appName on host3. When the PROVIDER URL reference a comma separated list of multiple OPMN servers When the JNDI propery java.naming.provider.url references a comma separated list of multiple URLs, the lookup will return the exact same things as with the single OPMN server: a list of qualified Contexts from the cluster. The purpose of having multiple OPMN servers is to provide high availability in the initial context creation, such that if OPMN at host1 is unavailable, client will try the lookup via OPMN on host2, and so on. After the initial lookup returns and cache a list of contexts, the JNDI URL(s) are no longer used in the same client session. That explains why removing the 3rd URL from the list of JNDI URLs will not stop the client from getting the EJB on the 3rd server. About the oracle.j2ee.rmi.loadBalance Property After the client acquires the list of contexts, it will cache it at the client side as “list of available RMI contexts”. This list includes all the servers in the destination cluster. This list will stay in the cache until the client session (JVM) ends. The RMI load balancing against the destination cluster is happening at the client side, as the client is switching between the members of the list. Whether and how often the client will fresh the Context from the list of Context is based on the value of the oracle.j2ee.rmi.loadBalance. The documentation at http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI list all the available values for the oracle.j2ee.rmi.loadBalance. Value Description client If specified, the client interacts with the OC4J process that was initially chosen at the first lookup for the entire conversation. context Used for a Web client (servlet or JSP) that will access EJBs in a clustered OC4J environment. If specified, a new Context object for a randomly-selected OC4J instance will be returned each time InitialContext() is invoked. lookup Used for a standalone client that will access EJBs in a clustered OC4J environment. If specified, a new Context object for a randomly-selected OC4J instance will be created each time the client calls Context.lookup(). Please note the regardless of the setting of oracle.j2ee.rmi.loadBalance property, the “refresh” only occurs at the client. The client can only choose from the "list of available context" that was returned and cached from the very first lookup. That is, the client will merely get a new Context object from the “list of available RMI contexts” from the cache at the client side. The client will NOT go to the OPMN server again to get the list. That also implies that if you are adding a node to the server cluster AFTER the client’s initial lookup, the client would not know it because neither the server nor the client will initiate a refresh of the “list of available servers” to reflect the new node. About High Availability (i.e. Resilience Against Node Failure of Remote OC4J Cluster) What we have discussed above is about load balancing. Let's also discuss high availability. This is how the High Availability works in RMI: when the client use the context but get an exception such as socket is closed, it knows that the server referenced by that Context is problematic and will try to get another unused Context from the “list of available contexts”. Again, this list is the list that was returned and cached at the very first lookup in the entire client session.

Read the article
People, Process & Engagement: WebCenter Partner Keste

- by Michael Snow

v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Within the WebCenter group here at Oracle, discussions about people, process and engagement cross over many vertical industries and products. Amidst our growing partner ecosystem, the community provides us insight into great customer use cases every day. Such is the case with our partner, Keste, who provides us a guest post on our blog today with an overview of their innovative solution for a customer in the transportation industry. Keste is an Oracle software solutions and development company headquartered in Dallas, Texas. As a Platinum member of the Oracle® PartnerNetwork, Keste designs, develops and deploys custom solutions that automate complex business processes. Seamless Customer Self-Service Experience in the Trucking Industry with Oracle WebCenter Portal Keste, Oracle Platinum Partner Customer Overview Omnitracs, Inc., a Qualcomm company provides mobility solutions for trucking fleets to companies in the transportation industry. Omnitracs’ mobility services include basic communications such as text as well as advanced monitoring services such as GPS tracking, temperature tracking of perishable goods, load tracking and weighting distribution, and many others. Customer Business Needs Already the leading provider of mobility solutions for large trucking fleets, they chose to target smaller trucking fleets as new customers. However their existing high-touch customer support method would not be a cost effective or scalable method to manage and service these smaller customers. Omnitracs needed to provide several self-service features to make customer support more scalable while keeping customer satisfaction levels high and the costs manageable. The solution also had to be very intuitive and easy to use. The systems that Omnitracs sells to these trucking customers require professional installation and smaller customers need to track and schedule the installation. Information captured in Oracle eBusiness Suite needed to be readily available for new customers to track these purchases and delivery details. Omnitracs wanted a high impact User Interface to significantly improve customer experience with the ability to integrate with EBS, provisioning systems as well as CRM systems that were already implemented. Omnitracs also wanted to build an architecture platform that could potentially be extended to other Portals. Omnitracs’ stated goal was to deliver an “eBay-like” or “Amazon-like” experience for all of their customers so that they could reach a much broader market beyond their large company customer base. Solution Overview In order to manage the increased complexity, the growing support needs of global customers and improve overall product time-to-market in a cost-effective manner, IT began to deliver a self-service model. This self service model not only transformed numerous business processes but is also allowing the business to keep up with the growing demands of the (internal and external) customers. This solution was a customer service Portal that provided self service capabilities for large and small customers alike for Activation of mobility products, managing add-on applications for the devices (much like the Apple App Store), transferring services when trucks are sold to other companies as well as deactivation all without the involvement of a call service agent or sending multiple emails to different Omnitracs contacts. This is a conceptual view of the Customer Portal showing the details of the components that make up the solution. 12.00 The portal application for transactions was entirely built using ADF 11g R2. Omnitracs’ business had a pressing requirement to have a portal available 24/7 for its customers. Since there were interactions with EBS in the back-end, the downtimes on the EBS would negate this availability. Omnitracs devised a decoupling strategy at the database side for the EBS data. The decoupling of the database was done using Oracle Data Guard and completely insulated the solution from any eBusiness Suite down time. The customer has no knowledge whether eBS is running or not. Here are two sample screenshots of the portal application built in Oracle ADF. Customer Benefits The Customer Portal not only provided the scalability to grow the business but also provided the seamless integration with other disparate applications. Some of the key benefits are: Improved Customer Experience: With a modern look and feel and a Portal that has the aspects of an App Store, the customer experience was significantly improved. Page response times went from several seconds to sub-second for all of the pages. Enabled new product launches: After successfully dominating the large fleet market, Omnitracs now has a scalable solution to sell and manage smaller fleet customers giving them a huge advantage over their nearest competitors. Dozens of new customers have been acquired via this portal through an onboarding process that now takes minutes Seamless Integrations Improves Customer Support: ADF 11gR2 allowed Omnitracs to bring a diverse list of applications into one integrated solution. This provided a seamless experience for customers to route them from Marketing focused application to a customer-oriented portal. Internally, it also allowed Sales Representatives to have an integrated flow for taking a prospect through the various steps to onboard them as a customer. Key integrations included: Unity Core Salesforce.com Merchant e-Solution for credit card Custom Omnitracs Applications like CUPS and AUTO Security utilizing OID and OVD Back end integration with EBS (Data Guard) and iQ Database Business Impact Significant business impacts were realized through the launch of customer portal. It not only allows the business to push through in underserved segments, but also reduces the time it needs to spend on customer support—allowing the business to focus more on sales and identifying the market for new products. Some of the Immediate Benefits are The entire onboarding process is now completely automated and now completes in minutes. This represents an 85% productivity improvement over their previous processes. And it was 160 times faster! With the success of this self-service solution, the business is now targeting about 3X customer growth in the next five years. This represents a tripling of their overall customer base and significant downstream revenue for the ongoing services. 90%+ improvement of customer onboarding and management process by utilizing, single sign on integration using OID/OAM solution, performance improvements and new self-service functionality Unified login for all Customers, Partners and Internal Users enables login to a common portal and seamless access to all other integrated applications targeted at the respective audience Significantly improved customer experience with a better look and feel with a more user experience focused Portal screens. Helped sales of the new product by having an easy way of ordering and activating the product. Data Guard helped increase availability of the Portal to 99%+ and make it independent of EBS downtime. This gave customers the feel of high availability of the portal application. Some of the anticipated longer term Benefits are: Platform that can be leveraged to launch any new product introduction and enable all product teams to reach new customers and new markets Easy integration with content management to allow business owners more control of the product catalog Overall reduced TCO with standardization of the Oracle platform Managed IT support cost savings through optimization of technology skills needed to support and modify this solution ------------------------------------------------------------ 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 -"/ /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Times New Roman","serif";}

Read the article
Best Practices - which domain types should be used to run applications

- by jsavit

This post is one of a series of "best practices" notes for Oracle VM Server for SPARC (formerly named Logical Domains) One question that frequently comes up is "which types of domain should I use to run applications?" There used to be a simple answer in most cases: "only run applications in guest domains", but enhancements to T-series servers, Oracle VM Server for SPARC and the advent of SPARC SuperCluster have made this question more interesting and worth qualifying differently. This article reviews the relevant concepts and provides suggestions on where to deploy applications in a logical domains environment. Review: division of labor and types of domain Oracle VM Server for SPARC offloads many functions from the hypervisor to domains (also called virtual machines). This is a modern alternative to using a "thick" hypervisor that provides all virtualization functions, as in traditional VM designs, This permits a simpler hypervisor design, which enhances reliability, and security. It also reduces single points of failure by assigning responsibilities to multiple system components, which further improves reliability and security. In this architecture, management and I/O functionality are provided within domains. Oracle VM Server for SPARC does this by defining the following types of domain, each with their own roles: Control domain - management control point for the server, used to configure domains and manage resources. It is the first domain to boot on a power-up, is an I/O domain, and is usually a service domain as well. I/O domain - has been assigned physical I/O devices: a PCIe root complex, a PCI device, or a SR-IOV (single-root I/O Virtualization) function. It has native performance and functionality for the devices it owns, unmediated by any virtualization layer. Service domain - provides virtual network and disk devices to guest domains. Guest domain - a domain whose devices are all virtual rather than physical: virtual network and disk devices provided by one or more service domains. In common practice, this is where applications are run. Typical deployment A service domain is generally also an I/O domain: otherwise it wouldn't have access to physical device "backends" to offer to its clients. Similarly, an I/O domain is also typically a service domain in order to leverage the available PCI busses. Control domains must be I/O domains, because they boot up first on the server and require physical I/O. It's typical for the control domain to also be a service domain too so it doesn't "waste" the I/O resources it uses. A simple configuration consists of a control domain, which is also the one I/O and service domain, and some number of guest domains using virtual I/O. In production, customers typically use multiple domains with I/O and service roles to eliminate single points of failure: guest domains have virtual disk and virtual devices provisioned from more than one service domain, so failure of a service domain or I/O path or device doesn't result in an application outage. This is also used for "rolling upgrades" in which service domains are upgraded one at a time while their guests continue to operate without disruption. (It should be noted that resiliency to I/O device failures can also be provided by the single control domain, using multi-path I/O) In this type of deployment, control, I/O, and service domains are used for virtualization infrastructure, while applications run in guest domains. Changing application deployment patterns The above model has been widely and successfully used, but more configuration options are available now. Servers got bigger than the original T2000 class machines with 2 I/O busses, so there is more I/O capacity that can be used for applications. Increased T-series server capacity made it attractive to run more vertical applications, such as databases, with higher resource requirements than the "light" applications originally seen. This made it attractive to run applications in I/O domains so they could get bare-metal native I/O performance. This is leveraged by the SPARC SuperCluster engineered system, announced a year ago at Oracle OpenWorld. In SPARC SuperCluster, I/O domains are used for high performance applications, with native I/O performance for disk and network and optimized access to the Infiniband fabric. Another technical enhancement is the introduction of Direct I/O (DIO) and Single Root I/O Virtualization (SR-IOV), which make it possible to give domains direct connections and native I/O performance for selected I/O devices. A domain with either a DIO or SR-IOV device is an I/O domain. In summary: not all I/O domains own PCI complexes, and there are increasingly more I/O domains that are not service domains. They use their I/O connectivity for performance for their own applications. However, there are some limitations and considerations: at this time, a domain using physical I/O cannot be live-migrated to another server. There is also a need to plan for security and introducing unneeded dependencies: if an I/O domain is also a service domain providing virtual I/O go guests, it has the ability to affect the correct operation of its client guest domains. This is even more relevant for the control domain. where the ldm has to be protected from unauthorized (or even mistaken) use that would affect other domains. As a general rule, running applications in the service domain or the control domain should be avoided. To recap: Guest domains with virtual I/O still provide the greatest operational flexibility, including features like live migration. I/O domains can be used for applications with high performance requirements. This is used to great effect in SPARC SuperCluster and in general T4 deployments. Direct I/O (DIO) and Single Root I/O Virtualization (SR-IOV) make this more attractive by giving direct I/O access to more domains. Service domains should in general not be used for applications, because compromised security in the domain, or an outage, can affect other domains that depend on it. This concern can be mitigated by providing guests' their virtual I/O from more than one service domain, so an interruption of service in the service domain does not cause an application outage. The control domain should in general not be used to run applications, for the same reason. SPARC SuperCluster use the control domain for applications, but it is an exception: it's not a general purpose environment; it's an engineered system with specifically configured applications and optimization for optimal performance. These are recommended "best practices" based on conversations with a number of Oracle architects. Keep in mind that "one size does not fit all", so you should evaluate these practices in the context of your own requirements. Summary Higher capacity T-series servers have made it more attractive to use them for applications with high resource requirements. New deployment models permit native I/O performance for demanding applications by running them in I/O domains with direct access to their devices. This is leveraged in SPARC SuperCluster, and can be leveraged in T-series servers to provision high-performance applications running in domains. Carefully planned, this can be used to provide higher performance for critical applications.

Read the article
How to read oom-killer syslog messages?

- by Grant

I have a Ubuntu 12.04 server which sometimes dies completely - no SSH, no ping, nothing until it is physically rebooted. After the reboot, I see in syslog that the oom-killer killed, well, pretty much everything. There's a lot of detailed memory usage information in them. How do I read these logs to see what caused the OOM issue? The server has far more memory than it needs, so it shouldn't be running out of memory. Oct 25 07:28:04 nldedip4k031 kernel: [87946.529511] oom_kill_process: 9 callbacks suppressed Oct 25 07:28:04 nldedip4k031 kernel: [87946.529514] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529516] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529518] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:04 nldedip4k031 kernel: [87946.529519] Call Trace: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529525] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529528] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529530] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529532] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529535] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529537] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529541] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529543] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529546] [] vfs_read+0x8c/0x160 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529548] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529550] [] sys_read+0x3d/0x70 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529554] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529555] Mem-Info: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529556] DMA per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529557] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529558] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529560] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529561] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529562] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529563] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529564] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529565] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529566] Normal per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529567] CPU 0: hi: 186, btch: 31 usd: 179 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529568] CPU 1: hi: 186, btch: 31 usd: 182 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529569] CPU 2: hi: 186, btch: 31 usd: 132 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529570] CPU 3: hi: 186, btch: 31 usd: 175 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529571] CPU 4: hi: 186, btch: 31 usd: 91 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529572] CPU 5: hi: 186, btch: 31 usd: 173 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529573] CPU 6: hi: 186, btch: 31 usd: 159 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529574] CPU 7: hi: 186, btch: 31 usd: 164 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529575] HighMem per-cpu: Oct 25 07:28:04 nldedip4k031 kernel: [87946.529576] CPU 0: hi: 186, btch: 31 usd: 165 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529577] CPU 1: hi: 186, btch: 31 usd: 183 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529578] CPU 2: hi: 186, btch: 31 usd: 185 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529579] CPU 3: hi: 186, btch: 31 usd: 138 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529580] CPU 4: hi: 186, btch: 31 usd: 155 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529581] CPU 5: hi: 186, btch: 31 usd: 104 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529582] CPU 6: hi: 186, btch: 31 usd: 133 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529583] CPU 7: hi: 186, btch: 31 usd: 170 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529586] active_anon:5523 inactive_anon:354 isolated_anon:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529586] active_file:2815 inactive_file:6849119 isolated_file:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529587] unevictable:0 dirty:449 writeback:10 unstable:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529587] free:1304125 slab_reclaimable:104672 slab_unreclaimable:3419 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529588] mapped:2661 shmem:138 pagetables:313 bounce:0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529591] DMA free:4252kB min:780kB low:972kB high:1168kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11564kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1 all_unreclaimable? yes Oct 25 07:28:04 nldedip4k031 kernel: [87946.529594] lowmem_reserve[]: 0 869 32460 32460 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529599] Normal free:44052kB min:44216kB low:55268kB high:66324kB active_anon:0kB inactive_anon:0kB active_file:616kB inactive_file:568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:890008kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:407124kB slab_unreclaimable:13672kB kernel_stack:992kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2083 all_unreclaimable? yes Oct 25 07:28:04 nldedip4k031 kernel: [87946.529602] lowmem_reserve[]: 0 0 252733 252733 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529606] HighMem free:5168196kB min:512kB low:402312kB high:804112kB active_anon:22092kB inactive_anon:1416kB active_file:10640kB inactive_file:27395920kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32349872kB mlocked:0kB dirty:1796kB writeback:40kB mapped:10640kB shmem:552kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1252kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Oct 25 07:28:04 nldedip4k031 kernel: [87946.529609] lowmem_reserve[]: 0 0 0 0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529611] DMA: 6*4kB 6*8kB 6*16kB 5*32kB 5*64kB 4*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 4232kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529616] Normal: 297*4kB 180*8kB 119*16kB 73*32kB 67*64kB 47*128kB 35*256kB 13*512kB 5*1024kB 1*2048kB 1*4096kB = 44052kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529622] HighMem: 1*4kB 6*8kB 27*16kB 11*32kB 2*64kB 1*128kB 0*256kB 0*512kB 4*1024kB 1*2048kB 1260*4096kB = 5168196kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529627] 6852076 total pagecache pages Oct 25 07:28:04 nldedip4k031 kernel: [87946.529628] 0 pages in swap cache Oct 25 07:28:04 nldedip4k031 kernel: [87946.529629] Swap cache stats: add 0, delete 0, find 0/0 Oct 25 07:28:04 nldedip4k031 kernel: [87946.529630] Free swap = 3998716kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.529631] Total swap = 3998716kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.571914] 8437743 pages RAM Oct 25 07:28:04 nldedip4k031 kernel: [87946.571916] 8209409 pages HighMem Oct 25 07:28:04 nldedip4k031 kernel: [87946.571917] 159556 pages reserved Oct 25 07:28:04 nldedip4k031 kernel: [87946.571917] 6862034 pages shared Oct 25 07:28:04 nldedip4k031 kernel: [87946.571918] 123540 pages non-shared Oct 25 07:28:04 nldedip4k031 kernel: [87946.571919] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Oct 25 07:28:04 nldedip4k031 kernel: [87946.571927] [ 421] 0 421 709 152 3 0 0 upstart-udev-br Oct 25 07:28:04 nldedip4k031 kernel: [87946.571929] [ 429] 0 429 773 326 5 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571931] [ 567] 0 567 772 224 4 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571932] [ 568] 0 568 772 231 7 -17 -1000 udevd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571934] [ 764] 0 764 712 103 1 0 0 upstart-socket- Oct 25 07:28:04 nldedip4k031 kernel: [87946.571936] [ 772] 103 772 815 164 5 0 0 dbus-daemon Oct 25 07:28:04 nldedip4k031 kernel: [87946.571938] [ 785] 0 785 1671 600 1 -17 -1000 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571940] [ 809] 101 809 7766 380 1 0 0 rsyslogd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571942] [ 869] 0 869 1158 213 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571943] [ 873] 0 873 1158 214 6 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571945] [ 911] 0 911 1158 215 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571947] [ 912] 0 912 1158 214 2 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571949] [ 914] 0 914 1158 213 1 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571950] [ 916] 0 916 618 86 1 0 0 atd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571952] [ 917] 0 917 655 226 3 0 0 cron Oct 25 07:28:04 nldedip4k031 kernel: [87946.571954] [ 948] 0 948 902 159 3 0 0 irqbalance Oct 25 07:28:04 nldedip4k031 kernel: [87946.571956] [ 993] 0 993 1145 363 3 0 0 master Oct 25 07:28:04 nldedip4k031 kernel: [87946.571957] [ 1002] 104 1002 1162 333 1 0 0 qmgr Oct 25 07:28:04 nldedip4k031 kernel: [87946.571959] [ 1016] 0 1016 730 149 2 0 0 mdadm Oct 25 07:28:04 nldedip4k031 kernel: [87946.571961] [ 1057] 0 1057 6066 2160 3 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571963] [ 1086] 0 1086 1158 213 3 0 0 getty Oct 25 07:28:04 nldedip4k031 kernel: [87946.571965] [ 1088] 33 1088 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571967] [ 1089] 33 1089 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571969] [ 1090] 33 1090 6175 1451 3 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571971] [ 1091] 33 1091 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571972] [ 1092] 33 1092 6191 1451 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571974] [ 1109] 33 1109 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571976] [ 1151] 33 1151 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:04 nldedip4k031 kernel: [87946.571978] [ 1201] 104 1201 1803 652 1 0 0 tlsmgr Oct 25 07:28:04 nldedip4k031 kernel: [87946.571980] [ 2475] 0 2475 2435 812 0 0 0 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571982] [ 2494] 0 2494 1745 839 1 0 0 bash Oct 25 07:28:04 nldedip4k031 kernel: [87946.571984] [ 2573] 0 2573 3394 1689 0 0 0 sshd Oct 25 07:28:04 nldedip4k031 kernel: [87946.571986] [ 2589] 0 2589 5014 457 3 0 0 rsync Oct 25 07:28:04 nldedip4k031 kernel: [87946.571988] [ 2590] 0 2590 7970 522 1 0 0 rsync Oct 25 07:28:04 nldedip4k031 kernel: [87946.571990] [ 2652] 104 2652 1150 326 5 0 0 pickup Oct 25 07:28:04 nldedip4k031 kernel: [87946.571992] Out of memory: Kill process 421 (upstart-udev-br) score 1 or sacrifice child Oct 25 07:28:04 nldedip4k031 kernel: [87946.572407] Killed process 421 (upstart-udev-br) total-vm:2836kB, anon-rss:156kB, file-rss:452kB Oct 25 07:28:04 nldedip4k031 kernel: [87946.573107] init: upstart-udev-bridge main process (421) killed by KILL signal Oct 25 07:28:04 nldedip4k031 kernel: [87946.573126] init: upstart-udev-bridge main process ended, respawning Oct 25 07:28:34 nldedip4k031 kernel: [87976.461570] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461573] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461576] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:34 nldedip4k031 kernel: [87976.461578] Call Trace: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461585] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461626] Mem-Info: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461628] DMA per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461629] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461631] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461633] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461634] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461636] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461638] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461639] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461641] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461642] Normal per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461644] CPU 0: hi: 186, btch: 31 usd: 61 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461646] CPU 1: hi: 186, btch: 31 usd: 49 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461647] CPU 2: hi: 186, btch: 31 usd: 8 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461649] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461651] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461652] CPU 5: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461654] CPU 6: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461656] CPU 7: hi: 186, btch: 31 usd: 30 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461657] HighMem per-cpu: Oct 25 07:28:34 nldedip4k031 kernel: [87976.461658] CPU 0: hi: 186, btch: 31 usd: 4 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461660] CPU 1: hi: 186, btch: 31 usd: 204 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461662] CPU 2: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461663] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461665] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461667] CPU 5: hi: 186, btch: 31 usd: 31 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461668] CPU 6: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461670] CPU 7: hi: 186, btch: 31 usd: 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461674] active_anon:5441 inactive_anon:412 isolated_anon:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461674] active_file:2668 inactive_file:6922842 isolated_file:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461675] unevictable:0 dirty:836 writeback:0 unstable:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461676] free:1231664 slab_reclaimable:105781 slab_unreclaimable:3399 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461677] mapped:2649 shmem:138 pagetables:313 bounce:0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461682] DMA free:4248kB min:780kB low:972kB high:1168kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15756kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11560kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:5687 all_unreclaimable? yes Oct 25 07:28:34 nldedip4k031 kernel: [87976.461686] lowmem_reserve[]: 0 869 32460 32460 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461693] Normal free:44184kB min:44216kB low:55268kB high:66324kB active_anon:0kB inactive_anon:0kB active_file:20kB inactive_file:1096kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:890008kB mlocked:0kB dirty:4kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:411564kB slab_unreclaimable:13592kB kernel_stack:992kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1816 all_unreclaimable? yes Oct 25 07:28:34 nldedip4k031 kernel: [87976.461697] lowmem_reserve[]: 0 0 252733 252733 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461703] HighMem free:4878224kB min:512kB low:402312kB high:804112kB active_anon:21764kB inactive_anon:1648kB active_file:10652kB inactive_file:27690268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32349872kB mlocked:0kB dirty:3340kB writeback:0kB mapped:10592kB shmem:552kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1252kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Oct 25 07:28:34 nldedip4k031 kernel: [87976.461708] lowmem_reserve[]: 0 0 0 0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461711] DMA: 8*4kB 7*8kB 6*16kB 5*32kB 5*64kB 4*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 4248kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461719] Normal: 272*4kB 178*8kB 76*16kB 52*32kB 42*64kB 36*128kB 23*256kB 20*512kB 7*1024kB 2*2048kB 1*4096kB = 44176kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461727] HighMem: 1*4kB 45*8kB 31*16kB 24*32kB 5*64kB 3*128kB 1*256kB 2*512kB 4*1024kB 2*2048kB 1188*4096kB = 4877852kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461736] 6925679 total pagecache pages Oct 25 07:28:34 nldedip4k031 kernel: [87976.461737] 0 pages in swap cache Oct 25 07:28:34 nldedip4k031 kernel: [87976.461739] Swap cache stats: add 0, delete 0, find 0/0 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461740] Free swap = 3998716kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.461741] Total swap = 3998716kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.524951] 8437743 pages RAM Oct 25 07:28:34 nldedip4k031 kernel: [87976.524953] 8209409 pages HighMem Oct 25 07:28:34 nldedip4k031 kernel: [87976.524954] 159556 pages reserved Oct 25 07:28:34 nldedip4k031 kernel: [87976.524955] 6936141 pages shared Oct 25 07:28:34 nldedip4k031 kernel: [87976.524956] 124602 pages non-shared Oct 25 07:28:34 nldedip4k031 kernel: [87976.524957] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Oct 25 07:28:34 nldedip4k031 kernel: [87976.524966] [ 429] 0 429 773 326 5 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524968] [ 567] 0 567 772 224 4 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524971] [ 568] 0 568 772 231 7 -17 -1000 udevd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524973] [ 764] 0 764 712 103 3 0 0 upstart-socket- Oct 25 07:28:34 nldedip4k031 kernel: [87976.524976] [ 772] 103 772 815 164 2 0 0 dbus-daemon Oct 25 07:28:34 nldedip4k031 kernel: [87976.524979] [ 785] 0 785 1671 600 1 -17 -1000 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524981] [ 809] 101 809 7766 380 1 0 0 rsyslogd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524983] [ 869] 0 869 1158 213 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524986] [ 873] 0 873 1158 214 6 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524988] [ 911] 0 911 1158 215 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524990] [ 912] 0 912 1158 214 2 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524992] [ 914] 0 914 1158 213 1 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.524995] [ 916] 0 916 618 86 1 0 0 atd Oct 25 07:28:34 nldedip4k031 kernel: [87976.524997] [ 917] 0 917 655 226 3 0 0 cron Oct 25 07:28:34 nldedip4k031 kernel: [87976.524999] [ 948] 0 948 902 159 5 0 0 irqbalance Oct 25 07:28:34 nldedip4k031 kernel: [87976.525002] [ 993] 0 993 1145 363 3 0 0 master Oct 25 07:28:34 nldedip4k031 kernel: [87976.525004] [ 1002] 104 1002 1162 333 1 0 0 qmgr Oct 25 07:28:34 nldedip4k031 kernel: [87976.525007] [ 1016] 0 1016 730 149 2 0 0 mdadm Oct 25 07:28:34 nldedip4k031 kernel: [87976.525009] [ 1057] 0 1057 6066 2160 3 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525012] [ 1086] 0 1086 1158 213 3 0 0 getty Oct 25 07:28:34 nldedip4k031 kernel: [87976.525014] [ 1088] 33 1088 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525017] [ 1089] 33 1089 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525019] [ 1090] 33 1090 6175 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525021] [ 1091] 33 1091 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525024] [ 1092] 33 1092 6191 1451 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525026] [ 1109] 33 1109 6191 1517 0 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525029] [ 1151] 33 1151 6191 1451 1 0 0 /usr/sbin/apach Oct 25 07:28:34 nldedip4k031 kernel: [87976.525031] [ 1201] 104 1201 1803 652 1 0 0 tlsmgr Oct 25 07:28:34 nldedip4k031 kernel: [87976.525033] [ 2475] 0 2475 2435 812 0 0 0 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.525036] [ 2494] 0 2494 1745 839 1 0 0 bash Oct 25 07:28:34 nldedip4k031 kernel: [87976.525038] [ 2573] 0 2573 3394 1689 3 0 0 sshd Oct 25 07:28:34 nldedip4k031 kernel: [87976.525040] [ 2589] 0 2589 5014 457 3 0 0 rsync Oct 25 07:28:34 nldedip4k031 kernel: [87976.525043] [ 2590] 0 2590 7970 522 1 0 0 rsync Oct 25 07:28:34 nldedip4k031 kernel: [87976.525045] [ 2652] 104 2652 1150 326 5 0 0 pickup Oct 25 07:28:34 nldedip4k031 kernel: [87976.525048] [ 2847] 0 2847 709 89 0 0 0 upstart-udev-br Oct 25 07:28:34 nldedip4k031 kernel: [87976.525050] Out of memory: Kill process 764 (upstart-socket-) score 1 or sacrifice child Oct 25 07:28:34 nldedip4k031 kernel: [87976.525484] Killed process 764 (upstart-socket-) total-vm:2848kB, anon-rss:204kB, file-rss:208kB Oct 25 07:28:34 nldedip4k031 kernel: [87976.526161] init: upstart-socket-bridge main process (764) killed by KILL signal Oct 25 07:28:34 nldedip4k031 kernel: [87976.526180] init: upstart-socket-bridge main process ended, respawning Oct 25 07:28:44 nldedip4k031 kernel: [87986.439671] irqbalance invoked oom-killer: gfp_mask=0x80d0, order=0, oom_adj=0, oom_score_adj=0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439674] irqbalance cpuset=/ mems_allowed=0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439676] Pid: 948, comm: irqbalance Not tainted 3.2.0-55-generic-pae #85-Ubuntu Oct 25 07:28:44 nldedip4k031 kernel: [87986.439678] Call Trace: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439684] [] dump_header.isra.6+0x85/0xc0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439686] [] oom_kill_process+0x5c/0x80 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439688] [] out_of_memory+0xc5/0x1c0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439691] [] __alloc_pages_nodemask+0x72c/0x740 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439694] [] __get_free_pages+0x1c/0x30 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439696] [] get_zeroed_page+0x12/0x20 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439699] [] fill_read_buffer.isra.8+0xaa/0xd0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439702] [] sysfs_read_file+0x7d/0x90 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439704] [] vfs_read+0x8c/0x160 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439707] [] ? fill_read_buffer.isra.8+0xd0/0xd0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439709] [] sys_read+0x3d/0x70 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439712] [] sysenter_do_call+0x12/0x28 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439714] Mem-Info: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439714] DMA per-cpu: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439716] CPU 0: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439717] CPU 1: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439718] CPU 2: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439719] CPU 3: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439720] CPU 4: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439721] CPU 5: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439722] CPU 6: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439723] CPU 7: hi: 0, btch: 1 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439724] Normal per-cpu: Oct 25 07:28:44 nldedip4k031 kernel: [87986.439725] CPU 0: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439726] CPU 1: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439727] CPU 2: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439728] CPU 3: hi: 186, btch: 31 usd: 0 Oct 25 07:28:44 nldedip4k031 kernel: [87986.439729] CPU 4: hi: 186, btch: 31 usd: 0 Oct 25 07:33:48 nldedip4k031 kernel: imklog 5.8.6, log source = /proc/kmsg started. Oct 25 07:33:48 nldedip4k031 rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="2880" x-info="http://www.rsyslog.com"] start Oct 25 07:33:48 nldedip4k031 rsyslogd: rsyslogd's groupid changed to 103 Oct 25 07:33:48 nldedip4k031 rsyslogd: rsyslogd's userid changed to 101 Oct 25 07:33:48 nldedip4k031 rsyslogd-2039: Could not open output pipe '/dev/xconsole' [try http://www.rsyslog.com/e/2039 ]

Read the article
GPGPU

WhatGPU obviously stands for Graphics Processing Unit (the silicon powering the display you are using to read this blog post). The extra GP in front of that stands for General Purpose computing.So, altogether GPGPU refers to computing we can perform on GPU for purposes beyond just drawing on the screen. In effect, we can use a GPGPU a bit like we already use a CPU: to perform some calculation (that doesn’t have to have any visual element to it). The attraction is that a GPGPU can be orders of magnitude faster than a CPU.WhyWhen I was at the SuperComputing conference in Portland last November, GPGPUs were all the rage. A quick online search reveals many articles introducing the GPGPU topic. I'll just share 3 here: pcper (ignoring all pages except the first, it is a good consumer perspective), gizmodo (nice take using mostly layman terms) and vizworld (answering the question on "what's the big deal").The GPGPU programming paradigm (from a high level) is simple: in your CPU program you define functions (aka kernels) that take some input, can perform the costly operation and return the output. The kernels are the things that execute on the GPGPU leveraging its power (and hence execute faster than what they could on the CPU) while the host CPU program waits for the results or asynchronously performs other tasks.However, GPGPUs have different characteristics to CPUs which means they are suitable only for certain classes of problem (i.e. data parallel algorithms) and not for others (e.g. algorithms with branching or recursion or other complex flow control). You also pay a high cost for transferring the input data from the CPU to the GPU (and vice versa the results back to the CPU), so the computation itself has to be long enough to justify the overhead transfer costs. If your problem space fits the criteria then you probably want to check out this technology.HowSo where can you get a graphics card to start playing with all this? At the time of writing, the two main vendors ATI (owned by AMD) and NVIDIA are the obvious players in this industry. You can read about GPGPU on this AMD page and also on this NVIDIA page. NVIDIA's website also has a free chapter on the topic from the "GPU Gems" book: A Toolkit for Computation on GPUs.If you followed the links above, then you've already come across some of the choices of programming models that are available today. Essentially, AMD is offering their ATI Stream technology accessible via a language they call Brook+; NVIDIA offers their CUDA platform which is accessible from CUDA C. Choosing either of those locks you into the GPU vendor and hence your code cannot run on systems with cards from the other vendor (e.g. imagine if your CPU code would run on Intel chips but not AMD chips). Having said that, both vendors plan to support a new emerging standard called OpenCL, which theoretically means your kernels can execute on any GPU that supports it. To learn more about all of these there is a website: gpgpu.org. The caveat about that site is that (currently) it completely ignores the Microsoft approach, which I touch on next.On Windows, there is already a cross-GPU-vendor way of programming GPUs and that is the DirectX API. Specifically, on Windows Vista and Windows 7, the DirectX 11 API offers a dedicated subset of the API for GPGPU programming: DirectCompute. You use this API on the CPU side, to set up and execute the kernels that run on the GPU. The kernels are written in a language called HLSL (High Level Shader Language). You can use DirectCompute with HLSL to write a "compute shader", which is the term DirectX uses for what I've been referring to in this post as a "kernel". For a comprehensive collection of links about this (including tutorials, videos and samples) please see my blog post: DirectCompute.Note that there are many efforts to build even higher level languages on top of DirectX that aim to expose GPGPU programming to a wider audience by making it as easy as today's mainstream programming models. I'll mention here just two of those efforts: Accelerator from MSR and Brahma by Ananth. Comments about this post welcome at the original blog.

Read the article
Using NServiceBus behind a custom web service

- by Michael Stephenson

In this post I'd like to talk about an architecture scenario we had recently and how we were able to utilise NServiceBus to help us address this problem. Scenario Cognos is a reporting system used by one of my clients. A while back we developed a web service façade to allow line of business applications to be able to access reports from Cognos to support their various functions. The service was intended to provide access to reports which were quick running reports or pre-generated reports which could be accessed real-time on demand. One of the key aims of the web service was to provide a simple generic interface to allow applications to get any report without needing to worry about the complex .net SDK for Cognos. The web service also supported multi-hop kerberos delegation so that report data could be accesses under the context of the end user. This service was working well for a period of time. The Problem The problem we encountered was that reports were now also required to be available to batch processes. The original design was optimised for low latency so users would enjoy a positive experience, however when the batch processes started to request 250+ concurrent reports over an extended period of time you can begin to imagine the sorts of problems that come into play. The key problems this new scenario caused are: Users may be affected and the latency of on demand reports was significantly slower The Cognos infrastructure was not scaled sufficiently to be able to cope with these long peaks of load From a cost perspective it just isn't feasible to scale the Cognos infrastructure to be able to handle the load when it is only for a couple of hour window each night. We really needed to introduce a second pattern for accessing this service which would support high through-put scenarios. We also had little control over the batch process in terms of being able to throttle its load. We could however make some changes to the way it accessed the reports. The Approach My idea was to introduce a throttling mechanism between the Web Service Façade and Cognos. This would allow the batch processes to push reports requests hard at the web service which we were confident the web service can handle. The web service would then queue these requests and process them behind the scenes and make a call back to the batch application to provide the report once it had been accessed. In terms of technology we had some limitations because we were not able to use WCF or IIS7 where the MSMQ-Activated WCF services could have helped, but we did have MSMQ as an option and I thought NServiceBus could do just the job to help us here. The flow of how this would work was as follows: The batch applications would send a request for a report to the web service The web service uses NServiceBus to send the message to a Queue The NServiceBus Generic Host is running as a windows service with a message handler which subscribes to these messages The message handler gets the message, accesses the report from Cognos The message handler calls back to the original batch application, this is decoupled because the calling application provides a call back url The report gets into the batch application and is processed as normal This approach looks something like the below diagram: The key points are an application wanting to take advantage of the batch driven reports needs to do the following: Implement our call back contract Make a call to the service providing a call back url Provide a correlation ID so it knows how to tie each response back to its request What does NServiceBus offer in this solution So this scenario is not the typical messaging service bus type of solution people implement with NServiceBus, but it did offer the following: Simplified interaction with MSMQ Offered the ability to configure the number of processes working through the queue so we could find a balance between load on Cognos versus the applications end to end processing time NServiceBus offers retries and a way to manage failed messages NServiceBus offers a high availability setup The simple thing is that NServiceBus gave us the platform to build the solution on. We just implemented a message handler which functionally processed a message and we could rely on NServiceBus to do all of the hard work around managing the queues and all of the lower level things that would have took ages to write to any kind of robust level. Conclusion With this approach we were able to deal with a fairly significant performance issue with out too much rework. Hopefully this write up gives people some insight into ideas on how to leverage the excellent NServiceBus framework to help solve integration and high through-put scenarios.

Read the article
Java Spotlight Episode 103: 2012 Duke Choice Award Winners

- by Roger Brinkley

Our annual interview with the 2012 Duke Choice Award Winners recorded live at the JavaOne 2012. Right-click or Control-click to download this MP3 file. You can also subscribe to the Java Spotlight Podcast Feed to get the latest podcast automatically. If you use iTunes you can open iTunes and subscribe with this link: Java Spotlight Podcast in iTunes. Show Notes Events Oct 13, Devoxx 4 Kids Nederlands Oct 15-17, JAX London Oct 20, Devoxx 4 Kids Français Oct 22-23, Freescale Technology Forum - Japan, Tokyo Oct 30-Nov 1, Arm TechCon, Santa Clara Oct 31, JFall, Netherlands Nov 2-3, JMagreb, Morocco Nov 13-17, Devoxx, Belgium Feature Interview Duke Choice Award Winners 2012 - Show Presentation London Java CommunityThe second user group receiving a Duke’s Choice Award this year, the London Java Community (LJC) and its users have been active in the OpenJDK, the Java Community Process (JCP) and other efforts within the global Java community. Student Nokia Developer GroupThis year’s student winner, Ram Kashyap, is the founder and president of the Nokia Student Network, and was profiled in the “The New Java Developers” feature in the March/April 2012 issue of Java Magazine. Since then, Ram has maintained a hectic pace, graduating from the People’s Education Society Institute of Technology in Bangalore, India, while working on a Java mobile startup and training students on Java ME. Jelastic, Inc.Moving existing Java applications to the cloud can be a daunting task, but startup Jelastic, Inc. offers the first all-Java platform-as-a-service (PaaS) that enables existing Java applications to be deployed in the cloud without code changes or lock-in. NATOThe first-ever Community Choice Award goes to the MASE Integrated Console Environment (MICE) in use at NATO. Built in Java on the NetBeans platform, MICE provides a high-performance visualization environment for conducting air defense and battle-space operations. DuchessRather than focus on a specific geographic area like most Java User Groups (JUGs), Duchess fosters the participation of women in the Java community worldwide. The group has more than 500 members in 60 countries, and provides a platform through which women can connect with each other and get involved in all aspects of the Java community. AgroSense ProjectImproving farming methods to feed a hungry world is the goal of AgroSense, an open source farm information management system built in Java and the NetBeans platform. AgroSense enables farmers, agribusinesses, suppliers and others to develop modular applications that will easily exchange information through a common underlying NetBeans framework. Apache Software Foundation Hadoop ProjectThe Apache Software Foundation’s Hadoop project, written in Java, provides a framework for distributed processing of big data sets across clusters of computers, ranging from a few servers to thousands of machines. This harnessing of large data pools allows organizations to better understand and improve their business. Parleys.comE-learning specialist Parleys.com, based in Brussels, Belgium, uses Java technologies to bring online classes and full IT conferences to desktops, laptops, tablets and mobile devices. Parleys.com has hosted more than 1,700 conferences—including Devoxx and JavaOne—for more than 800,000 unique visitors. Winners not presenting at JavaOne 2012 Duke Choice Awards BOF Liquid RoboticsRobotics – Liquid Robotics is an ocean data services provider whose Wave Glider technology collects information from the world’s oceans for application in government, science and commercial applications. The organization features the “father of Java” James Gosling as its chief software architect.United Nations High Commissioner for RefugeesThe United Nations High Commissioner for Refugees (UNHCR) is on the front lines of crises around the world, from civil wars to natural disasters. To help facilitate its mission of humanitarian relief, the UNHCR has developed a light-client Java application on the NetBeans platform. The Level One registration tool enables the UNHCR to collect information on the number of refugees and their water, food, housing, health, and other needs in the field, and combines that with geocoding information from various sources. This enables the UNHCR to deliver the appropriate kind and amount of assistance where it is needed.

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Developing an Implementation Plan with Iterations by Russ Pitts

- by user535886

Developing an Implementation Plan with Iterations by Russ Pitts Ok, so you have come to grips with understanding that applying the iterative concept, as defined by OUM is simply breaking up the project effort you have estimated for each phase into one or more six week calendar duration blocks of work. Idea being the business user(s) or key recipient(s) of work product(s) being developed never go longer than six weeks without having some sort of review or prototyping of the work results for an iteration…”think-a-little”, “do-a-little”, and “show-a-little” in a six week or less timeframe…ideally the business user(s) or key recipients(s) are involved throughout. You also understand the OUM concept that you only plan for that which you have knowledge of. The concept further defined, a project plan initially is developed at a high-level, and becomes more detailed as project knowledge grows. Agreeing to this concept means you also have to admit to the fallacy that one can plan with precision beyond six weeks into a project…Anything beyond six weeks is a best guess in most cases when dealing with software implementation projects. Project planning, as defined by OUM begins with the Implementation Plan view, which is a very high-level perspective of the effort estimated for each of the five OUM phases, as well as the number of iterations within each phase. You might wonder how can you predict the number of iterations for each phase at this early point in the project. Remember project planning is not an exact science, and initially is high-level and abstract in nature, and then becomes more detailed and precise as the project proceeds. So where do you start in defining iterations for each phase for a project? The following are three easy steps to initially define the number of iterations for each phase: Step 1 => Start with identifying the known factors… …Prior to starting a project you should know: · The agreed upon time-period for an iteration (e.g 6 weeks, or 4 weeks, or…) within a phase (recommend keeping iteration time-period consistent within a phase, if not for the entire project) · The number of resources available for the project · The number of total number of man-day (effort) you have estimated for each of the five OUM phases of the project · The number of work days for a week Step 2 => Calculate the man-days of effort required for an iteration within a phase… Lets assume for the sake of this example there are 10 project resources, and you have estimated 2,536 man-days of work effort which will need to occur for the elaboration phase of the project. Let’s also assume a week for this project is defined as 5 business days, and that each iteration in the elaboration phase will last a calendar duration of 6 weeks. A simple calculation is performed to calculate the daily burn rate for a single iteration, which produces a result of… ((Number of resources * days per week) * duration of iteration) = Number of days required per iteration ((10 resources * 5 days/week) * 6 weeks) = 300 man days of effort required per iteration Step 3 => Calculate the number of iterations that can occur within a phase Next calculate the number of iterations that can occur for the amount of man-days of effort estimated for the phase being considered… (number of man-days of effort estimated / number of man-days required per iteration) = # of iterations for phase (2,536 man-days of estimated effort for phase / 300 man days of effort required per iteration) = 8.45 iterations, which should be rounded to a whole number such as 9 iterations* *Note - It is important to note this is an approximate calculation, not an exact science. This particular example is a simple one, which assumes all resources are utilized throughout the phase, including tech resources, etc. (rounding down or up to a whole number based on project factor considerations). It is also best in many cases to round up to higher number, as this provides some calendar scheduling contingency.

Read the article
Microsoft hosting free Hyper-V training for VMware Pros

- by Ryan Roussel

Microsoft will be hosting free training for virtualization professionals focused on Hyper-V, System Center, and virtualization architecture. Details are below: Just one week after Microsoft Management Summit 2011 (MMS), Microsoft Learning will be hosting an exclusive three-day Jump Start class specially tailored for VMware and Microsoft virtualization technology pros. Registration for “Microsoft Virtualization for VMware Professionals” is open now and will be delivered as an online class on March 29-31, 2010 from 10:00am-4:00pm PDT. The course is COMPLETELY FREE and OPEN TO ANYONE! Please share with your customers, blog, Tweet, etc. – help us get the word out to strengthen support for Microsoft’s virtualization offerings. What’s the high-level overview? This cutting edge course will feature expert instruction and real-world demonstrations of Hyper-V and brand new releases from System Center Virtual Machine Manager 2012 Beta (many of which will be announced just one week earlier at MMS). Register Now! Day 1 will focus on “Platform” (Hyper-V, virtualization architecture, high availability & clustering) 10:00am – 10:30pm PDT: Virtualization 360 Overview 10:30am – 12:00pm: Microsoft Hyper-V Deployment Options & Architecture 1:00pm – 2:00pm: Differentiating Microsoft and VMware (terminology, etc.) 2:00pm – 4:00pm: High Availability & Clustering Day 2 will focus on “Management” (System Center Suite, SCVMM 2012 Beta, Opalis, Private Cloud solutions) 10:00am – 11:00pm PDT: System Center Suite Overview w/ focus on DPM 11:00am – 12:00pm: Virtual Machine Manager 2012 | Part 1 1:00pm – 1:30pm: Virtual Machine Manager 2012 | Part 2 1:30pm – 2:30pm: Automation with System Center Opalis & PowerShell 2:30pm – 4:00pm: Private Cloud Solutions, Architecture & VMM SSP 2.0 Day 3 will focus on “VDI” (VDI Infrastructure/architecture, v-Alliance, application delivery via VDI) 10:00am – 11:00pm PDT: Virtual Desktop Infrastructure (VDI) Architecture | Part 1 11:00am – 12:00pm: Virtual Desktop Infrastructure (VDI) Architecture | Part 2 1:00pm – 2:30pm: v-Alliance Solution Overview 2:30pm – 4:00pm: Application Delivery for VDI Every section will be team-taught by two of the most respected authorities on virtualization technologies: Microsoft Technical Evangelist Symon Perriman and leading Hyper-V, VMware, and XEN infrastructure consultant, Corey Hynes Who is the target audience for this training? Suggested prerequisite skills include real-world experience with Windows Server 2008 R2, virtualization and datacenter management. The course is tailored to these types of roles: · IT Professional · IT Decision Maker · Network Administrators & Architects · Storage/Infrastructure Administrators & Architects How do I to register and learn more about this great training opportunity? · Register: Visit the Registration Page and sign up for all three sessions · Blog: Learn more from the Microsoft Learning Blog · Twitter: Here are a few posts you can retweet: o Mar. 29-31 "Microsoft #Virtualization for VMware Pros" @SymonPerriman Corey Hynes http://bit.ly/JS-Hyper-V @MSLearning #Hyper-V o @SysCtrOpalis Mar. 29-31 "Microsoft #Virtualization for VMware Pros" @SymonPerriman Corey Hynes http://bit.ly/JS-Hyper-V #Hyper-V o Learn all the cool new features in Hyper-V & System Center 2012! SCVMM, Self-Service Portal 2.0, http://bit.ly/JS-Hyper-V #Hyper-V #Opalis What is a “Jump Start” course? A “Jump Start” course is “team-taught” by two expert instructors in an engaging radio talk show style format. The idea is to deliver readiness training on strategic and emerging technologies that drive awareness at scale before Microsoft Learning develops mainstream Microsoft Official Courses (MOC) that map to certifications. All sessions are professionally recorded and distributed through MS Showcase, Channel 9, Zune Marketplace and iTunes for broader reach.

Read the article
Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
F1 Pit Pragmatics

- by mikef

"I hate computers. No, really, I hate them. I love the communications they facilitate, I love the conveniences they provide to my life. but I actually hate the computers themselves." - Scott Merrill, 'I hate computers: confessions of a Sysadmin' If Scott's goal was to polarize opinion and trigger raging arguments over the 'real reasons why computers suck', then he certainly succeeded. Impassioned vitriol sits side-by-side with rational debate. Yet Scott's fundamental point is absolutely on the money - Computers are a means to an end. The IT industry is finally starting to put weight behind the notion that good User Experience is an absolutely crucial goal, a cause championed by the likes of Microsoft's Bill Buxton, and which Apple's increasingly ubiquitous touch screen interface exemplifies. However, that doesn't change the fact that, occasionally, you just have to man up and deal with complex systems. In fact, sometimes you just need to sacrifice everything else in the name of performance. You'll find a perfect example of this Faustian bargain in Trevor Clarke's fascinating look into the (diabolical) IT infrastructure of modern F1 racing - high performance, high availability. high everything. To paraphrase, each car has up to 100 sensors, transmitting around 30Gb of data over the course of a race (70% in real-time). This data is then processed by no less than 3 servers (per car) so that the engineers in the pit have access to telemetry, strategy information, timing feeds, a connection back to the operations room in the team's home base - the list goes on. All of this while the servers are exposed "to carbon dust, oil, vibration, rain, heat, [and] variable power". Now, this is admittedly an extreme context where there's no real choice but to use complex systems where ease-of-use is, at best, a secondary concern. The flip-side is seen in small-scale personal computing such as that seen in Apple's iDevices, which are incredibly intuitive but limited in their scope. In terms of what kinds of systems they prefer to use, I suspect that most SysAdmins find themselves somewhere along this axis of Power vs. Usability, and which end of this axis you resonate with also hints at where you think the IT industry should focus its energy. Do you see yourself in the F1 pit, making split-second decisions, wrestling with information flows and reticent hardware to bend them to your will? If so, I imagine you feel that computers are subtle tools which need to be tuned and honed, using the advanced knowledge possessed only by responsible SysAdmins (If you have an iPhone, I suspect it's jail-broken). If the machines throw enigmatic errors, it's the price of flexibility and raw power. Alternatively, would you prefer to have your role more accessible, with users empowered by knowledge, spreading the load of managing IT environments? In that case, then you want hardware and software to have User Experience as their primary focus, and are of the "means to an end" school of thought (you're probably also fed up with users not listening to you when you try and help). At its heart, the dichotomy is between raw power (which might be difficult to use) and ease-of-use (which might have some limitations, but you can be up and running immediately). Of course, the ultimate goal is a fusion of flexibility, power and usability all in one system. It's achievable in specific software environments, and Red Gate considers it a target worth aiming for, but in other cases it's a goal right up there with cold fusion. I think it'll be a long time before we see it become ubiquitous. In the meantime, are you Power-Hungry or a Champion of Usability? Cheers, Michael Francis Simple Talk SysAdmin Editor

Read the article
ALC889 - GA-P55-UD4 -no sound - Ubuntu 11.10

- by george

I have computer with a Gigabyte P55A-UD4 motherboard. I have on-board audio - Realtek ALC889. I am using Ubuntu 11.10 and have no sound. please please heeeelp :). i have tryed to install high definition audio codecs from realtek but it doesn't work. in bios the azalia codec is turned on. ps : sorry for my english. 00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 12) 00:01.0 PCI bridge: Intel Corporation Core Processor PCI Express x16 Root Port (rev 12) 00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1a.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06) 00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 06) 00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 06) 00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 (rev 06) 00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 06) 00:1c.6 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 7 (rev 06) 00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.1 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.2 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.3 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB Universal Host Controller (rev 06) 00:1d.7 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a6) 00:1f.0 ISA bridge: Intel Corporation 5 Series Chipset LPC Interface Controller (rev 06) 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 06) 00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 06) 01:00.0 VGA compatible controller: nVidia Corporation GT216 [GeForce GT 220] (rev a2) 01:00.1 Audio device: nVidia Corporation High Definition Audio Controller (rev a1) 03:00.0 SATA controller: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03) 03:00.1 IDE interface: JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03) 04:00.0 IDE interface: Marvell Technology Group Ltd. Device 91a3 (rev 11) 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) 06:03.0 IDE interface: Integrated Technology Express, Inc. IT8213 IDE Controller 06:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 3f:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 02) 3f:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 02) 3f:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 02) 3f:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 02) 3f:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 02) 3f:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 02) aplay -l karta 0: Intel [HDA Intel], urzadzenie 0: ALC889 Analog [ALC889 Analog] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 0: Intel [HDA Intel], urzadzenie 1: ALC889 Digital [ALC889 Digital] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 3: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 7: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 8: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0 karta 1: NVidia [HDA NVidia], urzadzenie 9: HDMI 0 [HDMI 0] Urzadzenia podrzedne: 1/1 Urzadzenie podrzedne #0: subdevice #0

Read the article

Search Results

Search found 7805 results on 313 pages for 'high voltage'.

Page 60/313 | < Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67 | Next Page >

- by cute

- by Thomas Weller

- by Claus Jandausch

- by Stilgar

- by Brad Dwyer

- by jamesmorris

- by Stephen.Walther

- by Roy Rico

- by Roy Rico

- by pinaldave

- by Pinal Dave

- by Testas

- by fip

- by Michael Snow

- by jsavit

- by Grant

- by Michael Stephenson

- by Roger Brinkley

- by rchrd

- by user535886

- by Ryan Roussel

- by Pinal Dave

- by mikef

- by george

< Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67 | Next Page >