Search Results

Search found 11396 results on 456 pages for 'simply denis'.

Page 426/456 | < Previous Page | 422 423 424 425 426 427 428 429 430 431 432 433  | Next Page >

  • Full-text Indexing Books Online

    - by Most Valuable Yak (Rob Volk)
    While preparing for a recent SQL Saturday presentation, I was struck by a crazy idea (shocking, I know): Could someone import the content of SQL Server Books Online into a database and apply full-text indexing to it?  The answer is yes, and it's really quite easy to do. The first step is finding the installed help files.  If you have SQL Server 2012, BOL is installed under the Microsoft Help Library.  You can find the install location by opening SQL Server Books Online and clicking the gear icon for the Help Library Manager.  When the new window pops up click the Settings link, you'll get the following: You'll see the path under Library Location. Once you navigate to that path you'll have to drill down a little further, to C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store.  This is where the help file content is kept if you downloaded it for offline use. Depending on which products you've downloaded help for, you may see a few hundred files.  Fortunately they're named well and you can easily find the "SQL_Server_Denali_Books_Online_" files.  We are interested in the .MSHC files only, and can skip the Installation and Developer Reference files. Despite the .MHSC extension, these files are compressed with the standard Zip format, so your favorite archive utility (WinZip, 7Zip, WinRar, etc.) can open them.  When you do, you'll see a few thousand files in the archive.  We are only interested in the .htm files, but there's no harm in extracting all of them to a folder.  7zip provides a command-line utility and the following will extract to a D:\SQLHelp folder previously created: 7z e –oD:\SQLHelp "C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store\SQL_Server_Denali_Books_Online_B780_SQL_110_en-us_1.2.mshc" *.htm Well that's great Rob, but how do I put all those files into a full-text index? I'll tell you in a second, but first we have to set up a few things on the database side.  I'll be using a database named Explore (you can certainly change that) and the following setup is a fragment of the script I used in my presentation: USE Explore; GO CREATE SCHEMA help AUTHORIZATION dbo; GO -- Create default fulltext catalog for later FT indexes CREATE FULLTEXT CATALOG FTC AS DEFAULT; GO CREATE TABLE help.files(file_id int not null IDENTITY(1,1) CONSTRAINT PK_help_files PRIMARY KEY, path varchar(256) not null CONSTRAINT UNQ_help_files_path UNIQUE, doc_type varchar(6) DEFAULT('.xml'), content varbinary(max) not null); CREATE FULLTEXT INDEX ON help.files(content TYPE COLUMN doc_type LANGUAGE 1033) KEY INDEX PK_help_files; This will give you a table, default full-text catalog, and full-text index on that table for the content you're going to insert.  I'll be using the command line again for this, it's the easiest method I know: for %a in (D:\SQLHelp\*.htm) do sqlcmd -S. -E -d Explore -Q"set nocount on;insert help.files(path,content) select '%a', cast(c as varbinary(max)) from openrowset(bulk '%a', SINGLE_CLOB) as c(c)" You'll need to copy and run that as one line in a command prompt.  I'll explain what this does while you run it and watch several thousand files get imported: The "for" command allows you to loop over a collection of items.  In this case we want all the .htm files in the D:\SQLHelp folder.  For each file it finds, it will assign the full path and file name to the %a variable.  In the "do" clause, we'll specify another command to be run for each iteration of the loop.  I make a call to "sqlcmd" in order to run a SQL statement.  I pass in the name of the server (-S.), where "." represents the local default instance. I specify -d Explore as the database, and -E for trusted connection.  I then use -Q to run a query that I enclose in double quotes. The query uses OPENROWSET(BULK…SINGLE_CLOB) to open the file as a data source, and to treat it as a single character large object.  In order for full-text indexing to work properly, I have to convert the text content to varbinary. I then INSERT these contents along with the full path of the file into the help.files table created earlier.  This process continues for each file in the folder, creating one new row in the table. And that's it! 5 SQL Statements and 2 command line statements to unzip and import SQL Server Books Online!  In case you're wondering why I didn't use FILESTREAM or FILETABLE, it's simply because I haven't learned them…yet. I may return to this blog after I figure that out and update it with the steps to do so.  I believe that will make it even easier. In the spirit of exploration, I'll leave you to work on some fulltext queries of this content.  I also recommend playing around with the sys.dm_fts_xxxx DMVs (I particularly like sys.dm_fts_index_keywords, it's pretty interesting).  There are additional example queries in the download material for my presentation linked above. Many thanks to Kevin Boles (t) for his advice on (re)checking the content of the help files.  Don't let that .htm extension fool you! The 2012 help files are actually XML, and you'd need to specify '.xml' in your document type column in order to extract the full-text keywords.  (You probably noticed this in the default definition for the doc_type column.)  You can query sys.fulltext_document_types to get a complete list of the types that can be full-text indexed. I also need to thank Hilary Cotter for giving me the original idea. I believe he used MSDN content in a full-text index for an article from waaaaaaaaaaay back, that I can't find now, and had forgotten about until just a few days ago.  He is also co-author of Pro Full-Text Search in SQL Server 2008, which I highly recommend.  He also has some FTS articles on Simple Talk: http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features/ http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features,-part-2/

    Read the article

  • Network communications mechanisms for SQL Server

    - by Akshay Deep Lamba
    Problem I am trying to understand how SQL Server communicates on the network, because I'm having to tell my networking team what ports to open up on the firewall for an edge web server to communicate back to the SQL Server on the inside. What do I need to know? Solution In order to understand what needs to be opened where, let's first talk briefly about the two main protocols that are in common use today: TCP - Transmission Control Protocol UDP - User Datagram Protocol Both are part of the TCP/IP suite of protocols. We'll start with TCP. TCP TCP is the main protocol by which clients communicate with SQL Server. Actually, it is more correct to say that clients and SQL Server use Tabular Data Stream (TDS), but TDS actually sits on top of TCP and when we're talking about Windows and firewalls and other networking devices, that's the protocol that rules and controls are built around. So we'll just speak in terms of TCP. TCP is a connection-oriented protocol. What that means is that the two systems negotiate the connection and both agree to it. Think of it like a phone call. While one person initiates the phone call, the other person has to agree to take it and both people can end the phone call at any time. TCP is the same way. Both systems have to agree to the communications, but either side can end it at any time. In addition, there is functionality built into TCP to ensure that all communications can be disassembled and reassembled as necessary so it can pass over various network devices and be put together again properly in the right order. It also has mechanisms to handle and retransmit lost communications. Because of this functionality, TCP is the protocol used by many different network applications. The way the applications all can share is through the use of ports. When a service, like SQL Server, comes up on a system, it must listen on a port. For a default SQL Server instance, the default port is 1433. Clients connect to the port via the TCP protocol, the connection is negotiated and agreed to, and then the two sides can transfer information as needed until either side decides to end the communication. In actuality, both sides will have a port to use for the communications, but since the client's port is typically determined semi-randomly, when we're talking about firewalls and the like, typically we're interested in the port the server or service is using. UDP UDP, unlike TCP, is not connection oriented. A "client" can send a UDP communications to anyone it wants. There's nothing in place to negotiate a communications connection, there's nothing in the protocol itself to coordinate order of communications or anything like that. If that's needed, it's got to be handled by the application or by a protocol built on top of UDP being used by the application. If you think of TCP as a phone call, think of UDP as a postcard. I can put a postcard in the mail to anyone I want, and so long as it is addressed properly and has a stamp on it, the postal service will pick it up. Now, what happens it afterwards is not guaranteed. There's no mechanism for retransmission of lost communications. It's great for short communications that doesn't necessarily need an acknowledgement. Because multiple network applications could be communicating via UDP, it uses ports, just like TCP. The SQL Browser or the SQL Server Listener Service uses UDP. Network Communications - Talking to SQL Server When an instance of SQL Server is set up, what TCP port it listens on depends. A default instance will be set up to listen on port 1433. A named instance will be set to a random port chosen during installation. In addition, a named instance will be configured to allow it to change that port dynamically. What this means is that when a named instance starts up, if it finds something already using the port it normally uses, it'll pick a new port. If you have a named instance, and you have connections coming across a firewall, you're going to want to use SQL Server Configuration Manager to set a static port. This will allow the networking and security folks to configure their devices for maximum protection. While you can change the network port for a default instance of SQL Server, most people don't. Network Communications - Finding a SQL Server When just the name is specified for a client to connect to SQL Server, for instance, MySQLServer, this is an attempt to connect to the default instance. In this case the client will automatically attempt to communicate to port 1433 on MySQLServer. If you've switched the port for the default instance, you'll need to tell the client the proper port, usually by specifying the following syntax in the connection string: <server>,<port>. For instance, if you moved SQL Server to listen on 14330, you'd use MySQLServer,14330 instead of just MySQLServer. However, because a named instance sets up its port dynamically by default, the client never knows at the outset what the port is it should talk to. That's what the SQL Browser or the SQL Server Listener Service (SQL Server 2000) is for. In this case, the client sends a communication via the UDP protocol to port 1434. It asks, "Where is the named instance?" So if I was running a named instance called SQL2008R2, it would be asking the SQL Browser, "Hey, how do I talk to MySQLServer\SQL2008R2?" The SQL Browser would then send back a communications from UDP port 1434 back to the client telling the client how to talk to the named instance. Of course, you can skip all of this of you set that named instance's port statically. Then you can use the <server>,<port> mechanism to connect and the client won't try to talk to the SQL Browser service. It'll simply try to make the connection. So, for instance, is the SQL2008R2 instance was listening on port 20080, specifying MySQLServer,20080 would attempt a connection to the named instance. Network Communications - Named Pipes Named pipes is an older network library communications mechanism and it's generally not used any longer. It shouldn't be used across a firewall. However, if for some reason you need to connect to SQL Server with it, this protocol also sits on top of TCP. Named Pipes is actually used by the operating system and it has its own mechanism within the protocol to determine where to route communications. As far as network communications is concerned, it listens on TCP port 445. This is true whether we're talking about a default or named instance of SQL Server. The Summary Table To put all this together, here is what you need to know: Type of Communication Protocol Used Default Port Finding a SQL Server or SQL Server Named Instance UDP 1434 Communicating with a default instance of SQL Server TCP 1433 Communicating with a named instance of SQL Server TCP * Determined dynamically at start up Communicating with SQL Server via Named Pipes TCP 445

    Read the article

  • Windows Phone Developer Spotlight: Nikolai Joukov

    - by Lori Lalonde
    Originally posted on: http://geekswithblogs.net/lorilalonde/archive/2014/06/04/windows-phone-developer-spotlight-nikolai-joukov.aspxAs part of an ongoing series, I plan to include a spotlight post on people within the community that are stars in their field and area of expertise. For my first spotlight post, I interviewed Nikolai Joukov, who is a regular attendee at my local area .NET User Group (CTTDNUG), and has also participated in many of the Mobile and Cloud workshops we have hosted over the past few years. Nikolai stood out immediately, because of his passion for developing mobile apps, his interest in continuous learning, and his drive to publish quality apps that people will find useful and entertaining. Background: Nikolai immigrated to Canada in 1995, and has been working in IT since 1997. He moved on to become an independent contractor in 2005, and has worked at various large scale organizations over the course of his career, including BMO, Enbridge, Economical Insurance, Equitable Life, Manulife and Sun Life. Nikolai is an accomplished Windows Phone and Windows Store publisher, with 11 published Windows Phone apps, and 8 published Windows Store apps. He has almost 6000 downloads and favourable reviews. Q & A with Nikolai How many years have you been developing Windows Phone apps? 2 years When did you develop your very first Windows Phone app, and what was it about? Actually, the very first app I wrote was for the Microsoft “Smart Phone” back in 2004. This phone was given to me by Microsoft during the Developers Days Conference in Toronto. It was some kind of experimental model named Smart Phone, but you had to use VB 3 to develop the applications. Needless to say, this was not very successful at that time. My app was a Stock Trades Calculator. Very primitive, but it was working for me. The phone was heavy and the battery barely lasted 4 hours. Microsoft stopped supporting it few months later and the phone stopped working shortly after, but I still have it as a souvenir. For Windows Phone, my first app was “Trip Packing Assistant”. This is a simple trip packing check list that allows you to list items by category, set required quantity of items, and mark off the item in the list when it is packed. I designed it for me and my wife Galina, since we love to travel and this program helps manage our list for us. How did you get started in Windows Phone development? I have to say thanks to our .NET User Group for introducing me to Windows Phone development. I was intrigued and decided to give it a try. In October 2012 during a 2 day training event that ObjectSharp hosted in London, I met Bruce Johnson. On his advice, I registered for Developer Movement, and it is was a good push to actually complete the apps that I started. You have a great series of travel guide apps both for Windows Phone and Windows Store. Tell us about how you came up with the idea to develop those apps and what process you went through to put it all together. Like I said earlier, my wife and I love to travel. Before I created Trip Packing Assistant, every time we were planning to travel somewhere new, Galina would spend 3-4 weeks doing research. She would create a Word document with all of the information. We didn’t want to have to carry our laptop with us all the time, so we printed out the Word document she created, and would take it with us. After we returned from the trip, we would bring back tons of pictures and materials. Then our friends started to ask us about our materials before they planned their trips to the same places we had visited. So I decided to give it a try and started making apps for Windows Phone and for Windows 8. I hope these applications will help people who are planning to travel. So, all of the pictures used in the travel apps you created were actually taken by you during these amazing trips? Yes Do you have another Windows Store/Windows Phone project in development right now? If so, can you give us a hint at what it will be about? I want to stay with travel apps for now. But this time I will try to write an app for us (Galina and I). Usually we go on the trip, then I write the apps after we have all this beautiful pictures in our hands. We are planning a trip to Rome. This app will not have the pictures, but I want to add a map with points of interest and all information that can be useful for us. Then we will go on our trip and test it on location. As well I am planning to work on my existing apps to make them better. What learning resources would you recommend for other developers that want to get started in Window Store and Windows Phone development? I would start with dev.windowsphone.com to get all tools and samples, also links to training materials. I like MVA (Microsoft Virtual Academy). Their videos are really useful and it is free. Pluralsight is good too but it is not free and I do not have a subscription anymore. Our .NET User Group meetings give good insights too. I went to all meetings and full day training events. When you start to develop your app, you need to do research for specific questions that arise during development. The Developer Portal and Nokia Developer are good resources too. Wrap Up Thanks Nikolai for participating in my first Spotlight blog post! Shown below is Nikolai’s publisher page in the Windows Phone Store and his publisher page in the Windows Store. Simply click on it to be taken to there to check out his portfolio of apps. Be sure to download his apps and try them out! They are all free! Nikolai’s Windows Phone apps   Nikolai’s Windows Store Apps

    Read the article

  • Evaluating Solutions to Manage Product Compliance? Don't Wait Much Longer

    - by Kerrie Foy
    Depending on severity, product compliance issues can cause all sorts of problems from run-away budgets to business closures. But effective policies and safeguards can create a strong foundation for innovation, productivity, market penetration and competitive advantage. If you’ve been putting off a systematic approach to product compliance, it is time to reconsider that decision, or indecision. Why now?  No matter what industry, companies face a litany of worldwide and regional regulations that require proof of product compliance and environmental friendliness for market access.  For example, Restriction of Hazardous Substances (RoHS) is a regulation that restricts the use of six dangerous materials used in the manufacture of electronic and electrical equipment.  ROHS was originally adopted by the European Union in 2003 for implementation in 2006, and it has evolved over time through various regional versions for North America, China, Japan, Korea, Norway and Turkey.  In addition, the RoHS directive allowed for material exemptions used in Medical Devices, but that exemption ends in 2014.   Additional regulations worth watching are the Battery Directive, Waste Electrical and Electronic Equipment (WEEE), and Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) directives.  Additional evolving regulations are coming from governing bodies like the Food and Drug Administration (FDA) and the International Organization for Standardization (ISO). Corporate sustainability initiatives are also gaining urgency and influencing product design. In a survey of 405 corporations in the Global 500 by Carbon Disclosure Project, co-written by PwC (CDP Global 500 Climate Change Report 2012 entitled Business Resilience in an Uncertain, Resource-Constrained World), 48% of the respondents indicated they saw potential to create new products and business services as a response to climate change. Just 21% reported a dedicated budget for the research. However, the report goes on to explain that those few companies are winning over new customers and driving additional profits by exploiting their abilities to adapt to environmental needs. The article cites Dell as an example – Dell has invested in research to develop new products designed to reduce its customers’ emissions by more than 10 million metric tons of CO2e per year. This reduction in emissions should save Dell’s customers over $1billion per year as a result! Over time we expect to see many additional companies prove that eco-design provides marketplace benefits through differentiation and direct customer value. How do you meet compliance requirements and also successfully invest in eco-friendly designs? No doubt companies struggle to answer this question. After all, the journey to get there may involve transforming business models, go-to-market strategies, supply networks, quality assurance policies and compliance processes per the rapidly evolving global and regional directives. There may be limited executive focus on the initiative, inability to quantify noncompliance, or not enough resources to justify investment. To make things even more difficult to address, compliance responsibility can be a passionate topic within an organization, making the prospect of change on an enterprise scale problematic and time-consuming. Without a single source of truth for product data and without proper processes in place, ensuring product compliance burgeons into a crushing task that is cost-prohibitive and overwhelming to an organization. With all the overhead, certain markets or demographics become simply inaccessible. Therefore, the risk to consumer goodwill and satisfaction, revenue, business continuity, and market potential is too great not to solve the compliance challenge. Companies are beginning to adapt and even thrive in today’s highly regulated and transparent environment by implementing systematic approaches to product compliance that are more than functional bandages but revenue-generating engines. Consider partnering with Oracle to help you address your compliance needs. Many of the world’s most innovative leaders and pioneers are leveraging Oracle’s Agile Product Lifecycle Management (PLM) portfolio of enterprise applications to manage the product value chain, centralize product data, automate processes, and launch more eco-friendly products to market faster.   Particularly, the Agile Product Governance & Compliance (PG&C) solution provides out-of-the-box functionality to integrate actionable regulatory information into the enterprise product record from the ideation to the disposal/recycling phase. Agile PG&C makes it possible to efficiently manage compliance per corporate green initiatives as well as regional and global directives. Options are critical, but so is ease-of-use. Anyone who’s grappled with compliance policy knows legal interpretation plays a major role in determining how an organization responds to regulation. Agile PG&C gives you the freedom to configure product compliance per your needs, while maintaining rigorous control over the product record in an easy-to-use interface that facilitates adoption efforts. It allows you to assign regulations as specifications for a part or BOM roll-up. Each specification has a threshold value that alerts you to a non-compliance issue if the threshold value is exceeded. Set however many regulations as specifications you need to make sure a product can be sold in your target countries. Another option is to implement like one of our leading consumer electronics customers and define your own “catch-all” specification to ensure compliance in all markets. You can give your suppliers secure access to enter their component data or integrate a third party’s data. With Agile PG&C you are able to design compliance earlier into your products to reduce cost and improve quality downstream when stakes are higher. Agile PG&C is a comprehensive solution that makes product compliance more reliable and efficient. Throughout product lifecycles, use the solution to support full material disclosures, efficiently manage declarations with your suppliers, feed compliance data into a corrective action if a product must be changed, and swiftly satisfy audits by showing all due diligence tracked in one solution. Given the compounding regulation and consumer focus on urgent environmental issues, now is the time to act. Implementing an enterprise, systematic approach to product compliance is a competitive investment. From the start, Agile Product Governance & Compliance enables companies to confidently design for compliance and sustainability, reduce the cost of compliance, minimize the risk of business interruption, deliver responsible products, and inspire new innovation.  Don’t wait any longer! To find out more about Agile Product Governance & Compliance download the data sheet, contact your sales representative, or call Oracle at 1-800-633-0738. Many thanks to Shane Goodwin, Senior Manager, Oracle Agile PLM Product Management, for contributions to this article. 

    Read the article

  • Clarity is important, both in question and in answer.

    - by gerrylowry
    clarity is important ... i'm often reminded of the Clouseau movie in which Peter Sellers as Chief Inspector Clouseau asks a hotel clerk "Does your dog bite?" ... the clerk answers "no" ... after Clouseau has been bitten by the dog, he looks at the hotel clerk who says "That's not my dog".  Clarity is important, both in question and in answer. i've been a member of forums.asp.net since 2008 ... like many of my peers at forums.asp.net, i've answered my fair share of questions. FWIW, the purpose of this, my first web log post to http://weblogs.asp.net/gerrylowry is to help new members ask better questions and in turn get better answers. TIMTOWTDI  =.  there is more than one way to do it imho, the best way to ask a question in any forum, or even person to person, is to first formulate your question and then ask yourself to answer your own question. Things to consider when asking (the more complete your question, the more likely you'll get the answer you require): -- have you searched Google and/or your favourite search engine(s) before posting your question to forums.asp.net; examples: site:msdn.microsoft.com entity framework 5.0 c#http://lmgtfy.com/?q=site%3Amsdn.microsoft.com+entity+framework+5.0+c%23 site:forums.asp.net MVC tutorial c#http://lmgtfy.com/?q=site%3Aforums.asp.net+MVC+tutorial+c%23 -- are you asking your question in the correct forum?  look at the forums' descriptions at http://forums.asp.net/; examples: Getting Started If you have a general ASP.NET question on a topic that's not covered by one of the other more specific forums - ask it here. MVC Discussions regarding ASP.NET Model-View-Controller (MVC) C# Questions about using C# for ASP.NET development Note:  if your question pertains more to c# than to MVC, choosing the C# forum is likely to be more appropriate. -- is your post subject clear and concise, yet not too vague? compare these three subjects (all three had something to do with GridView):     (1)    please help     (2)    gridview      (3)    How to show newline in GridView  -- have you clearly explained your scenario? compare:  my leg hurts   with   when i walk too much, my right knee hurts in the knee joint  compare:  my code does not work    with    when i enter a date as 2012-11-8, i get a FormatException -- have you checked your spelling, your grammar, and your English? for better or worse, English is the language of forums.asp.net ... many of the currently 170000++ forums.asp.net are not native speakers of English; that's okay ... however, there are times when choosing the more appropriate words will likely get one a better answer; fortunately, there are web tools to help you formulate your question, for example, http://translate.google.com/.  -- have you provided relevant information about your environment? here are a few examples ... feel free to include other items to your question ... rule of thumb:  if you think a given detail is relevant, it likely is -- what technology are you using?    ASP.NET MVC 4, ASP.NET MVC 3, WebForms, ...  -- what version of Visual Studio are you using?  vs2012 (ultimate, professional, express), vs2010, vs2008 ... -- are you hosting your own website?  are you using a shared hosting service? -- are you experience difficulties in just one browser? more than one browser? -- what browser version(s) are you using?   ie8? ie9? ... -- what is your operating system?     win8, win7, vista, XP, server 2008 R2 ... -- what is your database?   SQL Server 2008 R2, ss2005, MySQL, Oracle, ... -- what is your web server?  iis 7.5, iis 6, .... -- have you provided enough information for someone to be able to answer your question? Here's an actual example from an O.P. that i hope is self-explanatory: I'm trying to make a simple calculator when i write the code in windows application it worked when i tried it in web application it doesn't work and there are no errors what should i do ??!! -- have you included unnecessary information? more than once, i've seen the O.P. (original post, original poster) include many extra lines of code that were not relevant to the actual question; the more unnecessary code that you include, the less likely your volunteer peers will be motivated to donate their time to help you. -- have you asked the question that you want answered? "Does this dog bite?" -- are your expectations reasonable? -- generally, persons who are going to answer your questions are your peers ... they are unpaid volunteers ... -- are you looking for help with your homework, work assignment, or hobby? or, are you expecting someone else to do your work for you?  -- do you expect a complete solution or are you simply looking for guidance and direction? -- you are likely to get more help by first making a reasonable effort to help yourself first Clarity is important, both in question and in answer. if you are answering someone else's question, please remember that clear answers are just as important as clear questions; would you understand your own answer? Things to consider when answering: -- have you tested your code example?  if you have, say so; if you've not tested your code example, also say so -- imho, it's okay to guess as long as you clearly state that you're guessing ... sometimes a wrong guess can still help the O.P. find her/his way to the right answer -- meanness does not contribute to being helpful; sometimes one may become frustrated with the O.P. and/or others participating in a thread, if that happens to you, be kind regardless; speaking from my own experience, at least once i've allowed myself to be frustrated into writing something inappropriate that i've regretted later ... being a meany does not feel good ... being kind and helpful feels fantastic! Tip:  before asking your question, read more than a few existing questions and answers to get a sense of how your peers ask and answer questions. Gerry P.S.:  try to avoid necroposting and piggy backing. necroposting is adding to an old post, especially one that was resolved months ago. piggy backing is adding your own question to someone else's thread.

    Read the article

  • MVC Portable Area Modules *Without* MasterPages

    - by Steve Michelotti
    Portable Areas from MvcContrib provide a great way to build modular and composite applications on top of MVC. In short, portable areas provide a way to distribute MVC binary components as simple .NET assemblies where the aspx/ascx files are actually compiled into the assembly as embedded resources. I’ve blogged about Portable Areas in the past including this post here which talks about embedding resources and you can read more of an intro to Portable Areas here. As great as Portable Areas are, the question that seems to come up the most is: what about MasterPages? MasterPages seems to be the one thing that doesn’t work elegantly with portable areas because you specify the MasterPage in the @Page directive and it won’t use the same mechanism of the view engine so you can’t just embed them as resources. This means that you end up referencing a MasterPage that exists in the host application but not in your portable area. If you name the ContentPlaceHolderId’s correctly, it will work – but it all seems a little fragile. Ultimately, what I want is to be able to build a portable area as a module which has no knowledge of the host application. I want to be able to invoke the module by a full route on the user’s browser and it gets invoked and “automatically appears” inside the application’s visual chrome just like a MasterPage. So how could we accomplish this with portable areas? With this question in mind, I looked around at what other people are doing to address similar problems. Specifically, I immediately looked at how the Orchard team is handling this and I found it very compelling. Basically Orchard has its own custom layout/theme framework (utilizing a custom view engine) that allows you to build your module without any regard to the host. You simply decorate your controller with the [Themed] attribute and it will render with the outer chrome around it: 1: [Themed] 2: public class HomeController : Controller Here is the slide from the Orchard talk at this year MIX conference which shows how it conceptually works:   It’s pretty cool stuff.  So I figure, it must not be too difficult to incorporate this into the portable areas view engine as an optional piece of functionality. In fact, I’ll even simplify it a little – rather than have 1) Document.aspx, 2) Layout.ascx, and 3) <view>.ascx (as shown in the picture above); I’ll just have the outer page be “Chrome.aspx” and then the specific view in question. The Chrome.aspx not only takes the place of the MasterPage, but now since we’re no longer constrained by the MasterPage infrastructure, we have the choice of the Chrome.aspx living in the host or inside the portable areas as another embedded resource! Disclaimer: credit where credit is due – much of the code from this post is me re-purposing the Orchard code to suit my needs. To avoid confusion with Orchard, I’m going to refer to my implementation (which will be based on theirs) as a Chrome rather than a Theme. The first step I’ll take is to create a ChromedAttribute which adds a flag to the current HttpContext to indicate that the controller designated Chromed like this: 1: [Chromed] 2: public class HomeController : Controller The attribute itself is an MVC ActionFilter attribute: 1: public class ChromedAttribute : ActionFilterAttribute 2: { 3: public override void OnActionExecuting(ActionExecutingContext filterContext) 4: { 5: var chromedAttribute = GetChromedAttribute(filterContext.ActionDescriptor); 6: if (chromedAttribute != null) 7: { 8: filterContext.HttpContext.Items[typeof(ChromedAttribute)] = null; 9: } 10: } 11:   12: public static bool IsApplied(RequestContext context) 13: { 14: return context.HttpContext.Items.Contains(typeof(ChromedAttribute)); 15: } 16:   17: private static ChromedAttribute GetChromedAttribute(ActionDescriptor descriptor) 18: { 19: return descriptor.GetCustomAttributes(typeof(ChromedAttribute), true) 20: .Concat(descriptor.ControllerDescriptor.GetCustomAttributes(typeof(ChromedAttribute), true)) 21: .OfType<ChromedAttribute>() 22: .FirstOrDefault(); 23: } 24: } With that in place, we only have to override the FindView() method of the custom view engine with these 6 lines of code: 1: public override ViewEngineResult FindView(ControllerContext controllerContext, string viewName, string masterName, bool useCache) 2: { 3: if (ChromedAttribute.IsApplied(controllerContext.RequestContext)) 4: { 5: var bodyView = ViewEngines.Engines.FindPartialView(controllerContext, viewName); 6: var documentView = ViewEngines.Engines.FindPartialView(controllerContext, "Chrome"); 7: var chromeView = new ChromeView(bodyView, documentView); 8: return new ViewEngineResult(chromeView, this); 9: } 10:   11: // Just execute normally without applying Chromed View Engine 12: return base.FindView(controllerContext, viewName, masterName, useCache); 13: } If the view engine finds the [Chromed] attribute, it will invoke it’s own process – otherwise, it’ll just defer to the normal web forms view engine (with masterpages). The ChromeView’s primary job is to independently set the BodyContent on the view context so that it can be rendered at the appropriate place: 1: public class ChromeView : IView 2: { 3: private ViewEngineResult bodyView; 4: private ViewEngineResult documentView; 5:   6: public ChromeView(ViewEngineResult bodyView, ViewEngineResult documentView) 7: { 8: this.bodyView = bodyView; 9: this.documentView = documentView; 10: } 11:   12: public void Render(ViewContext viewContext, System.IO.TextWriter writer) 13: { 14: ChromeViewContext chromeViewContext = ChromeViewContext.From(viewContext); 15:   16: // First render the Body view to the BodyContent 17: using (var bodyViewWriter = new StringWriter()) 18: { 19: var bodyViewContext = new ViewContext(viewContext, bodyView.View, viewContext.ViewData, viewContext.TempData, bodyViewWriter); 20: this.bodyView.View.Render(bodyViewContext, bodyViewWriter); 21: chromeViewContext.BodyContent = bodyViewWriter.ToString(); 22: } 23: // Now render the Document view 24: this.documentView.View.Render(viewContext, writer); 25: } 26: } The ChromeViewContext (code excluded here) mainly just has a string property for the “BodyContent” – but it also makes sure to put itself in the HttpContext so it’s available. Finally, we created a little extension method so the module’s view can be rendered in the appropriate place: 1: public static void RenderBody(this HtmlHelper htmlHelper) 2: { 3: ChromeViewContext chromeViewContext = ChromeViewContext.From(htmlHelper.ViewContext); 4: htmlHelper.ViewContext.Writer.Write(chromeViewContext.BodyContent); 5: } At this point, the other thing left is to decide how we want to implement the Chrome.aspx page. One approach is the copy/paste the HTML from the typical Site.Master and change the main content placeholder to use the HTML helper above – this way, there are no MasterPages anywhere. Alternatively, we could even have Chrome.aspx utilize the MasterPage if we wanted (e.g., in the case where some pages are Chromed and some pages want to use traditional MasterPage): 1: <%@ Page Title="" Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage" %> 2: <asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server"> 3: <% Html.RenderBody(); %> 4: </asp:Content> At this point, it’s all academic. I can create a controller like this: 1: [Chromed] 2: public class WidgetController : Controller 3: { 4: public ActionResult Index() 5: { 6: return View(); 7: } 8: } Then I’ll just create Index.ascx (a partial view) and put in the text “Inside my widget”. Now when I run the app, I can request the full route (notice the controller name of “widget” in the address bar below) and the HTML from my Index.ascx will just appear where it is supposed to.   This means no more warnings for missing MasterPages and no more need for your module to have knowledge of the host’s MasterPage placeholders. You have the option of using the Chrome.aspx in the host or providing your own while embedding it as an embedded resource itself. I’m curious to know what people think of this approach. The code above was done with my own local copy of MvcContrib so it’s not currently something you can download. At this point, these are just my initial thoughts – just incorporating some ideas for Orchard into non-Orchard apps to enable building modular/composite apps more easily. Additionally, on the flip side, I still believe that Portable Areas have potential as the module packaging story for Orchard itself.   What do you think?

    Read the article

  • Set Context User Principal for Customized Authentication in SignalR

    - by Shaun
    Originally posted on: http://geekswithblogs.net/shaunxu/archive/2014/05/27/set-context-user-principal-for-customized-authentication-in-signalr.aspxCurrently I'm working on a single page application project which is built on AngularJS and ASP.NET WebAPI. When I need to implement some features that needs real-time communication and push notifications from server side I decided to use SignalR. SignalR is a project currently developed by Microsoft to build web-based, read-time communication application. You can find it here. With a lot of introductions and guides it's not a difficult task to use SignalR with ASP.NET WebAPI and AngularJS. I followed this and this even though it's based on SignalR 1. But when I tried to implement the authentication for my SignalR I was struggled 2 days and finally I got a solution by myself. This might not be the best one but it actually solved all my problem.   In many articles it's said that you don't need to worry about the authentication of SignalR since it uses the web application authentication. For example if your web application utilizes form authentication, SignalR will use the user principal your web application authentication module resolved, check if the principal exist and authenticated. But in my solution my ASP.NET WebAPI, which is hosting SignalR as well, utilizes OAuth Bearer authentication. So when the SignalR connection was established the context user principal was empty. So I need to authentication and pass the principal by myself.   Firstly I need to create a class which delivered from "AuthorizeAttribute", that will takes the responsible for authenticate when SignalR connection established and any method was invoked. 1: public class QueryStringBearerAuthorizeAttribute : AuthorizeAttribute 2: { 3: public override bool AuthorizeHubConnection(HubDescriptor hubDescriptor, IRequest request) 4: { 5: } 6:  7: public override bool AuthorizeHubMethodInvocation(IHubIncomingInvokerContext hubIncomingInvokerContext, bool appliesToMethod) 8: { 9: } 10: } The method "AuthorizeHubConnection" will be invoked when any SignalR connection was established. And here I'm going to retrieve the Bearer token from query string, try to decrypt and recover the login user's claims. 1: public override bool AuthorizeHubConnection(HubDescriptor hubDescriptor, IRequest request) 2: { 3: var dataProtectionProvider = new DpapiDataProtectionProvider(); 4: var secureDataFormat = new TicketDataFormat(dataProtectionProvider.Create()); 5: // authenticate by using bearer token in query string 6: var token = request.QueryString.Get(WebApiConfig.AuthenticationType); 7: var ticket = secureDataFormat.Unprotect(token); 8: if (ticket != null && ticket.Identity != null && ticket.Identity.IsAuthenticated) 9: { 10: // set the authenticated user principal into environment so that it can be used in the future 11: request.Environment["server.User"] = new ClaimsPrincipal(ticket.Identity); 12: return true; 13: } 14: else 15: { 16: return false; 17: } 18: } In the code above I created "TicketDataFormat" instance, which must be same as the one I used to generate the Bearer token when user logged in. Then I retrieve the token from request query string and unprotect it. If I got a valid ticket with identity and it's authenticated this means it's a valid token. Then I pass the user principal into request's environment property which can be used in nearly future. Since my website was built in AngularJS so the SignalR client was in pure JavaScript, and it's not support to set customized HTTP headers in SignalR JavaScript client, I have to pass the Bearer token through request query string. This is not a restriction of SignalR, but a restriction of WebSocket. For security reason WebSocket doesn't allow client to set customized HTTP headers from browser. Next, I need to implement the authentication logic in method "AuthorizeHubMethodInvocation" which will be invoked when any SignalR method was invoked. 1: public override bool AuthorizeHubMethodInvocation(IHubIncomingInvokerContext hubIncomingInvokerContext, bool appliesToMethod) 2: { 3: var connectionId = hubIncomingInvokerContext.Hub.Context.ConnectionId; 4: // check the authenticated user principal from environment 5: var environment = hubIncomingInvokerContext.Hub.Context.Request.Environment; 6: var principal = environment["server.User"] as ClaimsPrincipal; 7: if (principal != null && principal.Identity != null && principal.Identity.IsAuthenticated) 8: { 9: // create a new HubCallerContext instance with the principal generated from token 10: // and replace the current context so that in hubs we can retrieve current user identity 11: hubIncomingInvokerContext.Hub.Context = new HubCallerContext(new ServerRequest(environment), connectionId); 12: return true; 13: } 14: else 15: { 16: return false; 17: } 18: } Since I had passed the user principal into request environment in previous method, I can simply check if it exists and valid. If so, what I need is to pass the principal into context so that SignalR hub can use. Since the "User" property is all read-only in "hubIncomingInvokerContext", I have to create a new "ServerRequest" instance with principal assigned, and set to "hubIncomingInvokerContext.Hub.Context". After that, we can retrieve the principal in my Hubs through "Context.User" as below. 1: public class DefaultHub : Hub 2: { 3: public object Initialize(string host, string service, JObject payload) 4: { 5: var connectionId = Context.ConnectionId; 6: ... ... 7: var domain = string.Empty; 8: var identity = Context.User.Identity as ClaimsIdentity; 9: if (identity != null) 10: { 11: var claim = identity.FindFirst("Domain"); 12: if (claim != null) 13: { 14: domain = claim.Value; 15: } 16: } 17: ... ... 18: } 19: } Finally I just need to add my "QueryStringBearerAuthorizeAttribute" into the SignalR pipeline. 1: app.Map("/signalr", map => 2: { 3: // Setup the CORS middleware to run before SignalR. 4: // By default this will allow all origins. You can 5: // configure the set of origins and/or http verbs by 6: // providing a cors options with a different policy. 7: map.UseCors(CorsOptions.AllowAll); 8: var hubConfiguration = new HubConfiguration 9: { 10: // You can enable JSONP by uncommenting line below. 11: // JSONP requests are insecure but some older browsers (and some 12: // versions of IE) require JSONP to work cross domain 13: // EnableJSONP = true 14: EnableJavaScriptProxies = false 15: }; 16: // Require authentication for all hubs 17: var authorizer = new QueryStringBearerAuthorizeAttribute(); 18: var module = new AuthorizeModule(authorizer, authorizer); 19: GlobalHost.HubPipeline.AddModule(module); 20: // Run the SignalR pipeline. We're not using MapSignalR 21: // since this branch already runs under the "/signalr" path. 22: map.RunSignalR(hubConfiguration); 23: }); On the client side should pass the Bearer token through query string before I started the connection as below. 1: self.connection = $.hubConnection(signalrEndpoint); 2: self.proxy = self.connection.createHubProxy(hubName); 3: self.proxy.on(notifyEventName, function (event, payload) { 4: options.handler(event, payload); 5: }); 6: // add the authentication token to query string 7: // we cannot use http headers since web socket protocol doesn't support 8: self.connection.qs = { Bearer: AuthService.getToken() }; 9: // connection to hub 10: self.connection.start(); Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • Agile Like Jazz

    - by Jeff Certain
    (I’ve been sitting on this for a week or so now, thinking that it needed to be tightened up a bit to make it less rambling. Since that’s clearly not going to happen, reader beware!) I had the privilege of spending around 90 minutes last night sitting and listening to Sonny Rollins play a concert at the Disney Center in LA. If you don’t know who Sonny Rollins is, I don’t know how to explain the experience; if you know who he is, I don’t need to. Suffice it to say that he has been recording professionally for over 50 years, and helped create an entire genre of music. A true master by any definition. One of the most intriguing aspects of a concert like this, however, is watching the master step aside and let the rest of the musicians play. Not just play their parts, but really play… letting them take over the spotlight, to strut their stuff, to soak up enthusiastic applause from the crowd. Maybe a lot of it has to do with the fact that Sonny Rollins has been doing this for more than a half-century. Maybe it has something to do with a kind of patience you learn when you’re on the far side of 80 – and the man can still blow a mean sax for 90 minutes without stopping! Maybe it has to do with the fact that he was out there for the love of the music and the love of the show, not because he had anything to prove to anyone and, I like to think, not for the money. Perhaps it had more to do with the fact that, when you’re at that level of mastery, the other musicians are going to be good. Really good. Whatever the reasons, there was a incredible freedom on that stage – the ability to improvise, for each musician to showcase their own specialization and skills, and them come back to the common theme, back to being on the same page, as it were. All this took place in the same venue that is home to the L.A. Phil. Somehow, I can’t ever see the same kind of free-wheeling improvisation happening in that context. And, since I’m a geek, I started thinking about agility. Rollins has put together a quintet that reflects his own particular style and past. No upright bass or piano for Rollins – drums, bongos, electric guitar and bass guitar along with his sax. It’s not about the mix of instruments. Other trios, quartets, and sextets use different mixes of instruments. New Orleans jazz tends towards trombones instead of sax; some prefer cornet or trumpet. But no matter what the choice of instruments, size matters. Team sizes are something I’ve been thinking about for a while. We’re on a quest to rethink how our teams are organized. They just feel too big, too unwieldy. In fact, they really don’t feel like teams at all. Most of the time, they feel more like collections or people who happen to report to the same manager. I attribute this to a couple factors. One is over-specialization; we have a tendency to have people work in silos. Although the teams are product-focused, within them our developers are both generalists and specialists. On the one hand, we expect them to be able to build an entire vertical slice of the application; on the other hand, each developer tends to be responsible for the vertical slice. As a result, developers often work on their own piece of the puzzle, in isolation. This sort of feels like working on a jigsaw in a group – each person taking a set of colors and piecing them together to reveal a portion of the overall picture. But what inevitably happens when you go to meld all those pieces together? Inevitably, you have some sections that are too big to move easily. These sections end up falling apart under their own weight as you try to move them. Not only that, but there are other challenges – figuring out where that section fits, and how to tie it into the rest of the puzzle. Often, this is when you find a few pieces need to be added – these pieces are “glue,” if you will. The other issue that arises is due to the overhead of maintaining communications in a team. My mother, who worked in IT for around 30 years, once told me that 20% per team member is a good rule of thumb for maintaining communication. While this is a rule of thumb, it seems to imply that any team over about 6 people is going to become less agile simple because of the communications burden. Teams of ten or twelve seem like they fall into the philharmonic organizational model. Complicated pieces of music requiring dozens of players to all be on the same page requires a much different model than the jazz quintet. There’s much less room for improvisation, originality or freedom. (There are probably orchestral musicians who will take exception to this characterization; I’m calling it like I see it from the cheap seats.) And, there’s one guy up front who is running the show, whose job is to keep all of those dozens of players on the same page, to facilitate communications. Somehow, the orchestral model doesn’t feel much like a self-organizing team, either. The first violin may be the best violinist in the orchestra, but they don’t get to perform free-wheeling solos. I’ve never heard of an orchestra getting together for a jam session. But I have heard of teams that organize their work based on the developers available, rather than organizing the developers based on the work required. I have heard of teams where desired functionality is deferred – or worse yet, schedules are missed – because one critical person doesn’t have any bandwidth available. I’ve heard of teams where people simply don’t have the big picture, because there is too much communication overhead for everyone to be aware of everything that is happening on a project. I once heard Paul Rayner say something to the effect of “you have a process that is perfectly designed to give you exactly the results you have.” Given a choice, I want a process that’s much more like jazz than orchestral music. I want a process that doesn’t burden me with lots of forms and checkboxes and stuff. Give me the simplest, most lightweight process that will work – and a smaller team of the best developers I can find. This seems like the kind of process that will get the kind of result I want to be part of.

    Read the article

  • What's new in Solaris 11.1?

    - by Karoly Vegh
    Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1.  Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all.  I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop )  V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples.   VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog.  VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg.  parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness.  per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature.  NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones.  VIII: Security got a lot of attention too:  Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits.  PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users  SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion.  SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed.  Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified.  IX. Networking, as one of the core strengths of Solaris 11, has been extended with:  Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly  X. Data management:  FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all.  The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%!  XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges.  The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract.  For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;)  wbr,charlie

    Read the article

  • 6 Facts About GlassFish Announcement

    - by Bruno.Borges
    Since Oracle announced the end of commercial support for future Oracle GlassFish Server versions, the Java EE world has started wondering what will happen to GlassFish Server Open Source Edition. Unfortunately, there's a lot of misleading information going around. So let me clarify some things with facts, not FUD. Fact #1 - GlassFish Open Source Edition is not dead GlassFish Server Open Source Edition will remain the reference implementation of Java EE. The current trunk is where an implementation for Java EE 8 will flourish, and this will become the future GlassFish 5.0. Calling "GlassFish is dead" does no good to the Java EE ecosystem. The GlassFish Community will remain strong towards the future of Java EE. Without revenue-focused mind, this might actually help the GlassFish community to shape the next version, and set free from any ties with commercial decisions. Fact #2 - OGS support is not over As I said before, GlassFish Server Open Source Edition will continue. Main change is that there will be no more future commercial releases of Oracle GlassFish Server. New and existing OGS 2.1.x and 3.1.x commercial customers will continue to be supported according to the Oracle Lifetime Support Policy. In parallel, I believe there's no other company in the Java EE business that offers commercial support to more than one build of a Java EE application server. This new direction can actually help customers and partners, simplifying decision through commercial negotiations. Fact #3 - WebLogic is not always more expensive than OGS Oracle GlassFish Server ("OGS") is a build of GlassFish Server Open Source Edition bundled with a set of commercial features called GlassFish Server Control and license bundles such as Java SE Support. OGS has at the moment of this writing the pricelist of U$ 5,000 / processor. One information that some bloggers are mentioning is that WebLogic is more expensive than this. Fact 3.1: it is not necessarily the case. The initial edition of WebLogic is called "Standard Edition" and falls into a policy where some “Standard Edition” products are licensed on a per socket basis. As of current pricelist, US$ 10,000 / socket. If you do the math, you will realize that WebLogic SE can actually be significantly more cost effective than OGS, and a customer can save money if running on a CPU with 4 cores or more for example. Quote from the price list: “When licensing Oracle programs with Standard Edition One or Standard Edition in the product name (with the exception of Java SE Support, Java SE Advanced, and Java SE Suite), a processor is counted equivalent to an occupied socket; however, in the case of multi-chip modules, each chip in the multi-chip module is counted as one occupied socket.” For more details speak to your Oracle sales representative - this is clearly at list price and every customer typically has a relationship with Oracle (like they do with other vendors) and different contractual details may apply. And although OGS has always been production-ready for Java EE applications, it is no secret that WebLogic has always been more enterprise, mission critical application server than OGS since BEA. Different editions of WLS provide features and upgrade irons like the WebLogic Diagnostic Framework, Work Managers, Side by Side Deployment, ADF and TopLink bundled license, Web Tier (Oracle HTTP Server) bundled licensed, Fusion Middleware stack support, Oracle DB integration features, Oracle RAC features (such as GridLink), Coherence Management capabilities, Advanced HA (Whole Service Migration and Server Migration), Java Mission Control, Flight Recorder, Oracle JDK support, etc. Fact #4 - There’s no major vendor supporting community builds of Java EE app servers There are no major vendors providing support for community builds of any Open Source application server. For example, IBM used to provide community support for builds of Apache Geronimo, not anymore. Red Hat does not commercially support builds of WildFly and if I remember correctly, never supported community builds of former JBoss AS. Oracle has never commercially supported GlassFish Server Open Source Edition builds. Tomitribe appears to be the exception to the rule, offering commercial support for Apache TomEE. Fact #5 - WebLogic and GlassFish share several Java EE implementations It has been no secret that although GlassFish and WebLogic share some JSR implementations (as stated in the The Aquarium announcement: JPA, JSF, WebSockets, CDI, Bean Validation, JAX-WS, JAXB, and WS-AT) and WebLogic understands GlassFish deployment descriptors, they are not from the same codebase. Fact #6 - WebLogic is not for GlassFish what JBoss EAP is for WildFly WebLogic is closed-source offering. It is commercialized through a license-based plus support fee model. OGS although from an Open Source code, has had the same commercial model as WebLogic. Still, one cannot compare GlassFish/WebLogic to WildFly/JBoss EAP. It is simply not the same case, since Oracle has had two different products from different codebases. The comparison should be limited to GlassFish Open Source / Oracle GlassFish Server versus WildFly / JBoss EAP. But the message now is much clear: Oracle will commercially support only the proprietary product WebLogic, and invest on GlassFish Server Open Source Edition as the reference implementation for the Java EE platform and future Java EE 8, as a developer-friendly community distribution, and encourages community participation through Adopt a JSR and contributions to GlassFish. In comparison Oracle's decision has pretty much the same goal as to when IBM killed support for Websphere Community Edition; and to when Red Hat decided to change the name of JBoss Community Edition to WildFly, simplifying and clarifying marketing message and leaving the commercial field wide open to JBoss EAP only. Oracle can now, as any other vendor has already been doing, focus on only one commercial offer. Some users are saying they will now move to WildFly, but it is important to note that Red Hat does not offer commercial support for WildFly builds. Although the future JBoss EAP versions will come from the same codebase as WildFly, the builds will definitely not be the same, nor sharing 100% of their functionalities and bug fixes. This means there will be no company running a WildFly build in production with support from Red Hat. This discussion has also raised an important and interesting information: Oracle offers a free for developers OTN License for WebLogic. For other environments this is different, but please note this is the same policy Red Hat applies to JBoss EAP, as stated in their download page and terms. Oracle had the same policy for OGS. TL;DR; GlassFish Server Open Source Edition isn’t dead. Current and new OGS 2.x/3.x customers will continue to have support (respecting LSP). WebLogic is not necessarily more expensive than OGS. Oracle will focus on one commercially supported Java EE application server, like other vendors also limit themselves to support one build/product only. Community builds are hardly supported. Commercially supported builds of Open Source products are not exactly from the same codebase as community builds. What's next for GlassFish and the Java EE community? There are conversations in place to tackle some of the community desires, most of them stated by Markus Eisele in his blog post. We will keep you posted.

    Read the article

  • Inheritance Mapping Strategies with Entity Framework Code First CTP5: Part 2 – Table per Type (TPT)

    - by mortezam
    In the previous blog post you saw that there are three different approaches to representing an inheritance hierarchy and I explained Table per Hierarchy (TPH) as the default mapping strategy in EF Code First. We argued that the disadvantages of TPH may be too serious for our design since it results in denormalized schemas that can become a major burden in the long run. In today’s blog post we are going to learn about Table per Type (TPT) as another inheritance mapping strategy and we'll see that TPT doesn’t expose us to this problem. Table per Type (TPT)Table per Type is about representing inheritance relationships as relational foreign key associations. Every class/subclass that declares persistent properties—including abstract classes—has its own table. The table for subclasses contains columns only for each noninherited property (each property declared by the subclass itself) along with a primary key that is also a foreign key of the base class table. This approach is shown in the following figure: For example, if an instance of the CreditCard subclass is made persistent, the values of properties declared by the BillingDetail base class are persisted to a new row of the BillingDetails table. Only the values of properties declared by the subclass (i.e. CreditCard) are persisted to a new row of the CreditCards table. The two rows are linked together by their shared primary key value. Later, the subclass instance may be retrieved from the database by joining the subclass table with the base class table. TPT Advantages The primary advantage of this strategy is that the SQL schema is normalized. In addition, schema evolution is straightforward (modifying the base class or adding a new subclass is just a matter of modify/add one table). Integrity constraint definition are also straightforward (note how CardType in CreditCards table is now a non-nullable column). Another much more important advantage is the ability to handle polymorphic associations (a polymorphic association is an association to a base class, hence to all classes in the hierarchy with dynamic resolution of the concrete class at runtime). A polymorphic association to a particular subclass may be represented as a foreign key referencing the table of that particular subclass. Implement TPT in EF Code First We can create a TPT mapping simply by placing Table attribute on the subclasses to specify the mapped table name (Table attribute is a new data annotation and has been added to System.ComponentModel.DataAnnotations namespace in CTP5): public abstract class BillingDetail {     public int BillingDetailId { get; set; }     public string Owner { get; set; }     public string Number { get; set; } } [Table("BankAccounts")] public class BankAccount : BillingDetail {     public string BankName { get; set; }     public string Swift { get; set; } } [Table("CreditCards")] public class CreditCard : BillingDetail {     public int CardType { get; set; }     public string ExpiryMonth { get; set; }     public string ExpiryYear { get; set; } } public class InheritanceMappingContext : DbContext {     public DbSet<BillingDetail> BillingDetails { get; set; } } If you prefer fluent API, then you can create a TPT mapping by using ToTable() method: protected override void OnModelCreating(ModelBuilder modelBuilder) {     modelBuilder.Entity<BankAccount>().ToTable("BankAccounts");     modelBuilder.Entity<CreditCard>().ToTable("CreditCards"); } Generated SQL For QueriesLet’s take an example of a simple non-polymorphic query that returns a list of all the BankAccounts: var query = from b in context.BillingDetails.OfType<BankAccount>() select b; Executing this query (by invoking ToList() method) results in the following SQL statements being sent to the database (on the bottom, you can also see the result of executing the generated query in SQL Server Management Studio): Now, let’s take an example of a very simple polymorphic query that requests all the BillingDetails which includes both BankAccount and CreditCard types: projects some properties out of the base class BillingDetail, without querying for anything from any of the subclasses: var query = from b in context.BillingDetails             select new { b.BillingDetailId, b.Number, b.Owner }; -- var query = from b in context.BillingDetails select b; This LINQ query seems even more simple than the previous one but the resulting SQL query is not as simple as you might expect: -- As you can see, EF Code First relies on an INNER JOIN to detect the existence (or absence) of rows in the subclass tables CreditCards and BankAccounts so it can determine the concrete subclass for a particular row of the BillingDetails table. Also the SQL CASE statements that you see in the beginning of the query is just to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type) TPT ConsiderationsEven though this mapping strategy is deceptively simple, the experience shows that performance can be unacceptable for complex class hierarchies because queries always require a join across many tables. In addition, this mapping strategy is more difficult to implement by hand— even ad-hoc reporting is more complex. This is an important consideration if you plan to use handwritten SQL in your application (For ad hoc reporting, database views provide a way to offset the complexity of the TPT strategy. A view may be used to transform the table-per-type model into the much simpler table-per-hierarchy model.) SummaryIn this post we learned about Table per Type as the second inheritance mapping in our series. So far, the strategies we’ve discussed require extra consideration with regard to the SQL schema (e.g. in TPT, foreign keys are needed). This situation changes with the Table per Concrete Type (TPC) that we will discuss in the next post. References ADO.NET team blog Java Persistence with Hibernate book a { text-decoration: none; } a:visited { color: Blue; } .title { padding-bottom: 5px; font-family: Segoe UI; font-size: 11pt; font-weight: bold; padding-top: 15px; } .code, .typeName { font-family: consolas; } .typeName { color: #2b91af; } .padTop5 { padding-top: 5px; } .padTop10 { padding-top: 10px; } p.MsoNormal { margin-top: 0in; margin-right: 0in; margin-bottom: 10.0pt; margin-left: 0in; line-height: 115%; font-size: 11.0pt; font-family: "Calibri" , "sans-serif"; }

    Read the article

  • Thou shalt not put code on a piedestal - Code is a tool, no more, no less

    - by Ralf Westphal
    “Write great code and everything else becomes easier” is what Paul Pagel believes in. That´s his version of an adage by Brian Marick he cites: “treat code as an end, not just a means.” And he concludes: “My post-Agile world is software craftsmanship.” I wonder, if that´s really the way to go. Will “simply” writing great code lead the software industry into the light? He´s alluding to the philosopher Kant who proposed, a human beings should never be treated as a means, but always as an end. But should we transfer this ethical statement into the world of software? I doubt it.   Reason #1: Human beings are categorially different from code. They are autonomous entities who need to find a way of living happily together. To Kant it seemed this goal could only be reached if nobody (ab)used a human being for his/her purposes. Because using a human being, i.e. treating it as a means, would contradict the fundamental autonomy and freedom of human beings. People should hold up a symmetric view of their relationships: Since nobody wants to be (ab)used, nobody should (ab)use anybody else. If you want to be treated decently, with respect, in accordance with your own free will - which means as an end - then do the same to other people. Code is dead, it´s a product, it´s a tool for people to reach their goals. No company spends any money on code other than to save money or earn money in the long run. Code is not a puppy. Enterprises do not commission software development to just feel good in its company. Code is not a buddy. Code is a slave, if you will. A mechanical slave, a non-tangible robot. Code is a tool, is a tool. And if we start to treat it differently, if we elevate its status unduely… I guess that will contort our relationship in a contraproductive way. Please get me right: Just because something is “just a tool”, “just a product” does not mean we should not be careful while designing, building, using it. Right to the contrary. We should be very careful when writing code – but not for the code´s sake! We should be careful because we respect our customers who are fellow human beings who should be treated as an end. If we are careless, neglectful, ignorant when producing code on their behalf, then we´re using them. Being sloppy means you´re caring more for yourself that for your customer. You´re then treating the customer as a means to fulfill some of your own needs. That´s plain unethical behavior.   Reason #2: The focus should always be on your purpose, not on any tool. But if code is treated as an end, then the focus is on the code. That might sound right, because where else should be your focus as a software developer? But, well, I´d say, your focus should be on delivering value to your customer. Because in the end your customer does not care if you write a single line of code. She just wants her problem to be solved. Solving problems is the purpose of any contractor. Code must be treated just as a means, a tool we know how to handle very well. But if we´re really trying to be craftsmen then we should be conscious about exactly that and act ethically. That means we must never be so focused on our tool as to be unable to suggest better solutions to the problems of our customers than code.   I´m all with Paul when he urges us to “Write great code”. Sure, if you need to write code, then by all means do so. Write the best code you can think of – and then try to improve it. Paul has all the best intentions when he signs Brians “treat code as an end” - but as we all know: “The road to hell is paved with best intentions” ;-) Yes, I can imagine a “hell of code focus”. In fact, I don´t need to imagine it, I´m seeing it quite often. Because code hell is whereever two developers stand together and are so immersed in talking about all sorts of coding tricks, design patterns, code smells, technologies, platforms, tools that they lose sight of the big picture. Talking about TDD or SOLID or refactoring is a sign of consciousness – relative to the “cowboy coders” view of the world. But from yet another point of view TDD, SOLID, and refactoring are just cures for ailments within a system. And I fear, if “Writing great code” is the only focus or the main focus of software development, then we as an industry lose the ability to see that. Focus draws a line around something, it defines a horizon for perceptions and thinking. So if we focus on code our horizon ends where “the land of code” ends. I don´t think that should be our professional attitude.   So what about Software Craftsmanship as the next big thing after Agility? I think Software Craftsmanship has an important message for all software developers and beyond. But to make it the successor of the Agility movement seems to miss a point. Agility never claimed to solve all software development problems, I´d say. So to blame it for having missed out on certain aspects of it is wrong. If I had to summarize Agility in one word I´d say “Value”. Agility put value for the customer back in software development. Focus on delivering value early and often – that´s Agility´s mantra. All else follows from that. And I ask you: Is that obsolete? Is delivering value not hip anymore? No, sure not. That´s our very purpose as software developers. So how can Agility become obsolete and need to be replaced? We need to do away with this “either/or”-thinking. It´s either Agility or Lean or Software Craftsmanship or whatnot. Instead we should start integrating concepts and movements. Think “both/and”. Think Agility plus Software Craftsmanship plus Lean plus whatnot. We don´t neet to tear down anything from a piedestal and replace it with a new idol. Instead we should do away with piedestals and arrange whatever is helpful is a circle. Then we can turn to concepts, movements for whatever they are best. After 10 years of Agility we should be able to identify what it was good at – and keep that. Keep Agility around and add whatever Agility was lacking or never concerned with. Add whatever is at the core of Software Craftsmanship. Add whatever is at the core of Lean etc. But don´t call out the age of Post-Agility. Because it better never will end. Because once we start to lose Agility´s core we´re losing focus of the customer.

    Read the article

  • External File Upload Optimizations for Windows Azure

    - by rgillen
    [Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx] I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting. Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards). Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately). We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here. We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here. The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results.  We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post. For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results. Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows: We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post. This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here. Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB. (click chart for full size image) The chart above illustrates some interesting points about the results: When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size) For some of the moderately-sized source files, small blocks (256KB) are best As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs). Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file.    (click chart for full size image) The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes. This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files.   Summary What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions.   Related Resources Source code for upload test application Source code for random file generator ODatas feed of raw data from non-optimized transfer tests Experiment Metadata Experiment Datasets 2KB Uploads 32KB Uploads 64KB Uploads 128KB Uploads 256KB Uploads 512KB Uploads 1MB Uploads 5MB Uploads 10MB Uploads 25MB Uploads 50MB Uploads 100MB Uploads 250MB Uploads 500MB Uploads 750MB Uploads 1GB Uploads Raw Data OData feeds of raw data from blocked/parallelized transfer tests Experiment Metadata Experiment Datasets Raw Data 256KB Blocks 512KB Blocks 1MB Blocks 2MB Blocks 4MB Blocks Excel worksheet showing summarizations and comparisons

    Read the article

  • RIF PRD: Presentation syntax issues

    - by Charles Young
    Over Christmas I got to play a bit with the W3C RIF PRD and came across a few issues which I thought I would record for posterity. Specifically, I was working on a grammar for the presentation syntax using a GLR grammar parser tool (I was using the current CTP of ‘M’ (MGrammer) and Intellipad – I do so hope the MS guys don’t kill off M and Intellipad now they have dropped the other parts of SQL Server Modelling). I realise that the presentation syntax is non-normative and that any issues with it do not therefore compromise the standard. However, presentation syntax is useful in its own right, and it would be great to iron out any issues in a future revision of the standard. The main issues are actually not to do with the grammar at all, but rather with the ‘running example’ in the RIF PRD recommendation. I started with the code provided in Example 9.1. There are several discrepancies when compared with the EBNF rules documented in the standard. Broadly the problems can be categorised as follows: ·      Parenthesis mismatch – the wrong number of parentheses are used in various places. For example, in GoldRule, the RHS of the rule (the ‘Then’) is nested in the LHS (‘the If’). In NewCustomerAndWidgetRule, the RHS is orphaned from the LHS. Together with additional incorrect parenthesis, this leads to orphanage of UnknownStatusRule from the entire Document. ·      Invalid use of parenthesis in ‘Forall’ constructs. Parenthesis should not be used to enclose formulae. Removal of the invalid parenthesis gave me a feeling of inconsistency when comparing formulae in Forall to formulae in If. The use of parenthesis is not actually inconsistent in these two context, but in an If construct it ‘feels’ as if you are enclosing formulae in parenthesis in a LISP-like fashion. In reality, the parenthesis is simply being used to group subordinate syntax elements. The fact that an If construct can contain only a single formula as an immediate child adds to this feeling of inconsistency. ·      Invalid representation of compact URIs (CURIEs) in the context of Frame productions. In several places the URIs are not qualified with a namespace prefix (‘ex1:’). This conflicts with the definition of CURIEs in the RIF Datatypes and Built-Ins 1.0 document. Here are the productions: CURIE          ::= PNAME_LN                  | PNAME_NS PNAME_LN       ::= PNAME_NS PN_LOCAL PNAME_NS       ::= PN_PREFIX? ':' PN_LOCAL       ::= ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* PN_CHARS)? PN_CHARS       ::= PN_CHARS_U                  | '-' | [0-9] | #x00B7                  | [#x0300-#x036F] | [#x203F-#x2040] PN_CHARS_U     ::= PN_CHARS_BASE                  | '_' PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6]                  | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF]                  | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF]                  | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD]                  | [#x10000-#xEFFFF] PN_PREFIX      ::= PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)? The more I look at CURIEs, the more my head hurts! The RIF specification allows prefixes and colons without local names, which surprised me. However, the CURIE Syntax 1.0 working group note specifically states that this form is supported…and then promptly provides a syntactic definition that seems to preclude it! However, on (much) deeper inspection, it appears that ‘ex1:’ (for example) is allowed, but would really represent a ‘fragment’ of the ‘reference’, rather than a prefix! Ouch! This is so completely ambiguous that it surely calls into question the whole CURIE specification.   In any case, RIF does not allow local names without a prefix. ·      Missing ‘External’ specifiers for built-in functions and predicates.  The EBNF specification enforces this for terms within frames, but does not appear to enforce (what I believe is) the correct use of External on built-in predicates. In any case, the running example only specifies ‘External’ once on the predicate in UnknownStatusRule. External() is required in several other places. ·      The List used on the LHS of UnknownStatusRule is comma-delimited. This is not supported by the EBNF definition. Similarly, the argument list of pred:list-contains is illegally comma-delimited. ·      Unnecessary use of conjunction around a single formula in DiscountRule. This is strictly legal in the EBNF, but redundant.   All the above issues concern the presentation syntax used in the running example. There are a few minor issues with the grammar itself. Note that Michael Kiefer stated in his paper “Rule Interchange Format: The Framework” that: “The presentation syntax of RIF … is an abstract syntax and, as such, it omits certain details that might be important for unambiguous parsing.” ·      The grammar cannot differentiate unambiguously between strategies and priorities on groups. A processor is forced to resolve this by detecting the use of IRIs and integers. This could easily be fixed in the grammar.   ·      The grammar cannot unambiguously parse the ‘->’ operator in frames. Specifically, ‘-’ characters are allowed in PN_LOCAL names and hence a parser cannot determine if ‘status->’ is (‘status’ ‘->’) or (‘status-’ ‘>’).   One way to fix this is to amend the PN_LOCAL production as follows: PN_LOCAL ::= ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* ((PN_CHARS)-('-')))? However, unilaterally changing the definition of this production, which is defined in the SPARQL Query Language for RDF specification, makes me uncomfortable. ·      I assume that the presentation syntax is case-sensitive. I couldn’t find this stated anywhere in the documentation, but function/predicate names do appear to be documented as being case-sensitive. ·      The EBNF does not specify whitespace handling. A couple of productions (RULE and ACTION_BLOCK) are crafted to enforce the use of whitespace. This is not necessary. It seems inconsistent with the rest of the specification and can cause parsing issues. In addition, the Const production exhibits whitespaces issues. The intention may have been to disallow the use of whitespace around ‘^^’, but any direct implementation of the EBNF will probably allow whitespace between ‘^^’ and the SYMSPACE. Of course, I am being a little nit-picking about all this. On the whole, the EBNF translated very smoothly and directly to ‘M’ (MGrammar) and proved to be fairly complete. I have encountered far worse issues when translating other EBNF specifications into usable grammars.   I can’t imagine there would be any difficulty in implementing the same grammar in Antlr, COCO/R, gppg, XText, Bison, etc. A general observation, which repeats a point made above, is that the use of parenthesis in the presentation syntax can feel inconsistent and un-intuitive.   It isn’t actually inconsistent, but I think the presentation syntax could be improved by adopting braces, rather than parenthesis, to delimit subordinate syntax elements in a similar way to so many programming languages. The familiarity of braces would communicate the structure of the syntax more clearly to people like me.  If braces were adopted, parentheses could be retained around ‘var (frame | ‘new()’) constructs in action blocks. This use of parenthesis feels very LISP-like, and I think that this is my issue. It’s as if the presentation syntax represents the deformed love-child of LISP and C. In some places (specifically, action blocks), parenthesis is used in a LISP-like fashion. In other places it is used like braces in C. I find this quite confusing. Here is a corrected version of the running example (Example 9.1) in compliant presentation syntax: Document(    Prefix( ex1 <http://example.com/2009/prd2> )    (* ex1:CheckoutRuleset *)  Group rif:forwardChaining (     (* ex1:GoldRule *)    Group 10 (      Forall ?customer such that And(?customer # ex1:Customer                                     ?customer[ex1:status->"Silver"])        (Forall ?shoppingCart such that ?customer[ex1:shoppingCart->?shoppingCart]           (If Exists ?value (And(?shoppingCart[ex1:value->?value]                                  External(pred:numeric-greater-than-or-equal(?value 2000))))            Then Do(Modify(?customer[ex1:status->"Gold"])))))      (* ex1:DiscountRule *)    Group (      Forall ?customer such that ?customer # ex1:Customer        (If Or( ?customer[ex1:status->"Silver"]                ?customer[ex1:status->"Gold"])         Then Do ((?s ?customer[ex1:shoppingCart-> ?s])                  (?v ?s[ex1:value->?v])                  Modify(?s [ex1:value->External(func:numeric-multiply (?v 0.95))]))))      (* ex1:NewCustomerAndWidgetRule *)    Group (      Forall ?customer such that And(?customer # ex1:Customer                                     ?customer[ex1:status->"New"] )        (If Exists ?shoppingCart ?item                   (And(?customer[ex1:shoppingCart->?shoppingCart]                        ?shoppingCart[ex1:containsItem->?item]                        ?item # ex1:Widget ) )         Then Do( (?s ?customer[ex1:shoppingCart->?s])                  (?val ?s[ex1:value->?val])                  (?voucher ?customer[ex1:voucher->?voucher])                  Retract(?customer[ex1:voucher->?voucher])                  Retract(?voucher)                  Modify(?s[ex1:value->External(func:numeric-multiply(?val 0.90))]))))      (* ex1:UnknownStatusRule *)    Group (      Forall ?customer such that ?customer # ex1:Customer        (If Not(Exists ?status                       (And(?customer[ex1:status->?status]                            External(pred:list-contains(List("New" "Bronze" "Silver" "Gold") ?status)) )))         Then Do( Execute(act:print(External(func:concat("New customer: " ?customer))))                  Assert(?customer[ex1:status->"New"]))))  ) )   I hope that helps someone out there :-)

    Read the article

  • I see no LOBs!

    - by Paul White
    Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns? I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place.  The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose.  Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types.  To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute. Ok, enough of the pre-amble.  I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect: IF OBJECT_ID(N'dbo.Test', N'U') IS NOT NULL DROP TABLE dbo.Test GO CREATE TABLE dbo.Test ( row_id NUMERIC IDENTITY NOT NULL,   col01 NVARCHAR(450) NOT NULL, col02 NVARCHAR(450) NOT NULL, col03 NVARCHAR(450) NOT NULL, col04 NVARCHAR(450) NOT NULL, col05 NVARCHAR(450) NOT NULL, col06 NVARCHAR(450) NOT NULL, col07 NVARCHAR(450) NOT NULL, col08 NVARCHAR(450) NOT NULL, col09 NVARCHAR(450) NOT NULL, col10 NVARCHAR(450) NOT NULL, CONSTRAINT [PK dbo.Test row_id] PRIMARY KEY CLUSTERED (row_id) ) ; The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row: WITH Numbers AS ( -- Generates numbers 1 - 450 inclusive SELECT TOP (450) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) INSERT dbo.Test WITH (TABLOCKX) SELECT REPLICATE(N'A', N.n), REPLICATE(N'B', N.n), REPLICATE(N'C', N.n), REPLICATE(N'D', N.n), REPLICATE(N'E', N.n), REPLICATE(N'F', N.n), REPLICATE(N'G', N.n), REPLICATE(N'H', N.n), REPLICATE(N'I', N.n), REPLICATE(N'J', N.n) FROM Numbers AS N ORDER BY N.n ASC ; Once those two scripts have run, the table contains 450 rows and 10 columns of data like this: Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example: -- Find the maximum length of the data in -- column 5 for a range of rows SELECT result = MAX(DATALENGTH(T.col05)) FROM dbo.Test AS T WHERE row_id BETWEEN 50 AND 100 ; But with a different query… -- Read all the data in column 1 SELECT result = MAX(DATALENGTH(T.col01)) FROM dbo.Test AS T ; …suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect. The Explanation If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes.  If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE.  These special data types store the large data values in a separate structure, with just a small pointer left in the original row. Row Overflow SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes.  You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit. Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types).  Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage.  The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others. In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row.  Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved! Sneaky LOBs Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals.  The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage.  Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row. Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now.  Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types.  The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB. And Finally… In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes.  A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data.  The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row.  You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when.  You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this). Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance.  It makes more sense than ever to choose column data types sensibly.  If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word. Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs.  I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone? © Paul White 2011 email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • Spacewalk 2.0 provided to manage Oracle Linux systems

    - by wcoekaer
    Oracle Linux customers have a few options to manage and provision their servers. We provide a license to use Oracle Enterprise Manager's Linux OS management, monitoring and provisioning features without additional cost for every server that has an Oracle Linux support subscription. So there is no additional pack to license and no additional per server cost, it's all included in our Basic, Premier and Systems support subscriptions. The nice thing with Oracle Enterprise Manager is that you end up with a single management product that can manage all aspects of your software stack. You have complete insight into the applications running, you have roles and responsibilities, you have third party connectors for storage or other products and it makes it very easy and convenient to correlate data and events when something happens. If you use Oracle VM as well, you end up with a complete cloud portal with selfservice, chargeback, etc... Another, much simpler option, is just using yum. It is very easy to take a server and create directories and expose these through apache as repositories. You can have a simple yum config on each server pointing to a few specific repositories. It requires some manual effort in terms of creating directories, downloading packages and creating local repo files but it's easy to do and for many people a preferred solution. There are also a good number of customers that just connect their servers directly to ULN or to our free update server public-yum. Just to re-iterate, our public-yum servers have all the errata and updates available for free. Now we added another option. Many of our customers have switched from a competing Linux vendor and they had familiarity with their management tools. Switching to Oracle for support is very easy since we don't require changes to the installed servers but we also want to make sure there is a very easy and almost transparent switch for the management tools as well. While Oracle Enterprise Manager is our preferred way of managing systems, we now are offering Spacewalk 2.0 to our customers. The community project can be found here. We have made a few changes to ensure easy and complete support for Oracle Linux, tested it with public-yum, etc.. You can find the rpms in our public-yum repos at http://public-yum.oracle.com/repo/OracleLinux/OL6/. There are repositories for spacewalk server and then for each version (OL5,OL6) and architecture (x86 and x86-64) we have the client repositories as well. Spacewalk itself is only made available for OL6 x86-64. Documentation can be found here. I set it up myself and here are some quick steps on how you can get going in just a matter of minutes: Spacewalk Server Installation : 1) Installing an Oracle Database Use an existing Oracle Database or install a new Oracle Database (Standard or Enterprise Edition) [at this time use 11g, we will add support for 12c in the near future]. This database can be installed on the spacewalk server or on a separate remote server. While Oracle XE might work to create a small sample POC, we do not support the use of Oracle XE, spacewalk repositories can become large and create a significant database workload. Customers can use their existing database licenses, they can download the database with a trial licence from http://edelivery.oracle.com or Oracle Linux subscribers (customers) will be allowed to use the Oracle Database as a spacewalk repository as part of their Oracle Linux subscription at no additional cost. |NOTE : spacewalk requires the database to be configured with the UTF8 characterset. |Installation will fail if your database does not use UTF8. |To verify if your database is configured correctly, run the following command in sqlplus: | |select value from nls_database_parameters where parameter='NLS_CHARACTERSET'; |This should return 'AL32UTF8' 2) Configure the database schema for spacewalk Ideally, create a tablespace in the database to hold the spacewalk schema tables/data; create tablespace spacewalk datafile '/u01/app/oracle/oradata/orcl/spacewalk.dbf' size 10G autoextend on; Create the database user spacewalk (or use some other schema name) in sqlplus. example : create user spacewalk identified by spacewalk; grant connect, resource to spacewalk; grant create table, create trigger, create synonym, create view, alter session to spacewalk; grant unlimited tablespace to spacewalk; alter user spacewalk default tablespace spacewalk; 4) Spacewalk installation and configuration Spacewalk server requires an Oracle Linux 6 x86-64 system. Clients can be Oracle Linux 5 or 6, both 32- and 64bit. The server is only supported on OL6/64bit. The easiest way to get started is to do a 'Minimal' install of Oracle Linux on a server and configure the yum repository to include the spacewalk repo from public-yum. Once you have a system with a minimal install, modify your yum repo to include the spacewalk repo. Example : edit /etc/yum.repos.d/public-yum-ol.repo and add the following lines at the end of the file : [spacewalk] name=spacewalk baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/spacewalk20/server/$basearch/ gpgkey=http://public-yum.oracle.com/RPM-GPG-KEY-oracle-ol6 gpgcheck=1 enabled=1 Install the following pre-requisite packages on your spacewalk server : oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64 oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64 rpm -ivh oracle-instantclient11.2-basic-11.2.0.3.0-1.x86_64 rpm -ivh oracle-instantclient11.2-sqlplus-11.2.0.3.0-1.x86_64 The above RPMs can be found on the Oracle Technology Network website : http://www.oracle.com/technetwork/topics/linuxx86-64soft-092277.html As the root user, configure the library path to include the Oracle Instant Client libraries : cd /etc/ld.so.conf.d echo /usr/lib/oracle/11.2/client64/lib oracle-instantclient11.2.conf ldconfig Install spacewalk : # yum install spacewalk-oracle The above yum command should download and install all required packages to run spacewalk on your local server. | NOTE : if you did a full, desktop or workstation installation, | you have to remove the JTA package | BEFORE installing spacewalk-oracle (rpm -e --nodeps jta) Once the installation completes, simply run the spacewalk configuration tool and you are all set. (make sure to run the command with the 2 arguments) spacewalk-setup --disconnected --external-db Answer the questions during the setup, ensure you provide the current database user (example : spacewalk) and password (example : spacewalk) and database server hostname (the standard hostname of the server on which you have deployed the Oracle database) At the end of the setup script, your spacewalk server should be fully configured and you can log into the web portal. Use your favorite browser to connect to the website : http://[spacewalkserverhostname] The very first action will be to create the main admin account.

    Read the article

  • Why you need to learn async in .NET

    - by PSteele
    I had an opportunity to teach a quick class yesterday about what’s new in .NET 4.0.  One of the topics was the TPL (Task Parallel Library) and how it can make async programming easier.  I also stressed that this is the direction Microsoft is going with for C# 5.0 and learning the TPL will greatly benefit their understanding of the new async stuff.  We had a little time left over and I was able to show some code that uses the Async CTP to accomplish some stuff, but it wasn’t a simple demo that you could jump in to and understand so I thought I’d thrown one together and put it in a blog post. The entire solution file with all of the sample projects is located here. A Simple Example Let’s start with a super-simple example (WindowsApplication01 in the solution). I’ve got a form that displays a label and a button.  When the user clicks the button, I want to start displaying the current time for 15 seconds and then stop. What I’d like to write is this: lblTime.ForeColor = Color.Red; for (var x = 0; x < 15; x++) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); Thread.Sleep(1000); } lblTime.ForeColor = SystemColors.ControlText; (Note that I also changed the label’s color while counting – not quite an ILM-level effect, but it adds something to the demo!) As I’m sure most of my readers are aware, you can’t write WinForms code this way.  WinForms apps, by default, only have one thread running and it’s main job is to process messages from the windows message pump (for a more thorough explanation, see my Visual Studio Magazine article on multithreading in WinForms).  If you put a Thread.Sleep in the middle of that code, your UI will be locked up and unresponsive for those 15 seconds.  Not a good UX and something that needs to be fixed.  Sure, I could throw an “Application.DoEvents()” in there, but that’s hacky. The Windows Timer Then I think, “I can solve that.  I’ll use the Windows Timer to handle the timing in the background and simply notify me when the time has changed”.  Let’s see how I could accomplish this with a Windows timer (WindowsApplication02 in the solution): public partial class Form1 : Form { private readonly Timer clockTimer; private int counter;   public Form1() { InitializeComponent(); clockTimer = new Timer {Interval = 1000}; clockTimer.Tick += UpdateLabel; }   private void UpdateLabel(object sender, EventArgs e) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); counter++; if (counter == 15) { clockTimer.Enabled = false; lblTime.ForeColor = SystemColors.ControlText; } }   private void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; counter = 0; clockTimer.Start(); } } Holy cow – things got pretty complicated here.  I use the timer to fire off a Tick event every second.  Inside there, I can update the label.  Granted, I can’t use a simple for/loop and have to maintain a global counter for the number of iterations.  And my “end” code (when the loop is finished) is now buried inside the bottom of the Tick event (inside an “if” statement).  I do, however, get a responsive application that doesn’t hang or stop repainting while the 15 seconds are ticking away. But doesn’t .NET have something that makes background processing easier? The BackgroundWorker Next I try .NET’s BackgroundWorker component – it’s specifically designed to do processing in a background thread (leaving the UI thread free to process the windows message pump) and allows updates to be performed on the main UI thread (WindowsApplication03 in the solution): public partial class Form1 : Form { private readonly BackgroundWorker worker;   public Form1() { InitializeComponent(); worker = new BackgroundWorker {WorkerReportsProgress = true}; worker.DoWork += StartUpdating; worker.ProgressChanged += UpdateLabel; worker.RunWorkerCompleted += ResetLabelColor; }   private void StartUpdating(object sender, DoWorkEventArgs e) { var workerObject = (BackgroundWorker) sender; for (int x = 0; x < 15; x++) { workerObject.ReportProgress(0); Thread.Sleep(1000); } }   private void UpdateLabel(object sender, ProgressChangedEventArgs e) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); }   private void ResetLabelColor(object sender, RunWorkerCompletedEventArgs e) { lblTime.ForeColor = SystemColors.ControlText; }   private void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; worker.RunWorkerAsync(); } } Well, this got a little better (I think).  At least I now have my simple for/next loop back.  Unfortunately, I’m still dealing with event handlers spread throughout my code to co-ordinate all of this stuff in the right order. Time to look into the future. The async way Using the Async CTP, I can go back to much simpler code (WindowsApplication04 in the solution): private async void cmdStart_Click(object sender, EventArgs e) { lblTime.ForeColor = Color.Red; for (var x = 0; x < 15; x++) { lblTime.Text = DateTime.Now.ToString("HH:mm:ss"); await TaskEx.Delay(1000); } lblTime.ForeColor = SystemColors.ControlText; } This code will run just like the Timer or BackgroundWorker versions – fully responsive during the updates – yet is way easier to implement.  In fact, it’s almost a line-for-line copy of the original version of this code.  All of the async plumbing is handled by the compiler and the framework.  My code goes back to representing the “what” of what I want to do, not the “how”. I urge you to download the Async CTP.  All you need is .NET 4.0 and Visual Studio 2010 sp1 – no need to set up a virtual machine with the VS2011 beta (unless, of course, you want to dive right in to the C# 5.0 stuff!).  Starting playing around with this today and see how much easier it will be in the future to write async-enabled applications.

    Read the article

  • NUMA-aware placement of communication variables

    - by Dave
    For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

    Read the article

  • Refactoring Part 1 : Intuitive Investments

    - by Wes McClure
    Fear, it’s what turns maintaining applications into a nightmare.  Technology moves on, teams move on, someone is left to operate the application, what was green is now perceived brown.  Eventually the business will evolve and changes will need to be made.  The approach to those changes often dictates the long term viability of the application.  Fear of change, lack of passion and a lack of interest in understanding the domain often leads to a paranoia to do anything that doesn’t involve duct tape and bailing twine.  Don’t get me wrong, those have a place in the short term viability of a project but they don’t have a place in the long term.  Add to it “us versus them” in regards to the original team and those that maintain it, internal politics and other factors and you have a recipe for disaster.  This results in code that quickly becomes unmanageable.  Even the most clever of designs will eventually become sub optimal and debt will amount that exponentially makes changes difficult.  This is where refactoring comes in, and it’s something I’m very passionate about.  Refactoring is about improving the process whereby we make change, it’s an exponential investment in the process of change. Without it we will incur exponential complexity that halts productivity. Investments, especially in the long term, require intuition and reflection.  How can we tackle new development effectively via evolving the original design and paying off debt that has been incurred? The longer we wait to ask and answer this question, the more it will cost us.  Small requests don’t warrant big changes, but realizing when changes now will pay off in the long term, and especially in the short term, is valuable. I have done my fair share of maintaining applications and continuously refactoring as needed, but recently I’ve begun work on a project that hasn’t had much debt, if any, paid down in years.  This is the first in a series of blog posts to try to capture the process which is largely driven by intuition of smaller refactorings from other projects. Signs that refactoring could help: Testability How can decreasing test time not pay dividends? One of the first things I found was that a very important piece often takes 30+ minutes to test.  I can only imagine how much time this has cost historically, but more importantly the time it might cost in the coming weeks: I estimate at least 10-20 hours per person!  This is simply unacceptable for almost any situation.  As it turns out, about 6 hours of working with this part of the application and I was able to cut the time down to under 30 seconds!  In less than the lost time of one week, I was able to fix the problem for all future weeks! If we can’t test fast then we can’t change fast, nor with confidence. Code is used by end users and it’s also used by developers, consider your own needs in terms of the code base.  Adding logic to enable/disable features during testing can help decouple parts of an application and lead to massive improvements.  What exactly is so wrong about test code in real code?  Often, these become features for operators and sometimes end users.  If you cannot run an integration test within a test runner in your IDE, it’s time to refactor. Readability Are variables named meaningfully via a ubiquitous language? Is the code segmented functionally or behaviorally so as to minimize the complexity of any one area? Are aspects properly segmented to avoid confusion (security, logging, transactions, translations, dependency management etc) Is the code declarative (what) or imperative (how)?  What matters, not how.  LINQ is a great abstraction of the what, not how, of collection manipulation.  The Reactive framework is a great example of the what, not how, of managing streams of data. Are constants abstracted and named, or are they just inline? Do people constantly bitch about the code/design? If the code is hard to understand, it will be hard to change with confidence.  It’s a large undertaking if the original designers didn’t pay much attention to readability and as such will never be done to “completion.”  Make sure not to go over board, instead use this as you change an application, not in lieu of changes (like with testability). Complexity Simplicity will never be achieved, it’s highly subjective.  That said, a lot of code can be significantly simplified, tidy it up as you go.  Refactoring will often converge upon a simplification step after enough time, keep an eye out for this. Understandability In the process of changing code, one often gains a better understanding of it.  Refactoring code is a good way to learn how it works.  However, it’s usually best in combination with other reasons, in effect killing two birds with one stone.  Often this is done when readability is poor, in which case understandability is usually poor as well.  In the large undertaking we are making with this legacy application, we will be replacing it.  Therefore, understanding all of its features is important and this refactoring technique will come in very handy. Unused code How can deleting things not help? This is a freebie in refactoring, it’s very easy to detect with modern tools, especially in statically typed languages.  We have VCS for a reason, if in doubt, delete it out (ok that was cheesy)! If you don’t know where to start when refactoring, this is an excellent starting point! Duplication Do not pray and sacrifice to the anti-duplication gods, there are excellent examples where consolidated code is a horrible idea, usually with divergent domains.  That said, mediocre developers live by copy/paste.  Other times features converge and aren’t combined.  Tools for finding similar code are great in the example of copy/paste problems.  Knowledge of the domain helps identify convergent concepts that often lead to convergent solutions and will give intuition for where to look for conceptual repetition. 80/20 and the Boy Scouts It’s often said that 80% of the time 20% of the application is used most.  These tend to be the parts that are changed.  There are also parts of the code where 80% of the time is spent changing 20% (probably for all the refactoring smells above).  I focus on these areas any time I make a change and follow the philosophy of the Boy Scout in cleaning up more than I messed up.  If I spend 2 hours changing an application, in the 20%, I’ll always spend at least 15 minutes cleaning it or nearby areas. This gives a huge productivity edge on developers that don’t. Ironically after a short period of time the 20% shrinks enough that we don’t have to spend 80% of our time there and can move on to other areas.   Refactoring is highly subjective, never attempt to refactor to completion!  Learn to be comfortable with leaving one part of the application in a better state than others.  It’s an evolution, not a revolution.  These are some simple areas to look into when making changes and can help get one started in the process.  I’ve often found that refactoring is a convergent process towards simplicity that sometimes spans a few hours but often can lead to massive simplifications over the timespan of weeks and months of regular development.

    Read the article

  • Not attending the LUGM mini-meetup - 05. Oct 2013

    Not attending a meeting of the LUGM can be fun, too. It's getting a bit of a habit that Ish is organising small gatherings, aka mini-meetups, of the Linux User Group Mauritius/Meta (LUGM) almost every Saturday. There they mainly discuss and talk about various elements of using Linux as ones main operating systems and the possibilities you are going to have. On top of course, some tips & tricks about mastering the command line and initial steps in scripting or even writing HTML. In general, sounds like a good portion of fun and great spirit of community. Unfortunately, I'm usually quite busy with private and family matters during the weekend and so I already signalised that I wouldn't be around. Well, at least not physically... But this Saturday a couple of things worked out faster than expected and so I was hanging out on my machine. I made virtual contact with one of Pawan's messages over on Facebook... And somehow that kicked off some kind of an online game fun on basic configuration of Apache HTTPd 2.2.x, PHP 5.x and how to improve the overall performance of a newly installed blog based on WordPress. Default configuration files Nitin's website finally came alive and despite the dark theme and the hidden Apple 'fanboy' advertisement I was more interested in the technical situation. As with any new installation there is usually quite some adjustment to be done. And Nitin's page was no exception. Unfortunately, out of the box installations of Apache httpd and PHP are too verbose and expose too much information under the hood. You might think that this isn't really a problem at all, well, think about it again after completely reading this article. First, I checked the HTTP response headers - using either Chrome Developer Tools or Firefox Web Developer extension - of Nitin's page and based on that I advised him to lower the noise levels a little bit. It's not really necessary that detailed information about web server software and scripting language has to be published in every response made. Quite a number of script kiddies and exploits actually check for version specifics prior to an attack. So, removing at least version details hardens the system a little bit. In particular, I'm talking about these response values: Server X-Powered-By How to achieve that? By tweaking the configuration files... Namely, we are going to look into the following ones: apache2.conf httpd.conf .htaccess php.ini The above list contains some additional files, I'm talking about in the next paragraphs. Anyway, those are the ones involved. Tweaking Apache Open your favourite text editor and start to modify the apache2.conf. Eventually, you might like to have a quick peak at the file to see whether it is necessary to adjust it or not. Following is a handy combination of commands to get an overview of your active directives: # sudo grep -v '#' /etc/apache2/apache2.conf | grep -v '^$' | less There you keep an eye on those two Apache directives: ServerSignature Off ServerTokens Prod If that's not the case, change them as highlighted above. In order to activate your modifications you have to restart Apache httpd server. On Debian and Ubuntu you might use apache2ctl for that, on other distributions you might have to use service or run the init-scripts again: # sudo apache2ctl configtestSyntax OK# sudo apache2ctl restart Refresh your website and check the HTTP response header. Tweaking PHP5 (a little bit) Next, check your php.ini file with the following statement: # sudo grep -v ';' /etc/php5/apache2/php.ini | grep -v '^$' | less And check the value of expose_php = Off Again, if it's not as highlighted, change it... Some more Apache love Okay, back to Apache it might also be interesting to improve the situation about browser caching and removing more obsolete information. When you run your website against the usual performance checks like Google Page Speed and Yahoo YSlow you might see those check points with bad grades on a standard, default configuration. Well, this can be done easily. Configure entity tags (ETags) ETags are only interesting when you run your websites on a farm of multiple web servers. Removing this data for your static resources is very simple in Apache. As we are going to deal with the HTTP response header information you have to ensure that Apache is capable to manipulate them. First, check your enabled modules: # sudo ls -al /etc/apache2/mods-enabled/ | grep headers And in case that the 'headers' module is not listed, you have to enable it from the available ones: # sudo a2enmod headers Second, check your httpd.conf file (in case it exists): # sudo grep -v '#' /etc/apache2/httpd.conf | grep -v '^$' | less In newer (better said fresh) installations you might have to create a new configuration file below your conf.d folder with your favourite text editor like so: # sudo nano /etc/apache2/conf.d/headers.conf Then, in order to tweak your HTTP responses either check for those lines or add them: Header unset ETagFileETag None In case that your file doesn't exist or those lines are missing, feel free to create/add them. Afterwards, check your Apache configuration syntax and restart your running instances as already shown above: # sudo apache2ctl configtestSyntax OK# sudo apache2ctl restart Add Expires headers To improve the loading performance of your website, you should take some care into the proper configuration of how to leverage the browser's ability to cache certain resources and files. This is done by adding an Expires: value to the HTTP response header. Generally speaking it is advised that you specify a near-future, read: 1 week or a little bit more, for your static content like JavaScript files or Cascading Style Sheets. One solution to adjust this is to put some instructions into the .htaccess file in the root folder of your web site. Of course, this could also be placed into a more generic location of your Apache installation but honestly, I'd like to keep this at the web site level. Following some adjustments I'm currently using on this blog site: # Turn on Expires and set default to 0ExpiresActive OnExpiresDefault A0 # Set up caching on media files for 1 year (forever?)<FilesMatch "\.(flv|ico|pdf|avi|mov|ppt|doc|mp3|wmv|wav)$">ExpiresDefault A29030400Header append Cache-Control "public"</FilesMatch> # Set up caching on media files for 1 week<FilesMatch "\.(js|css)$">ExpiresDefault A604800Header append Cache-Control "public"</FilesMatch> # Set up caching on media files for 31 days<FilesMatch "\.(gif|jpg|jpeg|png|swf)$">ExpiresDefault A2678400Header append Cache-Control "public"</FilesMatch> As we are editing the .htaccess files, it is not necessary to restart Apache. In case that your web site doesn't load anymore or you're experiencing an error while trying to restart your httpd, check that the 'expires' module is actually an enabled module: # ls -al /etc/apache2/mods-enabled/ | grep expires# sudo a2enmod expires Of course, the instructions above a re not feature complete but I hope that they might provide a better default configuration for your LAMP stack. Resume of the day Within a couple of hours, and while being occupied with an eLearning course on SQL Server 2012, I had some good fun in helping and assisting other LUGM members while they were some kilometers away at Bagatelle. According to other blog articles it seems that Nitin had quite some moments of desperation. Just for the records: At no time it was my intention to either kick his butt or pull a leg on him. Simply, providing some input based on the lessons I've learned over the last couple of years configuring Apache HTTPd and PHP. Check out the other blogs, too: LUGM mini-meetup... Epic! Superb Saturday Linux Meetup And last but not least, the man himself: The end of a new beginning Cheers, and happy community'ing! Updates Due to our weekly Code & Coffee sessions in the MSCC community, I had a chance to talk to Nitin directly and he showed me the problems directly on his machine. This led to update this article hence the paragraphs on enabling the modules 'headers' and 'expires'.

    Read the article

  • How to create a virtual network with Azure Connect

    - by Herve Roggero
    If you are trying to establish a virtual network between machines located in disparate networks, you can either use VPN, Virtual Network or Azure Connect. If you want to establish a connection between machines located in Windows Azure, you should consider using the Virtual Network service. If you want to establish a connection between local machines and Virtual Machines in Windows Azure, you may be able to use your existing VPN device (assuming you have one), as long as the device is supported by Microsoft. If the VPN device you are using isn’t supported, or if you are trying to create a virtual network between machines from disparate networks (such as machines located in another cloud provider), you can use Azure Connect. This blog post explains how Azure Connect can help you create virtual networks between multiple servers in the cloud, various servers in different cloud environments, and on-premise. Note: Azure Connect is currently in Technical Preview. About Azure Connect Let’s do a quick review of Azure Connect. This technology implements an IPSec tunnel from machines to to a relay service located in the Microsoft cloud (Azure). So in essence, Azure Connect doesn’t provide a point-to-point connection between machines; the network communication is tunneled through the relay service. The relay service in turn offers a mechanism to enforce basic communication rules that you define through Groups. We will review this later. You could network two or more VMs in the Azure cloud (although you should consider using a Virtual Network if you go this route), or servers in the Azure cloud and other machines in the Amazon cloud for example, or even two or more on-premise servers located in different locations for which a direct network connection is not an option. You can place any number of machines in your topology. Azure Connect gives you great flexibility on how you want to build your virtual network across various environments. So Azure Connect makes sense when you want to: Connect machines located in different cloud providers Connect on-premise machines running in different locations Connect Azure VMs with on-premise (if you do not have a VPN device, or if your device is not supported) Connect Azure Roles (Worker Roles, Web Roles) with on-premise servers or in other cloud providers The diagram below shows you a high level network topology that involves machines in the Windows Azure cloud, other cloud providers and on-premise. You should note that the only required component in this diagram is the Relay itself. The other machines are optional (although your network is useful only if you have two or more machines involved). Relay agents are currently available in three geographic areas: US, Europe and Asia. You can change which region you want to use in the Windows Azure management portal. High Level Network Topology With Azure Connect Azure Connect Agent Azure Connect establishes a virtual network and creates virtual adapters on your machines; these virtual adapters communicate through the Relay using IPSec. This is achieved by installing an agent (the Azure Connect Agent) on all the machines you want in your network topology. However, you do not need to install the agent on Worker Roles and Web Roles; that’s because the agent is already installed for you. Any other machine, including Virtual Machines in Windows Azure, needs the agent installed.  To install the agent, simply go to your Windows Azure portal (http://windows.azure.com) and click on Networks on the bottom left panel. You will see a list of subscriptions under Connect. If you select a subscription, you will be able to click on the Install Local Endpoint icon on top. Clicking on this icon will begin the download and installation process for the agent. Activating Roles for Azure Connect As previously mentioned, you do not need to install the Azure Connect Agent on Worker Roles and Web Roles because it is already loaded. However, you do need to activate them if you want the roles to participate in your network topology. To do this, you will need to click on the Get Activation Token icon. The activation token must then be copied and placed in the configuration file of your roles. For more information on how to perform this step, visit MSDN at http://msdn.microsoft.com/en-us/library/windowsazure/gg432964.aspx. Firewall Rules Note that specific firewall rules must exist to allow the agent to communicate through the Relay. You will need to allow TCP 443 and ICMPv6. For additional information, please visit MSDN at http://msdn.microsoft.com/en-us/library/windowsazure/gg433061.aspx. CA Certificates You can optionally require agents to sign their activation request with the Relay using a trusted certificate issued by a Certificate Authority (CA). Click on Activation Options to learn more. Groups To create your network topology you must first create a group. A group represents a logical container of endpoints (or machines) that can communicate through the Relay. You can create multiple groups allowing you to manage network communication differently. For example you could create a DEVELOPMENT group and a PRODUCTION group. To add an endpoint you must first install an agent that will create a virtual adapter on the machine on which it is installed (as discussed in the previous section). Once you have created a group and installed the agents, the machines will appear in the Windows Azure management portal and you can start assigning machines to groups. The next figure shows you that I created a group called LocalGroup and assigned two machines (both on-premise) to that group. Groups and Computers in Azure Connect As I mentioned previously you can allow these machines to establish a network connection. To do this, you must enable the Interconnected option in the group. The following diagram shows you the definition of the group. In this topology I chose to include local machines only, but I could also add worker roles and web roles in the Azure Roles section (you must first activate your roles, as discussed previously). You could also add other Groups, allowing you to manage inter-group communication. Defining a Group in Azure Connect Testing the Connection Now that my agents have been installed on my two machines, the group defined and the Interconnected option checked, I can test the connection between my machines. The next screenshot shows you that I sent a PING request to DEVLAP02 from DEVDSK02. The PING request was successful. Note however that the time is in the hundreds of milliseconds on average. That is to be expected because the machines are connecting through the Relay located in the cloud. Going through the Relay introduces an extra hop in the communication chain, so if your systems rely on high performance, you may want to conduct some basic performance tests. Sending a PING Request Through The Relay Conclusion As you can see, creating a network topology between machines using the Azure Connect service is simple. It took me less than five minutes to create the above configuration, including the time it took to install the Azure Connect agents on the two machines. The flexibility of Azure Connect allows you to create a virtual network between disparate environments, as long as your operating systems are supported by the agent. For more information on Azure Connect, visit the MSDN website at http://msdn.microsoft.com/en-us/library/windowsazure/gg432997.aspx. About Herve Roggero Herve Roggero, Windows Azure MVP, is the founder of Blue Syntax Consulting, a company specialized in cloud computing products and services. Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" from Apress and runs the Azure Florida Association (on LinkedIn: http://www.linkedin.com/groups?gid=4177626). For more information on Blue Syntax Consulting, visit www.bluesyntax.net. Special Thanks I would like thank those that helped me figure out how Azure Connect works: Marcel Meijer - http://blogs.msmvps.com/marcelmeijer/ Michael Wood - Http://www.mvwood.com Glenn Block - http://www.codebetter.com/glennblock Yves Goeleven - http://cloudshaper.wordpress.com/ Sandrino Di Mattia - http://fabriccontroller.net/ Mike Martin - http://techmike2kx.wordpress.com

    Read the article

  • How to set Grub to automatically load Xen kernel

    - by Cerin
    How do you configure Grub to automatically use the Xen kernel under Ubuntu 11.10? No matter what I do, it loads the first menuentry. The only way I can get it to load Xen is to manually select the kernel, which I can't do if I have to reboot the server remotely, or there's a power failure and the machine automatically boots up when power's restored, etc. It's driving me nuts. In my /boot/grub/grub.cfg, the Xen kernel is at index 4 (i.e. it's the 5th menuentry). So I've tried: Setting GRUB_DEFAULT=4, and running sudo update-grub Setting GRUB_DEFAULT=saved and GRUB_SAVEDEFAULT=true, and running sudo update-grub Setting GRUB_DEFAULT="Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-16-server", and running sudo update-grub None of these work. It continues to load the first menuentry, which is "Ubuntu, with Linux 3.0.0-16-server". Below is my current /boot/grub/grub.cfg. What am I doing wrong? # # DO NOT EDIT THIS FILE # # It is automatically generated by grub-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### if [ -s $prefix/grubenv ]; then set have_grubenv=true load_env fi set default="Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-16-server" if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z "${boot_once}" ]; then saved_entry="${chosen}" save_env saved_entry fi } function recordfail { set recordfail=1 if [ -n "${have_grubenv}" ]; then if [ -z "${boot_once}" ]; then save_env recordfail; fi; fi } function load_video { insmod vbe insmod vga insmod video_bochs insmod video_cirrus } insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac if loadfont /usr/share/grub/unicode.pf2 ; then set gfxmode=auto load_video insmod gfxterm insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac set locale_dir=($root)/boot/grub/locale set lang=en_US insmod gettext fi terminal_output gfxterm if [ "${recordfail}" = 1 ]; then set timeout=-1 else set timeout=2 fi ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/05_debian_theme ### set menu_color_normal=white/black set menu_color_highlight=black/light-gray if background_color 44,0,30; then clear fi ### END /etc/grub.d/05_debian_theme ### ### BEGIN /etc/grub.d/10_linux ### if [ ${recordfail} != 1 ]; then if [ -e ${prefix}/gfxblacklist.txt ]; then if hwmatch ${prefix}/gfxblacklist.txt 3; then if [ ${match} = 0 ]; then set linux_gfx_mode=keep else set linux_gfx_mode=text fi else set linux_gfx_mode=text fi else set linux_gfx_mode=keep fi else set linux_gfx_mode=text fi export linux_gfx_mode if [ "$linux_gfx_mode" != "text" ]; then load_video; fi menuentry 'Ubuntu, with Linux 3.0.0-16-server' --class ubuntu --class gnu-linux --class gnu --class os { recordfail set gfxpayload=$linux_gfx_mode insmod gzio insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac linux /boot/vmlinuz-3.0.0-16-server root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro initrd /boot/initrd.img-3.0.0-16-server } menuentry 'Ubuntu, with Linux 3.0.0-16-server (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os { recordfail insmod gzio insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Linux 3.0.0-16-server ...' linux /boot/vmlinuz-3.0.0-16-server root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro recovery nomodeset echo 'Loading initial ramdisk ...' initrd /boot/initrd.img-3.0.0-16-server } submenu "Previous Linux versions" { menuentry 'Ubuntu, with Linux 3.0.0-12-server' --class ubuntu --class gnu-linux --class gnu --class os { recordfail set gfxpayload=$linux_gfx_mode insmod gzio insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac linux /boot/vmlinuz-3.0.0-12-server root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro initrd /boot/initrd.img-3.0.0-12-server } menuentry 'Ubuntu, with Linux 3.0.0-12-server (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os { recordfail insmod gzio insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Linux 3.0.0-12-server ...' linux /boot/vmlinuz-3.0.0-12-server root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro recovery nomodeset echo 'Loading initial ramdisk ...' initrd /boot/initrd.img-3.0.0-12-server } } ### END /etc/grub.d/10_linux ### ### BEGIN /etc/grub.d/20_linux_xen ### submenu "Xen 4.1-amd64" { menuentry 'Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-16-server' --class ubuntu --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Xen 4.1-amd64 ...' multiboot /boot/xen-4.1-amd64.gz placeholder echo 'Loading Linux 3.0.0-16-server ...' module /boot/vmlinuz-3.0.0-16-server placeholder root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro echo 'Loading initial ramdisk ...' module /boot/initrd.img-3.0.0-16-server } menuentry 'Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-16-server (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Xen 4.1-amd64 ...' multiboot /boot/xen-4.1-amd64.gz placeholder echo 'Loading Linux 3.0.0-16-server ...' module /boot/vmlinuz-3.0.0-16-server placeholder root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro single echo 'Loading initial ramdisk ...' module /boot/initrd.img-3.0.0-16-server } menuentry 'Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-12-server' --class ubuntu --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Xen 4.1-amd64 ...' multiboot /boot/xen-4.1-amd64.gz placeholder echo 'Loading Linux 3.0.0-12-server ...' module /boot/vmlinuz-3.0.0-12-server placeholder root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro echo 'Loading initial ramdisk ...' module /boot/initrd.img-3.0.0-12-server } menuentry 'Ubuntu GNU/Linux, with Xen 4.1-amd64 and Linux 3.0.0-12-server (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac echo 'Loading Xen 4.1-amd64 ...' multiboot /boot/xen-4.1-amd64.gz placeholder echo 'Loading Linux 3.0.0-12-server ...' module /boot/vmlinuz-3.0.0-12-server placeholder root=UUID=d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac ro single echo 'Loading initial ramdisk ...' module /boot/initrd.img-3.0.0-12-server } } ### END /etc/grub.d/20_linux_xen ### ### BEGIN /etc/grub.d/20_memtest86+ ### menuentry "Memory test (memtest86+)" { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac linux16 /boot/memtest86+.bin } menuentry "Memory test (memtest86+, serial console 115200)" { insmod raid insmod mdraid1x insmod part_msdos insmod part_msdos insmod ext2 set root='(mduuid/be73165bc31d6f5cd00d05036c7b964f)' search --no-floppy --fs-uuid --set=root d72bad3f-9ed7-44b9-b3d1-d7af9f62a8ac linux16 /boot/memtest86+.bin console=ttyS0,115200n8 } ### END /etc/grub.d/20_memtest86+ ### ### BEGIN /etc/grub.d/30_os-prober ### ### END /etc/grub.d/30_os-prober ### ### BEGIN /etc/grub.d/40_custom ### # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. ### END /etc/grub.d/40_custom ### ### BEGIN /etc/grub.d/41_custom ### if [ -f $prefix/custom.cfg ]; then source $prefix/custom.cfg; fi ### END /etc/grub.d/41_custom ###

    Read the article

  • The Sensemaking Spectrum for Business Analytics: Translating from Data to Business Through Analysis

    - by Joe Lamantia
    One of the most compelling outcomes of our strategic research efforts over the past several years is a growing vocabulary that articulates our cumulative understanding of the deep structure of the domains of discovery and business analytics. Modes are one example of the deep structure we’ve found.  After looking at discovery activities across a very wide range of industries, question types, business needs, and problem solving approaches, we've identified distinct and recurring kinds of sensemaking activity, independent of context.  We label these activities Modes: Explore, compare, and comprehend are three of the nine recognizable modes.  Modes describe *how* people go about realizing insights.  (Read more about the programmatic research and formal academic grounding and discussion of the modes here: https://www.researchgate.net/publication/235971352_A_Taxonomy_of_Enterprise_Search_and_Discovery) By analogy to languages, modes are the 'verbs' of discovery activity.  When applied to the practical questions of product strategy and development, the modes of discovery allow one to identify what kinds of analytical activity a product, platform, or solution needs to support across a spread of usage scenarios, and then make concrete and well-informed decisions about every aspect of the solution, from high-level capabilities, to which specific types of information visualizations better enable these scenarios for the types of data users will analyze. The modes are a powerful generative tool for product making, but if you've spent time with young children, or had a really bad hangover (or both at the same time...), you understand the difficult of communicating using only verbs.  So I'm happy to share that we've found traction on another facet of the deep structure of discovery and business analytics.  Continuing the language analogy, we've identified some of the ‘nouns’ in the language of discovery: specifically, the consistently recurring aspects of a business that people are looking for insight into.  We call these discovery Subjects, since they identify *what* people focus on during discovery efforts, rather than *how* they go about discovery as with the Modes. Defining the collection of Subjects people repeatedly focus on allows us to understand and articulate sense making needs and activity in more specific, consistent, and complete fashion.  In combination with the Modes, we can use Subjects to concretely identify and define scenarios that describe people’s analytical needs and goals.  For example, a scenario such as ‘Explore [a Mode] the attrition rates [a Measure, one type of Subject] of our largest customers [Entities, another type of Subject] clearly captures the nature of the activity — exploration of trends vs. deep analysis of underlying factors — and the central focus — attrition rates for customers above a certain set of size criteria — from which follow many of the specifics needed to address this scenario in terms of data, analytical tools, and methods. We can also use Subjects to translate effectively between the different perspectives that shape discovery efforts, reducing ambiguity and increasing impact on both sides the perspective divide.  For example, from the language of business, which often motivates analytical work by asking questions in business terms, to the perspective of analysis.  The question posed to a Data Scientist or analyst may be something like “Why are sales of our new kinds of potato chips to our largest customers fluctuating unexpectedly this year?” or “Where can innovate, by expanding our product portfolio to meet unmet needs?”.  Analysts translate questions and beliefs like these into one or more empirical discovery efforts that more formally and granularly indicate the plan, methods, tools, and desired outcomes of analysis.  From the perspective of analysis this second question might become, “Which customer needs of type ‘A', identified and measured in terms of ‘B’, that are not directly or indirectly addressed by any of our current products, offer 'X' potential for ‘Y' positive return on the investment ‘Z' required to launch a new offering, in time frame ‘W’?  And how do these compare to each other?”.  Translation also happens from the perspective of analysis to the perspective of data; in terms of availability, quality, completeness, format, volume, etc. By implication, we are proposing that most working organizations — small and large, for profit and non-profit, domestic and international, and in the majority of industries — can be described for analytical purposes using this collection of Subjects.  This is a bold claim, but simplified articulation of complexity is one of the primary goals of sensemaking frameworks such as this one.  (And, yes, this is in fact a framework for making sense of sensemaking as a category of activity - but we’re not considering the recursive aspects of this exercise at the moment.) Compellingly, we can place the collection of subjects on a single continuum — we call it the Sensemaking Spectrum — that simply and coherently illustrates some of the most important relationships between the different types of Subjects, and also illuminates several of the fundamental dynamics shaping business analytics as a domain.  As a corollary, the Sensemaking Spectrum also suggests innovation opportunities for products and services related to business analytics. The first illustration below shows Subjects arrayed along the Sensemaking Spectrum; the second illustration presents examples of each kind of Subject.  Subjects appear in colors ranging from blue to reddish-orange, reflecting their place along the Spectrum, which indicates whether a Subject addresses more the viewpoint of systems and data (Data centric and blue), or people (User centric and orange).  This axis is shown explicitly above the Spectrum.  Annotations suggest how Subjects align with the three significant perspectives of Data, Analysis, and Business that shape business analytics activity.  This rendering makes explicit the translation and bridging function of Analysts as a role, and analysis as an activity. Subjects are best understood as fuzzy categories [http://georgelakoff.files.wordpress.com/2011/01/hedges-a-study-in-meaning-criteria-and-the-logic-of-fuzzy-concepts-journal-of-philosophical-logic-2-lakoff-19731.pdf], rather than tightly defined buckets.  For each Subject, we suggest some of the most common examples: Entities may be physical things such as named products, or locations (a building, or a city); they could be Concepts, such as satisfaction; or they could be Relationships between entities, such as the variety of possible connections that define linkage in social networks.  Likewise, Events may indicate a time and place in the dictionary sense; or they may be Transactions involving named entities; or take the form of Signals, such as ‘some Measure had some value at some time’ - what many enterprises understand as alerts.   The central story of the Spectrum is that though consumers of analytical insights (represented here by the Business perspective) need to work in terms of Subjects that are directly meaningful to their perspective — such as Themes, Plans, and Goals — the working realities of data (condition, structure, availability, completeness, cost) and the changing nature of most discovery efforts make direct engagement with source data in this fashion impossible.  Accordingly, business analytics as a domain is structured around the fundamental assumption that sense making depends on analytical transformation of data.  Analytical activity incrementally synthesizes more complex and larger scope Subjects from data in its starting condition, accumulating insight (and value) by moving through a progression of stages in which increasingly meaningful Subjects are iteratively synthesized from the data, and recombined with other Subjects.  The end goal of  ‘laddering’ successive transformations is to enable sense making from the business perspective, rather than the analytical perspective.Synthesis through laddering is typically accomplished by specialized Analysts using dedicated tools and methods. Beginning with some motivating question such as seeking opportunities to increase the efficiency (a Theme) of fulfillment processes to reach some level of profitability by the end of the year (Plan), Analysts will iteratively wrangle and transform source data Records, Values and Attributes into recognizable Entities, such as Products, that can be combined with Measures or other data into the Events (shipment of orders) that indicate the workings of the business.  More complex Subjects (to the right of the Spectrum) are composed of or make reference to less complex Subjects: a business Process such as Fulfillment will include Activities such as confirming, packing, and then shipping orders.  These Activities occur within or are conducted by organizational units such as teams of staff or partner firms (Networks), composed of Entities which are structured via Relationships, such as supplier and buyer.  The fulfillment process will involve other types of Entities, such as the products or services the business provides.  The success of the fulfillment process overall may be judged according to a sophisticated operating efficiency Model, which includes tiered Measures of business activity and health for the transactions and activities included.  All of this may be interpreted through an understanding of the operational domain of the businesses supply chain (a Domain).   We'll discuss the Spectrum in more depth in succeeding posts.

    Read the article

  • Combining Shared Secret and Username Token – Azure Service Bus

    - by Michael Stephenson
    As discussed in the introduction article this walkthrough will explain how you can implement WCF security with the Windows Azure Service Bus to ensure that you can protect your endpoint in the cloud with a shared secret but also flow through a username token so that in your listening WCF service you will be able to identify who sent the message. This could either be in the form of an application or a user depending on how you want to use your token. Prerequisites Before going into the walk through I want to explain a few assumptions about the scenario we are implementing but to keep the article shorter I am not going to walk through all of the steps in how to setup some of this. In the solution we have a simple console application which will represent the client application. There is also the services WCF application which contains the WCF service we will expose via the Windows Azure Service Bus. The WCF Service application in this example was hosted in IIS 7 on Windows 2008 R2 with AppFabric Server installed and configured to auto-start the WCF listening services. I am not going to go through significant detail around the IIS setup because it should not matter in relation to this article however if you want to understand more about how to configure WCF and IIS for such a scenario please refer to the following paper which goes into a lot of detail about how to configure this. The link is: http://tinyurl.com/8s5nwrz   The Service Component To begin with let's look at the service component and how it can be configured to listen to the service bus using a shared secret but to also accept a username token from the client. In the sample the service component is called Acme.Azure.ServiceBus.Poc.UN.Services. It has a single service which is the Visual Studio template for a WCF service when you add a new WCF Service Application so we have a service called Service1 with its Echo method. Nothing special so far!.... The next step is to look at the web.config file to see how we have configured the WCF service. In the services section of the WCF configuration you can see I have created my service and I have created a local endpoint which I simply used to do a little bit of diagnostics and to check it was working, but more importantly there is the Windows Azure endpoint which is using the ws2007HttpRelayBinding (note that this should also work just the same if your using netTcpRelayBinding). The key points to note on the above picture are the service behavior called MyServiceBehaviour and the service bus endpoints behavior called MyEndpointBehaviour. We will go into these in more detail later.   The Relay Binding The relay binding for the service has been configured to use the TransportWithMessageCredential security mode. This is the important bit where the transport security really relates to the interaction between the service and listening to the Azure Service Bus and the message credential is where we will use our username token like we have specified in the message/clientCrentialType attribute. Note also that we have left the relayClientAuthenticationType set to RelayAccessToken. This means that authentication will be made against ACS for accessing the service bus and messages will not be accepted from any sender who has not been authenticated by ACS.   The Endpoint Behaviour In the below picture you can see the endpoint behavior which is configured to use the shared secret client credential for accessing the service bus and also for diagnostic purposes I have included the service registry element. Hopefully if you are familiar with using Windows Azure Service Bus relay feature the above is very familiar to you and this is a very common setup for this section. There is nothing specific to the username token implementation here. The Service Behaviour Now we come to the bit with most of the username token bits in it. When you configure the service behavior I have included the serviceCredentials element and then setup to use userNameAuthentication and you can see that I have created my own custom username token validator.   This setup means that WCF will hand off to my class for validating the username token details. I have also added the serviceSecurityAudit element to give me a simple auditing of access capability. My UsernamePassword Validator The below picture shows you the details of the username password validator class I have implemented. WCF will hand off to this class when validating the token and give me a nice way to check the token credentials against an on-premise store. You have all of the validation features with a non-service bus WCF implementation available such as validating the username password against active directory or ASP.net membership features or as in my case above something much simpler.   The Client Now let's take a look at the client side of this solution and how we can configure the client to authenticate against ACS but also send a username token over to the service component so it can implement additional security checks on-premise. I have a console application and in the program class I want to use the proxy generated with Add Service Reference to send a message via the Azure Service Bus. You can see in my WCF client configuration below I have setup my details for the azure service bus url and am using the ws2007HttpRelayBinding. Next is my configuration for the relay binding. You can see below I have configured security to use TransportWithMessageCredential so we will flow the username token with the message and also the RelayAccessToken relayClientAuthenticationType which means the component will validate against ACS before being allowed to access the relay endpoint to send a message.     After the binding we need to configure the endpoint behavior like in the below picture. This is the normal configuration to use a shared secret for accessing a Service Bus endpoint.   Finally below we have the code of the client in the console application which will call the service bus. You can see that we have created our proxy and then made a normal call to a WCF service but this time we have also set the ClientCredentials to use the appropriate username and password which will be flown through the service bus and to our service which will validate them.     Conclusion As you can see from the above walkthrough it is not too difficult to configure a service to use both a shared secret and username token at the same time. This gives you the power and protection offered by the access control service in the cloud but also the ability to flow additional tokens to the on-premise component for additional security features to be implemented. Sample The sample used in this post is available at the following location: https://s3.amazonaws.com/CSCBlogSamples/Acme.Azure.ServiceBus.Poc.UN.zip

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

< Previous Page | 422 423 424 425 426 427 428 429 430 431 432 433  | Next Page >