Search Results

Search found 9983 results on 400 pages for 'fuzzy c means'.

Page 125/400 | < Previous Page | 121 122 123 124 125 126 127 128 129 130 131 132 | Next Page >

VS 2010 Debugger Improvements (BreakPoints, DataTips, Import/Export)

- by ScottGu

This is the twenty-first in a series of blog posts I’m doing on the VS 2010 and .NET 4 release. Today’s blog post covers a few of the nice usability improvements coming with the VS 2010 debugger. The VS 2010 debugger has a ton of great new capabilities. Features like Intellitrace (aka historical debugging), the new parallel/multithreaded debugging capabilities, and dump debuging support typically get a ton of (well deserved) buzz and attention when people talk about the debugging improvements with this release. I’ll be doing blog posts in the future that demonstrate how to take advantage of them as well. With today’s post, though, I thought I’d start off by covering a few small, but nice, debugger usability improvements that were also included with the VS 2010 release, and which I think you’ll find useful. Breakpoint Labels VS 2010 includes new support for better managing debugger breakpoints. One particularly useful feature is called “Breakpoint Labels” – it enables much better grouping and filtering of breakpoints within a project or across a solution. With previous releases of Visual Studio you had to manage each debugger breakpoint as a separate item. Managing each breakpoint separately can be a pain with large projects and for cases when you want to maintain “logical groups” of breakpoints that you turn on/off depending on what you are debugging. Using the new VS 2010 “breakpoint labeling” feature you can now name these “groups” of breakpoints and manage them as a unit. Grouping Multiple Breakpoints Together using a Label Below is a screen-shot of the breakpoints window within Visual Studio 2010. This lists all of the breakpoints defined within my solution (which in this case is the ASP.NET MVC 2 code base): The first and last breakpoint in the list above breaks into the debugger when a Controller instance is created or released by the ASP.NET MVC Framework. Using VS 2010, I can now select these two breakpoints, right-click, and then select the new “Edit labels…” menu command to give them a common label/name (making them easier to find and manage): Below is the dialog that appears when I select the “Edit labels” command. We can use it to create a new string label for our breakpoints or select an existing one we have already defined. In this case we’ll create a new label called “Lifetime Management” to describe what these two breakpoints cover: When we press the OK button our two selected breakpoints will be grouped under the newly created “Lifetime Management” label: Filtering/Sorting Breakpoints by Label We can use the “Search” combobox to quickly filter/sort breakpoints by label. Below we are only showing those breakpoints with the “Lifetime Management” label: Toggling Breakpoints On/Off by Label We can also toggle sets of breakpoints on/off by label group. We can simply filter by the label group, do a Ctrl-A to select all the breakpoints, and then enable/disable all of them with a single click: Importing/Exporting Breakpoints VS 2010 now supports importing/exporting breakpoints to XML files – which you can then pass off to another developer, attach to a bug report, or simply re-load later. To export only a subset of breakpoints, you can filter by a particular label and then click the “Export breakpoint” button in the Breakpoints window: Above I’ve filtered my breakpoint list to only export two particular breakpoints (specific to a bug that I’m chasing down). I can export these breakpoints to an XML file and then attach it to a bug report or email – which will enable another developer to easily setup the debugger in the correct state to investigate it on a separate machine. Pinned DataTips Visual Studio 2010 also includes some nice new “DataTip pinning” features that enable you to better see and track variable and expression values when in the debugger. Simply hover over a variable or expression within the debugger to expose its DataTip (which is a tooltip that displays its value) – and then click the new “pin” button on it to make the DataTip always visible: You can “pin” any number of DataTips you want onto the screen. In addition to pinning top-level variables, you can also drill into the sub-properties on variables and pin them as well. Below I’ve “pinned” three variables: “category”, “Request.RawUrl” and “Request.LogonUserIdentity.Name”. Note that these last two variable are sub-properties of the “Request” object. Associating Comments with Pinned DataTips Hovering over a pinned DataTip exposes some additional UI within the debugger: Clicking the comment button at the bottom of this UI expands the DataTip - and allows you to optionally add a comment with it: This makes it really easy to attach and track debugging notes: Pinned DataTips are usable across both Debug Sessions and Visual Studio Sessions Pinned DataTips can be used across multiple debugger sessions. This means that if you stop the debugger, make a code change, and then recompile and start a new debug session - any pinned DataTips will still be there, along with any comments you associate with them. Pinned DataTips can also be used across multiple Visual Studio sessions. This means that if you close your project, shutdown Visual Studio, and then later open the project up again – any pinned DataTips will still be there, along with any comments you associate with them. See the Value from Last Debug Session (Great Code Editor Feature) How many times have you ever stopped the debugger only to go back to your code and say: $#@! – what was the value of that variable again??? One of the nice things about pinned DataTips is that they keep track of their “last value from debug session” – and you can look these values up within the VB/C# code editor even when the debugger is no longer running. DataTips are by default hidden when you are in the code editor and the debugger isn’t running. On the left-hand margin of the code editor, though, you’ll find a push-pin for each pinned DataTip that you’ve previously setup: Hovering your mouse over a pinned DataTip will cause it to display on the screen. Below you can see what happens when I hover over the first pin in the editor - it displays our debug session’s last values for the “Request” object DataTip along with the comment we associated with them: This makes it much easier to keep track of state and conditions as you toggle between code editing mode and debugging mode on your projects. Importing/Exporting Pinned DataTips As I mentioned earlier in this post, pinned DataTips are by default saved across Visual Studio sessions (you don’t need to do anything to enable this). VS 2010 also now supports importing/exporting pinned DataTips to XML files – which you can then pass off to other developers, attach to a bug report, or simply re-load later. Combined with the new support for importing/exporting breakpoints, this makes it much easier for multiple developers to share debugger configurations and collaborate across debug sessions. Summary Visual Studio 2010 includes a bunch of great new debugger features – both big and small. Today’s post shared some of the nice debugger usability improvements. All of the features above are supported with the Visual Studio 2010 Professional edition (the Pinned DataTip features are also supported in the free Visual Studio 2010 Express Editions) I’ll be covering some of the “big big” new debugging features like Intellitrace, parallel/multithreaded debugging, and dump file analysis in future blog posts. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

Read the article
Replication Services as ETL extraction tool

- by jorg

In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication… The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button. In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines! On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication. Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions The New Subscription Wizard appears Select the publisher on which you just created your publication and select the database and publication (SnapshotTest) You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful. You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here Click Next Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot: Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot: Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data: Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

Read the article
On StringComparison Values

- by Jesse

When you use the .NET Framework’s String.Equals and String.Compare methods do you use an overloStringComparison enumeration value? If not, you should be because the value provided for that StringComparison argument can have a big impact on the results of your string comparison. The StringComparison enumeration defines values that fall into three different major categories: Culture-sensitive comparison using a specific culture, defaulted to the Thread.CurrentThread.CurrentCulture value (StringComparison.CurrentCulture and StringComparison.CurrentCutlureIgnoreCase) Invariant culture comparison (StringComparison.InvariantCulture and StringComparison.InvariantCultureIgnoreCase) Ordinal (byte-by-byte) comparison of (StringComparison.Ordinal and StringComparison.OrdinalIgnoreCase) There is a lot of great material available that detail the technical ins and outs of these different string comparison approaches. If you’re at all interested in the topic these two MSDN articles are worth a read: Best Practices For Using Strings in the .NET Framework: http://msdn.microsoft.com/en-us/library/dd465121.aspx How To Compare Strings: http://msdn.microsoft.com/en-us/library/cc165449.aspx Those articles cover the technical details of string comparison well enough that I’m not going to reiterate them here other than to say that the upshot is that you typically want to use the culture-sensitive comparison whenever you’re comparing strings that were entered by or will be displayed to users and the ordinal comparison in nearly all other cases. So where does that leave the invariant culture comparisons? The “Best Practices For Using Strings in the .NET Framework” article has the following to say: “On balance, the invariant culture has very few properties that make it useful for comparison. It does comparison in a linguistically relevant manner, which prevents it from guaranteeing full symbolic equivalence, but it is not the choice for display in any culture. One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display. For example, if a large data file that contains a list of sorted identifiers for display accompanies an application, adding to this list would require an insertion with invariant-style sorting.” I don’t know about you, but I feel like that paragraph is a bit lacking. Are there really any “real world” reasons to use the invariant culture comparison? I think the answer to this question is, “yes”, but in order to understand why we should first think about what the invariant culture comparison really does. The invariant culture comparison is really just a culture-sensitive comparison using a special invariant culture (Michael Kaplan has a great post on the history of the invariant culture on his blog: http://blogs.msdn.com/b/michkap/archive/2004/12/29/344136.aspx). This means that the invariant culture comparison will apply the linguistic customs defined by the invariant culture which are guaranteed not to differ between different machines or execution contexts. This sort of consistently does prove useful if you needed to maintain a list of strings that are sorted in a meaningful and consistent way regardless of the user viewing them or the machine on which they are being viewed. Example: Prototype Names Let’s say that you work for a large multi-national toy company with branch offices in 10 different countries. Each year the company would work on 15-25 new toy prototypes each of which is assigned a “code name” while it is under development. Coming up with fun new code names is a big part of the company culture that everyone really enjoys, so to be fair the CEO of the company spent a lot of time coming up with a prototype naming scheme that would be fun for everyone to participate in, fair to all of the different branch locations, and accessible to all members of the organization regardless of the country they were from and the language that they spoke. Each new prototype will get a code name that begins with a letter following the previously created name using the alphabetical order of the Latin/Roman alphabet. Each new year prototype names would start back at “A”. The country that leads the prototype development effort gets to choose the name in their native language. (An appropriate Romanization system will be used for countries where the primary language is not written in the Latin/Roman alphabet. For example, the Pinyin system could be used for Chinese). To avoid repeating names, a list of all current and past prototype names will be maintained on each branch location’s company intranet site. Assuming that maintaining a single pre-sorted list is not feasible among all of the highly distributed intranet implementations, what string comparison method would you use to sort each year’s list of prototype names so that the list is both meaningful and consistent regardless of the country within which the list is being viewed? Sorting the list with a culture-sensitive comparison using the default configured culture on each country’s intranet server the list would probably work most of the time, but subtle differences between cultures could mean that two different people would see a list that was sorted slightly differently. The CEO wants the prototype names to be a unifying aspect of company culture and is adamant that everyone see the the same list sorted in the same order and there’s no way to guarantee a consistent sort across different cultures using the culture-sensitive string comparison rules. The culture-sensitive sort would produce a meaningful list for the specific user viewing it, but it wouldn’t always be consistent between different users. Sorting with the ordinal comparison would certainly be consistent regardless of the user viewing it, but would it be meaningful? Let’s say that the current year’s prototype name list looks like this: Antílope (Spanish) Babouin (French) Cahoun (Czech) Diamond (English) Flosse (German) If you were to sort this list using ordinal rules you’d end up with: Antílope Babouin Diamond Flosse Cahoun This sort is no good because the entry for “C” appears the bottom of the list after “F”. This is because the Czech entry for the letter “C” makes use of a diacritic (accent mark). The ordinal string comparison does a byte-by-byte comparison of the code points that make up each character in the string and the code point for the “C” with the diacritic mark is higher than any letter without a diacritic mark, which pushes that entry to the bottom of the sorted list. The CEO wants each country to be able to create prototype names in their native language, which means we need to allow for names that might begin with letters that have diacritics, so ordinal sorting kills the meaningfulness of the list. As it turns out, this situation is actually well-suited for the invariant culture comparison. The invariant culture accounts for linguistically relevant factors like the use of diacritics but will provide a consistent sort across all machines that perform the sort. Now that we’ve walked through this example, the following line from the “Best Practices For Using Strings in the .NET Framework” makes a lot more sense: One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display That line describes the prototype name example perfectly: we need a way to persist ordered data for a cross-culturally identical display. While this example is 100% made-up, I think it illustrates that there are indeed real-world situations where the invariant culture comparison is useful.

Read the article
General monitoring for SQL Server Analysis Services using Performance Monitor

- by Testas

A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs. File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor, a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used…… Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above 25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file

Read the article
Handling HumanTask attachments in Oracle BPM 11g PS4FP+ (I)

- by ccasares

Adding attachments to a HumanTask is a feature that exists in Oracle HWF (Human Workflow) since 10g. However, in 11g there have been many improvements on this feature and this entry will try to summarize them. Oracle BPM 11g 11.1.1.5.1 (aka PS4 Feature Pack or PS4FP) introduced two great features: Ability to link attachments at a Task scope or at a Process scope: "Task" attachments are only visible within the scope (lifetime) of a task. This means that, initially, any member of the assignment pattern of the Human Task will be able to handle (add, review or remove) attachments. However, once the task is completed, subsequent human tasks will not have access to them. This does not mean those attachments got lost. Once the human task is completed, attachments can be retrieved in order to, i.e., check them in to a Content Server or to inject them to a new and different human task. Aside note: a "re-initiated" human task will inherit comments and attachments, along with history and -optionally- payload. See here for more info. "Process" attachments are visible within the scope of the process. This means that subsequent human tasks in the same process instance will have access to them. Ability to use Oracle WebCenter Content (previously known as "Oracle UCM") as the backend for the attachments instead of using HWF database backend. This feature adds all content server document lifecycle capabilities to HWF attachments (versioning, RBAC, metadata management, etc). As of today, only Oracle WCC is supported. However, Oracle BPM Suite does include a license of Oracle WCC for the solely usage of document management within BPM scope. Here are some code samples that leverage the above features. Retrieving uploaded attachments -Non UCM- Non UCM attachments (default ones or those that have existed from 10g, and are stored "as-is" in HWK database backend) can be retrieved after the completion of the Human Task. Firstly, we need to know whether any attachment has been effectively uploaded to the human task. There are two ways to find it out: Through an XPath function: Checking the execData/attachment[] structure. For example: Once we are sure one ore more attachments were uploaded to the Human Task, we want to get them. In this example, by "get" I mean to get the attachment name and the payload of the file. Aside note: Oracle HWF lets you to upload two kind of [non-UCM] attachments: a desktop document and a Web URL. This example focuses just on the desktop document one. In order to "retrieve" an uploaded Web URL, you can get it directly from the execData/attachment[] structure. Attachment content (payload) is retrieved through the getTaskAttachmentContents() XPath function: This example shows how to retrieve as many attachments as those had been uploaded to the Human Task and write them to the server using the File Adapter service. The sample process excerpt is as follows: A dummy UserTask using "HumanTask1" Human Task followed by a Embedded Subprocess that will retrieve the attachments (we're assuming at least one attachment is uploaded): and once retrieved, we will write each of them back to a file in the server using a File Adapter service: In detail: We've defined an XSD structure that will hold the attachments (both name and payload): Then, we can create a BusinessObject based on such element (attachmentCollection) and create a variable (named attachmentBPM) of such BusinessObject type. We will also need to keep a copy of the HumanTask output's execData structure. Therefore we need to create a variable of type TaskExecutionData... ...and copy the HumanTask output execData to it: Now we get into the embedded subprocess that will retrieve the attachments' payload. First, and using an XSLT transformation, we feed the attachmentBPM variable with the name of each attachment and setting an empty value to the payload: Please note that we're using the XSLT for-each node to create as many target structures as necessary. Also note that we're setting an Empty text to the payload variable. The reason for this is to make sure the <payload></payload> tag gets created. This is needed when we map the payload to the XML variable later. Aside note: We are assuming that we're retrieving non-UCM attachments. However in real life you might want to check the type of attachment you're handling. The execData/attachment[]/storageType contains the values "UCM" for UCM type attachments, "TASK" for non-UCM ones or "URL" for Web URL ones. Those values are part of the "Ext.Com.Oracle.Xmlns.Bpel.Workflow.Task.StorageTypeEnum" enumeration. Once we have fed the attachmentsBPM structure and so it now contains the name of each of the attachments, it is time to iterate through it and get the payload. Therefore we will use a new embedded subprocess of type MultiInstance, that will iterate over the attachmentsBPM/attachment[] element: In every iteration we will use a Script activity to map the corresponding payload element with the result of the XPath function getTaskAttachmentContents(). Please, note how the target array element is indexed with the loopCounter predefined variable, so that we make sure we're feeding the right element during the array iteration: The XPath function used looks as follows: hwf:getTaskAttachmentContents(bpmn:getDataObject('UserTask1LocalExecData')/ns1:systemAttributes/ns1:taskId, bpmn:getDataObject('attachmentsBPM')/ns:attachment[bpmn:getActivityInstanceAttribute('SUBPROCESS3067107484296', 'loopCounter')]/ns:fileName) where the input parameters are: taskId of the just completed Human Task attachment name we're retrieving the payload from array index (loopCounter predefined variable) Aside note: The reason whereby we're iterating the execData/attachment[] structure through embedded subprocess and not, i.e., using XSLT and for-each nodes, is mostly because the getTaskAttachmentContents() XPath function is currently not available in XSLT mappings. So all this example might be considered as a workaround until this gets fixed/enhanced in future releases. Once this embedded subprocess ends, we will have all attachments (name + payload) in the attachmentsBPM variable, which is the main goal of this sample. But in order to test everything runs fine, we finish the sample writing each attachment to a file. To that end we include a final embedded subprocess to concurrently iterate through each attachmentsBPM/attachment[] element: On each iteration we will use a Service activity that invokes a File Adapter write service. In here we have two important parameters to set. First, the payload itself. The file adapter awaits binary data in base64 format (string). We have to map it using XPath (Simple mapping doesn't recognize a String as a base64-binary valid target): Second, we must set the target filename using the Service Properties dialog box: Again, note how we're making use of the loopCounter index variable to get the right element within the embedded subprocess iteration. Handling UCM attachments will be part of a different and upcoming blog entry. Once I finish will all posts on this matter, I will upload the whole sample project to java.net.

Read the article
MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1

- by Mat Keep

0 0 1 893 5092 Homework 42 11 5974 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} Summary The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark What’s New in MySQL Cluster 7.2 MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes. Multi-Threaded Data Node Extensions The MySQL Cluster 7.2 data node is now functionally divided into seven thread types: 1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads. 2) Transaction Coordinator threads (tc) 3) Asynchronous Replication threads (rep) 4) Schema Management threads (main) 5) Network receiver threads (recv) 6) Network send threads (send) 7) IO threads Each of these thread types are discussed in more detail below. MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads. The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes. Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release. To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread. The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster. The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency. The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load. Benchmarking the Scalability Enhancements The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x. Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1 The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each). As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1. More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned! In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster Conclusion MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node. As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Read the article
What's up with LDoms: Part 2 - Creating a first, simple guest

- by Stefan Hinker

Welcome back! In the first part, we discussed the basic concepts of LDoms and how to configure a simple control domain. We saw how resources were put aside for guest systems and what infrastructure we need for them. With that, we are now ready to create a first, very simple guest domain. In this first example, we'll keep things very simple. Later on, we'll have a detailed look at things like sizing, IO redundancy, other types of IO as well as security. For now,let's start with this very simple guest. It'll have one core's worth of CPU, one crypto unit, 8GB of RAM, a single boot disk and one network port. CPU and RAM are easy. The network port we'll create by attaching a virtual network port to the vswitch we created in the primary domain. This is very much like plugging a cable into a computer system on one end and a network switch on the other. For the boot disk, we'll need two things: A physical piece of storage to hold the data - this is called the backend device in LDoms speak. And then a mapping between that storage and the guest domain, giving it access to that virtual disk. For this example, we'll use a ZFS volume for the backend. We'll discuss what other options there are for this and how to chose the right one in a later article. Here we go: root@sun # ldm create mars root@sun # ldm set-vcpu 8 mars root@sun # ldm set-mau 1 mars root@sun # ldm set-memory 8g mars root@sun # zfs create rpool/guests root@sun # zfs create -V 32g rpool/guests/mars.bootdisk root@sun # ldm add-vdsdev /dev/zvol/dsk/rpool/guests/mars.bootdisk \ mars.root@primary-vds root@sun # ldm add-vdisk root mars.root@primary-vds mars root@sun # ldm add-vnet net0 switch-primary mars That's all, mars is now ready to power on. There are just three commands between us and the OK prompt of mars: We have to "bind" the domain, start it and connect to its console. Binding is the process where the hypervisor actually puts all the pieces that we've configured together. If we made a mistake, binding is where we'll be told (starting in version 2.1, a lot of sanity checking has been put into the config commands themselves, but binding will catch everything else). Once bound, we can start (and of course later stop) the domain, which will trigger the boot process of OBP. By default, the domain will then try to boot right away. If we don't want that, we can set "auto-boot?" to false. Finally, we'll use telnet to connect to the console of our newly created guest. The output of "ldm list" shows us what port has been assigned to mars. By default, the console service only listens on the loopback interface, so using telnet is not a large security concern here. root@sun # ldm set-variable auto-boot\?=false mars root@sun # ldm bind mars root@sun # ldm start mars root@sun # ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- UART 8 7680M 0.5% 1d 4h 30m mars active -t---- 5000 8 8G 12% 1s root@sun # telnet localhost 5000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ~Connecting to console "mars" in group "mars" .... Press ~? for control options .. {0} ok banner SPARC T3-4, No Keyboard Copyright (c) 1998, 2011, Oracle and/or its affiliates. All rights reserved. OpenBoot 4.33.1, 8192 MB memory available, Serial # 87203131. Ethernet address 0:21:28:24:1b:50, Host ID: 85241b50. {0} ok We're done, mars is ready to install Solaris, preferably using AI, of course ;-) But before we do that, let's have a little look at the OBP environment to see how our virtual devices show up here: {0} ok printenv auto-boot? auto-boot? = false {0} ok printenv boot-device boot-device = disk net {0} ok devalias root /virtual-devices@100/channel-devices@200/disk@0 net0 /virtual-devices@100/channel-devices@200/network@0 net /virtual-devices@100/channel-devices@200/network@0 disk /virtual-devices@100/channel-devices@200/disk@0 virtual-console /virtual-devices/console@1 name aliases We can see that setting the OBP variable "auto-boot?" to false with the ldm command worked. Of course, we'd normally set this to "true" to allow Solaris to boot right away once the LDom guest is started. The setting for "boot-device" is the default "disk net", which means OBP would try to boot off the devices pointed to by the aliases "disk" and "net" in that order, which usually means "disk" once Solaris is installed on the disk image. The actual devices these aliases point to are shown with the command "devalias". Here, we have one line for both "disk" and "net". The device paths speak for themselves. Note that each of these devices has a second alias: "net0" for the network device and "root" for the disk device. These are the very same names we've given these devices in the control domain with the commands "ldm add-vnet" and "ldm add-vdisk". Remember this, as it is very useful once you have several dozen disk devices... To wrap this up, in this part we've created a simple guest domain, complete with CPU, memory, boot disk and network connectivity. This should be enough to get you going. I will cover all the more advanced features and a little more theoretical background in several follow-on articles. For some background reading, I'd recommend the following links: LDoms 2.2 Admin Guide: Setting up Guest Domains Virtual Console Server: vntsd manpage - This includes the control sequences and commands available to control the console session. OpenBoot 4.x command reference - All the things you can do at the ok prompt

Read the article
.NET to iOS: From WinForms to the iPad

- by RobertChipperfield

One of the great things about working at Red Gate is getting to play with new technology - and right now, that means mobile. A few weeks ago, we decided that a little research into the tablet computing arena was due, and purely from a numbers point of view, that suggested the iPad as a good target device. A quick trip to iPhoneDevCon in San Diego later, and Marine and I came back full of ideas, and with some concept of how iOS development was meant to work. Here's how we went from there to the release of Stacks & Heaps, our geeky take on the classic "Snakes & Ladders" game. Step 1: Buy a Mac I've played with many operating systems in my time: from the original BBC Model B, through DOS, Windows, Linux, and others, but I'd so far managed to avoid buying fruit-flavoured computer hardware! If you want to develop for the iPhone, iPad or iPod Touch, that's the first thing that needs to change. If you've not used OS X before, the first thing you'll realise is that everything is different! In the interests of avoiding a flame war in the comments section, I'll only go so far as to say that a lot of my Windows-flavoured muscle memory no longer worked. If you're in the UK, you'll also realise your keyboard is lacking a # key, and that " and @ are the other way around from normal. The wonderful Ukelele keyboard layout editor restores some sanity here, as long as you don't look at the keyboard when you're typing. I couldn't give up the PC entirely, but a handy application called Synergy comes to the rescue - it lets you share a single keyboard and mouse between multiple machines. There's a few limitations: Alt-Tab always seems to go to the Mac, and Windows 7's UAC dialogs require the local mouse for security reasons, but it gets you a long way at least. Step 2: Register as an Apple Developer You can register as an Apple Developer free of charge, and that lets you download XCode and the iOS SDK. You also get the iPhone / iPad emulator, which is handy, since you'll need to be a paid member before you can deploy your apps to a real device. You can either enroll as an individual, or as a company. They both cost the same ($99/year), but there's a few differences between them. If you register as a company, you can add multiple developers to your team (all for the same $99 - not $99 per developer), and you get to use your company name in the App Store. However, you'll need to send off significantly more documentation to Apple, and I suspect the process takes rather longer than for an individual, where they just need to verify some credit card details. Here's a tip: if you're registering as a company, do so as early as possible. The approval process can take a while to complete, so get the application in in plenty of time. Step 3: Learn to love the square brackets! Objective-C is the language of the iPad. C and C++ are also supported, and if you're doing some serious game development, you'll probably spend most of your time in C++ talking OpenGL, but for forms-based apps, you'll be interacting with a lot of the Objective-C SDK. Like shifting from Ctrl-C to Cmd-C, it feels a little odd at first, with the familiar string.format(.) turning into: NSString *myString = [NSString stringWithFormat:@"Hello world, it's %@", [NSDate date]]; Thankfully XCode's auto-complete is normally passable, if not up to Visual Studio's standards, which coupled with a huge amount of content on Stack Overflow means you'll soon get to grips with the API. You'll need to get used to some terminology changes, though; here's an incomplete approximation: Coming from a .NET background, there's some luxuries you no longer have developing Objective C in XCode: Generics! Remember back in .NET 1.1, when all collections were just objects? Yup, we're back there now. ReSharper. Or, more generally, very much refactoring support. The not-many-keystrokes to rename a class, its file, and al references to it in Visual Studio turns into a much more painful experience in XCode. Garbage collection. This is actually rather less of an issue than you might expect: if you follow the rules, the reference counting provided by Objective C gets you a long way without too much pain. Circular references are their usual problematic self, though. Decent exception handling. You do have exceptions, but they're nowhere near as widely used. Generally, if something goes wrong, you get nil (see translation table above) back. Which brings me on to. Calling a method on a nil object isn't a failure - it just returns nil itself! There's many arguments for and against this, but personally I fall into the "stuff should fail as quickly and explicitly as possible" camp. Less specifically, I found that there's more chance of code failing at runtime rather than getting caught at compile-time: using the @selector(.) syntax to pass a method signature isn't (can't be) checked at compile-time, so the first you know about a typo is a crash when you try and call it. The solution to this is of course lots of great testing, both automated and manual, but I still find comfort in provably correct type safety being enforced in addition to testing. Step 4: Submit to the App Store Assuming you want to distribute to more than a handful of devices, you're going to need to submit your app to the Apple App Store. There's a few gotchas in terms of getting builds signed with the right certificates, and you'll be bouncing around between XCode and iTunes Connect a fair bit, but eventually you get everything checked off the to-do list, and are ready to upload your first binary! With some amount of anticipation, I pressed the Upload button in XCode, ready to release our creation into the world, but was instead greeted by an error informing me my XML file was malformed. Uh. A little Googling later, and it turned out that a simple rename from "Stacks&Heaps.app" to "StacksAndHeaps.app" worked around an XML escaping bug, and we were good to go. The next step is to wait for approval (or otherwise). After a couple of weeks of intensive development, this part is agonising. Did we make it? The Apple jury is still out at the moment, but our fingers are firmly crossed! In the meantime, you can see some screenshots and leave us your email address if you'd like us to get in touch when it does go live at the MobileFoo website. Step 5: Profit! Actually, that wasn't the idea here: Stacks & Heaps is free; there's no adverts, and we're not going to sell all your data either. So why did we do it? We wanted to get an idea of what it's like to move from coding for a desktop environment, to something completely different. We don't know whether in a year's time, the iPad will still be the dominant force, or whether Android will have smoothed out some bugs, tweaked the performance, and polished the UI, but I think it's a fairly sure bet that the tablet form factor is here to stay. We want to meet people who are using it, start chatting to them, and find out about some of the pain they're feeling. What better way to do that than do it ourselves, and get to write a cool game in the process?

Read the article
SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
Using Node.js as an accelerator for WCF REST services

- by Elton Stoneman

Node.js is a server-side JavaScript platform "for easily building fast, scalable network applications". It's built on Google's V8 JavaScript engine and uses an (almost) entirely async event-driven processing model, running in a single thread. If you're new to Node and your reaction is "why would I want to run JavaScript on the server side?", this is the headline answer: in 150 lines of JavaScript you can build a Node.js app which works as an accelerator for WCF REST services*. It can double your messages-per-second throughput, halve your CPU workload and use one-fifth of the memory footprint, compared to the WCF services direct. Well, it can if: 1) your WCF services are first-class HTTP citizens, honouring client cache ETag headers in request and response; 2) your services do a reasonable amount of work to build a response; 3) your data is read more often than it's written. In one of my projects I have a set of REST services in WCF which deal with data that only gets updated weekly, but which can be read hundreds of times an hour. The services issue ETags and will return a 304 if the client sends a request with the current ETag, which means in the most common scenario the client uses its local cached copy. But when the weekly update happens, then all the client caches are invalidated and they all need the same new data. Then the service will get hundreds of requests with old ETags, and they go through the full service stack to build the same response for each, taking up threads and processing time. Part of that processing means going off to a database on a separate cloud, which introduces more latency and downtime potential. We can use ASP.NET output caching with WCF to solve the repeated processing problem, but the server will still be thread-bound on incoming requests, and to get the current ETags reliably needs a database call per request. The accelerator solves that by running as a proxy - all client calls come into the proxy, and the proxy routes calls to the underlying REST service. We could use Node as a straight passthrough proxy and expect some benefit, as the server would be less thread-bound, but we would still have one WCF and one database call per proxy call. But add some smart caching logic to the proxy, and share ETags between Node and WCF (so the proxy doesn't even need to call the servcie to get the current ETag), and the underlying service will only be invoked when data has changed, and then only once - all subsequent client requests will be served from the proxy cache. I've built this as a sample up on GitHub: NodeWcfAccelerator on sixeyed.codegallery. Here's how the architecture looks: The code is very simple. The Node proxy runs on port 8010 and all client requests target the proxy. If the client request has an ETag header then the proxy looks up the ETag in the tag cache to see if it is current - the sample uses memcached to share ETags between .NET and Node. If the ETag from the client matches the current server tag, the proxy sends a 304 response with an empty body to the client, telling it to use its own cached version of the data. If the ETag from the client is stale, the proxy looks for a local cached version of the response, checking for a file named after the current ETag. If that file exists, its contents are returned to the client as the body in a 200 response, which includes the current ETag in the header. If the proxy does not have a local cached file for the service response, it calls the service, and writes the WCF response to the local cache file, and to the body of a 200 response for the client. So the WCF service is only troubled if both client and proxy have stale (or no) caches. The only (vaguely) clever bit in the sample is using the ETag cache, so the proxy can serve cached requests without any communication with the underlying service, which it does completely generically, so the proxy has no notion of what it is serving or what the services it proxies are doing. The relative path from the URL is used as the lookup key, so there's no shared key-generation logic between .NET and Node, and when WCF stores a tag it also stores the "read" URL against the ETag so it can be used for a reverse lookup, e.g: Key Value /WcfSampleService/PersonService.svc/rest/fetch/3 "28cd4796-76b8-451b-adfd-75cb50a50fa6" "28cd4796-76b8-451b-adfd-75cb50a50fa6" /WcfSampleService/PersonService.svc/rest/fetch/3 In Node we read the cache using the incoming URL path as the key and we know that "28cd4796-76b8-451b-adfd-75cb50a50fa6" is the current ETag; we look for a local cached response in /caches/28cd4796-76b8-451b-adfd-75cb50a50fa6.body (and the corresponding .header file which contains the original service response headers, so the proxy response is exactly the same as the underlying service). When the data is updated, we need to invalidate the ETag cache – which is why we need the reverse lookup in the cache. In the WCF update service, we don't need to know the URL of the related read service - we fetch the entity from the database, do a reverse lookup on the tag cache using the old ETag to get the read URL, update the new ETag against the URL, store the new reverse lookup and delete the old one. Running Apache Bench against the two endpoints gives the headline performance comparison. Making 1000 requests with concurrency of 100, and not sending any ETag headers in the requests, with the Node proxy I get 102 requests handled per second, average response time of 975 milliseconds with 90% of responses served within 850 milliseconds; going direct to WCF with the same parameters, I get 53 requests handled per second, mean response time of 1853 milliseconds, with 90% of response served within 3260 milliseconds. Informally monitoring server usage during the tests, Node maxed at 20% CPU and 20Mb memory; IIS maxed at 60% CPU and 100Mb memory. Note that the sample WCF service does a database read and sleeps for 250 milliseconds to simulate a moderate processing load, so this is *not* a baseline Node-vs-WCF comparison, but for similar scenarios where the service call is expensive but applicable to numerous clients for a long timespan, the performance boost from the accelerator is considerable. * - actually, the accelerator will work nicely for any HTTP request, where the URL (path + querystring) uniquely identifies a resource. In the sample, there is an assumption that the ETag is a GUID wrapped in double-quotes (e.g. "28cd4796-76b8-451b-adfd-75cb50a50fa6") – which is the default for WCF services. I use that assumption to name the cache files uniquely, but it is a trivial change to adapt to other ETag formats.

Read the article
A Good Developer is So Hard to Find

- by James Michael Hare

Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones. The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member. In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth. It seems that if you ask two different people what makes a good developer, often you will get three different opinions. With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process. Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor. On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder. So is it a skill? An Art? Or something entirely in between? Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point. This just isn't so. It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document. I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all. So development must be more than a skill. But on the other end, we have art. Is development an art? Is our end result to produce art? I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability. None of these three things have anything to do with art, per se. Art is beauty for its own sake and is a wonderful thing. But if you apply that same though to development it just doesn't hold. I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more. Yes, the end result, the accuracy, is the prime criteria to be met. But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon. Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution. So development must be something less than art. In the end, I think I feel like development is a matter of craftsmanship. We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient. There is skill involved, and there is an art, but really it boils down to being able to craft code. Crafting code is far more than writing code. Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well. So this is what I want to find, I want to find code craftsman! But how? I used to ask coding-trivia questions a long time ago and many people still fall back on this. The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code. For example: What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct? (answer: mutable) So what do we prove if a candidate can answer this? Only that they know what mutable means. One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used! But it rarely does! The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions. After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test. But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds. There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft. And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever. So if not trivia, then what? Brain teasers? The thought is, these type of questions measure the thinking power of a candidate. The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates. In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code? Here's a novel idea: have them code! Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code. It just makes sense that if we're going to hire someone to code we should actually watch them code. When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical? Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

Read the article
Integration Patterns with Azure Service Bus Relay, Part 1: Exposing the on-premise service

- by Elton Stoneman

We're in the process of delivering an enabling project to expose on-premise WCF services securely to Internet consumers. The Azure Service Bus Relay is doing the clever stuff, we register our on-premise service with Azure, consumers call into our .servicebus.windows.net namespace, and their requests are relayed and serviced on-premise. In theory it's all wonderfully simple; by using the relay we get lots of protocol options, free HTTPS and load balancing, and by integrating to ACS we get plenty of security options. Part of our delivery is a suite of sample consumers for the service - .NET, jQuery, PHP - and this set of posts will cover setting up the service and the consumers. Part 1: Exposing the on-premise service In theory, this is ultra-straightforward. In practice, and on a dev laptop it is - but in a corporate network with firewalls and proxies, it isn't, so we'll walkthrough some of the pitfalls. Note that I'm using the "old" Azure portal which will soon be out of date, but the new shiny portal should have the same steps available and be easier to use. We start with a simple WCF service which takes a string as input, reverses the string and returns it. The Part 1 version of the code is on GitHub here: on GitHub here: IPASBR Part 1. Configuring Azure Service Bus Start by logging into the Azure portal and registering a Service Bus namespace which will be our endpoint in the cloud. Give it a globally unique name, set it up somewhere near you (if you’re in Europe, remember Europe (North) is Ireland, and Europe (West) is the Netherlands), and enable ACS integration by ticking "Access Control" as a service: Authenticating and authorizing to ACS When we try to register our on-premise service as a listener for the Service Bus endpoint, we need to supply credentials, which means only trusted service providers can act as listeners. We can use the default "owner" credentials, but that has admin permissions so a dedicated service account is better (Neil Mackenzie has a good post On Not Using owner with the Azure AppFabric Service Bus with lots of permission details). Click on "Access Control Service" for the namespace, navigate to Service Identities and add a new one. Give the new account a sensible name and description: Let ACS generate a symmetric key for you (this will be the shared secret we use in the on-premise service to authenticate as a listener), but be sure to set the expiration date to something usable. The portal defaults to expiring new identities after 1 year - but when your year is up *your identity will expire without warning* and everything will stop working. In production, you'll need governance to manage identity expiration and a process to make sure you renew identities and roll new keys regularly. The new service identity needs to be authorized to listen on the service bus endpoint. This is done through claim mapping in ACS - we'll set up a rule that says if the nameidentifier in the input claims has the value serviceProvider, in the output we'll have an action claim with the value Listen. In the ACS portal you'll see that there is already a Relying Party Application set up for ServiceBus, which has a Default rule group. Edit the rule group and click Add to add this new rule: The values to use are: Issuer: Access Control Service Input claim type: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier Input claim value: serviceProvider Output claim type: net.windows.servicebus.action Output claim value: Listen When your service namespace and identity are set up, open the Part 1 solution and put your own namespace, service identity name and secret key into the file AzureConnectionDetails.xml in Solution Items, e.g: <azure namespace="sixeyed-ipasbr">  <service identityName="serviceProvider" symmetricKey="nuR2tHhlrTCqf4YwjT2RA2BZ/+xa23euaRJNLh1a/V4="/> </azure> Build the solution, and the T4 template will generate the Web.config for the service project with your Azure details in the transportClientEndpointBehavior: <behavior name="SharedSecret"> <transportClientEndpointBehavior credentialType="SharedSecret"> <clientCredentials> <sharedSecret issuerName="serviceProvider" issuerSecret="nuR2tHhlrTCqf4YwjT2RA2BZ/+xa23euaRJNLh1a/V4="/> </clientCredentials> </transportClientEndpointBehavior> </behavior> , and your service namespace in the Azure endpoint:  <endpoint address="sb://sixeyed-ipasbr.servicebus.windows.net/net" binding="netTcpRelayBinding" contract="Sixeyed.Ipasbr.Services.IFormatService" behaviorConfiguration="SharedSecret"> </endpoint> The sample project is hosted in IIS, but it won't register with Azure until the service is activated. Typically you'd install AppFabric 1.1 for Widnows Server and set the service to auto-start in IIS, but for dev just navigate to the local REST URL, which will activate the service and register it with Azure. Testing the service locally As well as an Azure endpoint, the service has a WebHttpBinding for local REST access:  <endpoint address="rest" binding="webHttpBinding" behaviorConfiguration="RESTBehavior" contract="Sixeyed.Ipasbr.Services.IFormatService" /> Build the service, then navigate to: http://localhost/Sixeyed.Ipasbr.Services/FormatService.svc/rest/reverse?string=abc123 - and you should see the reversed string response: If your network allows it, you'll get the expected response as before, but in the background your service will also be listening in the cloud. Good stuff! Who needs network security? Onto the next post for consuming the service with the netTcpRelayBinding. Setting up network access to Azure But, if you get an error, it's because your network is secured and it's doing something to stop the relay working. The Service Bus relay bindings try to use direct TCP connections to Azure, so if ports 9350-9354 are available *outbound*, then the relay will run through them. If not, the binding steps down to standard HTTP, and issues a CONNECT across port 443 or 80 to set up a tunnel for the relay. If your network security guys are doing their job, the first option will be blocked by the firewall, and the second option will be blocked by the proxy, so you'll get this error: System.ServiceModel.CommunicationException: Unable to reach sixeyed-ipasbr.servicebus.windows.net via TCP (9351, 9352) or HTTP (80, 443) - and that will probably be the start of lots of discussions. Network guys don't really like giving servers special permissions for the web proxy, and they really don't like opening ports, so they'll need to be convinced about this. The resolution in our case was to put up a dedicated box in a DMZ, tinker with the firewall and the proxy until we got a relay connection working, then run some traffic which the the network guys monitored to do a security assessment afterwards. Along the way we hit a few more issues, diagnosed mainly with Fiddler and Wireshark: System.Net.ProtocolViolationException: Chunked encoding upload is not supported on the HTTP/1.0 protocol - this means the TCP ports are not available, so Azure tries to relay messaging traffic across HTTP. The service can access the endpoint, but the proxy is downgrading traffic to HTTP 1.0, which does not support tunneling, so Azure can’t make its connection. We were using the Squid proxy, version 2.6. The Squid project is incrementally adding HTTP 1.1 support, but there's no definitive list of what's supported in what version (here are some hints). System.ServiceModel.Security.SecurityNegotiationException: The X.509 certificate CN=servicebus.windows.net chain building failed. The certificate that was used has a trust chain that cannot be verified. Replace the certificate or change the certificateValidationMode. The evocation function was unable to check revocation because the revocation server was offline. - by this point we'd given up on the HTTP proxy and opened the TCP ports. We got this error when the relay binding does it's authentication hop to ACS. The messaging traffic is TCP, but the control traffic still goes over HTTP, and as part of the ACS authentication the process checks with a revocation server to see if Microsoft’s ACS cert is still valid, so the proxy still needs some clearance. The service account (the IIS app pool identity) needs access to: www.public-trust.com mscrl.microsoft.com We still got this error periodically with different accounts running the app pool. We fixed that by ensuring the machine-wide proxy settings are set up, so every account uses the correct proxy: netsh winhttp set proxy proxy-server="http://proxy.x.y.z" - and you might need to run this to clear out your credential cache: certutil -urlcache * delete If your network guys end up grudgingly opening ports, they can restrict connections to the IP address range for your chosen Azure datacentre, which might make them happier - see Windows Azure Datacenter IP Ranges. After all that you've hopefully got an on-premise service listening in the cloud, which you can consume from pretty much any technology.

Read the article
Emit Knowledge - social network for knowledge sharing

- by hajan

Emit Knowledge, as the words refer - it's a social network for emitting / sharing knowledge from users by users. Those who can benefit the most out of this network is perhaps all of YOU who have something to share with others and contribute to the knowledge world. I've been closely communicating with the core team of this very, very interesting, brand new social network (with specific purpose!) about the concept, idea and the vision they have for their product and I can say with a lot of confidence that this network has real potential to become something from which we will all benefit. I won't speak much about that and would prefer to give you link and try it yourself - http://www.emitknowledge.com Mainly, through the past few months I've been testing this network and it is getting improved all the time. The user experience is great, you can easily find out what you need and it follows some known patterns that are common for all social networks. They have some real good ideas and plans that are already under development for the next updates of their product. You can do micro blogging or you can do regular normal blogging… it’s up to you, and the way it works, it is seamless. Here is a short Question and Answers (QA) interview I made with the lead of the team, Marijan Nikolovski: 1. Can you please explain us briefly, what is Emit Knowledge? Emit Knowledge is a brand new knowledge based social network, delivering quality content from users to users. We believe that people’s knowledge, experience and professional thoughts compose quality content, worth sharing among millions around the world. Therefore, we created the platform that matches people’s need to share and gain knowledge in the most suitable and comfortable way. Easy to work with, Emit Knowledge lets you to smoothly craft and emit knowledge around the globe. 2. How 'old' is Emit Knowledge? In hamster’s years we are almost five years old start-up :). Just kidding. We’ve released our public beta about three months ago. Our official release date is 27 of June 2012. 3. How did you come up with this idea? Everything started from a simple idea to solve a complex problem. We’ve seen that the social web has become polluted with data and is on the right track to lose its base principles – socialization and common cause. That was our start point. We’ve gathered the team, drew some sketches and started to mind map the idea. After several idea refactoring’s Emit Knowledge was born. 4. Is there any competition out there in the market? Currently we don't have any competitors that share the same cause. What makes our platform different is the ideology that our product promotes and the functionalities that our platform offers for easy socialization based on interests and knowledge sharing. 5. What are the main technologies used to build Emit Knowledge? Emit Knowledge was built on a heterogeneous pallet of technologies. Currently, we have four of separation: UI – Built on ASP.NET MVC3 and Knockout.js; Messaging infrastructure – Build on top of RabbitMQ; Background services – Our in-house solution for job distribution, orchestration and processing; Data storage – Build on top of MongoDB; What are the main reasons you've chosen ASP.NET MVC? Since all of our team members are .NET engineers, the decision was very natural. ASP.NET MVC is the only Microsoft web stack that sticks to the HTTP behavioral standards. It is easy to work with, have a tiny learning curve and everyone who is familiar with the HTTP will understand its architecture and convention without any difficulties. 6. What are the main reasons for choosing ASP.NET MVC? Since all of our team members are .NET engineers, the decision was very natural. ASP.NET MVC is the only Microsoft web stack that sticks to the HTTP behavioral standards. It is easy to work with, have a tiny learning curve and everyone who is familiar with the HTTP will understand its architecture and convention without any difficulties. 7. Did you use some of the latest Microsoft technologies? If yes, which ones? Yes, we like to rock the cutting edge tech house. Currently we are using Microsoft’s latest technologies like ASP.NET MVC, Web API (work in progress) and the best for the last; we are utilizing Windows Azure IaaS to the bone. 8. Can you please tell us shortly, what would be the benefit of regular bloggers in other blogging platforms to join Emit Knowledge? Well, unless you are some of the smoking ace gurus whose blogs are followed by a large number of users, our platform offers knowledge based segregated community equipped with tools that will enable both current and future users to expand their relations and to self-promote in the community based on their activity and knowledge sharing. 10. I see you are working very intensively and there is already integration with some third-party services to make the process of sharing and emitting knowledge easier, which services did you integrate until now and what do you plan do to next? We have “reemit” functionality for internal sharing and we also support external services like: Twitter; LinkedIn; Facebook; For the regular bloggers we have an extra cream, Windows Live Writer support for easy blog posts emitting. 11. What should we expect next? Currently, we are working on a new fancy community feature. This means that we are going to support user groups to be formed. So for all existing communities and user groups out there, wait us a little bit, we are coming for rescue :). One of the top next features they are developing is the Community Feature. It means, if you have your own User Group, Community Group or any other Group on which you and your users are mostly blogging or sharing (emitting) knowledge in various ways, Emit Knowledge as a platform will help you have everything you need to promote your group, make new followers and host all the necessary stuff that you have had need of. I would invite you to try the network and start sharing knowledge in a way that will help you gather new followers and spread your knowledge faster, easier and in a more efficient way! Let’s Emit Knowledge!

Read the article
SortedDictionary and SortedList

- by Simon Cooper

Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

Read the article
Converting projects to use Automatic NuGet restore

- by terje

Originally posted on: http://geekswithblogs.net/terje/archive/2014/06/11/converting-projects-to-use-automatic-nuget-restore.aspxDownload tool In version 2.7 of NuGet automatic nuget restore was introduced, meaning you no longer need to distort your msbuild project files with nuget target information. Visual Studio and TFS 2013 build have this enabled by default. However, if your project was created before this was introduced, and/or if you have used the “Enable NuGet Package Restore” afterwards, you now have a series of unwanted things in your projects, and a series of project files that have been modified – and – you no longer neither want nor need this ! You might also get into some unwanted issues due to these modifications. This is a MSBuild modification that was needed only before NuGet 2.7 ! So: DON’T USE THIS FUNCTION !!! There is an issue https://nuget.codeplex.com/workitem/4019 on this on the NuGet project site to get this function removed, renamed or at least moved farther away from the top level (please help vote it up!). The response seems to be that it WILL BE removed, around version 3.0. This function does nothing you need after the introduction of NuGet 2.7. What is also unfortunate is the naming of it – it implies that it is needed, it is not, and what is worse, there is no corresponding function to remove what it does ! So to fix this use the tool named IFix, that will fix this issue for you - all free of course, and the code is open source. Also report issues there: https://github.com/OsirisTerje/IFix IFix information DOWNLOAD HERE This command line tool installs using an MSI, and add itself to the system path. If you work in a team, you will probably need to use the tool multiple times. Anyone in the team may at any time use the “Enable NuGet Package Restore” function and mess up your project again. The IFix program can be run either in a check modus, where it does not write anything back – it only checks if you have any issues, or in a Fix mode, where it will also perform the necessary fixes for you. The IFix program is used like this: IFix <command> [-c/--check] [-f/--fix] [-v/--verbose] The command in this case is “nugetrestore”. It will do a check from the location where it is being called, and run through all subfolders from that location. So “IFix nugetrestore --check” , will do the check , and “IFix nugetrestore --fix” will perform the changes, for all files and folders below the current working directory. (Note that --check can be replaced with only –c, and --fix with –f, and so on. ) BEWARE: When you run the fix option, all solutions to be affected must be closed in Visual Studio ! So, if you just want to DO it, then: IFix nugetrestore --check to see if you have issues then IFix nugetrestore --fix to fix them. How does it work IFix nugetrestore checks and optionally fixes four issues that the older enabling of nuget restore did. The issues are related to the MSBuild projess, and are: Deleting the nuget.targets file. Deleting the nuget.exe that is located under the .nuget folder Removing all references to nuget.targets in the solution file Removing all properties and target imports of nuget.targets inside the csproj files. IFix fixes these issues in the same sequence. The first step, removing the nuget.targets file is the most critical one, and all instances of the nuget.targets file within the scope of a solution has to be removed, and in addition it has to be done with the solution closed in Visual Studio. If Visual Studio finds a nuget.targets file, the csproj files will be automatically messed up again. This means the removal process above might need to be done multiple times, specially when you’re working with a team, and that solution context menu still has the “Enable NuGet Package Restore” function. Someone on the team might inadvertently do this at any time. It can be a good idea to add this check to a checkin policy – if you run TFS standard version control, but that will have no effect if you use TFS Git version control of course. So, better be prepared to run the IFix check from time to time. Or, even better, install IFix on your build servers, and add a call to IFix nugetrestore --check in the TFS Build script. How does it look As a first example I have run the IFix program from the top of a set of git repositories, so it spans multiple repositories with multiple solutions. The result from the check option is as follows: We see the four red lines, there is one for each of the four checks we talked about in the previous section. The fact that they are red, means we have that particular issue. The first section (above the first red text line) is the nuget targets section. Notice No.1, it says it has found no paths to copy. What IFix does here is to check if there are any defined paths to other nuget galleries. If there are, then those are copied over to the nuget.config file, where is where it should be in version 2.7 and above. No.2 says it has found the particular nuget.targets file, No.3 states it HAS found some other nuget galleries defines in the targets file, which then it would like to copy to the config.file. No.4 is the section for nuget.exe files, and list those it has found, and which it would like to delete. No 5 states it has found a reference to nuget.targets in the solution file. This reference comes from the fact that the .nuget folder is a solution folder, and the items within are described in the solution file. It then checks the csproj files, and as can be seen from the last red line, it ha found issues in 96 out of 198 csproj files. There are two possible issues in a csproj files. No.6 is the first one, and the most common and most important one, an “Import project” section. This is the section that calls the nuget.targets files. No.7 is another issue, which seems to sometimes be there, sometimes not, it is a RestorePackages property, which also should go away. Now, if we run the IFix nugetrestore –fix command, and then the check again after that, the result is: All green !

Read the article
SQL SERVER – Data Sources and Data Sets in Reporting Services SSRS

- by Pinal Dave

This example is from the Beginning SSRS by Kathi Kellenberger. Supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the Beginning SSRS. Supporting files are available with a free download from the www.Joes2Pros.com web site. Connecting to Your Data? When I was a child, the telephone book was an important part of my life. Maybe I was just a nerd, but I enjoyed getting a new book every year to page through to learn about the businesses in my small town or to discover where some of my school acquaintances lived. It was also the source of maps to my town’s neighborhoods and the towns that surrounded me. To make a phone call, I would need a telephone number. In order to find a telephone number, I had to know how to use the telephone book. That seems pretty simple, but it resembles connecting to any data. You have to know where the data is and how to interact with it. A data source is the connection information that the report uses to connect to the database. You have two choices when creating a data source, whether to embed it in the report or to make it a shared resource usable by many reports. Data Sources and Data Sets A few basic terms will make the upcoming choses make more sense. What database on what server do you want to connect to? It would be better to just ask… “what is your data source?” The connection you need to make to get your reports data is called a data source. If you connected to a data source (like the JProCo database) there may be hundreds of tables. You probably only want data from just a few tables. This means you want to write a specific query against this data source. A query on a data source to get just the records you need for an SSRS report is called a Data Set. Creating a local Data Source You can connect embed a connection from your report directly to your JProCo database which (let’s say) is installed on a server named Reno. If you move JProCo to a new server named Tampa then you need to update the Data Set. If you have 10 reports in one project that were all pointing to the JProCo database on the Reno server then they would all need to be updated at once. It’s possible to make a project level Data Source and have each report use that. This means one change can fix all 10 reports at once. This would be called a Shared Data Source. Creating a Shared Data Source The best advice I can give you is to create shared data sources. The reason I recommend this is that if a database moves to a new server you will have just one place in Report Manager to make the server name change. That one change will update the connection information in all the reports that use that data source. To get started, you will start with a fresh project. Go to Start > All Programs > SQL Server 2012 > Microsoft SQL Server Data Tools to launch SSDT. Once SSDT is running, click New Project to create a new project. Once the New Project dialog box appears, fill in the form, as shown in. Be sure to select Report Server Project this time – not the wizard. Click OK to dismiss the New Project dialog box. You should now have an empty project, as shown in the Solution Explorer. A report is meant to show you data. Where is the data? The first task is to create a Shared Data Source. Right-click on the Shared Data Sources folder and choose Add New Data Source. The Shared Data Source Properties dialog box will launch where you can fill in a name for the data source. By default, it is named DataSource1. The best practice is to give the data source a more meaningful name. It is possible that you will have projects with more than one data source and, by naming them, you can tell one from another. Type the name JProCo for the data source name and click the Edit button to configure the database connection properties. If you take a look at the types of data sources you can choose, you will see that SSRS works with many data platforms including Oracle, XML, and Teradata. Make sure SQL Server is selected before continuing. For this post, I am assuming that you are using a local SQL Server and that you can use your Windows account to log in to the SQL Server. If, for some reason you must use SQL Server Authentication, choose that option and fill in your SQL Server account credentials. Otherwise, just accept Windows Authentication. If your database server was installed locally and with the default instance, just type in Localhost for the Server name. Select the JProCo database from the database list. At this point, the connection properties should look like. If you have installed a named instance of SQL Server, you will have to specify the server name like this: Localhost\InstanceName, replacing the InstanceName with whatever your instance name is. If you are not sure about the named instance, launch the SQL Server Configuration Manager found at Start > All Programs > Microsoft SQL Server 2012 > Configuration Tools. If you have a named instance, the name will be shown in parentheses. A default instance of SQL Server will display MSSQLSERVER; a named instance will display the name chosen during installation. Once you get the connection properties filled in, click OK to dismiss the Connection Properties dialog box and OK again to dismiss the Shared Data Source properties. You now have a data source in the Solution Explorer. What’s next I really need to thank Kathi Kellenberger and Rick Morelan for sharing this material for this 5 day series of posts on SSRS. To get really comfortable with SSRS you will get to know the different SSDT windows, Build reports on your own (without the wizards), Add report headers and footers, Accept user input, create levels, charts, or even maps for visual appeal. You might be surprise to know a small 230 page book starts from the very beginning and covers the steps to do all these items. Beginning SSRS 2012 is a small easy to follow book so you can learn SSRS for less than $20. See Joes2Pros.com for more on this and other books. If you want to learn SSRS in easy to simple words – I strongly recommend you to get Beginning SSRS book from Joes 2 Pros. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Reporting Services, SSRS

Read the article
Contracting as a Software Developer in the UK

- by Frez

Normal 0 false false false EN-GB X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:EN-US;} Having had some 15 years’ experience of working as a software contractor, I am often asked by developers who work as permanent employees (permies) about the pros and cons of working as a software consultant through my own limited company and whether the move would be a good one for them. Whilst it is possible to contract using other financial vehicles such as umbrella companies, this article will only consider limited companies as that is what I have experience of using. Contracting or consultancy requires a different mind-set from being a permanent member of staff, and not all developers are capable of this shift in attitude. Whilst you can look forward to an increase in the money you take home, there are real risks and expenses you would not normally be exposed to as a permie. So let us have a look at the pros and cons: Pros: More money There is no doubt that whilst you are working on contracts you will earn significantly more than you would as a permanent employee. Furthermore, working through a limited company is more tax efficient. Less politics You really have no need to involve yourself in office politics. When the end of the day comes you can go home and not think or worry about the power struggles within the company you are contracted to. Your career progression is not tied to the company. Expenses from gross income All your expenses of trading as a business will come out of your company’s gross income, i.e. before tax. This covers travelling expenses provided you have not been at the same client/location for more than two years, internet subscriptions, professional subscriptions, software, hardware, accountancy services and so on. Cons: Work is more transient Contracts typically range from a couple of weeks to a year, although will most likely start at 3 months. However, most contracts are extended either because the project you have been brought in to help with takes longer to deliver than expected, the client decides they can use you on other aspects of the project, or the client decides they would like to use you on other projects. The temporary nature of the work means that you will have down-time between contracts while you secure new opportunities during which time your company will have no income. You may need to attend several interviews before securing a new contract. Accountancy expenses Your company is a separate entity and there are accountancy requirements which, unless you like paperwork, means your company will need to appoint an accountant to prepare your company’s accounts. It may also be worth purchasing some accountancy software, so talk to your accountant about this as they may prefer you to use a particular software package so they can integrate it with their systems. VAT You will need to register your company for VAT. This is tax neutral for you as the VAT you charge your clients you will pass onto the government less any VAT you are reclaiming from expenses, but it is additional paperwork to undertake each quarter. It is worth checking out the Fixed Rate VAT Scheme that is available, particularly after the initial expenses of setting up your company are over. No training Clients take you on based on your skills, not to train you when they will lose that investment at the end of the contract, so understand that it is unlikely you will receive any training funded by a client. However, learning new skills during a contract is possible and you may choose to accept a contract on a lower rate if this is guaranteed as it will help secure future contracts. No financial extras You will have no free pension, life, accident, sickness or medical insurance unless you choose to purchase them yourself. A financial advisor can give you all the necessary advice in this area, and it is worth taking seriously. A year after I started as a consultant I contracted a serious illness, this kept me off work for over two months, my client was very understanding and it could have been much worse, so it is worth considering what your options might be in the case of illness, death and retirement. Agencies Whilst it is possible to work directly for end clients there are pros and cons of working through an agency. The main advantage is cash flow, you invoice the agency and they typically pay you within a week, whereas working directly for a client could have you waiting up to three months to be paid. The downside of working for agencies, especially in the current difficult times, is that they may go out of business and you then have difficulty getting the money you are owed. Tax investigation It is possible that the Inland Revenue may decide to investigate your company for compliance with tax law. Insurance is available to cover you for this. My personal recommendation would be to join the PCG as this insurance is included as a benefit of membership, Professional Indemnity Some agencies require that you are covered by professional indemnity insurance; this is a cost you would not incur as a permie. Travel Unless you live in an area that has an abundance of opportunities, such as central London, it is likely that you will be travelling further, longer and with more expense than if you were permanently employed at a local company. This not only affects you monetarily, but also your quality of life and the ability to keep fit and healthy. Obtaining finance If you want to secure a mortgage on a property it can be more difficult or expensive, especially if you do not have three years of audited accounts to show a mortgage lender. Caveat This post is my personal opinion and should not be used as a definitive guide or recommendation to contracting and whether it is suitable for you as an individual, i.e. I accept no responsibility if you decide to take up contracting based on this post and you fare badly for whatever reason.

Read the article
The SSIS tuning tip that everyone misses

- by Rob Farley

I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

Read the article
Routes on a sphere surface - Find geodesic?

- by CaNNaDaRk

I'm working with some friends on a browser based game where people can move on a 2D map. It's been almost 7 years and still people play this game so we are thinking of a way to give them something new. Since then the game map was a limited plane and people could move from (0, 0) to (MAX_X, MAX_Y) in quantized X and Y increments (just imagine it as a big chessboard). We believe it's time to give it another dimension so, just a couple of weeks ago, we began to wonder how the game could look with other mappings: Unlimited plane with continous movement: this could be a step forward but still i'm not convinced. Toroidal World (continous or quantized movement): sincerely I worked with torus before but this time I want something more... Spherical world with continous movement: this would be great! What we want Users browsers are given a list of coordinates like (latitude, longitude) for each object on the spherical surface map; browsers must then show this in user's screen rendering them inside a web element (canvas maybe? this is not a problem). When people click on the plane we convert the (mouseX, mouseY) to (lat, lng) and send it to the server which has to compute a route between current user's position to the clicked point. What we have We began writing a Java library with many useful maths to work with Rotation Matrices, Quaternions, Euler Angles, Translations, etc. We put it all together and created a program that generates sphere points, renders them and show them to the user inside a JPanel. We managed to catch clicks and translate them to spherical coords and to provide some other useful features like view rotation, scale, translation etc. What we have now is like a little (very little indeed) engine that simulates client and server interaction. Client side shows points on the screen and catches other interactions, server side renders the view and does other calculus like interpolating the route between current position and clicked point. Where is the problem? Obviously we want to have the shortest path to interpolate between the two route points. We use quaternions to interpolate between two points on the surface of the sphere and this seemed to work fine until i noticed that we weren't getting the shortest path on the sphere surface: We though the problem was that the route is calculated as the sum of two rotations about X and Y axis. So we changed the way we calculate the destination quaternion: We get the third angle (the first is latitude, the second is longitude, the third is the rotation about the vector which points toward our current position) which we called orientation. Now that we have the "orientation" angle we rotate Z axis and then use the result vector as the rotation axis for the destination quaternion (you can see the rotation axis in grey): What we got is the correct route (you can see it lays on a great circle), but we get to this ONLY if the starting route point is at latitude, longitude (0, 0) which means the starting vector is (sphereRadius, 0, 0). With the previous version (image 1) we don't get a good result even when startin point is 0, 0, so i think we're moving towards a solution, but the procedure we follow to get this route is a little "strange" maybe? In the following image you get a view of the problem we get when starting point is not (0, 0), as you can see starting point is not the (sphereRadius, 0, 0) vector, and as you can see the destination point (which is correctly drawn!) is not on the route. The magenta point (the one which lays on the route) is the route's ending point rotated about the center of the sphere of (-startLatitude, 0, -startLongitude). This means that if i calculate a rotation matrix and apply it to every point on the route maybe i'll get the real route, but I start to think that there's a better way to do this. Maybe I should try to get the plane through the center of the sphere and the route points, intersect it with the sphere and get the geodesic? But how? Sorry for being way too verbose and maybe for incorrect English but this thing is blowing my mind! EDIT: This code version is related to the first image: public void setRouteStart(double lat, double lng) { EulerAngles tmp = new EulerAngles ( Math.toRadians(lat), 0, -Math.toRadians(lng)); //set route start Quaternion qtStart.setInertialToObject(tmp); //do other stuff like drawing start point... } public void impostaDestinazione(double lat, double lng) { EulerAngles tmp = new AngoliEulero( Math.toRadians(lat), 0, -Math.toRadians(lng)); qtEnd.setInertialToObject(tmp); //do other stuff like drawing dest point... } public V3D interpolate(double totalTime, double t) { double _t = t/totalTime; Quaternion q = Quaternion.Slerp(qtStart, qtEnd, _t); RotationMatrix.inertialQuatToIObject(q); V3D p = matInt.inertialToObject(V3D.Xaxis.scale(sphereRadius)); //other stuff, like drawing point ... return p; } //mostly taken from a book! public static Quaternion Slerp(Quaternion q0, Quaternion q1, double t) { double cosO = q0.dot(q1); double q1w = q1.w; double q1x = q1.x; double q1y = q1.y; double q1z = q1.z; if (cosO < 0.0f) { q1w = -q1w; q1x = -q1x; q1y = -q1y; q1z = -q1z; cosO = -cosO; } double sinO = Math.sqrt(1.0f - cosO*cosO); double O = Math.atan2(sinO, cosO); double oneOverSinO = 1.0f / senoOmega; k0 = Math.sin((1.0f - t) * O) * oneOverSinO; k1 = Math.sin(t * O) * oneOverSinO; // Interpolate return new Quaternion( k0*q0.w + k1*q1w, k0*q0.x + k1*q1x, k0*q0.y + k1*q1y, k0*q0.z + k1*q1z ); } A little dump of what i get (again check image 1): Route info: Sphere radius and center: 200,000, (0.0, 0.0, 0.0) Route start: lat 0,000 °, lng 0,000 ° @v: (200,000, 0,000, 0,000), |v| = 200,000 Route end: lat 30,000 °, lng 30,000 ° @v: (150,000, 86,603, 100,000), |v| = 200,000 Qt dump: (w, x, y, z), rot. angle°, (x, y, z) rot. axis Qt start: (1,000, 0,000, -0,000, 0,000); 0,000 °; (1,000, 0,000, 0,000) Qt end: (0,933, 0,067, -0,250, 0,250); 42,181 °; (0,186, -0,695, 0,695) Route start: lat 30,000 °, lng 10,000 ° @v: (170,574, 30,077, 100,000), |v| = 200,000 Route end: lat 80,000 °, lng -50,000 ° @v: (22,324, -26,604, 196,962), |v| = 200,000 Qt dump: (w, x, y, z), rot. angle°, (x, y, z) rot. axis Qt start: (0,962, 0,023, -0,258, 0,084); 31,586 °; (0,083, -0,947, 0,309) Qt end: (0,694, -0,272, -0,583, -0,324); 92,062 °; (-0,377, -0,809, -0,450)

Read the article
Oracle Database 12 c New Partition Maintenance Features by Gwen Lazenby

- by hamsun

One of my favourite new features in Oracle Database 12c is the ability to perform partition maintenance operations on multiple partitions. This means we can now add, drop, truncate and merge multiple partitions in one operation, and can split a single partition into more than two partitions also in just one command. This would certainly have made my life slightly easier had it been available when I administered a data warehouse at Oracle 9i. To demonstrate this new functionality and syntax, I am going to create two tables, ORDERS and ORDERS_ITEMS which have a parent-child relationship. ORDERS is to be partitioned using range partitioning on the ORDER_DATE column, and ORDER_ITEMS is going to partitioned using reference partitioning and its foreign key relationship with the ORDERS table. This form of partitioning was a new feature in 11g and means that any partition maintenance operations performed on the ORDERS table will also take place on the ORDER_ITEMS table as well. First create the ORDERS table - SQL CREATE TABLE orders ( order_id NUMBER(12), order_date TIMESTAMP, order_mode VARCHAR2(8), customer_id NUMBER(6), order_status NUMBER(2), order_total NUMBER(8,2), sales_rep_id NUMBER(6), promotion_id NUMBER(6), CONSTRAINT orders_pk PRIMARY KEY(order_id) ) PARTITION BY RANGE(order_date) (PARTITION Q1_2007 VALUES LESS THAN (TO_DATE('01-APR-2007','DD-MON-YYYY')), PARTITION Q2_2007 VALUES LESS THAN (TO_DATE('01-JUL-2007','DD-MON-YYYY')), PARTITION Q3_2007 VALUES LESS THAN (TO_DATE('01-OCT-2007','DD-MON-YYYY')), PARTITION Q4_2007 VALUES LESS THAN (TO_DATE('01-JAN-2008','DD-MON-YYYY')) ); Table created. Now the ORDER_ITEMS table SQL CREATE TABLE order_items ( order_id NUMBER(12) NOT NULL, line_item_id NUMBER(3) NOT NULL, product_id NUMBER(6) NOT NULL, unit_price NUMBER(8,2), quantity NUMBER(8), CONSTRAINT order_items_fk FOREIGN KEY(order_id) REFERENCES orders(order_id) on delete cascade) PARTITION BY REFERENCE(order_items_fk) tablespace example; Table created. Now look at DBA_TAB_PARTITIONS to get details of what partitions we have in the two tables – SQL select table_name,partition_name, partition_position position, high_value from dba_tab_partitions where table_owner='SH' and table_name like 'ORDER_%' order by partition_position, table_name; TABLE_NAME PARTITION_NAME POSITION HIGH_VALUE -------------- --------------- -------- ------------------------- ORDERS Q1_2007 1 TIMESTAMP' 2007-04-01 00:00:00' ORDER_ITEMS Q1_2007 1 ORDERS Q2_2007 2 TIMESTAMP' 2007-07-01 00:00:00' ORDER_ITEMS Q2_2007 2 ORDERS Q3_2007 3 TIMESTAMP' 2007-10-01 00:00:00' ORDER_ITEMS Q3_2007 3 ORDERS Q4_2007 4 TIMESTAMP' 2008-01-01 00:00:00' ORDER_ITEMS Q4_2007 4 Just as an aside it is also now possible in 12c to use interval partitioning on reference partitioned tables. In 11g it was not possible to combine these two new partitioning features. For our first example of the new 12cfunctionality, let us add all the partitions necessary for 2008 to the tables using one command. Notice that the partition specification part of the add command is identical in format to the partition specification part of the create command as shown above - SQL alter table orders add PARTITION Q1_2008 VALUES LESS THAN (TO_DATE('01-APR-2008','DD-MON-YYYY')), PARTITION Q2_2008 VALUES LESS THAN (TO_DATE('01-JUL-2008','DD-MON-YYYY')), PARTITION Q3_2008 VALUES LESS THAN (TO_DATE('01-OCT-2008','DD-MON-YYYY')), PARTITION Q4_2008 VALUES LESS THAN (TO_DATE('01-JAN-2009','DD-MON-YYYY')); Table altered. Now look at DBA_TAB_PARTITIONS and we can see that the 4 new partitions have been added to both tables – SQL select table_name,partition_name, partition_position position, high_value from dba_tab_partitions where table_owner='SH' and table_name like 'ORDER_%' order by partition_position, table_name; TABLE_NAME PARTITION_NAME POSITION HIGH_VALUE -------------- --------------- -------- ------------------------- ORDERS Q1_2007 1 TIMESTAMP' 2007-04-01 00:00:00' ORDER_ITEMS Q1_2007 1 ORDERS Q2_2007 2 TIMESTAMP' 2007-07-01 00:00:00' ORDER_ITEMS Q2_2007 2 ORDERS Q3_2007 3 TIMESTAMP' 2007-10-01 00:00:00' ORDER_ITEMS Q3_2007 3 ORDERS Q4_2007 4 TIMESTAMP' 2008-01-01 00:00:00' ORDER_ITEMS Q4_2007 4 ORDERS Q1_2008 5 TIMESTAMP' 2008-04-01 00:00:00' ORDER_ITEMS Q1_2008 5 ORDERS Q2_2008 6 TIMESTAMP' 2008-07-01 00:00:00' ORDER_ITEM Q2_2008 6 ORDERS Q3_2008 7 TIMESTAMP' 2008-10-01 00:00:00' ORDER_ITEMS Q3_2008 7 ORDERS Q4_2008 8 TIMESTAMP' 2009-01-01 00:00:00' ORDER_ITEMS Q4_2008 8 Next, we can drop or truncate multiple partitions by giving a comma separated list in the alter table command. Note the use of the plural ‘partitions’ in the command as opposed to the singular ‘partition’ prior to 12c– SQL alter table orders drop partitions Q3_2008,Q2_2008,Q1_2008; Table altered. Now look at DBA_TAB_PARTITIONS and we can see that the 3 partitions have been dropped in both the two tables – TABLE_NAME PARTITION_NAME POSITION HIGH_VALUE -------------- --------------- -------- ------------------------- ORDERS Q1_2007 1 TIMESTAMP' 2007-04-01 00:00:00' ORDER_ITEMS Q1_2007 1 ORDERS Q2_2007 2 TIMESTAMP' 2007-07-01 00:00:00' ORDER_ITEMS Q2_2007 2 ORDERS Q3_2007 3 TIMESTAMP' 2007-10-01 00:00:00' ORDER_ITEMS Q3_2007 3 ORDERS Q4_2007 4 TIMESTAMP' 2008-01-01 00:00:00' ORDER_ITEMS Q4_2007 4 ORDERS Q4_2008 5 TIMESTAMP' 2009-01-01 00:00:00' ORDER_ITEMS Q4_2008 5 Now let us merge all the 2007 partitions together to form one single partition – SQL alter table orders merge partitions Q1_2005, Q2_2005, Q3_2005, Q4_2005 into partition Y_2007; Table altered. TABLE_NAME PARTITION_NAME POSITION HIGH_VALUE -------------- --------------- -------- ------------------------- ORDERS Y_2007 1 TIMESTAMP' 2008-01-01 00:00:00' ORDER_ITEMS Y_2007 1 ORDERS Q4_2008 2 TIMESTAMP' 2009-01-01 00:00:00' ORDER_ITEMS Q4_2008 2 Splitting partitions is a slightly more involved. In the case of range partitioning one of the new partitions must have no high value defined, and in list partitioning one of the new partitions must have no list of values defined. I call these partitions the ‘everything else’ partitions, and will contain any rows contained in the original partition that are not contained in the any of the other new partitions. For example, let us split the Y_2007 partition back into 4 quarterly partitions – SQL alter table orders split partition Y_2007 into (PARTITION Q1_2007 VALUES LESS THAN (TO_DATE('01-APR-2007','DD-MON-YYYY')), PARTITION Q2_2007 VALUES LESS THAN (TO_DATE('01-JUL-2007','DD-MON-YYYY')), PARTITION Q3_2007 VALUES LESS THAN (TO_DATE('01-OCT-2007','DD-MON-YYYY')), PARTITION Q4_2007); Now look at DBA_TAB_PARTITIONS to get details of the new partitions – TABLE_NAME PARTITION_NAME POSITION HIGH_VALUE -------------- --------------- -------- ------------------------- ORDERS Q1_2007 1 TIMESTAMP' 2007-04-01 00:00:00' ORDER_ITEMS Q1_2007 1 ORDERS Q2_2007 2 TIMESTAMP' 2007-07-01 00:00:00' ORDER_ITEMS Q2_2007 2 ORDERS Q3_2007 3 TIMESTAMP' 2007-10-01 00:00:00' ORDER_ITEMS Q3_2007 3 ORDERS Q4_2007 4 TIMESTAMP' 2008-01-01 00:00:00' ORDER_ITEMS Q4_2007 4 ORDERS Q4_2008 5 TIMESTAMP' 2009-01-01 00:00:00' ORDER_ITEMS Q4_2008 5 Partition Q4_2007 has a high value equal to the high value of the original Y_2007 partition, and so has inherited its upper boundary from the partition that was split. As for a list partitioning example let look at the following another table, SALES_PAR_LIST, which has 2 partitions, Americas and Europe and a partitioning key of country_name. SQL select table_name,partition_name, high_value from dba_tab_partitions where table_owner='SH' and table_name = 'SALES_PAR_LIST'; TABLE_NAME PARTITION_NAME HIGH_VALUE -------------- --------------- ----------------------------- SALES_PAR_LIST AMERICAS 'Argentina', 'Canada', 'Peru', 'USA', 'Honduras', 'Brazil', 'Nicaragua' SALES_PAR_LIST EUROPE 'France', 'Spain', 'Ireland', 'Germany', 'Belgium', 'Portugal', 'Denmark' Now split the Americas partition into 3 partitions – SQL alter table sales_par_list split partition americas into (partition south_america values ('Argentina','Peru','Brazil'), partition north_america values('Canada','USA'), partition central_america); Table altered. Note that no list of values was given for the ‘Central America’ partition. However it should have inherited any values in the original ‘Americas’ partition that were not assigned to either the ‘North America’ or ‘South America’ partitions. We can confirm this by looking at the DBA_TAB_PARTITIONS view. SQL select table_name,partition_name, high_value from dba_tab_partitions where table_owner='SH' and table_name = 'SALES_PAR_LIST'; TABLE_NAME PARTITION_NAME HIGH_VALUE --------------- --------------- -------------------------------- SALES_PAR_LIST SOUTH_AMERICA 'Argentina', 'Peru', 'Brazil' SALES_PAR_LIST NORTH_AMERICA 'Canada', 'USA' SALES_PAR_LIST CENTRAL_AMERICA 'Honduras', 'Nicaragua' SALES_PAR_LIST EUROPE 'France', 'Spain', 'Ireland', 'Germany', 'Belgium', 'Portugal', 'Denmark' In conclusion, I hope that DBA’s whose work involves maintaining partitions will find the operations a bit more straight forward to carry out once they have upgraded to Oracle Database 12c. Gwen Lazenby is a Principal Training Consultant at Oracle. She is part of Oracle University's Core Technology delivery team based in the UK, teaching Database Administration and Linux courses. Her specialist topics include using Oracle Partitioning and Parallelism in Data Warehouse environments, as well as Oracle Spatial and RMAN.

Read the article
Oracle Tutor: Top 10 to Implement Sustainable Policies and Procedures

- by emily.chorba(at)oracle.com

Overview Your organization (executives, managers, and employees) understands the value of having written business process documents (process maps, procedures, instructions, reference documents, and form abstracts). Policies and procedures should be documented because they help to reduce the range of individual decisions and encourage management by exception: the manager only needs to give special attention to unusual problems, not covered by a specific policy or procedure. As more and more procedures are written to cover recurring situations, managers will begin to make decisions which will be consistent from one functional area to the next.Companies should take a project management approach when implementing an environment for a sustainable documentation program and do the following:1. Identify an Executive Champion2. Put together a winning team3. Assign ownership4. Centralize publishing5. Establish the Document Maintenance Process Up Front6. Document critical activities only7. Document actual practice8. Minimize documentation9. Support continuous improvement10. Keep it simple 1. Identify an Executive ChampionAppoint a top down driver. Select one key individual to be a mentor for the procedure planning team. The individual should be a senior manager, such as your company president, CIO, CFO, the vice-president of quality, manufacturing, or engineering. Written policies and procedures can be important supportive aids when known to express the thinking for the chief executive officer and / or the president and to have his or her full support. 2. Put Together a Winning TeamChoose a strong Project Management Leader and staff the procedure planning team with management members from cross functional groups. Make sure team members have the responsibility - and the authority - to make things happen.The winning team should consist of the Documentation Project Manager, Document Owners (one for each functional area), a Document Controller, and Document Specialists (as needed). The Tutor Implementation Guide has complete job descriptions for these roles. 3. Assign Ownership It is virtually impossible to keep process documentation simple and meaningful if employees who are far removed from the activity itself create it. It is impossible to keep documentation up-to-date when responsibility for the document is not clearly understood.Key to the Tutor methodology, therefore, is the concept of ownership. Each document has a single owner, who is responsible for ensuring that the document is necessary and that it reflects actual practice. The owner must be a person who is knowledgeable about the activity and who has the authority to build consensus among the persons who participate in the activity as well as the authority to define or change the way an activity is performed. The owner must be an advocate of the performers and negotiate, not dictate practices.In the Tutor environment, a document's owner is the only person with the authority to approve an update to that document. 4. Centralize Publishing Although it is tempting (especially in a networked environment and with document management software solutions) to decentralize the control of all documents -- with each owner updating and distributing his own -- Tutor promotes centralized publishing by assigning the Document Administrator (gate keeper) to manage the updates and distribution of the procedures library. 5. Establish a Document Maintenance Process Up Front (and stick to it) Everyone in your organization should know they are invited to suggest changes to procedures and should understand exactly what steps to take to do so. Tutor provides a set of procedures to help your company set up a healthy document control system. There are many document management products available to automate some of the document change and maintenance steps. Depending on the size of your organization, a simple document management system can reduce the effort it takes to track and distribute document changes and updates. Whether your company decides to store the written policies and procedures on a file server or in a database, the essential tasks for maintaining documents are the same, though some tasks are automated. 6. Document Critical Activities Only The best way to keep your documentation simple is to reduce the number of process documents to a bare minimum and to include in those documents only as much detail as is absolutely necessary. The first step to reducing process documentation is to document only those activities that are deemed critical. Not all activities require documentation. In fact, some critical activities cannot and should not be standardized. Others may be sufficiently documented with an instruction or a checklist and may not require a procedure. A document should only be created when it enhances the performance of the employee performing the activity. If it does not help the employee, then there is no reason to maintain the document. Activities that represent little risk (such as project status), activities that cannot be defined in terms of specific tasks (such as product research), and activities that can be performed in a variety of ways (such as advertising) often do not require documentation. Sometimes, an activity will evolve to the point where documentation is necessary. For example, an activity performed by single employee may be straightforward and uncomplicated -- that is, until the activity is performed by multiple employees. Sometimes, it is the interaction between co-workers that necessitates documentation; sometimes, it is the complexity or the diversity of the activity.7. Document Actual Practices The only reason to maintain process documentation is to enhance the performance of the employee performing the activity. And documentation can only enhance performance if it reflects reality -- that is, current best practice. Documentation that reflects an unattainable ideal or outdated practices will end up on the shelf, unused and forgotten.Documenting actual practice means (1) auditing the activity to understand how the work is really performed, (2) identifying best practices with employees who are involved in the activity, (3) building consensus so that everyone agrees on a common method, and (4) recording that consensus.8. Minimize Documentation One way to keep it simple is to document at the highest level possible. That is, include in your documents only as much detail as is absolutely necessary.When writing a document, you should ask yourself, What is the purpose of this document? That is, what problem will it solve?By focusing on this question, you can target the critical information.• What questions are the end users likely to have?• What level of detail is required?• Is any of this information extraneous to the document's purpose? Short, concise documents are user friendly and they are easier to keep up to date. 9. Support Continuous Improvement Employees who perform an activity are often in the best position to identify improvements to the process. In other words, continuous improvement is a natural byproduct of the work itself -- but only if the improvements are communicated to all employees who are involved in the process, and only if there is consensus among those employees.Traditionally, process documentation has been used to dictate performance, to limit employees' actions. In the Tutor environment, process documents are used to communicate improvements identified by employees. How does this work? The Tutor methodology requires a process document to reflect actual practice, so the owner of a document must routinely audit its content -- does the document match what the employees are doing? If it doesn't, the owner has the responsibility to evaluate the process, to build consensus among the employees, to identify "best practices," and to communicate these improvements via a document update. Continuous improvement can also be an outgrowth of corrective action -- but only if the solutions to problems are communicated effectively. The goal should be to solve a problem once and only once, which means not only identifying the solution, but ensuring that the solution becomes part of the process. The Tutor system provides the method through which improvements and solutions are documented and communicated to all affected employees in a cost-effective, timely manner; it ensures that improvements are not lost or confined to a single employee. 10. Keep it Simple Process documents don't have to be complex and unfriendly. In fact, the simpler the format and organization, the more likely the documents will be used. And the simpler the method of maintenance, the more likely the documents will be kept up-to-date. Keep it simply by:• Minimizing skills and training required• Following the established Tutor document format and layout• Avoiding technology just for technology's sake No other rule has as major an impact on the success of your internal documentation as -- keep it simple. Learn More For more information about Tutor, visit Oracle.Com or the Tutor Blog. Post your questions at the Tutor Forum. Emily Chorba Principle Product Manager Oracle Tutor & BPM

Read the article
Setting useLegacyV2RuntimeActivationPolicy At Runtime

- by Reed

Version 4.0 of the .NET Framework included a new CLR which is almost entirely backwards compatible with the 2.0 version of the CLR. However, by default, mixed-mode assemblies targeting .NET 3.5sp1 and earlier will fail to load in a .NET 4 application. Fixing this requires setting useLegacyV2RuntimeActivationPolicy in your app.Config for the application. While there are many good reasons for this decision, there are times when this is extremely frustrating, especially when writing a library. As such, there are (rare) times when it would be beneficial to set this in code, at runtime, as well as verify that it’s running correctly prior to receiving a FileLoadException. Typically, loading a pre-.NET 4 mixed mode assembly is handled simply by changing your app.Config file, and including the relevant attribute in the startup element: <?xml version="1.0" encoding="utf-8" ?> <configuration> <startup useLegacyV2RuntimeActivationPolicy="true"> <supportedRuntime version="v4.0"/> </startup> </configuration> .csharpcode { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { background-color: #f4f4f4; margin: 0em; width: 100% } .csharpcode .lnum { color: #606060 } This causes your application to run correctly, and load the older, mixed-mode assembly without issues. For full details on what’s happening here and why, I recommend reading Mark Miller’s detailed explanation of this attribute and the reasoning behind it. Before I show any code, let me say: I strongly recommend using the official approach of using app.config to set this policy. That being said, there are (rare) times when, for one reason or another, changing the application configuration file is less than ideal. While this is the supported approach to handling this issue, the CLR Hosting API includes a means of setting this programmatically via the ICLRRuntimeInfo interface. Normally, this is used if you’re hosting the CLR in a native application in order to set this, at runtime, prior to loading the assemblies. However, the F# Samples include a nice trick showing how to load this API and bind this policy, at runtime. This was required in order to host the Managed DirectX API, which is built against an older version of the CLR. This is fairly easy to port to C#. Instead of a direct port, I also added a little addition – by trapping the COM exception received if unable to bind (which will occur if the 2.0 CLR is already bound), I also allow a runtime check of whether this property was setup properly: public static class RuntimePolicyHelper { public static bool LegacyV2RuntimeEnabledSuccessfully { get; private set; } static RuntimePolicyHelper() { ICLRRuntimeInfo clrRuntimeInfo = (ICLRRuntimeInfo)RuntimeEnvironment.GetRuntimeInterfaceAsObject( Guid.Empty, typeof(ICLRRuntimeInfo).GUID); try { clrRuntimeInfo.BindAsLegacyV2Runtime(); LegacyV2RuntimeEnabledSuccessfully = true; } catch (COMException) { // This occurs with an HRESULT meaning // "A different runtime was already bound to the legacy CLR version 2 activation policy." LegacyV2RuntimeEnabledSuccessfully = false; } } [ComImport] [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)] [Guid("BD39D1D2-BA2F-486A-89B0-B4B0CB466891")] private interface ICLRRuntimeInfo { void xGetVersionString(); void xGetRuntimeDirectory(); void xIsLoaded(); void xIsLoadable(); void xLoadErrorString(); void xLoadLibrary(); void xGetProcAddress(); void xGetInterface(); void xSetDefaultStartupFlags(); void xGetDefaultStartupFlags(); [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)] void BindAsLegacyV2Runtime(); } } Using this, it’s possible to not only set this at runtime, but also verify, prior to loading your mixed mode assembly, whether this will succeed. In my case, this was quite useful – I am working on a library purely for internal use which uses a numerical package that is supplied with both a completely managed as well as a native solver. The native solver uses a CLR 2 mixed-mode assembly, but is dramatically faster than the pure managed approach. By checking RuntimePolicyHelper.LegacyV2RuntimeEnabledSuccessfully at runtime, I can decide whether to enable the native solver, and only do so if I successfully bound this policy. There are some tricks required here – To enable this sort of fallback behavior, you must make these checks in a type that doesn’t cause the mixed mode assembly to be loaded. In my case, this forced me to encapsulate the library I was using entirely in a separate class, perform the check, then pass through the required calls to that class. Otherwise, the library will load before the hosting process gets enabled, which in turn will fail. This code will also, of course, try to enable the runtime policy before the first time you use this class – which typically means just before the first time you check the boolean value. As a result, checking this early on in the application is more likely to allow it to work. Finally, if you’re using a library, this has to be called prior to the 2.0 CLR loading. This will cause it to fail if you try to use it to enable this policy in a plugin for most third party applications that don’t have their app.config setup properly, as they will likely have already loaded the 2.0 runtime. As an example, take a simple audio player. The code below shows how this can be used to properly, at runtime, only use the “native” API if this will succeed, and fallback (or raise a nicer exception) if this will fail: public class AudioPlayer { private IAudioEngine audioEngine; public AudioPlayer() { if (RuntimePolicyHelper.LegacyV2RuntimeEnabledSuccessfully) { // This will load a CLR 2 mixed mode assembly this.audioEngine = new AudioEngineNative(); } else { this.audioEngine = new AudioEngineManaged(); } } public void Play(string filename) { this.audioEngine.Play(filename); } } Now – the warning: This approach works, but I would be very hesitant to use it in public facing production code, especially for anything other than initializing your own application. While this should work in a library, using it has a very nasty side effect: you change the runtime policy of the executing application in a way that is very hidden and non-obvious.

Read the article
Network communications mechanisms for SQL Server

- by Akshay Deep Lamba

Problem I am trying to understand how SQL Server communicates on the network, because I'm having to tell my networking team what ports to open up on the firewall for an edge web server to communicate back to the SQL Server on the inside. What do I need to know? Solution In order to understand what needs to be opened where, let's first talk briefly about the two main protocols that are in common use today: TCP - Transmission Control Protocol UDP - User Datagram Protocol Both are part of the TCP/IP suite of protocols. We'll start with TCP. TCP TCP is the main protocol by which clients communicate with SQL Server. Actually, it is more correct to say that clients and SQL Server use Tabular Data Stream (TDS), but TDS actually sits on top of TCP and when we're talking about Windows and firewalls and other networking devices, that's the protocol that rules and controls are built around. So we'll just speak in terms of TCP. TCP is a connection-oriented protocol. What that means is that the two systems negotiate the connection and both agree to it. Think of it like a phone call. While one person initiates the phone call, the other person has to agree to take it and both people can end the phone call at any time. TCP is the same way. Both systems have to agree to the communications, but either side can end it at any time. In addition, there is functionality built into TCP to ensure that all communications can be disassembled and reassembled as necessary so it can pass over various network devices and be put together again properly in the right order. It also has mechanisms to handle and retransmit lost communications. Because of this functionality, TCP is the protocol used by many different network applications. The way the applications all can share is through the use of ports. When a service, like SQL Server, comes up on a system, it must listen on a port. For a default SQL Server instance, the default port is 1433. Clients connect to the port via the TCP protocol, the connection is negotiated and agreed to, and then the two sides can transfer information as needed until either side decides to end the communication. In actuality, both sides will have a port to use for the communications, but since the client's port is typically determined semi-randomly, when we're talking about firewalls and the like, typically we're interested in the port the server or service is using. UDP UDP, unlike TCP, is not connection oriented. A "client" can send a UDP communications to anyone it wants. There's nothing in place to negotiate a communications connection, there's nothing in the protocol itself to coordinate order of communications or anything like that. If that's needed, it's got to be handled by the application or by a protocol built on top of UDP being used by the application. If you think of TCP as a phone call, think of UDP as a postcard. I can put a postcard in the mail to anyone I want, and so long as it is addressed properly and has a stamp on it, the postal service will pick it up. Now, what happens it afterwards is not guaranteed. There's no mechanism for retransmission of lost communications. It's great for short communications that doesn't necessarily need an acknowledgement. Because multiple network applications could be communicating via UDP, it uses ports, just like TCP. The SQL Browser or the SQL Server Listener Service uses UDP. Network Communications - Talking to SQL Server When an instance of SQL Server is set up, what TCP port it listens on depends. A default instance will be set up to listen on port 1433. A named instance will be set to a random port chosen during installation. In addition, a named instance will be configured to allow it to change that port dynamically. What this means is that when a named instance starts up, if it finds something already using the port it normally uses, it'll pick a new port. If you have a named instance, and you have connections coming across a firewall, you're going to want to use SQL Server Configuration Manager to set a static port. This will allow the networking and security folks to configure their devices for maximum protection. While you can change the network port for a default instance of SQL Server, most people don't. Network Communications - Finding a SQL Server When just the name is specified for a client to connect to SQL Server, for instance, MySQLServer, this is an attempt to connect to the default instance. In this case the client will automatically attempt to communicate to port 1433 on MySQLServer. If you've switched the port for the default instance, you'll need to tell the client the proper port, usually by specifying the following syntax in the connection string: <server>,<port>. For instance, if you moved SQL Server to listen on 14330, you'd use MySQLServer,14330 instead of just MySQLServer. However, because a named instance sets up its port dynamically by default, the client never knows at the outset what the port is it should talk to. That's what the SQL Browser or the SQL Server Listener Service (SQL Server 2000) is for. In this case, the client sends a communication via the UDP protocol to port 1434. It asks, "Where is the named instance?" So if I was running a named instance called SQL2008R2, it would be asking the SQL Browser, "Hey, how do I talk to MySQLServer\SQL2008R2?" The SQL Browser would then send back a communications from UDP port 1434 back to the client telling the client how to talk to the named instance. Of course, you can skip all of this of you set that named instance's port statically. Then you can use the <server>,<port> mechanism to connect and the client won't try to talk to the SQL Browser service. It'll simply try to make the connection. So, for instance, is the SQL2008R2 instance was listening on port 20080, specifying MySQLServer,20080 would attempt a connection to the named instance. Network Communications - Named Pipes Named pipes is an older network library communications mechanism and it's generally not used any longer. It shouldn't be used across a firewall. However, if for some reason you need to connect to SQL Server with it, this protocol also sits on top of TCP. Named Pipes is actually used by the operating system and it has its own mechanism within the protocol to determine where to route communications. As far as network communications is concerned, it listens on TCP port 445. This is true whether we're talking about a default or named instance of SQL Server. The Summary Table To put all this together, here is what you need to know: Type of Communication Protocol Used Default Port Finding a SQL Server or SQL Server Named Instance UDP 1434 Communicating with a default instance of SQL Server TCP 1433 Communicating with a named instance of SQL Server TCP * Determined dynamically at start up Communicating with SQL Server via Named Pipes TCP 445

Read the article
How-to delete a tree node using the context menu

- by frank.nimphius

Hierarchical trees in Oracle ADF make use of View Accessors, which means that only the top level node needs to be exposed as a View Object instance on the ADF Business Components Data Model. This also means that only the top level node has a representation in the PageDef file as a tree binding and iterator binding reference. Detail nodes are accessed through tree rule definitions that use the accessor mentioned above (or nested collections in the case of POJO or EJB business services). The tree component is configured for single node selection, which however can be declaratively changed for users to press the ctrl key and selecting multiple nodes. In the following, I explain how to create a context menu on the tree for users to delete the selected tree nodes. For this, the context menu item will access a managed bean, which then determines the selected node(s), the internal ADF node bindings and the rows they represent. As mentioned, the ADF Business Components Data Model only needs to expose the top level node data sources, which in this example is an instance of the Locations View Object. For the tree to work, you need to have associations defined between entities, which usually is done for you by Oracle JDeveloper if the database tables have foreign keys defined Note: As a general hint of best practices and to simplify your life: Make sure your database schema is well defined and designed before starting your development project. Don't treat the database as something organic that grows and changes with the requirements as you proceed in your project. Business service refactoring in response to database changes is possible, but should be treated as an exception, not the rule. Good database design is a necessity – even for application developers – and nothing evil. To create the tree component, expand the Data Controls panel and drag the View Object collection to the view. From the context menu, select the tree component entry and continue with defining the tree rules that make up the hierarchical structure. As you see, when pressing the green plus icon in the Edit Tree Binding dialog, the data structure, Locations - Departments – Employees in my sample, shows without you having created a View Object instance for each of the nodes in the ADF Business Components Data Model. After you configured the tree structure in the Edit Tree Binding dialog, you press OK and the tree is created. Select the tree in the page editor and open the Structure Window (ctrl+shift+S). In the Structure window, expand the tree node to access the conextMenu facet. Use the right mouse button to insert a Popup into the facet. Repeat the same steps to insert a Menu and a Menu Item into the Popup you created. The Menu item text should be changed to something meaningful like "Delete". Note that the custom menu item later is added to the context menu together with the default context menu options like expand and expand all. To define the action that is executed when the menu item is clicked on, you select the Action Listener property in the Property Inspector and click the arrow icon followed by the Edit menu option. Create or select a managed bean and define a method name for the action handler. Next, select the tree component and browse to its binding property in the Property Inspector. Again, use the arrow icon | Edit option to create a component binding in the same managed bean that has the action listener defined. The tree handle is used in the action listener code, which is shown below: public void onTreeNodeDelete(ActionEvent actionEvent) { //access the tree from the JSF component reference created //using the af:tree "binding" property. The "binding" property //creates a pair of set/get methods to access the RichTree instance RichTree tree = this.getTreeHandler(); //get the list of selected row keys RowKeySet rks = tree.getSelectedRowKeys(); //access the iterator to loop over selected nodes Iterator rksIterator = rks.iterator(); //The CollectionModel represents the tree model and is //accessed from the tree "value" property CollectionModel model = (CollectionModel) tree.getValue(); //The CollectionModel is a wrapper for the ADF tree binding //class, which is JUCtrlHierBinding JUCtrlHierBinding treeBinding = (JUCtrlHierBinding) model.getWrappedData(); //loop over the selected nodes and delete the rows they //represent while(rksIterator.hasNext()){ List nodeKey = (List) rksIterator.next(); //find the ADF node binding using the node key JUCtrlHierNodeBinding node = treeBinding.findNodeByKeyPath(nodeKey); //delete the row. Row rw = node.getRow(); rw.remove(); } //only refresh the tree if tree nodes have been selected if(rks.size() > 0){ AdfFacesContext adfFacesContext = AdfFacesContext.getCurrentInstance(); adfFacesContext.addPartialTarget(tree); } } Note: To enable multi node selection for a tree, select the tree and change the row selection setting from "single" to "multiple". Note: a fully pictured version of this post will become available at the end of the month in a PDF summary on ADF Code Corner : http://www.oracle.com/technetwork/developer-tools/adf/learnmore/index-101235.html

Read the article
MVC Portable Area Modules *Without* MasterPages

- by Steve Michelotti

Portable Areas from MvcContrib provide a great way to build modular and composite applications on top of MVC. In short, portable areas provide a way to distribute MVC binary components as simple .NET assemblies where the aspx/ascx files are actually compiled into the assembly as embedded resources. I’ve blogged about Portable Areas in the past including this post here which talks about embedding resources and you can read more of an intro to Portable Areas here. As great as Portable Areas are, the question that seems to come up the most is: what about MasterPages? MasterPages seems to be the one thing that doesn’t work elegantly with portable areas because you specify the MasterPage in the @Page directive and it won’t use the same mechanism of the view engine so you can’t just embed them as resources. This means that you end up referencing a MasterPage that exists in the host application but not in your portable area. If you name the ContentPlaceHolderId’s correctly, it will work – but it all seems a little fragile. Ultimately, what I want is to be able to build a portable area as a module which has no knowledge of the host application. I want to be able to invoke the module by a full route on the user’s browser and it gets invoked and “automatically appears” inside the application’s visual chrome just like a MasterPage. So how could we accomplish this with portable areas? With this question in mind, I looked around at what other people are doing to address similar problems. Specifically, I immediately looked at how the Orchard team is handling this and I found it very compelling. Basically Orchard has its own custom layout/theme framework (utilizing a custom view engine) that allows you to build your module without any regard to the host. You simply decorate your controller with the [Themed] attribute and it will render with the outer chrome around it: 1: [Themed] 2: public class HomeController : Controller Here is the slide from the Orchard talk at this year MIX conference which shows how it conceptually works: It’s pretty cool stuff. So I figure, it must not be too difficult to incorporate this into the portable areas view engine as an optional piece of functionality. In fact, I’ll even simplify it a little – rather than have 1) Document.aspx, 2) Layout.ascx, and 3) <view>.ascx (as shown in the picture above); I’ll just have the outer page be “Chrome.aspx” and then the specific view in question. The Chrome.aspx not only takes the place of the MasterPage, but now since we’re no longer constrained by the MasterPage infrastructure, we have the choice of the Chrome.aspx living in the host or inside the portable areas as another embedded resource! Disclaimer: credit where credit is due – much of the code from this post is me re-purposing the Orchard code to suit my needs. To avoid confusion with Orchard, I’m going to refer to my implementation (which will be based on theirs) as a Chrome rather than a Theme. The first step I’ll take is to create a ChromedAttribute which adds a flag to the current HttpContext to indicate that the controller designated Chromed like this: 1: [Chromed] 2: public class HomeController : Controller The attribute itself is an MVC ActionFilter attribute: 1: public class ChromedAttribute : ActionFilterAttribute 2: { 3: public override void OnActionExecuting(ActionExecutingContext filterContext) 4: { 5: var chromedAttribute = GetChromedAttribute(filterContext.ActionDescriptor); 6: if (chromedAttribute != null) 7: { 8: filterContext.HttpContext.Items[typeof(ChromedAttribute)] = null; 9: } 10: } 11: 12: public static bool IsApplied(RequestContext context) 13: { 14: return context.HttpContext.Items.Contains(typeof(ChromedAttribute)); 15: } 16: 17: private static ChromedAttribute GetChromedAttribute(ActionDescriptor descriptor) 18: { 19: return descriptor.GetCustomAttributes(typeof(ChromedAttribute), true) 20: .Concat(descriptor.ControllerDescriptor.GetCustomAttributes(typeof(ChromedAttribute), true)) 21: .OfType<ChromedAttribute>() 22: .FirstOrDefault(); 23: } 24: } With that in place, we only have to override the FindView() method of the custom view engine with these 6 lines of code: 1: public override ViewEngineResult FindView(ControllerContext controllerContext, string viewName, string masterName, bool useCache) 2: { 3: if (ChromedAttribute.IsApplied(controllerContext.RequestContext)) 4: { 5: var bodyView = ViewEngines.Engines.FindPartialView(controllerContext, viewName); 6: var documentView = ViewEngines.Engines.FindPartialView(controllerContext, "Chrome"); 7: var chromeView = new ChromeView(bodyView, documentView); 8: return new ViewEngineResult(chromeView, this); 9: } 10: 11: // Just execute normally without applying Chromed View Engine 12: return base.FindView(controllerContext, viewName, masterName, useCache); 13: } If the view engine finds the [Chromed] attribute, it will invoke it’s own process – otherwise, it’ll just defer to the normal web forms view engine (with masterpages). The ChromeView’s primary job is to independently set the BodyContent on the view context so that it can be rendered at the appropriate place: 1: public class ChromeView : IView 2: { 3: private ViewEngineResult bodyView; 4: private ViewEngineResult documentView; 5: 6: public ChromeView(ViewEngineResult bodyView, ViewEngineResult documentView) 7: { 8: this.bodyView = bodyView; 9: this.documentView = documentView; 10: } 11: 12: public void Render(ViewContext viewContext, System.IO.TextWriter writer) 13: { 14: ChromeViewContext chromeViewContext = ChromeViewContext.From(viewContext); 15: 16: // First render the Body view to the BodyContent 17: using (var bodyViewWriter = new StringWriter()) 18: { 19: var bodyViewContext = new ViewContext(viewContext, bodyView.View, viewContext.ViewData, viewContext.TempData, bodyViewWriter); 20: this.bodyView.View.Render(bodyViewContext, bodyViewWriter); 21: chromeViewContext.BodyContent = bodyViewWriter.ToString(); 22: } 23: // Now render the Document view 24: this.documentView.View.Render(viewContext, writer); 25: } 26: } The ChromeViewContext (code excluded here) mainly just has a string property for the “BodyContent” – but it also makes sure to put itself in the HttpContext so it’s available. Finally, we created a little extension method so the module’s view can be rendered in the appropriate place: 1: public static void RenderBody(this HtmlHelper htmlHelper) 2: { 3: ChromeViewContext chromeViewContext = ChromeViewContext.From(htmlHelper.ViewContext); 4: htmlHelper.ViewContext.Writer.Write(chromeViewContext.BodyContent); 5: } At this point, the other thing left is to decide how we want to implement the Chrome.aspx page. One approach is the copy/paste the HTML from the typical Site.Master and change the main content placeholder to use the HTML helper above – this way, there are no MasterPages anywhere. Alternatively, we could even have Chrome.aspx utilize the MasterPage if we wanted (e.g., in the case where some pages are Chromed and some pages want to use traditional MasterPage): 1: <%@ Page Title="" Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage" %> 2: <asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server"> 3: <% Html.RenderBody(); %> 4: </asp:Content> At this point, it’s all academic. I can create a controller like this: 1: [Chromed] 2: public class WidgetController : Controller 3: { 4: public ActionResult Index() 5: { 6: return View(); 7: } 8: } Then I’ll just create Index.ascx (a partial view) and put in the text “Inside my widget”. Now when I run the app, I can request the full route (notice the controller name of “widget” in the address bar below) and the HTML from my Index.ascx will just appear where it is supposed to. This means no more warnings for missing MasterPages and no more need for your module to have knowledge of the host’s MasterPage placeholders. You have the option of using the Chrome.aspx in the host or providing your own while embedding it as an embedded resource itself. I’m curious to know what people think of this approach. The code above was done with my own local copy of MvcContrib so it’s not currently something you can download. At this point, these are just my initial thoughts – just incorporating some ideas for Orchard into non-Orchard apps to enable building modular/composite apps more easily. Additionally, on the flip side, I still believe that Portable Areas have potential as the module packaging story for Orchard itself. What do you think?

Read the article

Search Results

Search found 9983 results on 400 pages for 'fuzzy c means'.

Page 125/400 | < Previous Page | 121 122 123 124 125 126 127 128 129 130 131 132 | Next Page >

- by ScottGu

- by jorg

- by Jesse

- by Testas

- by ccasares

- by Mat Keep

- by Stefan Hinker

- by RobertChipperfield

- by pinaldave

- by Elton Stoneman

- by James Michael Hare

- by Elton Stoneman

- by hajan

- by Simon Cooper

- by terje

- by Pinal Dave

- by Frez

- by Rob Farley

- by CaNNaDaRk

- by hamsun

- by emily.chorba(at)oracle.com

- by Reed

- by Akshay Deep Lamba

- by frank.nimphius

- by Steve Michelotti

< Previous Page | 121 122 123 124 125 126 127 128 129 130 131 132 | Next Page >