Search Results

Search found 2372 results on 95 pages for 'identify'.

Page 37/95 | < Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44 | Next Page >

Recover Deleted Files on an NTFS Hard Drive from a Ubuntu Live CD

- by Trevor Bekolay

Accidentally deleting a file is a terrible feeling. Not being able to boot into Windows and undelete that file makes that even worse. Fortunately, you can recover deleted files on NTFS hard drives from an Ubuntu Live CD. To show this process, we created four files on the desktop of a Windows XP machine, and then deleted them. We then booted up the same machine with the bootable Ubuntu 9.10 USB Flash Drive that we created last week. Once Ubuntu 9.10 boots up, open a terminal by clicking Applications in the top left of the screen, and then selecting Accessories > Terminal. To undelete our files, we first need to identify the hard drive that we want to undelete from. In the terminal window, type in: sudo fdisk –l and press enter. What you’re looking for is a line that ends with HPSF/NTFS (under the heading System). In our case, the device is “/dev/sda1”. This may be slightly different for you, but it will still begin with /dev/. Note this device name. If you have more than one hard drive partition formatted as NTFS, then you may be able to identify the correct partition by the size. If you look at the second line of text in the screenshot above, it reads “Disk /dev/sda: 136.4 GB, …” This means that the hard drive that Ubuntu has named /dev/sda is 136.4 GB large. If your hard drives are of different size, then this information can help you track down the right device name to use. Alternatively, you can just try them all, though this can be time consuming for large hard drives. Now that you know the name Ubuntu has assigned to your hard drive, we’ll scan it to see what files we can uncover. In the terminal window, type: sudo ntfsundelete <HD name> and hit enter. In our case, the command is: sudo ntfsundelete /dev/sda1 The names of files that can recovered show up in the far right column. The percentage in the third column tells us how much of that file can be recovered. Three of the four files that we originally deleted are showing up in this list, even though we shut down the computer right after deleting the four files – so even in ideal cases, your files may not be recoverable. Nevertheless, we have three files that we can recover – two JPGs and an MPG. Note: ntfsundelete is immediately available in the Ubuntu 9.10 Live CD. If you are in a different version of Ubuntu, or for some other reason get an error when trying to use ntfsundelete, you can install it by entering “sudo apt-get install ntfsprogs” in a terminal window. To quickly recover the two JPGs, we will use the * wildcard to recover all of the files that end with .jpg. In the terminal window, enter sudo ntfsundelete <HD name> –u –m *.jpg which is, in our case, sudo ntfsundelete /dev/sda1 –u –m *.jpg The two files are recovered from the NTFS hard drive and saved in the current working directory of the terminal. By default, this is the home directory of the current user, though we are working in the Desktop folder. Note that the ntfsundelete program does not make any changes to the original NTFS hard drive. If you want to take those files and put them back in the NTFS hard drive, you will have to move them there after they are undeleted with ntfsundelete. Of course, you can also put them on your flash drive or open Firefox and email them to yourself – the sky’s the limit! We have one more file to undelete – our MPG. Note the first column on the far left. It contains a number, its Inode. Think of this as the file’s unique identifier. Note this number. To undelete a file by its Inode, enter the following in the terminal: sudo ntfsundelete <HD name> –u –i <Inode> In our case, this is: sudo ntfsundelete /dev/sda1 –u –i 14159 This recovers the file, along with an identifier that we don’t really care about. All three of our recoverable files are now recovered. However, Ubuntu lets us know visually that we can’t use these files yet. That’s because the ntfsundelete program saves the files as the “root” user, not the “ubuntu” user. We can verify this by typing the following in our terminal window: ls –l We want these three files to be owned by ubuntu, not root. To do this, enter the following in the terminal window: sudo chown ubuntu <Files> If the current folder has other files in it, you may not want to change their owner to ubuntu. However, in our case, we only have these three files in this folder, so we will use the * wildcard to change the owner of all three files. sudo chown ubuntu * The files now look normal, and we can do whatever we want with them. Hopefully you won’t need to use this tip, but if you do, ntfsundelete is a nice command-line utility. It doesn’t have a fancy GUI like many of the similar Windows programs, but it is a powerful tool that can recover your files quickly. See ntfsundelete’s manual page for more detailed usage information Similar Articles Productive Geek Tips Reset Your Ubuntu Password Easily from the Live CDUse Ubuntu Live CD to Backup Files from Your Dead Windows ComputerCreate a Bootable Ubuntu 9.10 USB Flash DriveCreate a Bootable Ubuntu USB Flash Drive the Easy WayGuide to Using Check Disk in Windows Vista TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional Windows 7 Easter Theme YoWindoW, a real time weather screensaver Optimize your computer the Microsoft way Stormpulse provides slick, real time weather data Geek Parents – Did you try Parental Controls in Windows 7? Change DNS servers on the fly with DNS Jumper

Read the article
Automating Form Login

- by Greg_Gutkin

Introduction A common task in configuring a web application for proxying in Pagelet Producer is setting up form autologin. PP provides a wizard-like tool for detecting the login form fields, but this is usually only the first step in configuring this feature. If the generated configuration doesn't seem to work, some additional manual modifications will be needed to complete the setup. This article will try to guide you through this process while steering you away from common pitfalls. For the purposes of this article, let's assume the following characteristics about your environment: Web Application Base URL: http://host/app (configured as Resource Source URL in PP) Pagelet Producer Base URL: http://pp/pagelets Form Field Auto-Detection Form Autologin is configured in the PP Admin UI under resource_name/Autologin/Form Login. First, you'll enter the URL to the login form under "Login Form Identification". This will enable the admin wizard to connect to and display the login page. Caution: RedirectsMake sure the entered URL matches what you see in the browser's address bar, when the application login page is displayed. For example, even though you may be able to reach the login page by simply typing http://host/app, the URL you end up on may change to http://host/app/login via browser redirect(s).The second URL is the one you will want to use. Caution: External Login ServersThe login page may actually come from a different server than the application you are trying to proxy. For example, you may notice that the login page URL changes to http://hostB/appB. This is common when external SSO products are involved. There are two ways of dealing with this situation. One is to configure Pagelet Producer to participate in SSO. This approach is out of scope of this article and is discussed in a separate whitepaper (TODO add link). The second approach is to use the autologin feature to provide stored credentials to the SSO login form. Since the login form URL is not an extension of the application base URL (PP resource URL), you will need to add a new PP resource for the SSO server and configure the login form on that resource instead of the original application resource. One side benefit of this additional resource is that it can reused for other applications relying on the same SSO server for login. After entering the login page URL (make sure dropdown says "URL"), click "Automatically Detect Form Fields". This will bring up the web app's login page in a new browser window. Fill it out and submit it as you would normally. If everything goes right, Pagelet Producer will intercept the submitted values and fill out all the needed configuration data in the Admin UI. If the login form window doesn't close or configuration data doesn't get filled in, you may have not entered the login page URL correctly. Review the two cautionary notes above and make any necessary changes. If the form fields got filled automatically, it's time to save the configuration and test it out. If you can access a protected area of the backend application via a proxied PP URL without filling out its login form, then you are pretty much done with login form configuration. The only other step you will need to complete before declaring this aspect of configuration production ready is configuring form field source. You may skip to that section below. Manual Login Form Identification Let's take a closer look at Login Form Identification. This determines how Pagelet Producer recognizes login forms as such. URL The most efficient way of detecting login forms is by looking at the page URL. This method can only be used under the following conditions: Login page URL must be different from the post login application URLs. Login page URL must stay constant regardless of the path it takes to reach the page. For example, reaching the login page by going to the application base URL or to a specific protected URL must result in a redirect to the same login page URL (query string excluded). If only the query string parameters change, just leave out the query string from the configured login page URL. If either of these conditions is not fullfilled, you must switch to the RegEx approach below. RegEx If the login page URL is not uniform enough across all scenarios or is indistinguishable from other page locations, PP can be configured to recognize it by looking at the page markup itself. This is accomplished by changing the dropdown to "RegEx". If regular expressions scare you, take comfort from the fact that in most cases you won't need to enter any special regex characters. Let's look at an example: Say you have a login form that looks like <form id='loginForm' action='login?from=pageA' > <input id='user'> <input id='pass'> </form> Since this form has an id attribute, you can be reasonably sure that this login form can be uniquely identified across the web application by this snippet: "id='loginForm'". (Unless, of course your backend web application contains login forms to other apps). Since no wildcards are needed to find this snippet, you can just enter it as is into the RegEx field - no special regular expression characters needed! If the web developer who created the form wasn't kind enough to provide a unique id, you will need to look for other snippets of the page to uniquely identify it. It could be the action URL, an input field id, or some other markup fragment. You should abstain from using UI text as an identifier it may change in translated versions of the page and prevent the login page logic from working for international users. You may need to turn to regular expression wildcard syntax if no simple matches work. For more information on regular expression, refer to the Resources section. Form Submit Location Now we'll look at the form submit location. If the captured URL contains query string parameters that will likely change from one form submission to the next, you will need to change its type to RegEx. This type will tell Pagelet Producer to parse the login page for the action URL and submit to the value found. The regular expression needs to point at the actual action URL with its first grouping expression. Taking the example form definition above, the form submit location regex would be: action='(.*?)' The parentheses are used to identify the actual action URL, while the rest of the expression provides the context for finding it. Expression .*? is a so-called reluctant wildcard that matches any character excluding the single quote that follows. See Resources section below for further information on regular expressions. Manual Form Field Detection If the Admin UI form field detection wizard fails to populate login form configuration page, you will have to enter the fields by hand. Use a built-in browser developer tool or addon (e.g. Firebug) to inspect the form element and its children input elements. For each input element (including hidden elements), create an entry under Form Fields. Change its Source according to the next section. Form Field Source Change the source of any of the fields not exposed to the users of the login form (i.e. hidden fields) to "Generated". This means Pagelet Producer will just use the values returned by the web app rather than supplying values it stored. For fields that contain sensitive data or vary from user to user (e.g. username & password), change the source to User (Credential) Vault. Logging Support To help you troubleshoot you autologin configuration, PP provides some useful logging support. To turn on detailed logging for the autologin feature, navigate to Settings in Admin UI. Under Logging, change the log level for AutoLogin to Finest. Known Limitations Autologin feature may not work as expected if login form fields (not just the values, but the DOM elements themselves) are generated dynamically by client side JavaScript. Resources RegEx RegEx Reference from Java RegEx Test Tool

Read the article
SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

- by pinaldave

It’s no secret that bad data leads to bad decisions and poor results. However, how do you prevent dirty data from taking up residency in your data store? Some might argue that it’s the responsibility of the person sending you the data. While that may be true, in practice that will rarely hold up. It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data. What constitutes bad data? There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain. Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster. Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1: Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level. Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data. Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code. I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example. The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2: Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint. A regular expression is used to identify patterns within data. The patterns could be something as simple as a date format or something much more complex such as a street address. For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period. So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be. How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be: ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above: ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ . In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do. Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3: Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email. Email addresses are often entered incorrectly because people are trying to avoid spam. While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet: \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed: ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4: Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations. To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator. Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File. The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified. In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected. This makes diagnosing errors much easier! Tip 5: Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions. For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes. Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation. It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results. For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here. Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Using Find/Replace with regular expressions inside a SSIS package

- by jamiet

Another one of those might-be-useful-again-one-day-so-I’ll-share-it-in-a-blog-post blog posts I am currently working on a SQL Server Integration Services (SSIS) 2012 implementation where each package contains a parameter called ETLIfcHist_ID: During normal execution this will get altered when the package is executed from the Execute Package Task however we want to make sure that at deployment-time they all have a default value of –1. Of course, they tend to get changed during development so I wanted a way of easily changing them all back to the default value. Opening up each package in turn and editing them was an option but given that we have over 40 packages and we might want to carry out this reset fairly frequently I needed a more automated method so I turned to Visual Studio’s Find/Replace… feature Of course, we don’t know what value will be in that parameter so I can’t simply search for a particular value; hence I opted to use a regular expression to identify the value to be change. In the rest of this blog post I’ll explain how to do that. For demonstration purposes I have taken the contents of a .dtsx file and stripped out everything except the element containing the parameters (<DTS:PackageParameters>), if you want to play along at home you can copy-paste the XML document below into a new XML file and open it up in Visual Studio: <?xml version="1.0"?> <DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts"> <DTS:PackageParameters> <DTS:PackageParameter DTS:CreationName="" DTS:DataType="3" DTS:Description="InterfaceHistory_ID: used for Lineage" DTS:DTSID="{635616DB-EEEE-45C8-89AA-713E25846C7E}" DTS:ObjectName="ETLIfcHist_ID"> <DTS:Property DTS:DataType="3" DTS:Name="ParameterValue">VALUE_TO_BE_CHANGED</DTS:Property> </DTS:PackageParameter> <DTS:PackageParameter DTS:CreationName="" DTS:DataType="3" DTS:Description="Some other description" DTS:DTSID="{635616DB-EEEE-45C8-89AA-713E25845C7E}" DTS:ObjectName="SomeOtherObjectName"> <DTS:Property DTS:DataType="3" DTS:Name="ParameterValue">SomeOtherValue</DTS:Property> </DTS:PackageParameter> </DTS:PackageParameters> </DTS:Executable> We are trying to identify the value of the parameter whose name is ETLIfcHist_ID – notice that in the XML document above that value is VALUE_TO_BE_CHANGED. The following regular expression will find the appropriate portion of the XML document: {\<DTS\:PackageParameter[\n ]*DTS\:CreationName="[A-Za-z0-9\:_\{\}- ]*"[\n ]*DTS\:DataType="[A-Za-z0-9\:_\{\}- ]*"[\n ]*DTS\:Description="[A-Za-z0-9\:_\{\}- ]*"[\n ]*DTS\:DTSID="[A-Za-z0-9\:_\{\}- ]*"[\n ]*DTS\:ObjectName="ETLIfcHist_ID"\>[\n ]*\<DTS\:Property[\n ]*DTS\:DataType="[A-Za-z0-9\:_\{\}- ]*"[\n ]*DTS\:Name="ParameterValue"\>}[A-Za-z0-9\:_\{\}- ]*{\<\/DTS\:Property\>} I have highlighted the name of the parameter that we’re looking for. I have also highlighted two portions identified by pairs of curly braces “{…}”; these are important because they pick out the two portions either side of the value I want to replace, in other words the portions highlighted here: <DTS:PackageParameters> <DTS:PackageParameter DTS:CreationName="" DTS:DataType="3" DTS:Description="InterfaceHistory_ID: used for Lineage" DTS:DTSID="{635616DB-EEEE-45C8-89AA-713E25846C7E}" DTS:ObjectName="ETLIfcHist_ID"> <DTS:Property DTS:DataType="3" DTS:Name="ParameterValue">VALUE_TO_BE_CHANGED</DTS:Property> </DTS:PackageParameter> Those sections in the curly braces are termed tag expressions and can be identified in the replace expression using a backslash and a number identifying which tag expression you’re referring to according to its ordinal position. Hence, our replace expression is simply: \1-1\2 We’re saying the portion of our file identified by the regular expression should be replaced by the first curly brace section, then the literal –1, then the second curly brace section. Make sense? Give it a go yourself by plugging those two expressions into Visual Studio’s Find and Replace dialog. If you set it to look in “All Open Documents” then you can open up the code-behind of all your packages and change all of them at once. The Find and Replace dialog will look like this: That’s it! I realise that not everyone will be looking to change the value of a parameter but hopefully I have shown you a technique that you can modify to work for your own scenario. Given that this blog post is, y’know, on the web I have no doubt that someone is going to find a fault with my find regex expression and if that person is you….that’s OK. Let me know about it in the comments below and perhaps we can work together to come up with something better! Note that some parameters may have a different set of properties (for example some, but not all, of my parameters have a DTS:Required attribute) so your find regular expression may have to change accordingly. When researching this I found the following article to be invaluable: Visual Studio Find/Replace Regular Expression Usage @Jamiet

Read the article
SQL SERVER – 3 Online SQL Courses at Pluralsight and Free Learning Resources

- by pinaldave

Usain Bolt is an inspiration for all. He broke his own record multiple times because he wanted to do better! Read more about him on wikipedia. He is great and indeed fastest man on the planet. Usain Bolt – World’s Fastest Man “Can you teach me SQL Server Performance Tuning?” This is one of the most popular questions which I receive all the time. The answer is YES. I would love to do performance tuning training for anyone, anywhere. It is my favorite thing to do, and it is my favorite thing to train others in. If possible, I would love to do training 24 hours a day, 7 days a week, 365 days a year. To me, it doesn’t feel like a job. Of course, as much as I would love to do performance tuning 24/7/365, obviously I am just one human being and can only be in one place t one time. It is also very difficult to train more than one person at a time, and it is difficult to train two or more people at a time, especially when the two people are at different levels. I am also limited by geography. I live in India, and adjust to my own time zone. Trying to teach a live course from India to someone whose time zone is 12 or more hours off of mine is very difficult. If I am trying to teach at 2 am, I am sure I am not at my best! There was only one solution to scale – Online Trainings. I have built 3 different courses on SQL Server Performance Tuning with Pluralsight. Now I have no problem – I am 100% scalable and available 24/7 and 365. You can make me say the same things again and again till you find it right. I am in your mobile, PC as well as on XBOX. This is why I am such a big fan of online courses. I have recorded many performance tuning classes and you can easily access them online, at your own time. And don’t think that just because these aren’t live classes you won’t be able to get any feedback from me. I encourage all my viewers to go ahead and ask me questions by e-mail, Twitter, Facebook, or whatever way you can get a hold of me. Here are details of three of my courses with Pluralsight. I suggest you go over the description of the course. As an author of the course, I have few FREE codes for watching the free courses. Please leave a comment with your valid email address, I will send a few of them to random winners. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an art to master – for developers and DBAs alike. This course takes a systematic approach to planning, analyzing, debugging and troubleshooting common query-related performance problems. This includes an introduction to understanding execution plans inside SQL Server. In this almost four hour course we cover following important concepts. Introduction 10:22 Execution Plan Basics 45:59 Essential Indexing Techniques 20:19 Query Design for Performance 50:16 Performance Tuning Tools 01:15:14 Tips and Tricks 25:53 Checklist: Performance Tuning 07:13 The duration of each module is mentioned besides the name of the module. SQL Server Performance: Indexing Basics This course teaches you how to master the art of performance tuning SQL Server by better understanding indexes. In this almost two hour course we cover following important concepts. Introduction 02:03 Fundamentals of Indexing 22:21 Practical Indexing Implementation Techniques 37:25 Index Maintenance 16:33 Introduction to ColumnstoreIndex 08:06 Indexing Practical Performance Tips and Tricks 24:56 Checklist : Index and Performance 07:29 The duration of each module is mentioned besides the name of the module. SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. In this almost 2 hours and 15 minutes course we cover following important concepts. Introduction 00:54 Retrieving IDENTITY value using @@IDENTITY 08:38 Concepts Related to Identity Values 04:15 Difference between WHERE and HAVING 05:52 Order in WHERE clause 07:29 Concepts Around Temporary Tables and Table Variables 09:03 Are stored procedures pre-compiled? 05:09 UNIQUE INDEX and NULLs problem 06:40 DELETE VS TRUNCATE 06:07 Locks and Duration of Transactions 15:11 Nested Transaction and Rollback 09:16 Understanding Date/Time Datatypes 07:40 Differences between VARCHAR and NVARCHAR datatypes 06:38 Precedence of DENY and GRANT security permissions 05:29 Identify Blocking Process 06:37 NULLS usage with Dynamic SQL 08:03 Appendix Tips and Tricks with Tools 20:44 The duration of each module is mentioned besides the name of the module. SQL in Sixty Seconds You will have to login and to get subscribed to the courses to view them. Here are my free video learning resources SQL in Sixty Seconds. These are 60 second video which I have built on various subjects related to SQL Server. Do let me know what you think about them? Here are three of my latest videos: Identify Most Resource Intensive Queries – SQL in Sixty Seconds #028 Copy Column Headers from Resultset – SQL in Sixty Seconds #027 Effect of Collation on Resultset – SQL in Sixty Seconds #026 You can watch and learn at your own pace. Then you can easily ask me any questions you have. E-mail is easiest, but for really tough questions I’m willing to talk on Skype, Gtalk, or even Facebook chat. Please do watch and then talk with me, I am always available on the internet! Here is the video of the world’s fastest man.Usain St. Leo Bolt inspires us that we all do better than best. We can go the next level of our own record. We all can improve if we have a will and dedication. Watch the video from 5:00 mark. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLServer, T SQL, Technology, Video

Read the article
Thread placement policies on NUMA systems - update

- by Dave

In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

Read the article
Writing the tests for FluentPath

- by Bertrand Le Roy

Writing the tests for FluentPath is a challenge. The library is a wrapper around a legacy API (System.IO) that wasn’t designed to be easily testable. If it were more testable, the sensible testing methodology would be to tell System.IO to act against a mock file system, which would enable me to verify that my code is doing the expected file system operations without having to manipulate the actual, physical file system: what we are testing here is FluentPath, not System.IO. Unfortunately, that is not an option as nothing in System.IO enables us to plug a mock file system in. As a consequence, we are left with few options. A few people have suggested me to abstract my calls to System.IO away so that I could tell FluentPath – not System.IO – to use a mock instead of the real thing. That in turn is getting a little silly: FluentPath already is a thin abstraction around System.IO, so layering another abstraction between them would double the test surface while bringing little or no value. I would have to test that new abstraction layer, and that would bring us back to square one. Unless I’m missing something, the only option I have here is to bite the bullet and test against the real file system. Of course, the tests that do that can hardly be called unit tests. They are more integration tests as they don’t only test bits of my code. They really test the successful integration of my code with the underlying System.IO. In order to write such tests, the techniques of BDD work particularly well as they enable you to express scenarios in natural language, from which test code is generated. Integration tests are being better expressed as scenarios orchestrating a few basic behaviors, so this is a nice fit. The Orchard team has been successfully using SpecFlow for integration tests for a while and I thought it was pretty cool so that’s what I decided to use. Consider for example the following scenario: Scenario: Change extension Given a clean test directory When I change the extension of bar\notes.txt to foo Then bar\notes.txt should not exist And bar\notes.foo should exist This is human readable and tells you everything you need to know about what you’re testing, but it is also executable code. What happens when SpecFlow compiles this scenario is that it executes a bunch of regular expressions that identify the known Given (set-up phases), When (actions) and Then (result assertions) to identify the code to run, which is then translated into calls into the appropriate methods. Nothing magical. Here is the code generated by SpecFlow: [NUnit.Framework.TestAttribute()] [NUnit.Framework.DescriptionAttribute("Change extension")] public virtual void ChangeExtension() { TechTalk.SpecFlow.ScenarioInfo scenarioInfo = new TechTalk.SpecFlow.ScenarioInfo("Change extension", ((string[])(null))); #line 6 this.ScenarioSetup(scenarioInfo); #line 7 testRunner.Given("a clean test directory"); #line 8 testRunner.When("I change the extension of " + "bar\\notes.txt to foo"); #line 9 testRunner.Then("bar\\notes.txt should not exist"); #line 10 testRunner.And("bar\\notes.foo should exist"); #line hidden testRunner.CollectScenarioErrors();} The #line directives are there to give clues to the debugger, because yes, you can put breakpoints into a scenario: The way you usually write tests with SpecFlow is that you write the scenario first, let it fail, then write the translation of your Given, When and Then into code if they don’t already exist, which results in running but failing tests, and then you write the code to make your tests pass (you implement the scenario). In the case of FluentPath, I built a simple Given method that builds a simple file hierarchy in a temporary directory that all scenarios are going to work with: [Given("a clean test directory")] public void GivenACleanDirectory() { _path = new Path(SystemIO.Path.GetTempPath()) .CreateSubDirectory("FluentPathSpecs") .MakeCurrent(); _path.GetFileSystemEntries() .Delete(true); _path.CreateFile("foo.txt", "This is a text file named foo."); var bar = _path.CreateSubDirectory("bar"); bar.CreateFile("baz.txt", "bar baz") .SetLastWriteTime(DateTime.Now.AddSeconds(-2)); bar.CreateFile("notes.txt", "This is a text file containing notes."); var barbar = bar.CreateSubDirectory("bar"); barbar.CreateFile("deep.txt", "Deep thoughts"); var sub = _path.CreateSubDirectory("sub"); sub.CreateSubDirectory("subsub"); sub.CreateFile("baz.txt", "sub baz") .SetLastWriteTime(DateTime.Now); sub.CreateFile("binary.bin", new byte[] {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0xFF}); } Then, to implement the scenario that you can read above, I had to write the following When: [When("I change the extension of (.*) to (.*)")] public void WhenIChangeTheExtension( string path, string newExtension) { var oldPath = Path.Current.Combine(path.Split('\\')); oldPath.Move(p => p.ChangeExtension(newExtension)); } As you can see, the When attribute is specifying the regular expression that will enable the SpecFlow engine to recognize what When method to call and also how to map its parameters. For our scenario, “bar\notes.txt” will get mapped to the path parameter, and “foo” to the newExtension parameter. And of course, the code that verifies the assumptions of the scenario: [Then("(.*) should exist")] public void ThenEntryShouldExist(string path) { Assert.IsTrue(_path.Combine(path.Split('\\')).Exists); } [Then("(.*) should not exist")] public void ThenEntryShouldNotExist(string path) { Assert.IsFalse(_path.Combine(path.Split('\\')).Exists); } These steps should be written with reusability in mind. They are building blocks for your scenarios, not implementation of a specific scenario. Think small and fine-grained. In the case of the above steps, I could reuse each of those steps in other scenarios. Those tests are easy to write and easier to read, which means that they also constitute a form of documentation. Oh, and SpecFlow is just one way to do this. Rob wrote a long time ago about this sort of thing (but using a different framework) and I highly recommend this post if I somehow managed to pique your interest: http://blog.wekeroad.com/blog/make-bdd-your-bff-2/ And this screencast (Rob always makes excellent screencasts): http://blog.wekeroad.com/mvc-storefront/kona-3/ (click the “Download it here” link)

Read the article
How you can extend Tasklists in Fusion Applications

- by Elie Wazen

In this post we describe the process of modifying and extending a Tasklist available in the Regional Area of a Fusion Applications UI Shell. This is particularly useful to Customers who would like to expose Setup Tasks (generally available in the Fusion Setup Manager application) in the various functional pillars workareas. Oracle Composer, the tool used to implement such extensions allows changes to be made at runtime. The example provided in this document is for an Oracle Fusion Financials page. Let us examine the case of a customer role who requires access to both, a workarea and its associated functional tasks, and to an FSM (setup) task. Both of these tasks represent ADF Taskflows but each is accessible from a different page. We will show how an FSM task is added to a Functional tasklist and made accessible to a user from within a single workarea, eliminating the need to navigate between the FSM application and the Functional workarea where transactions are conducted. In general, tasks in Fusion Applications are grouped in two ways: Setup tasks are grouped in tasklists available to implementers in the Functional Setup Manager (FSM). These Tasks are accessed by implementation users and in general do not represent daily operational tasks that fit into a functional business process and were consequently included in the FSM application. For these tasks, the primary organizing principle is precedence between tasks. If task "Manage Suppliers" has prerequisites, those tasks must precede it in a tasklist. Task Lists are organized to efficiently implement an offering. Tasks frequently performed as part of business process flows are made available as links in the tasklist of their corresponding menu workarea. The primary organizing principle in the menu and task pane entries is to group tasks that are generally accessed together. Customizing a tasklist thus becomes required for business scenarios where a task packaged under FSM as a setup task, is for a particular customer a regular maintenance task that is accessed for record updates or creation as part of normal operational activities and where the frequency of this access merits the inclusion of that task in the related operational tasklist A user with the role of maintaining Journals in General Ledger is also responsible for maintaining Chart of Accounts Mappings. In the Fusion Financials Product Family, Manage Journals is a task available from within the Journals Menu whereas Chart of Accounts Mapping is available via FSM under the Define Chart of Accounts tasklist Figure 1. The Manage Chart of Accounts Mapping Task in FSM Figure 2. The Manage Journals Task in the Task Pane of the Journals Workarea Our goal is to simplify cross task navigation and allow the user to access both tasks from a single tasklist on a single page without having to navigate to FSM for the Mapping task and to the Journals workarea for the Manage task. To accomplish that, we use Oracle Composer to customize the Journals tasklist by adding to it the Mapping task. Identify the Taskflow name and path of the FSM Task The first step in our process is to identify the underlying taskflow for the Manage Chart of Accounts Mappings task. We select to Setup and Maintenance from the Navigator to launch the FSM Application, and we query the task from Manage Tasklists and Tasks Figure 3. Task Details including Taskflow path The Manage Chart of Accounts Mapping Task Taskflow is: /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/coaMappings/ui/flow /CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow We copy that value and use it later as a parameter to our new task in the customized Journals Tasklist. Customize the Journals Page A user with Administration privileges can start the run time customization directly from the Administration Menu of the Global Area. This customization is done at the Site level and once implemented becomes available to all users with access to the Journals Workarea. Figure 4. Customization Menu The Oracle Composer Window is displayed in the same browser and the Hierarchy of the page component is displayed and available for modification. Figure 5. Oracle Composer In the composer Window select the PanelFormLayout node and click on the Edit Button. Note that the selected component is simultaneously highlighted in the lower pane in the browser. In the Properties popup window, select the Tasks List and Task Properties Tab, where the user finds the hierarchy of the Tasklist and is able to Edit nodes or create new ones. src="https://blogs.oracle.com/FunctionalArchitecture/resource/TL5.jpg" Figure 6. The Tasklist in edit mode Add a Child Task to the Tasklist In the Edit Window the user will now create a child node at the desired level in the hierarchy by selecting the immediate parent node and clicking on the insert node button. This process requires four values to be set as described in Table 1 below. Parameter Value How to Determine the Value Focus View Id /JournalEntryPage This is the Focus View ID of the UI Shell where the Tasklist we want to customize is. A simple way to determine this value is to copy it from any of the Standard tasks on the Tasklist Label COA Mapping This is the Display name of the Task as it will appear in the Tasklist Task Type dynamicMain If the value is dynamicMain, the page contains a new link in the Regional Area. When you click the link, a new tab with the loaded task opens Taskflowid /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/ coaMappings/ui/flow/ CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow This is the Taskflow path we retrieved from the Task Definition in FSM earlier in the process Table 1. Parameters and Values for the Task to be added to the customized Tasklist Figure 7. The parameters window of the newly added Task Access the FSM Task from the Journals Workarea Once the FSM task is added and its parameters defined, the user saves the record, closes the Composer making the new task immediately available to users with access to the Journals workarea (Refer to Figure 8 below). Figure 8. The COA Mapping Task is now visible and can be invoked from the Journals Workarea Additional Considerations If a Task Flow is part of a product that is deployed on the same app server as the Tasklist workarea then that task flow can be added to a customized tasklist in that workarea. Otherwise that task flow can be invoked from its parent product’s workarea tasklist by selecting that workarea from the Navigator menu. For Example The following Taskflows belong respectively to the Subledger Accounting, and to the General Ledger Products. /WEB-INF/oracle/apps/financials/subledgerAccounting/accountingMethodSetup/mappingSets/ui/flow/MappingSetFlow.xml#MappingSetFlow /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/coaMappings/ui/flow/CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow Since both the Subledger Accounting and General Ledger products are part of the LedgerApp J2EE Applicaton and are both deployed on the General Ledger Cluster Server (Figure 8 below), the user can add both of the above taskflows to the tasklist in the /JournalEntryPage FocusVIewID Workarea. Note: both FSM Taskflows and Functional Taskflows can be added to the Tasklists as described in this document Figure 8. The Topology of the Fusion Financials Product Family. Note that SubLedger Accounting and General Ledger are both deployed on the Ledger App Conclusion In this document we have shown how an administrative user can edit the Tasklist in the Regional Area of a Fusion Apps page using Oracle Composer. This is useful for cases where tasks packaged in different workareas are frequently accessed by the same user. By making these tasks available from the same page, we minimize the number of steps in the navigation the user has to do to perform their transactions and queries in Fusion Apps. The example explained above showed that tasks classified as Setup tasks, meaning made accessible to implementation users from the FSM module can be added to the workarea of their respective Fusion application. This eliminates the need to navigate to FSM to access tasks that are both setup and regular maintenance tasks. References Oracle Fusion Applications Extensibility Guide 11g Release 1 (11.1.1.5) Part Number E16691-02 (Section 3.2) Oracle Fusion Applications Developer's Guide 11g Release 1 (11.1.4) Part Number E15524-05

Read the article
Fraud Detection with the SQL Server Suite Part 2

- by Dejan Sarka

This is the second part of the fraud detection whitepaper. You can find the first part in my previous blog post about this topic. My Approach to Data Mining Projects It is impossible to evaluate the time and money needed for a complete fraud detection infrastructure in advance. Personally, I do not know the customer’s data in advance. I don’t know whether there is already an existing infrastructure, like a data warehouse, in place, or whether we would need to build one from scratch. Therefore, I always suggest to start with a proof-of-concept (POC) project. A POC takes something between 5 and 10 working days, and involves personnel from the customer’s site – either employees or outsourced consultants. The team should include a subject matter expert (SME) and at least one information technology (IT) expert. The SME must be familiar with both the domain in question as well as the meaning of data at hand, while the IT expert should be familiar with the structure of data, how to access it, and have some programming (preferably Transact-SQL) knowledge. With more than one IT expert the most time consuming work, namely data preparation and overview, can be completed sooner. I assume that the relevant data is already extracted and available at the very beginning of the POC project. If a customer wants to have their people involved in the project directly and requests the transfer of knowledge, the project begins with training. I strongly advise this approach as it offers the establishment of a common background for all people involved, the understanding of how the algorithms work and the understanding of how the results should be interpreted, a way of becoming familiar with the SQL Server suite, and more. Once the data has been extracted, the customer’s SME (i.e. the analyst), and the IT expert assigned to the project will learn how to prepare the data in an efficient manner. Together with me, knowledge and expertise allow us to focus immediately on the most interesting attributes and identify any additional, calculated, ones soon after. By employing our programming knowledge, we can, for example, prepare tens of derived variables, detect outliers, identify the relationships between pairs of input variables, and more, in only two or three days, depending on the quantity and the quality of input data. I favor the customer’s decision of assigning additional personnel to the project. For example, I actually prefer to work with two teams simultaneously. I demonstrate and explain the subject matter by applying techniques directly on the data managed by each team, and then both teams continue to work on the data overview and data preparation under our supervision. I explain to the teams what kind of results we expect, the reasons why they are needed, and how to achieve them. Afterwards we review and explain the results, and continue with new instructions, until we resolve all known problems. Simultaneously with the data preparation the data overview is performed. The logic behind this task is the same – again I show to the teams involved the expected results, how to achieve them and what they mean. This is also done in multiple cycles as is the case with data preparation, because, quite frankly, both tasks are completely interleaved. A specific objective of the data overview is of principal importance – it is represented by a simple star schema and a simple OLAP cube that will first of all simplify data discovery and interpretation of the results, and will also prove useful in the following tasks. The presence of the customer’s SME is the key to resolving possible issues with the actual meaning of the data. We can always replace the IT part of the team with another database developer; however, we cannot conduct this kind of a project without the customer’s SME. After the data preparation and when the data overview is available, we begin the scientific part of the project. I assist the team in developing a variety of models, and in interpreting the results. The results are presented graphically, in an intuitive way. While it is possible to interpret the results on the fly, a much more appropriate alternative is possible if the initial training was also performed, because it allows the customer’s personnel to interpret the results by themselves, with only some guidance from me. The models are evaluated immediately by using several different techniques. One of the techniques includes evaluation over time, where we use an OLAP cube. After evaluating the models, we select the most appropriate model to be deployed for a production test; this allows the team to understand the deployment process. There are many possibilities of deploying data mining models into production; at the POC stage, we select the one that can be completed quickly. Typically, this means that we add the mining model as an additional dimension to an existing DW or OLAP cube, or to the OLAP cube developed during the data overview phase. Finally, we spend some time presenting the results of the POC project to the stakeholders and managers. Even from a POC, the customer will receive lots of benefits, all at the sole risk of spending money and time for a single 5 to 10 day project: The customer learns the basic patterns of frauds and fraud detection The customer learns how to do the entire cycle with their own people, only relying on me for the most complex problems The customer’s analysts learn how to perform much more in-depth analyses than they ever thought possible The customer’s IT experts learn how to perform data extraction and preparation much more efficiently than they did before All of the attendees of this training learn how to use their own creativity to implement further improvements of the process and procedures, even after the solution has been deployed to production The POC output for a smaller company or for a subsidiary of a larger company can actually be considered a finished, production-ready solution It is possible to utilize the results of the POC project at subsidiary level, as a finished POC project for the entire enterprise Typically, the project results in several important “side effects” Improved data quality Improved employee job satisfaction, as they are able to proactively contribute to the central knowledge about fraud patterns in the organization Because eventually more minds get to be involved in the enterprise, the company should expect more and better fraud detection patterns After the POC project is completed as described above, the actual project would not need months of engagement from my side. This is possible due to our preference to transfer the knowledge onto the customer’s employees: typically, the customer will use the results of the POC project for some time, and only engage me again to complete the project, or to ask for additional expertise if the complexity of the problem increases significantly. I usually expect to perform the following tasks: Establish the final infrastructure to measure the efficiency of the deployed models Deploy the models in additional scenarios Through reports By including Data Mining Extensions (DMX) queries in OLTP applications to support real-time early warnings Include data mining models as dimensions in OLAP cubes, if this was not done already during the POC project Create smart ETL applications that divert suspicious data for immediate or later inspection I would also offer to investigate how the outcome could be transferred automatically to the central system; for instance, if the POC project was performed in a subsidiary whereas a central system is available as well Of course, for the actual project, I would repeat the data and model preparation as needed It is virtually impossible to tell in advance how much time the deployment would take, before we decide together with customer what exactly the deployment process should cover. Without considering the deployment part, and with the POC project conducted as suggested above (including the transfer of knowledge), the actual project should still only take additional 5 to 10 days. The approximate timeline for the POC project is, as follows: 1-2 days of training 2-3 days for data preparation and data overview 2 days for creating and evaluating the models 1 day for initial preparation of the continuous learning infrastructure 1 day for presentation of the results and discussion of further actions Quite frequently I receive the following question: are we going to find the best possible model during the POC project, or during the actual project? My answer is always quite simple: I do not know. Maybe, if we would spend just one hour more for data preparation, or create just one more model, we could get better patterns and predictions. However, we simply must stop somewhere, and the best possible way to do this, according to my experience, is to restrict the time spent on the project in advance, after an agreement with the customer. You must also never forget that, because we build the complete learning infrastructure and transfer the knowledge, the customer will be capable of doing further investigations independently and improve the models and predictions over time without the need for a constant engagement with me.

Read the article
SQL Sentry First Impressions

- by AjarnMark

After struggling to defend my SQL Servers from a political attack recently, I realized that I needed better tools to back me up, and SQL Sentry is the leading candidate. A couple of weeks ago, seemingly from out of nowhere, complaints from the business users started coming in that one of the core internal applications was running dramatically slower than normal, and fingers were being pointed at the SQL Server. Unfortunately, we don’t have a production DBA whose entire job is to monitor and maintain our SQL Servers. The responsibility falls to me to do the best I can, investing only a small portion of my time, because there are so many other responsibilities to take care of, and our industry is still deep in recession. I inherited these SQL Servers and have made significant improvements in process and procedure, but I had not yet made the time to take real baseline measurements or keep a really close eye on the performance. Like many DBAs, I wrote several of my own tools and used the “built-in tools” like Profiler, PerfMon, and sp_who2 (did I mention most of our instances are SQL Server 2000?). These have all served me well for in-the-moment troubleshooting and maintenance, but they really fell down on the job when I was called upon to “prove” that SQL Server performance was acceptable and more importantly had not degraded recently (i.e. historical comparisons). I really didn’t have anything from a historical comparison perspective, but I was able to show that current performance was acceptable, and deflect attention back onto other components (which in fact turned out to be the real culprit). That experience dramatically illustrated the need for better monitoring tools. Coincidentally, I had been talking recently to my boss about the mini nightmare of monitoring several critical and interdependent overnight jobs that operate on separate instances of SQL Server. Among other tools, I had been using Idera’s SQL Job Manager which is a free tool and did a nice job of showing me job schedules and histories in a nice calendar view. This worked fairly well, and for the money (did I mention it was free?) it couldn’t be beat. But it is based on the stored job history in MSDB, and there were other performance problems that we ran into when we started changing the settings for how much job history to retain, in order to be able to look back a month or more in the calendar view. Another coincidence (if you believe in such things) was that when we had some of those performance challenges, I posted a couple of questions to the #sqlhelp hashtag on Twitter and Greg Gonzalez (@SQLSensei) suggested I check out SQL Sentry’s Event Manager. At the time, I just thought he worked there, but later found out that he founded the company. When I took a quick look at the features & benefits, the one that really jumped out at me is Chaining and Queueing which sounded like it would really help with our “interdependent jobs on different servers” issue. I know that is a lot of background story and coincidences, but hopefully you have stuck with me so far, and now we have arrived at the point where last week I downloaded and installed the 30-day trial of the SQL Sentry Power Suite, which is Event Manager plus Performance Advisor. And I must say that I really like what I see so far. Here are a few highlights: Great Support. I had two issues getting the trial setup and monitoring a handful of our servers. One of which was entirely my fault (missed a security setting in SQL 2008) and the other was mostly my fault (late change to some config settings that were apparently cached and did not get refreshed properly). In both cases, the support staff at SQL Sentry were very responsive and rather quickly figured out what the cause and fix was for each of them. This left me with a great impression of the company. Kudos to them! Chaining and Queueing. While I have not yet activated this feature, I am very excited about the possibilities. We have jobs on three different instances of SQL Server that have to be run in a certain order, and each has to finish before the next can successfully begin, and I believe this feature will ensure just that. It has been a real pain in the backside when one of those jobs runs just a little too long and does not finish before the job on another instance starts, thus triggering a chain reaction of either outright job failures, or worse, successful completion of completely invalid processing. Calendar View. I really, really like the Event Manager calendar view where I can see all jobs and events across all instances and identify potential resource contention as well as windows of opportunity for maintenance activity. Very well done, and based on Event Manager’s own database of accumulated historical information rather than querying the source instances every time. Performance Advisor Dashboard History View. This view let’s me quickly select a date and time range and it displays graphs of key SQL Server and Windows metrics. This is exactly the thing I needed to answer the “has performance changed recently” question at the beginning of this post. Reporting Services Subscription Jobs with Report Name. This was a big and VERY pleasant surprise. If you have ever looked at the list of SQL Server jobs that SQL Server Reporting Services creates when you make a Subscription, you will notice that they all have some sort of GUID as the name of the job. This is really ugly, and really annoying because when you are just looking at the SQL Agent and Job Activity Monitor, if you see that Job X failed, you really do not have any indication in the name or the properties of the Job itself, as to what Report that was for. But with SQL Sentry Event Manager you do. The Jobs list in the Navigator pane in SQL Sentry, amazingly, displays the name of the Report that the Subscription Job is for. And when you open it to see more details, it shows you the full Reporting Services path to that Report, so you can immediately track it down in the Report Manager in case you want to identify/notify the owner or edit the Subscription information. I did not expect this at all, but I sure do like it. HOORAY! That is just my first impressions from using the tools for a few days. And I haven’t even gotten into how it showed me where I was completely mistaken about one aspect of my SQL Server disk configurations. I’ll share that lesson in another blog entry. But I have to say it again, the combination of Event Manager and Performance Advisor working together have really made me a fan.

Read the article
SQL Server Table Polling by Multiple Subscribers

- by Daniel Hester

Background Designing Stored Procedures that are safe for multiple subscribers (to call simultaneously) can be challenging. For example let’s say that you want multiple worker processes to poll a shared work queue that’s encapsulated as a SQL Table. This is a common scenario and through experience you’ll find that you want to use Table Hints to prevent unwanted locking when performing simultaneous queries on the same table. There are three table hints to consider: NOLOCK, READPAST and UPDLOCK. Both NOLOCK and READPAST table hints allow you to SELECT from a table without placing a LOCK on that table. However, SELECTs with the READPAST hint will ignore any records that are locked due to being updated/inserted (or otherwise “dirty”), whereas a SELECT with NOLOCK ignores all locks including dirty reads. For the initial update of the flag (that marks the record as available for subscription) I don’t use the NOLOCK Table Hint because I want to be sensitive to the “active” records in the table and I want to exclude them. I use an Update Lock (UPDLOCK) in conjunction with a WHERE clause that uses a sub-select with a READPAST Table Hint in order to explicitly lock the records I’m updating (UPDLOCK) but not place a lock on the table when selecting the records that I’m going to update (READPAST). UPDATES should be allowed to lock the rows affected because we’re probably changing a flag on a record so that it is not included in a SELECT from another subscriber. On the UPDATE statement we should explicitly use the UPDLOCK to guard against lock escalation. A SELECT to check for the next record(s) to process can result in a shared read lock being held by more than one subscriber polling the shared work queue (SQL table). It is expected that more than one worker process (or server) might try to process the same new record(s) at the same time. When each process then tries to obtain the update lock, none of them can because another process has a shared read lock in place. Thus without the UPDLOCK hint the result would be a lock escalation deadlock; however with the UPDLOCK hint this condition is mitigated against. Note that using the READPAST table hint requires that you also set the ISOLATION LEVEL of the transaction to be READ COMMITTED (rather than the default of SERIALIZABLE). Guidance In the Stored Procedure that returns records to the multiple subscribers: Perform the UPDATE first. Change the flag that makes the record available to subscribers. Additionally, you may want to update a LastUpdated datetime field in order to be able to check for records that “got stuck” in an intermediate state or for other auditing purposes. In the UPDATE statement use the (UPDLOCK) Table Hint on the UPDATE statement to prevent lock escalation. In the UPDATE statement also use a WHERE Clause that uses a sub-select with a (READPAST) Table Hint to select the records that you’re going to update. In the UPDATE statement use the OUTPUT clause in conjunction with a Temporary Table to isolate the record(s) that you’ve just updated and intend to return to the subscriber. This is the fastest way to update the record(s) and to get the records’ identifiers within the same operation. Finally do a set-based SELECT on the main Table (using the Temporary Table to identify the records in the set) with either a READPAST or NOLOCK table hint. Use NOLOCK if there are other processes (besides the multiple subscribers) that might be changing the data that you want to return to the multiple subscribers; or use READPAST if you're sure there are no other processes (besides the multiple subscribers) that might be updating column data in the table for other purposes (e.g. changes to a person’s last name). NOLOCK is generally the better fit in this part of the scenario. See the following as an example: CREATE PROCEDURE [dbo].[usp_NewCustomersSelect] AS BEGIN -- OVERRIDE THE DEFAULT ISOLATION LEVEL SET TRANSACTION ISOLATION LEVEL READ COMMITTED -- SET NOCOUNT ON SET NOCOUNT ON -- DECLARE TEMP TABLE -- Note that this example uses CustomerId as an identifier; -- you could just use the Identity column Id if that’s all you need. DECLARE @CustomersTempTable TABLE ( CustomerId NVARCHAR(255) ) -- PERFORM UPDATE FIRST -- [Customers] is the name of the table -- [Id] is the Identity Column on the table -- [CustomerId] is the business document key used to identify the -- record globally, i.e. in other systems or across SQL tables -- [Status] is INT or BIT field (if the status is a binary state) -- [LastUpdated] is a datetime field used to record the time of the -- last update UPDATE [Customers] WITH (UPDLOCK) SET [Status] = 1, [LastUpdated] = GETDATE() OUTPUT [INSERTED].[CustomerId] INTO @CustomersTempTable WHERE ([Id] = (SELECT TOP 100 [Id] FROM [Customers] WITH (READPAST) WHERE ([Status] = 0) ORDER BY [Id] ASC)) -- PERFORM SELECT FROM ENTITY TABLE SELECT [C].[CustomerId], [C].[FirstName], [C].[LastName], [C].[Address1], [C].[Address2], [C].[City], [C].[State], [C].[Zip], [C].[ShippingMethod], [C].[Id] FROM [Customers] AS [C] WITH (NOLOCK), @CustomersTempTable AS [TEMP] WHERE ([C].[CustomerId] = [TEMP].[CustomerId]) END In a system that has been designed to have multiple status values for records that need to be processed in the Work Queue it is necessary to have a “Watch Dog” process by which “stale” records in intermediate states (such as “In Progress”) are detected, i.e. a [Status] of 0 = New or Unprocessed; a [Status] of 1 = In Progress; a [Status] of 2 = Processed; etc.. Thus, if you have a business rule that states that the application should only process new records if all of the old records have been processed successfully (or marked as an error), then it will be necessary to build a monitoring process to detect stalled or stale records in the Work Queue, hence the use of the LastUpdated column in the example above. The Status field along with the LastUpdated field can be used as the criteria to detect stalled / stale records. It is possible to put this watchdog logic into the stored procedure above, but I would recommend making it a separate monitoring function. In writing the stored procedure that checks for stale records I would recommend using the same kind of lock semantics as suggested above. The example below looks for records that have been in the “In Progress” state ([Status] = 1) for greater than 60 seconds: CREATE PROCEDURE [dbo].[usp_NewCustomersWatchDog] AS BEGIN -- TO OVERRIDE THE DEFAULT ISOLATION LEVEL SET TRANSACTION ISOLATION LEVEL READ COMMITTED -- SET NOCOUNT ON SET NOCOUNT ON DECLARE @MaxWait int; SET @MaxWait = 60 IF EXISTS (SELECT 1 FROM [dbo].[Customers] WITH (READPAST) WHERE ([Status] = 1) AND (DATEDIFF(s, [LastUpdated], GETDATE()) > @MaxWait)) BEGIN SELECT 1 AS [IsWatchDogError] END ELSE BEGIN SELECT 0 AS [IsWatchDogError] END END Downloads The zip file below contains two SQL scripts: one to create a sample database with the above stored procedures and one to populate the sample database with 10,000 sample records. I am very grateful to Red-Gate software for their excellent SQL Data Generator tool which enabled me to create these sample records in no time at all. References http://msdn.microsoft.com/en-us/library/ms187373.aspx http://www.techrepublic.com/article/using-nolock-and-readpast-table-hints-in-sql-server/6185492 http://geekswithblogs.net/gwiele/archive/2004/11/25/15974.aspx http://grounding.co.za/blogs/romiko/archive/2009/03/09/biztalk-sql-receive-location-deadlocks-dirty-reads-and-isolation-levels.aspx

Read the article
The True Cost of a Solution

- by D'Arcy Lussier

I had a Twitter chat recently with someone suggesting Oracle and SQL Server were losing out to OSS (Open Source Software) in the enterprise due to their issues with scaling or being too generic (one size fits all). I challenged that a bit, as my experience with enterprise sized clients has been different – adverse to OSS but receptive to an established vendor. The response I got was: Found it easier to influence change by showing how X can’t solve our problems or X is extremely costly to scale. Money talks. I think this is definitely the right approach for anyone pitching an alternate or alien technology as part of a solution: identify the issue, identify the solution, then present pros and cons including a cost/benefit analysis. What can happen though is we get tunnel vision and don’t present a full view of the costs associated with a solution. An “Acura”te Example (I’m so clever…) This is my dream vehicle, a Crystal Black Pearl coloured Acura MDX with the SH-AWD package! We’re a family of 4 (5 if my daughters ever get their wish of adding a dog), and I’ve always wanted a luxury type of vehicle, so this is a perfect replacement in a few years when our Rav 4 has hit the 8 – 10 year mark. MSRP – $62,890 But as we all know, that’s not *really* the cost of the vehicle. There’s taxes and fees added on, there’s the extended warranty if I choose to purchase it, there’s the finance rate that needs to be factored in… MSRP – $62,890 Taxes – $7,546 Warranty - $2,500 SubTotal – $72,936 Finance Charge – $ 1094.04 Grand Total – $74,030 Well! Glad we did that exercise – we discovered an extra $11k added on to the MSRP! Well now we have our true price…or do we? Lifetime of the Vehicle I’m expecting to have this vehicle for 7 – 10 years. While the hard cost of the vehicle is known and dealt with, the costs to run and maintain the vehicle are on top of this. I did some research, and here’s what I’ve found: Fuel and Mileage Gas prices are high as it is for regular fuel, but getting into an MDX will require that I *only* purchase premium fuel, which comes at a premium price. I need to expect my bill at the pump to be higher. Comparing the MDX to my 2007 Rav4 also shows I’ll be gassing up more often. The Rav4 has a city MPG of 21, while the MDX plummets to 16! The MDX does have a bigger fuel tank though, so all in all the number of times I hit the pumps might even out. Still, I estimate I’ll be spending approximately $8000 – $10000 more on gas over a 10 year period than my current Rav4. Service Options Limited Although I have options with my Toyota here in Winnipeg (we have 4 Toyota dealerships), I do go to my original dealer for any service work. Still, I like the fact that I have options. However, there’s only one Acura dealership in all of Winnipeg! So if, for whatever reason, I’m not satisfied with the level of service I’m stuck. Non Warranty Service Work Also let’s not forget that there’s a bulk of work required every year that is *not* covered under warranty – oil changes, tire rotations, brake pads, etc. I expect I’ll need to get new tires at the 5 years mark as well, which can easily be $1200 – $1500 (I just paid $1000 for new tires for the Rav4 and we’re at the 5 year mark). Now these aren’t going to be *new* costs that I’m not used to from our existing vehicles, but they should still be factored in. I’d budget $500/year, or $5000 over the 10 years I’ll own the vehicle. Final Assessment So let’s re-assess the true cost of my dream MDX: MSRP $62,890 Taxes $7,546 Warranty $2,500 Finance Charge $1094 Gas $10,000 Service Work $5000 Grand Total $89,030 So now I have a better idea of 10 year cost overall, and I’ve identified some concerns with local service availability. And there’s now much more to consider over the original $62,890 price tag. Tying This Back to Technology Solutions The process that we just went through is no different than what organizations do when considering implementing a new system, technology, or technology based solution, within their environments. It’s easy to tout the short term cost savings of particular product/platform/technology in a vacuum. But its when you consider the wider impact that the true cost comes into play. Let’s create a scenario: A company is not happy with its current data reporting suite. An employee suggests moving to an open source solution. The selling points are: - Because its open source its free - The organization would have access to the source code so they could alter it however they wished - It provided features not available with the current reporting suite At first this sounds great to the management and executive, but then they start asking some questions and uncover more information: - The OSS product is built on a technology not used anywhere within the organization - There are no vendors offering product support for the OSS product - The OSS product requires a specific server platform to operate on, one that’s not standard in the organization All of a sudden, the true cost of implementing this solution is starting to become clearer. The company might save money on licensing costs, but their training costs would increase significantly – developers would need to learn how to develop in the technology the OSS solution was built on, IT staff must learn how to set up and maintain a new server platform within their existing infrastructure, and if a problem was found there was no vendor to contact for support. The true cost of implementing a “free” OSS solution is actually spinning up a project to implement it within the organization – no small cost. And that’s just the short-term cost. Now the organization must ensure they maintain trained staff who can make changes to the OSS reporting solution and IT staff that will stay knowledgeable in the new server platform. If those skills are very niche, then higher labour costs could be incurred if those people are hard to find or if trained employees use that knowledge as leverage for higher pay. Maybe a vendor exists that will contract out support, but then there are those costs to consider as well. And let’s not forget end-user training – in our example, anyone that runs reports will need to be trained on how to use the new system. Here’s the Point We still tend to look at software in an “off the shelf” kind of way. It’s very easy to say “oh, this product is better than vendor x’s product – and its free because its OSS!” but the reality is that implementing any new technology within an organization has a cost regardless of the retail price of the product. Training, integration, support – these are real costs that impact an organization and span multiple departments. Whether you’re pitching an improved business process, a new system, or a new technology, you need to consider the bigger picture costs of implementation. What you define as success (in our example, having better reporting functionality) might not be what others define as success if implementing your solution causes them issues. A true enterprise solution needs to consider the entire enterprise.

Read the article
Let your Signature Experience drive IT-decision making

- by Tania Le Voi

Today’s CIO job description: ‘’Align IT infrastructure and solutions with business goals and objectives ; AND while doing so reduce costs; BUT ALSO, be innovative, ensure the architectures are adaptable and agile as we need to act today on the changes that we may request tomorrow.” Sound like an unachievable request? The fact is, reality dictates that CIO’s are put under this type of pressure to deliver more with less. In a past career phase I spent a few years as an IT Relationship Manager for a large Insurance company. This is a role that we see all too infrequently in many of our customers, and it’s a shame. The purpose of this role was to build a bridge, a relationship between IT and the business. Key to achieving that goal was to ensure the same language was being spoken and more importantly that objectives were commonly understood - hence service and projects were delivered to time, to budget and actually solved the business problems. In reality IT and the business are already married, but the relationship is most often defined as ‘supplier’ of IT rather than a ‘trusted partner’. To deliver business value they need to understand how to work together effectively to attain this next level of partnership. The Business cannot compete if they do not get a new product to market ahead of the competition, or for example act in a timely manner to address a new industry problem such as a legislative change. An even better example is when the Application or Service fails and the Business takes a hit by bad publicity, being trending topics on social media and losing direct revenue from online channels. For this reason alone Business and IT need the alignment of their priorities and deliverables now more than ever! Take a look at Forrester’s recent study that found ‘many IT respondents considering themselves to be trusted partners of the business but their efforts are impaired by the inadequacy of tools and organizations’. IT Meet the Business; Business Meet IT So what is going on? We talk about aligning the business with IT but the reality is it’s difficult to do. Like any relationship each side has different goals and needs and language can be a barrier; business vs. technology jargon! What if we could translate the needs of both sides into actionable information, backed by data both sides understand, presented in a meaningful way? Well now we can with the Business-Driven Application Management capabilities in Oracle Enterprise Manager 12cR2! Enterprise Manager’s Business-Driven Application Management capabilities provide the information that IT needs to understand the impact of its decisions on business criteria. No longer does IT need to be focused solely on speeds and feeds, performance and throughput – now IT can understand IT’s impact on business KPIs like inventory turns, order-to-cash cycle, pipeline-to-forecast, and similar. Similarly, now the line of business can understand which IT services are most critical for the KPIs they care about. There are a good deal of resources on Oracle Technology Network that describe the functionality of these products, so I won’t’ rehash them here. What I want to talk about is what you do with these products. What’s next after we meet? Where do you start? Step 1: Identify the Signature Experience. This is THE business process (or set of processes) that is core to the business, the one that drives the economic engine, the process that a customer recognises the company brand for, reputation, the customer experience, the process that a CEO would state as his number one priority. The crème de la crème of your business! Once you have nailed this it gets easy as Enterprise Manager 12c makes it easy. Step 2: Map the Signature Experience to underlying IT. Taking the signature experience, map out the touch points of the components that play a part in ensuring this business transaction is successful end to end, think of it like mapping out a critical path; the applications, middleware, databases and hardware. Use the wealth of Enterprise Manager features such as Systems, Services, Business Application Targets and Business Transaction Management (BTM) to assist you. Adding Real User Experience Insight (RUEI) into the mix will make the end to end customer satisfaction story transparent. Work with the business and define meaningful key performance indicators (KPI’s) and thresholds to enable you to report and action upon. Step 3: Observe the data over time. You now have meaningful insight into every step enabling your signature experience and you understand the implication of that experience on your underlying IT. Watch if for a few months, see what happens and reconvene with your business stakeholders and set clear and measurable targets which can re-define service levels. Step 4: Change the information about which you and the business communicate. It’s amazing what happens when you and the business speak the same language. You’ll be able to make more informed business and IT decisions. From here IT can identify where/how budget is spent whether on the level of support, performance, capacity, HA, DR, certification etc. IT SLA’s no longer need be focused on metrics such as %availability but structured around business process requirements. The power of this way of thinking doesn’t end here. IT staff get to see and understand how their own role contributes to the business making them accountable for the business service. Take a step further and appraise your staff on the business competencies that are linked to the service availability. For the business, the language barrier is removed by producing targeted reports on the signature experience core to the business and therefore key to the CEO. Chargeback or show back becomes easier to justify as the ‘cost of day per outage’ can be more easily calculated; the business will be able to translate the cost to the business to the cost/value of the underlying IT that supports it. Used this way, Oracle Enterprise Manager 12c is a key enabler to a harmonious relationship between the end customer the business and IT to deliver ultimate service and satisfaction. Just engage with the business upfront, make the signature experience visible and let Enterprise Manager 12c do the rest. In the next blog entry we will cover some of the Enterprise Manager features mentioned to enable you to implement this new way of working.

Read the article
Oracle B2B - Synchronous Request Reply

- by cdwright

Introduction So first off, let me say I didn't create this demo (although I did modify it some). I got it from a member of the B2B development technical staff. Since it came with only a simple readme file, I thought I would take some time and write a more detailed explanation about how it works. Beginning with Oracle SOA Suite PS5 (11.1.1.6), B2B supports synchronous request reply over http using the b2b/syncreceiver servlet. I’m attaching the demo to this blog which includes a SOA composite archive that needs to be deployed using JDeveloper, a B2B repository with two agreements that need to be deployed using the B2B console, and a test xml file that gets sent to the b2b/syncreceiver servlet using your favorite SOAP test tool (I'm using Firefox Poster here). You can download the zip file containing the demo here. The demo works by sending the sample xml request file (req.xml) to http://<b2bhost>:8001/b2b/syncreceiver using the SOAP test tool. The syncreceiver servlet keeps the socket connection open between itself and the test tool so that it can synchronously send the reply message back. When B2B receives the inbound request message, it is passed to the SOA composite through the default B2B Fabric binding. A simple reply is created in BPEL and returned to B2B which then sends the message back to the test tool using that same socket connection. I’ll show you the B2B configuration first, then we’ll look at the soa composite. Configuring B2B No additional configuration necessary in order to use the syncreceiver servlet. It is already running when you start SOA. After importing the GC_SyncReqRep.zip repository file into B2B, you’ll have the typical GlobalChips host trading partner and the Acme remote trading partner. Document Management The repository contains two very simple custom XML document definitions called Orders and OrdersResponse. In order to determine the trading partner agreement needed to process the inbound Orders document, you need to know two things about it; what is it and where it came from. So let’s look at how B2B identifies the appropriate document definition for the message. The XSD’s for these two document definitions themselves are not particularly interesting. Whenever you're dealing with custom XML documents, B2B identifies the appropriate document definition for each XML message using an XPath Identification Expression. The expression is entered for each of these document definitions under the document administration tab in the B2B console. The full XPATH expression for the Orders document is //*[local-name()='shiporder']/*[local-name()='shipto']/*[local-name()='name']/text(). You can see this path in the XSD diagram below and how it uniquely identifies this message. The OrdersReponse document is identified in the same way. The XPath expression for it is //*[local-name()='Response']/*[local-name()='Status']/text(). You can see how it’s path differs uniquely identifying the reply from the request. Trading Partner Profile The trading partner profiles are very simple too. For GlobalChips, a generic identifier is being used to identify the sender of the response document using the host trading partner name. For Acme, a generic identifier is also being used to identify the sender of the inbound request using the remote trading partner name. The document types are added for the remote trading partner as usual. So the remote trading partner Acme is the sender of the Orders document, and it is the receiver of the OrdersResponse document. For the remote trading partner only, there needs to be a dummy channel which gets used in the outbound response agreement. The channel is not actually used. It is just a necessary place holder that needs to be there when creating the agreement. Trading Partner Agreement The agreements are equally simple. There is no validation and translation is not an option for a custom XML document type. For the InboundAgreement (request) the document definition is set to OrdersDef. In the Agreement Parameters section the generic identifiers have been added for the host and remote trading partners. That’s all that is needed for the inbound transaction. For the OutboundAgreement (response), the document definition is set to OrdersResponseDef and the generic identifiers for the two trading partners are added. The remote trading partner dummy delivery channel is also added to the agreement. SOA Composite Import the SOA composite archive into JDeveloper as an EJB JAR file. Open the composite and you should have a project that looks like this. In the composite, open the b2bInboundSyncSvc exposed service and advance through the setup wizard. Select your Application Server Connection and advance to the Operations window. Notice here that the B2B binding is set to Receive. It is not set for Synchronous Request Reply. Continue advancing through the wizard as you normally would and select finish at the end. Now open BPELProcess1 in the composite. The BPEL process is set as a Synchronous Request Reply as you can see below. The while loop is there just to give the process something to do. The actual reply message is prepared in the assignResponseValues assignment followed by an Invoke of the B2B binding. Open the replyResponse Invoke and go to the properties tab. You’ll see that the fromTradingPartnerId, toTradingPartner, documentTypeName, and documentProtocolRevision properties have been set. Testing the Configuration To test the configuration, I used Firefox Poster. Enter the URL for the b2b/syncreceiver servlet and browse for the req.xml file that contains the test request message. In the Headers tab, add the property ‘from’ and give it the value ‘Acme’. This is how B2B will know where the message is coming from and it will use that information along with the document type name to find the right trading partner agreement. Now post the message. You should get back a response with a status of ‘200 OK’. That’s all there is to it.

Read the article
Optimized OCR black/white pixel algorithm

- by eagle

I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

Read the article
Chrome: Saved username/password filled in incognito mode

- by Wouter Coekaerts

If I open an incognito window in Google Chrome and go to a webpage where Chrome has a saved username and password from (for example the login form on http://gmail.com), I see that my username and password are automatically filled in. Does that mean that I am not really incognito? Can the website see my username even if I don't explicitly log in? Or is there some mechanism behind the scenes that prevents the webpage from grabbing auto-filled values unless I actually log in? Clarification: Stored usernames (and passwords) are a lot like cookies: your unique identifier linked to a certain site, stored locally in your browser, available to the site when you open it. When you go incognito you ask your browser not to identify you to the sites you visit. It does that by (among other things) not exposing its cookies. Exposing the stored username in this mode does not make sense to me (but maybe I'm missing something...).

Read the article
Using Ubuntu Karmic as an L2TP Client for VPN

- by James Lawrie

I'm trying to connect to a VPN service over L2TP using Karmic as a client and it's not working. The only details I have are the remote IP address, username & password, and a shared secret string; this is enough for Windows but doesn't appear to be enough for Ubuntu. I've tried using network-manager-vpnc and vpnc from the terminal to connect and I get "no supported authentication", and trying with OpenSwan it says "unable to identify either side of the connection". I'd really appreciate some help here if anyone else has implemented this successfully.

Read the article
What is causing a vm to exhibit packet loss?

- by d03boy

We have a pretty nice piece of hardware set up to run multiple virtual machines in vmware and one of the vm's is an instance of Windows Server 2003 running SQL Server 2005. For some reason we occasionally see 10-20 seconds of straight packet loss to this machine from remote machines (my workstation) as well as other vm's on the same physical hardware. I am using PingPlotter to keep a close eye on the packet loss. So far we've turned off flow control on the NIC but we are already running out of other things to try. What might be causing this and how can I identify the problem? Note: We also have another server with a very similar configuration with the same type of problem to a lesser extent (because its not used as heavily?)

Read the article
Virus that duplicates word documents as exe

- by Bob Rivers

Hi, We are facing a virus problem on our network, but I'm unable to identify it, so we can't properly deal with it. The symptoms are that the virus duplicates a word document (.doc) generating a new archive with the same name, but with an exe extension, and, after that, the virus hides the original file. So, when the user clicks over the file, it propagates itself. Symantec AV seems to be able to block it: every time that the virus tries to generate the exe, symantec blocks it, but at this point, the original file was already converted to hidden, so the user thinks that the file has been deleted. Symantec identifies it as a simple trojan horse. I already started a full scan, but it didn't found nothing. I'm trying to know the virus name in order to fight it. Does anyone has any kind of information? TIA, Bob

Read the article
Webmin/ SpamAssassin doesn't appear to be 'learning' from forwarded examples of spam

- by James

I have spamtrap@ and hamtrap@ addresses set up on my mail server and forward examples of spam to the spamtrap address. I was hoping that after a few examples, SpamAssassin would 'learn' to identify the particular characteristics of spammy mail with common attributes, but this doesn't appear to be the case - it still gets delivered as normal mail. For example, some emails from the same sender and/ or with the same subject line, despite being sent several times to spamtrap@, are just delivered normally. Does it sound like SpamAssassin isn't working or correctly configured, or have I misunderstood a fundamental aspect of how it works?

Read the article
Middle Mouse Button does not work in XFCE / Arch Linux

- by Alp

I have the XFCE desktop manager installed on my Arch Linux system. With E17 (Enlightenment desktop manager) i had no problems with my mouse: all buttons worked correctly out of the box. But in XFCE my middle mouse button does not fire an event at all (no output with xev). Evdev seems to identify my mouse correctly (Razer Deathadder) because it echoes its name in the xorg logs. I have no idea what could cause this and how to debug the problem. I start both e17 and xfce with startx. Here is my ~/.xinitrc: exec startxfce4 --with-ck-launch #exec enlightenment-start

Read the article
Seperator in dock in osx

- by sagar

I have placed too many icons in my dock. there is by default a separator in dock between applications & trash. I want to add more separators in my dock - for grouping purpose. say for example finder, preview, itunes, system pref.,activity monitor FOR Mac osx group Mozilla, safari - for Browsing group Odesk, skype, ipmessanger, adium, team viewer for communication Means, I just want to add separator to identify them very quickly. Is it possible ? if yes - how ? Thanks in advance for sharing your knowledge. sagar

Read the article
Fastest light-weight image viewer over forwarded x11 session (linux)

- by Matthew

I have a slow network connection over which I'm forwarding x11 over ssh. I want to view images on the remote host (Ubuntu) quickly and efficiently. I'm looking for an image viewer that will take into account the image viewer window's resolution and downsize the image before sending it over the network, instead of sending the full size image. The images I want to view will be around 5MB and I only need to be able to browse through tiny thumbnails of the images to identify the image I'm looking for. It is not necessary to be able to see more than one image at a time. Highest speed over slow network connection is the priority. Thanks! Matthew EDIT: It's possible that the way x11 forwarding works, only the image at the display resolution will be transferred anyway. If that's true, please confirm and the question still stands for which image viewer will be the fastest over a slow connection

Read the article
Wi-Fi performance in Windows 8 RP on a MacBook Air (mid 2011)

- by Steven Lu

I was able to install the Boot Camp Windows software using the executable that it provided, and there are no unrecognized or unknown devices in Device Manager. Wi-Fi works but it seems to be limited to an extremely slow 1.5Mbits. Network Center reports an 802.11n connection (at 65Mbps usually) but transfers never reach above about 200kB/s. Being limited to 1/20th of the connection speed of my internet service is quite frustrating. Does anybody experience the same issue? I have been trying to identify the Broadcom Wi-Fi chipset and a driver that I could try to upgrade to but I have made very little progress on Google on this front.

Read the article
High server load - [jbd2/md1-8] using 99.99% IO

- by Alex

I've been having spike in load over the last week. This usually occurs once or twice a day. I've managed to identify from iotop that [jbd2/md1-8] is using 99.99 % IO. During the high load times there is no high traffic to the server. Server specs are: AMD Opteron 8 core 16 GB RAM 2x2.000 GB 7.200 RPM HDD Software Raid 1 Cloudlinux + Cpanel Mysql is properly tuned Apart from the spikes, the load usually is around 0.80 at most. I've searched around but can't find what [jbd2/md1-8] does exactly. Has anyone had this problem or does anyone know a possible solution? Thank you. UPDATE: TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 16:05:36 399 be/3 root 0.00 B/s 38.76 K/s 0.00 % 99.99 % [jbd2/md1-8]

Read the article

Search Results

Search found 2372 results on 95 pages for 'identify'.

Page 37/95 | < Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44 | Next Page >

- by Trevor Bekolay

- by Greg_Gutkin

- by pinaldave

- by jamiet

- by pinaldave

- by Dave

- by Bertrand Le Roy

- by Elie Wazen

- by Dejan Sarka

- by AjarnMark

- by Daniel Hester

- by D'Arcy Lussier

- by Tania Le Voi

- by cdwright

- by eagle

- by Wouter Coekaerts

- by James Lawrie

- by d03boy

- by Bob Rivers

- by James

- by Alp

- by sagar

- by Matthew

- by Steven Lu

- by Alex

< Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44 | Next Page >