Search Results

Search found 18272 results on 731 pages for 'ldap query'.

Page 93/731 | < Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100  | Next Page >

  • Automounting AFP Share For OpenDirectory Users (Mac Lion Server)

    - by davzie
    We're moving all our Mac's from local logins to OpenDirectory logins in an effort to keep everything secure and to also solve issues we have with a Tiger server buggering permissions on an AFP shared drive. To cut it short, I want users to be able to login to their OpenDirectory account and I then want the AFP share point to mount using the credentials they just logged in with. Firstly, is this possible (I assumed it was) and secondly how might I go about doing it? Cheers, Dave

    Read the article

  • stdout, stderr, and what else? (going insane parsing slapadd output)

    - by user64204
    I am using slapadd to restore a backup. That backup contains 45k entries which takes a while to restore so I need to get some progress update from slapadd. Luckily for me there is the -v switch which gives an output similar to this one: added: "[email protected],ou=People,dc=example,dc=org" (00003d53) added: "[email protected],ou=People,dc=example,dc=org" (00003d54) added: "[email protected],ou=People,dc=example,dc=org" (00003d55) .######## 44.22% eta 05m05s elapsed 04m spd 29.2 k/s added: "[email protected],ou=People,dc=example,dc=org" (00003d56) added: "[email protected],ou=People,dc=example,dc=org" (00003d57) added: "[email protected],ou=People,dc=example,dc=org" (00003d58) added: "[email protected],ou=People,dc=example,dc=org" (00003d59) Every N entries added, slapadd writes a progress update output line (.######## 44.22% eta 05m05s elapsed ...) which I want to keep and an output line for every entry created which I want to hide because it exposes people's email address but still want to count them to know how many users were imported The way I thought about hiding emails and showing the progress update is this: $ slapadd -v ... 2>&1 | tee log.txt | grep '########' # => would give me real-time progress update $ grep "added" log.txt | wc -l # => once backup has been restored I would know how many users were added I tried different variations of the above, and whatever I try I can't grep the progress update output line. I traced slapadd as follows: sudo strace slapadd -v ... And here is what I get: write(2, "added: \"[email protected]"..., 78added: "[email protected],ou=People,dc=example,dc=org" (00000009) ) = 78 gettimeofday({1322645227, 253338}, NULL) = 0 _######## 44.22% eta 05m05s elapsed 04m spd 29.2 k/s ) = 80 write(2, "\n", 1 ) As you can see, the percentage line isn't sent to either stdout or stderr (FYI I have validated with known working and failing commands that 2 is stderr and 1 is stdout) Q1: Where is the progress update output line going? Q2: How can I grep on it while sending stderr to a file? Additional info: I'm running Openldap 2.4.21 on ubuntu server 10.04

    Read the article

  • OS X Server - setting up a share for network home directories + fast user switching

    - by sohocoke
    I'm nearly finished setting up my Open Directory master to allow users to be managed centrally and logged in on any of the client machines at home. I found discussion suggesting that using an AFP share for the 'Users' automount would result in network users (cf. users defined locally, or Portable Home Directory users) are unable to use fast user switching, as the first user logging in would trigger the automount and mount the 'Users' share with his permissions, preventing further users from using the mount. I've also found some suggestions to configure autofs such that the 'Users' share is mounted prior to any user logging in, but not great amounts of details along these lines. I'd greatly appreciate some instructions to set up autofs - ideally, on the OpenDirectory server rather than each client's etc/fstab or equivalent location, so that it isn't required per every client machine - to get fast user switching working with network users.

    Read the article

  • openldap make sure password does not contain username

    - by Ryan Horrisberger
    Is there a way using openldap to ensure that a user's password does not contain their name or their username? I know that you can use the ppolicy overlay pwdCheckModule by writing a C function to do password checking, but is this the best approach? It doesn't seem like many folks are doing password quality checking this way--the only example I've found is a github example which only does basic checking.

    Read the article

  • Ubuntu Server 10.10 vs. Fedora Server 14 for Mono.NET app hosting in VM

    - by Abbas
    Ubuntu Server 10.10 vs. Fedora Server 14 I want to create a web-server running Mono, MySQL 5.5 and OpenLDAP running as a VM (on VMWare Workstation). Searching “Ubuntu Server vs. Fedora Server” mostly yields flame wars and noise. There are a few good articles available but they are either out-of-date or don’t offer very convincing arguments. I know the answer is most likely to be “it depends” but I wanted to harness the collective wisdom on ServerFault and get opinions, experiences and factual information to the extent possible. My selection criteria would be (other than what is mentioned above): Ease of use Ease of development Reliability Security

    Read the article

  • How to migrate Fedora DS (389 DS) to a new machine?

    - by zengr
    Hello, I am trying to migrate a Fedora DS (1.2.2) to a new server (1.2.7.5). The process has been painful to say the least. The old server (1.2.2) was also an upgrade from an old fedora DS setup, so it does not contain migrate-ds-admin.pl. I found this question, but the URL does not open. I am aware that I need to use migrate-ds-admin.pl, but I am clueless. How do I use it? I assume this works like this: 1. Copy migrate-ds-admin.pl from server which has 1.2.7 to 1.2.2 2. Run migrate-ds-admin.pl to export the schema+ldif from 1.2.2 3. Import the schema+ldif to 1.2.7 using migrate-ds-admin.pl. If the above is true, then what parameters are need for export and import? Note: ./ldif2db -n NetscapeRoot -i /root/NetscapeRoot.ldif ./ldif2db -n userRoot -i /root/userRoot.ldif The above two commands work like a charm, but since the schema (custom schema) is not migrated, I see alot of errors during import.

    Read the article

  • What's the appropriate way to upgrade Apache in RHEL?

    - by jldugger
    The version of Apache shipped in RHEL 5.4 is very old. A feature I need only shipped recently. It seems Apache upstream only ships tarballs, and omits binary packages. Obviously I could build from source, but what's the canonical way to upgrade a single package like this? Is it common procedure to drop a newer tarball in the existing SPEC, or does someone already do all this with an eye towards RHEL?

    Read the article

  • OSX Server - How to set environment variable on network user login

    - by tmkly3
    I have a group of users on my server, "Developers", and I would like an environment variable to be set for them whenever they login. More specifically, when anyone in this group logs in, I would like the equivalent of: setenv ANDROID_SDK_HOME /Developers/Android/User to be set at login. I can do this with a login script if necessary, but what I'm asking is: is it possible to set this type of thing in Profile Manager, Workgroup Manager, Directory Utility, etc? Thanks - I've looked everywhere but can't find anything.

    Read the article

  • Active Directory: Determining DN or OU from log in credentials [closed]

    - by Christopher Broome
    I'm updating a PHP login process to leverage active directory on a Windows server. The logging in process seems pretty straight forward via a "ldap_bind", but I also want to pull some profile information from the AD server (first name, last name, etc...) which seems to require a robust distinguished name (DN). When on the windows server I can grab this via 'dsquery user' on the command prompt, but is there a way to get the same value from just the user's login credentials in PHP? I want to avoid getting a list of hundreds of DNs when on-boarding clients and associating each with one of our users, so any means to programmatically determine this would be preferential. Otherwise, I'll know the domain and host for the request so I can at least set the DC portions of the DN, but the organizational units (OU) seem to be pretty important for querying data. If I can find some of the root level OU values associated with the user I can do a ldap_search and crawl. I browsed through the existing questions and found some similar but nothing that really addressed this, so my apologies if the obvious answer is out there. Thanks for the help.

    Read the article

  • Start TLS and 389 Directory

    - by Kyle Flavin
    I'm trying to configure Start TLS on 389 Directory server, but I'm having all sorts of issues. I've been following this doc: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/managing-certs.html which specifies that I should create a certificate for both the directory server and admin server. I've imported the CA cert on both servers. I've tried to use the same server certificate for both. It will not allow me to do so. However, the admin and directory servers reside on the same host. If I generate a new certificate it will need to use the same hostname. I'm not sure if that's valid... Has anyone out there set this up before? Any direction would be helpful. I have multmaster replication set up. From an external client, I'm attempting to do an ldapsearch -ZZ -x -h "myhost" -b "dc=example,dc=com" -D "cn=Directory Manager" -W "", and I'm getting a protocol error.

    Read the article

  • Sun Directory Server 5.2 performance

    - by tmow
    Hi all, I'm using logconv.pl (provided by Sun), to measure performance on my server. These two metrics results, are worrying me a bit: Binds: 192164 Unbinds: 111569 In fact the difference between the two it's quite big, how can I determine which are the unbound requests? As stated by Lodovic: Many applications just close the connections without sending an Unbind request. This simply can explain the difference. But the logconv.pl doesn't show details about the unbound requests, do you know any other tools or can you suggest some queries or whatever that can help me find out the root cause? Do you think anyway that the performances may improve fixing the issue?

    Read the article

  • SVN Active Directory authentication with ProxyPass redirect in the mix

    - by Jason B. Standing
    We have a BitNami SVN stack running on a Windows machine which holds our SVN repository. It's set up to authenticate against our AD server and uses authz to control rights. Everything works perfectly if Tortoise points at http://[machine name]/svn However - we need to be able to access it from http://[domain]/svn. The domain name points to a linux environment that we're decommissioning, but until we do, other systems on that box prevent us from just re-pointing the domain record. Currently, we've got a ProxyPass record on the linux machine to forward requests through to http://[machine name]/svn - it seems to work fine, and the endpoint machine asks for credentials, then authenticates: but when that happens, the access attempt is logged as coming from the linux box, rather than from the user who has authenticated. It's almost like some element of the credentials aren't being passed through to the endpoint machine. Has anyone done this before, or is there other info I can give to try to make sense of this problem, and figure out a way to solve it? Thankyou!

    Read the article

  • setting up a samba PDC -error with testparm

    - by Rungano
    Hi guys I have installed a samba PDC but when I test the samba configurations file I am getting errors like these, "Invalid combination of parameters for service homes. Map system can only work if create mask includes octal 010 (S_IXGRP)." My Configuration file is as follows [homes] comment = Home Directories path = /home_srv1/%u valid users = %S read only = No create mask = 0660 directory mask = 0770 browseable = No I tried to google but with no luck, Serverfault is always my best hope. Thanks for helping out.

    Read the article

  • What filesystem comes closest to matching NTFS for support of ACLs, and highly-granular permissioning?

    - by warren
    It seems that most other filesystems handle the basic *nix permissions (ugo±rwx), with maybe an addition here or there. Or can be "made" to handle ACLs through the use of other tools on top of the system. On the wikipedia pages about filesystems (http://en.wikipedia.org/wiki/List%5Fof%5Ffile%5Fsystems & http://en.wikipedia.org/wiki/Comparison%5Fof%5Ffile%5Fsystems), it appears that while some do support extended meta-data, none support natively the level of permissioning that NTFS does. Am I wrong in this understanding?

    Read the article

  • How to disable password change for openldap user?

    - by Keve
    Considering possible solutions for some improvements I run into this theoretical question and I couldn't find a satisfying answer. Some of you may have first-hand experience with this in practice, so here the question goes: How can I disable password changing for an OpenLDAP user? The account must stay enabled, allowed to log on to workstations and work as usual, but should not be able to change its own password. Can this be done? If so, how difficult is it to implement it? All suggestions are appreciated! For reference: Servers and workstations are to run a mixture of FreeBSD and OpenBSD. Accounts to get password disabled are student or generic workstation accounts. Environment is a school.

    Read the article

  • SQL SERVER – How to Recover SQL Database Data Deleted by Accident

    - by Pinal Dave
    In Repair a SQL Server database using a transaction log explorer, I showed how to use ApexSQL Log, a SQL Server transaction log viewer, to recover a SQL Server database after a disaster. In this blog, I’ll show you how to use another SQL Server disaster recovery tool from ApexSQL in a situation when data is accidentally deleted. You can download ApexSQL Recover here, install, and play along. With a good SQL Server disaster recovery strategy, data recovery is not a problem. You have a reliable full database backup with valid data, a full database backup and subsequent differential database backups, or a full database backup and a chain of transaction log backups. But not all situations are ideal. Here we’ll address some sub-optimal scenarios, where you can still successfully recover data. If you have only a full database backup This is the least optimal SQL Server disaster recovery strategy, as it doesn’t ensure minimal data loss. For example, data was deleted on Wednesday. Your last full database backup was created on Sunday, three days before the records were deleted. By using the full database backup created on Sunday, you will be able to recover SQL database records that existed in the table on Sunday. If there were any records inserted into the table on Monday or Tuesday, they will be lost forever. The same goes for records modified in this period. This method will not bring back modified records, only the old records that existed on Sunday. If you restore this full database backup, all your changes (intentional and accidental) will be lost and the database will be reverted to the state it had on Sunday. What you have to do is compare the records that were in the table on Sunday to the records on Wednesday, create a synchronization script, and execute it against the Wednesday database. If you have a full database backup followed by differential database backups Let’s say the situation is the same as in the example above, only you create a differential database backup every night. Use the full database backup created on Sunday, and the last differential database backup (created on Tuesday). In this scenario, you will lose only the data inserted and updated after the differential backup created on Tuesday. If you have a full database backup and a chain of transaction log backups This is the SQL Server disaster recovery strategy that provides minimal data loss. With a full chain of transaction logs, you can recover the SQL database to an exact point in time. To provide optimal results, you have to know exactly when the records were deleted, because restoring to a later point will not bring back the records. This method requires restoring the full database backup first. If you have any differential log backup created after the last full database backup, restore the most recent one. Then, restore transaction log backups, one by one, it the order they were created starting with the first created after the restored differential database backup. Now, the table will be in the state before the records were deleted. You have to identify the deleted records, script them and run the script against the original database. Although this method is reliable, it is time-consuming and requires a lot of space on disk. How to easily recover deleted records? The following solution enables you to recover SQL database records even if you have no full or differential database backups and no transaction log backups. To understand how ApexSQL Recover works, I’ll explain what happens when table data is deleted. Table data is stored in data pages. When you delete table records, they are not immediately deleted from the data pages, but marked to be overwritten by new records. Such records are not shown as existing anymore, but ApexSQL Recover can read them and create undo script for them. How long will deleted records stay in the MDF file? It depends on many factors, as time passes it’s less likely that the records will not be overwritten. The more transactions occur after the deletion, the more chances the records will be overwritten and permanently lost. Therefore, it’s recommended to create a copy of the database MDF and LDF files immediately (if you cannot take your database offline until the issue is solved) and run ApexSQL Recover on them. Note that a full database backup will not help here, as the records marked for overwriting are not included in the backup. First, I’ll delete some records from the Person.EmailAddress table in the AdventureWorks database.   I can delete these records in SQL Server Management Studio, or execute a script such as DELETE FROM Person.EmailAddress WHERE BusinessEntityID BETWEEN 70 AND 80 Then, I’ll start ApexSQL Recover and select From DELETE operation in the Recovery tab.   In the Select the database to recover step, first select the SQL Server instance. If it’s not shown in the drop-down list, click the Server icon right to the Server drop-down list and browse for the SQL Server instance, or type the instance name manually. Specify the authentication type and select the database in the Database drop-down list.   In the next step, you’re prompted to add additional data sources. As this can be a tricky step, especially for new users, ApexSQL Recover offers help via the Help me decide option.   The Help me decide option guides you through a series of questions about the database transaction log and advises what files to add. If you know that you have no transaction log backups or detached transaction logs, or the online transaction log file has been truncated after the data was deleted, select No additional transaction logs are available. If you know that you have transaction log backups that contain the delete transactions you want to recover, click Add transaction logs. The online transaction log is listed and selected automatically.   Click Add if to add transaction log backups. It would be best if you have a full transaction log chain, as explained above. The next step for this option is to specify the time range.   Selecting a small time range for the time of deletion will create the recovery script just for the accidentally deleted records. A wide time range might script the records deleted on purpose, and you don’t want that. If needed, you can check the script generated and manually remove such records. After that, for all data sources options, the next step is to select the tables. Be careful here, if you deleted some data from other tables on purpose, and don’t want to recover them, don’t select all tables, as ApexSQL Recover will create the INSERT script for them too.   The next step offers two options: to create a recovery script that will insert the deleted records back into the Person.EmailAddress table, or to create a new database, create the Person.EmailAddress table in it, and insert the deleted records. I’ll select the first one.   The recovery process is completed and 11 records are found and scripted, as expected.   To see the script, click View script. ApexSQL Recover has its own script editor, where you can review, modify, and execute the recovery script. The insert into statements look like: INSERT INTO Person.EmailAddress( BusinessEntityID, EmailAddressID, EmailAddress, rowguid, ModifiedDate) VALUES( 70, 70, N'[email protected]' COLLATE SQL_Latin1_General_CP1_CI_AS, 'd62c5b4e-c91f-403f-b630-7b7e0fda70ce', '20030109 00:00:00.000' ); To execute the script, click Execute in the menu.   If you want to check whether the records are really back, execute SELECT * FROM Person.EmailAddress WHERE BusinessEntityID BETWEEN 70 AND 80 As shown, ApexSQL Recover recovers SQL database data after accidental deletes even without the database backup that contains the deleted data and relevant transaction log backups. ApexSQL Recover reads the deleted data from the database data file, so this method can be used even for databases in the Simple recovery model. Besides recovering SQL database records from a DELETE statement, ApexSQL Recover can help when the records are lost due to a DROP TABLE, or TRUNCATE statement, as well as repair a corrupted MDF file that cannot be attached to as SQL Server instance. You can find more information about how to recover SQL database lost data and repair a SQL Server database on ApexSQL Solution center. There are solutions for various situations when data needs to be recovered. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • SQLAuthority News – Pluralsight Course Review – Practices for Software Startups – Part 1 of 2

    - by pinaldave
    This is first part of the two part series of Practices for Software Startup Pluralsight Course. The course is written by Stephen Forte (Blog | Twitter). Stephen Forte is the Chief Strategy Officer of the venture backed company, Telerik, a leading vendor of developer and team productivity tools. Stephen is also a Certified Scrum Master, Certified Scrum Professional, PMP, and also speaks regularly at industry conferences around the world. He has written several books on application and database development.  Stephen is also a board member of the Scrum Alliance. Startups – Everybodies Dream Start-up companies are an important topic right now – everyone wants to start their own business.  It is also important to remember that all companies were a start up at one point – from your corner store to the giants like Microsoft and Apple.  Research proves that not every start-up succeeds, in fact, most will fail before their first year.  There are many reasons for this, and this could be due to the fact that there are many stages to a start-up company, and stumbling at any of these stages can lead to failure.  It is important to understand what makes a start-up company succeed at all its hurdles to become successful.  It is even important to define success.  For most start-ups this would mean becoming their own independently functioning company or to be bought out for a hefty profit by a larger company.  The idea of making a hefty profit by living your dream is extremely important, and you can even think of start-ups as the new craze.  That’s why studying them is so important – they are very popular, but things have changed a lot since their inception. Starting the Startups Beginning a start-up company used to be difficult, but now facilities and information is widely available, and it is much easier.  But that means it is much easier to fail, also.  Previously to start your own company, everything was planned and organized, resources were ensured and backed up before beginning; even the idea of starting your own business was a big thing.  Now anybody can do it, and the steps are simple and outlines everywhere – you can get online software and easily outsource , cloud source, or crowdsource a lot of your material.  But without the type of planning previously required, things can often go badly. New Products – New Ideas – New World There are so many fantastic new products, but they don’t reach success all the time.  I find start-up companies very interesting, and whenever I meet someone who is interested in the subject or already starting their own company, I always ask what they are doing, their plans, goals, market, etc.  I am sorry to say that in most cases, they cannot answer my questions.  It is true that many fantastic ideas fail because of bad decisions.  These bad decisions were not made intentionally, but people were simply unaware of what they should be doing.  This will always lead to failure.  But I am happy to say that all these issues can be gone because Pluralsight is now offering a course all about start-ups by Stephen Forte.  Stephen is a start up leader.  He has successfully started many companies and most are still going strong, or have gone on to even bigger and better things. Beginning Course on Startup I have always thought start-ups are a fascinating subject, and decided to take his course, but it is three hours long.  This would be hard to fit into my busy work day all at once, so I decided to do half of his course before my daughter wakes up, and the other half after she goes to sleep.  The course is divided into six modules, so this would be easy to do.  I began the first chapter early in the morning, at 5 am.  Stephen jumped right into the middle of the subject in the very first module – designing your business plan.  The first question you will have to answer to yourself, to others, and to investors is: What is your product and when will we be able to see it?  So a very important concept is a “minimal viable product.”  This means setting goals for yourself and your product.  We all have large dreams, but your minimal viable product doesn’t have to be your final vision at the very first.  For example: Apple is a giant company, but it is still evolving.  Steve Jobs didn’t envision the iPhone 6 at the very beginning.  He had to start at the first iPhone and do his market research, and the idea evolved into the technology you see now.  So for yourself, you should decide a beginning and stop point.  Do your market research.  Determine who you want to reach, what audience you want for your product.  You can have a great idea that simply will not work in the market, do need, bottlenecks, lack of resources, or competition.  There is a lot of research that needs to be done before you even write a business plan, and Stephen covers it in the very first chapter. The Team – Unique Key to Success After jumping right into the subject in the very first module, I wondered what Stephen could have in store for me for the rest of the course.  Chapter number two is building a team.  Having a team is important regardless of what your startup is.  You can be a true visionary with endless ideas and energy, but one person can still not do everything.  It is important to decide from the very beginning if you will have cofounders, team leaders, and how many employees you’ll need.  Even more important, you’ll need to decide what kind of team you want – what personalities, skills, and type of energy you want each of your employees to bring.  Do you want to have an A+ team with a B- idea, or do you have a B- idea that needs an A+ team to sell it?  Stephen asks all the hard questions!  I was especially impressed by his insight on developing.  You have to decide if you need developers, how many, and what their skills should be. I found this insight extremely useful for everyday usage, not just for start-up companies.  I would apply this kind of information in management at any position.  An amazing team will build an amazing product – and that doesn’t matter if you’re a start-up company or a small team working for a much larger business. Customer Development – The Ultimate Obective Chapter three was about customer development. According to Stephen, there are four different steps to develop a customer base.  The first question to ask yourself is if you are envisioning a large customer base buying a few products each, or a small, dedicated base that buys a lot of your product – quantity vs. Quality.  He also discusses how to earn, retain, and get more customers.  He also says that each customer should be placed in a different role – some will be like investors, who regularly spend with you and invest their money in your business.  It is then your job to take that investment and turn it into a better product in the future.  You need to deal with their money properly – think of it is as theirs as investors, not yours as profit.  At the end of this module I felt that only Stephen could provide this kind of insight, and then he listed all the resources he took his information from.  I have never seen a group of people so passionate about their customers. It was indeed a long day for me. In tomorrow’s part 2 we will discuss rest of the three module and also will see a quick video of the Practices for Software Startup Pluralsight Course. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Best Practices, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQLAuthority News – Memories at Anniversary of SQL Wait Stats Book

    - by pinaldave
    SQL Wait Stats About a year ago, I had very proud moment. I had published my second book SQL Server Wait Stats with me as a primary author. It has been a long journey since then. The book got great response and it was widely accepted in the community. It was first of its kind of book written specifically on Wait Stats and Performance. The book was based on my earlier month long series written on the same subject SQL Server Wait Stats. Today, on the anniversary of the book, lots of things come to my mind let me share a few here. Idea behind Blog Series A very common question I often receive is why I wrote a 30 day series on Wait Stats. There were two reasons for it. 1) I have been working with SQL Server for a long time and have troubleshoot more than hundreds of SQL Server which are related to performance tuning. It was a great experience and it taught me a lot of new things. I always documented my experience. After a while I found that I was able to completely rely on my own notes when I was troubleshooting any servers. It is right then I decided to document my experience for the community. 2) While working with wait stats there were a few things, which I thought I knew it well as they were working. However, there was always a fear in the back of mind that what happens if what I believed was incorrect and I was on the wrong path all the time. There was only one way to get it validated. Put it out in front community with my understanding and request further help to improve my understanding. It worked, it worked beautifully. I received plenty of conversations, emails and comments. I refined my content based on various conversations and make it more relevant and near accurate. I guess above two are the major reasons for beginning my journey on writing Wait Stats blog series. Idea behind Book After writing a blog series there was a good amount of request I keep on receiving that I should convert it to eBook or proper book as reading blog posts is great but it goes not give a comprehensive understanding of the subject. The very common feedback from users who were beginning the subject that they will prefer to read it in a structured method. After hearing the feedback for more than 4 months, I decided to write a book based on the blog posts. When I envisioned book, I wanted to make sure this book addresses the wait stats concepts from the fundamentals and fill the gaps of blogs I wrote earlier. Rick Morelan and Joes 2 Pros Team I must acknowledge my co-author Rick Morelan for his unconditional support in writing this book. I had already authored one book before I published this book. The experience to write the book was out of the world. Writing blog posts are much much easier than writing books. The efforts it takes to write a book is 100 times more even though the content is ready. I could have not done it myself if there was not tremendous support of my co-author and editor’s team. We spend days and days researching and discussing various concepts covered in the book. When we were in doubt we reached out to experts as well did a practical reproduction of the scenarios to validate the concepts and claims. After continuous 3 months of hard work we were able to get this book out in the community. September 1st – the lucky day Well, we had to select any day to publish the books. When book was completed in August last week we felt very glad. We all had worked hard and having a sample draft book in hand was feeling like having a newborn baby in our hand. Every time my books are published I feel the same joy which I had when my daughter was born. The feeling of holding a new book in hand is the (almost) same feeling as holding newborn baby. I am excited. For me September 1st has been the luckiest day in mind life. My daughter Shaivi was born on September 1st. Since then every September first has been excellent day and have taken me to the next step in life. I believe anything and everything I do on September 1st it is turning out to be successful and blessed. Rick and I had finished a book in the last week of August. We sent it to the publisher (printer) and asked him to take the book live as soon as possible. We did not decide on any date as we wanted the book to get out as fast as it can. Interesting enough, the publisher/printer selected September 1st for publishing the book. He published the book on 1st September and I knew it at the same time that this book will go next level. Book Model – The Most Beautiful Girl We were done with book. We had no budget left for marketing. Rick and I had a long conversation regarding how to spread the words for the book so it can reach to many people. While we were talking about marketing Rick come up with the idea that we should hire a most beautiful girl around who acknowledge our book and genuinely care for book. It was a difficult task and Rick asked me to find a more beautiful girl. I am a father and the most beautiful girl for me my daughter. This was not a difficult task for me. Rick had given me task to find the most beautiful girl and I just could not think of anyone else than my own daughter. I still do not know what Rick thought about this idea but I had already made up my mind. You can see the detailed blog post here. The Fun Experiments Book Signing Event We had lots of fun moments along this book. We have given away more books to people for free than we have sold them actually. We had done book signing events, contests, and just plain give away when we found people can be benefited from this book. There was never an intention to make money and get rich. We just wanted that more and more people know about this new concept and learn from it. Today when I look back to the earnings there is nothing much we have earned if you talk about dollars. However the best reward which we have received is the satisfaction and love of community. The amount of emails, conversations we have so far received for this book is over thousands. We had fun writing this book, it was indeed a very satisfying journey. I have earned lots of friends while learning and exploring. Availability The book is one year old but still very relevant when it is about performance tuning. It is available at various online book stores. If you have read the book, do let me know what you think of it. Amazon | Kindle | Flipkart | Indiaplaza Reference:  Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority, SQLAuthority Book Review, T SQL, Technology

    Read the article

  • SQLAuthority Book Review – DBA Survivor: Become a Rock Star DBA

    - by pinaldave
    DBA Survivor: Become a Rock Star DBA – Thomas LaRock Link to Amazon Link to Flipkart First of all, I thank all my readers when I wrote that I could not get this book in any local book stores, because they offered me to send a copy of this good book. A very special mention goes to Sripada and Jayesh for they gave so much effort in finding my home address and sending me the hard copy. Before, I did not have the copy of the book, but now I have two of it already! It surprises me how my readers were able to find my home address, which I have not publicly shared. Quick Review: This is indeed a one easy-to-read and fun book. We all work day and night with technology yet we should not forget to show our love and care for our family at home. For our souls that starve for peace and guidance, this one book is the “it” book for all the technology enthusiasts. Though this book was specifically written for DBAs, the reach is not limited to DBAs only because the lessons incorporated in it actually applies to all. This is one of the most motivating technical books I have read. Detailed Review: Let us go over a few questions first: Who wants to be as famous as rockstars in the field of Database Administration? How can one learn what it takes to become a top notch software developer? If you are a beginner in your field, how will you go to next level? Your boss may be very kind or like Dilbert’s Boss, what will you do? How do you keep growing when Eco-system around you does not support you? You are almost at top but there is someone else at the TOP, what do you do and how do you avoid office politics? As a database developer what should be your basic responsibility? and many more… I was able to completely read book in one sitting and I loved it. Before I continue with my opinion, I want to echo the opinion of Kevin Kline who has written the Forward of the book. He has truly suggested that “You hold in your hands a collection of insights and wisdom on the topic of database administration gained through many years of hard-won experience, long nights of study, and direct mentorship under some of the industry’s most talented database professionals and information technology (IT) experts.” Today, IT field is getting bigger and better, while talking about terabytes of the database becomes “more” normal every single day. The gods and demigods of database professionals are taking care of these large scale databases and are carefully maintaining them. In this world, there are only a few beginnings on the first step. There are many experts in different technology fields who are asked to address the issues with databases. There is YOU and ME, who is just new to this work. So we ask ourselves WHERE to begin and HOW to begin. We adore and follow the religion of our rockstars, but oftentimes we really have no idea about their background and their struggles. Every rockstar has his success story which needs to be digested before learning his tricks and tips. This book starts with the same note and teaches the two most important lessons for anybody who wants to be a DBA Rockstar –  to focus on their single goal of learning and to excel the technology. The story starts with three simple guidelines – Get Prepared, Get Trained, Get Certified. Once a person learns the skills, and then, it would be about time that he needs to enrich or to improve those skills you have learned. I am sure that the right opportunity will come finding themselves and they will not have to go run behind it. However, the real challenge for any person is the first day or first week. A new employee, no matter how much experienced he is, sometimes has no clue about what should one do at new job. Chapter 2 and chapter 3 precisely talk about what one should do as soon as the new job begins. It is also written with keeping the fact in focus that each job can be very much different but there are few infrastructure setups and programming concepts are the same. Learning basics of database was really interesting. I like to focus on the roots of any technology. It is important to understand the structure of the database before suggesting what indexes needs to be created, the same way this book covers the most essential knowledge one must learn by most database developers. I think the title of the fourth chapter is my favorite sentence in this book. I can see that I will be saying this again and again in the future – “A Development Server Is a Production Server to a Developer“. I have worked in the software industry for almost 8 years now and I have seen so many developers sitting on their chairs and waiting for instructions from their lead about how to improve the code or what to do the next. When I talk to them, I suggest that the experiment with their server and try various techniques. I think they all should understand that for them, a development server is their production server and needs to pay proper attention to the code from the beginning. There should be NO any inappropriate code from the beginning. One has to fully focus and give their best, if they are not sure they should ask but should do something and stay active. Chapter 5 and 6 talks about two essential skills for any developer and database administration – what are the ethics of developers when they are working with production server and how to support software which is running on the production server. I have met many people who know the theory by heart but when put in front of keyboard they do not know where to start. The first thing they do opening the browser and searching online, instead of opening SQL Server Management Studio. This can very well happen to anybody who is experienced as well. Chapter 5 and 6 addresses that situation as well includes the handy scripts which can solve almost all the basic trouble shooting issues. “Where’s the Buffet?” By far, this is the best chapter in this book. If you have ever met me, you would know that I love food. I think after reading this chapter, I felt Thomas has written this just keeping me in mind. I think there will be many other people who feel the same way, too. Even my wife who read this chapter thought this was specifically written for me. I will not talk any more about this chapter as this is one must read chapter. And of course this is about real ‘FOOD‘. I am an SQL Server Trainer and Consultant and I totally agree with the point made in the chapter 8 of this book. Yes, it says here that what is necessary to train employees and people. Millions of dollars worth the labor is continuously done in the world which has faults and incorrect. Once something goes wrong, very expensive consultant comes in and fixes the problem. This whole cycle which can be stopped and improved if proper training is done. There is plenty of free trainings available as well, if one cannot afford paid training. “Connect. Learn. Share” – I think this is a great summary and bird’s eye view of this book. Networking is the key. Everything which is discussed in this book can be taken to next level if one properly uses this tips and continuously grow with it. Connecting with others, helping learn each other and building the good knowledge sharing environment should be the goal of everyone. Before I end the review I want to share a real experience. I have personally met one DBA who has worked in a single department in a company for so long that when he was put in a different department in his company due to closing that department, he could not adjust and quit the job despite the same people and company around him. Adjusting in the new environment gets much tougher as one person gets more and more experienced. This book precisely addresses the same issue along with their solutions. I just cannot stop comparing the book with my personal journey. I found so many things which are coincidently in the book is written as how we developer and DBA think. I must express special thanks to Thomas for taking time in his personal life and write this book for us. This book is indeed a book for everybody who wants to grow healthy in the tough and competitive environment. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Book Review, SQLAuthority News, SQLServer, T SQL, Technology

    Read the article

  • SQLAuthority News – Pluralsight Course Review – Practices for Software Startups – Part 2 of 2

    - by pinaldave
    This is the second part of the two part series of Practices for Software Startup Pluralsight Course. Please read the first part of this series over here. The course is written by Stephen Forte (Blog | Twitter). Stephen Forte is the Chief Strategy Officer of the venture backed company, Telerik. Personal Learning Schedule After these three sessions it was 6:30 am and time to do my own blog.  But for the rest of the day, I kept thinking about the course, and wanted to go back and finish.  I was wishing that I had woken up at 3 am so I could finish all at one go.  All day long I was digesting what I had learned.  At 10 pm, after my daughter had gone to bed, I sighed on again.  I was not disappointed by the long wait.  As I mentioned before, Stephen has started four to six companies, and all of them are very successful today. Here is the video I promised yesterday – it discusses the importance of Right Sizing Your Startup. The Heartbeat of Startup – Technology Stephen has combined all technology knowledge into one 30 minute session.  He discussed  how to start your project, how to deal with opinions, and how to deal with multiple ideas – every start up has multiple directions it can go. He spent a lot of time emphasized deciding which direction to go and how to decide which will be the best for you.  He called it a continuous development cycle. One of the biggest hazards for a start-up company is one person deciding the direction the company will go, until down the road another team member announces that there is a glitch in their part of the work and that everyone will have to start over.  Even though a team of two or five people can move quickly, often the decision has gone too long and cannot be easily fixed.   Stephen used an example from his own life:  he was biased for one type of technology, and his teammate for another.  In the end they opted for his teammate’s  choice , and in the end it was a good decision, even though he was unfamiliar with that particular program.  He argues that technology should not be a barrier to progress, that you cannot rely on your experience only.  This really spoke to me because I am a big fan of SQL, but I know there is more out there, and I should be more open to it.  I give my thanks to Stephen, I learned something in this module besides startups. Money, Success and Epic Win! The longest, but most interesting, the module was funding your start-up.  You need to fund the start-up right at the very beginning, if not done right you will run into trouble.  The good news is that a few years ago start-ups required a lot more money – think millions of dollars – but now start-ups can get off the ground for thousands.  Stephen used an example of a company that years ago would have needed a million dollars, but today could be started for $600.  It is true that things have changed, but you still need money.  For $600 you can start small and add dynamically, as needed.  But the truth is that if you have $600, $6000, or $6 million, it will be spent.  Don’t think of it as trying to save money, think of it as investing in your future.   You will need money, and you will need to (quickly) decide what you do with the money: shares, stakeholders, investing in a team, hiring a CEO.  This is so important because once you have money and start the company, the company IS your money.  It is your biggest currency – having a percentage of ownership in the company.  Investors will want percentages as repayment for their investment, and they will want a say in the business as well.  You will have to decide how far you will dilute your shares, and how the company will be divided, if at all.  If you don’t plan in advance, you will find that after gaining three or four investors, suddenly you are the minority owner in your own dream.  You need to understand funding carefully.  This single module is worth all the money you would have spent on the whole course alone.  I encourage everyone to listen to this single module even if they don’t watch any of the others.     Press End to Start the Game – Exists! The final module is exit strategies.  You did all this work, dealt with all political and legal issues.  What are you going to get out of it? The answer is simple: money.  Maybe you want your company to be bought out, for you talent to bring you a profit.  You can sell the company to someone and still head it.  Many options are available.  You could sell and still work as an employee but no longer own the company.  There are many exit strategies.  This is where all your hard work comes into play.  It is important not to feel fooled at any step.  There are so many good ideas that end up in the garbage because of poor planning, so that if you find yourself successful, you don’t want to blow it at this step!  The exit is important.  I thought that this aspect of the course was completely unique, and I loved Stephen’s point of view.  I was lost deep in thought after this module ended.  I actually took two hours worth of notes on this section alone – and it was only a three hour course.  I am planning on attending this course one more time next week, just to catch up on all the small bits of wisdom I’m sure I missed. Thank you Stephen for bringing your real world experience with us!  I recommend that everyone attends this course, even if they don’t want to begin their own start-up company. It was indeed a long day for me. Do not forget to read part 1 of this story and attend course Practices for Software Startup Pluralsight Course. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Best Practices, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Introduction – Day 1 of 31

    - by pinaldave
    List of all the Interview Questions and Answers Series blogs Posts covering interview questions and answers always make for interesting reading.  Some people like the subject for their helpful hints and thought provoking subject, and others dislike these posts because they feel it is nothing more than cheating.  I’d like to discuss the pros and cons of a Question and Answer format here. Interview Questions and Answers are Helpful Just like blog posts, books, and articles, interview Question and Answer discussions are learning material.  The popular Dummy’s books or Idiots Guides are not only for “dummies,” but can help everyone relearn the fundamentals.  Question and Answer discussions can serve the same purpose.  You could call this SQL Server Fundamentals or SQL Server 101. I have administrated hundreds of interviews during my career and I have noticed that sometimes an interviewee with several years of experience lacks an understanding of the fundamentals.  These individuals have been in the industry for so long, usually working on a very specific project, that the ABCs of the business have slipped their mind. Or, when a college graduate is looking to get into the industry, he is not expected to have experience since he is just graduated. However, the new grad is expected to have an understanding of fundamentals and theory.  Sometimes after the stress of final exams and graduation, it can be difficult to remember the correct answers to interview questions, though. An interview Question and Answer discussion can be very helpful to both these individuals.  It is simply a way to go back over the building blocks of a topic.  Many times a simple review like this will help “jog” your memory, and all those previously-memorized facts will come flooding back to you.  It is not a way to re-learn a topic, but a way to remind yourself of what you already know. A Question and Answer discussion can also be a way to go over old topics in a more interesting manner.  Especially if you have been working in the industry, or taking lots of classes on the topic, everything you read can sound like a repeat of what you already know.  Going over a topic in a new format can make the material seem fresh and interesting.  And an interested mind will be more engaged and remember more in the end. Interview Questions and Answers are Harmful A common argument against a Question and Answer discussion is that it will give someone a “cheat sheet.” A new guy with relatively little experience can read the interview questions and answers, and then memorize them. When an interviewer asks him the same questions, he will repeat the answers and get the job. Honestly, is he good hire because he memorized the interview questions? Wouldn’t it be better for the interviewer to hire someone with actual experience?  The answer is not as easy as it seems – there are many different factors to be considered. If the interviewer is asking fundamentals-related questions only, he gets the answers he wants to hear, and then hires this first candidate – there is a good chance that he is hiring based on personality rather than experience.  If the interviewer is smart he will ask deeper questions, have more than one person on the interview team, and interview a variety of candidates.  If one interviewee happens to memorize some answers, it usually doesn’t mean he will automatically get the job at the expense of more qualified candidates. Another argument against interview Question and Answers is that it will give candidates a false sense of confidence, and that they will appear more qualified than they are. Well, if that is true, it will not last after the first interview when the candidate is asked difficult questions and he cannot find the answers in the list of interview Questions and Answers.  Besides, confidence is one of the best things to walk into an interview with! In today’s competitive job market, there are often hundreds of candidates applying for the same position.  With so many applicants to choose from, interviewers must make decisions about who to call back and who to hire based on their gut feeling.  One drawback to reading an interview Question and Answer article is that you might sound very boring in your interview – saying the same thing as every single candidate, and parroting answers that sound like someone else wrote them for you – because they did.  However, it is definitely better to go to an interview prepared, just make sure that you give a lot of thought to your answers to make them sound like your own voice.  Remember that you will be hired based on your skills as well as your personality, so don’t think that having all the right answers will make get you hired.  A good interviewee will be prepared, confident, and know how to stand out. My Opinion A list of interview Questions and Answers is really helpful as a refresher or for beginners. To really ace an interview, one needs to have real-world, hands-on experience with SQL Server as well. Interview questions just serve as a starter or easy read for experienced professionals. When I have to learn new technology, I often search online for interview questions and get an idea about the breadth and depth of the technology. Next Action I am going to write about interview Questions and Answers for next 30 days. I have previously written a series of interview questions and answers; now I have re-written them keeping the latest version of SQL Server and current industry progress in mind. If you have faced interesting interview questions or situations, please write to me and I will publish them as a guest post. If you want me to add few more details, leave a comment and I will make sure that I do my best to accommodate. Tomorrow we will start the interview Questions and Answers series, with a few interesting stories, best practices and guest posts. We will have a prize give-away and other awards when the series ends. List of all the Interview Questions and Answers Series blogs Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

    - by Pinal Dave
    This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions of SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect. Let us read the blog post in Vinod’s own voice. I think the series from Pinal is a good one for anyone planning to start on Big Data journey from the basics. In my daily customer interactions this buzz of “Big Data” always comes up, I react generally saying – “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?” Generally, there is a silence in the air when I ask this question. Data is everywhere in organizations – be it big data, small data, all data and for few it is bad data which is same as no data :). Wow, don’t discount me as someone who opposes “Big Data”, I am a big supporter as much as I am a critic of the abuse of this term by the people. In this post, I wanted to let my mind flow so that you can also think in the direction I want you to see these concepts. In any case, this is not an exhaustive dump of what is in my mind – but you will surely get the drift how I am going to question Big Data terms from customers!!! Is Big Data Relevant to me? Many of my customers talk to me like blank whiteboard with no idea – “why Big Data”. They want to jump into the bandwagon of technology and they want to decipher insights from their unexplored data a.k.a. unstructured data with structured data. So what are these industry scenario’s that come to mind? Here are some of them: Financials Fraud detection: Banks and Credit cards are monitoring your spending habits on real-time basis. Customer Segmentation: applies in every industry from Banking to Retail to Aviation to Utility and others where they deal with end customer who consume their products and services. Customer Sentiment Analysis: Responding to negative brand perception on social or amplify the positive perception. Sales and Marketing Campaign: Understand the impact and get closer to customer delight. Call Center Analysis: attempt to take unstructured voice recordings and analyze them for content and sentiment. Medical Reduce Re-admissions: How to build a proactive follow-up engagements with patients. Patient Monitoring: How to track Inpatient, Out-Patient, Emergency Visits, Intensive Care Units etc. Preventive Care: Disease identification and Risk stratification is a very crucial business function for medical. Claims fraud detection: There is no precise dollars that one can put here, but this is a big thing for the medical field. Retail Customer Sentiment Analysis, Customer Care Centers, Campaign Management. Supply Chain Analysis: Every sensors and RFID data can be tracked for warehouse space optimization. Location based marketing: Based on where a check-in happens retail stores can be optimize their marketing. Telecom Price optimization and Plans, Finding Customer churn, Customer loyalty programs Call Detail Record (CDR) Analysis, Network optimizations, User Location analysis Customer Behavior Analysis Insurance Fraud Detection & Analysis, Pricing based on customer Sentiment Analysis, Loyalty Management Agents Analysis, Customer Value Management This list can go on to other areas like Utility, Manufacturing, Travel, ITES etc. So as you can see, there are obviously interesting use cases for each of these industry verticals. These are just representative list. Where to start? A lot of times I try to quiz customers on a number of dimensions before starting a Big Data conversation. Are you getting the data you need the way you want it and in a timely manner? Can you get in and analyze the data you need? How quickly is IT to respond to your BI Requests? How easily can you get at the data that you need to run your business/department/project? How are you currently measuring your business? Can you get the data you need to react WITHIN THE QUARTER to impact behaviors to meet your numbers or is it always “rear-view mirror?” How are you measuring: The Brand Customer Sentiment Your Competition Your Pricing Your performance Supply Chain Efficiencies Predictive product / service positioning What are your key challenges of driving collaboration across your global business?  What the challenges in innovation? What challenges are you facing in getting more information out of your data? Note: Garbage-in is Garbage-out. Hold good for all reporting / analytics requirements Big Data POCs? A number of customers get into the realm of setting a small team to work on Big Data – well it is a great start from an understanding point of view, but I tend to ask a number of other questions to such customers. Some of these common questions are: To what degree is your advanced analytics (natural language processing, sentiment analysis, predictive analytics and classification) paired with your Big Data’s efforts? Do you have dedicated resources exploring the possibilities of advanced analytics in Big Data for your business line? Do you plan to employ machine learning technology while doing Advanced Analytics? How is Social Media being monitored in your organization? What is your ability to scale in terms of storage and processing power? Do you have a system in place to sort incoming data in near real time by potential value, data quality, and use frequency? Do you use event-driven architecture to manage incoming data? Do you have specialized data services that can accommodate different formats, security, and the management requirements of multiple data sources? Is your organization currently using or considering in-memory analytics? To what degree are you able to correlate data from your Big Data infrastructure with that from your enterprise data warehouse? Have you extended the role of Data Stewards to include ownership of big data components? Do you prioritize data quality based on the source system (that is Facebook/Twitter data has lower quality thresholds than radio frequency identification (RFID) for a tracking system)? Do your retention policies consider the different legal responsibilities for storing Big Data for a specific amount of time? Do Data Scientists work in close collaboration with Data Stewards to ensure data quality? How is access to attributes of Big Data being given out in the organization? Are roles related to Big Data (Advanced Analyst, Data Scientist) clearly defined? How involved is risk management in the Big Data governance process? Is there a set of documented policies regarding Big Data governance? Is there an enforcement mechanism or approach to ensure that policies are followed? Who is the key sponsor for your Big Data governance program? (The CIO is best) Do you have defined policies surrounding the use of social media data for potential employees and customers, as well as the use of customer Geo-location data? How accessible are complex analytic routines to your user base? What is the level of involvement with outside vendors and third parties in regard to the planning and execution of Big Data projects? What programming technologies are utilized by your data warehouse/BI staff when working with Big Data? These are some of the important questions I ask each customer who is actively evaluating Big Data trends for their organizations. These questions give you a sense of direction where to start, what to use, how to secure, how to analyze and more. Sign off Any Big data is analysis is incomplete without a compelling story. The best way to understand this is to watch Hans Rosling – Gapminder (2:17 to 6:06) videos about the third world myths. Don’t get overwhelmed with the Big Data buzz word, the destination to what your data speaks is important. In this blog post, we did not particularly look at any Big Data technologies. This is a set of questionnaire one needs to keep in mind as they embark their journey of Big Data. I did write some of the basics in my blog: Big Data – Big Hype yet Big Opportunity. Do let me know if these questions make sense?  Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

    - by pinaldave
    It’s no secret that bad data leads to bad decisions and poor results.  However, how do you prevent dirty data from taking up residency in your data store?  Some might argue that it’s the responsibility of the person sending you the data.  While that may be true, in practice that will rarely hold up.  It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data.  What constitutes bad data?  There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain.  Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster.  Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1:     Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level.  Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data.  Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code.  I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example.   The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2:     Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint.   A regular expression is used to identify patterns within data.   The patterns could be something as simple as a date format or something much more complex such as a street address.  For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period.  So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be.   How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be:  ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above:  ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ .  In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do.  Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3:     Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email.  Email addresses are often entered incorrectly because people are trying to avoid spam.  While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet:  \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed:  ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4:     Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations.  To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator.  Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File.  The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified.  In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected.  This makes diagnosing errors much easier! Tip 5:    Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions.  For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes.  Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation.  It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results.  For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here.  Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

    - by pinaldave
    Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

< Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100  | Next Page >