Search Results

Search found 87891 results on 3516 pages for 'server migration'.

Page 617/3516 | < Previous Page | 613 614 615 616 617 618 619 620 621 622 623 624 | Next Page >

Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Architecture – Day 4 of 21

- by Pinal Dave

In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture. Big Data Cycle Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of the data. While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture. Questions to Ask How big is your total database? What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval? How important is the data availability and what is the plan for disaster recovery? What are the plans for network and physical security of the data? What platform will be the driving force behind data and what are different service level agreements for the infrastructure? This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture. Building Blocks of Big Data Architecture It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works. Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented. NoSQL in Data Management NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL. Tomorrow Next four days we will answer the Buzz Words – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Analytics – Day 18 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the various components in Big Data Story. In this article we will understand what are the various analytics tasks we try to achieve with the Big Data and the list of the important tools in Big Data Story. When you have plenty of the data around you what is the first thing which comes to your mind? “What do all these data means?” Exactly – the same thought comes to my mind as well. I always wanted to know what all the data means and what meaningful information I can receive out of it. Most of the Big Data projects are built to retrieve various intelligence all this data contains within it. Let us take example of Facebook. When I look at my friends list of Facebook, I always want to ask many questions such as - On which date my maximum friends have a birthday? What is the most favorite film of my most of the friends so I can talk about it and engage them? What is the most liked placed to travel my friends? Which is the most disliked cousin for my friends in India and USA so when they travel, I do not take them there. There are many more questions I can think of. This illustrates that how important it is to have analysis of Big Data. Here are few of the kind of analysis listed which you can use with Big Data. Slicing and Dicing: This means breaking down your data into smaller set and understanding them one set at a time. This also helps to present various information in a variety of different user digestible ways. For example if you have data related to movies, you can use different slide and dice data in various formats like actors, movie length etc. Real Time Monitoring: This is very crucial in social media when there are any events happening and you wanted to measure the impact at the time when the event is happening. For example, if you are using twitter when there is a football match, you can watch what fans are talking about football match on twitter when the event is happening. Anomaly Predication and Modeling: If the business is running normal it is alright but if there are signs of trouble, everyone wants to know them early on the hand. Big Data analysis of various patterns can be very much helpful to predict future. Though it may not be always accurate but certain hints and signals can be very helpful. For example, lots of data can help conclude that if there is lots of rain it can increase the sell of umbrella. Text and Unstructured Data Analysis: unstructured data are now getting norm in the new world and they are a big part of the Big Data revolution. It is very important that we Extract, Transform and Load the unstructured data and make meaningful data out of it. For example, analysis of lots of images, one can predict that people like to use certain colors in certain months in their cloths. Big Data Analytics Solutions There are many different Big Data Analystics Solutions out in the market. It is impossible to list all of them so I will list a few of them over here. Tableau – This has to be one of the most popular visualization tools out in the big data market. SAS – A high performance analytics and infrastructure company IBM and Oracle – They have a range of tools for Big Data Analysis Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Data Scientist. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Trying to install Team viewer on Ubuntu 12.04

- by Teknikk

I recently got Ubuntu installed on my server, I wanted to install TeamViewer so i could easy manage the virtual machines, However, I get errors when installing it from App store?, And I also get errors, but more detailed on the terminal. Error output: tek@tek-G53SW:~/Download$ sudo dpkg -i ipts teamviewer_linux_x64.deb dpkg: error processing ipts (--install): cannot access archive: No such file or directory (Reading database ... 142115 files and directories currently installed.) Preparing to replace teamviewer7 7.0.9360 (using teamviewer_linux_x64.deb) ... Unpacking replacement teamviewer7 ... dpkg: dependency problems prevent configuration of teamviewer7: teamviewer7 depends on libc6-i386 (>= 2.7); however: Package libc6-i386 is not installed. teamviewer7 depends on lib32asound2; however: Package lib32asound2 is not installed. teamviewer7 depends on lib32z1; however: Package lib32z1 is not installed. teamviewer7 depends on ia32-libs; however: Package ia32-libs is not installed. dpkg: error processing teamviewer7 (--install): dependency problems - leaving unconfigured Errors were encountered while processing: ipts teamviewer7 I tried to install it manually, but with no luck, I heard some others has this problems. I am running Ubuntu 12.04 x64. Error @ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs : tek@tek-G53SW:~/Download$ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs Reading package lists... Done Building dependency tree Reading state information... Done You might want to run 'apt-get -f install' to correct these: The following packages have unmet dependencies: ia32-libs : Depends: ia32-libs-multiarch E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution). tek@tek-G53SW:~/Download$ More errors tek@tek-G53SW:~/Download$ sudo apt-get -f install [sudo] password for tek: Reading package lists... Done Building dependency tree Reading state information... Done Correcting dependencies... Done The following packages will be REMOVED: teamviewer7 0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded. 1 not fully installed or removed. After this operation, 81.9 MB disk space will be freed. Do you want to continue [Y/n]? y (Reading database ... 142441 files and directories currently installed.) Removing teamviewer7 ... tek@tek-G53SW:~/Download$ sudo apt-get install libc6-i386 lib32asound2 lib32z1 ia32-libs Reading package lists... Done Building dependency tree Reading state information... Done lib32z1 is already the newest version. libc6-i386 is already the newest version. lib32asound2 is already the newest version. Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: ia32-libs : Depends: ia32-libs-multiarch E: Unable to correct problems, you have held broken packages. tek@tek-G53SW:~/Download$ sudo apt-get install ia32-libs-multiarch Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: ia32-libs-multiarch:i386 : Depends: gstreamer0.10-plugins-good:i386 but it is not going to be installed Depends: gtk2-engines:i386 but it is not going to be installed Depends: gtk2-engines-murrine:i386 but it is not going to be installed Depends: gtk2-engines-pixbuf:i386 but it is not going to be installed Depends: gtk2-engines-oxygen:i386 but it is not going to be installed Depends: ibus-gtk:i386 but it is not going to be installed Depends: libcanberra-gtk-module:i386 but it is not going to be installed Depends: libcups2:i386 but it is not going to be installed Depends: libcupsimage2:i386 but it is not going to be installed Depends: libfontconfig1:i386 but it is not going to be installed Depends: libgail-common:i386 but it is not going to be installed Depends: libgphoto2-2:i386 but it is not going to be installed Depends: libgtk2.0-0:i386 but it is not going to be installed Depends: libnss3:i386 but it is not going to be installed Depends: libqt4-opengl:i386 but it is not going to be installed Depends: libqt4-qt3support:i386 but it is not going to be installed Depends: libqt4-scripttools:i386 but it is not going to be installed Depends: libqt4-svg:i386 but it is not going to be installed Depends: libqtgui4:i386 but it is not going to be installed Depends: libqtwebkit4:i386 but it is not going to be installed Depends: librsvg2-common:i386 but it is not going to be installed Depends: libsane:i386 but it is not going to be installed E: Unable to correct problems, you have held broken packages. tek@tek-G53SW:~/Download$

Read the article
SQL – Business Intelligence: Derive Data or Information?

- by Pinal Dave

We all know the value of information in our lives. Whether it’s a personal decision or a business initiated one, people need it. But the question is: who is to make the distinction between data and information? We all come across a whole lot of data daily, that may be significant or not. We filter what’s required and forget about the rest. Information is filtered and distilled data. Filtering and distillation can also alter its actual meaning and natural state. Therefore, in this blog we discover some ways to ensure that we’re using business intelligence derived from the right information for making critical management decisions. Four key questions managers must ask themselves before making a decision: 1. Am I working with data or information? 2. What is it’s context? 3. How recent is it? 4. How was it derived or what is the source? The first question is probably the most important. You must know what you’re dealing with here. If you see use of adjectives and conclusions drawn, it’s information. Not raw data. You very next concern must be whether this is guised to present a particular viewpoint or perspective. It makes a lot of difference if you take a decision based on someone’s propaganda to distort real facts. Therefore, the context and the intentions of the distillation process must be clear to you. The next consideration is whether data is recent enough to hold any value. Since it has a very short shelf life, you must ensure that its context and value is not lost out of time. The last and the most important consideration is how was it derived in the first place. The observer effect is what calls the shots here. The source can change the context to a great extent if the collection methodology and purpose is not clear. Gathering intelligence for decision making requires users to be keen observers and not take the information provided on its face value alone. These probing questions will allow you to make sure that you’re working with clean and accurate data devoid of any influence or manipulations. Only then can you be sure of deriving true business intelligence for your organization. BI technology is also a great way to ensure accuracy of reports. SQL BI Platform provides advanced tools and techniques for all your BI needs and concerns. Koenig Solutions offers this course along with a host of other Business Intelligence and IT courses on all latest technologies available in the market today. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Can't get bonding and bridging to work for KVM

- by user9546

Hi everyone. I can't for the life of me get bonding and bridging to work for the KVM setup I'm building. I'm using a fresh install (not an upgrade) of Ubuntu Server 10.10. I have 4 NICs on the same subnet (two intended for each of my two VMs). I'm trying to achieve the setup that Uthark describes here. But following his guidelines didn't work for me. My eth0 and eth1 did not come up, and "brctl show" showed that br0 didn't have any interfaces (the bond). I assumed it didn't work because he's using 10.4, and this article says there's a recent change in bonding: [I can't post more than one hyperlink per post because I'm a newbie.] I had to use this article to get my interfaces to work at all on the same subnet, which is why I have the post-up lines on some of my interfaces: [I can't post more than one hyperlink per post because I'm a newbie.] I installed ifenslave and ethtool. I also created /etc/modprobe.d/aliases.conf with the following content: alias bond0 bonding options bonding mode=6 miimon=100 downdelay=200 updelay=200 And I included "bonding" in /etc/modules So, after several approaches, here is my latest interfaces file: auto lo iface lo inet loopback auto eth5 iface eth5 inet manual auto br5 iface br5 inet static post-up /sbin/ip rule add from [network].79 lookup 10 post-up /sbin/ip route add table 10 default via [network].1 src [network].79 dev br5 address [network].79 netmask 255.255.255.0 network [network].0 broadcast [network].255 gateway [network].1 bridge_ports eth5 bridge_stp off bridge_fd 0 bridge_maxwait 0 auto eth2 iface eth2 inet manual auto br2 iface br2 inet static post-up /sbin/ip rule add from [network].78 lookup 11 post-up /sbin/ip route add table 11 default via [network].1 src [network].78 dev br2 address [network].78 netmask 255.255.255.0 network [network].0 broadcast [network].255 gateway [network].1 bridge_ports eth2 bridge_stp off bridge_fd 0 bridge_maxwait 0 iface eth0 inet manual iface eth1 inet manual auto bond0 iface bond0 inet static bond_miimon 100 bond_mode balance-alb up /sbin/ifenslave bond0 eth0 eth1 down /sbin/ifenslave -d bond0 eth0 eth1 auto br0 iface br0 inet static address [network].60 netmask 255.255.255.0 network [network].0 broadcast [network].255 gateway [network].1 bridge_ports bond0 eth2, eth5, br2, and br5 all seem to be working fine. The only other thing I could find that looked suspicious is an error regarding bonding in /var/log/messages: kernel: [ 3.828684] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details. even though there is a bond-miimon line in /etc/network/interfaces (if that's what they're talking about). Also, the bond seems to go in and out of promiscuous mode several times on boot: Jan 20 14:19:02 kvmhost kernel: [ 3.902378] device bond0 entered promiscuous mode Jan 20 14:19:02 kvmhost kernel: [ 3.902390] device bond0 left promiscuous mode Jan 20 14:19:02 kvmhost kernel: [ 3.902393] device bond0 entered promiscuous mode Jan 20 14:19:02 kvmhost kernel: [ 3.902397] device bond0 left promiscuous mode Jan 20 14:19:03 kvmhost kernel: [ 4.998990] device bond0 entered promiscuous mode Jan 20 14:19:03 kvmhost kernel: [ 4.999005] device bond0 left promiscuous mode Jan 20 14:19:03 kvmhost kernel: [ 4.999008] device bond0 entered promiscuous mode Jan 20 14:19:03 kvmhost kernel: [ 4.999012] device bond0 left promiscuous mode Any advice would be greatly appreciated. It seems that this must be possible, based on other posts, but I can't see what I'm doing wrong. Thanks.

Read the article
SQLAuthority News – Stay Connected and Social Media

- by pinaldave

I think I have finally gotten back my faith in social media. If you are following my blog I am sure you are aware of my views on social media – SQLAuthority News – Social Media Confusion – Twitter, FaceBook, LinkedIn and Me. I was not happy about how social media was evolving. Whenever I go to Twitter, LinkedIn or Facebook, I noticed the same updates everywhere. I just thought I was wasting my time doing the same thing everywhere. I strongly believe that there is no dictator on internet. Nobody has authority over others, everybody can express their ideas as long as it is not violating others privacy and it is not morally wrong. I have decided that instead of trying to improve the world, I should change myself and adjust my needs. Here are few things I have done to relieve my social media confusion. Twitter I un-followed people who were taking up my time with too many updates. I un-followed people who hardly updated at all. I did not follow anybody else’s list, as I have no control over who other people follow. I follow not only serious SQL people but some fun stuff as well. I removed all my friends who were on Facebook and repeating the same updates on Twitter. I engage with them on Facebook. I followed people who are very conversational on Twitter. I let anybody follow me. I update all my blog posts through at least five tweets online. I decided to re-tweet at least five of my favorite tweets of the day, this way I force myself to remain active in the community. Follow me on Twitter! LinkedIn I updated my career and professional info on LinkedIn. I keep my LinkedIn profile updated with my latest jobs and career news. I let anybody connect with me on LinkedIn. I specify my email address in my profile, keeping it easy for those who want to add me. I read all the profile related updates of my connections – it is very valuable to know who is where and what changes are happening. I do not add my personal tweets or comments in LinkedIn profile. I just keep it professional. Link with me at LinkedIn Facebook I use Facebook only for personal friends. I visit all of my friends at regular intervals and make sure that they are really my friends. I often remove my friends from my Twitter list who are sending duplicate updates. I upload my family photos as well as family updates on Facebook, making sure that only my approved friends are able to read my updates. I keep my Facebook very personal and I often chat with my friends on Facebook chat. I am no longer confused about social media and I think I am using it appropriately. As I said, one cannot decide for others how to use social media, you can only decide for yourself. I have finally found my peace with social media. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the analytics in Big Data Story. In this article we will understand how to become a Data Scientist for Big Data Story. Data Scientist is a new buzz word, everyone seems to be wanting to become Data Scientist. Let us go over a few key topics related to Data Scientist in this blog post. First of all we will understand what is a Data Scientist. In the new world of Big Data, I see pretty much everyone wants to become Data Scientist and there are lots of people I have already met who claims that they are Data Scientist. When I ask what is their role, I have got a wide variety of answers. What is Data Scientist? Data scientists are the experts who understand various aspects of the business and know how to strategies data to achieve the business goals. They should have a solid foundation of various data algorithms, modeling and statistics methodology. What do Data Scientists do? Data scientists understand the data very well. They just go beyond the regular data algorithms and builds interesting trends from available data. They innovate and resurrect the entire new meaning from the existing data. They are artists in disguise of computer analyst. They look at the data traditionally as well as explore various new ways to look at the data. Data Scientists do not wait to build their solutions from existing data. They think creatively, they think before the data has entered into the system. Data Scientists are visionary experts who understands the business needs and plan ahead of the time, this tremendously help to build solutions at rapid speed. Besides being data expert, the major quality of Data Scientists is “curiosity”. They always wonder about what more they can get from their existing data and how to get maximum out of future incoming data. Data Scientists do wonders with the data, which goes beyond the job descriptions of Data Analysist or Business Analysist. Skills Required for Data Scientists Here are few of the skills a Data Scientist must have. Expert level skills with statistical tools like SAS, Excel, R etc. Understanding Mathematical Models Hands-on with Visualization Tools like Tableau, PowerPivots, D3. j’s etc. Analytical skills to understand business needs Communication skills On the technology front any Data Scientists should know underlying technologies like (Hadoop, Cloudera) as well as their entire ecosystem (programming language, analysis and visualization tools etc.) . Remember that for becoming a successful Data Scientist one require have par excellent skills, just having a degree in a relevant education field will not suffice. Final Note Data Scientists is indeed very exciting job profile. As per research there are not enough Data Scientists in the world to handle the current data explosion. In near future Data is going to expand exponentially, and the need of the Data Scientists will increase along with it. It is indeed the job one should focus if you like data and science of statistics. Courtesy: emc Tomorrow In tomorrow’s blog post we will discuss about various Big Data Learning resources. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Key-Value Pair Databases and Document Databases in the Big Data Story. In this article we will understand the role of Columnar, Graph and Spatial Database supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (The day before yesterday’s post) NoSQL Databases (The day before yesterday’s post) Key-Value Pair Databases (Yesterday’s post) Document Databases (Yesterday’s post) Columnar Databases (Tomorrow’s post) Graph Databases (Today’s post) Spatial Databases (Today’s post) Columnar Databases Relational Database is a row store database or a row oriented database. Columnar databases are column oriented or column store databases. As we discussed earlier in Big Data we have different kinds of data and we need to store different kinds of data in the database. When we have columnar database it is very easy to do so as we can just add a new column to the columnar database. HBase is one of the most popular columnar databases. It uses Hadoop file system and MapReduce for its core data storage. However, remember this is not a good solution for every application. This is particularly good for the database where there is high volume incremental data is gathered and processed. Graph Databases For a highly interconnected data it is suitable to use Graph Database. This database has node relationship structure. Nodes and relationships contain a Key Value Pair where data is stored. The major advantage of this database is that it supports faster navigation among various relationships. For example, Facebook uses a graph database to list and demonstrate various relationships between users. Neo4J is one of the most popular open source graph database. One of the major dis-advantage of the Graph Database is that it is not possible to self-reference (self joins in the RDBMS terms) and there might be real world scenarios where this might be required and graph database does not support it. Spatial Databases We all use Foursquare, Google+ as well Facebook Check-ins for location aware check-ins. All the location aware applications figure out the position of the phone with the help of Global Positioning System (GPS). Think about it, so many different users at different location in the world and checking-in all together. Additionally, the applications now feature reach and users are demanding more and more information from them, for example like movies, coffee shop or places see. They are all running with the help of Spatial Databases. Spatial data are standardize by the Open Geospatial Consortium known as OGC. Spatial data helps answering many interesting questions like “Distance between two locations, area of interesting places etc.” When we think of it, it is very clear that handing spatial data and returning meaningful result is one big task when there are millions of users moving dynamically from one place to another place & requesting various spatial information. PostGIS/OpenGIS suite is very popular spatial database. It runs as a layer implementation on the RDBMS PostgreSQL. This makes it totally unique as it offers best from both the worlds. Courtesy: mushroom network Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Hive. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Professional Development – Difference Between Bio, CV and Resume

- by Pinal Dave

Applying for work can be very stressful – you want to put your best foot forward, and it can be very hard to sell yourself to a potential employer while highlighting your best characteristics and answering questions. On top of that, some jobs require different application materials – a biography (or bio), a curriculum vitae (or CV), or a resume. These things seem so interchangeable, so what is the difference? Let’s start with the one most of us have heard of – the resume. A resume is a summary of your job and education history. If you have ever applied for a job, you will have used a resume. The ability to write a good resume that highlights your best characteristics and emphasizes your qualifications for a specific job is a skill that will take you a long way in the world. For such an essential skill, unfortunately it is one that many people struggle with. RESUME So let’s discuss what makes a great resume. First, make sure that your name and contact information are at the top, in large print (slightly larger font than the rest of the text, size 14 or 16 if the rest is size 12, for example). You need to make sure that if you catch the recruiter’s attention and they know how to get a hold of you. As for qualifications, be quick and to the point. Make your job title and the company the headline, and include your skills, accomplishments, and qualifications as bullet points. Use good action verbs, like “finished,” “arranged,” “solved,” and “completed.” Include hard numbers – don’t just say you “changed the filing system,” say that you “revolutionized the storage of over 250 files in less than five days.” Doesn’t that sentence sound much more powerful? Curriculum Vitae (CV) Now let’s talk about curriculum vitae, or “CVs”. A CV is more like an expanded resume. The same rules are still true: put your name front and center, keep your contact info up to date, and summarize your skills with bullet points. However, CVs are often required in more technical fields – like science, engineering, and computer science. This means that you need to really highlight your education and technical skills. Difference between Resume and CV Resumes are expected to be one or two pages long – CVs can be as many pages as necessary. If you are one of those people lucky enough to feel limited by the size constraint of resumes, a CV is for you! On a CV you can expand on your projects, highlight really exciting accomplishments, and include more educational experience – including GPA and test scores from the GRE or MCAT (as applicable). You can also include awards, associations, teaching and research experience, and certifications. A CV is a place to really expand on all your experience and how great you will be in this particular position. Biography (Bio) Chances are, you already know what a bio is, and you have even read a few of them. Think about the one or two paragraphs that every author includes in the back flap of a book. Think about the sentences under a blogger’s photo on every “About Me” page. That is a bio. It is a way to quickly highlight your life experiences. It is essentially the way you would introduce yourself at a party. Where a bio is required for a job, chances are they won’t want to know about where you were born and how many pets you have, though. This is a way to summarize your entire job history in quick-to-read format – and sometimes during a job hunt, being able to get to the point and grab the recruiter’s interest is the best way to get your foot in the door. Think of a bio as your entire resume put into words. Most bios have a standard format. In paragraph one, talk about your most recent position and accomplishments there, specifically how they relate to the job you are applying for. If you have teaching or research experience, training experience, certifications, or management experience, talk about them in paragraph two. Paragraph three and four are for highlighting publications, education, certifications, associations, etc. To wrap up your bio, provide your contact info and availability (dates and times). Where to use What? For most positions, you will know exactly what kind of application to use, because the job announcement will state what materials are needed – resume, CV, bio, cover letter, skill set, etc. If there is any confusion, choose whatever the industry standard is (CV for technical fields, resume for everything else) or choose which of your documents is the strongest. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: About Me, PostADay, Professional Development, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
How Visual Studio 2010 and Team Foundation Server enable Compliance

- by Martin Hinshelwood

One of the things that makes Team Foundation Server (TFS) the most powerful Application Lifecycle Management (ALM) platform is the traceability it provides to those that use it. This traceability is crucial to enable many companies to adhere to many of the Compliance regulations to which they are bound (e.g. CFR 21 Part 11 or Sarbanes–Oxley.) From something as simple as relating Tasks to Check-in’s or being able to see the top 10 files in your codebase that are causing the most Bugs, to identifying which Bugs and Requirements are in which Release. All that information is available and more in TFS. Although all of this tradability is available within TFS you do need to understand that it is not for free. Well… I say that, but if you are using TFS properly you will have this information with no additional work except for firing up the reporting. Using Visual Studio ALM and Team Foundation Server you can relate every line of code changes all the way up to requirements and back down through Test Cases to the Test Results. Figure: The only thing missing is Build In order to build the relationship model below we need to examine how each of the relationships get there. Each member of your team from programmer to tester and Business Analyst to Business have their roll to play to knit this together. Figure: The relationships required to make this work can get a little confusing If Build is added to this to relate Work Items to Builds and with knowledge of which builds are in which environments you can easily identify what is contained within a Release. Figure: How are things progressing Along with the ability to produce the progress and trend reports the tractability that is built into TFS can be used to fulfil most audit requirements out of the box, and augmented to fulfil the rest. In order to understand the relationships, lets look at each of the important Artifacts and how they are associated with each other… Requirements – The root of all knowledge Requirements are the thing that the business cares about delivering. These could be derived as User Stories or Business Requirements Documents (BRD’s) but they should be what the Business asks for. Requirements can be related to many of the Artifacts in TFS, so lets look at the model: Figure: If the centre of the world was a requirement We can track which releases Requirements were scheduled in, but this can change over time as more details come to light. Figure: Who edited the Requirement and when There is also the ability to query Work Items based on the History of changed that were made to it. This is particularly important with Requirements. It might not be enough to say what Requirements were completed in a given but also to know which Requirements were ever assigned to a particular release. Figure: Some magic required, but result still achieved As an augmentation to this it is also possible to run a query that shows results from the past, just as if we had a time machine. You can take any Query in the system and add a “Asof” clause at the end to query historical data in the operational store for TFS. select <fields> from WorkItems [where <condition>] [order by <fields>] [asof <date>] Figure: Work Item Query Language (WIQL) format In order to achieve this you do need to save the query as a *.wiql file to your local computer and edit it in notepad, but one imported into TFS you run it any time you want. Figure: Saving Queries locally can be useful All of these Audit features are available throughout the Work Item Tracking (WIT) system within TFS. Tasks – Where the real work gets done Tasks are the work horse of the development team, but they only as useful as Excel if you do not relate them properly to other Artifacts. Figure: The Task Work Item Type has its own relationships Requirements should be broken down into Tasks that the development team work from to build what is required by the business. This may be done by a small dedicated group or by everyone that will be working on the software team but however it happens all of the Tasks create should be a Child of a Requirement Work Item Type. Figure: Tasks are related to the Requirement Tasks should be used to track the day-to-day activities of the team working to complete the software and as such they should be kept simple and short lest developers think they are more trouble than they are worth. Figure: Task Work Item Type has a narrower purpose Although the Task Work Item Type describes the work that will be done the actual development work involves making changes to files that are under Source Control. These changes are bundled together in a single atomic unit called a Changeset which is committed to TFS in a single operation. During this operation developers can associate Work Item with the Changeset. Figure: Tasks are associated with Changesets Changesets – Who wrote this crap Changesets themselves are just an inventory of the changes that were made to a number of files to complete a Task. Figure: Changesets are linked by Tasks and Builds Figure: Changesets tell us what happened to the files in Version Control Although comments can be changed after the fact, the inventory and Work Item associations are permanent which allows us to Audit all the way down to the individual change level. Figure: On Check-in you can resolve a Task which automatically associates it Because of this we can view the history on any file within the system and see how many changes have been made and what Changesets they belong to. Figure: Changes are tracked at the File level What would be even more powerful would be if we could view these changes super imposed over the top of the lines of code. Some people call this a blame tool because it is commonly used to find out which of the developers introduced a bug, but it can also be used as another method of Auditing changes to the system. Figure: Annotate shows the lines the Annotate functionality allows us to visualise the relationship between the individual lines of code and the Changesets. In addition to this you can create a Label and apply it to a version of your version control. The problem with Label’s is that they can be changed after they have been created with no tractability. This makes them practically useless for any sort of compliance audit. So what do you use? Branches – And why we need them Branches are a really powerful tool for development and release management, but they are most important for audits. Figure: One way to Audit releases The R1.0 branch can be created from the Label that the Build creates on the R1 line when a Release build was created. It can be created as soon as the Build has been signed of for release. However it is still possible that someone changed the Label between this time and its creation. Another better method can be to explicitly link the Build output to the Build. Builds – Lets tie some more of this together Builds are the glue that helps us enable the next level of tractability by tying everything together. Figure: The dashed pieces are not out of the box but can be enabled When the Build is called and starts it looks at what it has been asked to build and determines what code it is going to get and build. Figure: The folder identifies what changes are included in the build The Build sets a Label on the Source with the same name as the Build, but the Build itself also includes the latest Changeset ID that it will be building. At the end of the Build the Build Agent identifies the new Changesets it is building by looking at the Check-ins that have occurred since the last Build. Figure: What changes have been made since the last successful Build It will then use that information to identify the Work Items that are associated with all of the Changesets Changesets are associated with Build and change the “Integrated In” field of those Work Items . Figure: Find all of the Work Items to associate with The “Integrated In” field of all of the Work Items identified by the Build Agent as being integrated into the completed Build are updated to reflect the Build number that successfully integrated that change. Figure: Now we know which Work Items were completed in a build Now that we can link a single line of code changed all the way back through the Task that initiated the action to the Requirement that started the whole thing and back down to the Build that contains the finished Requirement. But how do we know wither that Requirement has been fully tested or even meets the original Requirements? Test Cases – How we know we are done The only way we can know wither a Requirement has been completed to the required specification is to Test that Requirement. In TFS there is a Work Item type called a Test Case Test Cases enable two scenarios. The first scenario is the ability to track and validate Acceptance Criteria in the form of a Test Case. If you agree with the Business a set of goals that must be met for a Requirement to be accepted by them it makes it both difficult for them to reject a Requirement when it passes all of the tests, but also provides a level of tractability and validation for audit that a feature has been built and tested to order. Figure: You can have many Acceptance Criteria for a single Requirement It is crucial for this to work that someone from the Business has to sign-off on the Test Case moving from the “Design” to “Ready” states. The Second is the ability to associate an MS Test test with the Test Case thereby tracking the automated test. This is useful in the circumstance when you want to Track a test and the test results of a Unit Test designed to test the existence of and then re-existence of a a Bug. Figure: Associating a Test Case with an automated Test Although it is possible it may not make sense to track the execution of every Unit Test in your system, there are many Integration and Regression tests that may be automated that it would make sense to track in this way. Bug – Lets not have regressions In order to know wither a Bug in the application has been fixed and to make sure that it does not reoccur it needs to be tracked. Figure: Bugs are the centre of their own world If the fix to a Bug is big enough to require that it is broken down into Tasks then it is probably a Requirement. You can associate a check-in with a Bug and have it tracked against a Build. You would also have one or more Test Cases to prove the fix for the Bug. Figure: Bugs have many associations This allows you to track Bugs / Defects in your system effectively and report on them. Change Request – I am not a feature In the CMMI Process template Change Requests can also be easily tracked through the system. In some cases it can be very important to track Change Requests separately as an Auditor may want to know what was changed and who authorised it. Again and similar to Bugs, if the Change Request is big enough that it would require to be broken down into Tasks it is in reality a new feature and should be tracked as a Requirement. Figure: Make sure your Change Requests only Affect Requirements and not rewrite them Conclusion Visual Studio 2010 and Team Foundation Server together provide an exceptional Application Lifecycle Management platform that can help your team comply with even the harshest of Compliance requirements while still enabling them to be Agile. Most Audits are heavy on required documentation but most of that information is captured for you as long a you do it right. You don’t even need every team member to understand it all as each of the Artifacts are relevant to a different type of team member. Business Analysts manage Requirements and Change Requests Programmers manage Tasks and check-in against Change Requests and Bugs Testers manage Bugs and Test Cases Build Masters manage Builds Although there is some crossover there are still rolls or “hats” that are worn. Do you thing this is all achievable? Have I missed anything that you think should be there?

Read the article
SQL Table stored as a Heap - the dangers within

- by MikeD

Nearly all of the time I create a table, I include a primary key, and often that PK is implemented as a clustered index. Those two don't always have to go together, but in my world they almost always do. On a recent project, I was working on a data warehouse and a set of SSIS packages to import data from an OLTP database into my data warehouse. The data I was importing from the business database into the warehouse was mostly new rows, sometimes updates to existing rows, and sometimes deletes. I decided to use the MERGE statement to implement the insert, update or delete in the data warehouse, I found it quite performant to have a stored procedure that extracted all the new, updated, and deleted rows from the source database and dump it into a working table in my data warehouse, then run a stored proc in the warehouse that was the MERGE statement that took the rows from the working table and updated the real fact table. Use Warehouse CREATE TABLE Integration.MergePolicy (PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date, Operation varchar(5)) CREATE TABLE fact.Policy (PolicyKey int identity primary key, PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date) CREATE PROC Integration.MergePolicy as begin begin tran Merge fact.Policy as tgtUsing Integration.MergePolicy as SrcOn (tgt.PolicyId = Src.PolicyId) When not matched by Target then Insert (PolicyId, PolicyTypeKey, Premium, Deductible, EffectiveDate)values (src.PolicyId, src.PolicyTypeKey, src.Premium, src.Deductible, src.EffectiveDate) When matched and src.Operation = 'U' then Update set PolicyTypeKey = src.PolicyTypeKey,Premium = src.Premium,Deductible = src.Deductible,EffectiveDate = src.EffectiveDate When matched and src.Operation = 'D' then Delete ;delete from Integration.WorkPolicy commit end Notice that my worktable (Integration.MergePolicy) doesn't have any primary key or clustered index. I didn't think this would be a problem, since it was relatively small table and was empty after each time I ran the stored proc. For one of the work tables, during the initial loads of the warehouse, it was getting about 1.5 million rows inserted, processed, then deleted. Also, because of a bug in the extraction process, the same 1.5 million rows (plus a few hundred more each time) was getting inserted, processed, and deleted. This was being sone on a fairly hefty server that was otherwise unused, and no one was paying any attention to the time it was taking. This week I received a backup of this database and loaded it on my laptop to troubleshoot the problem, and of course it took a good ten minutes or more to run the process. However, what seemed strange to me was that after I fixed the problem and happened to run the merge sproc when the work table was completely empty, it still took almost ten minutes to complete. I immediately looked back at the MERGE statement to see if I had some sort of outer join that meant it would be scanning the target table (which had about 2 million rows in it), then turned on the execution plan output to see what was happening under the hood. Running the stored procedure again took a long time, and the plan output didn't show me much - 55% on the MERGE statement, and 45% on the DELETE statement, and table scans on the work table in both places. I was surprised at the relative cost of the DELETE statement, because there were really 0 rows to delete, but I was expecting to see the table scans. (I was beginning now to suspect that my problem was because the work table was being stored as a heap.) Then I turned on STATS_IO and ran the sproc again. The output was quite interesting.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Policy'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'MergePolicy'. Scan count 1, logical reads 433276, physical reads 60, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. I've reproduced the above from memory, the details aren't exact, but the essential bit was the very high number of logical reads on the table stored as a heap. Even just doing a SELECT Count(*) from Integration.MergePolicy incurred that sort of output, even though the result was always 0. I suppose I should research more on the allocation and deallocation of pages to tables stored as a heap, but I haven't, and my original assumption that a table stored as a heap with no rows would only need to read one page to answer any query was definitely proven wrong. It's likely that some sort of physical defragmentation of the table may have cleaned that up, but it seemed that the easiest answer was to put a clustered index on the table. After doing so, the execution plan showed a cluster index scan, and the IO stats showed only a single page read. (I aborted my first attempt at adding a clustered index on the table because it was taking too long - instead I ran TRUNCATE TABLE Integration.MergePolicy first and added the clustered index, both of which took very little time). I suspect I may not have noticed this if I had used TRUNCATE TABLE Integration.MergePolicy instead of DELETE FROM Integration.MergePolicy, since I'm guessing that the truncate operation does some rather quick releasing of pages allocated to the heap table. In the future, I will likely be much more careful to have a clustered index on every table I use, even the working tables. Mike

Read the article
SQL Saturday 27 (Portland, Oregon)

- by BuckWoody

I’m sitting in the Seattle airport, waiting for my flight to Silicon Valley California for the SQL Server 2008 R2 Launch Event. By some quirk of nature, they are asking me to Emcee the event – but that’s another post entirely. I’m reflecting on the SQL Saturday 27 event that was just held in Portland, Oregon this last Saturday. These are not Microsoft-sponsored events – it’s truly the community at work. Think of a big user-group meeting – I mean REALLY big – held in a central location, like at a college (as ours was) or some larger, inexpensive venue like that. Everyone there is volunteering – it’s my own money and time to drive several hours to a hotel for the night, feed myself and present. It’s their own time and money for the folks that organize the event – unless a vendor or two steps in to help. It’s their own time and money for the attendees to drive a long way, spend the night and their Saturday to listen to the speakers. Why do all this? Because everybody benefits. Every speaker learns something new, meets new people, and reaches a new audience. Every volunteer does the same. And the attendees? Well, it’s pretty obvious what they get. A 7Am to 10PM extravaganza of knowledge from every corner of the product. In fact, this year the Portland group hooked up with the CodeCamp folks and held a combined event. We had over 850 people, and I had everyone from data professionals to developers in my sessions. So I’ll take this opportunity to do two things: to say “thank you” to all of the folks who attended, from those who spoke to those who worked and those who came to listen, and to challenge you to attend the next SQL Saturday anywhere near you. You can find the list here: http://www.sqlsaturday.com/. Don’t see anything in your area? Start one! The PASS folks have a package that will show you how. Sure, it’s a big job, but the key is to get as many people helping you as possible. Even if you have only a few dozen folks show up the first time, no worries. The first events I presented at had about 20 in the room. But not this week. See you at the Launch Event if you’re near the San Francisco area tomorrow, and see you at the Redmond SQL Saturday and TechEd if not. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Pig and Pig Latin in Big Data Story. In this article we will understand what is Sqoop and Zookeeper in Big Data Story. There are two most important components one should learn when learning about interacting with Hadoop – Sqoop and Zookper. What is Sqoop? Most of the business stores their data in RDBMS as well as other data warehouse solutions. They need a way to move data to the Hadoop system to do various processing and return it back to RDBMS from Hadoop system. The data movement can happen in real time or at various intervals in bulk. We need a tool which can help us move this data from SQL to Hadoop and from Hadoop to SQL. Sqoop (SQL to Hadoop) is such a tool which extract data from non-Hadoop data sources and transform them into the format which Hadoop can use it and later it loads them into HDFS. Essentially it is ETL tool where it Extracts, Transform and Load from SQL to Hadoop. The best part is that it also does extract data from Hadoop and loads them to Non-SQL (or RDBMS) data stores. Essentially, Sqoop is a command line tool which does SQL to Hadoop and Hadoop to SQL. It is a command line interpreter. It creates MapReduce job behinds the scene to import data from an external database to HDFS. It is very effective and easy to learn tool for nonprogrammers. What is Zookeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. In other words Zookeeper is a replicated synchronization service with eventual consistency. In simpler words – in Hadoop cluster there are many different nodes and one node is master. Let us assume that master node fails due to any reason. In this case, the role of the master node has to be transferred to a different node. The main role of the master node is managing the writers as that task requires persistence in order of writing. In this kind of scenario Zookeeper will assign new master node and make sure that Hadoop cluster performs without any glitch. Zookeeper is the Hadoop’s method of coordinating all the elements of these distributed systems. Here are few of the tasks which Zookeepr is responsible for. Zookeeper manages the entire workflow of starting and stopping various nodes in the Hadoop’s cluster. In Hadoop cluster when any processes need certain configuration to complete the task. Zookeeper makes sure that certain node gets necessary configuration consistently. In case of the master node fails, Zookeepr can assign new master node and make sure cluster works as expected. There many other tasks Zookeeper performance when it is about Hadoop cluster and communication. Basically without the help of Zookeeper it is not possible to design any new fault tolerant distributed application. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Big Data Analytics. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQLAuthority News – Advantages of Distance Learning

- by Pinal Dave

Distance education is extremely popular – almost overnight, it seems. Almost everyone has taken an online course, or knows someone who has, or is considering joining an online school. There are many advantages and disadvantages to attending an online school – but the same can be said of attending a physical school! Let’s take a look at the top reasons to use distance education. 1) Flexibility. Physical universities are usually willing to make some concessions to student – like night classes, study hours, and online networks. However, nothing is going to beat the flexibility of distance education. You can attend classes and take notes anytime, anywhere, wearing anything you’d like! 2) Affordability. We don’t need to get into hard numbers to understand how an expensive university can be. Students are taking on more and more debt just to get an education. Many of these fees pay for room, board, and facilities. Distance education cuts out all these costs, and makes attending school much more affordable for the average student. 3) Try before you buy. Did you know that the average college student changes his or her major 10 times before they graduate? You can imagine that this kind of indecision plays a huge part in WHEN you graduate – not being able to make up your mind can cost you big bucks if you have to stay in school for extra years! Distance education allows you to take different classes from a wide range of disciplines. Do you want to study forensic science or English literature? Now you don’t have to pay for classes you can’t afford just to find out. 4) Pace yourself. Some students struggle in a traditional classroom setting – classes can be taught too fast, too slow, or there are too many distractions. Distance education allows mature students to set the pace themselves. They can rewatch lectures they didn’t catch the first time, or go through classes quickly if they are already familiar with the material – cutting out the chance of burning out or getting bored. 5) Lifelong learning. Maybe you already have a degree, but would like to learn more about your field, or a related field, or maybe even about something completely unrelated – just because you are curious! Distance education allows you to learn whatever you want ,whenever you want (and yes, wearing anything you’d like!). 6) Attend whatever college you want. Because of the popularity of distance education, physical campuses are getting in on the game by offering online courses – often just uploaded versions of classes already taught at their campus. Ever wanted to attend Harvard, but knew you couldn’t get in? Take a class online! Of course, you probably should not attempt to lie and say you have a Harvard degree, but Ivy League colleges are prestigious because they are the best in their field – take advantage of the best by taking an online course! I am a big believer in continuing education, whether it is online courses, returning to school, or even take informal classes online. Distance education can be a great way to accomplish these goals and become a lifelong learner. My friends at provides training through virtual classrooms for students who want to avoid travelling. Distance learning course allows IT aspirants to connect with trainers using the internet. I encourage everyone to check it out! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
Three Buckets of Knowledge

- by BuckWoody

As I learn more and more about SQL Server every day, I divide up my information into three “buckets”: Concepts In the first bucket are the general concepts about the topic. What is it? What does it do (or sometimes, what is is supposed to do?) How does one operation flow to another? For this information I use books, magazine articles and believe it or not – Wikipedia. I don’t always trust that last source, but I do use it to see how others lay out their thoughts around a concept. I really like graphical charts that show me the process flow if I can get it, and this is an ideal place for a good presentation. In fact, this may be the only real use for a presentation – I’ll explain what I mean in a moment. Reference The references for a topic include things like Transact-SQL (T-SQL) syntax, or the screen layout on a panel, things like that. Think Dictionary. The only reference I trust for this information is Books Online – presentations are fine, but we’re talking about a dictionary. Ever go to a movie that just reads through a dictionary? Me neither. But I have gone to presentations where people try to include tons of reference materials in their slides. Even if you give me the presentation material later, it’s not really a searchable, readable medium. How To A how-to for me is an example, or even better, a tutorial about an example. Whatever it is shows me a practical use for the concepts and of course involves the syntax. The important thing here is that you need to be able to separate out the example the person is showing you from the stuff you need to know. I can’t tell you how many times folks have told me, “well, sure, if yours is red then that works. But mine is blue.” And I have to explain, “then use “blue” for the search word here.” You get the idea. No one will do your work for you – the examples are meant as a teaching tool only. I accept that, learn what I can, and then run off to create my own thing. You might think a How To works well in a presentation, and it does, for the most part. For a complex example or tutorial, I still prefer the printed word (electronic if possible) so that I can go over the example multiple times, skip around and so on. The order here isn’t actually that important. Most of the time I start with a concept, look at an example, and then read the reference material. But sometimes I look up an example, read a little of concepts and then check the reference. The only primary thing I try to enforce is to read something from each of them. It’s dangerous to base your work on any single example, reference or concept. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

- by Pinal Dave

There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

Read the article
PASS summit 2013. We do not remember days. We remember moments.

- by Maria Zakourdaev

"Business or pleasure?" barked the security officer in the Charlotte International Airport. "I’m not sure, sir," I whimpered, immediately losing all courage. "I'm here for the database technologies summit called PASS”. "Sounds boring. Definitely a business trip." Boring?! He couldn’t have been more wrong. If he only knew about the countless meetings throughout the year where I waved my hands at my great boss and explained again and again how fantastic this summit is and how much I learned last year. One by one, the drops of water began eating away at the stone. He finally approved of my trip just to stop me from torturing him. Time moves as slow as a turtle when you are waiting for something. Time runs as fast as a cheetah when you are there. PASS has come...and passed. It’s been an amazing week. Enormous sqlenergy has filled the city, filled the convention center and the surrounding pubs and restaurants. There were awesome speakers, great content, and the chance to meet most inspiring database professionals from all over the world. Some sessions were unforgettable. Imagine a fully packed room with more than 500 people in awed silence, catching each and every one of Paul Randall's words. His tremendous energy and deep knowledge were truly thrilling. No words can describe Rob Farley's unique presentation style, captivating and engaging the audience. When the precious session minutes were over, I could tell that the many random puzzle pieces of information that his listeners knew had been suddenly combined into a clear, cohesive picture. I was amazed as always by Paul White's great sense of humor and his phenomenal ability to explain complicated concepts in a simple way. The keynote by the brilliant Dr. DeWitt from Microsoft in front of the full summit audience of 5000 deeply listening people was genuinely breathtaking. The entire conference throughout offered excellent speakers who inspired me to absorb the knowledge and use it when I got home. To my great surprise, I found that there are other people in this world who like replication as much I do. During the Birds of a Feather Luncheon, SQL Server MVP Ted Krueger was writing a script for replicating the food to other tables. I learned many things at PASS, and not all of them were about SQL. After three summits, this time I finally got the knack of networking. I actually went up and spoke to people, and believe me, that was not easy for an introvert. But this is what the summit is all about. Sqlpeople. They are the ones who make it such an exciting experience. I will be looking forward to the next year. Till then I have my notes and new ideas. How long was the summit? Thousands of unforgettable moments.

Read the article
SQL Saturday 27 (Portland, Oregon)

- by BuckWoody

I’m sitting in the Seattle airport, waiting for my flight to Silicon Valley California for the SQL Server 2008 R2 Launch Event. By some quirk of nature, they are asking me to Emcee the event – but that’s another post entirely. I’m reflecting on the SQL Saturday 27 event that was just held in Portland, Oregon this last Saturday. These are not Microsoft-sponsored events – it’s truly the community at work. Think of a big user-group meeting – I mean REALLY big – held in a central location, like at a college (as ours was) or some larger, inexpensive venue like that. Everyone there is volunteering – it’s my own money and time to drive several hours to a hotel for the night, feed myself and present. It’s their own time and money for the folks that organize the event – unless a vendor or two steps in to help. It’s their own time and money for the attendees to drive a long way, spend the night and their Saturday to listen to the speakers. Why do all this? Because everybody benefits. Every speaker learns something new, meets new people, and reaches a new audience. Every volunteer does the same. And the attendees? Well, it’s pretty obvious what they get. A 7Am to 10PM extravaganza of knowledge from every corner of the product. In fact, this year the Portland group hooked up with the CodeCamp folks and held a combined event. We had over 850 people, and I had everyone from data professionals to developers in my sessions. So I’ll take this opportunity to do two things: to say “thank you” to all of the folks who attended, from those who spoke to those who worked and those who came to listen, and to challenge you to attend the next SQL Saturday anywhere near you. You can find the list here: http://www.sqlsaturday.com/. Don’t see anything in your area? Start one! The PASS folks have a package that will show you how. Sure, it’s a big job, but the key is to get as many people helping you as possible. Even if you have only a few dozen folks show up the first time, no worries. The first events I presented at had about 20 in the room. But not this week. See you at the Launch Event if you’re near the San Francisco area tomorrow, and see you at the Redmond SQL Saturday and TechEd if not. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SRMSVC event ID 8197

- by Godeke

I have a Windows 2003 R2 machine that is giving an Event ID 8197 about once an hour and ten minutes. The full error is attached below. The machine is primary used to host IIS webpages and SMTP. There is no known scheduled tasks on the machine. I have read a lot of Google Search and Microsoft docs, but none of the suggestions found there have any impact. What I am curious is if there is any way to convert the SRMVOLMC81 and SRMVOLMC57 into mount point data so I could at least know where the error is sourcing from (there are no related errors in the logs, just the 8197 every hour and ten). Event Type: Error Event Source: SRMSVC Event Category: None Event ID: 8197 Date: 2/7/2011 Time: 11:32:21 AM User: N/A Computer: SERVER001 Description: File Server Resource Manager Service error: Unexpected error. Error-specific details: Error: GetVolumeNameForVolumeMountPoint, 0x80070001, Incorrect function. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. Data: 0000: 53 52 4d 56 4f 4c 4d 43 SRMVOLMC 0008: 38 31 00 00 00 00 00 00 81...... 0010: 53 52 4d 56 4f 4c 4d 43 SRMVOLMC 0018: 35 37 00 00 00 00 00 00 57......

Read the article
iScsiPrt error event ID 5

- by AZee

Event Log: "Failed to setup initiator portal. Error status is given in the dump data." This is being recorded every 3/100's of a second. We are using MS iSCSI Initiator on Windows Server 2003, Dell 2970 w/4GB (PAE). I am sure that this was configured by Dell initially. I have no idea what changes or mods were made since the company installed this machine until now. (I'm a new User so the lovely and vibrant screen images had to be removed. They were quite pretty and I am sure you would have been very moved and appreciative of them.) It appears that everything is installed correctly and the 5TB bound volume is accessible but I have never worked with iScsi before so I plead total ignorance. In searching I have found this to be a fairly sparce and bland documented subject. I'd like two things... First, to get rid of the error msg being logged. MS says it can be ignored if everything is working but it chews up resources logging it and I don't feel comfortable about any errors on my servers. I want to correct whatever is causing this problem. Secondly, being totally green to this, I would like to confirm that the setup is optimized and we are taking advantage of all features available. Although there are 3 NIC's in this machine it appears that the initiator is only configured for the Broadcom BMC5708C NetXtreme II on our 10.90.1.#, the other 2 NICS are 1GB on the 192.168.0.#. Would additional targets improve performance? If someone who is experienced in configuring the Microsoft iScsi Initiator can help I would really appreciate it since, as I mentioned, everything I have come across has not been of any value at all. Thanks! ~AZ

Read the article
IIS6: Web Site presenting the wrong SSL certificate

- by pcampbell

Consider an IIS6 installation with multiple Web Sites. Each is intended to be a different subdomain with its own cert (not a wildcard cert). Each has their host-header specified properly. foo.example.com - port 443. Require SSL w/128 bit. Working properly! It presents its SSL cert properly to the browser. Configured for a specific IP address. bar.example.com - port 443. Require SSL w/128 bit. Configured for all unassigned addresses. When inspecting the IIS property page, it fully shows the cert for bar.example.com on the View Certificate button. This is a NEW web site that is having cert problems. It's presenting the cert for foo.example.com. Ouch! Question: can you have more than one subdomains both running on separate websites with SSL certs on the same port (443)? How would you configure 2 web sites on the same range of 'all unassigned' for the same port (443) ? Update: ignoring the cert error, when browsing to https://bar, the content served is from https://foo site. When NOT using SSL, browsing to http://bar serves the correct content from bar. Just one address is assigned to this DMZ server.

Read the article
Configure IIS7.5 to allow calls to asmx web services.

- by goodeye

Hi, I migrated a site from IIS6 to Windows Server 2008 R2 IIS7.5. It has an asmx web service, which is working fine locally, but returns this 500 error when called from another machine: Request format is unrecognized for URL unexpectedly ending in /myMethodName The solution in previous versions is to add this to the web.config for the protocols needed (typically omitting HttpGet for production): <system.web> <webServices> <protocols> <add name="HttpGet" /> <add name="HttpPost" /> <add name="HttpSoap" /> </protocols> </webServices> </system.web> This is posted everywhere, including http://stackoverflow.com/questions/657313/request-format-is-unrecognized-for-url-unexpectedly-ending-in For IIS7.5, this throws a configuration error; I understand this section doesn't belong, but tried it anyway. I also boiled down the asmx call to a simple hello world. I tested with POST also, just to eliminate any issues with GET. What is the equivalent for IIS7.5? - either web.config format or the UI button to push would be really helpful. Thanks, Bob

Read the article
The Message Queuing service failed to join the computer's domain 'DOMAIN' Error 0xc00e0025

- by SimonGoldstone

Struggling to get MSMQ installed in Domain Integration mode on Windows 2012 (Azure). So far, I've provisioned a brand new Windows Server 2012 (R2) machine on the Azure platform and installed the Active Directory role and promoted the machine to a domain controller. Once the AD was in place, I then added the MSMQ feature, along with the Directory Integration add on. However, it will not install in AD integration mode. It will only work in Workgroup mode. I can verify this by running the following powershell command: New-MSMQQueue -name Queue1 -queuetype Public When I run this command, I get the following error: New-MsmqQueue : A workgroup installation computer does not support the operation. The event viewer reveals to errors in the Application log: 1. The Message Queuing service failed to join the computer's domain 'DOMAIN'. Error 0xc00e0025: 2. Message Queuing was unable to create the msmq (MSMQ Configuration) object in Active Directory Domain Services. Error c00e0025h: I'm struggling here. Any advice?

Read the article
Event ID: 861 - The Windows Firewall has detected an application listening for incoming traffic

- by Chris Marisic

Firstly, my machines aren't compromised any person suggesting such will be DV'd. The security logs on some of my networks client machines (all Windows Xp Sp3) get filled with these useless error messages. Security Failure Audit Detailed Tracking Event ID: 861 User: NT AUTHORITY\NETWORK SERVICE The Windows Firewall has detected an application listening for incoming traffic. Name: - Path: C:\WINDOWS\system32\svchost.exe Process identifier: 976 User account: NETWORK SERVICE User domain: NT AUTHORITY Service: Yes RPC server: No IP version: IPv4 IP protocol: UDP Port number: 55035 Allowed: No User notified: No It's always on various random ports of UDP so setting up a port exception isn't really an option. It's always from svchost or lsass both of which are running services from DLLs. One of the most offending processes seems to the be DnsCache. I have in my global policy under AT < Network < Network Connection < Widnows Firewall < Domain Profile (I haven't changed any standard profile options do both need configured? To allow remote administration and desktop exceptions and have a custom program exception list that has %SystemRoot%\system32\svchost.exe:*:enabled:svchost (Windows won't allow you to add this exception on a local machine but it let me have it on here in the global policy it just doesn't seem to do anything) %SystemRoot%\system32\lsass.exe:*enabled:lsass (I think this one ended all of my LSASS messages) %SystemRoot%\system32\dnsrslvr.dll:*:enabled:dnscache (I tried adding the dll itself to the exception list, this didn't seem to do anything) Is there really any other options left other than disabling the Windows Firewall entirely, disabling auditing entirely or just changing the event viewer to just auto overwrite when needed? I'd much rather fix the problem and get rid of these entries ever being created instead of just trying to cover up the problem.

Read the article

Search Results

Search found 87891 results on 3516 pages for 'server migration'.

Page 617/3516 | < Previous Page | 613 614 615 616 617 618 619 620 621 622 623 624 | Next Page >

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Teknikk

- by Pinal Dave

- by user9546

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Martin Hinshelwood

- by MikeD

- by BuckWoody

- by Pinal Dave

- by Pinal Dave

- by BuckWoody

- by Pinal Dave

- by Maria Zakourdaev

- by BuckWoody

- by Godeke

- by AZee

- by pcampbell

- by goodeye

- by SimonGoldstone

- by Chris Marisic

< Previous Page | 613 614 615 616 617 618 619 620 621 622 623 624 | Next Page >