Search Results

Search found 58569 results on 2343 pages for 'data guard'.

Page 7/2343 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • Temporary storage for keeping data between program iterations?

    - by mr.b
    I am working on an application that works like this: It fetches data from many sources, resulting in pool of about 500,000-1,500,000 records (depends on time/day) Data is parsed Part of data is processed in a way to compare it to pre-existing data (read from database), calculations are made, and stored in database. Resulting dataset that has to be stored in database is, however, much smaller in size (compared to original data set), and ranges from 5,000-50,000 records. This process almost always updates existing data, perhaps adds few more records. Then, data from step 2 should be kept somehow, somewhere, so that next time data is fetched, there is a data set which can be used to perform calculations, without touching pre-existing data in database. I should point out that this data can be lost, it's not irreplaceable (key information can be read from database if needed), but it would speed up the process next time. Application components can (and will be) run off different computers (in the same network), so storage has to be reachable from multiple hosts. I have considered using memcached, but I'm not quite sure should I do so, because one record is usually no smaller than 200 bytes, and if I have 1,500,000 records, I guess that it would amount to over 300 MB of memcached cache... But that doesn't seem scalable to me - what if data was 5x that amount? If it were to consume 1-2 GB of cache only to keep data in between iterations (which could easily happen)? So, the question is: which temporary storage mechanism would be most suitable for this kind of processing? I haven't considered using mysql temporary tables, as I'm not sure if they can persist between sessions, and be used by other hosts in network... Any other suggestion? Something I should consider?

    Read the article

  • Why are data structures so important in interviews?

    - by Vamsi Emani
    I am a newbie into the corporate world recently graduated in computers. I am a java/groovy developer. I am a quick learner and I can learn new frameworks, APIs or even programming languages within considerably short amount of time. Albeit that, I must confess that I was not so strong in data structures when I graduated out of college. Through out the campus placements during my graduation, I've witnessed that most of the biggie tech companies like Amazon, Microsoft etc focused mainly on data structures. It appears as if data structures is the only thing that they expect from a graduate. Adding to this, I see that there is this general perspective that a good programmer is necessarily a one with good knowledge about data structures. To be honest, I felt bad about that. I write good code. I follow standard design patterns of coding, I do use data structures but at the superficial level as in java exposed APIs like ArrayLists, LinkedLists etc. But the companies usually focused on the intricate aspects of Data Structures like pointer based memory manipulation and time complexities. Probably because of my java-ish background, Back then, I understood code efficiency and logic only when talked in terms of Object Oriented Programming like Objects, instances, etc but I never drilled down into the level of bits and bytes. I did not want people to look down upon me for this knowledge deficit of mine in Data Structures. So really why all this emphasis on Data Structures? Does, Not having knowledge in Data Structures really effect one's career in programming? Or is the knowledge in this subject really a sufficient basis to differentiate a good and a bad programmer?

    Read the article

  • Data structure for pattern matching.

    - by alvonellos
    Let's say you have an input file with many entries like these: date, ticker, open, high, low, close, <and some other values> And you want to execute a pattern matching routine on the entries(rows) in that file, using a candlestick pattern, for example. (See, Doji) And that pattern can appear on any uniform time interval (let t = 1s, 5s, 10s, 1d, 7d, 2w, 2y, and so on...). Say a pattern matching routine can take an arbitrary number of rows to perform an analysis and contain an arbitrary number of subpatterns. In other words, some patterns may require 4 entries to operate on. Say also that the routine (may) later have to find and classify extrema (local and global maxima and minima as well as inflection points) for the ticker over a closed interval, for example, you could say that a cubic function (x^3) has the extrema on the interval [-1, 1]. (See link) What would be the most natural choice in terms of a data structure? What about an interface that conforms a Ticker object containing one row of data to a collection of Ticker so that an arbitrary pattern can be applied to the data. What's the first thing that comes to mind? I chose a doubly-linked circular linked list that has the following methods: push_front() push_back() pop_front() pop_back() [] //overloaded, can be used with negative parameters But that data structure seems very clumsy, since so much pushing and popping is going on, I have to make a deep copy of the data structure before running an analysis on it. So, I don't know if I made my question very clear -- but the main points are: What kind of data structures should be considered when analyzing sequential data points to conform to a pattern that does NOT require random access? What kind of data structures should be considered when classifying extrema of a set of data points?

    Read the article

  • Five Key Strategies in Master Data Management

    - by david.butler(at)oracle.com
    Here is a very interesting Profit Magazine article on MDM: A recent customer survey reveals the deleterious effects of data fragmentation. by Trevor Naidoo, December 2010   Across industries and geographies, IT organizations have grown in complexity, whether due to mergers and acquisitions, or decentralized systems supporting functional or departmental requirements. With systems architected over time to support unique, one-off process needs, they are becoming costly to maintain, and the Internet has only further added to the complexity. Data fragmentation has become a key inhibitor in delivering flexible, user-friendly systems. The Oracle Insight team conducted a survey assessing customers' master data management (MDM) capabilities over the past two years to get a sense of where they are in terms of their capabilities. The responses, by 27 respondents from six different industries, reveal five key areas in which customers need to improve their data management in order to get better financial results. 1. Less than 15 percent of organizations surveyed understand the sources and quality of their master data, and have a roadmap to address missing data domains. Examples of the types of master data domains referred to are customer, supplier, product, financial and site. Many organizations have multiple sources of master data with varying degrees of data quality in each source -- customer data stored in the customer relationship management system is inconsistent with customer data stored in the order management system. Imagine not knowing how many places you stored your customer information, and whether a customer's address was the most up to date in each source. In fact, more than 55 percent of the respondents in the survey manage their data quality on an ad-hoc basis. It is important for organizations to document their inventory of data sources and then profile these data sources to ensure that there is a consistent definition of key data entities throughout the organization. Some questions to ask are: How do we define a customer? What is a product? How do we define a site? The goal is to strive for one common repository for master data that acts as a cross reference for all other sources and ensures consistent, high-quality master data throughout the organization. 2. Only 18 percent of respondents have an enterprise data management strategy to ensure that data is treated as an asset to the organization. Most respondents handle data at the department or functional level and do not have an enterprise view of their master data. The sales department may track all their interactions with customers as they move through the sales cycle, the service department is tracking their interactions with the same customers independently, and the finance department also has a different perspective on the same customer. The salesperson may not be aware that the customer she is trying to sell to is experiencing issues with existing products purchased, or that the customer is behind on previous invoices. The lack of a data strategy makes it difficult for business users to turn data into information via reports. Without the key building blocks in place, it is difficult to create key linkages between customer, product, site, supplier and financial data. These linkages make it possible to understand patterns. A well-defined data management strategy is aligned to the business strategy and helps create the governance needed to ensure that data stewardship is in place and data integrity is intact. 3. Almost 60 percent of respondents have no strategy to integrate data across operational applications. Many respondents have several disparate sources of data with no strategy to keep them in sync with each other. Even though there is no clear strategy to integrate the data (see #2 above), the data needs to be synced and cross-referenced to keep the business processes running. About 55 percent of respondents said they perform this integration on an ad hoc basis, and in many cases, it is done manually with the help of Microsoft Excel spreadsheets. For example, a salesperson needs a report on global sales for a specific product, but the product has different product numbers in different countries. Typically, an analyst will pull all the data into Excel, manually create a cross reference for that product, and then aggregate the sales. The exact same procedure has to be followed if the same report is needed the following month. A well-defined consolidation strategy will ensure that a central cross-reference is maintained with updates in any one application being propagated to all the other systems, so that data is synchronized and up to date. This can be done in real time or in batch mode using integration technology. 4. Approximately 50 percent of respondents spend manual efforts cleansing and normalizing data. Information stored in various systems usually follows different standards and formats, making it difficult to match the data. A customer's address can be stored in different ways using a variety of abbreviations -- for example, "av" or "ave" for avenue. Similarly, a product's attributes can be stored in a number of different ways; for example, a size attribute can be stored in inches and can also be entered as "'' ". These types of variations make it difficult to match up data from different sources. Today, most customers rely on manual, heroic efforts to match, cleanse, and de-duplicate data -- clearly not a scalable, sustainable model. To solve this challenge, organizations need the ability to standardize data for customers, products, sites, suppliers and financial accounts; however, less than 10 percent of respondents have technology in place to automatically resolve duplicates. It is no wonder, therefore, that we get communications about products we don't own, at addresses we don't reside, and using channels (like direct mail) we don't like. An all-too-common example of a potential challenge follows: Customers end up receiving duplicate communications, which not only impacts customer satisfaction, but also incurs additional mailing costs. Cleansing, normalizing, and standardizing data will help address most of these issues. 5. Only 10 percent of respondents have the ability to share data that was mastered in a master data hub. Close to 60 percent of respondents have efforts in place that profile, standardize and cleanse data manually, and the output of these efforts are stored in spreadsheets in various parts of the organization. This valuable information is not easily shared with the rest of the organization and, more importantly, this enriched information cannot be sent back to the source systems so that the data is fixed at the source. A key benefit of a master data management strategy is not only to clean the data, but to also share the data back to the source systems as well as other systems that need the information. Aside from the source systems, another key beneficiary of this data is the business intelligence system. Having clean master data as input to business intelligence systems provides more accurate and enhanced reporting.  Characteristics of Stellar MDM When deciding on the right master data management technology, organizations should look for solutions that have four main characteristics: enterprise-grade MDM performance complete technology that can be rapidly deployed and addresses multiple business issues end-to-end MDM process management with data quality monitoring and assurance pre-built MDM business relevant applications with data stores and workflows These master data management capabilities will aid in moving closer to a best-practice maturity level, delivering tremendous efficiencies and savings as well as revenue growth opportunities as a result of better understanding your customers.  Trevor Naidoo is a senior director in Industry Strategy and Insight at Oracle. 

    Read the article

  • Address Regulatory Mandates for Data Encryption Without Changing Your Applications

    - by Troy Kitch
    The Payment Card Industry Data Security Standard, US state-level data breach laws, and numerous data privacy regulations worldwide all call for data encryption to protect personally identifiable information (PII). However encrypting PII data in applications requires costly and complex application changes. Fortunately, since this data typically resides in the application database, using Oracle Advanced Security, PII can be encrypted transparently by the Oracle database without any application changes. In this ISACA webinar, learn how Oracle Advanced Security offers complete encryption for data at rest, in transit, and on backups, along with built-in key management to help organizations meet regulatory requirements and save money. You will also hear from TransUnion Interactive, the consumer subsidiary of TransUnion, a global leader in credit and information management, which maintains credit histories on an estimated 500 million consumers across the globe, about how they addressed PCI DSS encryption requirements using Oracle Database 11g with Oracle Advanced Security. Register to watch the webinar now.

    Read the article

  • Using Hadooop (HDInsight) with Microsoft - Two (OK, Three) Options

    - by BuckWoody
    Microsoft has many tools for “Big Data”. In fact, you need many tools – there’s no product called “Big Data Solution” in a shrink-wrapped box – if you find one, you probably shouldn’t buy it. It’s tempting to want a single tool that handles everything in a problem domain, but with large, complex data, that isn’t a reality. You’ll mix and match several systems, open and closed source, to solve a given problem. But there are tools that help with handling data at large, complex scales. Normally the best way to do this is to break up the data into parts, and then put the calculation engines for that chunk of data right on the node where the data is stored. These systems are in a family called “Distributed File and Compute”. Microsoft has a couple of these, including the High Performance Computing edition of Windows Server. Recently we partnered with Hortonworks to bring the Apache Foundation’s release of Hadoop to Windows. And as it turns out, there are actually two (technically three) ways you can use it. (There’s a more detailed set of information here: http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx, I’ll cover the options at a general level below)  First Option: Windows Azure HDInsight Service  Your first option is that you can simply log on to a Hadoop control node and begin to run Pig or Hive statements against data that you have stored in Windows Azure. There’s nothing to set up (although you can configure things where needed), and you can send the commands, get the output of the job(s), and stop using the service when you are done – and repeat the process later if you wish. (There are also connectors to run jobs from Microsoft Excel, but that’s another post)   This option is useful when you have a periodic burst of work for a Hadoop workload, or the data collection has been happening into Windows Azure storage anyway. That might be from a web application, the logs from a web application, telemetrics (remote sensor input), and other modes of constant collection.   You can read more about this option here:  http://blogs.msdn.com/b/windowsazure/archive/2012/10/24/getting-started-with-windows-azure-hdinsight-service.aspx Second Option: Microsoft HDInsight Server Your second option is to use the Hadoop Distribution for on-premises Windows called Microsoft HDInsight Server. You set up the Name Node(s), Job Tracker(s), and Data Node(s), among other components, and you have control over the entire ecostructure.   This option is useful if you want to  have complete control over the system, leave it running all the time, or you have a huge quantity of data that you have to bulk-load constantly – something that isn’t going to be practical with a network transfer or disk-mailing scheme. You can read more about this option here: http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx Third Option (unsupported): Installation on Windows Azure Virtual Machines  Although unsupported, you could simply use a Windows Azure Virtual Machine (we support both Windows and Linux servers) and install Hadoop yourself – it’s open-source, so there’s nothing preventing you from doing that.   Aside from being unsupported, there are other issues you’ll run into with this approach – primarily involving performance and the amount of configuration you’ll need to do to access the data nodes properly. But for a single-node installation (where all components run on one system) such as learning, demos, training and the like, this isn’t a bad option. Did I mention that’s unsupported? :) You can learn more about Windows Azure Virtual Machines here: http://www.windowsazure.com/en-us/home/scenarios/virtual-machines/ And more about Hadoop and the installation/configuration (on Linux) here: http://en.wikipedia.org/wiki/Apache_Hadoop And more about the HDInsight installation here: http://www.microsoft.com/web/gallery/install.aspx?appid=HDINSIGHT-PREVIEW Choosing the right option Since you have two or three routes you can go, the best thing to do is evaluate the need you have, and place the workload where it makes the most sense.  My suggestion is to install the HDInsight Server locally on a test system, and play around with it. Read up on the best ways to use Hadoop for a given workload, understand the parts, write a little Pig and Hive, and get your feet wet. Then sign up for a test account on HDInsight Service, and see how that leverages what you know. If you're a true tinkerer, go ahead and try the VM route as well. Oh - there’s another great reference on the Windows Azure HDInsight that just came out, here: http://blogs.msdn.com/b/brunoterkaly/archive/2012/11/16/hadoop-on-azure-introduction.aspx  

    Read the article

  • Oracle - A Leader in Gartner's MQ for Master Data Management for Customer Data

    - by Mala Narasimharajan
      The Gartner MQ report for Master Data Management of Customer Data Solutions is released and we're proud to say that Oracle is in the leaders' quadrant.  Here's a snippet from the report itself:  " “Oracle has a strong, though complex, portfolio of domain-specific MDM products that include prepackaged data models. Gartner estimates that Oracle now has over 1,500 licensed MDM customers, including 650 customers managing customer data. The MDM portfolio includes three products that address MDM of customer data solution needs: Oracle Fusion Customer Hub (FCH), Oracle CDH and Oracle Siebel UCM. These three MDM products are positioned for different segments of the market and Oracle is progressively moving all three products onto a common MDM technology platform..." (Gartner, Oct 18, 2012)  For more information on Oracle's solutions for customer data in Master Data Management, click here.  

    Read the article

  • Almost Realtime Data and Web application

    - by Chris G.
    I have a computer that is recording 100 different data points into an OPC server. I've written a simple OPC client that can read all of this data. I have a front-end website on a different network that I would like to consume this data. I could easily set the OPC client to send the data to a SQL server and the website could read from it, but that would be a lot of writes. If I wanted the data to be updated every 10 seconds I'd be writing to the database every 10 seconds. (I could probably just serialize the 100 points to get 1 write / 10 seconds but that would also limit my ability to search the data later). This solution wouldn't scale very well. If I had 100 of these computers the situation would quickly grow out of hand. Obviously I am well out of my league here and I have no experience with working with a large amount of data like this. What are my options and what should I research?

    Read the article

  • SQL SERVER – Integrate Your Data with Skyvia – Cloud ETL Solution

    - by Pinal Dave
    In our days data integration often becomes a key aspect of business success. For business analysts it’s very important to get integrated data from various sources, such as relational databases, cloud CRMs, etc. to make correct and successful decisions. There are various data integration solutions on market, and today I will tell about one of them – Skyvia. Skyvia is a cloud data integration service, which allows integrating data in cloud CRMs and different relational databases. It is a completely online solution and does not require anything except for a browser. Skyvia provides powerful etl tools for data import, export, replication, and synchronization for SQL Server and other databases and cloud CRMs. You can use Skyvia data import tools to load data from various sources to SQL Server (and SQL Azure). Skyvia supports such cloud CRMs as Salesforce and Microsoft Dynamics CRM and such databases as MySQL and PostgreSQL. You even can migrate data from SQL Server to SQL Server, or from SQL Server to other databases and cloud CRMs. Additionally Skyvia supports import of CSV files, either uploaded manually or stored on cloud file storage services, such as Dropbox, Box, Google Drive, or FTP servers. When data import is not enough, Skyvia offers bidirectional data synchronization. With this tool, you can synchronize SQL Server data with other databases and cloud CRMs. After performing the first synchronization, Skyvia tracks data changes in the synchronized data storages. In SQL Server databases (and other relational databases) it creates additional tracking tables and triggers. This allows synchronizing only the changed data. Skyvia also maps records by their primary key values to each other, so it does not require different sources to have the same primary key structure. It still can match the corresponding records without having to add any additional columns or changing data structure. The only requirement for synchronization is that primary keys must be autogenerated. With Skyvia it’s not necessary for data to have the same structure in integrated data storages. Skyvia supports powerful mapping mechanisms that allow synchronizing data with completely different structure. It provides support for complex mathematical and string expressions when mapping data, using lookups, etc. You may use data splitting – loading data from a single CSV file or source table to multiple related target tables. Or you may load data from several source CSV files or tables to several related target tables. In each case Skyvia preserves data relations. It builds corresponding relations between the target data automatically. When you often work with cloud CRM data, native CRM data reporting and analysis tools may be not enough for you. And there is a vast set of professional data analysis and reporting tools available for SQL Server. With Skyvia you can quickly copy your cloud CRM data to an SQL Server database and apply corresponding SQL Server tools to the data. In such case you can use Skyvia data replication tools. It allows you to quickly copy cloud CRM data to SQL Server or other databases without customizing any mapping. You need just to specify columns to copy data from. Target database tables will be created automatically. Skyvia offers powerful filtering settings to replicate only the records you need. Skyvia also provides capability to export data from SQL Server (including SQL Azure) and other databases and cloud CRMs to CSV files. These files can be either downloadable manually or loaded to cloud file storages or FTP server. You can use export, for example, to backup SQL Azure data to Dropbox. Any data integration operation can be scheduled for automatic execution. Thus, you can automate your SQL Azure data backup or data synchronization – just configure it once, then schedule it, and benefit from automatic data integration with Skyvia. Currently registration and using Skyvia is completely free, so you can try it yourself and find out whether its data migration and integration tools suits for you. Visit this link to register on Skyvia: https://app.skyvia.com/register Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Cloud Computing

    Read the article

  • SQL SERVER – Why Do We Need Master Data Management – Importance and Significance of Master Data Management (MDM)

    - by pinaldave
    Let me paint a picture of everyday life for you.  Let’s say you and your wife both have address books for your groups of friends.  There is definitely overlap between them, so that you both have the addresses for your mutual friends, and there are addresses that only you know, and some only she knows.  They also might be organized differently.  You might list your friend under “J” for “Joe” or even under “W” for “Work,” while she might list him under “S” for “Joe Smith” or under your name because he is your friend.  If you happened to trade, neither of you would be able to find anything! This is where data management would be very important.  If you were to consolidate into one address book, you would have to set rules about how to organize the book, and both of you would have to follow them.  You would also make sure that poor Joe doesn’t get entered twice under “J” and under “S.” This might be a familiar situation to you, whether you are thinking about address books, record collections, books, or even shopping lists.  Wherever there is a lot of data to consolidate, you are going to run into problems unless everyone is following the same rules. I’m sure that my readers can figure out where I am going with this.  What is SQL Server but a computerized way to organize data?  And Microsoft is making it easier and easier to get all your “addresses” into one place.  In the  2008 version of SQL they introduced a new tool called Master Data Services (MDS) for Master Data Management, and they have improved it for the new 2012 version. MDM was hailed as a major improvement for business intelligence.  You might not think that an organizational system is terribly exciting, but think about the kind of “address books” a company might have.  Many companies have lots of important information, like addresses, credit card numbers, purchase history, and so much more.  To organize all this efficiently so that customers are well cared for and properly billed (only once, not never or multiple times!) is a major part of business intelligence. MDM comes into play because it will comb through these mountains of data and make sure that all the information is consistent, accurate, and all placed in one database so that employees don’t have to search high and low and waste their time. MDM also has operational MDM functions.  This is not a redundancy.  Operational MDM means that when one employee updates one bit of information in the database, for example – updating a new address for a customer, operational MDM ensures that this address is updated throughout the system so that all departments will have the correct information. Another cool thing about MDM is that it features Master Data Services Configuration Manager, which is exactly what it sounds like.  It has a built-in “helper” that lets you set up your database quickly, easily, and with the correct configurations.  While talking about cool features, I can’t skip over the add-in for Excel.  This allows you to link certain data to Excel files for easier sharing and uploading. In summary, I want to emphasize that the scariest part of the database is slowly disappearing.  Everyone knows that a database – one consolidated area for all your data – is a good idea, but the idea of setting one up is daunting.  But SQL Server is making data management easier and easier with features like Master Data Services (MDS). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Master Data Services, MDM

    Read the article

  • Data breakpoints to find points where data gets broken

    - by raccoon_tim
    When working with a large code base, finding reasons for bizarre bugs can often be like finding a needle in a hay stack. Finding out why an object gets corrupted without no apparent reason can be quite daunting, especially when it seems to happen randomly and totally out of context. Scenario Take the following scenario as an example. You have defined the a class that contains an array of characters that is 256 characters long. You now implement a method for filling this buffer with a string passed as an argument. At this point you mistakenly expect the buffer to be 256 characters long. At some point you notice that you require another character buffer and you add that after the previous one in the class definition. You now figure that you don’t need the 256 characters that the first member can hold and you shorten that to 128 to conserve space. At this point you should start thinking that you also have to modify the method defined above to safeguard against buffer overflow. It so happens, however, that in this not so perfect world this does not cross your mind. Buffer overflow is one of the most frequent sources for errors in a piece of software and often one of the most difficult ones to detect, especially when data is read from an outside source. Many mass copy functions provided by the C run-time provide versions that have boundary checking (defined with the _s suffix) but they can not guard against hard coded buffer lengths that at some point get changed. Finding the bug Getting back to the scenario, you’re now wondering why does the second string get modified with data that makes no sense at all. Luckily, Visual Studio provides you with a tool to help you with finding just these kinds of errors. It’s called data breakpoints. To add a data breakpoint, you first run your application in debug mode or attach to it in the usual way, and then go to Debug, select New Breakpoint and New Data Breakpoint. In the popup that opens, you can type in the memory address and the amount of bytes you wish to monitor. You can also use an expression here, but it’s often difficult to come up with an expression for data in an object allocated on the heap when not in the context of a certain stack frame. There are a couple of things to note about data breakpoints, however. First of all, Visual Studio supports a maximum of four data breakpoints at any given time. Another important thing to notice is that some C run-time functions modify memory in kernel space which does not trigger the data breakpoint. For instance, calling ReadFile on a buffer that is monitored by a data breakpoint will not trigger the breakpoint. The application will now break at the address you specified it to. Often you might immediately spot the issue but the very least this feature can do is point you in the right direction in search for the real reason why the memory gets inadvertently modified. Conclusions Data breakpoints are a great feature, especially when doing a lot of low level operations where multiple locations modify the same data. With the exception of some special cases, like kernel memory modification, you can use it whenever you need to check when memory at a certain location gets changed on purpose or inadvertently.

    Read the article

  • Metro: Declarative Data Binding

    - by Stephen.Walther
    The goal of this blog post is to describe how declarative data binding works in the WinJS library. In particular, you learn how to use both the data-win-bind and data-win-bindsource attributes. You also learn how to use calculated properties and converters to format the value of a property automatically when performing data binding. By taking advantage of WinJS data binding, you can use the Model-View-ViewModel (MVVM) pattern when building Metro style applications with JavaScript. By using the MVVM pattern, you can prevent your JavaScript code from spinning into chaos. The MVVM pattern provides you with a standard pattern for organizing your JavaScript code which results in a more maintainable application. Using Declarative Bindings You can use the data-win-bind attribute with any HTML element in a page. The data-win-bind attribute enables you to bind (associate) an attribute of an HTML element to the value of a property. Imagine, for example, that you want to create a product details page. You want to show a product object in a page. In that case, you can create the following HTML page to display the product details: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Application1</title> <!-- WinJS references --> <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script> <!-- Application1 references --> <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> </head> <body> <h1>Product Details</h1> <div class="field"> Product Name: <span data-win-bind="innerText:name"></span> </div> <div class="field"> Product Price: <span data-win-bind="innerText:price"></span> </div> <div class="field"> Product Picture: <br /> <img data-win-bind="src:photo;alt:name" /> </div> </body> </html> The HTML page above contains three data-win-bind attributes – one attribute for each product property displayed. You use the data-win-bind attribute to set properties of the HTML element associated with the data-win-attribute. The data-win-bind attribute takes a semicolon delimited list of element property names and data source property names: data-win-bind=”elementPropertyName:datasourcePropertyName; elementPropertyName:datasourcePropertyName;…” In the HTML page above, the first two data-win-bind attributes are used to set the values of the innerText property of the SPAN elements. The last data-win-bind attribute is used to set the values of the IMG element’s src and alt attributes. By the way, using data-win-bind attributes is perfectly valid HTML5. The HTML5 standard enables you to add custom attributes to an HTML document just as long as the custom attributes start with the prefix data-. So you can add custom attributes to an HTML5 document with names like data-stephen, data-funky, or data-rover-dog-is-hungry and your document will validate. The product object displayed in the page above with the data-win-bind attributes is created in the default.js file: (function () { "use strict"; var app = WinJS.Application; app.onactivated = function (eventObject) { if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) { var product = { name: "Tesla", price: 80000, photo: "/images/TeslaPhoto.png" }; WinJS.Binding.processAll(null, product); } }; app.start(); })(); In the code above, a product object is created with a name, price, and photo property. The WinJS.Binding.processAll() method is called to perform the actual binding (Don’t confuse WinJS.Binding.processAll() and WinJS.UI.processAll() – these are different methods). The first parameter passed to the processAll() method represents the root element for the binding. In other words, binding happens on this element and its child elements. If you provide the value null, then binding happens on the entire body of the document (document.body). The second parameter represents the data context. This is the object that has the properties which are displayed with the data-win-bind attributes. In the code above, the product object is passed as the data context parameter. Another word for data context is view model.  Creating Complex View Models In the previous section, we used the data-win-bind attribute to display the properties of a simple object: a single product. However, you can use binding with more complex view models including view models which represent multiple objects. For example, the view model in the following default.js file represents both a customer and a product object. Furthermore, the customer object has a nested address object: (function () { "use strict"; var app = WinJS.Application; app.onactivated = function (eventObject) { if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) { var viewModel = { customer: { firstName: "Fred", lastName: "Flintstone", address: { street: "1 Rocky Way", city: "Bedrock", country: "USA" } }, product: { name: "Bowling Ball", price: 34.55 } }; WinJS.Binding.processAll(null, viewModel); } }; app.start(); })(); The following page displays the customer (including the customer address) and the product. Notice that you can use dot notation to refer to child objects in a view model such as customer.address.street. <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Application1</title> <!-- WinJS references --> <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script> <!-- Application1 references --> <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> </head> <body> <h1>Customer Details</h1> <div class="field"> First Name: <span data-win-bind="innerText:customer.firstName"></span> </div> <div class="field"> Last Name: <span data-win-bind="innerText:customer.lastName"></span> </div> <div class="field"> Address: <address> <span data-win-bind="innerText:customer.address.street"></span> <br /> <span data-win-bind="innerText:customer.address.city"></span> <br /> <span data-win-bind="innerText:customer.address.country"></span> </address> </div> <h1>Product</h1> <div class="field"> Name: <span data-win-bind="innerText:product.name"></span> </div> <div class="field"> Price: <span data-win-bind="innerText:product.price"></span> </div> </body> </html> A view model can be as complicated as you need and you can bind the view model to a view (an HTML document) by using declarative bindings. Creating Calculated Properties You might want to modify a property before displaying the property. For example, you might want to format the product price property before displaying the property. You don’t want to display the raw product price “80000”. Instead, you want to display the formatted price “$80,000”. You also might need to combine multiple properties. For example, you might need to display the customer full name by combining the values of the customer first and last name properties. In these situations, it is tempting to call a function when performing binding. For example, you could create a function named fullName() which concatenates the customer first and last name. Unfortunately, the WinJS library does not support the following syntax: <span data-win-bind=”innerText:fullName()”></span> Instead, in these situations, you should create a new property in your view model that has a getter. For example, the customer object in the following default.js file includes a property named fullName which combines the values of the firstName and lastName properties: (function () { "use strict"; var app = WinJS.Application; app.onactivated = function (eventObject) { if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) { var customer = { firstName: "Fred", lastName: "Flintstone", get fullName() { return this.firstName + " " + this.lastName; } }; WinJS.Binding.processAll(null, customer); } }; app.start(); })(); The customer object has a firstName, lastName, and fullName property. Notice that the fullName property is defined with a getter function. When you read the fullName property, the values of the firstName and lastName properties are concatenated and returned. The following HTML page displays the fullName property in an H1 element. You can use the fullName property in a data-win-bind attribute in exactly the same way as any other property. <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Application1</title> <!-- WinJS references --> <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script> <!-- Application1 references --> <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> </head> <body> <h1 data-win-bind="innerText:fullName"></h1> <div class="field"> First Name: <span data-win-bind="innerText:firstName"></span> </div> <div class="field"> Last Name: <span data-win-bind="innerText:lastName"></span> </div> </body> </html> Creating a Converter In the previous section, you learned how to format the value of a property by creating a property with a getter. This approach makes sense when the formatting logic is specific to a particular view model. If, on the other hand, you need to perform the same type of formatting for multiple view models then it makes more sense to create a converter function. A converter function is a function which you can apply whenever you are using the data-win-bind attribute. Imagine, for example, that you want to create a general function for displaying dates. You always want to display dates using a short format such as 12/25/1988. The following JavaScript file – named converters.js – contains a shortDate() converter: (function (WinJS) { var shortDate = WinJS.Binding.converter(function (date) { return date.getMonth() + 1 + "/" + date.getDate() + "/" + date.getFullYear(); }); // Export shortDate WinJS.Namespace.define("MyApp.Converters", { shortDate: shortDate }); })(WinJS); The file above uses the Module Pattern, a pattern which is used through the WinJS library. To learn more about the Module Pattern, see my blog entry on namespaces and modules: http://stephenwalther.com/blog/archive/2012/02/22/windows-web-applications-namespaces-and-modules.aspx The file contains the definition for a converter function named shortDate(). This function converts a JavaScript date object into a short date string such as 12/1/1988. The converter function is created with the help of the WinJS.Binding.converter() method. This method takes a normal function and converts it into a converter function. Finally, the shortDate() converter is added to the MyApp.Converters namespace. You can call the shortDate() function by calling MyApp.Converters.shortDate(). The default.js file contains the customer object that we want to bind. Notice that the customer object has a firstName, lastName, and birthday property. We will use our new shortDate() converter when displaying the customer birthday property: (function () { "use strict"; var app = WinJS.Application; app.onactivated = function (eventObject) { if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) { var customer = { firstName: "Fred", lastName: "Flintstone", birthday: new Date("12/1/1988") }; WinJS.Binding.processAll(null, customer); } }; app.start(); })(); We actually use our shortDate converter in the HTML document. The following HTML document displays all of the customer properties: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Application1</title> <!-- WinJS references --> <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script> <!-- Application1 references --> <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> <script type="text/javascript" src="js/converters.js"></script> </head> <body> <h1>Customer Details</h1> <div class="field"> First Name: <span data-win-bind="innerText:firstName"></span> </div> <div class="field"> Last Name: <span data-win-bind="innerText:lastName"></span> </div> <div class="field"> Birthday: <span data-win-bind="innerText:birthday MyApp.Converters.shortDate"></span> </div> </body> </html> Notice the data-win-bind attribute used to display the birthday property. It looks like this: <span data-win-bind="innerText:birthday MyApp.Converters.shortDate"></span> The shortDate converter is applied to the birthday property when the birthday property is bound to the SPAN element’s innerText property. Using data-win-bindsource Normally, you pass the view model (the data context) which you want to use with the data-win-bind attributes in a page by passing the view model to the WinJS.Binding.processAll() method like this: WinJS.Binding.processAll(null, viewModel); As an alternative, you can specify the view model declaratively in your markup by using the data-win-datasource attribute. For example, the following default.js script exposes a view model with the fully-qualified name of MyWinWebApp.viewModel: (function () { "use strict"; var app = WinJS.Application; app.onactivated = function (eventObject) { if (eventObject.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) { // Create view model var viewModel = { customer: { firstName: "Fred", lastName: "Flintstone" }, product: { name: "Bowling Ball", price: 12.99 } }; // Export view model to be seen by universe WinJS.Namespace.define("MyWinWebApp", { viewModel: viewModel }); // Process data-win-bind attributes WinJS.Binding.processAll(); } }; app.start(); })(); In the code above, a view model which represents a customer and a product is exposed as MyWinWebApp.viewModel. The following HTML page illustrates how you can use the data-win-bindsource attribute to bind to this view model: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Application1</title> <!-- WinJS references --> <link href="//Microsoft.WinJS.0.6/css/ui-dark.css" rel="stylesheet"> <script src="//Microsoft.WinJS.0.6/js/base.js"></script> <script src="//Microsoft.WinJS.0.6/js/ui.js"></script> <!-- Application1 references --> <link href="/css/default.css" rel="stylesheet"> <script src="/js/default.js"></script> </head> <body> <h1>Customer Details</h1> <div data-win-bindsource="MyWinWebApp.viewModel.customer"> <div class="field"> First Name: <span data-win-bind="innerText:firstName"></span> </div> <div class="field"> Last Name: <span data-win-bind="innerText:lastName"></span> </div> </div> <h1>Product</h1> <div data-win-bindsource="MyWinWebApp.viewModel.product"> <div class="field"> Name: <span data-win-bind="innerText:name"></span> </div> <div class="field"> Price: <span data-win-bind="innerText:price"></span> </div> </div> </body> </html> The data-win-bindsource attribute is used twice in the page above: it is used with the DIV element which contains the customer details and it is used with the DIV element which contains the product details. If an element has a data-win-bindsource attribute then all of the child elements of that element are affected. The data-win-bind attributes of all of the child elements are bound to the data source represented by the data-win-bindsource attribute. Summary The focus of this blog entry was data binding using the WinJS library. You learned how to use the data-win-bind attribute to bind the properties of an HTML element to a view model. We also discussed several advanced features of data binding. We examined how to create calculated properties by including a property with a getter in your view model. We also discussed how you can create a converter function to format the value of a view model property when binding the property. Finally, you learned how to use the data-win-bindsource attribute to specify a view model declaratively.

    Read the article

  • SQL Server Management Data Warehouse - quick tour on setting health monitoring policies

    - by ssqa.net
    Profiler, Perfmon, DMVs & scripts are legendary tools for a DBA to monitor the SQL arena. In line with these tools SQL Server 2008 throws a powerful stream with policy based management (PBM) framework & management data warehouse (MDW) methods, which is a relational database that contains the data that is collected from a server that is a data collection target. This data is used to generate the reports for the System Data collection sets, and can also be used to create custom reports. .....(read more)

    Read the article

  • Data Flow Diagrams - Difference between Lines and Arrows

    - by Howdy_McGee
    I'm currently working with Visio to create Data Flow Diagrams for a System Analysis and Design class but I'm unsure what the difference between ------ and ------> is. I can connect 2 shapes together with a line (process, entity, data store) but does the single line connecting the two mean data flow? Do I need to explicitly use the data flow arrow to show which way data is flowing? (There doesn't seem to be tags for this topic, maybe im in the wrong place?)

    Read the article

  • Big Data Videos

    - by Jean-Pierre Dijcks
    You can view them all on YouTube using the following links: Overview for the Boss: http://youtu.be/ikJyrmKdJWc Hadoop: http://youtu.be/acWtid-OOWM Acquiring Big Data: http://youtu.be/TfuhuA_uaho Organizing Big Data: http://youtu.be/IC6jVRO2Hq4 Analyzing Big Data: http://youtu.be/2yf_jrBhz5w These videos are a great place to start learning about big data, the value it can bring to your organisation and how Oracle can help you start working with big data today.

    Read the article

  • SQL Server and the XML Data Type : Data Manipulation

    The introduction of the xml data type, with its own set of methods for processing xml data, made it possible for SQL Server developers to create columns and variables of the type xml. Deanna Dicken examines the modify() method, which provides for data manipulation of the XML data stored in the xml data type via XML DML statements.

    Read the article

  • Behavior of <- NULL on lists versus data.frames for removing data

    - by Ananda Mahto
    Many R users eventually figure out lots of ways to remove elements from their data. One way is to use NULL, particularly when you want to do something like drop a column from a data.frame or drop an element from a list. Eventually, a user comes across a situation where they want to drop several columns from a data.frame at once, and they hit upon <- list(NULL) as the solution (since using <- NULL will result in an error). A data.frame is a special type of list, so it wouldn't be too tough to imagine that the approaches for removing items from a list should be the same as removing columns from a data.frame. However, they produce different results, as can be seen in the example below. ## Make some small data--two data.frames and two lists cars1 <- cars2 <- head(mtcars)[1:4] cars3 <- cars4 <- as.list(cars2) ## Demonstration that the `list(NULL)` approach works cars1[c("mpg", "cyl")] <- list(NULL) cars1 # disp hp # Mazda RX4 160 110 # Mazda RX4 Wag 160 110 # Datsun 710 108 93 # Hornet 4 Drive 258 110 # Hornet Sportabout 360 175 # Valiant 225 105 ## Demonstration that simply using `NULL` does not work cars2[c("mpg", "cyl")] <- NULL # Error in `[<-.data.frame`(`*tmp*`, c("mpg", "cyl"), value = NULL) : # replacement has 0 items, need 12 Switch to applying the same concept to a list, and compare the difference in behavior. ## Does not fully drop the items, but sets them to `NULL` cars3[c("mpg", "cyl")] <- list(NULL) # $mpg # NULL # # $cyl # NULL # # $disp # [1] 160 160 108 258 360 225 # # $hp # [1] 110 110 93 110 175 105 ## *Does* drop the `list` items while this would ## have produced an error with a `data.frame` cars4[c("mpg", "cyl")] <- NULL # $disp # [1] 160 160 108 258 360 225 # # $hp # [1] 110 110 93 110 175 105 The main questions I have are, if a data.frame is a list, why does it behave so differently in this scenario? Is there a foolproof way of knowing when an element will be dropped, when it will produce an error, and when it will simply be given a NULL value? Or do we depend on trial-and-error for this?

    Read the article

  • How does jQuery stores data with .data()?

    - by TK
    I am a little confused how jQuery stores data with .data() functions. Is this something called expando? Or is this using HTML5 Web Storage although I think this is very unlikely? The documentation says: The .data() method allows us to attach data of any type to DOM elements in a way that is safe from circular references and therefore from memory leaks. As I read about expando, it seems to have a rick of memory leak. Unfortunately my skills are not enough to read and understand jQuery code itself, but I want to know how jQuery stores such data by using data(). http://api.jquery.com/data/

    Read the article

  • ASP.Net Layered app - Share Entity Data Model amongst layers

    - by Chris Klepeis
    How can I share the auto-generated entity data model (generated object classes) amongst all layers of my C# web app whilst only granting query access in the data layer? This uses the typical 3 layer approach: data, business, presentation. My data layer returns an IEnumerable<T> to my business layer, but I cannot return type T to the presentation layer because I do not want the presentation layer to know of the existence of the data layer - which is where the entity framework auto-generated my classes. It was recommended to have a seperate layer with just the data model, but I'm unsure how to seperate the data model from the query functionality the entity framework provides.

    Read the article

  • How does jQuery store data with .data()?

    - by TK
    I am a little confused how jQuery stores data with .data() functions. Is this something called expando? Or is this using HTML5 Web Storage although I think this is very unlikely? The documentation says: The .data() method allows us to attach data of any type to DOM elements in a way that is safe from circular references and therefore from memory leaks. As I read about expando, it seems to have a rick of memory leak. Unfortunately my skills are not enough to read and understand jQuery code itself, but I want to know how jQuery stores such data by using data(). http://api.jquery.com/data/

    Read the article

  • Accessing and Updating Data in ASP.NET: Filtering Data Using a CheckBoxList

    Filtering Database Data with Parameters, an earlier installment in this article series, showed how to filter the data returned by ASP.NET's data source controls. In a nutshell, the data source controls can include parameterized queries whose parameter values are defined via parameter controls. For example, the SqlDataSource can include a parameterized SelectCommand, such as: SELECT * FROM Books WHERE Price > @Price. Here, @Price is a parameter; the value for a parameter can be defined declaratively using a parameter control. ASP.NET offers a variety of parameter controls, including ones that use hard-coded values, ones that retrieve values from the querystring, and ones that retrieve values from session, and others. Perhaps the most useful parameter control is the ControlParameter, which retrieves its value from a Web control on the page. Using the ControlParameter we can filter the data returned by the data source control based on the end user's input. While the ControlParameter works well with most types of Web controls, it does not work as expected with the CheckBoxList control. The ControlParameter is designed to retrieve a single property value from the specified Web control, but the CheckBoxList control does not have a property that returns all of the values of its selected items in a form that the CheckBoxList control can use. Moreover, if you are using the selected CheckBoxList items to query a database you'll quickly find that SQL does not offer out of the box functionality for filtering results based on a user-supplied list of filter criteria. The good news is that with a little bit of effort it is possible to filter data based on the end user's selections in a CheckBoxList control. This article starts with a look at how to get SQL to filter data based on a user-supplied, comma-delimited list of values. Next, it shows how to programmatically construct a comma-delimited list that represents the selected CheckBoxList values and pass that list into the SQL query. Finally, we'll explore creating a custom parameter control to handle this logic declaratively. Read on to learn more! Read More >

    Read the article

  • SQLAuthority News – Fast Track Data Warehouse 3.0 Reference Guide

    - by pinaldave
    http://msdn.microsoft.com/en-us/library/gg605238.aspx I am very excited that Fast Track Data Warehouse 3.0 reference guide has been announced. As a consultant I have always enjoyed working with Fast Track Data Warehouse project as it truly expresses the potential of the SQL Server Engine. Here is few details of the enhancement of the Fast Track Data Warehouse 3.0 reference architecture. The SQL Server Fast Track Data Warehouse initiative provides a basic methodology and concrete examples for the deployment of balanced hardware and database configuration for a data warehousing workload. Balance is measured across the key components of a SQL Server installation; storage, server, application settings, and configuration settings for each component are evaluated. Description Note FTDW 3.0 Architecture Basic component architecture for FT 3.0 based systems. New Memory Guidelines Minimum and maximum tested memory configurations by server socket count. Additional Startup Options Notes for T-834 and setting for Lock Pages in Memory. Storage Configuration RAID1+0 now standard (RAID1 was used in FT 2.0). Evaluating Fragmentation Query provided for evaluating logical fragmentation. Loading Data Additional options for CI table loads. MCR Additional detail and explanation of FTDW MCR Rating. Read white paper on fast track data warehousing. Reference: Pinal Dave (http://blog.SQLAuthority.com)   Filed under: Business Intelligence, Data Warehousing, PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

    Read the article

  • Accessing and Updating Data in ASP.NET: Filtering Data Using a CheckBoxList

    Filtering Database Data with Parameters, an earlier installment in this article series, showed how to filter the data returned by ASP.NET's data source controls. In a nutshell, the data source controls can include parameterized queries whose parameter values are defined via parameter controls. For example, the SqlDataSource can include a parameterized SelectCommand, such as: SELECT * FROM Books WHERE Price > @Price. Here, @Price is a parameter; the value for a parameter can be defined declaratively using a parameter control. ASP.NET offers a variety of parameter controls, including ones that use hard-coded values, ones that retrieve values from the querystring, and ones that retrieve values from session, and others. Perhaps the most useful parameter control is the ControlParameter, which retrieves its value from a Web control on the page. Using the ControlParameter we can filter the data returned by the data source control based on the end user's input. While the ControlParameter works well with most types of Web controls, it does not work as expected with the CheckBoxList control. The ControlParameter is designed to retrieve a single property value from the specified Web control, but the CheckBoxList control does not have a property that returns all of the values of its selected items in a form that the CheckBoxList control can use. Moreover, if you are using the selected CheckBoxList items to query a database you'll quickly find that SQL does not offer out of the box functionality for filtering results based on a user-supplied list of filter criteria. The good news is that with a little bit of effort it is possible to filter data based on the end user's selections in a CheckBoxList control. This article starts with a look at how to get SQL to filter data based on a user-supplied, comma-delimited list of values. Next, it shows how to programmatically construct a comma-delimited list that represents the selected CheckBoxList values and pass that list into the SQL query. Finally, we'll explore creating a custom parameter control to handle this logic declaratively. Read on to learn more! Read More >

    Read the article

  • extjs data store load data on fly

    - by CKeven
    I'm trying to create a data store that will load the data schema and records on fly. Here is the current code i have and I'm not sure how to setup the array reader properly since i don't have the schema before query returns. ds = new Ext.data.Store({ url: 'http://10.10.97.83/cgi-bin/cgiip.exe/WService=wsdev/majax/jsbrdgx.p', baseParams: { cr: Ext.util.JSON.encode(omgtobxParms) }, reader: new Ext.data.ArrayReader({ //root:data.value.records }, col_names) }); {"name": "tmp_buy_book", "schema": [ { "name": "a", "type": "C"}, { "name": "b", "type": "C"} "records": [["1", ""], ["1",""]]}

    Read the article

  • Big Data – Buzz Words: What is MapReduce – Day 7 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is Hadoop. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – MapReduce. What is MapReduce? MapReduce was designed by Google as a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. Though, MapReduce was originally Google proprietary technology, it has been quite a generalized term in the recent time. MapReduce comprises a Map() and Reduce() procedures. Procedure Map() performance filtering and sorting operation on data where as procedure Reduce() performs a summary operation of the data. This model is based on modified concepts of the map and reduce functions commonly available in functional programing. The library where procedure Map() and Reduce() belongs is written in many different languages. The most popular free implementation of MapReduce is Apache Hadoop which we will explore tomorrow. Advantages of MapReduce Procedures The MapReduce Framework usually contains distributed servers and it runs various tasks in parallel to each other. There are various components which manages the communications between various nodes of the data and provides the high availability and fault tolerance. Programs written in MapReduce functional styles are automatically parallelized and executed on commodity machines. The MapReduce Framework takes care of the details of partitioning the data and executing the processes on distributed server on run time. During this process if there is any disaster the framework provides high availability and other available modes take care of the responsibility of the failed node. As you can clearly see more this entire MapReduce Frameworks provides much more than just Map() and Reduce() procedures; it provides scalability and fault tolerance as well. A typical implementation of the MapReduce Framework processes many petabytes of data and thousands of the processing machines. How do MapReduce Framework Works? A typical MapReduce Framework contains petabytes of the data and thousands of the nodes. Here is the basic explanation of the MapReduce Procedures which uses this massive commodity of the servers. Map() Procedure There is always a master node in this infrastructure which takes an input. Right after taking input master node divides it into smaller sub-inputs or sub-problems. These sub-problems are distributed to worker nodes. A worker node later processes them and does necessary analysis. Once the worker node completes the process with this sub-problem it returns it back to master node. Reduce() Procedure All the worker nodes return the answer to the sub-problem assigned to them to master node. The master node collects the answer and once again aggregate that in the form of the answer to the original big problem which was assigned master node. The MapReduce Framework does the above Map () and Reduce () procedure in the parallel and independent to each other. All the Map() procedures can run parallel to each other and once each worker node had completed their task they can send it back to master code to compile it with a single answer. This particular procedure can be very effective when it is implemented on a very large amount of data (Big Data). The MapReduce Framework has five different steps: Preparing Map() Input Executing User Provided Map() Code Shuffle Map Output to Reduce Processor Executing User Provided Reduce Code Producing the Final Output Here is the Dataflow of MapReduce Framework: Input Reader Map Function Partition Function Compare Function Reduce Function Output Writer In a future blog post of this 31 day series we will explore various components of MapReduce in Detail. MapReduce in a Single Statement MapReduce is equivalent to SELECT and GROUP BY of a relational database for a very large database. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – HDFS. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >