Search Results

Search found 11338 results on 454 pages for 'big dave'.

Page 5/454 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Principles of Big Data By Jules J Berman, O&rsquo;Reilly Media Book Review

    - by Compudicted
    Originally posted on: http://geekswithblogs.net/Compudicted/archive/2013/11/04/principles-of-big-data-by-jules-j-berman-orsquoreilly-media.aspx A fantastic book! Must be part, if not yet, of the fundamentals of the Big Data as a field of science. Highly recommend to those who are into the Big Data practice. Yet, I confess this book is one of my best reads this year and for a number of reasons: The book is full of wisdom, intimate insight, historical facts and real life examples to how Big Data projects get conceived, operate and sadly, yes, sometimes die. But not only that, the book is most importantly is filled with valuable advice, accurate and even overwhelming amount of reference (from the positive side), and the author does not event stop there: there are numerous technical excerpts, links and examples allowing to quickly accomplish many daunting tasks or make you aware of what one needs to perform as a data practitioner (excuse my use of the word practitioner, I just did not find a better substitute to it to trying to reference all who face Big Data). Be aware that Jules Berman’s background is in medicine, naturally, this book discusses this subject a lot as it is very dear to the author’s heart I believe, this does not make this book any less significant however, quite the opposite, I trust if there is an area in science or practice where the biggest benefits can be ripped from Big Data projects it is indeed the medical science, let’s make Cancer history! On a personal note, for me as a database, BI professional it has helped to understand better the motives behind Big Data initiatives, their underwater rivers and high altitude winds that divert or propel them forward. Additionally, I was impressed by the depth and number of mining algorithms covered in it. I must tell this made me very curious and tempting to find out more about these indispensable attributes of Big Data so sure I will be trying stretching my wallet to acquire several books that go more in depth on several most popular of them. My favorite parts of the book, well, all of them actually, but especially chapter 9: Analysis, it is just very close to my heart. But the real reason is it let me see what I do with data from a different angle. And then the next - “Special Considerations”, they are just two logical parts. The writing language is of this book is very acceptable for all levels, I had no technical problem reading it in ebook format on my 8” tablet or a large screen monitor. If I would be asked to say at least something negative I have to state I had a feeling initially that the book’s first part reads like an academic material relaxing the reader as the book progresses forward. I admit I am impressed with Jules’ abilities to use several programming languages and OSS tools, bravo! And I agree, it is not too, too hard to grasp at least the principals of a modern programming language, which seems becomes a defacto knowledge standard item for any modern human being. So grab a copy of this book, read it end to end and make yourself shielded from making mistakes at any stage of your Big Data initiative, by the way this book also helps build better future Big Data projects. Disclaimer: I received a free electronic copy of this book as part of the O'Reilly Blogger Program.

    Read the article

  • Oracle Big Data Software Downloads

    - by Mike.Hallett(at)Oracle-BI&EPM
    Companies have been making business decisions for decades based on transactional data stored in relational databases. Beyond that critical data, is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and photographs that can be mined for useful information. Oracle offers a broad integrated portfolio of products to help you acquire and organize these diverse data sources and analyze them alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data Connectors Downloads here, includes: Oracle SQL Connector for Hadoop Distributed File System Release 2.1.0 Oracle Loader for Hadoop Release 2.1.0 Oracle Data Integrator Companion 11g Oracle R Connector for Hadoop v 2.1 Oracle Big Data Documentation The Oracle Big Data solution offers an integrated portfolio of products to help you organize and analyze your diverse data sources alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data, Release 2.2.0 - E41604_01 zip (27.4 MB) Integrated Software and Big Data Connectors User's Guide HTML PDF Oracle Data Integrator (ODI) Application Adapter for Hadoop Apache Hadoop is designed to handle and process data that is typically from data sources that are non-relational and data volumes that are beyond what is handled by relational databases. Typical processing in Hadoop includes data validation and transformations that are programmed as MapReduce jobs. Designing and implementing a MapReduce job usually requires expert programming knowledge. However, when you use Oracle Data Integrator with the Application Adapter for Hadoop, you do not need to write MapReduce jobs. Oracle Data Integrator uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Employing familiar and easy-to-use tools and pre-configured knowledge modules (KMs), the application adapter provides the following capabilities: Loading data into Hadoop from the local file system and HDFS Performing validation and transformation of data within Hadoop Loading processed data from Hadoop to an Oracle database for further processing and generating reports Oracle Database Loader for Hadoop Oracle Loader for Hadoop is an efficient and high-performance loader for fast movement of data from a Hadoop cluster into a table in an Oracle database. It pre-partitions the data if necessary and transforms it into a database-ready format. Oracle Loader for Hadoop is a Java MapReduce application that balances the data across reducers to help maximize performance. Oracle R Connector for Hadoop Oracle R Connector for Hadoop is a collection of R packages that provide: Interfaces to work with Hive tables, the Apache Hadoop compute infrastructure, the local R environment, and Oracle database tables Predictive analytic techniques, written in R or Java as Hadoop MapReduce jobs, that can be applied to data in HDFS files You install and load this package as you would any other R package. Using simple R functions, you can perform tasks such as: Access and transform HDFS data using a Hive-enabled transparency layer Use the R language for writing mappers and reducers Copy data between R memory, the local file system, HDFS, Hive, and Oracle databases Schedule R programs to execute as Hadoop MapReduce jobs and return the results to any of those locations Oracle SQL Connector for Hadoop Distributed File System Using Oracle SQL Connector for HDFS, you can use an Oracle Database to access and analyze data residing in Hadoop in these formats: Data Pump files in HDFS Delimited text files in HDFS Hive tables For other file formats, such as JSON files, you can stage the input in Hive tables before using Oracle SQL Connector for HDFS. Oracle SQL Connector for HDFS uses external tables to provide Oracle Database with read access to Hive tables, and to delimited text files and Data Pump files in HDFS. Related Documentation Cloudera's Distribution Including Apache Hadoop Library HTML Oracle R Enterprise HTML Oracle NoSQL Database HTML Recent Blog Posts Big Data Appliance vs. DIY Price Comparison Big Data: Architecture Overview Big Data: Achieve the Impossible in Real-Time Big Data: Vertical Behavioral Analytics Big Data: In-Memory MapReduce Flume and Hive for Log Analytics Building Workflows in Oozie

    Read the article

  • SQL SERVER – Maximize Database Performance with DB Optimizer – SQL in Sixty Seconds #054

    - by Pinal Dave
    Performance tuning is an interesting concept and everybody evaluates it differently. Every developer and DBA have different opinion about how one can do performance tuning. I personally believe performance tuning is a three step process Understanding the Query Identifying the Bottleneck Implementing the Fix While, we are working with large database application and it suddenly starts to slow down. We are all under stress about how we can get back the database back to normal speed. Most of the time we do not have enough time to do deep analysis of what is going wrong as well what will fix the problem. Our primary goal at that time is to just fix the database problem as fast as we can. However, here is one very important thing which we need to keep in our mind is that when we do quick fix, it should not create any further issue with other parts of the system. When time is essence and we want to do deep analysis of our system to give us the best solution we often tend to make mistakes. Sometimes we make mistakes as we do not have proper time to analysis the entire system. Here is what I do when I face such a situation – I take the help of DB Optimizer. It is a fantastic tool and does superlative performance tuning of the system. Everytime when I talk about performance tuning tool, the initial reaction of the people is that they do not want to try this as they believe it requires lots of the learning of the tool before they use it. It is absolutely not true with the case of the DB optimizer. It is a very easy to use and self intuitive tool. Once can get going with the product, in no time. Here is a quick video I have build where I demonstrate how we can identify what index is missing for query and how we can quickly create the index. Entire three steps of the query tuning are completed in less than 60 seconds. If you are into performance tuning and query optimization you should download DB Optimizer and give it a go. Let us see the same concept in following SQL in Sixty Seconds Video: You can Download DB Optimizer and reproduce the same Sixty Seconds experience. Related Tips in SQL in Sixty Seconds: Performance Tuning – Part 1 of 2 – Getting Started and Configuration Performance Tuning – Part 2 of 2 – Analysis, Detection, Tuning and Optimizing What would you like to see in the next SQL in Sixty Seconds video? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Interview Questions and Answers, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Identity

    Read the article

  • Master Data Management – A Foundation for Big Data Analysis

    - by Manouj Tahiliani
    While Master Data Management has crossed the proverbial chasm and is on its way to becoming mainstream, businesses are being hammered by a new megatrend called Big Data. Big Data is characterized by massive volumes, its high frequency, the variety of less structured data sources such as email, sensors, smart meters, social networks, and Weblogs, and the need to analyze vast amounts of data to determine value to improve upon management decisions. Businesses that have embraced MDM to get a single, enriched and unified view of Master data by resolving semantic discrepancies and augmenting the explicit master data information from within the enterprise with implicit data from outside the enterprise like social profiles will have a leg up in embracing Big Data solutions. This is especially true for large and medium-sized businesses in industries like Retail, Communications, Financial Services, etc that would find it very challenging to get comprehensive analytical coverage and derive long-term success without resolving the limitations of the heterogeneous topology that leads to disparate, fragmented and incomplete master data. For analytical success from Big Data or in other words ROI from Big Data Investments, businesses need to acquire, organize and analyze the deluge of data to make better decisions. There will need to be a coexistence of structured and unstructured data and to maintain a tight link between the two to extract maximum insights. MDM is the catalyst that helps maintain that tight linkage by providing an understanding about the identity, characteristics of Persons, Companies, Products, Suppliers, etc. associated with the Big Data and thereby help accelerate ROI. In my next post I will discuss about patterns for co-existing Big Data Solutions and MDM. Feel free to provide comments and thoughts on above as well as Integration or Architectural patterns.

    Read the article

  • Marshalling a big-endian byte collection into a struct in order to pull out values

    - by Pat
    There is an insightful question about reading a C/C++ data structure in C# from a byte array, but I cannot get the code to work for my collection of big-endian (network byte order) bytes. (EDIT: Note that my real struct has more than just one field.) Is there a way to marshal the bytes into a big-endian version of the structure and then pull out the values in the endianness of the framework (that of the host, which is usually little-endian)? This should summarize what I'm looking for (LE=LittleEndian, BE=BigEndian): void Main() { var leBytes = new byte[] {1, 0}; var beBytes = new byte[] {0, 1}; Foo fooLe = ByteArrayToStructure<Foo>(leBytes); Foo fooBe = ByteArrayToStructureBigEndian<Foo>(beBytes); Assert.AreEqual(fooLe, fooBe); } [StructLayout(LayoutKind.Explicit, Size=2)] public struct Foo { [FieldOffset(0)] public ushort firstUshort; } T ByteArrayToStructure<T>(byte[] bytes) where T: struct { GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned); T stuff = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(),typeof(T)); handle.Free(); return stuff; } T ByteArrayToStructureBigEndian<T>(byte[] bytes) where T: struct { ??? }

    Read the article

  • Convert a raw string to an array of big-endian words with Ruby

    - by Zag zag..
    Hello, I would like to convert a raw string to an array of big-endian words. As example, here is a JavaScript function that do it well (by Paul Johnston): /* * Convert a raw string to an array of big-endian words * Characters >255 have their high-byte silently ignored. */ function rstr2binb(input) { var output = Array(input.length >> 2); for(var i = 0; i < output.length; i++) output[i] = 0; for(var i = 0; i < input.length * 8; i += 8) output[i>>5] |= (input.charCodeAt(i / 8) & 0xFF) << (24 - i % 32); return output; } I believe the Ruby equivalent can be String#unpack(format). However, I don't know what should be the correct format parameter. Thank you for any help. Regards

    Read the article

  • Big-Oh running time of code in Java (are my answers accurate

    - by Terry Frederick
    the Method hasTwoTrueValues returns true if at least two values in an array of booleans are true. Provide the Big-Oh running time for all three implementations proposed. // Version 1 public boolean has TwoTrueValues( boolean [ ] arr ) { int count = 0; for( int i = 0; i < arr. length; i++ ) if( arr[ i ] ) count++; return count >= 2; } // Version 2 public boolean hasTwoTrueValues( boolean [ ] arr ) { for( int i = 0; i < arr.length; i++ ) for( int j = i + 1; j < arr.length; j++ ) if( arr[ i ] && arr[ j ] ) return true; } // Version 3 public boolean hasTwoTrueValues( boolean [ ] arr ) { for( int i = 0; i < arr.length; i++ if( arr[ i ] ) for( int j = i + 1; j < arr.length; j++ ) if( arr[ j ] ) return true; return false; } For Version 1 I say the running time is O(n) Version 2 I say O(n^2) Version 3 I say O(n^2) I am really new to this Big Oh Notation so if my answers are incorrect could you please explain and help.

    Read the article

  • Suggested Web Application Framework and Database for Enterprise, “Big-Data” App?

    - by willOEM
    I have a web application that I have been developing for a small group within my company over the past few years, using Pipeline Pilot (plus jQuery and Python scripting) for web development and back-end computation, and Oracle 10g for my RDBMS. Users upload experimental genomic data, which is parsed into a database, and made available for querying, transformation, and reporting. Experimental data sets are large and have many layers of metadata. A given experimental data record might have a foreign key relationship with a table that describes this data point's assay. Assays can cover multiple genes, which can have multiple transcript, which can have multiple mutations, which can affect multiple signaling pathways, etc. Users need to approach this data from any point in those layers in the metadata. Since all data sets for a given data type can run over a billion rows, this results in some large, dynamic queries that are hard to predict. New data sets are added on a weekly basis (~1GB per set). Experimental data is never updated, but the associated metadata can be updated weekly for a few records and yearly for most others. For every data set insert the system sees, there will be between 10 and 100 selects run against it and associated data. It is okay for updates and inserts to run slow, so long as queries run quick and are as up-to-date as possible. The application continues to grow in size and scope and is already starting to run slower than I like. I am worried that we have about outgrown Pipeline Pilot, and perhaps Oracle (as the sole database). Would a NoSQL database or an OLAP system be appropriate here? What web application frameworks work well with systems like this? I'd like the solution to be something scalable, portable and supportable X-years down the road. Here is the current state of the application: Web Server/Data Processing: Pipeline Pilot on Windows Server + IIS Database: Oracle 10g, ~1TB of data, ~180 tables with several billion-plus row tables Network Storage: Isilon, ~50TB of low-priority raw data

    Read the article

  • Big numbers in C

    - by teehoo
    I need help working with very big numbers. According to Windows calc, the exponent 174^55 = 1.6990597648061509725749329578093e+123. How would I store this using C (c99 standard). int main(){ long long int x = 174^55; //result is 153 printf("%lld\n", x); } For those curious, it is for a school project where we are implementing the RSA cryptographic algorithm, which deals with exponentiating large numbers with large powers for encryption/decryption.

    Read the article

  • Big O, how do you calculate/approximate it?

    - by Sven
    Most people with a degree in CS will certainly know what Big O stands for. It helps us to measure how (in)efficient an algorithm really is and if you know in what category the problem you are trying to solve lays in you can figure out if it is still possible to squeeze out that little extra performance.* But I'm curious, how do you calculate or approximate the complexity of your algorithms? *: but as they say, don't overdo it, premature optimization is the root of all evil, and optimization without a justified cause should deserve that name as well.

    Read the article

  • How big is too big (for NTFS)

    - by BCS
    I have a program and as it's done now, it has a data directory with something like 10-30K files in it and it's starting to cause problems. Should I expect that to cause problems and my only solution to tweak my file structure or does that indicate other problems?

    Read the article

  • Big O and Little o

    - by hyperdude
    If algorithm A has complexity O(n) and algorithm B has complexity o(n^2), what, if anything, can we say about the relationship between A and B? Note: the complexity of A is expressed using big-Oh, and the complexity of B is expressed using little-Oh.

    Read the article

  • BIG DATA eBook - Now Available

    - by Javier Puerta
    The Big Data interactive e-book “Meeting the Challenge of Big Data: Part One” has just been released. It’s your “one-stop shop” for info about Big Data and the Oracle offering around it.The new e-book (available on your computer or iPad) is packed with multi-media resources to educate Oracle staff, customers, prospects and partners on the value of Big Data. It features videos, tutorials, podcasts, reports, white papers, datasheets, blogs, web links, a 3-D demo, and more. Go and get it here!

    Read the article

  • Big Data Sessions at Openworld 2012

    - by Jean-Pierre Dijcks
    If you are coming to San Francisco, and you are interested in all the aspects to big data, this Focus On Big Data is a must have document.  Some (other) highlights: A performance demo of a full rack Big Data Appliance in the engineered systems showcase A set of handson labs on how to go from a NoSQL DB to an effective analytics play on big data Much, much more See you all in a few weeks in SF!

    Read the article

  • Big Oh Notation - formal definition.

    - by aloh
    I'm reading a textbook right now for my Java III class. We're reading about Big-Oh and I'm a little confused by its formal definition. Formal Definition: "A function f(n) is of order at most g(n) - that is, f(n) = O(g(n)) - if a positive real number c and positive integer N exist such that f(n) <= c g(n) for all n = N. That is, c g(n) is an upper bound on f(n) when n is sufficiently large." Ok, that makes sense. But hold on, keep reading...the book gave me this example: "In segment 9.14, we said that an algorithm that uses 5n + 3 operations is O(n). We now can show that 5n + 3 = O(n) by using the formal definition of Big Oh. When n = 3, 5n + 3 <= 5n + n = 6n. Thus, if we let f(n) = 5n + 3, g(n) = n, c = 6, N = 3, we have shown that f(n) <= 6 g(n) for n = 3, or 5n + 3 = O(n). That is, if an algorithm requires time directly proportional to 5n + 3, it is O(n)." Ok, this kind of makes sense to me. They're saying that if n = 3 or greater, 5n + 3 takes less time than if n was less than 3 - thus 5n + n = 6n - right? Makes sense, since if n was 2, 5n + 3 = 13 while 6n = 12 but when n is 3 or greater 5n + 3 will always be less than or equal to 6n. Here's where I get confused. They give me another example: Example 2: "Let's show that 4n^2 + 50n - 10 = O(n^2). It is easy to see that: 4n^2 + 50n - 10 <= 4n^2 + 50n for any n. Since 50n <= 50n^2 for n = 50, 4n^2 + 50n - 10 <= 4n^2 + 50n^2 = 54n^2 for n = 50. Thus, with c = 54 and N = 50, we have shown that 4n^2 + 50n - 10 = O(n^2)." This statement doesn't make sense: 50n <= 50n^2 for n = 50. Isn't any n going to make the 50n less than 50n^2? Not just greater than or equal to 50? Why did they even mention that 50n <= 50n^2? What does that have to do with the problem? Also, 4n^2 + 50n - 10 <= 4n^2 + 50n^2 = 54n^2 for n = 50 is going to be true no matter what n is. And how in the world does picking numbers show that f(n) = O(g(n))? Please help me understand! :(

    Read the article

  • Big O Complexity of a method

    - by timeNomad
    I have this method: public static int what(String str, char start, char end) { int count=0; for(int i=0;i<str.length(); i++) { if(str.charAt(i) == start) { for(int j=i+1;j<str.length(); j++) { if(str.charAt(j) == end) count++; } } } return count; } What I need to find is: 1) What is it doing? Answer: counting the total number of end occurrences after EACH (or is it? Not specified in the assignment, point 3 depends on this) start. 2) What is its complexity? Answer: the first loops iterates over the string completely, so it's at least O(n), the second loop executes only if start char is found and even then partially (index at which start was found + 1). Although, big O is all about worst case no? So in the worst case, start is the 1st char & the inner iteration iterates over the string n-1 times, the -1 is a constant so it's n. But, the inner loop won't be executed every outer iteration pass, statistically, but since big O is about worst case, is it correct to say the complexity of it is O(n^2)? Ignoring any constants and the fact that in 99.99% of times the inner loop won't execute every outer loop pass. 3) Rewrite it so that complexity is lower. What I'm not sure of is whether start occurs at most once or more, if once at most, then method can be rewritten using one loop (having a flag indicating whether start has been encountered and from there on incrementing count at each end occurrence), yielding a complexity of O(n). In case though, that start can appear multiple times, which most likely it is, because assignment is of a Java course and I don't think they would make such ambiguity. Solving, in this case, is not possible using one loop... WAIT! Yes it is..! Just have a variable, say, inc to be incremented each time start is encountered & used to increment count each time end is encountered after the 1st start was found: inc = 0, count = 0 if (current char == start) inc++ if (inc > 0 && current char == end) count += inc This would also yield a complexity of O(n)? Because there is only 1 loop. Yes I realize I wrote a lot hehe, but what I also realized is that I understand a lot better by forming my thoughts into words...

    Read the article

  • big O notation algorithm

    - by niggersak
    Use big-O notation to classify the traditional grade school algorithms for addition and multiplication. That is, if asked to add two numbers each having N digits, how many individual additions must be performed? If asked to multiply two N-digit numbers, how many individual multiplications are required? . Suppose f is a function that returns the result of reversing the string of symbols given as its input, and g is a function that returns the concatenation of the two strings given as its input. If x is the string hrwa, what is returned by g(f(x),x)? Explain your answer - don't just provide the result!

    Read the article

  • Tricky Big-O complexity

    - by timeNomad
    public void foo (int n, int m) { int i = m; while (i > 100) i = i/3; for (int k=i ; k>=0; k--) { for (int j=1; j<n; j*=2) System.out.print(k + "\t" + j); System.out.println(); } } I figured the complexity would be O(logn). That is as a product of the inner loop, the outer loop -- will never be executed more than 100 times, so it can be omitted. What I'm not sure about is the while clause, should it be incorporated into the Big-O complexity? For very large i values it could make an impact, or arithmetic operations, doesn't matter on what scale, count as basic operations and can be omitted?

    Read the article

  • Database indexes and their Big-O notation

    - by miket2e
    I'm trying to understand the performance of database indexes in terms of Big-O notation. Without knowing much about it, I would guess that: Querying on a primary key or unique index will give you a O(1) lookup time. Querying on a non-unique index will also give a O(1) time, albeit maybe the '1' is slower than for the unique index (?) Querying on a column without an index will give a O(N) lookup time (full table scan). Is this generally correct ? Will querying on a primary key ever give worse performance than O(1) ? My specific concern is for SQLite, but I'd be interested in knowing to what extent this varies between different databases too.

    Read the article

  • Can someone help with big O notation?

    - by Dann
    void printScientificNotation(double value, int powerOfTen) { if (value >= 1.0 && value < 10.0) { System.out.println(value + " x 10^" + powerOfTen); } else if (value < 1.0) { printScientificNotation(value * 10, powerOfTen - 1); } else // value >= 10.0 { printScientificNotation(value / 10, powerOfTen + 1); } } I understand how the method goes but I cannot figure out a way to represent the method. For example, if value was 0.00000009 or 9e-8, the method will call on printScientificNotation(value * 10, powerOfTen - 1); eight times and System.out.println(value + " x 10^" + powerOfTen); once. So the it is called recursively by the exponent for e. But how do I represent this by big O notation? Thanks!

    Read the article

  • Big Data for Retail

    - by David Dorf
    Right up there with mobile, social, and cloud is the term "big data," which seems to be popping up lots in the press these days.  Companies like Google, Yahoo, and Facebook have popularized a new class of data technologies meant to solve the problem of processing large amounts of data quickly.  I first mentioned this in a posting back in March 2009.  Put simply, big data implies datasets so large they can't normally be processed using a standard transactional database.  The term "noSQL" is often used in this context as well. Actually, using parallel processing within the Oracle database combined with Exadata can achieve impressive results.  Look for more from Oracle at OpenWorld as hinted by Jean-Pierre Dijcks. McKinsey recently released a report on big data in which retail was specifically mentioned as an industry that can benefit from the new technologies.  I won't rehash that report because my friend Rama already did such a good job in his posting, Impact of "Big Data" on Retail. The presentation below does a pretty good job of framing the problem, although it doesn't really get into the available technologies (e.g. Exadata, Hadoop, Cassandra, etc.) and isn't retail specific. Determine the Right Analytic Database: A Survey of New Data Technologies So when a retailer asks me about big data, here's what I say:  Big data refers to a set of technologies for processing large volumes of structured and unstructured data.  Imagine collecting everything uttered by your customers on Facebook and Twitter and combining it with all the data you can find about the products you sell (e.g. reviews, images, demonstration videos), including competitive data.  Assuming you could process all that data, you could then personalize offers to specific customers based on their tastes, ensure prices are competitive, and implement better local assortments.  It's really not that far off.

    Read the article

  • SQL – Migrate Database from SQL Server to NuoDB – A Quick Tutorial

    - by Pinal Dave
    Data is growing exponentially and every organization with growing data is thinking of next big innovation in the world of Big Data. Big data is a indeed a future for every organization at one point of the time. Just like every other next big thing, big data has its own challenges and issues. The biggest challenge associated with the big data is to find the ideal platform which supports the scalability and growth of the data. If you are a regular reader of this blog, you must be familiar with NuoDB. I have been working with NuoDB for a while and their recent release is the best thus far. NuoDB is an elastically scalable SQL database that can run on local host, datacenter and cloud-based resources. A key feature of the product is that it does not require sharding (read more here). Last week, I was able to install NuoDB in less than 90 seconds and have explored their Explorer and Admin sections. You can read about my experiences in these posts: SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB SQL – Quick Start with Admin Sections of NuoDB – Manage NuoDB Database SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database Many SQL Authority readers have been following me in my journey to evaluate NuoDB. One of the frequently asked questions I’ve received from you is if there is any way to migrate data from SQL Server to NuoDB. The fact is that there is indeed a way to do so and NuoDB provides a fantastic tool which can help users to do it. NuoDB Migrator is a command line utility that supports the migration of Microsoft SQL Server, MySQL, Oracle, and PostgreSQL schemas and data to NuoDB. The migration to NuoDB is a three-step process: NuoDB Migrator generates a schema for a target NuoDB database It loads data into the target NuoDB database It dumps data from the source database Let’s see how we can migrate our data from SQL Server to NuoDB using a simple three-step approach. But before we do that we will create a sample database in MSSQL and later we will migrate the same database to NuoDB: Setup Step 1: Build a sample data CREATE DATABASE [Test]; CREATE TABLE [Department]( [DepartmentID] [smallint] NOT NULL, [Name] VARCHAR(100) NOT NULL, [GroupName] VARCHAR(100) NOT NULL, [ModifiedDate] [datetime] NOT NULL, CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED ( [DepartmentID] ASC ) ) ON [PRIMARY]; INSERT INTO Department SELECT * FROM AdventureWorks2012.HumanResources.Department; Note that I am using the SQL Server AdventureWorks database to build this sample table but you can build this sample table any way you prefer. Setup Step 2: Install Java 64 bit Before you can begin the migration process to NuoDB, make sure you have 64-bit Java installed on your computer. This is due to the fact that the NuoDB Migrator tool is built in Java. You can download 64-bit Java for Windows, Mac OSX, or Linux from the following link: http://java.com/en/download/manual.jsp. One more thing to remember is that you make sure that the path in your environment settings is set to your JAVA_HOME directory or else the tool will not work. Here is how you can do it: Go to My Computer >> Right Click >> Select Properties >> Click on Advanced System Settings >> Click on Environment Variables >> Click on New and enter the following values. Variable Name: JAVA_HOME Variable Value: C:\Program Files\Java\jre7 Make sure you enter your Java installation directory in the Variable Value field. Setup Step 3: Install JDBC driver for SQL Server. There are two JDBC drivers available for SQL Server.  Select the one you prefer to use by following one of the two links below: Microsoft JDBC Driver jTDS JDBC Driver In this example we will be using jTDS JDBC driver. Once you download the driver, move the driver to your NuoDB installation folder. In my case, I have moved the JAR file of the driver into the C:\Program Files\NuoDB\tools\migrator\jar folder as this is my NuoDB installation directory. Now we are all set to start the three-step migration process from SQL Server to NuoDB: Migration Step 1: NuoDB Schema Generation Here is the command I use to generate a schema of my SQL Server Database in NuoDB. First I go to the folder C:\Program Files\NuoDB\tools\migrator\bin and execute the nuodb-migrator.bat file. Note that my database name is ‘test’. Additionally my username and password is also ‘test’. You can see that my SQL Server database is running on my localhost on port 1433. Additionally, the schema of the table is ‘dbo’. nuodb-migrator schema –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.path=/tmp/schema.sql The above script will generate a schema of all my SQL Server tables and will put it in the folder C:\tmp\schema.sql . You can open the schema.sql file and execute this file directly in your NuoDB instance. You can follow the link here to see how you can execute the SQL script in NuoDB. Please note that if you have not yet created the schema in the NuoDB database, you should create it before executing this step. Step 2: Generate the Dump File of the Data Once you have recreated your schema in NuoDB from SQL Server, the next step is very easy. Here we create a CSV format dump file, which will contain all the data from all the tables from the SQL Server database. The command to do so is very similar to the above command. Be aware that this step may take a bit of time based on your database size. nuodb-migrator dump –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.type=csv –output.path=/tmp/dump.cat Once the above command is successfully executed you can find your CSV file in the C:\tmp\ folder. However, you do not have to do anything manually. The third and final step will take care of completing the migration process. Migration Step 3: Load the Data into NuoDB After building schema and taking a dump of the data, the very next step is essential and crucial. It will take the CSV file and load it into the NuoDB database. nuodb-migrator load –target.url=jdbc:com.nuodb://localhost:48004/mytest –target.schema=dbo –target.username=test –target.password=test –input.path=/tmp/dump.cat Please note that in the above script we are now targeting the NuoDB database, which we have already created with the name of “MyTest”. If the database does not exist, create it manually before executing the above script. I have kept the username and password as “test”, but please make sure that you create a more secure password for your database for security reasons. Voila!  You’re Done That’s it. You are done. It took 3 setup and 3 migration steps to migrate your SQL Server database to NuoDB.  You can now start exploring the database and build excellent, scale-out applications. In this blog post, I have done my best to come up with simple and easy process, which you can follow to migrate your app from SQL Server to NuoDB. Download NuoDB I strongly encourage you to download NuoDB and go through my 3-step migration tutorial from SQL Server to NuoDB. Additionally here are two very important blog post from NuoDB CTO Seth Proctor. He has written excellent blog posts on the concept of the Administrative Domains. NuoDB has this concept of an Administrative Domain, which is a collection of hosts that can run one or multiple databases.  Each database has its own TEs and SMs, but all are managed within the Admin Console for that particular domain. http://www.nuodb.com/techblog/2013/03/11/getting-started-provisioning-a-domain/ http://www.nuodb.com/techblog/2013/03/14/getting-started-running-a-database/ Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

    Read the article

  • SQL SERVER – List All the DMV and DMF on Server

    - by pinaldave
    “How many DMVs and DVFs are there in SQL Server 2008?” – this question was asked to me in one of the recent SQL Server Trainings. Answer is very simple: SELECT name, type, type_desc FROM sys.system_objects WHERE name LIKE 'dm_%' ORDER BY name Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

    Read the article

  • SQL SERVER – Change Database Access to Single User Mode Using SSMS

    - by pinaldave
    I have previously written about how using T-SQL Script we can convert the database access to single user mode before backup. I was recently asked if the same can be done using SQL Server Management Studio. Yes! You can do it from database property (Write click on database and select database property) and follow image. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >