Search Results

Search found 9128 results on 366 pages for 'big theta'.

Page 2/366 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Programmaticaly finding the Landau notation (Big O or Theta notation) of an algorithm?

    - by Julien L
    I'm used to search for the Landau (Big O, Theta...) notation of my algorithms by hand to make sure they are as optimized as they can be, but when the functions are getting really big and complex, it's taking way too much time to do it by hand. it's also prone to human errors. I spent some time on Codility (coding/algo exercises), and noticed they will give you the Landau notation for your submitted solution (both in Time and Memory usage). I was wondering how they do that... How would you do it? Is there another way besides Lexical Analysis or parsing of the code? PS: This question concerns mainly PHP and or JavaScript, but I'm opened to any language and theory.

    Read the article

  • Big Data Accelerator

    - by Jean-Pierre Dijcks
    For everyone who does not regularly listen to earnings calls, Oracle's Q4 call was interesting (as it mostly is). One of the announcements in the call was the Big Data Accelerator from Oracle (Seeking Alpha link here - slightly tweaked for correctness shown below):  "The big data accelerator includes some of the standard open source software, HDFS, the file system and a number of other pieces, but also some Oracle components that we think can dramatically speed up the entire map-reduce process. And will be particularly attractive to Java programmers [...]. There are some interesting applications they do, ETL is one. Log processing is another. We're going to have a lot of those features, functions and pre-built applications in our big data accelerator."  Not much else we can say right now, more on this (and Big Data in general) at Openworld!

    Read the article

  • Big Data – Buzz Words: What is HDFS – Day 8 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is MapReduce. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – HDFS. What is HDFS ? HDFS stands for Hadoop Distributed File System and it is a primary storage system used by Hadoop. It provides high performance access to data across Hadoop clusters. It is usually deployed on low-cost commodity hardware. In commodity hardware deployment server failures are very common. Due to the same reason HDFS is built to have high fault tolerance. The data transfer rate between compute nodes in HDFS is very high, which leads to reduced risk of failure. HDFS creates smaller pieces of the big data and distributes it on different nodes. It also copies each smaller piece to multiple times on different nodes. Hence when any node with the data crashes the system is automatically able to use the data from a different node and continue the process. This is the key feature of the HDFS system. Architecture of HDFS The architecture of the HDFS is master/slave architecture. An HDFS cluster always consists of single NameNode. This single NameNode is a master server and it manages the file system as well regulates access to various files. In additional to NameNode there are multiple DataNodes. There is always one DataNode for each data server. In HDFS a big file is split into one or more blocks and those blocks are stored in a set of DataNodes. The primary task of the NameNode is to open, close or rename files and directory and regulate access to the file system, whereas the primary task of the DataNode is read and write to the file systems. DataNode is also responsible for the creation, deletion or replication of the data based on the instruction from NameNode. In reality, NameNode and DataNode are software designed to run on commodity machine build in Java language. Visual Representation of HDFS Architecture Let us understand how HDFS works with the help of the diagram. Client APP or HDFS Client connects to NameSpace as well as DataNode. Client App access to the DataNode is regulated by NameSpace Node. NameSpace Node allows Client App to connect to the DataNode based by allowing the connection to the DataNode directly. A big data file is divided into multiple data blocks (let us assume that those data chunks are A,B,C and D. Client App will later on write data blocks directly to the DataNode. Client App does not have to directly write to all the node. It just has to write to any one of the node and NameNode will decide on which other DataNode it will have to replicate the data. In our example Client App directly writes to DataNode 1 and detained 3. However, data chunks are automatically replicated to other nodes. All the information like in which DataNode which data block is placed is written back to NameNode. High Availability During Disaster Now as multiple DataNode have same data blocks in the case of any DataNode which faces the disaster, the entire process will continue as other DataNode will assume the role to serve the specific data block which was on the failed node. This system provides very high tolerance to disaster and provides high availability. If you notice there is only single NameNode in our architecture. If that node fails our entire Hadoop Application will stop performing as it is a single node where we store all the metadata. As this node is very critical, it is usually replicated on another clustered as well as on another data rack. Though, that replicated node is not operational in architecture, it has all the necessary data to perform the task of the NameNode in the case of the NameNode fails. The entire Hadoop architecture is built to function smoothly even there are node failures or hardware malfunction. It is built on the simple concept that data is so big it is impossible to have come up with a single piece of the hardware which can manage it properly. We need lots of commodity (cheap) hardware to manage our big data and hardware failure is part of the commodity servers. To reduce the impact of hardware failure Hadoop architecture is built to overcome the limitation of the non-functioning hardware. Tomorrow In tomorrow’s blog post we will discuss the importance of the relational database in Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data Learning Resources

    - by Lara Rubbelke
    I have recently had several requests from people asking for resources to learn about Big Data and Hadoop. Below is a list of resources that I typically recommend. I'll update this list as I find more resources. Let's crowdsource this... Tell me your favorite resources and I'll get them on the list! Books and Whitepapers Planning for Big Data Free e-book Great primer on the general Big Data space. This is always my recommendation for people who are new to Big Data and are trying to understand it....(read more)

    Read the article

  • E-Book on big data (featuring Analysts, Customers and more)

    - by Jean-Pierre Dijcks
    As we are gearing up for Openworld, here is a nice E-book on big data to start paging through. It contains Gartner's take on big data, customer and partner interviews and a lot more good info. Enjoy the read so you come prepared for Openworld!! Read the E-Book here. For those coming to Oracle Openworld (or the Americas Cup races around the same time), you can find big data sessions via this URL. Enjoy!!

    Read the article

  • Big Oh notation does not mention constant value

    - by user883561
    I am a programmer and have just started reading Algorithms. I am not completely convinced with the notations namely Bog Oh, Big Omega and Big Theta. The reason is by definition of Big Oh, it states that there should be a function g(x) such that it is always greater than or equal to f(x). Or f(x) <= c.n for all values of n n0. My doubt is the why dont we mention the constant value in the definition? For example. lets say a function 6n+4, we denote it as O(n). but its not true that the definition holds good for all constant value. this holds good only when c = 10 and n = 1. For lesser values of c than 6, the value of n0 increases. So why we do not mention the constant value as a part of the definition.

    Read the article

  • Big data: An evening in the life of an actual buyer

    - by Jean-Pierre Dijcks
    Here I am, and this is an actual story of one of my evenings, trying to spend money with a company and ultimately failing. I just gave up and bought a service from another vendor, not the incumbent. Here is that story and how I think big data could actually fix this (and potentially prevent some of this from happening). In the end this story should illustrate how big data can benefit me (get me what I want without causing grief) and the company I am trying to buy something from. Note: Lots of details left out, I have no intention of being the annoyed blogger moaning about a specific company. What did I want to get? We watch TV, we have internet and we do have a land line. The land line is from a different vendor then the TV and the internet. I have decided that this makes no sense and I was going to get a bundle (no need to infer who this is, I just picked the generic bundle word as this is what I want to get) of all three services as this seems to save me money. I also want to not talk to people, I just want to click on a website when I feel like it and get it all sorted. I do think that is reality. I want to just do my shopping at 9.30pm while watching silly reruns on TV. Problem 1 - Bad links So, I'm an existing customer of the company I want to buy my bundle from. I go to the website, I click on offers. Turns out they are offers for new customers. After grumbling about how good they are, I click on offers for existing customers. Bummer, it goes to offers for new customers, so I click again on the link for offers for existing customers. No cigar... it just does not work. Big data solutions: 1) Do not show an existing customer the offers for new customers unless they are the same => This is only partially doable without login, but if a customer logs in the application should always know that this is an existing customer. But in general, imagine I do this from my home going through the internet service of this vendor to their domain... an instant filter should move me into the "existing customer route". 2) Flag dead or incorrect links => I've clicked the link for "existing customer offers" at least 3 times in under 5 seconds... Identifying patterns like this is easy in Hadoop and can very quickly make a list of potentially incorrect links. No need for realtime fixing, just the fact that this link can be pro-actively fixed across my entire web domain is a good thing. Preventative maintenance! Problem 2 - Purchase cannot be completed Apart from the fact that the browsing pattern to actually get to what I want is poorly designed, my purchase never gets past a specific point. In other words, I put something into my shopping cart and when I want to move on the application either crashes (with me going to an error page) or hangs or goes into something like chat. So I try again, and again and again. I think I tried this entire path (while being logged in!!) at least 10 times over the course of 20 minutes. I also clicked on the feedback button and, frustrated as I was, tried to explain this did not work... Big Data Solutions: 1) This web site does shopping cart analysis. I got an email next day stating I have things in my shopping cart, just click here to complete my purchase. After the above experience, this just added insult to my pain... 2) What should have happened, is a Hadoop job going over all logged in customers that are on the buy flow. It should flag anyone who is trying (multiple attempts from the same user to do the same thing), analyze the shopping card, the clicks to identify what the customers wants, his feedback provided (note: always own your own website feedback, never just farm this out!!) and in a short turn around time (30 minutes to 2 hours or so) email me with a link to complete my purchase. Not with a link to my shopping cart 12 hours later, but a link to actually achieve what I wanted... Why should this company go through the big data effort? I do believe this is relatively easy to do using our Oracle Event Processing and Big Data Appliance solutions combined. It is almost so simple (to my mind) that it makes no sense that this is not in place? But, now I am ranting... Why is this interesting? It is because of $$$$. After trying really hard, I mean I did this all in the evening, and again in the morning before going to work. I kept on failing, But I really wanted this to work... so an email that said, sorry, we noticed you tried to get a bundle (the log knows what I wanted, where I failed, so easy to generate), here is the link to click and complete your purchase. And here is 2 movies on us as an apology would have kept me as a customer, and got the additional $$$$ per month for the next couple of years. It would also lead to upsell on my phone package etc. Instead, I went to a completely different company, bought service from them. Lost money for company A, negative sentiment for company A and me telling this story at the water cooler so I'm influencing more people to think negatively about company A. All in all, a loss of easy money, a ding in sentiment and image where a relatively simple solution exists and can be in place on the software I describe routinely in this blog... For those who are coming to Openworld and maybe see value in solving the above, or are thinking of how to solve this, come visit us in Moscone North - Oracle Red Lounge or in the Engineered Systems Showcase.

    Read the article

  • Big Data – Buzz Words: What is MapReduce – Day 7 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is Hadoop. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – MapReduce. What is MapReduce? MapReduce was designed by Google as a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. Though, MapReduce was originally Google proprietary technology, it has been quite a generalized term in the recent time. MapReduce comprises a Map() and Reduce() procedures. Procedure Map() performance filtering and sorting operation on data where as procedure Reduce() performs a summary operation of the data. This model is based on modified concepts of the map and reduce functions commonly available in functional programing. The library where procedure Map() and Reduce() belongs is written in many different languages. The most popular free implementation of MapReduce is Apache Hadoop which we will explore tomorrow. Advantages of MapReduce Procedures The MapReduce Framework usually contains distributed servers and it runs various tasks in parallel to each other. There are various components which manages the communications between various nodes of the data and provides the high availability and fault tolerance. Programs written in MapReduce functional styles are automatically parallelized and executed on commodity machines. The MapReduce Framework takes care of the details of partitioning the data and executing the processes on distributed server on run time. During this process if there is any disaster the framework provides high availability and other available modes take care of the responsibility of the failed node. As you can clearly see more this entire MapReduce Frameworks provides much more than just Map() and Reduce() procedures; it provides scalability and fault tolerance as well. A typical implementation of the MapReduce Framework processes many petabytes of data and thousands of the processing machines. How do MapReduce Framework Works? A typical MapReduce Framework contains petabytes of the data and thousands of the nodes. Here is the basic explanation of the MapReduce Procedures which uses this massive commodity of the servers. Map() Procedure There is always a master node in this infrastructure which takes an input. Right after taking input master node divides it into smaller sub-inputs or sub-problems. These sub-problems are distributed to worker nodes. A worker node later processes them and does necessary analysis. Once the worker node completes the process with this sub-problem it returns it back to master node. Reduce() Procedure All the worker nodes return the answer to the sub-problem assigned to them to master node. The master node collects the answer and once again aggregate that in the form of the answer to the original big problem which was assigned master node. The MapReduce Framework does the above Map () and Reduce () procedure in the parallel and independent to each other. All the Map() procedures can run parallel to each other and once each worker node had completed their task they can send it back to master code to compile it with a single answer. This particular procedure can be very effective when it is implemented on a very large amount of data (Big Data). The MapReduce Framework has five different steps: Preparing Map() Input Executing User Provided Map() Code Shuffle Map Output to Reduce Processor Executing User Provided Reduce Code Producing the Final Output Here is the Dataflow of MapReduce Framework: Input Reader Map Function Partition Function Compare Function Reduce Function Output Writer In a future blog post of this 31 day series we will explore various components of MapReduce in Detail. MapReduce in a Single Statement MapReduce is equivalent to SELECT and GROUP BY of a relational database for a very large database. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – HDFS. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Buzz Words: What is NoSQL – Day 5 of 21

    - by Pinal Dave
    In yesterday’s blog post we explored the basic architecture of Big Data . In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – NoSQL. What is NoSQL? NoSQL stands for Not Relational SQL or Not Only SQL. Lots of people think that NoSQL means there is No SQL, which is not true – they both sound same but the meaning is totally different. NoSQL does use SQL but it uses more than SQL to achieve its goal. As per Wikipedia’s NoSQL Database Definition – “A NoSQL database provides a mechanism for storage and retrieval of data that uses looser consistency models than traditional relational databases.“ Why use NoSQL? A traditional relation database usually deals with predictable structured data. Whereas as the world has moved forward with unstructured data we often see the limitations of the traditional relational database in dealing with them. For example, nowadays we have data in format of SMS, wave files, photos and video format. It is a bit difficult to manage them by using a traditional relational database. I often see people using BLOB filed to store such a data. BLOB can store the data but when we have to retrieve them or even process them the same BLOB is extremely slow in processing the unstructured data. A NoSQL database is the type of database that can handle unstructured, unorganized and unpredictable data that our business needs it. Along with the support to unstructured data, the other advantage of NoSQL Database is high performance and high availability. Eventual Consistency Additionally to note that NoSQL Database may not provided 100% ACID (Atomicity, Consistency, Isolation, Durability) compliance.  Though, NoSQL Database does not support ACID they provide eventual consistency. That means over the long period of time all updates can be expected to propagate eventually through the system and data will be consistent. Taxonomy Taxonomy is the practice of classification of things or concepts and the principles. The NoSQL taxonomy supports column store, document store, key-value stores, and graph databases. We will discuss the taxonomy in detail in later blog posts. Here are few of the examples of the each of the No SQL Category. Column: Hbase, Cassandra, Accumulo Document: MongoDB, Couchbase, Raven Key-value : Dynamo, Riak, Azure, Redis, Cache, GT.m Graph: Neo4J, Allegro, Virtuoso, Bigdata As of now there are over 150 NoSQL Database and you can read everything about them in this single link. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • Big Data – Buzz Words: What is Hadoop – Day 6 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is NoSQL. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – Hadoop. What is Hadoop? Apache Hadoop is an open-source, free and Java based software framework offers a powerful distributed platform to store and manage Big Data. It is licensed under an Apache V2 license. It runs applications on large clusters of commodity hardware and it processes thousands of terabytes of data on thousands of the nodes. Hadoop is inspired from Google’s MapReduce and Google File System (GFS) papers. The major advantage of Hadoop framework is that it provides reliability and high availability. What are the core components of Hadoop? There are two major components of the Hadoop framework and both fo them does two of the important task for it. Hadoop MapReduce is the method to split a larger data problem into smaller chunk and distribute it to many different commodity servers. Each server have their own set of resources and they have processed them locally. Once the commodity server has processed the data they send it back collectively to main server. This is effectively a process where we process large data effectively and efficiently. (We will understand this in tomorrow’s blog post). Hadoop Distributed File System (HDFS) is a virtual file system. There is a big difference between any other file system and Hadoop. When we move a file on HDFS, it is automatically split into many small pieces. These small chunks of the file are replicated and stored on other servers (usually 3) for the fault tolerance or high availability. (We will understand this in the day after tomorrow’s blog post). Besides above two core components Hadoop project also contains following modules as well. Hadoop Common: Common utilities for the other Hadoop modules Hadoop Yarn: A framework for job scheduling and cluster resource management There are a few other projects (like Pig, Hive) related to above Hadoop as well which we will gradually explore in later blog posts. A Multi-node Hadoop Cluster Architecture Now let us quickly see the architecture of the a multi-node Hadoop cluster. A small Hadoop cluster includes a single master node and multiple worker or slave node. As discussed earlier, the entire cluster contains two layers. One of the layer of MapReduce Layer and another is of HDFC Layer. Each of these layer have its own relevant component. The master node consists of a JobTracker, TaskTracker, NameNode and DataNode. A slave or worker node consists of a DataNode and TaskTracker. It is also possible that slave node or worker node is only data or compute node. The matter of the fact that is the key feature of the Hadoop. In this introductory blog post we will stop here while describing the architecture of Hadoop. In a future blog post of this 31 day series we will explore various components of Hadoop Architecture in Detail. Why Use Hadoop? There are many advantages of using Hadoop. Let me quickly list them over here: Robust and Scalable – We can add new nodes as needed as well modify them. Affordable and Cost Effective – We do not need any special hardware for running Hadoop. We can just use commodity server. Adaptive and Flexible – Hadoop is built keeping in mind that it will handle structured and unstructured data. Highly Available and Fault Tolerant – When a node fails, the Hadoop framework automatically fails over to another node. Why Hadoop is named as Hadoop? In year 2005 Hadoop was created by Doug Cutting and Mike Cafarella while working at Yahoo. Doug Cutting named Hadoop after his son’s toy elephant. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – MapReduce. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • New Big Data Appliance Security Features

    - by mgubar
    The Oracle Big Data Appliance (BDA) is an engineered system for big data processing.  It greatly simplifies the deployment of an optimized Hadoop Cluster – whether that cluster is used for batch or real-time processing.  The vast majority of BDA customers are integrating the appliance with their Oracle Databases and they have certain expectations – especially around security.  Oracle Database customers have benefited from a rich set of security features:  encryption, redaction, data masking, database firewall, label based access control – and much, much more.  They want similar capabilities with their Hadoop cluster.    Unfortunately, Hadoop wasn’t developed with security in mind.  By default, a Hadoop cluster is insecure – the antithesis of an Oracle Database.  Some critical security features have been implemented – but even those capabilities are arduous to setup and configure.  Oracle believes that a key element of an optimized appliance is that its data should be secure.  Therefore, by default the BDA delivers the “AAA of security”: authentication, authorization and auditing. Security Starts at Authentication A successful security strategy is predicated on strong authentication – for both users and software services.  Consider the default configuration for a newly installed Oracle Database; it’s been a long time since you had a legitimate chance at accessing the database using the credentials “system/manager” or “scott/tiger”.  The default Oracle Database policy is to lock accounts thereby restricting access; administrators must consciously grant access to users. Default Authentication in Hadoop By default, a Hadoop cluster fails the authentication test. For example, it is easy for a malicious user to masquerade as any other user on the system.  Consider the following scenario that illustrates how a user can access any data on a Hadoop cluster by masquerading as a more privileged user.  In our scenario, the Hadoop cluster contains sensitive salary information in the file /user/hrdata/salaries.txt.  When logged in as the hr user, you can see the following files.  Notice, we’re using the Hadoop command line utilities for accessing the data: $ hadoop fs -ls /user/hrdataFound 1 items-rw-r--r--   1 oracle supergroup         70 2013-10-31 10:38 /user/hrdata/salaries.txt$ hadoop fs -cat /user/hrdata/salaries.txtTom Brady,11000000Tom Hanks,5000000Bob Smith,250000Oprah,300000000 User DrEvil has access to the cluster – and can see that there is an interesting folder called “hrdata”.  $ hadoop fs -ls /user Found 1 items drwx------   - hr supergroup          0 2013-10-31 10:38 /user/hrdata However, DrEvil cannot view the contents of the folder due to lack of access privileges: $ hadoop fs -ls /user/hrdata ls: Permission denied: user=drevil, access=READ_EXECUTE, inode="/user/hrdata":oracle:supergroup:drwx------ Accessing this data will not be a problem for DrEvil. He knows that the hr user owns the data by looking at the folder’s ACLs. To overcome this challenge, he will simply masquerade as the hr user. On his local machine, he adds the hr user, assigns that user a password, and then accesses the data on the Hadoop cluster: $ sudo useradd hr $ sudo passwd $ su hr $ hadoop fs -cat /user/hrdata/salaries.txt Tom Brady,11000000 Tom Hanks,5000000 Bob Smith,250000 Oprah,300000000 Hadoop has not authenticated the user; it trusts that the identity that has been presented is indeed the hr user. Therefore, sensitive data has been easily compromised. Clearly, the default security policy is inappropriate and dangerous to many organizations storing critical data in HDFS. Big Data Appliance Provides Secure Authentication The BDA provides secure authentication to the Hadoop cluster by default – preventing the type of masquerading described above. It accomplishes this thru Kerberos integration. Figure 1: Kerberos Integration The Key Distribution Center (KDC) is a server that has two components: an authentication server and a ticket granting service. The authentication server validates the identity of the user and service. Once authenticated, a client must request a ticket from the ticket granting service – allowing it to access the BDA’s NameNode, JobTracker, etc. At installation, you simply point the BDA to an external KDC or automatically install a highly available KDC on the BDA itself. Kerberos will then provide strong authentication for not just the end user – but also for important Hadoop services running on the appliance. You can now guarantee that users are who they claim to be – and rogue services (like fake data nodes) are not added to the system. It is common for organizations to want to leverage existing LDAP servers for common user and group management. Kerberos integrates with LDAP servers – allowing the principals and encryption keys to be stored in the common repository. This simplifies the deployment and administration of the secure environment. Authorize Access to Sensitive Data Kerberos-based authentication ensures secure access to the system and the establishment of a trusted identity – a prerequisite for any authorization scheme. Once this identity is established, you need to authorize access to the data. HDFS will authorize access to files using ACLs with the authorization specification applied using classic Linux-style commands like chmod and chown (e.g. hadoop fs -chown oracle:oracle /user/hrdata changes the ownership of the /user/hrdata folder to oracle). Authorization is applied at the user or group level – utilizing group membership found in the Linux environment (i.e. /etc/group) or in the LDAP server. For SQL-based data stores – like Hive and Impala – finer grained access control is required. Access to databases, tables, columns, etc. must be controlled. And, you want to leverage roles to facilitate administration. Apache Sentry is a new project that delivers fine grained access control; both Cloudera and Oracle are the project’s founding members. Sentry satisfies the following three authorization requirements: Secure Authorization:  the ability to control access to data and/or privileges on data for authenticated users. Fine-Grained Authorization:  the ability to give users access to a subset of the data (e.g. column) in a database Role-Based Authorization:  the ability to create/apply template-based privileges based on functional roles. With Sentry, “all”, “select” or “insert” privileges are granted to an object. The descendants of that object automatically inherit that privilege. A collection of privileges across many objects may be aggregated into a role – and users/groups are then assigned that role. This leads to simplified administration of security across the system. Figure 2: Object Hierarchy – granting a privilege on the database object will be inherited by its tables and views. Sentry is currently used by both Hive and Impala – but it is a framework that other data sources can leverage when offering fine-grained authorization. For example, one can expect Sentry to deliver authorization capabilities to Cloudera Search in the near future. Audit Hadoop Cluster Activity Auditing is a critical component to a secure system and is oftentimes required for SOX, PCI and other regulations. The BDA integrates with Oracle Audit Vault and Database Firewall – tracking different types of activity taking place on the cluster: Figure 3: Monitored Hadoop services. At the lowest level, every operation that accesses data in HDFS is captured. The HDFS audit log identifies the user who accessed the file, the time that file was accessed, the type of access (read, write, delete, list, etc.) and whether or not that file access was successful. The other auditing features include: MapReduce:  correlate the MapReduce job that accessed the file Oozie:  describes who ran what as part of a workflow Hive:  captures changes were made to the Hive metadata The audit data is captured in the Audit Vault Server – which integrates audit activity from a variety of sources, adding databases (Oracle, DB2, SQL Server) and operating systems to activity from the BDA. Figure 4: Consolidated audit data across the enterprise.  Once the data is in the Audit Vault server, you can leverage a rich set of prebuilt and custom reports to monitor all the activity in the enterprise. In addition, alerts may be defined to trigger violations of audit policies. Conclusion Security cannot be considered an afterthought in big data deployments. Across most organizations, Hadoop is managing sensitive data that must be protected; it is not simply crunching publicly available information used for search applications. The BDA provides a strong security foundation – ensuring users are only allowed to view authorized data and that data access is audited in a consolidated framework.

    Read the article

  • Big Data – Buzz Words: What is NewSQL – Day 10 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the relational database. In this article we will take a quick look at the what is NewSQL. What is NewSQL? NewSQL stands for new scalable and high performance SQL Database vendors. The products sold by NewSQL vendors are horizontally scalable. NewSQL is not kind of databases but it is about vendors who supports emerging data products with relational database properties (like ACID, Transaction etc.) along with high performance. Products from NewSQL vendors usually follow in memory data for speedy access as well are available immediate scalability. NewSQL term was coined by 451 groups analyst Matthew Aslett in this particular blog post. On the definition of NewSQL, Aslett writes: “NewSQL” is our shorthand for the various new scalable/high performance SQL database vendors. We have previously referred to these products as ‘ScalableSQL‘ to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all the products, we adopted the term ‘NewSQL’ in the new report. And to clarify, like NoSQL, NewSQL is not to be taken too literally: the new thing about the NewSQL vendors is the vendor, not the SQL. In other words - NewSQL incorporates the concepts and principles of Structured Query Language (SQL) and NoSQL languages. It combines reliability of SQL with the speed and performance of NoSQL. Categories of NewSQL There are three major categories of the NewSQL New Architecture – In this framework each node owns a subset of the data and queries are split into smaller query to sent to nodes to process the data. E.g. NuoDB, Clustrix, VoltDB MySQL Engines – Highly Optimized storage engine for SQL with the interface of MySQ Lare the example of such category. E.g. InnoDB, Akiban Transparent Sharding – This system automatically split database across multiple nodes. E.g. Scalearc  Summary In simple words – NewSQL is kind of database following relational database principals and provides scalability like NoSQL. Tomorrow In tomorrow’s blog post we will discuss about the Role of Cloud Computing in Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • In Email, Image (img) Source (src) Tags are rewritten as relative links. How to fix?

    - by Noah Goodrich
    I'm working on sending out an html based email, and every time it sends the image src tags and some of the anchor href tags are modified to be relative url's. Update 2: This is happening between when the body of the email is generated and sent and when it arrives in my inbox. Update: I am using Postfix on a LAMPP server. In addition, I am using Zend_Mail to send the emails out. For example, I have a link: src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/header.jpg" And it gets rewritten as: src="../../../../images/email/highpoint_2009_04/header.jpg" What can cause this to occur and how is it corrected? Email headers: Return-Path: <[email protected]> X-Original-To: [email protected] Delivered-To: [email protected] Received: by mail.example.com (Postfix, from userid 0) id 6BF012252; Tue, 14 Apr 2009 12:15:20 -0600 (MDT) To: Gabriel <[email protected]> Subject: Free Map to Sales Success From: Somebody <[email protected]> Date: Tue, 14 Apr 2009 12:15:20 -0600 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: multipart/related Content-Disposition: inline Message-Id: <[email protected]> Original content to be sent out: <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td valign="top"> <a href="http://www.furnituretrainingcompany.com"> <img moz-do-not-send="true" alt="The Furniture Training Company - Know More. Sell More." src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/header.jpg" border="0" height="123" width="600"> </a> </td> </tr> </tbody> </table> <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td valign="top"><img alt="Visit us at High Point to receive your free training poster" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/hero.jpg" moz-do-not-send="true" height="150" width="600"><br> </td> </tr> </tbody> </table> <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_content_left.jpg" moz-do-not-send="true" height="30" width="30"><br> </td> <td bgcolor="#ffffff" valign="top"><font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#000000" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><big><small><big><b>See you at Market</b></big><br> </small></big></big></big></big></font> <font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#000000" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><big><small><br> </small></big></big></big></big></font><small><font face="Helvetica, Arial, sans-serif">Visit our space to get your free Map to Sales Success poster! This unique 24 X 36 color poster is your guide to developing high volume salespeople with larger tickets. Find us in the new NHFA Retailer Resource Center located in the Plaza. <br> <br> Don&#8217;t miss Mark Lacy&#8217;s entertaining seminar "Help Wanted! My Sales Associates Can&#8217;t Sell Water to a Thirsty Camel." He&#8217;ll reveal powerful secrets for turning sales associates into furniture experts that will sell. See him Saturday, April 25th at 11:30 AM in the seminar room of the new NHFA Retail Resource Center in the Plaza. <br> <br> Stop by our space to learn how our ingenious internet-delivered training courses are easy to use, guaranteed to work, and cheaper than the daily donuts. Over 95% report increased sales. <br> <br> Plan to see us at High Point. </font></small> <font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#000000" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><big><small><small><br> <br> <br> <br> </small></small></big></big></big></big></font><small><font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#000000" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><small> </small></big></big></big></font></small> <a href="http://www.furnituretrainingcompany.com/map"><img alt="Find out more" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/image_content_left.jpg" moz-do-not-send="true" border="0" height="67" width="326"></a><br> <br> </td> <td bgcolor="#ffffff" valign="top"> <img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_content_middle.jpg" moz-do-not-send="true" height="28" width="28"><br> </td> <td bgcolor="#ffffff" valign="top"><img alt="Roadmap to Sales Success poster" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/image_content_right.jpg" moz-do-not-send="true" height="267" width="186"><br> <font face="Helvetica, Arial, sans-serif"><small><font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#000000" size="1"><big><big><big><small><b>Road Map to Sales Success<br> </b><br> </small></big></big></big></font>This beautiful poster is yours free for simply stopping by and visiting with us at High Point. <span class="moz-txt-slash">Our space is located inside the </span>new NHFA Retailer Resource Center in the Plaza Suites, 222 South Main St, 1st Floor. We will be at market from Sat April 25th until Thur April 30th. </small></font><br> </td> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_content_right.jpg" moz-do-not-send="true" height="30" width="30"><br> <br> </td> </tr> </tbody> </table> <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/disclaimer_divider.jpg" moz-do-not-send="true" height="25" width="600"><br> </td> </tr> </tbody> </table> <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_disclaimer_left.jpg" moz-do-not-send="true"></td> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_disclaimer_middle.jpg" moz-do-not-send="true"><br> <font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" color="#666666" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><big><small><small><small>If you are not attending the High Point market in April but would still like to receive a free Road Map to Sales Success poster visit us on the web at <u><a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="http://www.furnituretrainingcompany.com">www.furnituretrainingcompany.com</a></u>, or to speak with a Furniture Training Company representative, call toll free (866) 755-5996. We do not offer free shipping outside of the U.S. and Canada. Retailers outside of the U.S. and Canada may call for more information. Limit one free Road Map to Sales Success per company. Other copies of the poster may be purchased on our web site.<br> <br> </small></small></small></big></big></big></big></font> <font color="#666666"><small><font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><small><small>We hope you found this message to be useful. However, if you'd rather not receive future emails of this sort from The Furniture Training Company, please <a moz-do-not-send="true" href="http://www.furnituretraining.com/contact">click here to unsubscribe</a>.<br> <br> </small></small></big></big></big></font></small><small><font originaltag="yes" style="font-size: 9px; font-family: Verdana,Arial,Helvetica,sans-serif;" face="Verdana, Arial, Helvetica, sans-serif" size="1"><big><big><big><small><small>&copy;Copyright 2009 The Furniture Training Company.<br> 1770 North Research Park Way, <br> North Logan, UT 84341. <br> All Rights Reserved.</small></small></big></big></big></font></small></font><br> </td> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer_disclaimer_right.jpg" moz-do-not-send="true"></td> </tr> </tbody> </table> <table align="center" border="0" cellpadding="0" cellspacing="0" width="600"> <tbody> <tr> <td bgcolor="#ffffff" valign="top"><img alt="" src="http://www.furnituretrainingcompany.com/images/email/highpoint_2009_04/footer.jpg" moz-do-not-send="true"> </td> </tr> </tbody> </table> <br> <br> Content that gets sent: <table border=3D"0" cellspacing=3D"0" cellpadding=3D"0" width=3D"600" al= ign=3D"center">=0D=0A<tbody>=0D=0A<tr>=0D=0A<td valign=3D"top"><a href= =3D"http://www.furnituretrainingcompany.com"> <img src=3D"http://www.fur= nituretrainingcompany.com/images/email/highpoint_2009_04/header.jpg" bor= der=3D"0" alt=3D"The Furniture Training Company - Know More. Sell More."= width=3D"600" height=3D"123" /> </a></td>=0D=0A</tr>=0D=0A</tbody>=0D= =0A</table>=0D=0A<table border=3D"0" cellspacing=3D"0" cellpadding=3D"0"= width=3D"600" align=3D"center">=0D=0A<tbody>=0D=0A<tr>=0D=0A<td valign= =3D"top"><img src=3D"http://www.furnituretrainingcompany.com/images/emai= l/highpoint_2009_04/hero.jpg" alt=3D"Visit us at High Point to receive y= our free training poster" width=3D"600" height=3D"150" /><br /></td>=0D= =0A</tr>=0D=0A</tbody>=0D=0A</table>=0D=0A<table border=3D"0" cellspacin= g=3D"0" cellpadding=3D"0" width=3D"600" align=3D"center">=0D=0A<tbody>= =0D=0A<tr>=0D=0A<td valign=3D"top" bgcolor=3D"#ffffff"><img src=3D"http:= //www.furnituretrainingcompany.com/images/email/highpoint_2009_04/spacer= _content_left.jpg" alt=3D"" width=3D"30" height=3D"30" /><br /></td>=0D= =0A<td valign=3D"top" bgcolor=3D"#ffffff"><span style=3D"font-size: xx-s= mall; font-family: Verdana,Arial,Helvetica,sans-serif; color: #000000;">= <big><big><big><big><small><big><strong>See you at Market</strong></big>= <br /> </small></big></big></big></big></span> <span style=3D"font-size:= xx-small; font-family: Verdana,Arial,Helvetica,sans-serif; color: #0000= 00;"><big><big><big><big><small><br /> </small></big></big></big></big><= /span><small><span style=3D"font-family: Helvetica,Arial,sans-serif;">Vi= sit our space to get your free Map to Sales Success poster! This unique= 24 X 36 color poster is your guide to developing high volume salespeopl= e with larger tickets. Find us in the new NHFA Retailer Resource Center= located in the Plaza. <br /> <br /> Don&rsquo;t miss Mark Lacy&rsquo;s= entertaining seminar "Help Wanted! My Sales Associates Can&rsquo;t Sell= Water to a Thirsty Camel." He&rsquo;ll reveal powerful secrets for turn= ing sales associates into furniture experts that will sell. See him Satu= rday, April 25th at 11:30 AM in the seminar room of the new NHFA Retail= Resource Center in the Plaza. <br /> <br /> Stop by our space to learn= how our ingenious internet-delivered training courses are easy to use,= guaranteed to work, and cheaper than the daily donuts. Over 95% report= increased sales. <br /> <br /> Plan to see us at High Point. </span></s= mall> <span style=3D"font-size: xx-small; font-family: Verdana,Arial,Hel= vetica,sans-serif; color: #000000;"><big><big><big><big><small><small><b= r /> <br /> <br /> <br /> </small></small></big></big></big></big></span= ><small><span style=3D"font-size: xx-small; font-family: Verdana,Arial,H= elvetica,sans-serif; color: #000000;"><big><big><big><small> </small></b= ig></big></big></span></small> <a href=3D"http://www.furnituretrainingco= mpany.com/map"><img src=3D"http://www.furnituretrainingcompany.com/image= s/email/highpoint_2009_04/image_content_left.jpg" border=3D"0" alt=3D"Fi= nd out more" width=3D"326" height=3D"67" /></a><br /> <br /></td>=0D=0A<= td valign=3D"top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituretr= ainingcompany.com/images/email/highpoint_2009_04/spacer_content_middle.j= pg" alt=3D"" width=3D"28" height=3D"28" /><br /></td>=0D=0A<td valign=3D= "top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituretrainingcompan= y.com/images/email/highpoint_2009_04/image_content_right.jpg" alt=3D"Roa= dmap to Sales Success poster" width=3D"186" height=3D"267" /><br /> <spa= n style=3D"font-family: Helvetica,Arial,sans-serif;"><small><span style= =3D"font-size: xx-small; color: #000000;"><big><big><big><small><strong>= Road Map to Sales Success<br /> </strong><br /> </small></big></big></bi= g></span>This beautiful poster is yours free for simply stopping by and= visiting with us at High Point. <span class=3D"moz-txt-slash">Our space= is located inside the </span>new NHFA Retailer Resource Center in the P= laza Suites, 222 South Main St, 1st Floor. We will be at market from Sat= April 25th until Thur April 30th. </small></span><br /></td>=0D=0A<td v= align=3D"top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituretraini= ngcompany.com/images/email/highpoint_2009_04/spacer_content_right.jpg" a= lt=3D"" width=3D"30" height=3D"30" /><br /> <br /></td>=0D=0A</tr>=0D=0A= </tbody>=0D=0A</table>=0D=0A<table border=3D"0" cellspacing=3D"0" cellpa= dding=3D"0" width=3D"600" align=3D"center">=0D=0A<tbody>=0D=0A<tr>=0D=0A= <td valign=3D"top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituret= rainingcompany.com/images/email/highpoint_2009_04/disclaimer_divider.jpg= " alt=3D"" width=3D"600" height=3D"25" /><br /></td>=0D=0A</tr>=0D=0A</t= body>=0D=0A</table>=0D=0A<table border=3D"0" cellspacing=3D"0" cellpaddi= ng=3D"0" width=3D"600" align=3D"center">=0D=0A<tbody>=0D=0A<tr>=0D=0A<td= valign=3D"top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituretrai= ningcompany.com/images/email/highpoint_2009_04/spacer_disclaimer_left.jp= g" alt=3D"" /></td>=0D=0A<td valign=3D"top" bgcolor=3D"#ffffff"><img src= =3D"http://www.furnituretrainingcompany.com/images/email/highpoint_2009_= 04/spacer_disclaimer_middle.jpg" alt=3D"" /><br /> <span style=3D"font-s= ize: xx-small; font-family: Verdana,Arial,Helvetica,sans-serif; color: #= 666666;"><big><big><big><big><small><small><small>If you are not attendi= ng the High Point market in April but would still like to receive a free= Road Map to Sales Success poster visit us on the web at <span style=3D"= text-decoration: underline;"><a class=3D"moz-txt-link-abbreviated" href= =3D"http://www.furnituretrainingcompany.com">www.furnituretrainingcompan= y.com</a></span>, or to speak with a Furniture Training Company represen= tative, call toll free (866) 755-5996. We do not offer free shipping out= side of the U.S. and Canada. Retailers outside of the U.S. and Canada ma= y call for more information. Limit one free Road Map to Sales Success pe= r company. Other copies of the poster may be purchased on our web site.<= br /> <br /> </small></small></small></big></big></big></big></span> <sp= an style=3D"color: #666666;"><small><span style=3D"font-size: xx-small;= font-family: Verdana,Arial,Helvetica,sans-serif;"><big><big><big><small= ><small>We hope you found this message to be useful. However, if you'd r= ather not receive future emails of this sort from The Furniture Training= Company, please <a href=3D"http://www.furnituretraining.com/contact">cl= ick here to unsubscribe</a>.<br /> <br /> </small></small></big></big></= big></span></small><small><span style=3D"font-size: xx-small; font-famil= y: Verdana,Arial,Helvetica,sans-serif;"><big><big><big><small><small>&co= py;Copyright 2009 The Furniture Training Company.<br /> 1770 North Resea= rch Park Way, <br /> North Logan, UT 84341. <br /> All Rights Reserved.<= /small></small></big></big></big></span></small></span><br /></td>=0D=0A= <td valign=3D"top" bgcolor=3D"#ffffff"><img src=3D"http://www.furnituret= rainingcompany.com/images/email/highpoint_2009_04/spacer_disclaimer_righ= t.jpg" alt=3D"" /></td>=0D=0A</tr>=0D=0A</tbody>=0D=0A</table>=0D=0A<tab= le border=3D"0" cellspacing=3D"0" cellpadding=3D"0" width=3D"600" align= =3D"center">=0D=0A<tbody>=0D=0A<tr>=0D=0A<td valign=3D"top" bgcolor=3D"#= ffffff"><img src=3D"http://www.furnituretrainingcompany.com/images/email= /highpoint_2009_04/footer.jpg" alt=3D"" /></td>=0D=0A</tr>=0D=0A</tbody>= =0D=0A</table>=0D=0A<p><br /></p><br><hr><a href=3D'http://localhost/ftc= /app/unsubscribe.php?action=3DoptOut&pid=3D6121&cid=3D19&email=3Dmarkl@f= urnituretrainingcompany.com'>Click to Unsubscribe</a>

    Read the article

  • Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

    - by Pinal Dave
    There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

    Read the article

  • Start your journey into Big Data with the Oracle Academy today!

    - by KLaker
     Big Data has the power to change the way we work, live, and think. The datafication of everything will create unprecedented demand for data scientists, software developers and engineers who can derive value from unstructured data to transform the world. The Oracle Academy Big Data Resource Guide is a collection of articles, videos, and other resources organized to help you gain a deeper understanding of the exciting field of Big Data. To start your journey visit the Oracle Academy website here: https://academy.oracle.com/oa-web-big-data.html. This landing pad will guide through the whole area of big data using the following structure: What is “Big Data” Engineered Systems Integration Database and Data Analytics Advanced Information Supplemental Information This is great resource packed with must-see videos and must-read whitepapers and blog posts by industry leaders.  Enjoy Technorati Tags: Big Data, Data Warehousing, Oracle, Training

    Read the article

  • Live from ODTUG - Big Data and SQL session #2

    - by Jean-Pierre Dijcks
    Sitting in Dominic Delmolino's session at ODTUG (KScope 12). If the session count at conferences is any indication then we will see more and more people start to deploy MapReduce in the database. And yes, that would be with SQL and PL/SQL first and foremost. Both Dominic and our own Bryn Llewellyn are doing MapReduce in the database presentations.  Since I have seen both, I would advice people to first look through Dominic's session to get a good grasp on what mappers do and what reducers do, then dive into Bryn's for a bunch of PL/SQL example. The thing I like about Dominic's is the last slide (a recursive WITH statement) to do this in SQL... Now I am hoping that next year we will see tools vendors show off how they work with Hadoop and MapReduce (at least talking about the concepts!!).

    Read the article

  • Why Oracle Data Integrator for Big Data?

    - by Mala Narasimharajan
    Big Data is everywhere these days - but what exactly is it? It’s data that comes from a multitude of sources – not only structured data, but unstructured data as well.  The sheer volume of data is mindboggling – here are a few examples of big data: climate information collected from sensors, social media information, digital pictures, log files, online video files, medical records or online transaction records.  These are just a few examples of what constitutes big data.   Embedded in big data is tremendous value and being able to manipulate, load, transform and analyze big data is key to enhancing productivity and competitiveness.  The value of big data lies in its propensity for greater in-depth analysis and data segmentation -- in turn giving companies detailed information on product performance, customer preferences and inventory.  Furthermore, by being able to store and create more data in digital form, “big data can unlock significant value by making information transparent and usable at much higher frequency." (McKinsey Global Institute, May 2011) Oracle's flagship product for bulk data movement and transformation, Oracle Data Integrator, is a critical component of Oracle’s Big Data strategy. ODI provides automation, bulk loading, and validation and transformation capabilities for Big Data while minimizing the complexities of using Hadoop.  Specifically, the advantages of ODI in a Big Data scenario are due to pre-built Knowledge Modules that drive processing in Hadoop. This leverages the graphical UI to load and unload data from Hadoop, perform data validations and create mapping expressions for transformations.  The Knowledge Modules provide a key jump-start and eliminate a significant amount of Hadoop development.  Using Oracle Data Integrator together with Oracle Big Data Connectors, you can simplify the complexities of mapping, accessing, and loading big data (via NoSQL or HDFS) but also correlating your enterprise data – this correlation may require integrating across heterogeneous and standards-based environments, connecting to Oracle Exadata, or sourcing via a big data platform such as Oracle Big Data Appliance. To learn more about Oracle Data Integration and Big Data, download our resource kit to see the latest in whitepapers, webinars, downloads, and more… or go to our website on www.oracle.com/bigdata

    Read the article

  • What is the Big-O time complexity of this algorithm

    - by grebwerd
    I was wondering what the run time of this small program would be? #include <stdio.h> int main(int argc, char* argv[]) { int i; int j; int inputSize; int sum = 0; if(argc == 1) inputSize = 16; else inputSize = atoi(argv[i]); for(i = 1; i <= inputSize; i++){ for(j = i; j < inputSize; j *=2 ){ printf("The value of sum is %d\n",++sum); } } } n S floor(log n - log (n-i)) = ? i =1 and that each summation would be the floor value between log(n) - log(n-i). Would the run time be n log n?

    Read the article

  • NRF Week - Disney Store Tour

    - by sarah.taylor(at)oracle.com
    Disney has created a real buzz at this year's NRF event. Yesterday morning we began the Oracle Retail Exchange program with a visit to the flagship Disney store in Times Square. Additionally Oracle made a key announcement with Disney  on Oracle Retail's Point of Sale implementation in 330 stores worldwide. Today   Disney's Steve Finney gave a super session on The Magic of Disney at the NRF Big Show. We also saw Disney making an exclusive news announcement about their plans for Global store openings at the Oracle trade show stand - with a little help from Mickey and Minnie Mouse. Disney Stores have been entirely reinvented since the company in 2008 took ownership after previously franchising the retail arm of the business. They have subsequently been a strong Oracle partner and technology has played a key role in their re imagination of the store environment. The new Imagination stores have a 20% higher footfall and margins are up 25%. The Disney brand is synonymous with magical and memorable experiences for children of all ages. The company is achieving a unique retail experience that delights children and shareholders alike! Technology is a key pillar in helping to deliver on both a strong operating model and a unique customer experience - the best thirty minutes in a child's day is their aim. Steve Finney this morning said their technology has to be as reliable as a theme park ride. Store experiences are much more enjoyable when there are short waiting times and children can interact with their favourite characters through magic mirrors, mobile point of sale, touch screens and custom animations that are digitally transmitted to stores globally. The Oracle Retail Point of Sale with iPad touch screens reduces check out times, stores customer data, ensures that promotions are delivered accurately and reduces losses. This means higher levels of guest conversion, increased availability and convenience for customers who want to check availability at other locations. Disney is a pioneer. At NRF's 100th show, we had the privilege of learning from a retailer using technology as a creative force to drive their business forward.

    Read the article

  • Willy Rotstein on Supply Chain Planning

    - by sarah.taylor(at)oracle.com
    Each time a merchandiser, buyer or planner in Retail makes a business decision around assortment, inventory, pricing and promotions there is an opportunity to improve both Profitability and Customer Service. Improving decision making, however, has always been a tricky business for retailers.  I have worked in this space for more than 15 years. I began my career as an academic, at Imperial College London, and then broadened this interest with Retailers, aiming to optimize their merchandising and supply chain decisions. Planning the business and optimizing profit is a complex process. The complexity arises from the variety of people involved, the large number of decisions to take across all business processes, the uncertainty intrinsic to the retail environment as well as the volume of data available for analysis.  Things are not getting any easier either. The advent of multi-channel, social media and mobile is taking these complexities to a new level and presenting additional opportunities for those willing to exploit them. I guess it is due to the complexities of the decision making process that, over the last couple of years working with Oracle Retail, I have witnessed a clear trend around the deployment of planning systems. Retailers are aiming to simplify their decision making processes. They want to use one joined up planning platform across the business and enhance it with "actionable" data mining and optimization techniques. At Oracle Retail, we have a vibrant community of international retailers who regularly come together to discuss the big issues in retail planning. It is a combination of fashion, grocery and speciality retailers, all sharing their best practice vision for planning and optimizing merchandise decisions. As part of the Retail Exchange program, at the recent National Retail Federation event in New York, I jointly hosted a Planning dinner with Peter Fitzgerald from Google UK, Retail Division. Those retailers from our international planning community who were in New York for the annual NRF event were able to attend. The group comprised some of Europe's great International Retail brands.  All sectors were represented by organisations like Mango, LVMH, Ahold, Morrisons, Shop Direct and River Island. They confirmed the current importance of engaging with Planning and Optimization issues. In particular the impact of the internet was a key topic. We had a great debate about new retail initiatives.  Peter highlighted how mobility is changing retail - in particular with the new "local availability search" initiative. We also had an exciting discussion around the opportunities to improve merchandising using the new data that is becoming available from search, social media and ecommerce sites. It will be our focus to continue to help retailers translate this data into better results while keeping their business operations simple. New developments in "actionable" analytics and computing capacity make this a very exciting area today. Watch this space for my contributions on these topics which will be made available through this blog. Oracle Retail has a strong Planning community. if you are a category manager, a planner, a buyer, a merchandiser, a retail supplier or any retail executive with a keen interest in planning then you would be very welcome to join Oracle Retail's Planning Community. As part of our community you will be able to join our in-person and virtual events, download topical white papers and best practice information specifically tailored to your area of interest.  If anyone would like to register their interest in joining our community of retailers discussing planning then please contact me at [email protected]   Willy Rotstein, Oracle Retail

    Read the article

  • Why CFOs Should Care About Big Data

    - by jmorourke
    The topic of “big data” clearly has reached a tipping point in 2012.  With plenty of coverage over the past few years in the IT press, we are now starting to see the topic of “big data” covered in mainstream business press, including a cover story in the October 2012 issue of the Harvard Business Review.  To help customers understand the challenges of managing “big data” as well as the opportunities that can be created by leveraging “big data”, Oracle has recently run and published the results of a customer survey, as well as white papers and articles on this topic.  Most recently, we commissioned a white paper titled “Mastering Big Data: CFO Strategies to Transform Insight into Opportunity”. The premise here is that “big data” is not just a topic that CIOs should pay attention to, but one that CFOs should understand and take advantage of as well.  Clearly, whoever masters the art and science of big data will be positioned for competitive advantage in their industries or markets.  That’s why smart CFOs are taking control of big data and business analytics projects, not just to uncover new ways to drive growth in a slowing global economy, but also to be a catalyst for change in the enterprise.  With an increasing number of CFOs now responsible for overseeing IT investments and providing strategic insight to the board, CFOs will be increasingly called upon to take a leadership role in assessing the value of “big data” initiatives, building on their traditional skills in reporting and helping managers analyze data to support decision making. Here’s a link to the white paper referenced above, which is posted on the Oracle C-Central/CFO web site, as well as some other resources that can help CFOs master the topic of “big data”: White Paper “Mastering Big Data:  CFO Strategies to Transform Insight into Opportunity CFO Market Watch article:  “Does Big Data Affect the CFO?” Oracle Survey Report:  “From Overload to Impact – An Industry Scorecard on Big Data Industry Challenges” Upcoming Big Data Webcast with Andrew McAfee Here’s a general link to Oracle C-Central/CFO in case you want to start there: www.oracle.com/c-central/cfo Feel free to contact me if you have any questions or need additional information:  [email protected]

    Read the article

  • Oracle Big Data Learning Library - Click on LEARN BY PRODUCT to Open Page

    - by chberger
    Oracle Big Data Learning Library... Learn about Oracle Big Data, Data Science, Learning Analytics, Oracle NoSQL Database, and more! Oracle Big Data Essentials Attend this Oracle University Course! Using Oracle NoSQL Database Attend this Oracle University class! Oracle and Big Data on OTN See the latest resource on OTN. Search Welcome Get Started Learn by Role Learn by Product Latest Additions Additional Resources Oracle Big Data Appliance Oracle Big Data and Data Science Basics Meeting the Challenge of Big Data Oracle Big Data Tutorial Video Series Oracle MoviePlex - a Big Data End-to-End Series of Demonstrations Oracle Big Data Overview Oracle Big Data Essentials Data Mining Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features Using Oracle NoSQL Database Exalytics Enterprise Manager 12c R3: Manage Exalytics Setting Up and Running Summary Advisor on an E s Oracle R Enterprise Oracle R Enterprise Tutorial Series Oracle Big Data Connectors Integrate All Your Data with Oracle Big Data Connectors Using Oracle Direct Connector for HDFS to Read the Data from HDSF Using Oracle R Connector for Hadoop to Analyze Data Oracle NoSQL Database Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features  Using Oracle NoSQL Database eries Oracle Business Intelligence Enterprise Edition Oracle Business Intelligence Oracle BI 11g R1: Create Analyses and Dashboards - 4 day class Oracle BI Publisher 11g R1: Fundamentals - 3 day class Oracle BI 11g R1: Build Repositories - 5 day class

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >