Search Results

Search found 66013 results on 2641 pages for 'big data analytics'.

Page 13/2641 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >

  • Setting up page goals in Analytics when using progressive enhancement to load content using jquery .load

    - by sam
    I'm using jQuery .load to load content in from other pages into my homepage, so that Google can still see whats going on I've made the <a> tags go to the pages but over ride them in the JavaScript so instead of going the that page it just loads in the content from that page to the main page. Normaly I would just make the page /contact.html a goal. Can I still get it to work as a goal if the content is being loaded in? Can I do something like when the user clicks <a href="contact.html" id="load-contact">contact</a> it logs the clicking of the <a> tag as a goal, rather than the actaul page being visited?

    Read the article

  • Google Analytics Goal tracking external referrals

    - by user1561108
    I have goal tracking setup on my site for a target url. It works insofar as it tracks all pageviews on my site leading up the goal. But it doesn't appear to be tracking the external referrer that the user came from initially, marking it as (entrance) and the step before that (not set). Is this standard behaviour for goals not to record external referrer and how can I add referrer tracking to my goal?

    Read the article

  • Developing a php system that tracks other websites analytics

    - by CodeCrack
    I want to develop a PHP website feature where users sign up, get a javascript snippet code that display an image on their site, and let's me track the number of visitors, unique hits, clicks and average visitor duration on their page. Is that something that should be done with some open source analytic software such as http://piwik.org/ or it's pretty doable on your own? If I had to do it myself from scratch, I would use image/pixel as a way to track the visit, drop a cookie with javascript snippet to track uniques, track clicks based on image click and redirect, and not sure about the bounce rate. Any thoughts or opinions are welcome.

    Read the article

  • Google analytics goal funnel visualization issues

    - by Lauren
    This is the goal funnel for checkout. Does anyone have any idea where the "/" is coming from? The cart page is at site: game on glove dot com (I don't want this stackoverflow page being indexed in google particularly well). Go to the site, click on the order button, make your selection, and click the button to enter the cart (it resolves to /Cart and /Shop-Cart). I believe I used the regular expression matching to match "cart". So why the "/" (I don't know what is causing the home page to reload when users are on the Cart page within a Colorbox lightbox where the only way back to home or "/" is to hit the exit button in the top right of the lightbox)? Here's my one guess for the former question but it doesn't seem likely: See the "check out with paypal" button? If you hovered over it, it does default to the home page which is what might be the "/"... but it really redirects the user to the paypal.com page so it shouldn't also load the home page.

    Read the article

  • Google Analytics showing more unique visitors than there are pages on an intranet site

    - by DDEX
    I take care of a company intranet and measure the traffic with GA. I am absolutely sure that there are no more than 5000 URLs in our company and it is impossible to check the intranet from outside the company network. Yet when I check the number of Unique Visitors (UV) in the last year GA says there were 36.500 of them. How is that possible? I thought UV should measure each URL only once in the given time period. Could anybody explain how this actually works? Can it be that the cookie trackers expire after some time and are counted more then once?

    Read the article

  • Use Google Analytics to target different sections of a blog

    - by Emily Yao
    I have a blog that targets different regions. The Europe region blog has different sections in different local languages such as English, French and German. I wonder how to track and analyze the different sections. My initial thought is to search the domain URL, but I found it is not a good idea. For example, the URL for the Europe blog is like www.myblog.com/europe. If you click the French section, the URL is like www.myblog.com/europe/language/french. If you click an article in the French section, it is like www.myblog.com/article_name. Notice the article link is not www.myblog.com/language/french/article_name!

    Read the article

  • Text limit on analytics event code

    - by Theo G
    I am just about to add the event code a button that downloads the pdf. Event code fields: _trackEvent(category, action, opt_label, opt_value, opt_noninteraction) Example of event code: onClick="_gaq.push(['_trackEvent', 'Videos', 'Play', 'Baby\'s First Birthday']);" I was just wondering if anyone knows if there is a text limit on the opt_value? Do you think the following would be too long 'Elmhurst School says IPC has made all the difference'?

    Read the article

  • Google analytics and 301 redirects

    - by Ilian Iliev
    We have a multi language website and the first page redirects to specific language page using 301 redirect based on some logic. For exmaple: http://mysite.com/ redirects to http://mysite.com/en/ The problem is that these redirects destroy the primary request so we do not get correct results for traffic sources in GA. How do you handle this case? Is there something that we can do? Any ideas will be appreciated

    Read the article

  • Include multiple IP addresses in Google Analytics

    - by RubenGeert
    I sometimes access my own website from my home/work/girlfriend IP addresses. I'd like to create a filter that includes any of these and nothing else. I thought a custom include filter with a very basic regex should do the trick. The regex I use is 62\.58\.32\.193|77\.172\.143\.12$|213\.125\.166\.98 to include 62.58.32.193 and 77.172.143.12 and 213.125.166.98 and no other IP addresses. I obviously tested it before using it. However, pageviews seem to be stuck at zero even though I did generate internal traffic. Does anybody understand what I'm doing wrong?

    Read the article

  • Google Analytics content experiment vs. funnel visualization

    - by Full Decent
    I am running an experiment on one of the pages in my funnel (the "Choose shipping options" page). But the numbers on the different reports do not correspond. First, I am expecting the 70 entrances in the funnel to equal the 131 experiment visits. Also, I expect the 23 conversions in the funnel to match the 21 transactions below. But they do not. How should I read this information to make good decisions?

    Read the article

  • How to use Google Analytics as an affiliate to track sales data

    - by lalex
    As an affiliate, how can we get more information on sales? It looks like the goals feature in GA is for those who have control over the receipt page. But we are sending users away using an affiliate link. With event tracking, we've been able to count the clicks and see which links are being clicked the most, but not which ones actually convert. We want to find out the following on each sale: Did the converted user come from search or internal traffic? If it was search, which keyword brought the user to our site (and clicked away and converted)? Is it possible?

    Read the article

  • Google analytics campaign advice

    - by Drewsdesign
    I am buying traffic from a broker not one source and sending to various landing pages. I would like to know the best way to structure a campaign so I can find which referrering site/url is performing the best (time on site, bounce etc) Should the utm_campaign be the 'brokername' and the utm_source be the 'landingpagename' or should this be the other way around? Also what would be the best way to create a custom report to show all the referrers metrics by each landing page ? Thank guys really appreciate any help on this.

    Read the article

  • As good as no bounces are registered in Google Analytics [on hold]

    - by user29931
    For a client I am having an issue for which I can't find an explanation. For some reason, bounces are no longer measured (or almost none) in GA. In GA I can see that the issue started in January 2013. I have been looking at the code inside and out, but I can't find any reason why. On the production site, there is (will be removed soon) a POST done on page load, so I thought that Google might see this as user interaction, hence never counting a bounce, but on staging i removed this POST and in the GA account for staging, still no bounces are registered. I have also checked if the tracking code appears twice on the page, and this is not the case. I tried with the GA debug plugin in Firefox and Chrome to see if that would learn me anything, but no luck... The site in question is www.kiala.be.

    Read the article

  • Google Analytics for subdomains

    - by Sebastian
    I have two WordPress multisites under one domain - city-x.domain.com and city-y.domain.com. domain.com is a landing page where you select your city, and a cookie will redirect the user to that city on subsequent visits. I'd like to be able to track the number of hits on all pages on domain.com, city-x.domain.com and city-y.domain.com separately and combined. How is this On a side note, I've heard that GA underestimates hits. As this is important for advertising purposes, is there a better free service?

    Read the article

  • Architecture for database analytics

    - by David Cournapeau
    Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

    Read the article

  • Accessing SQL Data Services via ADO.NET Data Service Client Library

    - by Mehmet Aras
    Is this possible? Basically I would like to use SQL Data Services REST interface and let the ADO.NET Data Service Client library handle communication details and generate the entities that I can use. I looked at the samples in February release of Azure services kit but the samples in there are using HttpWebRequest and HttpWebResponse to consume SQL Data Services RESTfully. I was hoping to use ADO.NET Data Service Client library to abstract low-level details away.

    Read the article

  • Why does Google Analytics use two domains?

    - by AKeller
    I'm building a distributed widget that is comparable to Google Analytics. Users will add a <script> tag to their site that references my widget's JavaScript file. The Google Analytics tracking code looks like this: var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-XXXXXXXX-X']); _gaq.push(['_trackPageview']); (function () { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Can anyone explain the reasoning behind separate HTTP and HTTPS hostnames? My instinct is to just secure the www address and then use the protocol-less syntax, like //www.google-analytics.com/ga.js. But I'm sure the Google Analytics architects put a lot of thought into this approach. I'd love to understand their logic before I follow/ignore their model.

    Read the article

  • Suggested Web Application Framework and Database for Enterprise, “Big-Data” App?

    - by willOEM
    I have a web application that I have been developing for a small group within my company over the past few years, using Pipeline Pilot (plus jQuery and Python scripting) for web development and back-end computation, and Oracle 10g for my RDBMS. Users upload experimental genomic data, which is parsed into a database, and made available for querying, transformation, and reporting. Experimental data sets are large and have many layers of metadata. A given experimental data record might have a foreign key relationship with a table that describes this data point's assay. Assays can cover multiple genes, which can have multiple transcript, which can have multiple mutations, which can affect multiple signaling pathways, etc. Users need to approach this data from any point in those layers in the metadata. Since all data sets for a given data type can run over a billion rows, this results in some large, dynamic queries that are hard to predict. New data sets are added on a weekly basis (~1GB per set). Experimental data is never updated, but the associated metadata can be updated weekly for a few records and yearly for most others. For every data set insert the system sees, there will be between 10 and 100 selects run against it and associated data. It is okay for updates and inserts to run slow, so long as queries run quick and are as up-to-date as possible. The application continues to grow in size and scope and is already starting to run slower than I like. I am worried that we have about outgrown Pipeline Pilot, and perhaps Oracle (as the sole database). Would a NoSQL database or an OLAP system be appropriate here? What web application frameworks work well with systems like this? I'd like the solution to be something scalable, portable and supportable X-years down the road. Here is the current state of the application: Web Server/Data Processing: Pipeline Pilot on Windows Server + IIS Database: Oracle 10g, ~1TB of data, ~180 tables with several billion-plus row tables Network Storage: Isilon, ~50TB of low-priority raw data

    Read the article

  • Bridging Two Worlds: Big Data and Enterprise Data

    - by Dain C. Hansen
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} The big data world is all the vogue in today’s IT conversations. It’s a world of volume, velocity, variety – tantalizing us with its untapped potential. It’s a world of transformational game-changing technologies that have already begun to alter the information management landscape. One of the reasons that big data is so compelling is that it’s a universal challenge that impacts every one of us. Whether it is healthcare, financial, manufacturing, government, retail - big data presents a pressing problem for many industries: how can so much information be processed so quickly to deliver the ‘bigger’ picture? With big data we’re tapping into new information that didn’t exist before: social data, weblogs, sensor data, complex content, and more. What also makes big data revolutionary is that it turns traditional information architecture on its head, putting into question commonly accepted notions of where and how data should be aggregated processed, analyzed, and stored. This is where Hadoop and NoSQL come in – new technologies which solve new problems for managing unstructured data. And now for some worst practices that I'd recommend that you please not follow: Worst Practice Lesson 1: Throw away everything that you already know about data management, data integration tools, and start completely over. One shouldn’t forget what’s already running in today’s IT. Today’s Business Analytics, Data Warehouses, Business Applications (ERP, CRM, SCM, HCM), and even many social, mobile, cloud applications still rely almost exclusively on structured data – or what we’d like to call enterprise data. This dilemma is what today’s IT leaders are up against: what are the best ways to bridge enterprise data with big data? And what are the best strategies for dealing with the complexities of these two unique worlds? Worst Practice Lesson 2: Throw away all of your existing business applications … because they don’t run on big data yet. Bridging the two worlds of big data and enterprise data means considering solutions that are complete, based on emerging Hadoop technologies (as well as traditional), and are poised for success through integrated design tools, integrated platforms that connect to your existing business applications, as well as and support real-time analytics. Leveraging these types of best practices translates to improved productivity, lowered TCO, IT optimization, and better business insights. Worst Practice Lesson 3: Separate out [and keep separate] your big data sandboxes from all the current enterprise IT systems. Don’t mix sand among playgrounds. We didn't tell you that you wouldn't get dirty doing this. Correlation between the two worlds is key. The real advantage to analyzing big data comes when you can correlate it with the existing data in your data warehouse or your current applications to make sense of the larger patterns. If you have not followed these worst practices 1-3 then you qualify for the first step of our journey: bridging the two worlds of enterprise data and big data. Over the next several weeks we’ll be discussing this topic along with several others around big data as it relates to data integration. We welcome you to join us in the conversation by following us on twitter on #BridgingBigData or download our latest white paper and resource kit: Big Data and Enterprise Data: Bridging Two Worlds.

    Read the article

  • SQL SERVER – Data Sources and Data Sets in Reporting Services SSRS

    - by Pinal Dave
    This example is from the Beginning SSRS by Kathi Kellenberger. Supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the Beginning SSRS. Supporting files are available with a free download from the www.Joes2Pros.com web site. Connecting to Your Data? When I was a child, the telephone book was an important part of my life. Maybe I was just a nerd, but I enjoyed getting a new book every year to page through to learn about the businesses in my small town or to discover where some of my school acquaintances lived. It was also the source of maps to my town’s neighborhoods and the towns that surrounded me. To make a phone call, I would need a telephone number. In order to find a telephone number, I had to know how to use the telephone book. That seems pretty simple, but it resembles connecting to any data. You have to know where the data is and how to interact with it. A data source is the connection information that the report uses to connect to the database. You have two choices when creating a data source, whether to embed it in the report or to make it a shared resource usable by many reports. Data Sources and Data Sets A few basic terms will make the upcoming choses make more sense. What database on what server do you want to connect to? It would be better to just ask… “what is your data source?” The connection you need to make to get your reports data is called a data source. If you connected to a data source (like the JProCo database) there may be hundreds of tables. You probably only want data from just a few tables. This means you want to write a specific query against this data source. A query on a data source to get just the records you need for an SSRS report is called a Data Set. Creating a local Data Source You can connect embed a connection from your report directly to your JProCo database which (let’s say) is installed on a server named Reno. If you move JProCo to a new server named Tampa then you need to update the Data Set. If you have 10 reports in one project that were all pointing to the JProCo database on the Reno server then they would all need to be updated at once. It’s possible to make a project level Data Source and have each report use that. This means one change can fix all 10 reports at once. This would be called a Shared Data Source. Creating a Shared Data Source The best advice I can give you is to create shared data sources. The reason I recommend this is that if a database moves to a new server you will have just one place in Report Manager to make the server name change. That one change will update the connection information in all the reports that use that data source. To get started, you will start with a fresh project. Go to Start > All Programs > SQL Server 2012 > Microsoft SQL Server Data Tools to launch SSDT. Once SSDT is running, click New Project to create a new project. Once the New Project dialog box appears, fill in the form, as shown in. Be sure to select Report Server Project this time – not the wizard. Click OK to dismiss the New Project dialog box. You should now have an empty project, as shown in the Solution Explorer. A report is meant to show you data. Where is the data? The first task is to create a Shared Data Source. Right-click on the Shared Data Sources folder and choose Add New Data Source. The Shared Data Source Properties dialog box will launch where you can fill in a name for the data source. By default, it is named DataSource1. The best practice is to give the data source a more meaningful name. It is possible that you will have projects with more than one data source and, by naming them, you can tell one from another. Type the name JProCo for the data source name and click the Edit button to configure the database connection properties. If you take a look at the types of data sources you can choose, you will see that SSRS works with many data platforms including Oracle, XML, and Teradata. Make sure SQL Server is selected before continuing. For this post, I am assuming that you are using a local SQL Server and that you can use your Windows account to log in to the SQL Server. If, for some reason you must use SQL Server Authentication, choose that option and fill in your SQL Server account credentials. Otherwise, just accept Windows Authentication. If your database server was installed locally and with the default instance, just type in Localhost for the Server name. Select the JProCo database from the database list. At this point, the connection properties should look like. If you have installed a named instance of SQL Server, you will have to specify the server name like this: Localhost\InstanceName, replacing the InstanceName with whatever your instance name is. If you are not sure about the named instance, launch the SQL Server Configuration Manager found at Start > All Programs > Microsoft SQL Server 2012 > Configuration Tools. If you have a named instance, the name will be shown in parentheses. A default instance of SQL Server will display MSSQLSERVER; a named instance will display the name chosen during installation. Once you get the connection properties filled in, click OK to dismiss the Connection Properties dialog box and OK again to dismiss the Shared Data Source properties. You now have a data source in the Solution Explorer. What’s next I really need to thank Kathi Kellenberger and Rick Morelan for sharing this material for this 5 day series of posts on SSRS. To get really comfortable with SSRS you will get to know the different SSDT windows, Build reports on your own (without the wizards),  Add report headers and footers, Accept user input,  create levels, charts, or even maps for visual appeal. You might be surprise to know a small 230 page book starts from the very beginning and covers the steps to do all these items. Beginning SSRS 2012 is a small easy to follow book so you can learn SSRS for less than $20. See Joes2Pros.com for more on this and other books. If you want to learn SSRS in easy to simple words – I strongly recommend you to get Beginning SSRS book from Joes 2 Pros. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Reporting Services, SSRS

    Read the article

  • Oracle Financial Management Analytics 11.1.2.2.300 is available

    - by THE
    (guest post by Greg) Oracle Financial Management Analytics 11.1.2.2.300 is now available for download from My Oracle Support as Patch 15921734 New Features in this release: Support for the new Oracle BI mobile HD iPad client. New Account Reconciliation Management and Financial Data Quality Management analytics Improved Hyperion Financial Management analytics and usability enhancements Enhanced Configuration Utility to support multiple products. For HFM, FCM or ARM, and FDM, we support both Oracle and Microsoft SQL Server database. Simplified Test to Production migration of OFMA. Web browsers support for Oracle Financial Management Analytics: Internet Explorer Version 9 - The Oracle Financial Management Analytics supports the Internet Explorer 9 Web browser (for both 32 and 64 bit). Firefox Version 6.x - The Oracle Financial Management Analytics supports the Firefox 6.x Web browser. Chrome Version 12.x - The Oracle Financial Management Analytics supports the Chrome 12.x Web browser. See OBIEE Certification Matrix 11.1.1.6:  http://www.oracle.com/technetwork/middleware/ias/downloads/fusion-certification-100350.html Oracle Financial Management Analytics Compatibility: The Oracle Financial Management Analytics supports the following product version: Oracle Hyperion Financial Data Quality Management Release 11.1.2.2.300 Oracle Financial Close Manager Release 11.1.2.2.300 Oracle Hyperion Financial Management Release 11.1.2.2.300  

    Read the article

  • Link google analytics (private account) with Adwords (client account)

    - by Jorre
    I have a Google Analytics account with all my (en my clients) websites linked in it. This is great to manage all analytics in one place. I'm now running a Google Adwords campaign for a client (with another email address than my google analytics account) and I want to keep track of Adwords stats in Google analytics. Is that even possible? Or do I have to create separate google analytics accounts for every client I'm running Adwords for?

    Read the article

  • How do I find information on who links to my sites?

    - by bobdobbs
    I'm trying to figure out if there's a free way to get information on backlinks to my site. I've had webmaster tools and google analytics set up for years. But I can't find access to data about site backlinks in either toolset. Webmaster tools, under 'traffic'-'links to your site' gives me the same message for all of my sites: "No data available". I haven't been able to find anything in GA that gives any information on backlinks. I've heard of using "links:" as an operator in google search, but for each of my sites, this returns either zero or very few results in cases when I know I have many backlinks. Most of the links simple aren't shown. My thinking is that google maintains a graph of who links to my site, so I figured that they might let me see it. But I can't figure out how. I've found this tool on a spammy website: http://www.backlinkwatch.com. It offers more data than google on my backlines, and offers more results in exchange for a paid subscription. The data it offers for free looks good, but the results are limited and the site has popups and obnoxious ads. So, in short: how do I get data on who links to me? Is there a free way?

    Read the article

  • Oracle Big Data Software Downloads

    - by Mike.Hallett(at)Oracle-BI&EPM
    Companies have been making business decisions for decades based on transactional data stored in relational databases. Beyond that critical data, is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and photographs that can be mined for useful information. Oracle offers a broad integrated portfolio of products to help you acquire and organize these diverse data sources and analyze them alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data Connectors Downloads here, includes: Oracle SQL Connector for Hadoop Distributed File System Release 2.1.0 Oracle Loader for Hadoop Release 2.1.0 Oracle Data Integrator Companion 11g Oracle R Connector for Hadoop v 2.1 Oracle Big Data Documentation The Oracle Big Data solution offers an integrated portfolio of products to help you organize and analyze your diverse data sources alongside your existing data to find new insights and capitalize on hidden relationships. Oracle Big Data, Release 2.2.0 - E41604_01 zip (27.4 MB) Integrated Software and Big Data Connectors User's Guide HTML PDF Oracle Data Integrator (ODI) Application Adapter for Hadoop Apache Hadoop is designed to handle and process data that is typically from data sources that are non-relational and data volumes that are beyond what is handled by relational databases. Typical processing in Hadoop includes data validation and transformations that are programmed as MapReduce jobs. Designing and implementing a MapReduce job usually requires expert programming knowledge. However, when you use Oracle Data Integrator with the Application Adapter for Hadoop, you do not need to write MapReduce jobs. Oracle Data Integrator uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Employing familiar and easy-to-use tools and pre-configured knowledge modules (KMs), the application adapter provides the following capabilities: Loading data into Hadoop from the local file system and HDFS Performing validation and transformation of data within Hadoop Loading processed data from Hadoop to an Oracle database for further processing and generating reports Oracle Database Loader for Hadoop Oracle Loader for Hadoop is an efficient and high-performance loader for fast movement of data from a Hadoop cluster into a table in an Oracle database. It pre-partitions the data if necessary and transforms it into a database-ready format. Oracle Loader for Hadoop is a Java MapReduce application that balances the data across reducers to help maximize performance. Oracle R Connector for Hadoop Oracle R Connector for Hadoop is a collection of R packages that provide: Interfaces to work with Hive tables, the Apache Hadoop compute infrastructure, the local R environment, and Oracle database tables Predictive analytic techniques, written in R or Java as Hadoop MapReduce jobs, that can be applied to data in HDFS files You install and load this package as you would any other R package. Using simple R functions, you can perform tasks such as: Access and transform HDFS data using a Hive-enabled transparency layer Use the R language for writing mappers and reducers Copy data between R memory, the local file system, HDFS, Hive, and Oracle databases Schedule R programs to execute as Hadoop MapReduce jobs and return the results to any of those locations Oracle SQL Connector for Hadoop Distributed File System Using Oracle SQL Connector for HDFS, you can use an Oracle Database to access and analyze data residing in Hadoop in these formats: Data Pump files in HDFS Delimited text files in HDFS Hive tables For other file formats, such as JSON files, you can stage the input in Hive tables before using Oracle SQL Connector for HDFS. Oracle SQL Connector for HDFS uses external tables to provide Oracle Database with read access to Hive tables, and to delimited text files and Data Pump files in HDFS. Related Documentation Cloudera's Distribution Including Apache Hadoop Library HTML Oracle R Enterprise HTML Oracle NoSQL Database HTML Recent Blog Posts Big Data Appliance vs. DIY Price Comparison Big Data: Architecture Overview Big Data: Achieve the Impossible in Real-Time Big Data: Vertical Behavioral Analytics Big Data: In-Memory MapReduce Flume and Hive for Log Analytics Building Workflows in Oozie

    Read the article

  • PostgreSQL to Data-Warehouse: Best approach for near-real-time ETL / extraction of data

    - by belvoir
    Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse. How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to 15-20k. The most frequently updated rows are ~64 bytes each but there are various tables etc so the data is quite diverse and can range up to 4000 bytes per row. The OLTP is active 24x5.5. Best Solution? From what I can piece together the most practical solution is as follows: Create a TRIGGER to write all DML activity to a rotating CSV log file Perform whatever transformations are required Use the native DW data pump tool to efficiently pump the transformed CSV into the DW Why this approach? TRIGGERS allow selective tables to be targeted rather than being system wide + output is configurable (i.e. into a CSV) and are relatively easy to write and deploy. SLONY uses similar approach and overhead is acceptable CSV easy and fast to transform Easy to pump CSV into the DW Alternatives considered .... Using native logging (http://www.postgresql.org/docs/8.3/static/runtime-config-logging.html). Problem with this is it looked very verbose relative to what I needed and was a little trickier to parse and transform. However it could be faster as I presume there is less overhead compared to a TRIGGER. Certainly it would make the admin easier as it is system wide but again, I don't need some of the tables (some are used for persistent storage of JMS messages which I do not want to log) Querying the data directly via an ETL tool such as Talend and pumping it into the DW ... problem is the OLTP schema would need tweaked to support this and that has many negative side-effects Using a tweaked/hacked SLONY - SLONY does a good job of logging and migrating changes to a slave so the conceptual framework is there but the proposed solution just seems easier and cleaner Using the WAL Has anyone done this before? Want to share your thoughts?

    Read the article

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >