Search Results

Search found 2017 results on 81 pages for 'hadoop streaming'.

Page 43/81 | < Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >

  • What if any free video player for windows can I set a custom frame buffer for streaming over the wifi?

    - by user268883
    I currently use DAUM PLayer (Pot Player) but I have problems streaming video to my laptop unless it is plugged into the network. An easy fix (so I thought) would be to find a player were I could adjust the cache / buffer so say 5 mins+ is read to the local Hdrive and then played from there... I can not work out how to do this in any player I have VLC, MPCHC and DAUM installed. All I want to do is increase the file buffer so X amount of the networkfile is copied to the local drive and then played... so all stuttering is stopped? How do I do this?

    Read the article

  • How do you stop windows 7 from auto streaming mp3's online?

    - by angryuser
    Be it IE, Firefox or Chrome whenever I try to download a media file windows 7 starts streaming it in the browser instead of giving me options about what I want to do with the file i'm trying to download. I know the problem is with the OS and not the browser because I can download the file just fine off the website when I use Ubuntu. I get the feeling somewhere a setting is saying "open all mp3's in browser" but I dont know where to find or change it. Can anyone help? Edit: If I click on the FLAC version of the audio file, windows 7 automatically downloads it. If I click on the MP3 version, it automatically streams it to the browser.

    Read the article

  • Hey Guy , I want ot streaming video by using VideView class . Can anyone tell me what format is it s

    - by eddyxd
    Hi , I am the newbie of android, but i hava seen the tutorial and implement some simple applications. The question i met is that I am tring to stream some video from my server to android, but the android VideoView class just plays the audition sololy without "image"@@!~ Here is my setting and android code : 1. android core code: mVideoView01.setVideoURI(Uri.parse("rtsp://192.168.16.1:8080/test.sdp")); mVideoView01.start(); 2. my streaming server is VLC and the command is: vlc -vvv d:\nobody.mp4 --sout=#transcode{vcodec=h264,width=320,hegiht=240}:rtp{dst=192.168.16.1,port=4444,sdp=rtsp://192.168.16.1:8080/test.sdp} ps: My ip is got from DHCP but I have checked it really can be connected(Android could play audition after all) ps2: I haved trid to stream some video from "http://www.americafree.tv/" and the playing is good!!@@ So I guess that the problem maybe is caused by streaming Video format, but I have almost tried every figument option form VLC, and it still don't workQQ. So Have anyone done the same test as me can give me some advice?? Thanks a lot!!!!! by eddy

    Read the article

  • Audio queue start failed

    - by mobapps99
    Hi , i'm developing a project which has both audio streaming and playing audio from file. For audio streaming i'm using AudioStreamer and for playing from file i'm using avaudioplayer. Both streaming and playing works perfectly as long as the app is not interrupted by a phone call or sms. But when a call/sms comes after dismissing the call when i try to restart streaming i'm getting the error "Audio queue start failed" . This happens only when i have used avaudioplayer at least once and after that used streaming. When the avaudioplayer obeject is not created , in this scenario the there is no problem with resuming streaming after dismissing the call. My guess is that some thing is wrong with audioqueue. Help is very much appreciated.......

    Read the article

  • Know your Data Lineage

    - by Simon Elliston Ball
    An academic paper without the footnotes isn’t an academic paper. Journalists wouldn’t base a news article on facts that they can’t verify. So why would anyone publish reports without being able to say where the data has come from and be confident of its quality, in other words, without knowing its lineage. (sometimes referred to as ‘provenance’ or ‘pedigree’) The number and variety of data sources, both traditional and new, increases inexorably. Data comes clean or dirty, processed or raw, unimpeachable or entirely fabricated. On its journey to our report, from its source, the data can travel through a network of interconnected pipes, passing through numerous distinct systems, each managed by different people. At each point along the pipeline, it can be changed, filtered, aggregated and combined. When the data finally emerges, how can we be sure that it is right? How can we be certain that no part of the data collection was based on incorrect assumptions, that key data points haven’t been left out, or that the sources are good? Even when we’re using data science to give us an approximate or probable answer, we cannot have any confidence in the results without confidence in the data from which it came. You need to know what has been done to your data, where it came from, and who is responsible for each stage of the analysis. This information represents your data lineage; it is your stack-trace. If you’re an analyst, suspicious of a number, it tells you why the number is there and how it got there. If you’re a developer, working on a pipeline, it provides the context you need to track down the bug. If you’re a manager, or an auditor, it lets you know the right things are being done. Lineage tracking is part of good data governance. Most audit and lineage systems require you to buy into their whole structure. If you are using Hadoop for your data storage and processing, then tools like Falcon allow you to track lineage, as long as you are using Falcon to write and run the pipeline. It can mean learning a new way of running your jobs (or using some sort of proxy), and even a distinct way of writing your queries. Other Hadoop tools provide a lot of operational and audit information, spread throughout the many logs produced by Hive, Sqoop, MapReduce and all the various moving parts that make up the eco-system. To get a full picture of what’s going on in your Hadoop system you need to capture both Falcon lineage and the data-exhaust of other tools that Falcon can’t orchestrate. However, the problem is bigger even that that. Often, Hadoop is just one piece in a larger processing workflow. The next step of the challenge is how you bind together the lineage metadata describing what happened before and after Hadoop, where ‘after’ could be  a data analysis environment like R, an application, or even directly into an end-user tool such as Tableau or Excel. One possibility is to push as much as you can of your key analytics into Hadoop, but would you give up the power, and familiarity of your existing tools in return for a reliable way of tracking lineage? Lineage and auditing should work consistently, automatically and quietly, allowing users to access their data with any tool they require to use. The real solution, therefore, is to create a consistent method by which to bring lineage data from these data various disparate sources into the data analysis platform that you use, rather than being forced to use the tool that manages the pipeline for the lineage and a different tool for the data analysis. The key is to keep your logs, keep your audit data, from every source, bring them together and use the data analysis tools to trace the paths from raw data to the answer that data analysis provides.

    Read the article

  • How to Play PC Games on Your TV

    - by Chris Hoffman
    No need to wait for Valve’s Steam Machines — connect your Windows gaming PC to your TV and use powerful PC graphics in the living room today. It’s easy — you don’t need any unusual hardware or special software. This is ideal if you’re already a PC gamer who wants to play your games on a larger screen. It’s also convenient if you want to play multiplayer PC games with controllers in your living rom. HDMI Cables and Controllers You’ll need an HDMI cable to connect your PC to your television. This requires a TV with HDMI-in, a PC with HDMI-out, and an HDMI cable. Modern TVs and PCs have had HDMI built in for years, so you should already be good to go. If you don’t have a spare HDMI cable lying around, you may have to buy one or repurpose one of your existing HDMI cables. Just don’t buy the expensive HDMI cables — even a cheap HDMI cable will work just as well as a more expensive one. Plug one end of the HDMI cable into the HDMI-out port on your PC and one end into the HDMI-In port on your TV. Switch your TV’s input to the appropriate HDMI port and you’ll see your PC’s desktop appear on your TV.  Your TV becomes just another external monitor. If you have your TV and PC far away from each other in different rooms, this won’t work. If you have a reasonably powerful laptop, you can just plug that into your TV — or you can unplug your desktop PC and hook it up next to your TV. Now you’ll just need an input device. You probably don’t want to sit directly in front of your TV with a wired keyboard and mouse! A wireless keyboard and wireless mouse can be convenient and may be ideal for some games. However, you’ll probably want a game controller like console players use. Better yet, get multiple game controllers so you can play local-multiplayer PC games with other people. The Xbox 360 controller is the ideal controller for PC gaming. Windows supports these controllers natively, and many PC games are designed specifically for these controllers. Note that Xbox One controllers aren’t yet supported on Windows because Microsoft hasn’t released drivers for them. Yes, you could use a third-party controller or go through the process of pairing a PlayStation controller with your PC using unofficial tools, but it’s better to get an Xbox 360 controller. Just plug one or more Xbox controllers into your PC’s USB ports and they’ll work without any setup required. While many PC games to support controllers, bear in mind that some games require a keyboard and mouse. A TV-Optimized Interface Use Steam’s Big Picture interface to more easily browse and launch games. This interface was designed for using on a television with controllers and even has an integrated web browser you can use with your controller. It will be used on the Valve’s Steam Machine consoles as the default TV interface. You can use a mouse with it too, of course. There’s also nothing stopping you from just using your Windows desktop with a mouse and keyboard — aside from how inconvenient it will be. To launch Big Picture Mode, open Steam and click the Big Picture button at the top-right corner of your screen. You can also press the glowing Xbox logo button in the middle of an Xbox 360 Controller to launch the Big Picture interface if Steam is open. Another Option: In-Home Streaming If you want to leave your PC in one room of your home and play PC games on a TV in a different room, you can consider using local streaming to stream games over your home network from your gaming PC to your television. Bear in mind that the game won’t be as smooth and responsive as it would if you were sitting in front of your PC. You’ll also need a modern router with fast wireless network speeds to keep up with the game streaming. Steam’s built-in In-Home Streaming feature is now available to everyone. You could plug a laptop with less-powerful graphics hardware into your TV and use it to stream games from your powerful desktop gaming rig. You could also use an older desktop PC you have lying around. To stream a game, log into Steam on your gaming PC and log into Steam with the same account on another computer on your home network. You’ll be able to view the library of installed games on your other PC and start streaming them. NVIDIA also has their own GameStream solution that allows you to stream games from a PC with powerful NVIDIA graphics hardware. However, you’ll need an NVIDIA Shield handheld gaming console to do this. At the moment, NVIDIA’s game streaming solution can only stream to the NVIDIA Shield. However, the NVIDIA Shield device can be connected to your TV so you can play that streaming game on your TV. Valve’s Steam Machines are supposed to bring PC gaming to the living room and they’ll do it using HDMI cables, a custom Steam controller, the Big Picture interface, and in-home streaming for compatibility with Windows games. You can do all of this yourself today — you’ll just need an Xbox 360 controller instead of the not-yet-released Steam controller. Image Credit: Marco Arment on Flickr, William Hook on Flickr, Lewis Dowling on Flickr

    Read the article

  • Improving TCP performance over a gigabit network lots of connections and high traffic for storage and streaming services

    - by Linux Guy
    I have two servers, Both servers hardware Specification are Processor : Dual Processor RAM : over 128 G.B Hard disk : SSD Hard disk Outging Traffic bandwidth : 3 Gbps network cards speed : 10 Gbps Server A : for Encoding videos Server B : for storage videos andstream videos over web interface like youtube The inbound bandwidth between two servers is 10Gbps , the outbound bandwidth internet bandwidth is 500Mpbs Both servers using public ip addresses in public and private network Both servers transfer and connection on nginx port , and the server B used for streaming media , like youtube stream videos Both servers in same network , when i do ping from Server A to Server B i got high time latency above 1.0ms , the time range time=52.7 ms to time=215.7 ms - This is the output of iftop utility 353Mb 707Mb 1.04Gb 1.38Gb 1.73Gb mqqqqqqqqqqqqqqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqq server.example.com => ip.address 6.36Mb 4.31Mb 1.66Mb <= 158Kb 94.8Kb 35.1Kb server.example.com => ip.address 1.23Mb 4.28Mb 1.12Mb <= 17.1Kb 83.5Kb 21.9Kb server.example.com => ip.address 395Kb 3.89Mb 1.07Mb <= 6.09Kb 109Kb 28.6Kb server.example.com => ip.address 4.55Mb 3.83Mb 1.04Mb <= 55.6Kb 45.4Kb 13.0Kb server.example.com => ip.address 649Kb 3.38Mb 1.47Mb <= 9.00Kb 38.7Kb 16.7Kb server.example.com => ip.address 5.00Mb 3.32Mb 1.80Mb <= 65.7Kb 55.1Kb 29.4Kb server.example.com => ip.address 387Kb 3.13Mb 1.06Mb <= 18.4Kb 39.9Kb 15.0Kb server.example.com => ip.address 3.27Mb 3.11Mb 1.01Mb <= 81.2Kb 64.5Kb 20.9Kb server.example.com => ip.address 1.75Mb 3.08Mb 2.72Mb <= 16.6Kb 35.6Kb 32.5Kb server.example.com => ip.address 1.75Mb 2.90Mb 2.79Mb <= 22.4Kb 32.6Kb 35.6Kb server.example.com => ip.address 3.03Mb 2.78Mb 1.82Mb <= 26.6Kb 27.4Kb 20.2Kb server.example.com => ip.address 2.26Mb 2.66Mb 1.36Mb <= 51.7Kb 49.1Kb 24.4Kb server.example.com => ip.address 586Kb 2.50Mb 1.03Mb <= 4.17Kb 26.1Kb 10.7Kb server.example.com => ip.address 2.42Mb 2.49Mb 2.44Mb <= 31.6Kb 29.7Kb 29.9Kb server.example.com => ip.address 2.41Mb 2.46Mb 2.41Mb <= 26.4Kb 24.5Kb 23.8Kb server.example.com => ip.address 2.37Mb 2.39Mb 2.40Mb <= 28.9Kb 27.0Kb 28.5Kb server.example.com => ip.address 525Kb 2.20Mb 1.05Mb <= 7.03Kb 26.0Kb 12.8Kb qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq TX: cum: 102GB peak: 1.65Gb rates: 1.46Gb 1.44Gb 1.48Gb RX: 1.31GB 24.3Mb 19.5Mb 18.9Mb 20.0Mb TOTAL: 103GB 1.67Gb 1.48Gb 1.46Gb 1.50Gb I check the transfer speed using iperf utility From Server A to Server B # iperf -c 0.0.0.2 -p 8777 ------------------------------------------------------------ Client connecting to 0.0.0.2, TCP port 8777 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 3] local 0.0.0.1 port 38895 connected with 0.0.0.2 port 8777 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.8 sec 528 KBytes 399 Kbits/sec My Current Connections in Server B # netstat -an|grep ":8777"|awk '/tcp/ {print $6}'|sort -nr| uniq -c 2072 TIME_WAIT 28 SYN_RECV 1 LISTEN 189 LAST_ACK 139 FIN_WAIT2 373 FIN_WAIT1 3381 ESTABLISHED 34 CLOSING Server A Network Card Information Settings for eth0: Supported ports: [ TP ] Supported link modes: 100baseT/Full 1000baseT/Full 10000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: 10000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 10000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: external Auto-negotiation: on MDI-X: Unknown Supports Wake-on: d Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: yes Server B Network Card Information Settings for eth2: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: 10000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: No Speed: 10000Mb/s Duplex: Full Port: Direct Attach Copper PHYAD: 0 Transceiver: external Auto-negotiation: off Supports Wake-on: d Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: yes ifconfig server A eth0 Link encap:Ethernet HWaddr 00:25:90:ED:9E:AA inet addr:0.0.0.1 Bcast:0.0.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1202795665 errors:0 dropped:64334 overruns:0 frame:0 TX packets:2313161968 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:893413096188 (832.0 GiB) TX bytes:3360949570454 (3.0 TiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:2207544 errors:0 dropped:0 overruns:0 frame:0 TX packets:2207544 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:247769175 (236.2 MiB) TX bytes:247769175 (236.2 MiB) ifconfig Server B eth2 Link encap:Ethernet HWaddr 00:25:90:82:C4:FE inet addr:0.0.0.2 Bcast:0.0.0.2 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:39973046980 errors:0 dropped:1828387600 overruns:0 frame:0 TX packets:69618752480 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3013976063688 (2.7 TiB) TX bytes:102250230803933 (92.9 TiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:1049495 errors:0 dropped:0 overruns:0 frame:0 TX packets:1049495 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:129012422 (123.0 MiB) TX bytes:129012422 (123.0 MiB) Netstat -i on Server B # netstat -i Kernel Interface table Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg eth2 9000 0 42098629968 0 2131223717 0 73698797854 0 0 0 BMRU lo 65536 0 1077908 0 0 0 1077908 0 0 0 LRU I Turn up send/receive buffers on the network card to 2048 and problem still persist I increase the MTU for server A and problem still persist and i increase the MTU for server B for better connectivity and transfer speed but it couldn't transfer at all The problem is : as you can see from iperf utility, the transfer speed from server A to server B slow when i restart network service in server B the transfer in server A at full speed, after 2 minutes , it's getting slow How could i troubleshoot slow speed issue and fix it in server B ? Notice : if there any other commands i should execute in servers for more information, so it might help resolve the problem , let me know in comments

    Read the article

  • Talking JavaOne with Rock Star Raghavan Srinivas

    - by Janice J. Heiss
    Raghavan Srinivas, affectionately known as “Rags,” is a two-time JavaOne Rock Star (from 2005 and 2011) who, as a Developer Advocate at Couchbase, gets his hands dirty with emerging technology directions and trends. His general focus is on distributed systems, with a specialization in cloud computing. He worked on Hadoop and HBase during its early stages, has spoken at conferences world-wide on a variety of technical topics, conducted and organized Hands-on Labs and taught graduate classes.He has 20 years of hands-on software development and over 10 years of architecture and technology evangelism experience and has worked for Digital Equipment Corporation, Sun Microsystems, Intuit and Accenture. He has evangelized and influenced the architecture of numerous technologies including the early releases of JavaFX, Java, Java EE, Java and XML, Java ME, AJAX and Web 2.0, and Java Security.Rags will be giving these sessions at JavaOne 2012: CON3570 -- Autosharding Enterprise to Social Gaming Applications with NoSQL and Couchbase CON3257 -- Script Bowl 2012: The Battle of the JVM-Based Languages (with Guillaume Laforge, Aaron Bedra, Dick Wall, and Dr Nic Williams) Rags emphasized the importance of the Cloud: “The Cloud and the Big Data are popular technologies not merely because they are trendy, but, largely due to the fact that it's possible to do massive data mining and use that information for business advantage,” he explained. I asked him what we should know about Hadoop. “Hadoop,” he remarked, “is mainly about using commodity hardware and achieving unprecedented scalability. At the heart of all this is the Java Virtual Machine which is running on each of these nodes. The vision of taking the processing to where the data resides is made possible by Java and Hadoop.” And the most exciting thing happening in the world of Java today? “I read recently that Java projects on github.com are just off the charts when compared to other projects. It's exciting to realize the robust growth of Java and the degree of collaboration amongst Java programmers.” He encourages Java developers to take advantage of Java 7 for Mac OS X which is now available for download. At the same time, he also encourages us to read the caveats. Originally published on blogs.oracle.com/javaone.

    Read the article

  • Talking JavaOne with Rock Star Raghavan Srinivas

    - by Janice J. Heiss
    Raghavan Srinivas, affectionately known as “Rags,” is a two-time JavaOne Rock Star (from 2005 and 2011) who, as a Developer Advocate at Couchbase, gets his hands dirty with emerging technology directions and trends. His general focus is on distributed systems, with a specialization in cloud computing. He worked on Hadoop and HBase during its early stages, has spoken at conferences world-wide on a variety of technical topics, conducted and organized Hands-on Labs and taught graduate classes.He has 20 years of hands-on software development and over 10 years of architecture and technology evangelism experience and has worked for Digital Equipment Corporation, Sun Microsystems, Intuit and Accenture. He has evangelized and influenced the architecture of numerous technologies including the early releases of JavaFX, Java, Java EE, Java and XML, Java ME, AJAX and Web 2.0, and Java Security.Rags will be giving these sessions at JavaOne 2012: CON3570 -- Autosharding Enterprise to Social Gaming Applications with NoSQL and Couchbase CON3257 -- Script Bowl 2012: The Battle of the JVM-Based Languages (with Guillaume Laforge, Aaron Bedra, Dick Wall, and Dr Nic Williams) Rags emphasized the importance of the Cloud: “The Cloud and the Big Data are popular technologies not merely because they are trendy, but, largely due to the fact that it's possible to do massive data mining and use that information for business advantage,” he explained. I asked him what we should know about Hadoop. “Hadoop,” he remarked, “is mainly about using commodity hardware and achieving unprecedented scalability. At the heart of all this is the Java Virtual Machine which is running on each of these nodes. The vision of taking the processing to where the data resides is made possible by Java and Hadoop.” And the most exciting thing happening in the world of Java today? “I read recently that Java projects on github.com are just off the charts when compared to other projects. It's exciting to realize the robust growth of Java and the degree of collaboration amongst Java programmers.” He encourages Java developers to take advantage of Java 7 for Mac OS X which is now available for download. At the same time, he also encourages us to read the caveats.

    Read the article

  • Why Oracle Data Integrator for Big Data?

    - by Mala Narasimharajan
    Big Data is everywhere these days - but what exactly is it? It’s data that comes from a multitude of sources – not only structured data, but unstructured data as well.  The sheer volume of data is mindboggling – here are a few examples of big data: climate information collected from sensors, social media information, digital pictures, log files, online video files, medical records or online transaction records.  These are just a few examples of what constitutes big data.   Embedded in big data is tremendous value and being able to manipulate, load, transform and analyze big data is key to enhancing productivity and competitiveness.  The value of big data lies in its propensity for greater in-depth analysis and data segmentation -- in turn giving companies detailed information on product performance, customer preferences and inventory.  Furthermore, by being able to store and create more data in digital form, “big data can unlock significant value by making information transparent and usable at much higher frequency." (McKinsey Global Institute, May 2011) Oracle's flagship product for bulk data movement and transformation, Oracle Data Integrator, is a critical component of Oracle’s Big Data strategy. ODI provides automation, bulk loading, and validation and transformation capabilities for Big Data while minimizing the complexities of using Hadoop.  Specifically, the advantages of ODI in a Big Data scenario are due to pre-built Knowledge Modules that drive processing in Hadoop. This leverages the graphical UI to load and unload data from Hadoop, perform data validations and create mapping expressions for transformations.  The Knowledge Modules provide a key jump-start and eliminate a significant amount of Hadoop development.  Using Oracle Data Integrator together with Oracle Big Data Connectors, you can simplify the complexities of mapping, accessing, and loading big data (via NoSQL or HDFS) but also correlating your enterprise data – this correlation may require integrating across heterogeneous and standards-based environments, connecting to Oracle Exadata, or sourcing via a big data platform such as Oracle Big Data Appliance. To learn more about Oracle Data Integration and Big Data, download our resource kit to see the latest in whitepapers, webinars, downloads, and more… or go to our website on www.oracle.com/bigdata

    Read the article

  • Cascading S3 Sink Tap not being deleted with SinkMode.REPLACE

    - by Eric Charles
    We are running Cascading with a Sink Tap being configured to store in Amazon S3 and were facing some FileAlreadyExistsException (see [1]). This was only from time to time (1 time on around 100) and was not reproducable. Digging into the Cascading codem, we discovered the Hfs.deleteResource() is called (among others) by the BaseFlow.deleteSinksIfNotUpdate(). Btw, we were quite intrigued with the silent NPE (with comment "hack to get around npe thrown when fs reaches root directory"). From there, we extended the Hfs tap with our own Tap to add more action in the deleteResource() method (see [2]) with a retry mechanism calling directly the getFileSystem(conf).delete. The retry mechanism seemed to bring improvement, but we are still sometimes facing failures (see example in [3]): it sounds like HDFS returns isDeleted=true, but asking directly after if the folder exists, we receive exists=true, which should not happen. Logs also shows randomly isDeleted true or false when the flow succeeds, which sounds like the returned value is irrelevant or not to be trusted. Can anybody bring his own S3 experience with such a behavior: "folder should be deleted, but it is not"? We suspect a S3 issue, but could it also be in Cascading or HDFS? We run on Hadoop Cloudera-cdh3u5 and Cascading 2.0.1-wip-dev. [1] org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory s3n://... already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132) at com.twitter.elephantbird.mapred.output.DeprecatedOutputFormatWrapper.checkOutputSpecs(DeprecatedOutputFormatWrapper.java:75) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:923) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:882) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:882) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:856) at cascading.flow.hadoop.planner.HadoopFlowStepJob.internalNonBlockingStart(HadoopFlowStepJob.java:104) at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:174) at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.j [2] @Override public boolean deleteResource(JobConf conf) throws IOException { LOGGER.info("Deleting resource {}", getIdentifier()); boolean isDeleted = super.deleteResource(conf); LOGGER.info("Hfs Sink Tap isDeleted is {} for {}", isDeleted, getIdentifier()); Path path = new Path(getIdentifier()); int retryCount = 0; int cumulativeSleepTime = 0; int sleepTime = 1000; while (getFileSystem(conf).exists(path)) { LOGGER .info( "Resource {} still exists, it should not... - I will continue to wait patiently...", getIdentifier()); try { LOGGER.info("Now I will sleep " + sleepTime / 1000 + " seconds while trying to delete {} - attempt: {}", getIdentifier(), retryCount + 1); Thread.sleep(sleepTime); cumulativeSleepTime += sleepTime; sleepTime *= 2; } catch (InterruptedException e) { e.printStackTrace(); LOGGER .error( "Interrupted while sleeping trying to delete {} with message {}...", getIdentifier(), e.getMessage()); throw new RuntimeException(e); } if (retryCount == 0) { getFileSystem(conf).delete(getPath(), true); } retryCount++; if (cumulativeSleepTime > MAXIMUM_TIME_TO_WAIT_TO_DELETE_MS) { break; } } if (getFileSystem(conf).exists(path)) { LOGGER .error( "We didn't succeed to delete the resource {}. Throwing now a runtime exception.", getIdentifier()); throw new RuntimeException( "Although we waited to delete the resource for " + getIdentifier() + ' ' + retryCount + " iterations, it still exists - This must be an issue in the underlying storage system."); } return isDeleted; } [3] INFO [pool-2-thread-15] (BaseFlow.java:1287) - [...] at least one sink is marked for delete INFO [pool-2-thread-15] (BaseFlow.java:1287) - [...] sink oldest modified date: Wed Dec 31 23:59:59 UTC 1969 INFO [pool-2-thread-15] (HiveSinkTap.java:148) - Now I will sleep 1 seconds while trying to delete s3n://... - attempt: 1 INFO [pool-2-thread-15] (HiveSinkTap.java:130) - Deleting resource s3n://... INFO [pool-2-thread-15] (HiveSinkTap.java:133) - Hfs Sink Tap isDeleted is true for s3n://... ERROR [pool-2-thread-15] (HiveSinkTap.java:175) - We didn't succeed to delete the resource s3n://... Throwing now a runtime exception. WARN [pool-2-thread-15] (Cascade.java:706) - [...] flow failed: ... java.lang.RuntimeException: Although we waited to delete the resource for s3n://... 0 iterations, it still exists - This must be an issue in the underlying storage system. at com.qubit.hive.tap.HiveSinkTap.deleteResource(HiveSinkTap.java:179) at com.qubit.hive.tap.HiveSinkTap.deleteResource(HiveSinkTap.java:40) at cascading.flow.BaseFlow.deleteSinksIfNotUpdate(BaseFlow.java:971) at cascading.flow.BaseFlow.prepare(BaseFlow.java:733) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:761) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:710) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)

    Read the article

  • fast forward/streaming in html5 video? RTSP?

    - by karpodiem
    right now I've got a few .mp4's hosted on Amazon S3. I know that S3 has support for RTMP, which is useful for streaming Flash. I'd like to accomplish something similar with html5 video; my biggest issue is that I need the ability to seek (fast forward) to a particular part of the video. Right now when I query the video, it loads the entire video before playing, which is a waste of bandwidth/dealbreaker. In what manner could this be implemented? Is this even possible? Looks like RTSP would be a good bet, but I haven't found whether anyone has rolled this out successfully.

    Read the article

  • How to get netstream bytesLoaded and bytesTotal from streaming .mp4?

    - by Amy
    I have a flex 3 app that uses netstream and a video object to stream .mp4 movies. I want to use the bytesLoaded and bytesTotal properties of the netstream to display the buffering information. I would also like to get any information about the number of frames that are dropped if possible. When I've tested on .flv I'm able to get the information without a problem, but it doesn't seem to work on .mp4. Is it possible to get this information streaming .mp4? Is there some configuration that I'm missing to make things work the same for .mp4 as .flv? Thanks!

    Read the article

  • Vbscript / webcam. using flash API - Save streaming file as image

    - by remi
    Hi. Based on a php app we've tried to developp our own webapp in vbscript. All is working well. The camera starts, etc.. The problem we're left with is the following: the flash API is streaming the content of the video to a webpage, this page saves the photo as a jpg. The PHP code we're trying to emulate is as follow $str = file_get_contents("php://input"); file_put_contents("/tmp/upload.jpg", pack("H*", $str)); After intensive googling the best we can come up with is Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Dim sImagePath As String = Server.MapPath("/registration/") & "test.jpg" Dim data As Byte() = Request.BinaryRead(Request.TotalBytes) Dim Ret As Array Dim imgbmp As New System.Drawing.Bitmap("test.jpg") Dim ms As MemoryStream = New MemoryStream(data) imgbmp.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg) Ret = ms.ToArray() Dim fs As New FileStream(sImagePath, FileMode.Create, FileAccess.Write) fs.Write(Ret, 0, Ret.Length) fs.Flush() fs.Close() End Sub which is not working, does anybody have any suggestion?

    Read the article

  • How to send all previously generated data to new Flex client when using BlazeDS streaming?

    - by Sergey
    Hello. I have a Flex client and Java back-end connected with each other via BlazeDS. I use streaming for real-time data push from server to client. Everything works fine, but I want to add the following functionality: when the new client opens Flex application, he must receive all previously generated data instead of just subscribing to data updates. It seems like I have to save this data at the web server - that's pretty straightforward, but how can I push this data to new client when he connects? Should I use custom ServiceAdapter or there is a better way to achieve this purpose?

    Read the article

  • Python how to execute generate code ?

    - by Natim
    Hello guys I have this code, and I would like to use the app parameter to generate the code instead of duplicating it. if app == 'map': try: from modulo.map.views import map return map(request, *args, **kwargs) except ImportError: pass elif app == 'schedule': try: from modulo.schedule.views import schedule_day return schedule_day(request, *args, **kwargs) except ImportError: pass elif app == 'sponsors': try: from modulo.sponsors.views import sponsors return sponsors(request, *args, **kwargs) except ImportError: pass elif app == 'streaming': try: from modulo.streaming.views import streaming return streaming(request, *args, **kwargs) except ImportError: pass Do you have any idea ? Thanks

    Read the article

  • Python: how to execute generated code ?

    - by Natim
    Hello guys I have this code, and I would like to use the app parameter to generate the code instead of duplicating it. if app == 'map': try: from modulo.map.views import map return map(request, *args, **kwargs) except ImportError: pass elif app == 'schedule': try: from modulo.schedule.views import schedule_day return schedule_day(request, *args, **kwargs) except ImportError: pass elif app == 'sponsors': try: from modulo.sponsors.views import sponsors return sponsors(request, *args, **kwargs) except ImportError: pass elif app == 'streaming': try: from modulo.streaming.views import streaming return streaming(request, *args, **kwargs) except ImportError: pass Do you have any idea ? Thanks

    Read the article

  • Oracle R Enterprise Tutorial Series on Oracle Learning Library

    - by mhornick
    Oracle Server Technologies Curriculum has just released the Oracle R Enterprise Tutorial Series, which is publicly available on Oracle Learning Library (OLL). This 8 part interactive lecture series with review sessions covers Oracle R Enterprise 1.1 and an introduction to Oracle R Connector for Hadoop 1.1: Introducing Oracle R Enterprise Getting Started with ORE R Language Basics Producing Graphs in R The ORE Transparency Layer ORE Embedded R Scripts: R Interface ORE Embedded R Scripts: SQL Interface Using the Oracle R Connector for Hadoop We encourage you to download Oracle software for evaluation from the Oracle Technology Network. See these links for R-related software: Oracle R Distribution, Oracle R Enterprise, ROracle, Oracle R Connector for Hadoop.  As always, we welcome comments and questions on the Oracle R Forum.

    Read the article

  • Big Data – Buzz Words: What is HDFS – Day 8 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is MapReduce. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – HDFS. What is HDFS ? HDFS stands for Hadoop Distributed File System and it is a primary storage system used by Hadoop. It provides high performance access to data across Hadoop clusters. It is usually deployed on low-cost commodity hardware. In commodity hardware deployment server failures are very common. Due to the same reason HDFS is built to have high fault tolerance. The data transfer rate between compute nodes in HDFS is very high, which leads to reduced risk of failure. HDFS creates smaller pieces of the big data and distributes it on different nodes. It also copies each smaller piece to multiple times on different nodes. Hence when any node with the data crashes the system is automatically able to use the data from a different node and continue the process. This is the key feature of the HDFS system. Architecture of HDFS The architecture of the HDFS is master/slave architecture. An HDFS cluster always consists of single NameNode. This single NameNode is a master server and it manages the file system as well regulates access to various files. In additional to NameNode there are multiple DataNodes. There is always one DataNode for each data server. In HDFS a big file is split into one or more blocks and those blocks are stored in a set of DataNodes. The primary task of the NameNode is to open, close or rename files and directory and regulate access to the file system, whereas the primary task of the DataNode is read and write to the file systems. DataNode is also responsible for the creation, deletion or replication of the data based on the instruction from NameNode. In reality, NameNode and DataNode are software designed to run on commodity machine build in Java language. Visual Representation of HDFS Architecture Let us understand how HDFS works with the help of the diagram. Client APP or HDFS Client connects to NameSpace as well as DataNode. Client App access to the DataNode is regulated by NameSpace Node. NameSpace Node allows Client App to connect to the DataNode based by allowing the connection to the DataNode directly. A big data file is divided into multiple data blocks (let us assume that those data chunks are A,B,C and D. Client App will later on write data blocks directly to the DataNode. Client App does not have to directly write to all the node. It just has to write to any one of the node and NameNode will decide on which other DataNode it will have to replicate the data. In our example Client App directly writes to DataNode 1 and detained 3. However, data chunks are automatically replicated to other nodes. All the information like in which DataNode which data block is placed is written back to NameNode. High Availability During Disaster Now as multiple DataNode have same data blocks in the case of any DataNode which faces the disaster, the entire process will continue as other DataNode will assume the role to serve the specific data block which was on the failed node. This system provides very high tolerance to disaster and provides high availability. If you notice there is only single NameNode in our architecture. If that node fails our entire Hadoop Application will stop performing as it is a single node where we store all the metadata. As this node is very critical, it is usually replicated on another clustered as well as on another data rack. Though, that replicated node is not operational in architecture, it has all the necessary data to perform the task of the NameNode in the case of the NameNode fails. The entire Hadoop architecture is built to function smoothly even there are node failures or hardware malfunction. It is built on the simple concept that data is so big it is impossible to have come up with a single piece of the hardware which can manage it properly. We need lots of commodity (cheap) hardware to manage our big data and hardware failure is part of the commodity servers. To reduce the impact of hardware failure Hadoop architecture is built to overcome the limitation of the non-functioning hardware. Tomorrow In tomorrow’s blog post we will discuss the importance of the relational database in Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • How to call Twiter's Streaming/Filter Feed with urllib2/httplib?

    - by Simon
    Update: I switched this back from answered as I tried the solution posed in cogent Nick's answer and switched to Google's urlfetch: logging.debug("starting urlfetch for http://%s%s" % (self.host, self.url)) result = urlfetch.fetch("http://%s%s" % (self.host, self.url), payload=self.body, method="POST", headers=self.headers, allow_truncated=True, deadline=5) logging.debug("finished urlfetch") but unfortunately finished urlfetch is never printed - I see the timeout happen in the logs (it returns 200 after 5 seconds), but execution doesn't seem tor return. Hi All- I'm attempting to play around with Twitter's Streaming (aka firehose) API with Google App Engine (I'm aware this probably isn't a great long term play as you can't keep the connection perpetually open with GAE), but so far I haven't had any luck getting my program to actually parse the results returned by Twitter. Some code: logging.debug("firing up urllib2") req = urllib2.Request(url="http://%s%s" % (self.host, self.url), data=self.body, headers=self.headers) logging.debug("called urlopen for %s %s, about to call urlopen" % (self.host, self.url)) fobj = urllib2.urlopen(req) logging.debug("called urlopen") When this executes, unfortunately, my debug output never shows the called urlopen line printed. I suspect what's happening is that Twitter keeps the connection open and urllib2 doesn't return because the server doesn't terminate the connection. Wireshark shows the request being sent properly and a response returned with results. I tried adding Connection: close to my request header, but that didn't yield a successful result. Any ideas on how to get this to work? thanks -Simon

    Read the article

  • Streaming binary data to WCF rest service gives Bad Request (400) when content length is greater than 64k

    - by Mikey Cee
    I have a WCF service that takes a stream: [ServiceContract] public class UploadService : BaseService { [OperationContract] [WebInvoke(BodyStyle=WebMessageBodyStyle.Bare, Method=WebRequestMethods.Http.Post)] public void Upload(Stream data) { // etc. } } This method is to allow my Silverlight application to upload large binary files, the easiest way being to craft the HTTP request by hand from the client. Here is the code in the Silverlight client that does this: const int contentLength = 64 * 1024; // 64 Kb var request = (HttpWebRequest)WebRequest.Create("http://localhost:8732/UploadService/"); request.AllowWriteStreamBuffering = false; request.Method = WebRequestMethods.Http.Post; request.ContentType = "application/octet-stream"; request.ContentLength = contentLength; using (var outputStream = request.GetRequestStream()) { outputStream.Write(new byte[contentLength], 0, contentLength); outputStream.Flush(); using (var response = request.GetResponse()); } Now, in the case above, where I am streaming 64 kB of data (or less), this works OK and if I set a breakpoint in my WCF method, and I can examine the stream and see 64 kB worth of zeros - yay! The problem arises if I send anything more than 64 kB of data, for instance by changing the first line of my client code to the following: const int contentLength = 64 * 1024 + 1; // 64 kB + 1 B This now throws an exception when I call request.GetResponse(): The remote server returned an error: (400) Bad Request. In my WCF configuration I have set maxReceivedMessageSize, maxBufferSize and maxBufferPoolSize to 2147483647, but to no avail. Here are the relevant sections from my service's app.config: <service name="UploadService"> <endpoint address="" binding="webHttpBinding" bindingName="StreamedRequestWebBinding" contract="UploadService" behaviorConfiguration="webBehavior"> <identity> <dns value="localhost" /> </identity> </endpoint> <host> <baseAddresses> <add baseAddress="http://localhost:8732/UploadService/" /> </baseAddresses> </host> </service> <bindings> <webHttpBinding> <binding name="StreamedRequestWebBinding" bypassProxyOnLocal="true" useDefaultWebProxy="false" hostNameComparisonMode="WeakWildcard" sendTimeout="00:05:00" openTimeout="00:05:00" receiveTimeout="00:05:00" maxReceivedMessageSize="2147483647" maxBufferSize="2147483647" maxBufferPoolSize="2147483647" transferMode="StreamedRequest"> <readerQuotas maxArrayLength="2147483647" maxStringContentLength="2147483647" /> </binding> </webHttpBinding> </bindings> <behaviors> <endpointBehaviors> <behavior name="webBehavior"> <webHttp /> </behavior> <endpointBehaviors> </behaviors> How do I make my service accept more than 64 kB of streamed post data?

    Read the article

  • Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It  supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • How to write a Media Center plugin like the Netflix plugin? Source code/reference samples?

    - by Vin
    I am looking to write a Windows Media Center plugin just like the Netflix WMC plugin. Once logged in, I know the streaming urls that I need to hook in to. Any source code, reference samples would be great. Found one on codeplex for swedish TV channels, but right now it's not working for some reason... Previously asked the following question, with no answers, so updated with a question asked in a easy to relate fashion Host a streaming video in my client, from a streaming url that is behind a login session? I am building a Silverlight 4 desktop client to show streaming video from a site that is login based. So that website has a Silverlight player that does streaming video, the player is behind a login sesion, so just by getting the url from fiddler and trying to play it in my Silverlight 4 desktop client won't work. Actually after that, I want to build a Windows Media Center plugin to build a Netflix-like client, that allows login through WMC and then allows you to watch streaming video. Any pointers on how to go about doing any of this?

    Read the article

  • Big Data Matters with ODI12c

    - by Madhu Nair
    contributed by Mike Eisterer On October 17th, 2013, Oracle announced the release of Oracle Data Integrator 12c (ODI12c).  This release signifies improvements to Oracle’s Data Integration portfolio of solutions, particularly Big Data integration. Why Big Data = Big Business Organizations are gaining greater insights and actionability through increased storage, processing and analytical benefits offered by Big Data solutions.  New technologies and frameworks like HDFS, NoSQL, Hive and MapReduce support these benefits now. As further data is collected, analytical requirements increase and the complexity of managing transformations and aggregations of data compounds and organizations are in need for scalable Data Integration solutions. ODI12c provides enterprise solutions for the movement, translation and transformation of information and data heterogeneously and in Big Data Environments through: The ability for existing ODI and SQL developers to leverage new Big Data technologies. A metadata focused approach for cataloging, defining and reusing Big Data technologies, mappings and process executions. Integration between many heterogeneous environments and technologies such as HDFS and Hive. Generation of Hive Query Language. Working with Big Data using Knowledge Modules  ODI12c provides developers with the ability to define sources and targets and visually develop mappings to effect the movement and transformation of data.  As the mappings are created, ODI12c leverages a rich library of prebuilt integrations, known as Knowledge Modules (KMs).  These KMs are contextual to the technologies and platforms to be integrated.  Steps and actions needed to manage the data integration are pre-built and configured within the KMs.  The Oracle Data Integrator Application Adapter for Hadoop provides a series of KMs, specifically designed to integrate with Big Data Technologies.  The Big Data KMs include: Check Knowledge Module Reverse Engineer Knowledge Module Hive Transform Knowledge Module Hive Control Append Knowledge Module File to Hive (LOAD DATA) Knowledge Module File-Hive to Oracle (OLH-OSCH) Knowledge Module  Nothing to beat an Example: To demonstrate the use of the KMs which are part of the ODI Application Adapter for Hadoop, a mapping may be defined to move data between files and Hive targets.  The mapping is defined by dragging the source and target into the mapping, performing the attribute (column) mapping (see Figure 1) and then selecting the KM which will govern the process.  In this mapping example, movie data is being moved from an HDFS source into a Hive table.  Some of the attributes, such as “CUSTID to custid”, have been mapped over. Figure 1  Defining the Mapping Before the proper KM can be assigned to define the technology for the mapping, it needs to be added to the ODI project.  The Big Data KMs have been made available to the project through the KM import process.   Generally, this is done prior to defining the mapping. Figure 2  Importing the Big Data Knowledge Modules Following the import, the KMs are available in the Designer Navigator. v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Figure 3  The Project View in Designer, Showing Installed IKMs Once the KM is imported, it may be assigned to the mapping target.  This is done by selecting the Physical View of the mapping and examining the Properties of the Target.  In this case MOVIAPP_LOG_STAGE is the target of our mapping. Figure 4  Physical View of the Mapping and Assigning the Big Data Knowledge Module to the Target Alternative KMs may have been selected as well, providing flexibility and abstracting the logical mapping from the physical implementation.  Our mapping may be applied to other technologies as well. The mapping is now complete and is ready to run.  We will see more in a future blog about running a mapping to load Hive. To complete the quick ODI for Big Data Overview, let us take a closer look at what the IKM File to Hive is doing for us.  ODI provides differentiated capabilities by defining the process and steps which normally would have to be manually developed, tested and implemented into the KM.  As shown in figure 5, the KM is preparing the Hive session, managing the Hive tables, performing the initial load from HDFS and then performing the insert into Hive.  HDFS and Hive options are selected graphically, as shown in the properties in Figure 4. Figure 5  Process and Steps Managed by the KM What’s Next Big Data being the shape shifting business challenge it is is fast evolving into the deciding factor between market leaders and others. Now that an introduction to ODI and Big Data has been provided, look for additional blogs coming soon using the Knowledge Modules which make up the Oracle Data Integrator Application Adapter for Hadoop: Importing Big Data Metadata into ODI, Testing Data Stores and Loading Hive Targets Generating Transformations using Hive Query language Loading Oracle from Hadoop Sources For more information now, please visit the Oracle Data Integrator Application Adapter for Hadoop web site, http://www.oracle.com/us/products/middleware/data-integration/hadoop/overview/index.html Do not forget to tune in to the ODI12c Executive Launch webcast on the 12th to hear more about ODI12c and GG12c. Normal 0 false false false EN-US ZH-TW X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";}

    Read the article

< Previous Page | 39 40 41 42 43 44 45 46 47 48 49 50  | Next Page >