Search Results

Search found 2088 results on 84 pages for 'jobs'.

Page 25/84 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >

  • SQL Agent Job - to execute as queue

    - by BINEESHTHOMAS
    I have a job which is calling 10 other jobs using sp_start_job. The job is having 10 steps, each step calling each sub jobs, When i execute the main job, i can see it started with step 1 and in a few secods it shows 'finished successfully' But the jobs take long time time, and when i see the log mechanism i have put inside , it shows the all the 10 steps are running simultaniously at the back, till it finishes after few hours. My requirement is, it should finish step 1 first and then only step2 should start. aNY HELP PLS ?

    Read the article

  • How does 37signals job preview functionality work?

    - by slythic
    Hi all, I'm interested in getting a preview functionality working similar to how the 37signals job site does: http://jobs.37signals.com. Below are some screen shots of how it works. Step 1. Create your ad http://cl.ly/dfc4761b015c7f43c8ab (URL /jobs/new) Step 2. Preview your ad http://cl.ly/9c4b4041cfea83d8569e (URL /jobs/new/preview) Step 3. Publish your ad http://cl.ly/a58284d90fd380d2c26b (URL /listings/new/purchase?token=5198) So assuming you have Post model where Step 1 usually takes place in the new/create view/actions, how should one continue to Step 2 Preview and then after previewing, proceeding to the Step 3 publishing the post/ad? Do they actually save the ad/post in the database before continuing to Step 2 (Preview) but set a flag (like a boolean field called preview set to true)? It looks like they set a token paramater but I'm not sure what it's used for) I'm interested in this because it seems to go against the CRUD/REST and I thought it would be good to know how it worked. Thanks!

    Read the article

  • Filter a List via Another

    - by user1166905
    I have a requirement to filter a list of Clients based on if they haven't had any jobs booked in the last x months. In my code I have two lists, one is my Clients and the other is a filtered List of Jobs between today and x months ago and the idea is to filter Clients based on their id not appearing in the jobs list. I tried the following: filteredClients.Where(n => jobsToSearch.Count(j => j.Client == n.ClientID) == 0).ToList(); But I seem to get ALL clients regardless. I can easily do a foreach but this severly slows down the process. How can I filter the client list based on the job list effectively?

    Read the article

  • How to kill all asynchronous processes

    - by Arko
    Suppose we have a BASH script running some commands in the background. At some time we want to kill all of them, whether they have finished their job or not. Here's an example: function command_doing_nothing () { sleep 10 echo "I'm done" } for (( i = 0; i < 3; i++ )); do command_doing_nothing & done echo "Jobs:" jobs sleep 1 # Now we want to kill them How to kill those 3 jobs running in the background?

    Read the article

  • What is corresponding Cron expression to fire in every X seconds, where X > 60?

    - by giolekva
    I want my jobs to execute in every X seconds, there's one to one matching between job and X. Also during runtime there can be registered new jobs with their own intervals. I've tried to write cron expression for such scenarios, but in documentation there's written that value of seconds can't be more than 69. So cron expression like this: "0/63 * * * * ?" isn't valid. At first sight solution of that problem seemed to be expression like this: "0/3 0/1 * * * ?", but it means completely different thing: trigger job in every three second of every minute. Can you suggest what is the right solution (cron expression) for that? I know I could use just simple timers, but I've to use cron jobs using Quartz.

    Read the article

  • PowerShell Script to Deploy Multiple VM on Azure in Parallel #azure #powershell

    - by Marco Russo (SQLBI)
    This blog is usually dedicated to Business Intelligence and SQL Server, but I didn’t found easily on the web simple PowerShell scripts to help me deploying a number of virtual machines on Azure that I use for testing and development. Since I need to deploy, start, stop and remove many virtual machines created from a common image I created (you know, Tabular is not part of the standard images provided by Microsoft…), I wanted to minimize the time required to execute every operation from my Windows Azure PowerShell console (but I suggest you using Windows PowerShell ISE), so I also wanted to fire the commands as soon as possible in parallel, without losing the result in the console. In order to execute multiple commands in parallel, I used the Start-Job cmdlet, and using Get-Job and Receive-Job I wait for job completion and display the messages generated during background command execution. This technique allows me to reduce execution time when I have to deploy, start, stop or remove virtual machines. Please note that a few operations on Azure acquire an exclusive lock and cannot be really executed in parallel, but only one part of their execution time is subject to this lock. Thus, you obtain a better response time also in these scenarios (this is the case of the provisioning of a new VM). Finally, when you remove the VMs you still have the disk containing the virtual machine to remove. This cannot be done just after the VM removal, because you have to wait that the removal operation is completed on Azure. So I wrote a script that you have to run a few minutes after VMs removal and delete disks (and VHD) no longer related to a VM. I just check that the disk were associated to the original image name used to provision the VMs (so I don’t remove other disks deployed by other batches that I might want to preserve). These examples are specific for my scenario, if you need more complex configurations you have to change and adapt the code. But if your need is to create multiple instances of the same VM running in a workgroup, these scripts should be good enough. I prepared the following PowerShell scripts: ProvisionVMs: Provision many VMs in parallel starting from the same image. It creates one service for each VM. RemoveVMs: Remove all the VMs in parallel – it also remove the service created for the VM StartVMs: Starts all the VMs in parallel StopVMs: Stops all the VMs in parallel RemoveOrphanDisks: Remove all the disks no longer used by any VMs. Run this script a few minutes after RemoveVMs script. ProvisionVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"   # Name of storage account (where VMs will be deployed) $StorageAccount = "Copy the Label property you get from Get-AzureStorageAccount"   function ProvisionVM( [string]$VmName ) {     Start-Job -ArgumentList $VmName {         param($VmName) $Location = "Copy the Location property you get from Get-AzureStorageAccount" $InstanceSize = "A5" # You can use any other instance, such as Large, A6, and so on $AdminUsername = "UserName" # Write the name of the administrator account in the new VM $Password = "Password"      # Write the password of the administrator account in the new VM $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" }         New-AzureVMConfig -Name $VmName -ImageName $Image -InstanceSize $InstanceSize |             Add-AzureProvisioningConfig -Windows -Password $Password -AdminUsername $AdminUsername|             New-AzureVM -Location $Location -ServiceName "$VmName" -Verbose     } }   # Set the proper storage - you might remove this line if you have only one storage in the subscription Set-AzureSubscription -SubscriptionName $SubscriptionName -CurrentStorageAccount $StorageAccount   # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName   # Every line in the following list provisions one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed ProvisionVM "test10" ProvisionVM "test11" ProvisionVM "test12" ProvisionVM "test13" ProvisionVM "test14" ProvisionVM "test15" ProvisionVM "test16" ProvisionVM "test17" ProvisionVM "test18" ProvisionVM "test19" ProvisionVM "test20"   # Wait for all to complete While (Get-Job -State "Running") {     Get-Job -State "Completed" | Receive-Job     Start-Sleep 1 }   # Display output from all jobs Get-Job | Receive-Job   # Cleanup of jobs Remove-Job *   # Displays batch completed echo "Provisioning VM Completed" RemoveVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"   function RemoveVM( [string]$VmName ) {     Start-Job -ArgumentList $VmName {         param($VmName)         Remove-AzureService -ServiceName $VmName -Force -Verbose     } }   # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName   # Every line in the following list remove one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed RemoveVM "test10" RemoveVM "test11" RemoveVM "test12" RemoveVM "test13" RemoveVM "test14" RemoveVM "test15" RemoveVM "test16" RemoveVM "test17" RemoveVM "test18" RemoveVM "test19" RemoveVM "test20"   # Wait for all to complete While (Get-Job -State "Running") {     Get-Job -State "Completed" | Receive-Job     Start-Sleep 1 }   # Display output from all jobs Get-Job | Receive-Job   # Cleanup Remove-Job *   # Displays batch completed echo "Remove VM Completed" StartVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"   function StartVM( [string]$VmName ) {     Start-Job -ArgumentList $VmName {         param($VmName)         Start-AzureVM -Name $VmName -ServiceName $VmName -Verbose     } }   # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName   # Every line in the following list starts one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StartVM "test10" StartVM "test11" StartVM "test11" StartVM "test12" StartVM "test13" StartVM "test14" StartVM "test15" StartVM "test16" StartVM "test17" StartVM "test18" StartVM "test19" StartVM "test20"   # Wait for all to complete While (Get-Job -State "Running") {     Get-Job -State "Completed" | Receive-Job     Start-Sleep 1 }   # Display output from all jobs Get-Job | Receive-Job   # Cleanup Remove-Job *   # Displays batch completed echo "Start VM Completed"   StopVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"   function StopVM( [string]$VmName ) {     Start-Job -ArgumentList $VmName {         param($VmName)         Stop-AzureVM -Name $VmName -ServiceName $VmName -Verbose -Force     } }   # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName   # Every line in the following list stops one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StopVM "test10" StopVM "test11" StopVM "test12" StopVM "test13" StopVM "test14" StopVM "test15" StopVM "test16" StopVM "test17" StopVM "test18" StopVM "test19" StopVM "test20"   # Wait for all to complete While (Get-Job -State "Running") {     Get-Job -State "Completed" | Receive-Job     Start-Sleep 1 }   # Display output from all jobs Get-Job | Receive-Job   # Cleanup Remove-Job *   # Displays batch completed echo "Stop VM Completed" RemoveOrphanDisks $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" }   # Remove all orphan disks coming from the image specified in $ImageName Get-AzureDisk |     Where-Object {$_.attachedto -eq $null -and $_.SourceImageName -eq $ImageName} |     Remove-AzureDisk -DeleteVHD -Verbose  

    Read the article

  • Compiling and installing UFRII driver for Canon IR2520 on a headless Ubuntu 12.04 Server

    - by nixnotwin
    I want to setup a headless Ubuntu 12.04 machine as a print server. The printer is Canon IR2520 which needs UFRII driver. Printer is connected to the network via Ethernet. After searching a lot about weather printer can be directly accessed as a SMB share, I decided to make Ubuntu server as print server. The Windows clients send the print jobs to the server and the server will send those jobs via Ethernet to the printer. I followed this how-to for installing the driver. The driver compilation fails with the error that gtk 2.0 package is not available. I cannot have gtk on a headless server, it is very necessary that it should not have any graphical/desktop packages. What would be the solution for installing UFRII on Ubuntu 12.04 Server.

    Read the article

  • Do I lose anything by coding in c# and using free online vb.net code convertors?

    - by Gullu
    The company I work for uses vb.net since there are many programmers who moved up from vb6 to vb.net. Basically more vb.net resources in the company for support/maintenance vs c#. I am a c# coder and was wondering if I could just continue coding in c# and just use the many online free c# to vb.net code convertors. That way, I will be more productive and also be more marketable since there are more c# jobs compared to vb.net jobs. I have done vb6 many years ago and I am comfortable debugging vb.net code. It's just the primary coding language. I am more comfortable in c#. Will I lose anything if I use this approach. (code conversion). Based on what i read online the future of vb.net is really "Dim". Please advise. thank you

    Read the article

  • Le PDG de Netgear s'en prend à Apple et à « l'égo » de Steve Job et trouve que Windows Phone 7 est « Game Over »

    Le PDG de Netgear s'en prend à Apple et à « l'égo » de Steve Job Et trouve que Windows Phone 7 n'a aucune chance Apple, dont l'écosystème fermé suscite les critiques de cetains, s'est vu très vertement critiqué par Patrick Lo, le PDG de Netgear, qui s'en est également pris à la personnalité de Steve Jobs et à Microsoft. Interrogé par le Sidney Morning Herald, Lo a ainsi critiqué la décision de Steve Jobs dans l'affaire Flash - iOS « Quelle raison a-t-il de s'en prendre à Flash ? ». Un point de vue qui est partagé par d'autres. Mais Lo a sa propre explication : « Il n'y a aucune autre raison que son égo ». Lo trouve aussi critiquable la décision d'Apple de cent...

    Read the article

  • EM12c Release 4: Cloud Control to Major Tom...

    - by abulloch
    With the latest release of Enterprise Manager 12c, Release 4 (12.1.0.4) the EM development team has added new functionality to assist the EM Administrator to monitor the health of the EM infrastructure.   Taking feedback delivered from customers directly and through customer advisory boards some nice enhancements have been made to the “Manage Cloud Control” sections of the UI, commonly known in the EM community as “the MTM pages” (MTM stands for Monitor the Monitor).  This part of the EM Cloud Control UI is viewed by many as the mission control for EM Administrators. In this post we’ll highlight some of the new information that’s on display in these redesigned pages and explain how the information they present can help EM administrators identify potential bottlenecks or issues with the EM infrastructure. The first page we’ll take a look at is the newly designed Repository information page.  You can get to this from the main Setup menu, through Manage Cloud Control, then Repository.  Once this page loads you’ll see the new layout that includes 3 tabs containing more drill-down information. The Repository Tab The first tab, Repository, gives you a series of 6 panels or regions on screen that display key information that the EM Administrator needs to review from time to time to ensure that their infrastructure is in good health. Rather than go through every panel let’s call out a few and let you explore the others later yourself on your own EM site.  Firstly, we have the Repository Details panel. At a glance the EM Administrator can see the current version of the EM repository database and more critically, three important elements of information relating to availability and reliability :- Is the database in Archive Log mode ? Is the database using Flashback ? When was the last database backup taken ? In this test environment above the answers are not too worrying, however, Production environments should have at least Archivelog mode enabled, Flashback is a nice feature to enable prior to upgrades (for fast rollback) and all Production sites should have a backup.  In this case the backup information in the Control file indicates there’s been no recorded backups taken. The next region of interest to note on this page shows key information around the Repository configuration, specifically, the initialisation parameters (from the spfile). If you’re storing your EM Repository in a Cluster Database you can view the parameters on each individual instance using the Instance Name drop-down selector in the top right of the region. Additionally, you’ll note there is now a check performed on the active configuration to ensure that you’re using, at the very least, Oracle minimum recommended values.  Should the values in your EM Repository not meet these requirements it will be flagged in this table with a red X for non-compliance.  You can of-course change these values within EM by selecting the Database target and modifying the parameters in the spfile (and optionally, the run-time values if the parameter allows dynamic changes). The last region to call out on this page before moving on is the new look Repository Scheduler Job Status region. This region is an update of a similar region seen on previous releases of the MTM pages in Cloud Control but there’s some important new functionality that’s been added that customers have requested. First-up - Restarting Repository Jobs.  As you can see from the graphic, you can now optionally select a job (by selecting the row in the UI table element) and click on the Restart Job button to take care of any jobs which have stopped or stalled for any reason.  Previously this needed to be done at the command line using EMDIAG or through a PL/SQL package invocation.  You can now take care of this directly from within the UI. Next, you’ll see that a feature has been added to allow the EM administrator to customise the run-time for some of the background jobs that run in the Repository.  We heard from some customers that ensuring these jobs don’t clash with Production backups, etc is a key requirement.  This new functionality allows you to select the pencil icon to edit the schedule time for these more resource intensive background jobs and modify the schedule to avoid clashes like this. Moving onto the next tab, let’s select the Metrics tab. The Metrics Tab There’s some big changes here, this page contains new information regions that help the Administrator understand the direct impact the in-bound metric flows are having on the EM Repository.  Many customers have provided feedback that they are in the dark about the impact of adding new targets or large numbers of new hosts or new target types into EM and the impact this has on the Repository.  This page helps the EM Administrator get to grips with this.  Let’s take a quick look at two regions on this page. First-up there’s a bubble chart showing a comprehensive view of the top resource consumers of metric data, over the last 30 days, charted as the number of rows loaded against the number of collections for the metric.  The size of the bubble indicates a relative volume.  You can see from this example above that a quick glance shows that Host metrics are the largest inbound flow into the repository when measured by number of rows.  Closely following behind this though are a large number of collections for Oracle Weblogic Server and Application Deployment.  Taken together the Host Collections is around 0.7Mb of data.  The total information collection for Weblogic Server and Application Deployments is 0.38Mb and 0.37Mb respectively. If you want to get this information breakdown on the volume of data collected simply hover over the bubble in the chart and you’ll get a floating tooltip showing the information. Clicking on any bubble in the chart takes you one level deeper into a drill-down of the Metric collection. Doing this reveals the individual metric elements for these target types and again shows a representation of the relative cost - in terms of Number of Rows, Number of Collections and Storage cost of data for each Metric type. Looking at another panel on this page we can see a different view on this data. This view shows a view of the Top N metrics (the drop down allows you to select 10, 15 or 20) and sort them by volume of data.  In the case above we can see the largest metric collection (by volume) in this case (over the last 30 days) is the information about OS Registered Software on a Host target. Taken together, these two regions provide a powerful tool for the EM Administrator to understand the potential impact of any new targets that have been discovered and promoted into management by EM12c.  It’s a great tool for identifying the cause of a sudden increase in Repository storage consumption or Redo log and Archive log generation. Using the information on this page EM Administrators can take action to mitigate any load impact by deploying monitoring templates to the targets causing most load if appropriate.   The last tab we’ll look at on this page is the Schema tab. The Schema Tab Selecting this tab brings up a window onto the SYSMAN schema with a focus on Space usage in the EM Repository.  Understanding what tablespaces are growing, at what rate, is essential information for the EM Administrator to stay on top of managing space allocations for the EM Repository so that it works as efficiently as possible and performs well for the users.  Not least because ensuring storage is managed well ensures continued availability of EM for monitoring purposes. The first region to highlight here shows the trend of space usage for the tablespaces in the EM Repository over time.  You can see the upward trend here showing that storage in the EM Repository is being consumed on an upward trend over the last few days here. This is normal as this EM being used here is brand new with Agents being added daily to bring targets into monitoring.  If your Enterprise Manager configuration has reached a steady state over a period of time where the number of new inbound targets is relatively small, the metric collection settings are fairly uniform and standardised (using Templates and Template Collections) you’re likely to see a trend of space allocation that plateau’s. The table below the trend chart shows the Top 20 Tables/Indexes sorted descending by order of space consumed.  You can switch the trend view chart and corresponding detail table by choosing a different tablespace in the EM Repository using the drop-down picker on the top right of this region. The last region to highlight on this page is the region showing information about the Purge policies in effect in the EM Repository. This information is useful to illustrate to EM Administrators the default purge policies in effect for the different categories of information available in the EM Repository.  Of course, it’s also been a long requested feature to have the ability to modify these default retention periods.  You can also do this using this screen.  As there are interdependencies between some data elements you can’t modify retention policies on a feature by feature basis.  Instead, retention policies take categories of information and bundles them together in Groups.  Retention policies are modified at the Group Level.  Understanding the impact of this really deserves a blog post all on it’s own as modifying these can have a significant impact on both the EM Repository’s storage footprint and it’s performance.  For now, we’re just highlighting the features visibility on these new pages. As a user of EM12c we hope the new features you see here address some of the feedback that’s been given on these pages over the past few releases.  We’ll look out for any comments or feedback you have on these pages ! 

    Read the article

  • Oracle Releases New Mainframe Re-Hosting in Oracle Tuxedo 11g

    - by Jason Williamson
    I'm excited to say that we've released our next generation of Re-hosting in 11g. In fact I'm doing some hands-on labs now for our Systems Integrators in Italy in a couple of weeks and targeting Latin America next month. If you are an SI, or Rehosting firm and are looking to become an Oracle Partner or get a better understanding of Tuxedo and how to use the workbench for rehosting...drop me a line. Oracle Tuxedo Application Runtime for CICS and Batch 11g provides a CICS API emulation and Batch environment that exploits the full range of Oracle Tuxedo's capabilities. Re-hosted applications run in a multi-node, grid environment with centralized production control. Also, enterprise integration of CICS application services benefits from an open and SOA-enabled framework. Key features include: CICS Application Runtime: Can run IBM CICS applications unchanged in an application grid, which enables the distribution of large workloads across multiple processors and nodes. This simplifies CICS administration and can scale to over 100,000 users and over 50,000 transactions per second. 3270 Terminal Server: Protects business users from change through support for tn3270 terminal emulation. Distributed CICS Resource Management: Simplifies deployment and administration by allowing customers to run CICS regions in a distributed configuration. Batch Application Runtime: Provides robust IBM JES-like job management that enables local or remote job submissions. In addition, distributed batch initiators can enable parallelization of jobs and support fail-over, shortening the batch window and helping to meet stringent SLAs. Batch Execution Environment: Helps to run IBM batch unchanged and also supports JCL functionality and all common batch utilities. Oracle Tuxedo Application Rehosting Workbench 11g provides a set of automated migration tools integrated around a central repository. The tools provide high precision which results in very low error rates and the ability to handle large applications. This enables less expensive, low-risk migration projects. Key capabilities include: Workbench Repository and Cataloguer: Ensures integrity of the migrated application assets through full dependency checking. The Cataloguer generates and maintains all relevant meta-data on source and target components. File Migrator: Supports reliable migration of datasets and flat files to an ISAM or Oracle Database 11g. This is done through the automated migration utilities for data unloading, reloading and validation. It also generates logical access functions to shield developers from data repository changes. DB2 Migrator: Similarly, this tool automates the migration of DB2 schema and data to Oracle Database 11g. COBOL Migrator: Supports migration of IBM mainframe COBOL assets (OLTP and Batch) to open systems. Adapts programs for compiler dialects and data access variations. JCL Migrator: Supports migration of IBM JCL jobs to a Tuxedo ART environment, maintaining the flow and characteristics of batch jobs.

    Read the article

  • Building Simple Workflows in Oozie

    - by dan.mcclary
    Introduction More often than not, data doesn't come packaged exactly as we'd like it for analysis. Transformation, match-merge operations, and a host of data munging tasks are usually needed before we can extract insights from our Big Data sources. Few people find data munging exciting, but it has to be done. Once we've suffered that boredom, we should take steps to automate the process. We want codify our work into repeatable units and create workflows which we can leverage over and over again without having to write new code. In this article, we'll look at how to use Oozie to create a workflow for the parallel machine learning task I described on Cloudera's site. Hive Actions: Prepping for Pig In my parallel machine learning article, I use data from the National Climatic Data Center to build weather models on a state-by-state basis. NCDC makes the data freely available as gzipped files of day-over-day observations stretching from the 1930s to today. In reading that post, one might get the impression that the data came in a handy, ready-to-model files with convenient delimiters. The truth of it is that I need to perform some parsing and projection on the dataset before it can be modeled. If I get more observations, I'll want to retrain and test those models, which will require more parsing and projection. This is a good opportunity to start building up a workflow with Oozie. I store the data from the NCDC in HDFS and create an external Hive table partitioned by year. This gives me flexibility of Hive's query language when I want it, but let's me put the dataset in a directory of my choosing in case I want to treat the same data with Pig or MapReduce code. CREATE EXTERNAL TABLE IF NOT EXISTS historic_weather(column 1, column2) PARTITIONED BY (yr string) STORED AS ... LOCATION '/user/oracle/weather/historic'; As new weather data comes in from NCDC, I'll need to add partitions to my table. That's an action I should put in the workflow. Similarly, the weather data requires parsing in order to be useful as a set of columns. Because of their long history, the weather data is broken up into fields of specific byte lengths: x bytes for the station ID, y bytes for the dew point, and so on. The delimiting is consistent from year to year, so writing SerDe or a parser for transformation is simple. Once that's done, I want to select columns on which to train, classify certain features, and place the training data in an HDFS directory for my Pig script to access. ALTER TABLE historic_weather ADD IF NOT EXISTS PARTITION (yr='2010') LOCATION '/user/oracle/weather/historic/yr=2011'; INSERT OVERWRITE DIRECTORY '/user/oracle/weather/cleaned_history' SELECT w.stn, w.wban, w.weather_year, w.weather_month, w.weather_day, w.temp, w.dewp, w.weather FROM ( FROM historic_weather SELECT TRANSFORM(...) USING '/path/to/hive/filters/ncdc_parser.py' as stn, wban, weather_year, weather_month, weather_day, temp, dewp, weather ) w; Since I'm going to prepare training directories with at least the same frequency that I add partitions, I should also add that to my workflow. Oozie is going to invoke these Hive actions using what's somewhat obviously referred to as a Hive action. Hive actions amount to Oozie running a script file containing our query language statements, so we can place them in a file called weather_train.hql. Starting Our Workflow Oozie offers two types of jobs: workflows and coordinator jobs. Workflows are straightforward: they define a set of actions to perform as a sequence or directed acyclic graph. Coordinator jobs can take all the same actions of Workflow jobs, but they can be automatically started either periodically or when new data arrives in a specified location. To keep things simple we'll make a workflow job; coordinator jobs simply require another XML file for scheduling. The bare minimum for workflow XML defines a name, a starting point, and an end point: <workflow-app name="WeatherMan" xmlns="uri:oozie:workflow:0.1"> <start to="ParseNCDCData"/> <end name="end"/> </workflow-app> To this we need to add an action, and within that we'll specify the hive parameters Also, keep in mind that actions require <ok> and <error> tags to direct the next action on success or failure. <action name="ParseNCDCData"> <hive xmlns="uri:oozie:hive-action:0.2"> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <configuration> <property> <name>oozie.hive.defaults</name> <value>/user/oracle/weather_ooze/hive-default.xml</value> </property> </configuration> <script>ncdc_parse.hql</script> </hive> <ok to="WeatherMan"/> <error to="end"/> </action> There are a couple of things to note here: I have to give the FQDN (or IP) and port of my JobTracker and NameNode. I have to include a hive-default.xml file. I have to include a script file. The hive-default.xml and script file must be stored in HDFS That last point is particularly important. Oozie doesn't make assumptions about where a given workflow is being run. You might submit workflows against different clusters, or have different hive-defaults.xml on different clusters (e.g. MySQL or Postgres-backed metastores). A quick way to ensure that all the assets end up in the right place in HDFS is just to make a working directory locally, build your workflow.xml in it, and copy the assets you'll need to it as you add actions to workflow.xml. At this point, our local directory should contain: workflow.xml hive-defaults.xml (make sure this file contains your metastore connection data) ncdc_parse.hql Adding Pig to the Ooze Adding our Pig script as an action is slightly simpler from an XML standpoint. All we do is add an action to workflow.xml as follows: <action name="WeatherMan"> <pig> <job-tracker>localhost:8021</job-tracker> <name-node>localhost:8020</name-node> <script>weather_train.pig</script> </pig> <ok to="end"/> <error to="end"/> </action> Once we've done this, we'll copy weather_train.pig to our working directory. However, there's a bit of a "gotcha" here. My pig script registers the Weka Jar and a chunk of jython. If those aren't also in HDFS, our action will fail from the outset -- but where do we put them? The Jython script goes into the working directory at the same level as the pig script, because pig attempts to load Jython files in the directory from which the script executes. However, that's not where our Weka jar goes. While Oozie doesn't assume much, it does make an assumption about the Pig classpath. Anything under working_directory/lib gets automatically added to the Pig classpath and no longer requires a REGISTER statement in the script. Anything that uses a REGISTER statement cannot be in the working_directory/lib directory. Instead, it needs to be in a different HDFS directory and attached to the pig action with an <archive> tag. Yes, that's as confusing as you think it is. You can get the exact rules for adding Jars to the distributed cache from Oozie's Pig Cookbook. Making the Workflow Work We've got a workflow defined and have collected all the components we'll need to run. But we can't run anything yet, because we still have to define some properties about the job and submit it to Oozie. We need to start with the job properties, as this is essentially the "request" we'll submit to the Oozie server. In the same working directory, we'll make a file called job.properties as follows: nameNode=hdfs://localhost:8020 jobTracker=localhost:8021 queueName=default weatherRoot=weather_ooze mapreduce.jobtracker.kerberos.principal=foo dfs.namenode.kerberos.principal=foo oozie.libpath=${nameNode}/user/oozie/share/lib oozie.wf.application.path=${nameNode}/user/${user.name}/${weatherRoot} outputDir=weather-ooze While some of the pieces of the properties file are familiar (e.g., JobTracker address), others take a bit of explaining. The first is weatherRoot: this is essentially an environment variable for the script (as are jobTracker and queueName). We're simply using them to simplify the directives for the Oozie job. The oozie.libpath pieces is extremely important. This is a directory in HDFS which holds Oozie's shared libraries: a collection of Jars necessary for invoking Hive, Pig, and other actions. It's a good idea to make sure this has been installed and copied up to HDFS. The last two lines are straightforward: run the application defined by workflow.xml at the application path listed and write the output to the output directory. We're finally ready to submit our job! After all that work we only need to do a few more things: Validate our workflow.xml Copy our working directory to HDFS Submit our job to the Oozie server Run our workflow Let's do them in order. First validate the workflow: oozie validate workflow.xml Next, copy the working directory up to HDFS: hadoop fs -put working_dir /user/oracle/working_dir Now we submit the job to the Oozie server. We need to ensure that we've got the correct URL for the Oozie server, and we need to specify our job.properties file as an argument. oozie job -oozie http://url.to.oozie.server:port_number/ -config /path/to/working_dir/job.properties -submit We've submitted the job, but we don't see any activity on the JobTracker? All I got was this funny bit of output: 14-20120525161321-oozie-oracle This is because submitting a job to Oozie creates an entry for the job and places it in PREP status. What we got back, in essence, is a ticket for our workflow to ride the Oozie train. We're responsible for redeeming our ticket and running the job. oozie -oozie http://url.to.oozie.server:port_number/ -start 14-20120525161321-oozie-oracle Of course, if we really want to run the job from the outset, we can change the "-submit" argument above to "-run." This will prep and run the workflow immediately. Takeaway So, there you have it: the somewhat laborious process of building an Oozie workflow. It's a bit tedious the first time out, but it does present a pair of real benefits to those of us who spend a great deal of time data munging. First, when new data arrives that requires the same processing, we already have the workflow defined and ready to run. Second, as we build up a set of useful action definitions over time, creating new workflows becomes quicker and quicker.

    Read the article

  • Steve Ballmer réfute la disparition des PC au profit des tablettes, quel sera l'avenir des appareils

    Mise à jour du 04.06.2010 par Katleen Steve Ballmer réfute la disparition des PCs au profit des tablettes en réponse à Steve Jobs, quel sera alors l'avenir des appareils numériques ? Suite au passage de Steve Jobs sur le grill des journalistes lors de la conférence D8, Steve Ballmer (CEO de Microsoft) s'est lui aussi assis dans la même chaise hier, où il a répondu aux questions de l'équipe de Wall Street Journal. «Quand nous étions une nation agraire, toutes les voitures étaient des camions. Mais à partir du moment où la population a commencé à migrer vers les villes, les gens ont commencé à utiliser des voitures. Je pense que les PC vont connaître le même sort que les camions. De moins en moins de ...

    Read the article

  • Does professionalism in job postings matter?

    - by Wings87
    I came across this job posting: http://www.justin.tv/jobs/jobs. The page contains some bad language ("No bulls--t"). I personally find this vaguely offensive, and it would certainly put me off applying to work there: It made me wonder what kind/personality of people work at this company. At least where I'm from (EU), the language would be considered bad form, and it would be seen as badly representing the company. Some job postings seem to go to the other extreme, filled with vacuous people skill descriptions and irrelevant details. It's tempting to dream up pictures of each company based on their job posting. So, does professionalism in job postings matter? Are you inclined to see through bad language, irrelevant corporate speak and so on, or do these affect interest in a job?

    Read the article

  • How could this diagram of the current most lucrative technologies be improved?

    - by Edward Tanguay
    I'm giving a talk in April 2011 on the "Developer English" and showing my non-developer audience, mostly English teachers, various diagrams to explain how developers see their industry etc. One of these diagrams is "Hot Technologies", basically, if you want to become a developer, what technologies should you learn to have the highest chance of (1) getting a job (2) making a good salary, and (3) work with the most exciting technology. This is a draft I made just to get some ideas out, basically C#, PHP, Java are where the bulk of the jobs are. Mobile development has a big future. JavaScript is becoming more and more important, and I want to list "minor technologies" such a Python, Ruby on Rails to the side, I assume e.g. that in general, there are a much smaller percentage of jobs in these technologies as in C#, PHP, Java. How could this diagram be improved?

    Read the article

  • How to start up rsyslog automatically?

    - by Enrique Videni
    Check current status of rsyslog $ chkconfig --list rsyslog rsyslog 0:off 1:off 2:off 3:off 4:off 5:off 6:off Start up rsyslog at some levels $ sudo chkconfig --level 35 rsyslog on It outputs these information: insserv: warning: script 'plymouth-stop' missing LSB tags and overrides insserv: Default-Start undefined, assuming empty start runlevel(s) for script `plymouth-stop' insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `plymouth-stop' The script you are attempting to invoke has been converted to an Upstart job, but lsb-header is not supported for Upstart jobs. insserv: warning: script 'failsafe-x' missing LSB tags and overrides insserv: Default-Start undefined, assuming empty start runlevel(s) for script `failsafe-x' insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `failsafe-x' The script you are attempting to invoke has been converted to an Upstart job, but lsb-header is not supported for Upstart jobs. insserv: warning: script 'udevtrigger' missing LSB tags and overrides insserv: Default-Start undefined, assuming empty start runlevel(s) for script `udevtrigger' Check current status of rsyslog again $ chkconfig --list rsyslog rsyslog 0:off 1:off 2:off 3:off 4:off 5:off 6:off I am a novice. Please show me how to use rsyslog from beginning.

    Read the article

  • Windows Azure Recipe: High Performance Computing

    - by Clint Edmonson
    One of the most attractive ways to use a cloud platform is for parallel processing. Commonly known as high-performance computing (HPC), this approach relies on executing code on many machines at the same time. On Windows Azure, this means running many role instances simultaneously, all working in parallel to solve some problem. Doing this requires some way to schedule applications, which means distributing their work across these instances. To allow this, Windows Azure provides the HPC Scheduler. This service can work with HPC applications built to use the industry-standard Message Passing Interface (MPI). Software that does finite element analysis, such as car crash simulations, is one example of this type of application, and there are many others. The HPC Scheduler can also be used with so-called embarrassingly parallel applications, such as Monte Carlo simulations. Whatever problem is addressed, the value this component provides is the same: It handles the complex problem of scheduling parallel computing work across many Windows Azure worker role instances. Drivers Elastic compute and storage resources Cost avoidance Solution Here’s a sketch of a solution using our Windows Azure HPC SDK: Ingredients Web Role – this hosts a HPC scheduler web portal to allow web based job submission and management. It also exposes an HTTP web service API to allow other tools (including Visual Studio) to post jobs as well. Worker Role – typically multiple worker roles are enlisted, including at least one head node that schedules jobs to be run among the remaining compute nodes. Database – stores state information about the job queue and resource configuration for the solution. Blobs, Tables, Queues, Caching (optional) – many parallel algorithms persist intermediate and/or permanent data as a result of their processing. These fast, highly reliable, parallelizable storage options are all available to all the jobs being processed. Training Here is a link to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure HPC Scheduler (3 labs)  The Windows Azure HPC Scheduler includes modules and features that enable you to launch and manage high-performance computing (HPC) applications and other parallel workloads within a Windows Azure service. The scheduler supports parallel computational tasks such as parametric sweeps, Message Passing Interface (MPI) processes, and service-oriented architecture (SOA) requests across your computing resources in Windows Azure. With the Windows Azure HPC Scheduler SDK, developers can create Windows Azure deployments that support scalable, compute-intensive, parallel applications. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

    Read the article

  • How to gain Professional Experience in Java/Java EE Development

    - by Deepak Chandrashekar
    I have been seeing opportunities go past me for just 1 reason: not having professional industry experience. I say to many employers that I'm capable of doing the job and show them the work I've done during the academics and also several personal projects which I took extra time and effort to teach myself the new industry standard technologies. But still, all they want is some 2-3 years experience in an industry. I'm a recent graduate with a Master's Degree in Computer science. I've been applying for quite a few jobs and most of these jobs require 2 years minimum experience. So, I thought somebody here might give me some realistic ideas about getting some experience which can be considered professional. Any kind of constructive comments are welcome.

    Read the article

  • MongoDB, 5 characters, and a free job board

    Today I came across shapado.com - a StackExchange-like open source system running on ruby and mongodb. It took a couple clicks and a few keystroke, and I had http://jobs.shapado.com/ setup and running for free. It was a quasi joke at first, but I figured it might be helpful to get this up and running. So, if you have any jobs to post, or would like to request work, please post away :) You can also set up your own, or download the source from the seemingly unreliable gitorious.    ...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • unix career in programming

    - by mnunna
    I am currently working on a HP-UX platform and my role as a prod support team member involves mostly to write shell scripts. But I want to branch out into core systems programming in unix. A quick search on the internet threw no "unix systems programming jobs" in my area. I'm confused as what to do. I really would like to continue with unix as my core competency, but unix jobs are mostly of sys admin/ prod support type, of which I do not want a part of. Can anyone of you give me an informed advice on the career oppurtinities that await unix professionals?? Any advice would be appreciated.

    Read the article

  • How could a human factors degree help a computer scientist?

    - by Bob Dole
    I'm wrapping up a masters in CS and already have half the credit hours needed for a degree in Human Factors. I just recently discovered how useful understanding about cognition can help someone that creates user interfaces and am thirsty for more knowledge in the area. For me, it seems that having both a masters in Human Factors and CS would be very marketable but would there be jobs out there that would allow me to apply both? Meaning what I would really like to do is take the requirements for some application, apply different Human Factors theories( GOMS, CE+ ) to developing the interface, maybe do cognitive walk through with users to optimize the UI, then develop the application. Do jobs out there exist like this? The reason I ask, is because I'm wondering if most places just want you to be either a Human Factors Expert or a Developer but not both.

    Read the article

  • iAds : déjà 50 % du marché américain pour la régie publicitaire d'Apple avant même son lancement, pr

    Mise à jour du 08/06/10 iAds possèderait déjà 50 % du marché US des annonces mobiles Avant même son lancement, prévu pour le 1er juillet iAds, la nouvelle régie publicitaire d'Apple pour applications mobiles, sera officiellement lancée le 1er juillet prochain. Elle concernera les applications tournant sur les iPhones et iPod Touch qui embarqueront iOS 4 (ex-iPhone OS), le nouvel OS mobile d'Apple. Lors du WWDC, Steve Jobs a d'ores et déjà dévoilé plusieurs grands noms d'annonceurs impliqués dans le projet. Parmi eux, on compte bien sûr Disney (dont Jobs est un actionnaire influent) mais aussi des marques aussi différent...

    Read the article

  • Cloudera Hadoop Certification Value in IT Industry for freshers

    - by Saumitra
    I am a software developer with 8 months of experience in IT industry working on development of tools for BIG DATA analytics. I have learned Hadoop basics on my own and I am pretty comfortable with writing MapReduce Jobs, PIG, HIVE, Flume and other related projects. I am thinking of appearing for Cloudera Hadoop Certification. My question is whether it will benefit me in any way, considering that I am a fresher with not even 1 year of experience. Most of the jobs posting which I have seen related to Hadoop requires at least 3 years of experience. I currently work in India but I can relocate. Please help me in deciding whether I should invest my time in perfecting my Hadoop skills for certification?

    Read the article

  • What is "networking" for your career and how do you know if you have done it successfully?

    - by Jay Godse
    Many people suggest "networking" as a tool or technique to build your career, get better jobs, get promotions, et cetera. But what is "networking"? And more importantly, how do you know if you have "networked" or "built your network" "successfully"? (I quoted all the terms which I think may have subjective and widely varying definitions). Many folks think that networking is schmoozing at networking events. Others think it is adding "friends" to Facebook or LinkedIn. But how do measure the success of such networks or activities? But we all know people (perhaps ourselves) who have done those things and still have trouble getting jobs, promotions, and recognition.

    Read the article

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >