Search Results

Search found 487 results on 20 pages for 'etl instrumentation'.

Page 2/20 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

What is the best FREE solution to implement one ETL project in MySql

- by Pedro

Hi, What is the best FREE solution to implement one ETL project in MySql? Regards, Pedro

Read the article
Field specific errors for ETL

- by AaronLS

I am creating a ETL process in MS SQL Server and I would like to have errors specific to a particular column of a particular row. For example, the data is initially loaded from excel files into a table(we'll call the Initial table) where all columns are varchar(2000) and then I stage the data to another table(the DataTypedTable) that contains more specific data types (datetime,int, etc.) or more tightly constrained varchar lengths. I need to be able to create error messages for a specific field such as: "Jan. 13th" is not a valid date format for the submission date. Please use a format of MM/DD/YYYY These error messages would need to be stored in some way such that later in the process a automated process can create reports with the error messages such that each message references a specific row and field(someone will need to go back and correct the data in the source system and resubmit the excel file). So ideally it would be inserted into a Failures tables of some sort and contain the primary key of the failed row, the column name, and the error message. Question: So I am wondering if this can be accomplished with SSIS, or some open source tool like Talend, and if so, what would be your general approach? Or what hand coded approach you would take? Couple approaches I've thought of using SQL(up until no I have done ETL by hand in SQL procs, but I want to consider other approaches. Possible C# even.): Use a cursor to read through the Initial table, and for each row insert a blank record with only the primary key into the DataTyped table, then use a single update statement for each column, such that if that update fails I can insert a very specific error message specific to that column in the error messages table. Insert all the data as is into the DataTyped table, but have duplicate columns like SubmissionDate and SubmissionDateOld. After the initial insert the *Old columns have data, the rest are blank, and I have a single update for each column that sets the SubmissionDate based on the SubmissionDateOld. In addition to suggesting an approach, I'd like to know if you are using that approach or something similar already in the work you do.

Read the article
successive Joins in Rhino.ETL

- by jonathan

Hi I have 3 tables i want to join in 1 table. How can i do that with Rhino.ETL ? I know how to join 2 tables, but not 3... Thanks John

Read the article
a bit confuse, does SSAS stored ETL result?

- by Irman

a bit confuse round here, does SSAS stored ETL result? or is it SSAS connect directly to the production databases?

Read the article
android instrumentation testsuite

- by siri

Hi I have written two test cases in a package com.app.myapp.test When I try to run them both of them are not getting executed, only one test case gets executed and stops. I have written the following testsuite in the same package AllTests.java public class AllTests extends TestSuite { public static Test suite() { return new TestSuiteBuilder(AllTests.class).includePackages("./src/com.ni.mypaint.test","./src/com.ni.mpaint.test").build(); /* .includeAllPackagesUnderHere() .build();*/ } Is the code and location for this testsuite is correct?

Read the article
Replication Services as ETL extraction tool

- by jorg

In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication… The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button. In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines! On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication. Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions The New Subscription Wizard appears Select the publisher on which you just created your publication and select the database and publication (SnapshotTest) You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful. You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here Click Next Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot: Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot: Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data: Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

Read the article
ETL mechanisms for MySQL to SQL Server over WAN

- by Troy Hunt

I’m looking for some feedback on mechanisms to batch data from MySQL Community Server 5.1.32 with an external host down to an internal SQL Server 05 Enterprise machine over VPN. The external box accumulates data throughout business hours (about 100Mb per day), which then needs to be transferred internationally across a WAN connection to an internal corporate environment before some BI work is performed. This should just be change-sets making their way down each night. I’m interested in thoughts on the ETL mechanisms people have successfully used in similar scenarios before. SSIS seems like a potential candidate; can anyone comment on the suitability for this scenario? Alternatively, other thoughts on how to do this in a cost-conscious way would be most appreciated. Thanks!

Read the article
Adhoc Data processing / ETL

- by Dane

I've just started at a new company in outsourced communications (e.g. print and mail, email, fax). One of the requirements is to process clients data and get it ready for mailing. For recurring jobs, this is easy using an ETL tool linked in with some addressing software, but for adhoc stuff it's a bit overkill. I've used inhouse developed stuff before (clunky but usable), but I don't want to have to re-develop that here. Any recommendations? Some features : Basic DBMS functionality (preferably with a proper DBMS backend for SQL support) Field concatenation (e.g. combine Firstname + Surname) "Pushing columns" (e.g. with address fields 1 - 8, push them left so if one is blank, the next one gets pushed up) Australia post mail sorting and dpid allocation (or can link into external tools relatively easily)

Read the article
ETL , Esper or Drools?

- by geoaxis

Hello, The question environment relates to JavaEE, Spring I am developing a system which can start and stop arbitrary TCP (or other) listeners for incoming messages. There could be a need to authenticate these messages. These messages need to be parsed and stored in some other entities. These entities model which fields they store. So for example if I have property1 that can have two text fields FillLevel1 and FillLevel2, I could receive messages on TCP which have both fill levels specified in text as F1=100;F2=90 Later I could add another filed say FillLevel3 when I start receiving messages F1=xx;F2=xx;F3=xx. But this is a conscious decision on the part of system modeler. My question is what do you think is better to use for parsing and storing the message. ETL (using Pantaho, which is used in other system) where you store the raw message and use task executor to consume them one by one and store the transformed messages as per your rules. One could use Espr or Drools to do the same thing , storing rules and executing them with timer, but I am not sure how dynamic you could get with making rules (they have to be made by end user in a running system and preferably in most user friendly way, ie no scripts or code, only GUI) The end user should be capable of changing the parse rules. It is also possible that end user might want to change the archived data as well (for example in the above example if a new value of FillLevel is added, one would like to put a FillLevel=-99 in the previous values to make the data consistent). Please ask for explanations, I have the feeling that I need to revise this question a bit. Thanks

Read the article
PostgreSQL to Data-Warehouse: Best approach for near-real-time ETL / extraction of data

- by belvoir

Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse. How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to 15-20k. The most frequently updated rows are ~64 bytes each but there are various tables etc so the data is quite diverse and can range up to 4000 bytes per row. The OLTP is active 24x5.5. Best Solution? From what I can piece together the most practical solution is as follows: Create a TRIGGER to write all DML activity to a rotating CSV log file Perform whatever transformations are required Use the native DW data pump tool to efficiently pump the transformed CSV into the DW Why this approach? TRIGGERS allow selective tables to be targeted rather than being system wide + output is configurable (i.e. into a CSV) and are relatively easy to write and deploy. SLONY uses similar approach and overhead is acceptable CSV easy and fast to transform Easy to pump CSV into the DW Alternatives considered .... Using native logging (http://www.postgresql.org/docs/8.3/static/runtime-config-logging.html). Problem with this is it looked very verbose relative to what I needed and was a little trickier to parse and transform. However it could be faster as I presume there is less overhead compared to a TRIGGER. Certainly it would make the admin easier as it is system wide but again, I don't need some of the tables (some are used for persistent storage of JMS messages which I do not want to log) Querying the data directly via an ETL tool such as Talend and pumping it into the DW ... problem is the OLTP schema would need tweaked to support this and that has many negative side-effects Using a tweaked/hacked SLONY - SLONY does a good job of logging and migrating changes to a slave so the conceptual framework is there but the proposed solution just seems easier and cleaner Using the WAL Has anyone done this before? Want to share your thoughts?

Read the article
ETL Operation - Return Primary Key

- by user302254

I am using Talend to populate a data warehouse. My job is writing customer data to a dimension table and transaction data to the fact table. The surrogate key (p_key) on the fact table is auto-incrementing. When I insert a new customer, I need my fact table to reflect the id of the related customer. As I mentioned my p_key is auto auto_incrementing so I can't just insert an arbitrary value for the p_key. Any thought on how I can insert a row into my dimension table and still retrieve the primary key to reference in my fact record? Thanks.

Read the article
What ETL tool do you use?

- by VanOrman

I've been trying to make a decision as to which package to implement. I've looked at Pentaho, Talend, SSIS, Informatica, etc... What do you use?

Read the article
SQL SERVER – SSIS Parameters in Parent-Child ETL Architectures – Notes from the Field #040

- by Pinal Dave

[Notes from Pinal]: SSIS is very well explored subject, however, there are so many interesting elements when we read, we learn something new. A similar concept has been Parent-Child ETL architecture’s relationship in SSIS. Linchpin People are database coaches and wellness experts for a data driven world. In this 40th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to understand SSIS Parameters in Parent-Child ETL Architectures. In this brief Notes from the Field post, I will review the use of SSIS parameters in parent-child ETL architectures. A very common design pattern used in SQL Server Integration Services is one I call the parent-child pattern. Simply put, this is a pattern in which packages are executed by other packages. An ETL infrastructure built using small, single-purpose packages is very often easier to develop, debug, and troubleshoot than large, monolithic packages. For a more in-depth look at parent-child architectures, check out my earlier blog post on this topic. When using the parent-child design pattern, you will frequently need to pass values from the calling (parent) package to the called (child) package. In older versions of SSIS, this process was possible but not necessarily simple. When using SSIS 2005 or 2008, or even when using SSIS 2012 or 2014 in package deployment mode, you would have to create package configurations to pass values from parent to child packages. Package configurations, while effective, were not the easiest tool to work with. Fortunately, starting with SSIS in SQL Server 2012, you can now use package parameters for this purpose. In the example I will use for this demonstration, I’ll create two packages: one intended for use as a child package, and the other configured to execute said child package. In the parent package I’m going to build a for each loop container in SSIS, and use package parameters to pass in a value – specifically, a ClientID – for each iteration of the loop. The child package will be executed from within the for each loop, and will create one output file for each client, with the source query and filename dependent on the ClientID received from the parent package. Configuring the Child and Parent Packages When you create a new package, you’ll see the Parameters tab at the package level. Clicking over to that tab allows you to add, edit, or delete package parameters. As shown above, the sample package has two parameters. Note that I’ve set the name, data type, and default value for each of these. Also note the column entitled Required: this allows me to specify whether the parameter value is optional (the default behavior) or required for package execution. In this example, I have one parameter that is required, and the other is not. Let’s shift over to the parent package briefly, and demonstrate how to supply values to these parameters in the child package. Using the execute package task, you can easily map variable values in the parent package to parameters in the child package. The execute package task in the parent package, shown above, has the variable vThisClient from the parent package mapped to the pClientID parameter shown earlier in the child package. Note that there is no value mapped to the child package parameter named pOutputFolder. Since this parameter has the Required property set to False, we don’t have to specify a value for it, which will cause that parameter to use the default value we supplied when designing the child pacakge. The last step in the parent package is to create the for each loop container I mentioned earlier, and place the execute package task inside it. I’m using an object variable to store the distinct client ID values, and I use that as the iterator for the loop (I describe how to do this more in depth here). For each iteration of the loop, a different client ID value will be passed into the child package parameter. The final step is to configure the child package to actually do something meaningful with the parameter values passed into it. In this case, I’ve modified the OleDB source query to use the pClientID value in the WHERE clause of the query to restrict results for each iteration to a single client’s data. Additionally, I’ll use both the pClientID and pOutputFolder parameters to dynamically build the output filename. As shown, the pClientID is used in the WHERE clause, so we only get the current client’s invoices for each iteration of the loop. For the flat file connection, I’m setting the Connection String property using an expression that engages both of the parameters for this package, as shown above. Parting Thoughts There are many uses for package parameters beyond a simple parent-child design pattern. For example, you can create standalone packages (those not intended to be used as a child package) and still use parameters. Parameter values may be supplied to a package directly at runtime by a SQL Server Agent job, through the command line (via dtexec.exe), or through T-SQL. Also, you can also have project parameters as well as package parameters. Project parameters work in much the same way as package parameters, but the parameters apply to all packages in a project, not just a single package. Conclusion Of the numerous advantages of using catalog deployment model in SSIS 2012 and beyond, package parameters are near the top of the list. Parameters allow you to easily share values from parent to child packages, enabling more dynamic behavior and better code encapsulation. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Heterogén adatelérés OWB-vel: ODI EE Enterprise ETL

- by Fekete Zoltán

Az elozo ketto blogbejegyzéshez kapcsolódva felmerül a kérdés: Hogyan lehet az Oracle Warehouse Builderrel heterogén adatforrásokat elérni? Ajánlott olvasmány: Oracle Warehouse Builder 11gR2: OWB ETL Using ODI Knowledge Modules Természetesen az OWB az Oracle Database Heterogeneous Services-zel ODBC-vel illetve Oracle Gateway-k alkalmazásával eddig is lehetett mindenféle ODBC kompatibilis továbbá mainframe-es adatbázisokat elérni. Oracle Database Gateways: MS SQL Server, Sybase, Teradata, Informix, ODBC, DRDA, APPC, WebSphere MQ, DB2, DB2/400. A megfelelo Application Adapters megvásárlásával lehet csatlakozni az OWB-vel például a következo forrásokhoz: SAP, Oracle E-Business Suite, Peoplesoft, Siebel, Oracle Customer Data Hub (CDH), Universal Customer Master (UCM), Product Information Management (PIM). Az OWB 11gR2-tol kezdve az OWB tudja használni az Oracle Data Integrator Knowledge moduljait a heterogén adatelérésre, ez JDBC-vel illetve más heterogén elérési módokkal. Ajánlott olvasmány: Oracle Warehouse Builder 11gR2: OWB ETL Using ODI Knowledge Modules Letöltés: Oracle Warehouse Builder. BTW az OWB Java-s kliens szoftver Linux-on és Windows-on is használható. A szerver oldal pedig természetesen az Oracle adatbázisban fut: Solaris, Linux, HP-UX, AIX, Windows operációs rendszereken.

Read the article
Do reactive extensions and ETL go together?

- by Aaron Anodide

I don't fully understand reactive extensions, but my inital reading caused me think about the ETL code I have. Right now its basically a workflow to to perform various operations in a certain sequence based on conditions it find as it progresses. I can also imagine an event driven way such that only a small amount of imperative logic causes a chain reaction to occur. Of course I don't need a new type of programming model to make an event driven collaboration like that. Just the same I am wondering if ETL is a good fit for potentially exploring Rx further. Is my connection in a valid direction even? If not, could you briefly correct the error in my logic?

Read the article
Extracting, Transforming, and Loading (ETL) Process

The process of Extracting, Transforming, and Loading data in to a data warehouse is called Extract Transform Load (ETL) process. This process can be used to obtain, analyze, and clean data from various data sources so that it can be stored in a uniform manner within a data warehouse. This data can then be used by various business intelligence processes to provide an organization with more of an in depth analysis of the current state of the company and where it is heading. A standard ETL process that might be used by a health care system may include importing all of their patients names, diagnoses and prescriptions in to a unified data warehouse so that trends can be spotted in regards to outbreaks like the flu and also predict potential illness that a patient might be affected by based on other patients with similar symptoms.

Read the article
Video Tutorials for Ab Initio ETL Data Warehousing Tool?

- by user25588

Where can I find video tutorials for "Ab Initio ETL Data Warehousing Tool"? I surfed in google for many days but i have no luck in finding the material.

Read the article
Service Broker, not ETL

- by jamiet

I have been very quiet on this blog of late and one reason for that is I have been very busy on a client project that I would like to talk about a little here. The client that I have been working for has a website that runs on a distributed architecture utilising a messaging infrastructure for communication between different endpoints. My brief was to build a system that could consume these messages and produce analytical information in near-real-time. More specifically I basically had to deliver a data warehouse however it was the real-time aspect of the project that really intrigued me. This real-time requirement meant that using an Extract transformation, Load (ETL) tool was out of the question and so I had no choice but to write T-SQL code (i.e. stored-procedures) to process the incoming messages and load the data into the data warehouse. This concerned me though – I had no way to control the rate at which data would arrive into the system yet we were going to have end-users querying the system at the same time that those messages were arriving; the potential for contention in such a scenario was pretty high and and was something I wanted to minimise as much as possible. Moreover I did not want the processing of data inside the data warehouse to have any impact on the customer-facing website. As you have probably guessed from the title of this blog post this is where Service Broker stepped in! For those that have not heard of it Service Broker is a queuing technology that has been built into SQL Server since SQL Server 2005. It provides a number of features however the one that was of interest to me was the fact that it facilitates asynchronous data processing which, in layman’s terms, means the ability to process some data without requiring the system that supplied the data having to wait for the response. That was a crucial feature because on this project the customer-facing website (in effect an OLTP system) would be calling one of our stored procedures with each message – we did not want to cause the OLTP system to wait on us every time we processed one of those messages. This asynchronous nature also helps to alleviate the contention problem because the asynchronous processing activity is handled just like any other task in the database engine and hence can wait on another task (such as an end-user query). Service Broker it was then! The stored procedure called by the OLTP system would simply put the message onto a queue and we would use a feature called activation to pick each message off the queue in turn and process it into the warehouse. At the time of writing the system is not yet up to full capacity but so far everything seems to be working OK (touch wood) and crucially our users are seeing data in near-real-time. By near-real-time I am talking about latencies of a few minutes at most and to someone like me who is used to building systems that have overnight latencies that is a huge step forward! So then, am I advocating that you all go out and dump your ETL tools? Of course not, no! What this project has taught me though is that in certain scenarios there may be better ways to implement a data warehouse system then the traditional “load data in overnight” approach that we are all used to. Moreover I have really enjoyed getting to grips with a new technology and even if you don’t want to use Service Broker you might want to consider asynchronous messaging architectures for your BI/data warehousing solutions in the future. This has been a very high level overview of my use of Service Broker and I have deliberately left out much of the minutiae of what has been a very challenging implementation. Nonetheless I hope I have caused you to reflect upon your own approaches to BI and question whether other approaches may be more tenable. All comments and questions gratefully received! Lastly, if you have never used Service Broker before and want to kick the tyres I have provided below a very simple “Service Broker Hello World” script that will create all of the objects required to facilitate Service Broker communications and then send the message “Hello World” from one place to anther! This doesn’t represent a “proper” implementation per se because it doesn’t close down down conversation objects (which you should always do in a real-world scenario) but its enough to demonstrate the capabilities! @Jamiet ----------------------------------------------------------------------------------------------- /*This is a basic Service Broker Hello World app. Have fun! -Jamie */ USE MASTER GO CREATE DATABASE SBTest GO --Turn Service Broker on! ALTER DATABASE SBTest SET ENABLE_BROKER GO USE SBTest GO -- 1) we need to create a message type. Note that our message type is -- very simple and allowed any type of content CREATE MESSAGE TYPE HelloMessage VALIDATION = NONE GO -- 2) Once the message type has been created, we need to create a contract -- that specifies who can send what types of messages CREATE CONTRACT HelloContract (HelloMessage SENT BY INITIATOR) GO --We can query the metadata of the objects we just created SELECT * FROM sys.service_message_types WHERE name = 'HelloMessage'; SELECT * FROM sys.service_contracts WHERE name = 'HelloContract'; SELECT * FROM sys.service_contract_message_usages WHERE service_contract_id IN (SELECT service_contract_id FROM sys.service_contracts WHERE name = 'HelloContract') AND message_type_id IN (SELECT message_type_id FROM sys.service_message_types WHERE name = 'HelloMessage'); -- 3) The communication is between two endpoints. Thus, we need two queues to -- hold messages CREATE QUEUE SenderQueue CREATE QUEUE ReceiverQueue GO --more querying metatda SELECT * FROM sys.service_queues WHERE name IN ('SenderQueue','ReceiverQueue'); --we can also select from the queues as if they were tables SELECT * FROM SenderQueue SELECT * FROM ReceiverQueue -- 4) Create the required services and bind them to be above created queues CREATE SERVICE Sender ON QUEUE SenderQueue CREATE SERVICE Receiver ON QUEUE ReceiverQueue (HelloContract) GO --more querying metadata SELECT * FROM sys.services WHERE name IN ('Receiver','Sender'); -- 5) At this point, we can begin the conversation between the two services by -- sending messages DECLARE @conversationHandle UNIQUEIDENTIFIER DECLARE @message NVARCHAR(100) BEGIN BEGIN TRANSACTION; BEGIN DIALOG @conversationHandle FROM SERVICE Sender TO SERVICE 'Receiver' ON CONTRACT HelloContract WITH ENCRYPTION=OFF -- Send a message on the conversation SET @message = N'Hello, World'; SEND ON CONVERSATION @conversationHandle MESSAGE TYPE HelloMessage (@message) COMMIT TRANSACTION END GO --check contents of queues SELECT * FROM SenderQueue SELECT * FROM ReceiverQueue GO -- Receive a message from the queue RECEIVE CONVERT(NVARCHAR(MAX), message_body) AS MESSAGE FROM ReceiverQueue GO --If no messages were received and/or you can't see anything on the queues you may wish to check the following for clues: SELECT * FROM sys.transmission_queue -- Cleanup DROP SERVICE Sender DROP SERVICE Receiver DROP QUEUE SenderQueue DROP QUEUE ReceiverQueue DROP CONTRACT HelloContract DROP MESSAGE TYPE HelloMessage GO USE MASTER GO DROP DATABASE SBTest GO

Read the article
SQL SERVER – Integrate Your Data with Skyvia – Cloud ETL Solution

- by Pinal Dave

In our days data integration often becomes a key aspect of business success. For business analysts it’s very important to get integrated data from various sources, such as relational databases, cloud CRMs, etc. to make correct and successful decisions. There are various data integration solutions on market, and today I will tell about one of them – Skyvia. Skyvia is a cloud data integration service, which allows integrating data in cloud CRMs and different relational databases. It is a completely online solution and does not require anything except for a browser. Skyvia provides powerful etl tools for data import, export, replication, and synchronization for SQL Server and other databases and cloud CRMs. You can use Skyvia data import tools to load data from various sources to SQL Server (and SQL Azure). Skyvia supports such cloud CRMs as Salesforce and Microsoft Dynamics CRM and such databases as MySQL and PostgreSQL. You even can migrate data from SQL Server to SQL Server, or from SQL Server to other databases and cloud CRMs. Additionally Skyvia supports import of CSV files, either uploaded manually or stored on cloud file storage services, such as Dropbox, Box, Google Drive, or FTP servers. When data import is not enough, Skyvia offers bidirectional data synchronization. With this tool, you can synchronize SQL Server data with other databases and cloud CRMs. After performing the first synchronization, Skyvia tracks data changes in the synchronized data storages. In SQL Server databases (and other relational databases) it creates additional tracking tables and triggers. This allows synchronizing only the changed data. Skyvia also maps records by their primary key values to each other, so it does not require different sources to have the same primary key structure. It still can match the corresponding records without having to add any additional columns or changing data structure. The only requirement for synchronization is that primary keys must be autogenerated. With Skyvia it’s not necessary for data to have the same structure in integrated data storages. Skyvia supports powerful mapping mechanisms that allow synchronizing data with completely different structure. It provides support for complex mathematical and string expressions when mapping data, using lookups, etc. You may use data splitting – loading data from a single CSV file or source table to multiple related target tables. Or you may load data from several source CSV files or tables to several related target tables. In each case Skyvia preserves data relations. It builds corresponding relations between the target data automatically. When you often work with cloud CRM data, native CRM data reporting and analysis tools may be not enough for you. And there is a vast set of professional data analysis and reporting tools available for SQL Server. With Skyvia you can quickly copy your cloud CRM data to an SQL Server database and apply corresponding SQL Server tools to the data. In such case you can use Skyvia data replication tools. It allows you to quickly copy cloud CRM data to SQL Server or other databases without customizing any mapping. You need just to specify columns to copy data from. Target database tables will be created automatically. Skyvia offers powerful filtering settings to replicate only the records you need. Skyvia also provides capability to export data from SQL Server (including SQL Azure) and other databases and cloud CRMs to CSV files. These files can be either downloadable manually or loaded to cloud file storages or FTP server. You can use export, for example, to backup SQL Azure data to Dropbox. Any data integration operation can be scheduled for automatic execution. Thus, you can automate your SQL Azure data backup or data synchronization – just configure it once, then schedule it, and benefit from automatic data integration with Skyvia. Currently registration and using Skyvia is completely free, so you can try it yourself and find out whether its data migration and integration tools suits for you. Visit this link to register on Skyvia: https://app.skyvia.com/register Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Cloud Computing

Read the article
OBIA on Teradata - Part 2 Teradata DB Utilization for ETL

- by Mohan Ramanuja

Techniques to Monitor Queries and ETL Load CPU and Disk I/OSelect username, processor, sum(cputime), sum(diskio) from dbc.ampusage where processor ='1-0' order by 2,3 descgroup by 1,2;UserName Vproc Sum(CpuTime) Sum(DiskIO)AC00916 10 6.71 24975 List Hardware ErrorsThere is a possibility that the system might have adequate disk space but out of free cylinders. In order to monitor hardware errors, the following query was used:Select * from dbc.Software_Event_Log where Text like '%restart%' order by thedate, thetime;For active users, usage of CPU and analysis of bad CPU to I/O ratiosSelect * from DBC.AMPUSAGE where username='CRMSTGC_DEV_ID'; AND SUBSTR(ACCOUNTNAME,6,3)='006'; Usage By I/OSelect AccountName, UserName, sum(CpuTime), sum(DiskIO) from DBC.AMPUSAGE group by AccountName, UserName Order by Sum(DiskIO) desc; AccountName UserName Sum(CpuTime) Sum(DiskIO)$M1$10062209 AB89487 374628.612 7821847$M1$10062210 AB89487 186692.244 2799412$M1$10062213 COC_ETL_ID 119531.068 331100426$M1$10062200 AB63472 118973.316 109881984$M1$10062204 AB63472 110825.356 94666986$M1$10062201 AB63472 110797.976 75016994$M1$10062202 AC06936 100924.448 407839702$M1$10062204 AB67963 0 4$M1$10062207 AB91990 0 2$M1$10062208 AB63461 0 24$M1$10062211 AB84332 0 6$M1$10062214 AB65484 0 8$M1$10062205 AB77529 0 58$M1$10062210 AC04768 0 36$M1$10062206 AB54940 0 22 Usage By CPUSelect AccountName, UserName, sum(CpuTime), sum(DiskIO) from DBC.AMPUSAGE group by AccountName, UserName Order by Sum(CpuTime) desc;AccountName UserName Sum(CpuTime) Sum(DiskIO)$M1$10062209 AB89487 374628.612 7821847$M1$10062210 AB89487 186692.244 2799412$M1$10062213 COC_ETL_ID 119531.068 331100426$M1$10062200 AB63472 118973.316 109881984$M1$10062204 AB63472 110825.356 94666986$M1$10062201 AB63472 110797.976 75016994$M2$100622105813004760047LOAD T23_ETLPROC_ENT 0 6$M1$10062215 AA37720 0 180$M1$10062209 AB81670 0 6Select count(distinct vproc) from dbc.ampusage;432select * from dbc.dbcinfo;AccountName UserName CpuTime DiskIO CpuTimeNorm Vproc VprocType Model$M1$10062205 CRM_STGC_DEV_ID 0.32 1764 12.7423999023438 0 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.28 1730 11.1495999145508 3 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.304 1736 12.1052799072266 4 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.248 1731 9.87535992431641 7 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.332 1731 13.2202398986816 8 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.284 1712 11.3088799133301 11 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.24 1757 9.55679992675781 12 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.292 1737 11.6274399108887 15 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.268 1753 10.6717599182129 16 AMP 2580$M1$10062205 CRM_STGC_DEV_ID 0.276 1732 10.9903199157715 19 AMP 2580select * from dbc.dbcinfo;InfoKey InfoDataLANGUAGE SUPPORT MODE StandardRELEASE 12.00.03.03VERSION 12.00.03.01a

Read the article
Straight Java/Groovy versus ETL tool (Talend/etc) - what libraries would you use?

- by Alex R

Assume you have a small project which on the surface looks like a good match for an ETL tool like Talend. But assume further, that you have never used Talend and furthermore, you do not trust "visual programming" tools in general and would rather code everything the old fashioned way (text on a nice IDE!) with the help of an appropriate language & support libraries. What are some language patterns & support libraries that could help you stay away from the ETL tool temptation/trap?

Read the article
How to run Android instrumentation tests from the command line (in Kubuntu)?

- by KK

We are able to run instrumentation tests of Android from the command line on Windows by launching: adb shell am instrument -w <package.test>/android.test.InstrumentationTestRunner This gives us good results. Using the same architecture, we are unable to run the same in Kubuntu. We have the same setup in Kubuntu. Can someone let us know, if there are packages with same name.. Then what package will the adb shell point? How will the emulator connect with adb shell from cmd line? DO we need to do any changes to do so in Kubuntu ?

Read the article
Tool to automatically add tracing or instrumentation to an .NET app?

- by Tony_Henrich

I have a .NET app which runs once a day. Sometimes there's a logical error somewhere which occurs randomly. Is there a tool like Eqatec's Tracer which automatically injects instrumentation in the code and logs everything to a file? The issue with Eqatec's Tracer is that it opens a window on the machine and it doesn't seem to log. My issue rarely occurs and I need to keep all logs for a long time. I don't want to change the source code and add logging statements all over the place. I am aware of PostSharp & Logfaces.

Read the article
Kimball University: Three ETL Compromises to Avoid

Why neglecting slowly changing dimensions, failing to capture metadata and overlooking scope creep can be the undoing of a dimensional data warehousing initiative.

Read the article
How to instrument existing ASP.NET application?

- by jkohlhepp

We have several highly complex ASP.NET web applications that are used internally by hundreds of users. We are trying to figure out which areas of the applications to invest in to improve functionality, but we aren't sure which screens/features are more heavily used. So, ideally, I'd like to find a way to add a layer of instrumentation to the applications that gathers metrics on which buttons are being clicked, which text boxes are being used, etc. Are there any products / open source apps out there that will do this sort of instrumentation for ASP.NET? Obviously I could do it myself manually by going into the code and injecting logging statements everywhere but this would be a significant amount of work that will be hard to accomplish.

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >