Search Results

Search found 22929 results on 918 pages for 'ssis script'.

Page 68/918 | < Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Stdin to powershell script

    - by Stefan
    I have a service running that can invoke an external process to modify a text stream before it is returned to the service. The text stream is handed from the service to the external process on stdout and the modified result is read from the service on stdin. The external process (command) can in other words be used as a text "filter". I would like to use a powershell script to modify the text stream. I can successfully launch a script from the service on win 2008r2 using the command "powershell -executionpolicy bypass -noninteractive ./myscript.ps1". I can make the script return text to the service on stdout using the write-host cmdlet. My problem is that I can't find a way to read the text on stdin in the script. Read-host doesn't seem to work as it requires an interactive shell. I would like to avoid writing the stdout from the service to a tmp file and read that file in the script as the service is multithreaded (can launch more than one external command at a time) and tmp file management (locking, unique filenames etc) is not desired. Is this possible or should I use for example Perl for this? Powershell seems compelling as it is preinstalled on all my win 2008 machines.

    Read the article

  • Crontab no error but doesn't execute script

    - by crontabOnFreebsd
    I'm trying to execute a shell script from cron on Freebsd. To test wether crontab is working at all, i wrote the line * * * * * echo "Hello" /home/myuser/logile and it work fine. But when trying to execute any script it doesn't do anything, not even an error. (In the script i tried to run is just the same echo command) Below is the output of crontab -l: SHELL=/bin/sh PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/home/myuser MAILTO=myuser * * * * * /home/myuser/shellscript.sh /home/myuser/logfile why is the script not getting executed, although crontab is obviously running? permission for all files are set to rwxr-xr-x

    Read the article

  • Beanshell in Ant yielding, "Unable to create javax script engine for beanshell"

    - by John B.
    Greeting, I'm trying to put some Beanshell script in my Ant build.xml file. I've followed the Ant manual as well as I can but I keep getting "Unable to create javax script engine for beanshell" when I run Ant. Here is the test target I wrote mostly from examples in the Ant manual: <target name="test-target"> <script language="beanshell" setbeans="true"> <classpath> <fileset dir="c:\TEMP" includes="*.jar" /> </classpath> System.out.println("Hello world"); </script> </target> My beanshell "bsh-2.0b4.jar" file is on the script task's classpath the way the manual recommended. Hope I have the right file. I'm working in c:\TEMP right now. I've been googling and trying for a while now. Any ideas would be greatly appreciated. Thanks.

    Read the article

  • Batch file script to remove special characters from filenames (Windows)

    - by njreed.myopenid.com
    I have a large set of files, some of which contain special characters in the filename (e.g. ä,ö,%, and others). I'd like a script file to iterate over these files and rename them removing the special characters. I don't really mind what it does, but it could replace them with underscores for example e.g. Störung%20.doc would be renamed to St_rung_20.doc In order of preference: A DOS batch file A Windows script file to run with cscript (vbs) A third party piece of software that can be run from the command-line (i.e. no user interaction required) Another language script file, for which I'd have to install an additional script engine Background: I'm trying to encrypt these file with GnuPG on Windows but it doesn't seem to handle special characters in filenames with the --encrypt-files option.

    Read the article

  • Perl Script in PHP

    - by Sev
    I have a perl script, which takes a query string parameter, connects to a database and displays data. I'd like to include that script in a PHP file like so: include('perlscript.pl?item=302'); Such that the perl script's response is displayed on the PHP/HTML page. How can I do this?

    Read the article

  • How do I automate navigation to a website that requires authentication?

    - by Wiz
    Here's what I'm trying to achieve. I would like to write a script that will navigate to a website that requires me to be authenticated as myself, say Facebook, Live Spaces, Twitter or any other, and then have that script search for certain information on one of the pages of the website. I've done something similar in the past with the Windows.Forms WebBrowser control, which is a full blown implementation of IE that can be controlled through code and will store whatever cookies you get once you're authenticated, but it was very unfriendly to modify and I was hoping to use a scripting language instead, maybe Powershell or something of that sort. Are there maybe some good tutorials about this out there on the web? Thanks!

    Read the article

  • Script tag in XHTML

    - by Ben
    Hi, This might sound like a reaaaally dumb question but... why do browsers have a fit with this syntax: <script type='text/javascript' src="/path/to/my.js" /> and want this instead <script type='text/javascript' src="/path/to/my.js"></script> Seems the first construct should be valid since there's no inner content to the tag.. ?

    Read the article

  • Drag and drop onto Python script in Windows Explorer

    - by grok
    I would like to drag and drop my data file onto a Python script and have it process the file and generate output. The Python script accepts the name of the data file as a command-line parameter, but Windows Explorer doesn't allow the script to be a drop target. Is there some kind of configuration that needs to be done somewhere for this work?

    Read the article

  • PHP set timeout for script with system call, set_time_limit not working

    - by tehalive
    I have a command-line PHP script that runs a wget request using each member of an array with foreach. This wget request can sometimes take a long time so I want to be able to set a timeout for killing the script if it goes past 15 seconds for example. I have PHP safemode disabled and tried set_time_limit(15) early in the script, however it continues indefinitely. Update: Thanks to Dor for pointing out this is because set_time_limit() does not respect system() calls. So I was trying to find other ways to kill the script after 15 seconds of execution. However, I'm not sure if it's possible to check the time a script has been running while it's in the middle of a wget request at the same time (a do while loop did not work). Maybe fork a process with a timer and set it to kill the parent after a set amount of time? Thanks for any tips! Update: Below is my relevant code. $url is passed from the command-line and is an array of multiple URLs (sorry for not posting this initially): foreach( $url as $key => $value){ $wget = "wget -r -H -nd -l 999 $value"; system($wget); }

    Read the article

  • Script executes successfully in commandline but not as a cronjob

    - by JasonOng
    I've a bash script that runs a ruby script that fetches my twitter feeds. ## /home/username/twittercron #!/bin/bash cd /home/username/twitter ruby twitter.rb friends It runs successfully in command line. /home/username/twittercron But when I try to run it as a cronjob, it ran but wasn't able to fetch the feeds. ## crontab -e */15 * * * * * /home/username/twittercron The script has been chmod +x. Not sure why it's as such. Any ideas?

    Read the article

  • Apache mod php and script invocation

    - by Abhi
    Say I am running a PHP script, foo.php, inside apache configured with mod php, then, say I invoke the script from my browser(or any other means), does apache spawn off a new process in which the script gets executed? How does it work? Can someone pls point me to some good article on this?

    Read the article

  • Temporary operation in a temporary directory in shell script

    - by jhs
    I need a fresh temporary directory to do some work in a shell script. When the work is done (or if I kill the job midway), I want the script to change back to the old working directory and wipe out the temporary one. In Ruby, it might look like this: require 'tmpdir' Dir.mktmpdir 'my_build' do |temp_dir| puts "Temporary workspace is #{temp_dir}" do_some_stuff(temp_dir) end puts "Temporary directory already deleted" What would be the best bang for the buck to do that in a Bash script? I want to trap

    Read the article

  • XAMPP Mercurial installation on Windows Apache --> HgWebDir.cgi Script Error

    - by Tim
    I try to publish multiple existing mercurial repository-locations though XAMPP Apache via CGI Python script hgwebdir.cgi ... as in this tutorial http://mercurial.selenic.com/wiki/HgWebDirStepByStep I get the following error from the apache error logs, when I try to access the repository path with a browser: Premature end of script headers: hgwebdir.cgi [Tue Apr 20 16:00:50 2010] [error] [client 91.67.44.216] Premature end of script headers: hgwebdir.cgi [Tue Apr 20 16:00:50 2010] [error] [client 91.67.44.216] File "C:/hostdir/xampp/cgi-bin/hg/hgwebdir.cgi", line 39\r [Tue Apr 20 16:00:50 2010] [error] [client 91.67.44.216] test = c:/hostdir/mercurial/test/\r [Tue Apr 20 16:00:50 2010] [error] [client 91.67.44.216] ^\r [Tue Apr 20 16:00:50 2010] [error] [client 91.67.44.216] SyntaxError: invalid syntax\r This is the path of the file where the script fails (and if I remove it, I get an empty HTML page shown with no visual elements in it): [paths] test = c:/hostdir/mercurial/test/ /hg = c:/hostdir/mercurial/** / = c:/hostdir/mercurial/ Does anybody have a clue for me?

    Read the article

  • How to run script from command-line

    - by Eric
    I want to write a script that counts the types of objects there are in the ZODB, when they were created, how many users have joined since a given point in time,etc. I am wondering how to accomplish this. So, was wondering if there is a way to pass a script to bin/instance to be executed. I've created this script with a py script but it takes a VERY long time to finish and this is why I would like to do this from the command-line... in the hopes of it running faster. Thanks ERic

    Read the article

  • shell script redirect output

    - by Andy
    I have a shell script to monitor process due to preventing the process closed. If the process is closed, that script will restart it. BTW, when the system starts, the crontab will run the script automatically. How can I get the output of the process which started by the shell script? #!/bin/bash PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:~/bin export PATH while : do if [ -z "$(ps -ef | grep -v grep | grep 225.0.6.4)" ]; then date +"%m-%d-%y %T" >> /home/andy/log/stream.log echo "225.0.6.4 - 103 not worked and restart process" >> /home/andy/log/stream.log echo "225.0.6.4 - 103 not worked and restart process" /usr/bin/tzap -a 1 -c /home/andy/channels.conf -o - -r -p "D" | /home/andy/ffmpeg -f mpegts -i pipe:0 -c:v libx264 -preset medium -crf 23 -bufsize 3000K -minrate 1200k -maxrate 1200k -pix_fmt yuv420p -g 50 -s 1024x768 -acodec libmp3lame -b:a 128k -ac 2 -ar 44100 -f mpegts udp://225.0.6.4:50000 & fi sleep 1 done

    Read the article

  • Using rvm with a standalone ruby script

    - by John Yeates
    I have rvm installed on a Mac OS X 10.6 system with the system ruby and 1.9.1. I also have this basic ruby script: #!/usr/bin/ruby require 'curb-fu' I need the script to use the system ruby regardless of what rvm's using at any given time; I'm assuming that I've got that right, at least. I've switched to the system ruby (rvm use system) and then installed the gem (gem install curb-fu). If I run irb and type require 'curb-fu', it works. However, running that script with ./myscript.rb fails: /Users/me/bin/podcast_notify.rb:6:in `require': no such file to load -- curb-fu (LoadError) from /Users/me/bin/podcast_notify.rb:6 What's going wrong here? How do I install curb-fu so that it's always available to this script?

    Read the article

  • How to stop java application using a shell script

    - by Fernando Moyano
    I have a shell script, which is run under a opensuse linux, that starts a java application (under a jar), the script is: #!/bin/sh #export JAVA_HOME=/usr/local/java #PATH=/usr/local/java/bin:${PATH} #---------------------------------# # dynamically build the classpath # #---------------------------------# THE_CLASSPATH= for i in `ls ./lib/*.jar` do THE_CLASSPATH=${THE_CLASSPATH}:${i} done #---------------------------# # run the application # #---------------------------# java -server -Xms512M -Xmx1G -cp ".:${THE_CLASSPATH}" com.package.MyApp > myApp.out 2>&0 & This script is working fine. Now, what I want, is to write a script to kill gracefully this app, something that allows me to kill it with the -15 argument from Linux kill command. The problem, is that there will be many java applications running on this server, so I need to specifically kill this one. Any help? Thanks in advance, Fernando

    Read the article

< Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >