Search Results

Search found 1698 results on 68 pages for 'loops'.

Page 64/68 | < Previous Page | 60 61 62 63 64 65 66 67 68  | Next Page >

  • Finding open contiguous blocks of time for every day of a month, fast

    - by Chris
    I am working on a booking availability system for a group of several venues, and am having a hard time generating the availability of time blocks for days in a given month. This is happening server-side in PHP, but the concept itself is language agnostic -- I could be doing this in JS or anything else. Given a venue_id, month, and year (6/2012 for example), I have a list of all events occurring in that range at that venue, represented as unix timestamps start and end. This data comes from the database. I need to establish what, if any, contiguous block of time of a minimum length (different per venue) exist on each day. For example, on 6/1 I have an event between 2:00pm and 7:00pm. The minimum time is 5 hours, so there's a block open there from 9am - 2pm and another between 7pm and 12pm. This would continue for the 2nd, 3rd, etc... every day of June. Some (most) of the days have nothing happening at all, some have 1 - 3 events. The solution I came up with works, but it also takes waaaay too long to generate the data. Basically, I loop every day of the month and create an array of timestamps for each 15 minutes of that day. Then, I loop the time spans of events from that day by 15 minutes, marking any "taken" timeslot as false. Remaining, I have an array that contains timestamp of free time vs. taken time: //one day's array after processing through loops (not real timestamps) array( 12345678=>12345678, // <--- avail 12345878=>12345878, 12346078=>12346078, 12346278=>false, // <--- not avail 12346478=>false, 12346678=>false, 12346878=>false, 12347078=>12347078, // <--- avail 12347278=>12347278 ) Now I would need to loop THIS array to find continuous time blocks, then check to see if they are long enough (each venue has a minimum), and if so then establish the descriptive text for their start and end (i.e. 9am - 2pm). WHEW! By the time all this looping is done, the user has grown bored and wandered off to Youtube to watch videos of puppies; it takes ages to so examine 30 or so days. Is there a faster way to solve this issue? To summarize the problem, given time ranges t1 and t2 on day d, how can I determine the remaining time left in d that is longer than the minimum time block m. This data is assembled on demand via AJAX as the user moves between calendar months. Results are cached per-page-load, so if the user goes to July a second time, the data that was generated the first time would be reused. Any other details that would help, let me know. Edit Per request, the database structure (or the part that is relevant here) *events* id (bigint) title (varchar) *event_times* id (bigint) event_id (bigint) venue_id (bigint) start (bigint) end (bigint) *venues* id (bigint) name (varchar) min_block (int) min_start (varchar) max_start (varchar)

    Read the article

  • Java array of arry [matrix] of an integer partition with fixed term

    - by user335209
    Hello, for my study purpose I need to build an array of array filled with the partitions of an integer with fixed term. That is given an integer, suppose 10 and given the fixed number of terms, suppose 5 I need to populate an array like this 10 0 0 0 0 9 0 0 0 1 8 0 0 0 2 7 0 0 0 3 ............ 9 0 0 1 0 8 0 0 1 1 ............. 7 0 1 1 0 6 0 1 1 1 ............ ........... 0 6 1 1 1 ............. 0 0 0 0 10 am pretty new to Java and am getting confused with all the for loops. Right now my code can do the partition of the integer but unfortunately it is not with fixed term public class Partition { private static int[] riga; private static void printPartition(int[] p, int n) { for (int i= 0; i < n; i++) System.out.print(p[i]+" "); System.out.println(); } private static void partition(int[] p, int n, int m, int i) { if (n == 0) printPartition(p, i); else for (int k= m; k > 0; k--) { p[i]= k; partition(p, n-k, n-k, i+1); } } public static void main(String[] args) { riga = new int[6]; for(int i = 0; i<riga.length; i++){ riga[i] = 0; } partition(riga, 6, 1, 0); } } the output I get it from is like this: 1 5 1 4 1 1 3 2 1 3 1 1 1 2 3 1 2 2 1 1 2 1 2 1 2 1 1 1 what i'm actually trying to understand how to proceed is to have it with a fixed terms which would be the columns of my array. So, am stuck with trying to get a way to make it less dynamic. Any help?

    Read the article

  • Big O Complexity of a method

    - by timeNomad
    I have this method: public static int what(String str, char start, char end) { int count=0; for(int i=0;i<str.length(); i++) { if(str.charAt(i) == start) { for(int j=i+1;j<str.length(); j++) { if(str.charAt(j) == end) count++; } } } return count; } What I need to find is: 1) What is it doing? Answer: counting the total number of end occurrences after EACH (or is it? Not specified in the assignment, point 3 depends on this) start. 2) What is its complexity? Answer: the first loops iterates over the string completely, so it's at least O(n), the second loop executes only if start char is found and even then partially (index at which start was found + 1). Although, big O is all about worst case no? So in the worst case, start is the 1st char & the inner iteration iterates over the string n-1 times, the -1 is a constant so it's n. But, the inner loop won't be executed every outer iteration pass, statistically, but since big O is about worst case, is it correct to say the complexity of it is O(n^2)? Ignoring any constants and the fact that in 99.99% of times the inner loop won't execute every outer loop pass. 3) Rewrite it so that complexity is lower. What I'm not sure of is whether start occurs at most once or more, if once at most, then method can be rewritten using one loop (having a flag indicating whether start has been encountered and from there on incrementing count at each end occurrence), yielding a complexity of O(n). In case though, that start can appear multiple times, which most likely it is, because assignment is of a Java course and I don't think they would make such ambiguity. Solving, in this case, is not possible using one loop... WAIT! Yes it is..! Just have a variable, say, inc to be incremented each time start is encountered & used to increment count each time end is encountered after the 1st start was found: inc = 0, count = 0 if (current char == start) inc++ if (inc > 0 && current char == end) count += inc This would also yield a complexity of O(n)? Because there is only 1 loop. Yes I realize I wrote a lot hehe, but what I also realized is that I understand a lot better by forming my thoughts into words...

    Read the article

  • Stop running this script, IE7 using PHP

    - by Jomel Dicen
    I incorporate javascript in my PHP program: Try to check my codes. It loops depend on the number of records in database. for instance: $counter = 0; foreach($row_value as $data): echo $this->javascript($counter, $data->exrate, $data->tab); endforeach; private function javascript($counter=NULL, $exrate=NULL, $tab=NULL){ $js = " <script type='text/javascript'> $(function () { var textBox0 = $('input:text[id$=quantity{$counter}]').keyup(foo); var textBox1 = $('input:text[id$=mc{$counter}]').keyup(foo); var textBox2 = $('input:text[id$=lc{$counter}]').keyup(foo); function foo() { var value0 = textBox0.val(); var value1 = textBox1.val(); var value2 = textBox2.val(); var sum = add(value1, value2) * (value0 * {$exrate}); $('input:text[id$=result{$counter}]').val(parseFloat(sum).toFixed(2)); // Compute Total Quantity var qtotal = 0; $('.quantity{$tab}').each(function() { qtotal += Number($(this).val()); }); $('#tquantity{$tab}').text(qtotal); // Compute MC UNIT var mctotal = 0; $('.mc{$tab}').each(function() { mctotal += Number($(this).val()); }); $('#tmc{$tab}').text(mctotal); // Compute LC UNIT var lctotal = 0; $('.lc{$tab}').each(function() { lctotal += Number($(this).val()); }); $('#tlc{$tab}').text(lctotal); // Compute Result var result = 0; $('.result{$tab}').each(function() { result += Number($(this).val()); }); $('#tresult{$tab}').text(result); } function add() { var sum = 0; for (var i = 0, j = arguments.length; i < j; i++) { if (IsNumeric(arguments[i])) { sum += parseFloat(arguments[i]); } } return sum; } function IsNumeric(input) { return (input - 0) == input && input.length > 0; } }); </script> "; return $js; } When I running this on IE this message is always annoying me " Stop running this script? A script on this page is causing your web browser to run slowly. If it continues to run, your computer might become unresponsive." but in firefox it's functioning well.

    Read the article

  • Thread class closing from other Class (Activity) with protected void onStop() Android

    - by user1761337
    I have a Problem with Closing the Thread. I will Closing the Thread with onStop,onPause and onDestroy. This is my Source in the Activity Class: @Override protected void onStop(){ super.onStop(); finish(); } @Override protected void onPause() { super.onPause(); finish(); } @Override public void onDestroy() { this.mWakeLock.release(); super.onDestroy(); } And the Thread Class: public class GameThread extends Thread { private SurfaceHolder mSurfaceHolder; private Handler mHandler; private Context mContext; private Paint mLinePaint; private Paint blackPaint; //for consistent rendering private long sleepTime; //amount of time to sleep for (in milliseconds) private long delay=1000/30; //state of game (Running or Paused). int state = 1; public final static int RUNNING = 1; public final static int PAUSED = 2; public final static int STOPED = 3; GameSurface gEngine; public GameThread(SurfaceHolder surfaceHolder, Context context, Handler handler,GameSurface gEngineS){ //data about the screen mSurfaceHolder = surfaceHolder; mHandler = handler; mContext = context; gEngine=gEngineS; } //This is the most important part of the code. It is invoked when the call to start() is //made from the SurfaceView class. It loops continuously until the game is finished or //the application is suspended. private long beforeTime; @Override public void run() { //UPDATE while (state==RUNNING) { Log.d("State","Thread is runnig"); //time before update beforeTime = System.nanoTime(); //This is where we update the game engine gEngine.Update(); //DRAW Canvas c = null; try { //lock canvas so nothing else can use it c = mSurfaceHolder.lockCanvas(null); synchronized (mSurfaceHolder) { //clear the screen with the black painter. //reset the canvas c.drawColor(Color.BLACK); //This is where we draw the game engine. gEngine.doDraw(c); } } finally { // do this in a finally so that if an exception is thrown // during the above, we don't leave the Surface in an // inconsistent state if (c != null) { mSurfaceHolder.unlockCanvasAndPost(c); } } this.sleepTime = delay-((System.nanoTime()-beforeTime)/1000000L); try { //actual sleep code if(sleepTime>0){ this.sleep(sleepTime); } } catch (InterruptedException ex) { Logger.getLogger(GameThread.class.getName()).log(Level.SEVERE, null, ex); } while (state==PAUSED){ Log.d("State","Thread is pausing"); try { this.sleep(1000); } catch (InterruptedException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } }} How i can close the Thread from Activity Class??

    Read the article

  • Do While loop breaks after incorrect input?

    - by Daminkz
    I am trying to have a loop continue to prompt the user for an option. When I get a string of characters instead of an int, the program loops indefinitely. I have tried setting the variable result to NULL, clearing the input stream, and have enclosed in try{}catch blocks (not in this example). Can anyone explain to me why this is? #include <iostream> #include <vector> #include <string> using namespace std; int menu(string question, vector<string> options) { int result; cout << question << endl; for(int i = 0; i < options.size(); i++) { cout << '[' << i << ']' << options[i] << endl; } bool ans = false; do { cin >> result; cin.ignore(1000, 10); if (result < options.size() ) { ans = true; } else { cout << "You must enter a valid option." << endl; result = NULL; ans = false; } } while(!ans); return result; } int main() { string menuQuestion = "Welcome to my game. What would you like to do?"; vector<string> mainMenu; mainMenu.push_back("Play Game"); mainMenu.push_back("Load Game"); mainMenu.push_back("About"); mainMenu.push_back("Exit"); int result = menu(menuQuestion, mainMenu); cout << "You entered: " << result << endl; return 0; }

    Read the article

  • Javascript returns Nan in IE, FF ok

    - by user350184
    im very new to javascript, and writing this script to add up a shopping cart and print out subtotals and totals. it works in FF but not in IE. this function is called by onclick of one of three select options with a value of 0-25. it is in a js file called in the head. what it does is get the selected values as variables, parseint them, adds and multiplies, and changes the innerHTML of the table to reflect the subtotals, and total. FF does it great, but IE gives Nan. ive tried rewriting it a number of different ways, and many translations still work in FF but not IE8. ive made sure the variables and form id's arent repeated. function gen_invoice() { var scount = parseInt(document.shopcart.studentcount.value, 10); var ycount = parseInt(document.shopcart.youthcount.value, 10); var fcount = parseInt(document.shopcart.facultycount.value, 10); //html output source is 3 selects like this, with diff ids and names: //<select name="studentcount" id="studentcount"> //<option onclick="gen_invoice()" value="0">0 </option></select> var cardcost = parseInt(document.shopcart.cardprice.value, 10); //cardcost comes from hidden input value: //<input type="hidden" id="cardprice" name="cardprice" value="25"> var totalsum = scount + ycount + fcount; var grandtotal = totalsum * cardcost; document.getElementById('s_price').innerHTML = scount * cardcost; document.getElementById('y_price').innerHTML = ycount * cardcost; document.getElementById('f_price').innerHTML = fcount * cardcost; document.getElementById('grand').innerHTML = grandtotal; //.... } ...after this there are 3 long loops for writing out some other forms, but they dont work in IE either because they depend on the selected values to be an integer. this part happens first and returns Nan, so im sure the problem is here somwhere. I have literally hit my head on the table over this. You can imagine how frustrating it is to be able to write the entire rest of the site beautifully, but then fail at adding 3 numbers together. help please!

    Read the article

  • GDI+ Load a jpg and save as 24bit png problem

    - by wookey
    Problem Hello all! I have this code which takes my jpg image loops through altering pixels and finally saving it as a png type. The problem is that the resulting image has a bit depth of 32 bits. I need it to be 24 bit, can any one shiny some light on the correct method of setting it? Am I along the right tracks looking at setting the pixel format to PixelFormat24bppRGB? Code static inline void Brighten(Gdiplus::Bitmap* img) { int width = img->GetWidth()/8,height = img->GetHeight(), max = (width*height),r,g,b; Gdiplus::Color pixel; for(int a = 0,x = 0, y = -1; a < max; ++a) { x = a%width; if(x == 0) ++y; img->GetPixel(x,y,&pixel); r = pixel.GetR(); g = pixel.GetG(); b = pixel.GetB(); if (r > 245) r = 245; if (g > 245) g = 245; if (b > 245) b = 245; r = 10; g = 10; b = 10; pixel = Gdiplus::Color(r,g,b); img->SetPixel(x,y,pixel);; } } ULONG_PTR m_dwToken = 0; Gdiplus::GdiplusStartupInput input; Gdiplus::GdiplusStartupOutput output; Gdiplus::GdiplusStartup( &m_dwToken, &input, &output ); USES_CONVERSION_EX; Gdiplus::ImageCodecInfo* pEncoders = static_cast< Gdiplus::ImageCodecInfo* >( _ATL_SAFE_ALLOCA(1040, _ATL_SAFE_ALLOCA_DEF_THRESHOLD)); Gdiplus::DllExports::GdipGetImageEncoders(5, 1040, pEncoders ); CLSID clsidEncoder = pEncoders[4].Clsid; Gdiplus::Bitmap img1((CT2W)L"IMG_1.JPG"); Brighten(&img1); img1.Save((CT2W)L"IMG_1_R3.PNG",&clsidEncoder,NULL); Thanks in advance!

    Read the article

  • How to optimize foreach loop in PHP

    - by vanneto
    First off, I know the title is generic and not fitting. I just couldn't think of a title that could describe my problem. I have a table Recipients in MySQL structured like this: id | email | status 1 foo@bar S 2 bar@baz S 3 abc@def R 4 sta@cko B I need to convert the data into the following XML, depending on the status field. For example: <Recipients> <RecipientsSent> <!-- Have the 'S' status --> <recipient>foo@bar</recipient> <recipient>bar@baz</recipient> </RecipientsSent> <RecipientsRegexError> <recipient>abc@def</recipient> </RecipientsRegexError> <RecipientsBlocked> <recipient>sta@cko</recipient> </RecipientsBlocked> </Recipients> I have this PHP code to implement this ($recipients contains an associative array of the db table): <Recipients> <RecipientsSent> <?php foreach ($recipients as $recipient): if ($recipient['status'] == 'S'): echo "<recipient>" . $recipient['email'] . "</recipient>"; endif; endforeach; ?> </RecipientsSent> <RecipientsRegexError> <?php foreach ($recipients as $recipient): if ($recipient['status'] == 'R'): echo "<recipient>" . $recipient['email'] . "</recipient>"; endif; endforeach; ?> </RecipientsRegexError> <?php /** same loop for the B status */ ?> </Recipients> So, this means that if I have 1000 entries in the table and 4 different status' that can be checked, it means that there will be 4 loops, each one executing 1000 times. How can this be done in a more efficient manner? I thought about fetching four different sets from the database, meaning 4 different queries, would that be more efficient? I'm thinking it could be done with one loop but but I can't come up with a solution. Any way this could be done with only one loop?

    Read the article

  • Creating array from two arrays

    - by binoculars
    I'm having troubles trying to create a certain array. Basicly, I have an array like this: [0] => Array ( [id] => 12341241 [type] => "Blue" ) [1] => Array ( [id] => 52454235 [type] => "Blue" ) [2] => Array ( [id] => 848437437 [type] => "Blue" ) [3] => Array ( [id] => 387372723 [type] => "Blue" ) [4] => Array ( [id] => 73732623 [type] => "Blue" ) ... Next, I have an array like this: [0] => Array ( [id] => 34141 [type] => "Red" ) [1] => Array ( [id] => 253532 [type] => "Red" ) [2] => Array ( [id] => 94274 [type] => "Red" ) I want to construct an array, which is a combination of the two above, using this rule: after 3 Blues, there must be a Red: Blue1 Blue2 Blue3 Red1 Blue4 Blue5 Blue6 Red2 Blue7 Blue8 Blue9 Red3 Note that the their can be more Red's than Blue's, but also more Blue's than Red's. If the Red's run out, it should begin with the first one again. Example: let's say there are only two Red's: Blue1 Blue2 Blue3 Red1 Blue4 Blue5 Blue6 Red2 Blue7 Blue8 Blue9 Red1 ... ... If the Blue's run out, the Red's should append until they run out too. Example: let's say there are 5 Blue's, and 5 Red's: Blue1 Blue2 Blue3 Red1 Blue4 Blue5 Red2 Red3 Red4 Red5 Note: the arrays come from mysql-fetches. Maybe it's better to fetch them while building the new array? Anyway, the while-loops got to me, I can't figure it out... Any help is much appreciated!

    Read the article

  • jQuery image loop not displaying any images

    - by user1871097
    I'm trying to create a very basic image gallery in jQuery. The goal is to have 3 images fade in and out in a sequential order. So image 1 is displayed, fades to image 2 etc. then the whole thing loops again. My HTML code so far is as follows: <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Slider</title> <style type="text/css"> .slider{ width: 2848px; height: 2136px; overflow: hidden; margin: 30px auto; } .slider img{ width:2848px; height:2136px; display:none; } </style> <script src="//ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script> <script src="//ajax.googleapis.com/ajax/libs/jqueryui/1.9.2/jquery-ui.min.js"></script> <script src="Slider2.js"></script> </head> <body onload="Slider2"();> <div class="slider"> <img id="1" src="31.jpg" border="0" alt="city"/> <img id="2" src="2vrtigo2.jpg" border="0" alt="roof"/> <img id="3" src="3.jpg" border="0" alt="sea"/> </div> </body> And the jQuery code looks like this: function Slider2() { var total = $(".slider img").size(); for (i=1; i<=total; i+=1) { $(".slider #"+i).fadeIn(600); $(".slider #"+i).delay(2000).hide; }} A quick syntactical note, I've also tried using i++ in the last argument of the For Loop. The result of this code is a blank, white page. I know some of the HTML is being compiled because the enormous 2848x2136 div creates scroll bars on the browser. If anyone could help me out, that would be greatly appreciated. Obviously I'm relatively new to web programming and would love some insight into why this isn't working. Thanks!

    Read the article

  • Data extract from website URL

    - by user2522395
    From this below script I am able to extract all links of particular website, But i need to know how I can generate data from extracted links especially like eMail, Phone number if its there Please help how i will modify the existing script and get the result or if you have full sample script please provide me. Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click 'url must be in this format: http://www.example.com/ Dim aList As ArrayList = Spider("http://www.qatarliving.com", 1) For Each url As String In aList lstUrls.Items.Add(url) Next End Sub Private Function Spider(ByVal url As String, ByVal depth As Integer) As ArrayList 'aReturn is used to hold the list of urls Dim aReturn As New ArrayList 'aStart is used to hold the new urls to be checked Dim aStart As ArrayList = GrabUrls(url) 'temp array to hold data being passed to new arrays Dim aTemp As ArrayList 'aNew is used to hold new urls before being passed to aStart Dim aNew As New ArrayList 'add the first batch of urls aReturn.AddRange(aStart) 'if depth is 0 then only return 1 page If depth < 1 Then Return aReturn 'loops through the levels of urls For i = 1 To depth 'grabs the urls from each url in aStart For Each tUrl As String In aStart 'grabs the urls and returns non-duplicates aTemp = GrabUrls(tUrl, aReturn, aNew) 'add the urls to be check to aNew aNew.AddRange(aTemp) Next 'swap urls to aStart to be checked aStart = aNew 'add the urls to the main list aReturn.AddRange(aNew) 'clear the temp array aNew = New ArrayList Next Return aReturn End Function Private Overloads Function GrabUrls(ByVal url As String) As ArrayList 'will hold the urls to be returned Dim aReturn As New ArrayList Try 'regex string used: thanks google Dim strRegex As String = "<a.*?href=""(.*?)"".*?>(.*?)</a>" 'i used a webclient to get the source 'web requests might be faster Dim wc As New WebClient 'put the source into a string Dim strSource As String = wc.DownloadString(url) Dim HrefRegex As New Regex(strRegex, RegexOptions.IgnoreCase Or RegexOptions.Compiled) 'parse the urls from the source Dim HrefMatch As Match = HrefRegex.Match(strSource) 'used later to get the base domain without subdirectories or pages Dim BaseUrl As New Uri(url) 'while there are urls While HrefMatch.Success = True 'loop through the matches Dim sUrl As String = HrefMatch.Groups(1).Value 'if it's a page or sub directory with no base url (domain) If Not sUrl.Contains("http://") AndAlso Not sUrl.Contains("www") Then 'add the domain plus the page Dim tURi As New Uri(BaseUrl, sUrl) sUrl = tURi.ToString End If 'if it's not already in the list then add it If Not aReturn.Contains(sUrl) Then aReturn.Add(sUrl) 'go to the next url HrefMatch = HrefMatch.NextMatch End While Catch ex As Exception 'catch ex here. I left it blank while debugging End Try Return aReturn End Function Private Overloads Function GrabUrls(ByVal url As String, ByRef aReturn As ArrayList, ByRef aNew As ArrayList) As ArrayList 'overloads function to check duplicates in aNew and aReturn 'temp url arraylist Dim tUrls As ArrayList = GrabUrls(url) 'used to return the list Dim tReturn As New ArrayList 'check each item to see if it exists, so not to grab the urls again For Each item As String In tUrls If Not aReturn.Contains(item) AndAlso Not aNew.Contains(item) Then tReturn.Add(item) End If Next Return tReturn End Function

    Read the article

  • PHP array performance

    - by dfo
    Hi, this is my first question on Stackoverflow, please bear with me. I'm testing an algorithm for 2d bin packing and I've chosen PHP to mock it up as it's my bread-and-butter language nowadays. As you can see on http://themworks.com/pack_v0.2/oopack.php?ol=1 it works pretty well, but you need to wait around 10-20 seconds for 100 rectangles to pack. For some hard to handle sets it would hit the php's 30s runtime limit. I did some profiling and it shows that most of the time my script goes through different parts of a small 2d array with 0's and 1's in it. It either checks if certain cell equals to 0/1 or sets it to 0/1. It can do such operations million times and each times it takes few microseconds. I guess I could use an array of booleans in a statically typed language and things would be faster. Or even make an array of 1 bit values. I'm thinking of converting the whole thing to some compiled language. Is PHP just not good for it? If I do need to convert it to let's say C++, how good are the automatic converters? My script is just a lot of for loops with basic arrays and objects manipulations. Thank you! Edit. This function gets called more than any other. It reads few properties of a very simple object, and goes through a very small part of a smallish array to check if there's any element not equal to 0. function fits($bin, $file, $x, $y) { $flag = true; $xw = $x + $file->get_width();; $yh = $y + $file->get_height(); for ($i = $x; $i < $xw; $i++) { for ($j = $y; $j < $yh; $j++) { if ($bin[$i][$j] !== 0) { $flag = false; break; } } if (!$flag) break; } return $flag; }

    Read the article

  • A Guided Tour of Complexity

    - by JoshReuben
    I just re-read Complexity – A Guided Tour by Melanie Mitchell , protégé of Douglas Hofstadter ( author of “Gödel, Escher, Bach”) http://www.amazon.com/Complexity-Guided-Tour-Melanie-Mitchell/dp/0199798109/ref=sr_1_1?ie=UTF8&qid=1339744329&sr=8-1 here are some notes and links:   Evolved from Cybernetics, General Systems Theory, Synergetics some interesting transdisciplinary fields to investigate: Chaos Theory - http://en.wikipedia.org/wiki/Chaos_theory – small differences in initial conditions (such as those due to rounding errors in numerical computation) yield widely diverging outcomes for chaotic systems, rendering long-term prediction impossible. System Dynamics / Cybernetics - http://en.wikipedia.org/wiki/System_Dynamics – study of how feedback changes system behavior Network Theory - http://en.wikipedia.org/wiki/Network_theory – leverage Graph Theory to analyze symmetric  / asymmetric relations between discrete objects Algebraic Topology - http://en.wikipedia.org/wiki/Algebraic_topology – leverage abstract algebra to analyze topological spaces There are limits to deterministic systems & to computation. Chaos Theory definitely applies to training an ANN (artificial neural network) – different weights will emerge depending upon the random selection of the training set. In recursive Non-Linear systems http://en.wikipedia.org/wiki/Nonlinear_system – output is not directly inferable from input. E.g. a Logistic map: Xt+1 = R Xt(1-Xt) Different types of bifurcations, attractor states and oscillations may occur – e.g. a Lorenz Attractor http://en.wikipedia.org/wiki/Lorenz_system Feigenbaum Constants http://en.wikipedia.org/wiki/Feigenbaum_constants express ratios in a bifurcation diagram for a non-linear map – the convergent limit of R (the rate of period-doubling bifurcations) is 4.6692016 Maxwell’s Demon - http://en.wikipedia.org/wiki/Maxwell%27s_demon - the Second Law of Thermodynamics has only a statistical certainty – the universe (and thus information) tends towards entropy. While any computation can theoretically be done without expending energy, with finite memory, the act of erasing memory is permanent and increases entropy. Life & thought is a counter-example to the universe’s tendency towards entropy. Leo Szilard and later Claude Shannon came up with the Information Theory of Entropy - http://en.wikipedia.org/wiki/Entropy_(information_theory) whereby Shannon entropy quantifies the expected value of a message’s information in bits in order to determine channel capacity and leverage Coding Theory (compression analysis). Ludwig Boltzmann came up with Statistical Mechanics - http://en.wikipedia.org/wiki/Statistical_mechanics – whereby our Newtonian perception of continuous reality is a probabilistic and statistical aggregate of many discrete quantum microstates. This is relevant for Quantum Information Theory http://en.wikipedia.org/wiki/Quantum_information and the Physics of Information - http://en.wikipedia.org/wiki/Physical_information. Hilbert’s Problems http://en.wikipedia.org/wiki/Hilbert's_problems pondered whether mathematics is complete, consistent, and decidable (the Decision Problem – http://en.wikipedia.org/wiki/Entscheidungsproblem – is there always an algorithm that can determine whether a statement is true).  Godel’s Incompleteness Theorems http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems  proved that mathematics cannot be both complete and consistent (e.g. “This statement is not provable”). Turing through the use of Turing Machines (http://en.wikipedia.org/wiki/Turing_machine symbol processors that can prove mathematical statements) and Universal Turing Machines (http://en.wikipedia.org/wiki/Universal_Turing_machine Turing Machines that can emulate other any Turing Machine via accepting programs as well as data as input symbols) that computation is limited by demonstrating the Halting Problem http://en.wikipedia.org/wiki/Halting_problem (is is not possible to know when a program will complete – you cannot build an infinite loop detector). You may be used to thinking of 1 / 2 / 3 dimensional systems, but Fractal http://en.wikipedia.org/wiki/Fractal systems are defined by self-similarity & have non-integer Hausdorff Dimensions !!!  http://en.wikipedia.org/wiki/List_of_fractals_by_Hausdorff_dimension – the fractal dimension quantifies the number of copies of a self similar object at each level of detail – eg Koch Snowflake - http://en.wikipedia.org/wiki/Koch_snowflake Definitions of complexity: size, Shannon entropy, Algorithmic Information Content (http://en.wikipedia.org/wiki/Algorithmic_information_theory - size of shortest program that can generate a description of an object) Logical depth (amount of info processed), thermodynamic depth (resources required). Complexity is statistical and fractal. John Von Neumann’s other machine was the Self-Reproducing Automaton http://en.wikipedia.org/wiki/Self-replicating_machine  . Cellular Automata http://en.wikipedia.org/wiki/Cellular_automaton are alternative form of Universal Turing machine to traditional Von Neumann machines where grid cells are locally synchronized with their neighbors according to a rule. Conway’s Game of Life http://en.wikipedia.org/wiki/Conway's_Game_of_Life demonstrates various emergent constructs such as “Glider Guns” and “Spaceships”. Cellular Automatons are not practical because logical ops require a large number of cells – wasteful & inefficient. There are no compilers or general program languages available for Cellular Automatons (as far as I am aware). Random Boolean Networks http://en.wikipedia.org/wiki/Boolean_network are extensions of cellular automata where nodes are connected at random (not to spatial neighbors) and each node has its own rule –> they demonstrate the emergence of complex  & self organized behavior. Stephen Wolfram’s (creator of Mathematica, so give him the benefit of the doubt) New Kind of Science http://en.wikipedia.org/wiki/A_New_Kind_of_Science proposes the universe may be a discrete Finite State Automata http://en.wikipedia.org/wiki/Finite-state_machine whereby reality emerges from simple rules. I am 2/3 through this book. It is feasible that the universe is quantum discrete at the plank scale and that it computes itself – Digital Physics: http://en.wikipedia.org/wiki/Digital_physics – a simulated reality? Anyway, all behavior is supposedly derived from simple algorithmic rules & falls into 4 patterns: uniform , nested / cyclical, random (Rule 30 http://en.wikipedia.org/wiki/Rule_30) & mixed (Rule 110 - http://en.wikipedia.org/wiki/Rule_110 localized structures – it is this that is interesting). interaction between colliding propagating signal inputs is then information processing. Wolfram proposes the Principle of Computational Equivalence - http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html - all processes that are not obviously simple can be viewed as computations of equivalent sophistication. Meaning in information may emerge from analogy & conceptual slippages – see the CopyCat program: http://cognitrn.psych.indiana.edu/rgoldsto/courses/concepts/copycat.pdf Scale Free Networks http://en.wikipedia.org/wiki/Scale-free_network have a distribution governed by a Power Law (http://en.wikipedia.org/wiki/Power_law - much more common than Normal Distribution). They are characterized by hubs (resilience to random deletion of nodes), heterogeneity of degree values, self similarity, & small world structure. They grow via preferential attachment http://en.wikipedia.org/wiki/Preferential_attachment – tipping points triggered by positive feedback loops. 2 theories of cascading system failures in complex systems are Self-Organized Criticality http://en.wikipedia.org/wiki/Self-organized_criticality and Highly Optimized Tolerance http://en.wikipedia.org/wiki/Highly_optimized_tolerance. Computational Mechanics http://en.wikipedia.org/wiki/Computational_mechanics – use of computational methods to study phenomena governed by the principles of mechanics. This book is a great intuition pump, but does not cover the more mathematical subject of Computational Complexity Theory – http://en.wikipedia.org/wiki/Computational_complexity_theory I am currently reading this book on this subject: http://www.amazon.com/Computational-Complexity-Christos-H-Papadimitriou/dp/0201530821/ref=pd_sim_b_1   stay tuned for that review!

    Read the article

  • Listing common SQL Code Smells.

    - by Phil Factor
    Once you’ve done a number of SQL Code-reviews, you’ll know those signs in the code that all might not be well. These ’Code Smells’ are coding styles that don’t directly cause a bug, but are indicators that all is not well with the code. . Kent Beck and Massimo Arnoldi seem to have coined the phrase in the "OnceAndOnlyOnce" page of www.C2.com, where Kent also said that code "wants to be simple". Bad Smells in Code was an essay by Kent Beck and Martin Fowler, published as Chapter 3 of the book ‘Refactoring: Improving the Design of Existing Code’ (ISBN 978-0201485677) Although there are generic code-smells, SQL has its own particular coding habits that will alert the programmer to the need to re-factor what has been written. See Exploring Smelly Code   and Code Deodorants for Code Smells by Nick Harrison for a grounding in Code Smells in C# I’ve always been tempted by the idea of automating a preliminary code-review for SQL. It would be so useful to trawl through code and pick up the various problems, much like the classic ‘Lint’ did for C, and how the Code Metrics plug-in for .NET Reflector by Jonathan 'Peli' de Halleux is used for finding Code Smells in .NET code. The problem is that few of the standard procedural code smells are relevant to SQL, and we need an agreed list of code smells. Merrilll Aldrich made a grand start last year in his blog Top 10 T-SQL Code Smells.However, I'd like to make a start by discovering if there is a general opinion amongst Database developers what the most important SQL Smells are. One can be a bit defensive about code smells. I will cheerfully write very long stored procedures, even though they are frowned on. I’ll use dynamic SQL occasionally. You can only use them as an aid for your own judgment and it is fine to ‘sign them off’ as being appropriate in particular circumstances. Also, whole classes of ‘code smells’ may be irrelevant for a particular database. The use of proprietary SQL, for example, is only a ‘code smell’ if there is a chance that the database will have to be ported to another RDBMS. The use of dynamic SQL is a risk only with certain security models. As the saying goes,  a CodeSmell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist. Plamen Ratchev’s wonderful article Ten Common SQL Programming Mistakes lists some of these ‘code smells’ along with out-and-out mistakes, but there are more. The use of nested transactions, for example, isn’t entirely incorrect, even though the database engine ignores all but the outermost: but it does flag up the possibility that the programmer thinks that nested transactions are supported. If anything requires some sort of general agreement, the definition of code smells is one. I’m therefore going to make this Blog ‘dynamic, in that, if anyone twitters a suggestion with a #SQLCodeSmells tag (or sends me a twitter) I’ll update the list here. If you add a comment to the blog with a suggestion of what should be added or removed, I’ll do my best to oblige. In other words, I’ll try to keep this blog up to date. The name against each 'smell' is the name of the person who Twittered me, commented about or who has written about the 'smell'. it does not imply that they were the first ever to think of the smell! Use of deprecated syntax such as *= (Dave Howard) Denormalisation that requires the shredding of the contents of columns. (Merrill Aldrich) Contrived interfaces Use of deprecated datatypes such as TEXT/NTEXT (Dave Howard) Datatype mis-matches in predicates that rely on implicit conversion.(Plamen Ratchev) Using Correlated subqueries instead of a join   (Dave_Levy/ Plamen Ratchev) The use of Hints in queries, especially NOLOCK (Dave Howard /Mike Reigler) Few or No comments. Use of functions in a WHERE clause. (Anil Das) Overuse of scalar UDFs (Dave Howard, Plamen Ratchev) Excessive ‘overloading’ of routines. The use of Exec xp_cmdShell (Merrill Aldrich) Excessive use of brackets. (Dave Levy) Lack of the use of a semicolon to terminate statements Use of non-SARGable functions on indexed columns in predicates (Plamen Ratchev) Duplicated code, or strikingly similar code. Misuse of SELECT * (Plamen Ratchev) Overuse of Cursors (Everyone. Special mention to Dave Levy & Adrian Hills) Overuse of CLR routines when not necessary (Sam Stange) Same column name in different tables with different datatypes. (Ian Stirk) Use of ‘broken’ functions such as ‘ISNUMERIC’ without additional checks. Excessive use of the WHILE loop (Merrill Aldrich) INSERT ... EXEC (Merrill Aldrich) The use of stored procedures where a view is sufficient (Merrill Aldrich) Not using two-part object names (Merrill Aldrich) Using INSERT INTO without specifying the columns and their order (Merrill Aldrich) Full outer joins even when they are not needed. (Plamen Ratchev) Huge stored procedures (hundreds/thousands of lines). Stored procedures that can produce different columns, or order of columns in their results, depending on the inputs. Code that is never used. Complex and nested conditionals WHILE (not done) loops without an error exit. Variable name same as the Datatype Vague identifiers. Storing complex data  or list in a character map, bitmap or XML field User procedures with sp_ prefix (Aaron Bertrand)Views that reference views that reference views that reference views (Aaron Bertrand) Inappropriate use of sql_variant (Neil Hambly) Errors with identity scope using SCOPE_IDENTITY @@IDENTITY or IDENT_CURRENT (Neil Hambly, Aaron Bertrand) Schemas that involve multiple dated copies of the same table instead of partitions (Matt Whitfield-Atlantis UK) Scalar UDFs that do data lookups (poor man's join) (Matt Whitfield-Atlantis UK) Code that allows SQL Injection (Mladen Prajdic) Tables without clustered indexes (Matt Whitfield-Atlantis UK) Use of "SELECT DISTINCT" to mask a join problem (Nick Harrison) Multiple stored procedures with nearly identical implementation. (Nick Harrison) Excessive column aliasing may point to a problem or it could be a mapping implementation. (Nick Harrison) Joining "too many" tables in a query. (Nick Harrison) Stored procedure returning more than one record set. (Nick Harrison) A NOT LIKE condition (Nick Harrison) excessive "OR" conditions. (Nick Harrison) User procedures with sp_ prefix (Aaron Bertrand) Views that reference views that reference views that reference views (Aaron Bertrand) sp_OACreate or anything related to it (Bill Fellows) Prefixing names with tbl_, vw_, fn_, and usp_ ('tibbling') (Jeremiah Peschka) Aliases that go a,b,c,d,e... (Dave Levy/Diane McNurlan) Overweight Queries (e.g. 4 inner joins, 8 left joins, 4 derived tables, 10 subqueries, 8 clustered GUIDs, 2 UDFs, 6 case statements = 1 query) (Robert L Davis) Order by 3,2 (Dave Levy) MultiStatement Table functions which are then filtered 'Sel * from Udf() where Udf.Col = Something' (Dave Ballantyne) running a SQL 2008 system in SQL 2000 compatibility mode(John Stafford)

    Read the article

  • Customize Team Build 2010 – Part 13: Get control over the Build Output

    In the series the following parts have been published Part 1: Introduction Part 2: Add arguments and variables Part 3: Use more complex arguments Part 4: Create your own activity Part 5: Increase AssemblyVersion Part 6: Use custom type for an argument Part 7: How is the custom assembly found Part 8: Send information to the build log Part 9: Impersonate activities (run under other credentials) Part 10: Include Version Number in the Build Number Part 11: Speed up opening my build process template Part 12: How to debug my custom activities Part 13: Get control over the Build Output Part 14: Execute a PowerShell script Part 15: Fail a build based on the exit code of a console application In the part 8, I have explained how you can add informational messages, warnings or errors to the build output. If you want to integrate with other lines of text to the build output, you need to do more. This post will show you how you can add extra steps, additional information and hyperlinks to the build output. UPDATE 13-12-2010: Thanks to Jason Pricket, it is now also possible to not show every activity in the build log. This is really useful when you are doing for-loops in your template. To see how you can do that, check out Jason's blog: http://blogs.msdn.com/b/jpricket/archive/2010/12/09/tfs-2010-making-your-build-log-less-noisy.aspx Add an hyperlink to the end of the build output Lets start with a simple example of how you can adjust the build output. In this case we are going to add at the end of the build output an hyperlink where a user can click on to for example start the deployment to the test environment. In part 4 you can find information how you can create a custom activity To add information to the build output, you need the BuildDetail. This value is a variable in your xaml and is thus easily transferable to you custom activity. Besides the BuildDetail the user has also to specify the text and the url that has to be added to the end of the build output. The following code segment shows you how you can achieve this.     [BuildActivity(HostEnvironmentOption.All)]    public sealed class AddHyperlinkToBuildOutput : CodeActivity    {        [RequiredArgument]        public InArgument<IBuildDetail> BuildDetail { get; set; }         [RequiredArgument]        public InArgument<string> DisplayText { get; set; }         [RequiredArgument]        public InArgument<string> Url { get; set; }         protected override void Execute(CodeActivityContext context)        {            // Obtain the runtime value of the input arguments                        IBuildDetail buildDetail = context.GetValue(this.BuildDetail);            string displayText = context.GetValue(this.DisplayText);            string url = context.GetValue(this.Url);             // Add the hyperlink            buildDetail.Information.AddExternalLink(displayText, new Uri(url));            buildDetail.Information.Save();        }    } If you add this activity to somewhere in your build process template (within the scope Run on Agent), you will get the following build output Add an line of text to the build output The next challenge is to add this kind of output not only to the end of the build output but at the step that is currently executing. To be able to do this, you need the current node in the build output. The following code shows you how you can achieve this. First you need to get the current activity tracking, which you can get with the following line of code             IActivityTracking currentTracking = context.GetExtension<IBuildLoggingExtension>().GetActivityTracking(context); Then you can create a new node and set its type to Activity Tracking Node (so copy it from the current node) and do nice things with the node.             IBuildInformationNode childNode = currentTracking.Node.Children.CreateNode();            childNode.Type = currentTracking.Node.Type;            childNode.Fields.Add("DisplayText", "This text is displayed."); You can also add a build step to display progress             IBuildStep buildStep = childNode.Children.AddBuildStep("Custom Build Step", "This is my custom build step");            buildStep.FinishTime = DateTime.Now.AddSeconds(10);            buildStep.Status = BuildStepStatus.Succeeded; Or you can add an hyperlink to the node             childNode.Children.AddExternalLink("My link", new Uri(http://www.ewaldhofman.nl)); When you combine this together you get the following result in the build output   You can download the full solution at BuildProcess.zip. It will include the sources of every part and will continue to evolve.

    Read the article

  • Dynamically creating a Generic Type at Runtime

    - by Rick Strahl
    I learned something new today. Not uncommon, but it's a core .NET runtime feature I simply did not know although I know I've run into this issue a few times and worked around it in other ways. Today there was no working around it and a few folks on Twitter pointed me in the right direction. The question I ran into is: How do I create a type instance of a generic type when I have dynamically acquired the type at runtime? Yup it's not something that you do everyday, but when you're writing code that parses objects dynamically at runtime it comes up from time to time. In my case it's in the bowels of a custom JSON parser. After some thought triggered by a comment today I realized it would be fairly easy to implement two-way Dictionary parsing for most concrete dictionary types. I could use a custom Dictionary serialization format that serializes as an array of key/value objects. Basically I can use a custom type (that matches the JSON signature) to hold my parsed dictionary data and then add it to the actual dictionary when parsing is complete. Generic Types at Runtime One issue that came up in the process was how to figure out what type the Dictionary<K,V> generic parameters take. Reflection actually makes it fairly easy to figure out generic types at runtime with code like this: if (arrayType.GetInterface("IDictionary") != null) { if (arrayType.IsGenericType) { var keyType = arrayType.GetGenericArguments()[0]; var valueType = arrayType.GetGenericArguments()[1]; … } } The GetArrayType method gets passed a type instance that is the array or array-like object that is rendered in JSON as an array (which includes IList, IDictionary, IDataReader and a few others). In my case the type passed would be something like Dictionary<string, CustomerEntity>. So I know what the parent container class type is. Based on the the container type using it's then possible to use GetGenericTypeArguments() to retrieve all the generic types in sequential order of definition (ie. string, CustomerEntity). That's the easy part. Creating a Generic Type and Providing Generic Parameters at RunTime The next problem is how do I get a concrete type instance for the generic type? I know what the type name and I have a type instance is but it's generic, so how do I get a type reference to keyvaluepair<K,V> that is specific to the keyType and valueType above? Here are a couple of things that come to mind but that don't work (and yes I tried that unsuccessfully first): Type elementType = typeof(keyvalue<keyType, valueType>); Type elementType = typeof(keyvalue<typeof(keyType), typeof(valueType)>); The problem is that this explicit syntax expects a type literal not some dynamic runtime value, so both of the above won't even compile. I turns out the way to create a generic type at runtime is using a fancy bit of syntax that until today I was completely unaware of: Type elementType = typeof(keyvalue<,>).MakeGenericType(keyType, valueType); The key is the type(keyvalue<,>) bit which looks weird at best. It works however and produces a non-generic type reference. You can see the difference between the full generic type and the non-typed (?) generic type in the debugger: The nonGenericType doesn't show any type specialization, while the elementType type shows the string, CustomerEntity (truncated above) in the type name. Once the full type reference exists (elementType) it's then easy to create an instance. In my case the parser parses through the JSON and when it completes parsing the value/object it creates a new keyvalue<T,V> instance. Now that I know the element type that's pretty trivial with: // Objects start out null until we find the opening tag resultObject = Activator.CreateInstance(elementType); Here the result object is picked up by the JSON array parser which creates an instance of the child object (keyvalue<K,V>) and then parses and assigns values from the JSON document using the types  key/value property signature. Internally the parser then takes each individually parsed item and adds it to a list of  List<keyvalue<K,V>> items. Parsing through a Generic type when you only have Runtime Type Information When parsing of the JSON array is done, the List needs to be turned into a defacto Dictionary<K,V>. This should be easy since I know that I'm dealing with an IDictionary, and I know the generic types for the key and value. The problem is again though that this needs to happen at runtime which would mean using several Convert.ChangeType() calls in the code to dynamically cast at runtime. Yuk. In the end I decided the easier and probably only slightly slower way to do this is a to use the dynamic type to collect the items and assign them to avoid all the dynamic casting madness: else if (IsIDictionary) { IDictionary dict = Activator.CreateInstance(arrayType) as IDictionary; foreach (dynamic item in items) { dict.Add(item.key, item.value); } return dict; } This code creates an instance of the generic dictionary type first, then loops through all of my custom keyvalue<K,V> items and assigns them to the actual dictionary. By using Dynamic here I can side step all the explicit type conversions that would be required in the three highlighted areas (not to mention that this nested method doesn't have access to the dictionary item generic types here). Static <- -> Dynamic Dynamic casting in a static language like C# is a bitch to say the least. This is one of the few times when I've cursed static typing and the arcane syntax that's required to coax types into the right format. It works but it's pretty nasty code. If it weren't for dynamic that last bit of code would have been a pretty ugly as well with a bunch of Convert.ChangeType() calls to litter the code. Fortunately this type of type convulsion is rather rare and reserved for system level code. It's not every day that you create a string to object parser after all :-)© Rick Strahl, West Wind Technologies, 2005-2011Posted in .NET  CSharp   Tweet (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • CodePlex Daily Summary for Sunday, June 06, 2010

    CodePlex Daily Summary for Sunday, June 06, 2010New ProjectsActive Worlds Dot Net Wrapper (Based on AwSdk): Active Worlds Dot Net Wrapper (Based on AwSdk)Combina: Smart calculator for large combinatorial calculations.Concurrent Cache: ConcurrentCache is a smart output cache library extending OutputCacheProvider. It consists of in memory, cache files and compressed files modes and...Decay: Personal use. For learningFazTalk: FazTalk is a suite of tools and products that are designed to improve collaboration and workflow interactions. FazTalk takes an innovative approach...grouped: A peer to peer text editor, written in C# [update] I wrote this little thing a while back and even forgot about it, I stopped coding for more tha...HitchARide MVC 2 Sample: An MVC 2 sample written as part of the Microsoft 2010 London Web Camp based on the wireframes at http://schematics.earthware.co.uk/hitcharide. Not...Inspiration.Web: Description: A simple (but entertaining) ASP.NET MVC (C#) project to suggest random code names for projects. Intended audience: People who ne...NetFileBrowser - TinyMCE: tinyMCE file plugin with asp.netOil Slick Live Feeds: All live feeds from BP's Remotely Operated VehiclesParticle Lexer: Parser and Tokenizer libraryPdf Form Tool: Pdf Form Tool demonstrates how the iTextSharp library could be used to fill PDF forms. The input data is provided as a csv file. The application ...Planning Poker Windows Mobile 7: This project is a Planning Poker application for Windows Mobile 7 (and later?). RandomRat: RandomRat is a program for generating random sets that meet specific criteriaScience.NET: A scientific library written in managed code. It supports advanced mathematics (algebra system, sequences, statistics, combinatorics...), data stru...Spider Compiler: Spider Compiler parses the input of a spider programming source file and compiles it (with help of csc.exe; the C#-Compiler) to an exe-file. This p...Sununpro: sunun's project for study by team foundation server.TFS Buddy: An application that manipulates your I-Buddy whenever something happens in your Team Foundation ServerValveSoft: ValveSysWiiMote Physics: WiiMote Physics is an application that allows you to retrieve data from your WiiMote or Balance Board and display it in real-time. It has a number...WinGet: WinGet is a download manager for Windows. You can drag links onto the WinGet Widget and it will download a file on the selected folder. It is dev...XProject.NET: A project management and team collaboration platformNew Releases.NET DiscUtils: Version 0.9 Preview: This release is still under development. New features available in this release: Support for accessing short file names stored in WIM files Incr...Active Worlds Dot Net Wrapper (Based on AwSdk): Active World Dot Net Wrapper (0.0.1.85): Based on AwSdk 85AwSdk UnOfficial Wrapper Howto Use: C# using AwWrapper; VB.Net Import AwWrapperAjaxControlToolkit additional extenders: ZhecheAjaxControls for .NET3.5: Used AJAX Control Toolkit Release Notes - April 12th 2010 Release Version 40412. Fixed deadlock in long operation canceling Some other fixesAnyCAD: AnyCAD.v1.2.ENU.Install: http://www.anycad.net Parametric Modeling *3D: Sphere, Box, Cylinder, Cone •2D: Line, Rectangle, Arc, Arch, Circle, Spline, Polygon •Feature: Extr...Community Forums NNTP bridge: Community Forums NNTP Bridge V29: Release of the Community Forums NNTP Bridge to access the social and anwsers MS forums with a single, open source NNTP bridge. This release has ad...Concurrent Cache: 1.0: This is the first release for the ConcurrentCache library.Configuration Section Designer: 2.0.0: This is the first Beta Release for VS 2010 supportDoxygen Browser Addin for VS: Doxygen Browser Addin - v0.1.4 Beta: Support for Visual Studio 2010 improved the logging of errors (Event Logs) Fixed some issues/bugs Hot key for navigation "Control + F1, Contr...Folder Bookmarks: Folder Bookmarks 1.6.2: The latest version of Folder Bookmarks (1.6.2), with new UI changes. Once you have extracted the file, do not delete any files/folders. They are n...HERB.IQ: Beta 0.1 Source code release 5: Beta 0.1 Source code release 5Inspiration.Web: Initial release (deployment package): Initial release (deployment package)NetFileBrowser - TinyMCE: Demo Project: Demo ProjectNetFileBrowser - TinyMCE: NetFileBrowser: NetImageBrowserNLog - Advanced .NET Logging: Nightly Build 2010.06.05.001: Changes since the last build:2010-06-04 23:29:42 Jarek Kowalski Massive update to documentation generator. 2010-05-28 15:41:42 Jarek Kowalski upda...Oil Slick Live Feeds: Oil Slick Live Feeds 0.1: A the first release, with feeds from the MS Skandi, Boa Deep C, Enterprise and Q4000. They are live streams from the ROV's monitoring the damaged...Pcap.Net: Pcap.Net 0.7.0 (46671): Pcap.Net - June 2010 Release Pcap.Net is a .NET wrapper for WinPcap written in C++/CLI and C#. It Features almost all WinPcap features and includes...sqwarea: Sqwarea 0.0.289.0 (alpha): API supportTFS Buddy: TFS Buddy First release (Beta 1): This is the first release of the TFS Buddy.Visual Studio DSite: Looping Animation (Visual C++ 2008): A solider firing a bullet that loops and displays an explosion everytime it hits the edge of the form.WiiMote Physics: WiiMote Physics v4.0: v4.0.0.1 Recovered from existing compiled assembly after hard drive failure Now requires .NET 4.0 (it seems to make it run faster) Added new c...WinGet: Alpha 1: First Alpha of WinGet. It includes all the planned features but it contains many bugs. Packaged using 7-Zip and ClickOnce.Most Popular ProjectsWBFS ManagerRawrAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)PHPExcelpatterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesASP.NETMost Active ProjectsCommunity Forums NNTP bridgeRawrpatterns & practices – Enterprise LibraryGMap.NET - Great Maps for Windows Forms & PresentationN2 CMSIonics Isapi Rewrite FilterStyleCopsmark C# LibraryFarseer Physics Enginepatterns & practices: Composite WPF and Silverlight

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • C#: LINQ vs foreach - Round 1.

    - by James Michael Hare
    So I was reading Peter Kellner's blog entry on Resharper 5.0 and its LINQ refactoring and thought that was very cool.  But that raised a point I had always been curious about in my head -- which is a better choice: manual foreach loops or LINQ?    The answer is not really clear-cut.  There are two sides to any code cost arguments: performance and maintainability.  The first of these is obvious and quantifiable.  Given any two pieces of code that perform the same function, you can run them side-by-side and see which piece of code performs better.   Unfortunately, this is not always a good measure.  Well written assembly language outperforms well written C++ code, but you lose a lot in maintainability which creates a big techncial debt load that is hard to offset as the application ages.  In contrast, higher level constructs make the code more brief and easier to understand, hence reducing technical cost.   Now, obviously in this case we're not talking two separate languages, we're comparing doing something manually in the language versus using a higher-order set of IEnumerable extensions that are in the System.Linq library.   Well, before we discuss any further, let's look at some sample code and the numbers.  First, let's take a look at the for loop and the LINQ expression.  This is just a simple find comparison:       // find implemented via LINQ     public static bool FindViaLinq(IEnumerable<int> list, int target)     {         return list.Any(item => item == target);     }         // find implemented via standard iteration     public static bool FindViaIteration(IEnumerable<int> list, int target)     {         foreach (var i in list)         {             if (i == target)             {                 return true;             }         }           return false;     }   Okay, looking at this from a maintainability point of view, the Linq expression is definitely more concise (8 lines down to 1) and is very readable in intention.  You don't have to actually analyze the behavior of the loop to determine what it's doing.   So let's take a look at performance metrics from 100,000 iterations of these methods on a List<int> of varying sizes filled with random data.  For this test, we fill a target array with 100,000 random integers and then run the exact same pseudo-random targets through both searches.                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     Any         10       26          0.00046             30.00%     Iteration   10       20          0.00023             -     Any         100      116         0.00201             18.37%     Iteration   100      98          0.00118             -     Any         1000     1058        0.01853             16.78%     Iteration   1000     906         0.01155             -     Any         10,000   10,383      0.18189             17.41%     Iteration   10,000   8843        0.11362             -     Any         100,000  104,004     1.8297              18.27%     Iteration   100,000  87,941      1.13163             -   The LINQ expression is running about 17% slower for average size collections and worse for smaller collections.  Presumably, this is due to the overhead of the state machine used to track the iterators for the yield returns in the LINQ expressions, which seems about right in a tight loop such as this.   So what about other LINQ expressions?  After all, Any() is one of the more trivial ones.  I decided to try the TakeWhile() algorithm using a Count() to get the position stopped like the sample Pete was using in his blog that Resharper refactored for him into LINQ:       // Linq form     public static int GetTargetPosition1(IEnumerable<int> list, int target)     {         return list.TakeWhile(item => item != target).Count();     }       // traditionally iterative form     public static int GetTargetPosition2(IEnumerable<int> list, int target)     {         int count = 0;           foreach (var i in list)         {             if(i == target)             {                 break;             }               ++count;         }           return count;     }   Once again, the LINQ expression is much shorter, easier to read, and should be easier to maintain over time, reducing the cost of technical debt.  So I ran these through the same test data:                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile   10       41          0.00041             128%     Iteration   10       18          0.00018             -     TakeWhile   100      171         0.00171             88%     Iteration   100      91          0.00091             -     TakeWhile   1000     1604        0.01604             94%     Iteration   1000     825         0.00825             -     TakeWhile   10,000   15765       0.15765             92%     Iteration   10,000   8204        0.08204             -     TakeWhile   100,000  156950      1.5695              92%     Iteration   100,000  81635       0.81635             -     Wow!  I expected some overhead due to the state machines iterators produce, but 90% slower?  That seems a little heavy to me.  So then I thought, well, what if TakeWhile() is not the right tool for the job?  The problem is TakeWhile returns each item for processing using yield return, whereas our for-loop really doesn't care about the item beyond using it as a stop condition to evaluate. So what if that back and forth with the iterator state machine is the problem?  Well, we can quickly create an (albeit ugly) lambda that uses the Any() along with a count in a closure (if a LINQ guru knows a better way PLEASE let me know!), after all , this is more consistent with what we're trying to do, we're trying to find the first occurence of an item and halt once we find it, we just happen to be counting on the way.  This mostly matches Any().       // a new method that uses linq but evaluates the count in a closure.     public static int TakeWhileViaLinq2(IEnumerable<int> list, int target)     {         int count = 0;         list.Any(item =>             {                 if(item == target)                 {                     return true;                 }                   ++count;                 return false;             });         return count;     }     Now how does this one compare?                         List<T> On 100,000 Iterations     Method         Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile      10       41          0.00041             128%     Any w/Closure  10       23          0.00023             28%     Iteration      10       18          0.00018             -     TakeWhile      100      171         0.00171             88%     Any w/Closure  100      116         0.00116             27%     Iteration      100      91          0.00091             -     TakeWhile      1000     1604        0.01604             94%     Any w/Closure  1000     1101        0.01101             33%     Iteration      1000     825         0.00825             -     TakeWhile      10,000   15765       0.15765             92%     Any w/Closure  10,000   10802       0.10802             32%     Iteration      10,000   8204        0.08204             -     TakeWhile      100,000  156950      1.5695              92%     Any w/Closure  100,000  108378      1.08378             33%     Iteration      100,000  81635       0.81635             -     Much better!  It seems that the overhead of TakeAny() returning each item and updating the state in the state machine is drastically reduced by using Any() since Any() iterates forward until it finds the value we're looking for -- for the task we're attempting to do.   So the lesson there is, make sure when you use a LINQ expression you're choosing the best expression for the job, because if you're doing more work than you really need, you'll have a slower algorithm.  But this is true of any choice of algorithm or collection in general.     Even with the Any() with the count in the closure it is still about 30% slower, but let's consider that angle carefully.  For a list of 100,000 items, it was the difference between 1.01 ms and 0.82 ms roughly in a List<T>.  That's really not that bad at all in the grand scheme of things.  Even running at 90% slower with TakeWhile(), for the vast majority of my projects, an extra millisecond to save potential errors in the long term and improve maintainability is a small price to pay.  And if your typical list is 1000 items or less we're talking only microseconds worth of difference.   It's like they say: 90% of your performance bottlenecks are in 2% of your code, so over-optimizing almost never pays off.  So personally, I'll take the LINQ expression wherever I can because they will be easier to read and maintain (thus reducing technical debt) and I can rely on Microsoft's development to have coded and unit tested those algorithm fully for me instead of relying on a developer to code the loop logic correctly.   If something's 90% slower, yes, it's worth keeping in mind, but it's really not until you start get magnitudes-of-order slower (10x, 100x, 1000x) that alarm bells should really go off.  And if I ever do need that last millisecond of performance?  Well then I'll optimize JUST THAT problem spot.  To me it's worth it for the readability, speed-to-market, and maintainability.

    Read the article

  • Java collision detection and player movement: tips

    - by Loris
    I have read a short guide for game develompent (java, without external libraries). I'm facing with collision detection and player (and bullets) movements. Now i put the code. Most of it is taken from the guide (should i link this guide?). I'm just trying to expand and complete it. This is the class that take care of updates movements and firing mechanism (and collision detection): public class ArenaController { private Arena arena; /** selected cell for movement */ private float targetX, targetY; /** true if droid is moving */ private boolean moving = false; /** true if droid is shooting to enemy */ private boolean shooting = false; private DroidController droidController; public ArenaController(Arena arena) { this.arena = arena; this.droidController = new DroidController(arena); } public void update(float delta) { Droid droid = arena.getDroid(); //droid movements if (moving) { droidController.moveDroid(delta, targetX, targetY); //check if arrived if (droid.getX() == targetX && droid.getY() == targetY) moving = false; } //firing mechanism if(shooting) { //stop shot if there aren't bullets if(arena.getBullets().isEmpty()) { shooting = false; } for(int i = 0; i < arena.getBullets().size(); i++) { //current bullet Bullet bullet = arena.getBullets().get(i); System.out.println(bullet.getBounds()); //angle calculation double angle = Math.atan2(bullet.getEnemyY() - bullet.getY(), bullet.getEnemyX() - bullet.getX()); //increments x and y bullet.setX((float) (bullet.getX() + (Math.cos(angle) * bullet.getSpeed() * delta))); bullet.setY((float) (bullet.getY() + (Math.sin(angle) * bullet.getSpeed() * delta))); //collision with obstacles for(int j = 0; j < arena.getObstacles().size(); j++) { Obstacle obs = arena.getObstacles().get(j); if(bullet.getBounds().intersects(obs.getBounds())) { System.out.println("Collision detect!"); arena.removeBullet(bullet); } } //collisions with enemies for(int j = 0; j < arena.getEnemies().size(); j++) { Enemy ene = arena.getEnemies().get(j); if(bullet.getBounds().intersects(ene.getBounds())) { System.out.println("Collision detect!"); arena.removeBullet(bullet); } } } } } public boolean onClick(int x, int y) { //click on empty cell if(arena.getGrid()[(int)(y / Arena.TILE)][(int)(x / Arena.TILE)] == null) { //coordinates targetX = x / Arena.TILE; targetY = y / Arena.TILE; //enables movement moving = true; return true; } //click on enemy: fire if(arena.getGrid()[(int)(y / Arena.TILE)][(int)(x / Arena.TILE)] instanceof Enemy) { //coordinates float enemyX = x / Arena.TILE; float enemyY = y / Arena.TILE; //new bullet Bullet bullet = new Bullet(); //start coordinates bullet.setX(arena.getDroid().getX()); bullet.setY(arena.getDroid().getY()); //end coordinates (enemie) bullet.setEnemyX(enemyX); bullet.setEnemyY(enemyY); //adds bullet to arena arena.addBullet(bullet); //enables shooting shooting = true; return true; } return false; } As you can see for collision detection i'm trying to use Rectangle object. Droid example: import java.awt.geom.Rectangle2D; public class Droid { private float x; private float y; private float speed = 20f; private float rotation = 0f; private float damage = 2f; public static final int DIAMETER = 32; private Rectangle2D rectangle; public Droid() { rectangle = new Rectangle2D.Float(x, y, DIAMETER, DIAMETER); } public float getX() { return x; } public void setX(float x) { this.x = x; //rectangle update rectangle.setRect(x, y, DIAMETER, DIAMETER); } public float getY() { return y; } public void setY(float y) { this.y = y; //rectangle update rectangle.setRect(x, y, DIAMETER, DIAMETER); } public float getSpeed() { return speed; } public void setSpeed(float speed) { this.speed = speed; } public float getRotation() { return rotation; } public void setRotation(float rotation) { this.rotation = rotation; } public float getDamage() { return damage; } public void setDamage(float damage) { this.damage = damage; } public Rectangle2D getRectangle() { return rectangle; } } For now, if i start the application and i try to shot to an enemy, is immediately detected a collision and the bullet is removed! Can you help me with this? If the bullet hit an enemy or an obstacle in his way, it must disappear. Ps: i know that the movements of the bullets should be managed in another class. This code is temporary. update I realized what happens, but not why. With those for loops (which checks collisions) the movements of the bullets are instantaneous instead of gradual. In addition to this, if i add the collision detection to the Droid, the method intersects returns true ALWAYS while the droid is moving! public void moveDroid(float delta, float x, float y) { Droid droid = arena.getDroid(); int bearing = 1; if (droid.getX() > x) { bearing = -1; } if (droid.getX() != x) { droid.setX(droid.getX() + bearing * droid.getSpeed() * delta); //obstacles collision detection for(Obstacle obs : arena.getObstacles()) { if(obs.getRectangle().intersects(droid.getRectangle())) { System.out.println("Collision detected"); //ALWAYS HERE } } //controlla se è arrivato if ((droid.getX() < x && bearing == -1) || (droid.getX() > x && bearing == 1)) droid.setX(x); } bearing = 1; if (droid.getY() > y) { bearing = -1; } if (droid.getY() != y) { droid.setY(droid.getY() + bearing * droid.getSpeed() * delta); if ((droid.getY() < y && bearing == -1) || (droid.getY() > y && bearing == 1)) droid.setY(y); } }

    Read the article

  • The SSIS tuning tip that everyone misses

    - by Rob Farley
    I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

    Read the article

  • Automating deployments with the SQL Compare command line

    - by Jonathan Hickford
    In my previous article, “Five Tips to Get Your Organisation Releasing Software Frequently” I looked at how teams can automate processes to speed up release frequency. In this post, I’m looking specifically at automating deployments using the SQL Compare command line. SQL Compare compares SQL Server schemas and deploys the differences. It works very effectively in scenarios where only one deployment target is required – source and target databases are specified, compared, and a change script is automatically generated and applied. But if multiple targets exist, and pressure to increase the frequency of releases builds, this solution quickly becomes unwieldy.   This is where SQL Compare’s command line comes into its own. I’ve put together a PowerShell script that loops through the Servers table and pulls out the server and database, these are then passed to sqlcompare.exe to be used as target parameters. In the example the source database is a scripts folder, a folder structure of scripted-out database objects used by both SQL Source Control and SQL Compare. The script can easily be adapted to use schema snapshots.     -- Create a DeploymentTargets database and a Servers table CREATE DATABASE DeploymentTargets GO USE DeploymentTargets GO CREATE TABLE [dbo].[Servers]( [id] [int] IDENTITY(1,1) NOT NULL, [serverName] [nvarchar](50) NULL, [environment] [nvarchar](50) NULL, [databaseName] [nvarchar](50) NULL, CONSTRAINT [PK_Servers] PRIMARY KEY CLUSTERED ([id] ASC) ) GO -- Now insert your target server and database details INSERT INTO dbo.Servers ( serverName , environment , databaseName) VALUES ( N'myserverinstance' , N'myenvironment1' , N'mydb1') INSERT INTO dbo.Servers ( serverName , environment , databaseName) VALUES ( N'myserverinstance' , N'myenvironment2' , N'mydb2') Here’s the PowerShell script you can adapt for yourself as well. # We're holding the server names and database names that we want to deploy to in a database table. # We need to connect to that server to read these details $serverName = "" $databaseName = "DeploymentTargets" $authentication = "Integrated Security=SSPI" #$authentication = "User Id=xxx;PWD=xxx" # If you are using database authentication instead of Windows authentication. # Path to the scripts folder we want to deploy to the databases $scriptsPath = "SimpleTalk" # Path to SQLCompare.exe $SQLComparePath = "C:\Program Files (x86)\Red Gate\SQL Compare 10\sqlcompare.exe" # Create SQL connection string, and connection $ServerConnectionString = "Data Source=$serverName;Initial Catalog=$databaseName;$authentication" $ServerConnection = new-object system.data.SqlClient.SqlConnection($ServerConnectionString); # Create a Dataset to hold the DataTable $dataSet = new-object "System.Data.DataSet" "ServerList" # Create a query $query = "SET NOCOUNT ON;" $query += "SELECT serverName, environment, databaseName " $query += "FROM dbo.Servers; " # Create a DataAdapter to populate the DataSet with the results $dataAdapter = new-object "System.Data.SqlClient.SqlDataAdapter" ($query, $ServerConnection) $dataAdapter.Fill($dataSet) | Out-Null # Close the connection $ServerConnection.Close() # Populate the DataTable $dataTable = new-object "System.Data.DataTable" "Servers" $dataTable = $dataSet.Tables[0] #For every row in the DataTable $dataTable | FOREACH-OBJECT { "Server Name: $($_.serverName)" "Database Name: $($_.databaseName)" "Environment: $($_.environment)" # Compare the scripts folder to the database and synchronize the database to match # NB. Have set SQL Compare to abort on medium level warnings. $arguments = @("/scripts1:$($scriptsPath)", "/server2:$($_.serverName)", "/database2:$($_.databaseName)", "/AbortOnWarnings:Medium") # + @("/sync" ) # Commented out the 'sync' parameter for safety, write-host $arguments & $SQLComparePath $arguments "Exit Code: $LASTEXITCODE" # Some interesting variations # Check that every database matches a folder. # For example this might be a pre-deployment step to validate everything is at the same baseline state. # Or a post deployment script to validate the deployment worked. # An exit code of 0 means the databases are identical. # # $arguments = @("/scripts1:$($scriptsPath)", "/server2:$($_.serverName)", "/database2:$($_.databaseName)", "/Assertidentical") # Generate a report of the difference between the folder and each database. Generate a SQL update script for each database. # For example use this after the above to generate upgrade scripts for each database # Examine the warnings and the HTML diff report to understand how the script will change objects # #$arguments = @("/scripts1:$($scriptsPath)", "/server2:$($_.serverName)", "/database2:$($_.databaseName)", "/ScriptFile:update_$($_.environment+"_"+$_.databaseName).sql", "/report:update_$($_.environment+"_"+$_.databaseName).html" , "/reportType:Interactive", "/showWarnings", "/include:Identical") } It’s worth noting that the above example generates the deployment scripts dynamically. This approach should be problem-free for the vast majority of changes, but it is still good practice to review and test a pre-generated deployment script prior to deployment. An alternative approach would be to pre-generate a single deployment script using SQL Compare, and run this en masse to multiple targets programmatically using sqlcmd, or using a tool like SQL Multi Script.  You can use the /ScriptFile, /report, and /showWarnings flags to generate change scripts, difference reports and any warnings.  See the commented out example in the PowerShell: #$arguments = @("/scripts1:$($scriptsPath)", "/server2:$($_.serverName)", "/database2:$($_.databaseName)", "/ScriptFile:update_$($_.environment+"_"+$_.databaseName).sql", "/report:update_$($_.environment+"_"+$_.databaseName).html" , "/reportType:Interactive", "/showWarnings", "/include:Identical") There is a drawback of running a pre-generated deployment script; it assumes that a given database target hasn’t drifted from its expected state. Often there are (rightly or wrongly) many individuals within an organization who have permissions to alter the production database, and changes can therefore be made outside of the prescribed development processes. The consequence is that at deployment time, the applied script has been validated against a target that no longer represents reality. The solution here would be to add a check for drift prior to running the deployment script. This is achieved by using sqlcompare.exe to compare the target against the expected schema snapshot using the /Assertidentical flag. Should this return any differences (sqlcompare.exe Exit Code 79), a drift report is outputted instead of executing the deployment script.  See the commented out example. # $arguments = @("/scripts1:$($scriptsPath)", "/server2:$($_.serverName)", "/database2:$($_.databaseName)", "/Assertidentical") Any checks and processes that should be undertaken prior to a manual deployment, should also be happen during an automated deployment. You might think about triggering backups prior to deployment – even better, automate the verification of the backup too.   You can use SQL Compare’s command line interface along with PowerShell to automate multiple actions and checks that you need in your deployment process. Automation is a practical solution where multiple targets and a higher release cadence come into play. As we know, with great power comes great responsibility – responsibility to ensure that the necessary checks are made so deployments remain trouble-free.  (The code sample supplied in this post automates the simple dynamic deployment case – if you are considering more advanced automation, e.g. the drift checks, script generation, deploying to large numbers of targets and backup/verification, please email me at [email protected] for further script samples or if you have further questions)

    Read the article

  • IMAPSync Migration to Exchange 2010 SP1: Exchange drops connections while checking for existence of folders

    - by Benjamin Priestman
    I'm migrating from ZImbra Collaboration Suite to Exchange 2010 SP1. I'm testing IMAPSync as a possible migration tool and have hit a problem with the IMAP server in Exchange 2010. For each account it migrates, IMAPSync loops through the list of folders in the source mailbox and tests for the existence of each one in the destination mailbox. It then goes on to create those folders that do not exist and copy over the messages. It's the intial testing for the existence of the folders that is giving me a problem. The response given by the Exchange server when the folder does not yet exist is given as an error: "R=""16 NO IMAPSyncTest/8 doesn't exist."" After ten of these errors have been issued in succession, the Exchange server appears to stop responding to the IMAP session. Enabling protocol logging for IMAP confirms that the 10th request for a non-existant folder is the last request to be logged on the server. IMAPSync carries on merrily without seeming to realise its connection has gone and thus fails to create any folders. I've logged this with the tool's creator. Does anyone have any idea why Exchange is stopping responding to the connections though? The behaviour looks rather like throttling, although the 'ten strikes and you're out' trigger does not seem to correspond to any of the triggers on the ThrottlingPolicies. Just to check, I've tried creating a new ThrottlingPolicy, turned everything that I think might be relevant up to 11 and applied it to the my test mailbox. Policy settings are listed below, along with IMAP settings. Everything else should be pretty much as default. Throttling Policy RunspaceId : afa3159c-32a6-4906-986f-8adfbe50868b IsDefault : False AnonymousMaxConcurrency : 1 AnonymousPercentTimeInAD : AnonymousPercentTimeInCAS : AnonymousPercentTimeInMailboxRPC : EASMaxConcurrency : 10 EASPercentTimeInAD : EASPercentTimeInCAS : EASPercentTimeInMailboxRPC : EASMaxDevices : 10 EASMaxDeviceDeletesPerMonth : EWSMaxConcurrency : 10 EWSPercentTimeInAD : 50 EWSPercentTimeInCAS : 90 EWSPercentTimeInMailboxRPC : 60 EWSMaxSubscriptions : 5000 EWSFastSearchTimeoutInSeconds : 60 EWSFindCountLimit : 1000 IMAPMaxConcurrency : 1000 IMAPPercentTimeInAD : 400 IMAPPercentTimeInCAS : 400 IMAPPercentTimeInMailboxRPC : 400 OWAMaxConcurrency : 5 OWAPercentTimeInAD : 30 OWAPercentTimeInCAS : 150 OWAPercentTimeInMailboxRPC : 150 POPMaxConcurrency : 20 POPPercentTimeInAD : POPPercentTimeInCAS : POPPercentTimeInMailboxRPC : PowerShellMaxConcurrency : 18 PowerShellMaxTenantConcurrency : PowerShellMaxCmdlets : PowerShellMaxCmdletsTimePeriod : ExchangeMaxCmdlets : PowerShellMaxCmdletQueueDepth : PowerShellMaxDestructiveCmdlets : PowerShellMaxDestructiveCmdletsTimePeriod : RCAMaxConcurrency : 1000 RCAPercentTimeInAD : 400 RCAPercentTimeInCAS : 400 RCAPercentTimeInMailboxRPC : 400 CPAMaxConcurrency : 20 CPAPercentTimeInCAS : 205 CPAPercentTimeInMailboxRPC : 200 MessageRateLimit : RecipientRateLimit : ForwardeeLimit : CPUStartPercent : 75 AdminDisplayName : ExchangeVersion : 0.10 (14.0.100.0) Name : TestMigrationThrottling DistinguishedName : CN=TestMigrationThrottling,CN=Global Settings,CN=Our Company,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=cimex,DC=com Identity : TestMigrationThrottling Guid : 240049b3-2023-4df1-8edc-fbfc1fc80b87 ObjectCategory : domain.com/Configuration/Schema/ms-Exch-Throttling-Policy ObjectClass : {top, msExchGenericPolicy, msExchThrottlingPolicy} WhenChanged : 21/04/2011 18:48:19 WhenCreated : 21/04/2011 18:07:20 WhenChangedUTC : 21/04/2011 17:48:19 WhenCreatedUTC : 21/04/2011 17:07:20 OrganizationId : OriginatingServer : a-domain-controller IsValid : True IMAPSettings RunspaceId : afa3159c-32a6-4906-986f-8adfbe50868b ProtocolName : IMAP4 Name : 1 MaxCommandSize : 10240 ShowHiddenFoldersEnabled : False UnencryptedOrTLSBindings : {192.168.x.x:143} SSLBindings : {192.168.x.x:993} InternalConnectionSettings : {mail.office.domain.com:143:TLS, mail.office.domain.com:993:SSL} ExternalConnectionSettings : {mail.office.domain.com:143:TLS, mail.office.domain.com:993:SSL} X509CertificateName : mail.domain.com Banner : The Microsoft Exchange IMAP4 service is ready. LoginType : SecureLogin AuthenticatedConnectionTimeout : 00:30:00 PreAuthenticatedConnectionTimeout : 00:01:00 MaxConnections : 2147483647 MaxConnectionFromSingleIP : 2147483647 MaxConnectionsPerUser : 16 MessageRetrievalMimeFormat : BestBodyFormat ProxyTargetPort : 143 CalendarItemRetrievalOption : iCalendar OwaServerUrl : EnableExactRFC822Size : False LiveIdBasicAuthReplacement : False SuppressReadReceipt : False ProtocolLogEnabled : True EnforceCertificateErrors : False LogFileLocation : C:\Program Files\Microsoft\Exchange Server\V14\Logging\Imap4 LogFileRollOverSettings : Daily LogPerFileSizeQuota : 0 B (0 bytes) ExtendedProtectionPolicy : None EnableGSSAPIAndNTLMAuth : True Server : CMX-OFFICE-EX01 AdminDisplayName : ExchangeVersion : 0.10 (14.0.100.0) DistinguishedName : CN=1,CN=IMAP4,CN=Protocols,CN=EXCHANGE01,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Our COmpany,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=domain,DC=com Identity : EXCHANGE01\1 Guid : 48f9dc37-74c2-4fb0-a042-641f863f45f2 ObjectCategory : domain.com/Configuration/Schema/ms-Exch-Protocol-Cfg-IMAP-Server ObjectClass : {top, protocolCfg, protocolCfgIMAP, protocolCfgIMAPServer} WhenChanged : 21/04/2011 17:03:39 WhenCreated : 15/04/2011 13:51:58 WhenChangedUTC : 21/04/2011 16:03:39 WhenCreatedUTC : 15/04/2011 12:51:58 OrganizationId : OriginatingServer : a-domain-server IsValid : True

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

< Previous Page | 60 61 62 63 64 65 66 67 68  | Next Page >