Search Results

Search found 58436 results on 2338 pages for 'data'.

Page 465/2338 | < Previous Page | 461 462 463 464 465 466 467 468 469 470 471 472  | Next Page >

  • How to know if a MySQL UPDATE query fails because information being supplied matches data already in

    - by Mike
    I have a query where the user types a block of text in a textarea field. They are able to save this information in a database. The problem is that if they have not changed the information or if the information matches the data already in the database, I recieve a '0' for affected rows. Usually I display an error that says the query failed when there are no affected rows. How would I 'know' that the 0 affected rows is because the data already exists, so that I can display a more specific error?

    Read the article

  • beforeSave() returned some error

    - by kwokwai
    Hi all, I got a simple input text field in a HTML form: <input type="text" name="data[User][pswd]" id="data[User][pswd]"> The scripts for the Controller's action that captured the data is as follows: function register(){ $temp = $this->data; if(strlen($temp['User']['pswd'])>6) { if ($this->User->save($this->data)) { $this->Session->setFlash('Data was Saved'); } } } // this script works And in the Model controller, I got these lines of codes: function beforeSave() { $raw = $this->data; if(strlen($raw['User']['pswd'])>6){ md5($raw['User']['pswd']); } return true; } // this script failed to work The data was stored into the Database successfully but it was not undergone any MD5 encryption. I think that there must be some errors in the Model's script because I saw some errors flashed after the data was saved, but the screen that showed the errors immediately refreshed in a second after the data was saved successfully and I couldn't see the detail of the errors that caused the problem. Could you help me out please?

    Read the article

  • How to restore data on Power Failure using C++ programming on windows.

    - by Tarun
    Hi, In my program I am writing a file of my programs states. I am writing the file many times to the file during the program run, because the program changes some variables that that i need to store very frequently. Now, if , for some reasons my power fails. Then most of the time I loose data in that file. Please, tell me any mechanism which can protect data even if the power fails. (I have written C++ program on windows). Thank you

    Read the article

  • Can SQLite file copied successfully on the data folder of an unrooted android device ?

    - by student
    I know that in order to access the data folder on the device, it needs to be rooted. However, if I just want to copy the database from my assets folder to the data folder on my device, will the copying process works on an unrooted phone? The following is my Database Helper class. From logcat, I can verify that the methods call to copyDataBase(), createDataBase() and openDataBase() are returned successfully. However, I got this error message android.database.sqlite.SQLiteException: no such table: TABLE_NAME: when my application is executing rawQuery. I'm suspecting the database file is not copied successfully (cannot be too sure as I do not have access to data folder), yet the method call to copyDatabase() are not throwing any exception. What could it be? Thanks. ps: My device is still unrooted, I hope it is not the main cause of the error. public DatabaseHelper(Context context) { super(context, DB_NAME, null, 1); this.myContext = context; } public void createDataBase() throws IOException{ boolean dbExist = checkDataBase(); String s = new Boolean(dbExist).toString(); Log.d("dbExist", s ); if(dbExist){ //do nothing - database already exist Log.d("createdatabase","DB exists so do nothing"); }else{ this.getReadableDatabase(); try { copyDataBase(); Log.d("copydatabase","Successful return frm method call!"); } catch (IOException e) { throw new Error("Error copying database"); } } } private boolean checkDataBase(){ File dbFile = new File(DB_PATH + DB_NAME); return dbFile.exists(); } private void copyDataBase() throws IOException{ //Open your local db as the input stream InputStream myInput = null; myInput = myContext.getAssets().open(DB_NAME); Log.d("copydatabase","InputStream successful!"); // Path to the just created empty db String outFileName = DB_PATH + DB_NAME; //Open the empty db as the output stream OutputStream myOutput = new FileOutputStream(outFileName); //transfer bytes from the inputfile to the outputfile byte[] buffer = new byte[1024]; int length; while ((length = myInput.read(buffer))>0){ myOutput.write(buffer, 0, length); } //Close the streams myOutput.flush(); myOutput.close(); myInput.close(); } public void openDataBase() throws SQLException{ //Open the database String myPath = DB_PATH + DB_NAME; myDataBase = SQLiteDatabase.openDatabase(myPath, null, SQLiteDatabase.OPEN_READONLY); } /* @Override public synchronized void close() { if(myDataBase != null) myDataBase.close(); super.close(); }*/ public void close() { // NOTE: openHelper must now be a member of CallDataHelper; // you currently have it as a local in your constructor if (myDataBase != null) { myDataBase.close(); } } @Override public void onCreate(SQLiteDatabase db) { } @Override public void onUpgrade(SQLiteDatabase db, int oldVersion, int newVersion) { } }

    Read the article

  • Getting an Ajax response from Zend Framework Controller

    - by JavaLava
    I'm doing an Ajax request on one of my views to a Controller but I am unable to send back a response to the Ajax method. In the snippet below, I am trying to send the word 'hellopanda' back but in the alert message, I'll get data as an object. View : $.ajax({ type: "POST", url: "localhost/some-activity", data: dataString, success: function(data) { alert( "Data is: " + data); //do something with data }, error: function(data){ alert( "Data is: " + data); //do something with data }, onComplete: function(){ } }); Controller: public function someActivityAction(){ //do stuff echo "hellopanda"; } I'm pretty sure the echo is the problem. Any insights on to how to do a proper response to the view would be greatly appreciated.

    Read the article

  • What type of data should I send to view?

    - by Vizualni
    Hello, this question I've been asking myself since the day I started programming in MVC way. Should I send to view arrays filled with data or should I send it as on objects I retrieved from database? My model returns me data as objects. What would be the best way to create such a thing? $new_data = $model->find_by_id(1); echo $new_data->name; $new_data->name= "whatever"; $new_data->save(); For example. view.php echo $object->name; or echo $array['name'] Language is php :).

    Read the article

  • Is it possible to use variables and data in in-template css properties?

    - by aruman89
    <td style="width:77px" id="<%=id%>"><div class="condbar"><div class="condprogress" style="width:Data.condition%; background:condColor"></div> <div class="condvalue"><%=condition%><span>%</span></div></div></td> I want to set the width of a div .roster_condprogress as a percentage according to data and change the color accordingly (using a variable for that). What is a correct way to write this? (if it exists)

    Read the article

  • is it right way( safe) to assign post data value directly by name attibute value to a variable in

    - by I Like PHP
    i m working in PHP since one year, but now a days i got this way to assign post data value directly using name attribute . i m really curious to know the documentation about it.please refere me link regarding this . i explain by example here is my form <form method="post" action=""> <input type="text" name="userName" id="userName"> <input type="submit" name="doit" value="submit"> </form> to get the post data i always use $somevar=mysql_real_escape_string($_POST['userName']); but now i see another way $somevar= "userName"; i just want to know that is it safe n easy way??

    Read the article

  • How to take data from textarea and decrypt using javascript?

    - by user1657555
    I need to take data from a textarea on a website and decrypt it using a simple algorithm. The data is in the form of numbers separated by a comma. It also needs to read a space as a space. It looks like 42,54,57, ,57,40,57,44. Heres what I have so far: var my_textarea = $('textarea[name = "words"]').first(); var my_value = $(my_textarea).val(); var my_array = my_value.split(","); for (i=0; i < my_array.length; i++) { var nv = my_array - 124; var acv = nv + 34; var my_result = String.fromCharCode(acv); } prompt("", my_result);

    Read the article

  • PHP-FPM High Memory Usage

    - by Ruel
    I have a wordpress blog, that uses WP-SuperCache, and normally I get 100 visitors per day. With nginx + php-fpm it's blazing fast, and I have no regrets. One thing i noticed, php-fpm takes a lot of memory: top - 09:20:43 up 5 days, 15:53, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 26 total, 1 running, 25 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1048576k total, 329956k used, 718620k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10226 www-data 15 0 145m 52m 4584 S 0.0 5.1 0:07.55 php-fpm 10223 www-data 16 0 141m 48m 4692 S 0.0 4.8 0:08.70 php-fpm 20277 www-data 15 0 138m 46m 4368 S 0.0 4.5 0:07.55 php-fpm 20259 www-data 15 0 133m 41m 4600 S 0.0 4.0 0:06.68 php-fpm 12201 www-data 15 0 133m 41m 4632 S 0.0 4.0 0:08.31 php-fpm 11586 www-data 15 0 132m 40m 4292 S 0.0 3.9 0:03.27 php-fpm 29822 www-data 15 0 128m 36m 4356 S 0.0 3.6 0:05.26 php-fpm 28427 mysql 15 0 200m 7300 4764 S 0.0 0.7 0:47.89 mysqld 10202 root 18 0 98.3m 4320 1204 S 0.0 0.4 0:03.80 php-fpm 22524 root 18 0 86064 3396 2652 S 0.0 0.3 0:16.74 sshd 9882 www-data 18 0 42052 2572 804 S 0.0 0.2 0:27.52 nginx 9884 www-data 18 0 42052 2560 804 S 0.0 0.2 0:26.26 nginx 9881 www-data 18 0 42064 2524 804 S 0.0 0.2 0:29.24 nginx 9879 www-data 18 0 42032 2480 804 S 0.0 0.2 0:29.58 nginx 23771 root 15 0 12176 1820 1316 S 0.0 0.2 0:00.08 bash 28344 root 22 0 11932 1416 1184 S 0.0 0.1 0:00.00 mysqld_safe 18167 root 16 0 62628 1208 648 S 0.0 0.1 0:00.55 sshd 25941 root 15 0 12612 1192 928 R 0.0 0.1 0:02.21 top 11573 root 15 0 20876 1168 592 S 0.0 0.1 0:00.67 crond 9878 root 18 0 41000 1112 284 S 0.0 0.1 0:00.00 nginx 21736 root 23 0 21648 936 716 S 0.0 0.1 0:00.00 xinetd 11585 root 18 0 46748 816 428 S 0.0 0.1 0:00.00 saslauthd 14125 root 12 -4 12768 768 452 S 0.0 0.1 0:00.00 udevd 1 root 18 0 10352 728 616 S 0.0 0.1 0:17.93 init 24564 root 15 0 5912 680 544 S 0.0 0.1 0:01.90 syslogd 11618 root 18 0 46748 548 160 S 0.0 0.1 0:00.00 saslauthd Here's my php-fpm config: [global] pid = run/php-fpm.pid error_log = log/php-fpm.log log_level = notice [www] listen = 127.0.0.1:9000 user = www-data group = www-data pm = dynamic pm.max_children = 50 pm.start_servers = 3 pm.min_spare_servers = 3 pm.max_spare_servers = 10 pm.max_requests = 500 Sometimes it goes up to 400MB. And I'm planning to add a new website on my VPS. Is this normal?

    Read the article

  • Unable to use TweetDeck on Windows due to "Ooops, TweetDeck can't find your data" and "Sorry, Adobe

    - by Matt
    I'm running Adobe AIR 1.5.2 (latest) on Windows 7 (64-bit RTM) and downloaded TweetDeck 0.31.1 (latest). When I run TweetDeck I get the following errors: Ooops, TweetDeck can't find your data and Sorry, Adobe AIR has a problem running on this computer Other AIR applications install and run fine. I've uninstalled both TweetDeck and AIR and reinstalled. Following the uninstalls I've also removed all on-disk references to both TweetDeck and AIR, but no luck. UPDATE: Using Process Monitor I did a trace of Tweetdeck from the moment it launched until the first error occurred. I saw the following information in the output of the trace: 1 5:22:18.6522338 PM TweetDeck.exe 5580 CreateFile D:\ProgramData\Microsoft\Windows\Start Menu\Programs\rs\??\d:\Use\myusername\AppData\Roaming\Adobe\AIR\ELS\TweetDeckFast.F9107117265DB7542C1A806C8DB837742CE14C21.1\PrivateEncryptedDatak NAME INVALID Desired Access: Generic Write, Read Attributes, Disposition: OverwriteIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0 In this trace output, Tweetdeck.exe is trying to create the file D:\ProgramData\Microsoft\Windows\Start Menu\Programs\rs\??\d:\Use\myusername\AppData\Roaming\Adobe\AIR\ELS\TweetDeckFast.F9107117265DB7542C1A806C8DB837742CE14C21.1\PrivateEncryptedDatak but the path specified is invalid. When looking at the path you can see that it is indeed an invalid path. First, there’s the “??” portion which doesn’t exist in the file system since the “?” is an invalid character in Windows/NTFS file systems. Additionally, looking at this path, it actually seems to be composed of two parts (is the "??" a delimiter?): Part 1: D:\ProgramData\Microsoft\Windows\Start Menu\Programs\rs\?? Part 2: d:\Use\myusername\AppData\Roaming\Adobe\AIR\ELS \TweetDeckFast.F9107117265DB7542C1A806C8DB837742CE14C21.1\PrivateEncryptedDatak (the problem here is that d:\Use... doesn’t even exist. What seems to be happening here is that Tweetdeck is looking for the user credentials (the “PrivateEncryptedDatak” file) but it’s looking in the wrong place, can’t find the file, and hence the error that Tweetdeck is giving (shown in the screenshot). I'm trying to determine how TweetDeck is getting this path. I searched the contents of all files on my hard disk hoping to find some TweetDeck or Adobe AIR configuration file containing this incorrect path, but I was unable to find anything. UPDATE: See Carl's comment regarding directory junctions and symbolic links under my accepted answer. This ended up being the problem. Edit by Gnoupi: People, the answer section is there to provide an actual ANSWER, not to say you have the same issue. It doesn't help anyone that you have the same problem. Eventually, if you think this is really worth mentioning, put it as a comment under the question. But simply, if what you want to add is not an answer to the question, then don't post it as an answer. This is not a forum, I recommend new users to read the FAQ: http://superuser.com/faq

    Read the article

  • How to remove RAID flag on unstriped drive without losing data?

    - by Alex Folland
    I have a Gigabyte Z68X-UD4-B3 motherboard. It advertises this new thing called "XHD", which is like RAID but makes a SSD and traditional-style drive work together to enable high speed with high capacity. I don't want to use this feature, and I already have Windows 7 64 installed without using this feature. When I first installed my 2 hard drives (1 SSD and 1 traditional-style drive) in my machine and booted it up for the first time, it ran a program from the mobo that asked me if I wanted to set up XHD. Thinking it would go to some config screen, I said yes. It immediately started doing something with my drives and finished. I considered that strange, but figured it wouldn't matter when I simply install Windows onto my SSD only. I now have my BIOS and Windows running in AHCI mode with no RAID arrays and separate drives. My SSD is one of those new Corsair Force GT drives which loses power every so often, causing Windows to BSOD. I've figured everything out about this problem, including installing the latest firmware from Corsair, and the only way to fix it at this point is by installing Intel Rapid Storage Technology to control AHCI instead of Windows, since the Windows AHCI driver disables the drive's power every once in a while and can't be configured not to do so. I've tried installing Intel Rapid Storage Technology. When I reboot my machine after doing so, it BSODs just after the Windows logo. I've figured out this is because my SSD and my traditional drive are flagged as RAID, as seen in the "Intel Matrix Storage Manager" program found by switching the BIOS hard drive handling to "RAID" mode. This is due to the XHD auto-config program I mentioned earlier. Normally, the BIOS is set to AHCI, and when the drives boot in AHCI mode, they work perfectly. So, I've concluded the data is stored in AHCI mode but the drives' flags are set to RAID. I've figured out that I can accomplish my objective by using the "Intel Matrix Storage Manager" program on the mobo (with "Reset disks to non-RAID"), but doing so would cause it to completely wipe the drives I select. I want to simply toggle these flags from RAID to AHCI so Intel Rapid Storage Technology doesn't fail and cause a BSOD upon booting, but without wiping the drives.

    Read the article

  • Can I save & store a user's submission in a way that proves that the data has not been altered, and that the timestamp is accurate?

    - by jt0dd
    There are many situations where the validity of the timestamp attached to a certain post (submission of information) might be invaluable for the post owner's legal usage. I'm not looking for a service to achieve this, as requested in this great question, but rather a method for the achievement of such a service. For the legal (in most any law system) authentication of text content and its submission time, the owner of the content would need to prove: that the timestamp itself has not been altered and was accurate to begin with. that the text content linked to the timestamp had not been altered I'd like to know how to achieve this via programming (not a language-specific solution, but rather the methodology behind the solution). Can a timestamp be validated to being accurate to the time that the content was really submitted? Can data be stored in a form that it can be read, but not written to, in a proven way? In other words, can I save & store a user's submission in a way that proves that the data has not been altered, and that the timestamp is accurate? I can't think of any programming method that would make this possible, but I am not the most experienced programmer out there. Based on MidnightLightning's answer to the question I cited, this sort of thing is being done. Clarification: I'm looking for a method (hashing, encryption, etc) that would allow an average guy like me to achieve the desired effect through programming. I'm interested in this subject for the purpose of Defensive Publication. I'd like to learn a method that allows an every-day programmer to pick up his computer, write a program, pass information through it, and say: I created this text at this moment in time, and I can prove it. This means the information should be protected from the programmer who writes the code as well. Perhaps a 3rd party API would be required. I'm ok with that.

    Read the article

  • C#, AES encryption check!

    - by Data-Base
    I have this code for AES encryption, can some one verify that this code is good and not wrong? it works fine, but I'm more concern about the implementation of the algorithm // Plaintext value to be encrypted. //Passphrase from which a pseudo-random password will be derived. //The derived password will be used to generate the encryption key. //Password can be any string. In this example we assume that this passphrase is an ASCII string. //Salt value used along with passphrase to generate password. //Salt can be any string. In this example we assume that salt is an ASCII string. //HashAlgorithm used to generate password. Allowed values are: "MD5" and "SHA1". //SHA1 hashes are a bit slower, but more secure than MD5 hashes. //PasswordIterations used to generate password. One or two iterations should be enough. //InitialVector (or IV). This value is required to encrypt the first block of plaintext data. //For RijndaelManaged class IV must be exactly 16 ASCII characters long. //KeySize. Allowed values are: 128, 192, and 256. //Longer keys are more secure than shorter keys. //Encrypted value formatted as a base64-encoded string. public static string Encrypt(string PlainText, string Password, string Salt, string HashAlgorithm, int PasswordIterations, string InitialVector, int KeySize) { byte[] InitialVectorBytes = Encoding.ASCII.GetBytes(InitialVector); byte[] SaltValueBytes = Encoding.ASCII.GetBytes(Salt); byte[] PlainTextBytes = Encoding.UTF8.GetBytes(PlainText); PasswordDeriveBytes DerivedPassword = new PasswordDeriveBytes(Password, SaltValueBytes, HashAlgorithm, PasswordIterations); byte[] KeyBytes = DerivedPassword.GetBytes(KeySize / 8); RijndaelManaged SymmetricKey = new RijndaelManaged(); SymmetricKey.Mode = CipherMode.CBC; ICryptoTransform Encryptor = SymmetricKey.CreateEncryptor(KeyBytes, InitialVectorBytes); MemoryStream MemStream = new MemoryStream(); CryptoStream CryptoStream = new CryptoStream(MemStream, Encryptor, CryptoStreamMode.Write); CryptoStream.Write(PlainTextBytes, 0, PlainTextBytes.Length); CryptoStream.FlushFinalBlock(); byte[] CipherTextBytes = MemStream.ToArray(); MemStream.Close(); CryptoStream.Close(); return Convert.ToBase64String(CipherTextBytes); } public static string Decrypt(string CipherText, string Password, string Salt, string HashAlgorithm, int PasswordIterations, string InitialVector, int KeySize) { byte[] InitialVectorBytes = Encoding.ASCII.GetBytes(InitialVector); byte[] SaltValueBytes = Encoding.ASCII.GetBytes(Salt); byte[] CipherTextBytes = Convert.FromBase64String(CipherText); PasswordDeriveBytes DerivedPassword = new PasswordDeriveBytes(Password, SaltValueBytes, HashAlgorithm, PasswordIterations); byte[] KeyBytes = DerivedPassword.GetBytes(KeySize / 8); RijndaelManaged SymmetricKey = new RijndaelManaged(); SymmetricKey.Mode = CipherMode.CBC; ICryptoTransform Decryptor = SymmetricKey.CreateDecryptor(KeyBytes, InitialVectorBytes); MemoryStream MemStream = new MemoryStream(CipherTextBytes); CryptoStream cryptoStream = new CryptoStream(MemStream, Decryptor, CryptoStreamMode.Read); byte[] PlainTextBytes = new byte[CipherTextBytes.Length]; int ByteCount = cryptoStream.Read(PlainTextBytes, 0, PlainTextBytes.Length); MemStream.Close(); cryptoStream.Close(); return Encoding.UTF8.GetString(PlainTextBytes, 0, ByteCount); } Thank you

    Read the article

  • SOA Suite 11g Native Format Builder Complex Format Example

    - by bob.webster
    This rather long posting details the steps required to process a grouping of fixed length records using Format Builder.   If it’s 10 pm and you’re feeling beat you might want to leave this until tomorrow.  But if it’s 10 pm and you need to get a Format Builder Complex template done, read on… The goal is to process individual orders from a file using the 11g File Adapter and Format Builder Sample Data =========== 001Square Widget            0245.98 102Triagular Widget         1120.00 403Circular Widget           0099.45 ORD8898302/01/2011 301Hexagon Widget         1150.98 ORD6735502/01/2011 The records are fixed length records representing a number of logical Order records. Each order record consists of a number of item records starting with a 3 digit number, followed by a single Summary Record which starts with the constant ORD. How can this file be processed so that the first poll returns the first order? 001Square Widget            0245.98 102Triagular Widget         1120.00 403Circular Widget           0099.45 ORD8898302/01/2011 And the second poll returns the second order? 301Hexagon Widget           1150.98 ORD6735502/01/2011 Note: if you need more than one order per poll, that’s also possible, see the “Multiple Messages” field in the “File Adapter Step 6 of 9” snapshot further down.   To follow along with this example you will need - Studio Edition Version 11.1.1.4.0    with the   - SOA Extension for JDeveloper 11.1.1.4.0 installed Both can be downloaded from here:  http://www.oracle.com/technetwork/middleware/soasuite/downloads/index.html You will not need a running WebLogic Server domain to complete the steps and Format Builder tests in this article.     Start with a SOA Composite containing a File Adapter The Format Builder is part of the File Adapter so start by creating a new SOA Project and Composite. Here is a quick summary for those not familiar with these steps - Start JDeveloper - From the Main Menu choose File->New - In the New Gallery window that opens Expand the “General” category and Select the Applications node.   Then choose SOA Application from the Items section on the right.  Finally press the OK button. - In Step 1 of the “Create SOA Application wizard” that appears enter an Application Name and an Directory of your     choice,   then press the Next button. - In Step 2 of the “Create SOA Application wizard”, press the Next button leaving all entries as defaulted. - In Step 3 of the “Create SOA Application wizard”, Enter a composite name of your choice and Press the Finish   Button These steps result in a new Application and SOA Project. The SOA Project contains a composite.xml file which is opened and shown below. For our example we have not defined a Mediator or a BPEL process to minimize the steps, but one or the other would eventually be needed to use the File Adapter we are about to create. Drag and drop the File Adapter icon from the Component Pallette onto either the LEFT side of the diagram under “Exposed Services” or the right side under “External References”.  (See the Green Circle in the image below).  Placing the adapter on the left side would indicate the file being processed is inbound to the composite, if the adapter is placed on the right side then the data is outbound to a file.     Note that the same Format Builder definition can be used in both directions.  For example we could use the format with a File Adapter on the left side of the composite to parse fixed data into XML, modify the data in our Composite or BPEL process and then use the same Format Builder definition with a File adapter on the right side of the composite to write the data back out in the same fixed data format When the File Adapter is dropped on the Composite the File Adapter Wizard Appears. Skip Past the first page, Step 1 of 9 by pressing the Next button. In Step 2 enter a service name of your choice as shown below, then press Next   When the Native Format Builder appears, skip the welcome page by pressing next. Also press the Next button to accept the settings on Step 3 of 9 On Step 4, select Read File and press the Next button as shown below.   On Step 5 enter a directory that will contain a file with the input data, then  Press the Next button as shown below. In step 6, enter *.txt or another file format to select input files from the input directory mentioned in step 5. ALSO check the “Files contain Multiple Messages” checkbox and set the “Publish Messages in Batches of” field to 1.  The value can be set higher to increase the number of logical order group records returned on each poll of the file adapter.  In other words, it determines the number of Orders that will be sent to each instance of a Mediator or Composite processing using the File Adapter.   Skip Step 7 by pressing the Next button In Step 8 press the Gear Icon on the right side to load the Native Format Builder.       Native Format Builder  appears Before diving into the format, here is an overview of the process. Approach - Bottom up Assuming an Order is a grouping of item records and a summary record…. - Define a separate  Complex Type for each Record Type found in the group.    (One for itemRecord and one for summaryRecord) - Define a Complex Type to contain the Group of Record types defined above   (LogicalOrderRecord) - Define a top level element to represent an order.  (order)   The order element will be of type LogicalOrderRecord   Defining the Format In Step 1 select   “Create new”  and  “Complex Type” and “Next”   In Step two browse to and select a file containing the test data shown at the start of this article. A link is provided at the end of this article to download a file containing the test data. Press the Next button     In Step 3 Complex types must be define for each type of input record. Select the Root-Element and Click on the Add Complex Type icon This creates a new empty complex type definition shown below. The fastest way to create the definition is to highlight the first line of the Sample File data and drag the line onto the  <new_complex_type> Format Builder introspects the data and provides a grid to define additional fields. Change the “Complex Type Name” to  “itemRecord” Then click on the ruler to indicate the position of fixed columns.  Drag the red triangle icons to the exact columns if necessary. Double click on an existing red triangle to remove an unwanted entry. In the case below fields are define in columns 0-3, 4-28, 29-eol When the field definitions are correct, press the “Generate Fields” button. Field entries named C1, C2 and C3 will be created as shown below. Click on the field names and rename them from C1->itemNum, C2->itemDesc and C3->itemCost  When all the fields are correctly defined press OK to save the complex type.        Next, the process is repeated to define a Complex Type for the SummaryRecord. Select the Root-Element in the schema tree and press the new complex type icon Then highlight and drag the Summary Record from the sample data onto the <new_complex_type>   Change the complex type name to “summaryRecord” Mark the fixed fields for Order Number and Order Date. Press the Generate Fields button and rename C1 and C2 to itemNum and orderDate respectively.   The last complex type to be defined is a type to hold the group of items and the summary record. Select the Root-Element in the schema tree and click the new complex type icon Select the “<new_complex_type>” entry and click the pencil icon   On the Complex Type Details page change the name and type of each input field. Change line 1 to be named item and set the Type  to “itemRecord” Change line 2 to be named summary and set the Type to “summaryRecord” We also need to indicate that itemRecords repeat in the input file. Click the pencil icon at the right side of the item line. On the Edit Details page change the “Max Occurs” entry from 1 to UNBOUNDED. We also need to indicate how to identify an itemRecord.  Since each item record has “.” in column 32 we can use this fact to differentiate an item record from a summary record. Change the “Look Ahead” field to value 32 and enter a period in the “Look For” field Press the OK button to save entry.     Finally, its time to create a top level element to represent an order. Select the “Root-Element” in the schema tree and press the New element icon Click on the <new_element> and press the pencil icon.   Set the Element Name to “order” and change the Data Type to “logicalOrderRecord” Press the OK button to save the element definition.   The final definition should match the screenshot below. Press the Next Button to view the definition source.     Press the Test Button to test the definition   Press the Green Triangle Icon to run the test.   And we are presented with an unwelcome error. The error states that the processor ran out of data while working through the definition. The processor was unable to differentiate between itemRecords and summaryRecords and therefore treated the entire file as a list of itemRecords.  At end of file, the “summary” portion of the logicalOrderRecord remained unprocessed but mandatory.   This root cause of this error is the loss of our “lookAhead” definition used to identify itemRecords. This appears to be a bug in the  Native Format Builder 11.1.1.4.0 Luckily, a simple workaround exists. Press the Cancel button and return to the “Step 4 of 4” Window. Manually add    nxsd:lookAhead="32" nxsd:lookFor="."   attributes after the maxOccurs attribute of the item element. as shown in the highlighted text below.   When the lookAhead and lookFor attributes have been added Press the Test button and on the Test page press the Green Triangle. The test is now successful, the first order in the file is returned by the File Adapter.     Below is a complete listing of the Result XML from the right column of the screen above   Try running it The downloaded input test file and completed schema file can be used for testing without following all the Native Format Builder steps in this example. Use the following link to download a file containing the sample data. Download Sample Input Data This is the best approach rather than cutting and pasting the input data at the top of the article.  Since the data is fixed length it’s very important to watch out for trailing spaces in the data and to ensure an eol character at the end of every line. The download file is correctly formatted. The final schema definition can be downloaded at the following link Download Completed Schema Definition   - Save the inputData.txt file to a known location like the xsd folder in your project. - Save the inputData_6.xsd file to the xsd folder in your project. - At step 1 in the Native Format Builder wizard  (as shown above) check the “Edit existing” radio button,    then browse and select the inputData_6.xsd file - At step 2 of the Format Builder configuration Wizard (as shown above) supply the path and filename for    the inputData.txt file. - You can then proceed to the test page and run a test. - Remember the wizard bug will drop the lookAhead and lookFor attributes,  you will need to manually add   nxsd:lookAhead="32" nxsd:lookFor="."    after the maxOccurs attribute of the item element in the   LogicalOrderRecord Complex Type.  (as shown above)   Good Luck with your Format Project

    Read the article

  • SQLAuthority News – Speaking Sessions at TechEd India – 3 Sessions – 1 Panel Discussion

    - by pinaldave
    Microsoft Tech-Ed India 2010 is considered as the major Technology event of the year for various IT professionals and developers. This event will feature a comprehensive forum in order   to learn, connect, explore, and evolve the current technologies we have today. I would recommend this event to you since here you will learn about today’s cutting-edge trends, thereby enhancing your work profile and getting ahead of the rest. But, the most important benefit of all might be the networking opportunity that that you can attain by attending the forum. You can build personal connections with various Microsoft experts and peers that will last even far beyond this event! It also feels good to let you know that I will be speaking at this year’s event! So, here are the sessions that await you in this mega-forum. Session 1: True Lies of SQL Server – SQL Myth Buster Date: April 12, 2010  Time: 11:15pm – 11:45pm In this 30-minute demo session, I am going to briefly demonstrate few SQL Server Myth and their resolution backing up with some demo. This demo session is a must-attend for all developers and administrators who would come to the event. This is going to be a very quick yet  fun session. Session 2: Master Data Services in Microsoft SQL Server 2008 R2 Date: April 12, 2010  Time: 2:30pm-3:30pm SQL Server Master Data Services will ship with SQL Server 2008 R2 and will improve Microsoft’s platform appeal. This session provides an in depth demonstration of MDS features and highlights important usage scenarios. Master Data Services enables consistent decision making by allowing you to create, manage and propagate changes from single master view of your business entities. Also with MDS – Master Data-hub which is the vital component helps ensure reporting consistency across systems and deliver faster more accurate results across the enterprise. We will talk about establishing the basis for a centralized approach to defining, deploying, and managing master data in the enterprise. Session 3: Developing with SQL Server Spatial and Deep Dive into Spatial Indexing Date: April 14, 2010 Time: 5:00pm-6:00pm Microsoft SQL Server 2008 delivers new spatial data types that enable you to consume, use, and extend location-based data through spatial-enabled applications. Attend this session to learn how to use spatial functionality in next version of SQL Server to build and optimize spatial queries. This session outlines the new geography data type to store geodetic spatial data and perform operations on it, use the new geometry data type to store planar spatial data and perform operations on it, take advantage of new spatial indexes for high performance queries, use the new spatial results tab to quickly and easily view spatial query results directly from within Management Studio, extend spatial data capabilities by building or integrating location-enabled applications through support for spatial standards and specifications and much more. Panel Discussion: Harness the power of Web – SEO and Technical Blogging Date: April 12, 2010 Time: 5:00pm-6:00pm Here you will learn lots of tricks and tips about SEO and Technical Blogging from various Industry Technical Blogging Experts. This event will surely be one of the most important Tech conventions of 2010. TechEd is going to be a very busy time for Tech developers and enthusiasts, since every evening there will be a fun session to attend. If you are interested in any of the above topics for every session, I suggest that you visit each of them as you will learn so many things about the topic to be discussed. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: MVP, Pinal Dave, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, SQLAuthority News, T SQL, Technology Tagged: TechEd, TechEdIn

    Read the article

  • SQL SERVER – What is Spatial Database? – Developing with SQL Server Spatial and Deep Dive into Spati

    - by pinaldave
    What is Spatial Database? A spatial database is a database that is optimized to store and query data related to objects in space, including points, lines and polygons. While typical databases can understand various numeric and character types of data, additional functionality needs to be added for databases to process spatial data types. (Source: Wikipedia) Today I will be talking about the same subject at Microsoft TechEd India. If you want to learn about how to spatial aspect of data and how to integrate them with SQL Server this is the perfect session for you. Spatial is very special concept of SQL Server and I really like how it is implemented in SQL Server. In general Performance Tuning and Query Optimization is something I always have enjoyed in my professional life. Index are my best friends and many time, by implementing and many time by removing I have improved the performance of the system. In this session, I will be talking about Index along with Spatial Data. As Spatial Database is very interesting concept, I will cover super short but very interesting 10 quick slides about this subject. I will make sure in very first 20 mins, you will understand following topics Introduction to Spatial Database One line definition Understanding Spatial Indexing Index Internals Query/Performance Tuning Query Hinting/Cost Analysis Spatial Index Catalog Views Performance Troubleshooting Finding Optimal Index using Spatial Index SP Common Errors Index Maintenance This slides decks will be followed by around 30 mins demo which will have story of geometry, geography, index internals and performance tuning. If you are interested in learning how GIS works and how SQL Server out of the box supports this wonderful tools, you will really like how the story is told. I am sure all people who attend the event will know how the Bangalore is positioned on the map of India. I will take example of Bangalore and Hyderabad and demonstrate how index can improve the performance. Well there are lots of story to tell in the session, and I will be opening this session with the beautiful script of Botticelli’s Birth of Venus created by Michael J. Swart. I will also demonstrate few real life scenario where I will be talking about Spatial Database and its usage. Do not miss this session. At the end of session there will be book awarded to best participant. My session details: Session 3: Developing with SQL Server Spatial and Deep Dive into Spatial Indexing Date: April 14, 2010 Time: 5:00pm-6:00pm Microsoft SQL Server 2008 delivers new spatial data types that enable you to consume, use, and extend location-based data through spatial-enabled applications. Attend this session to learn how to use spatial functionality in next version of SQL Server to build and optimize spatial queries. This session outlines the new geography data type to store geodetic spatial data and perform operations on it, use the new geometry data type to store planar spatial data and perform operations on it, take advantage of new spatial indexes for high performance queries, use the new spatial results tab to quickly and easily view spatial query results directly from within Management Studio, extend spatial data capabilities by building or integrating location-enabled applications through support for spatial standards and specifications and much more. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, T SQL, Technology Tagged: Spatial Database

    Read the article

  • Continuous Integration for SQL Server Part II – Integration Testing

    - by Ben Rees
    My previous post, on setting up Continuous Integration for SQL Server databases using GitHub, Bamboo and Red Gate’s tools, covered the first two parts of a simple Database Continuous Delivery process: Putting your database in to a source control system, and, Running a continuous integration process, each time changes are checked in. However there is, of course, a lot more to to Continuous Delivery than that. Specifically, in addition to the above: Putting some actual integration tests in to the CI process (otherwise, they don’t really do much, do they!?), Deploying the database changes with a managed, automated approach, Monitoring what you’ve just put live, to make sure you haven’t broken anything. This post will detail how to set up a very simple pipeline for implementing the first of these (continuous integration testing). NB: A lot of the setup in this post is built on top of the configuration from before, so it might be difficult to implement this post without running through part I first. There’ll then be a third post on automated database deployment followed by a final post dealing with the last item – monitoring changes on the live system. In the previous post, I used a mixture of Red Gate products and other 3rd party software – GitHub and Atlassian Bamboo specifically. This was partly because I believe most people work in an heterogeneous environment, using software from different vendors to suit their purposes and I wanted to show how this could work for this process. For example, you could easily substitute Atlassian’s BitBucket or Stash for GitHub, depending on your needs, or use an alternative CI server such as TeamCity, TFS or Jenkins. However, in this, post, I’ll be mostly using Red Gate products only (other than tSQLt). I would do this, firstly because I work for Red Gate. However, I also think that in the area of Database Delivery processes, nobody else has the offerings to implement this process fully – so I didn’t have any choice!   Background on Continuous Delivery For me, a great source of information on what makes a proper Continuous Delivery process is the Jez Humble and David Farley classic: Continuous Delivery – Reliable Software Releases through Build, Test, and Deployment Automation This book is not of course, primarily about databases, and the process I outline here and in the previous article is a gross simplification of what Jez and David describe (not least because it’s that much harder for databases!). However, a lot of the principles that they describe can be equally applied to database development and, I would argue, should be. As I say however, what I describe here is a very simple version of what would be required for a full production process. A couple of useful resources on handling some of these complexities can be found in the following two references: Refactoring Databases – Evolutionary Database Design, by Scott J Ambler and Pramod J. Sadalage Versioning Databases – Branching and Merging, by Scott Allen In particular, I don’t deal at all with the issues of multiple branches and merging of those branches, an issue made particularly acute by the use of GitHub. The other point worth making is that, in the words of Martin Fowler: Continuous Delivery is about keeping your application in a state where it is always able to deploy into production.   I.e. we are not talking about continuously delivery updates to the production database every time someone checks in an amendment to a stored procedure. That is possible (and what Martin calls Continuous Deployment). However, again, that’s more than I describe in this article. And I doubt I need to remind DBAs or Developers to Proceed with Caution!   Integration Testing Back to something practical. The next stage, building on our set up from the previous article, is to add in some integration tests to the process. As I say, the CI process, though interesting, isn’t enormously useful without some sort of test process running. For this we’ll use the tSQLt framework, an open source framework designed specifically for running SQL Server tests. tSQLt is part of Red Gate’s SQL Test found on http://www.red-gate.com/products/sql-development/sql-test/ or can be downloaded separately from www.tsqlt.org - though I’ll provide a step-by-step guide below for setting this up. Getting tSQLt set up via SQL Test Click on the link http://www.red-gate.com/products/sql-development/sql-test/ and click on the blue Download button to download the Red Gate SQL Test product, if not already installed. Follow the install process for SQL Test to install the SQL Server Management Studio (SSMS) plugin on to your machine, if not already installed. Open SSMS. You should now see SQL Test under the Tools menu:   Clicking this link will give you the basic SQL Test dialogue: As yet, though we’ve installed the SQL Test product we haven’t yet installed the tSQLt test framework on to any particular database. To do this, we need to add our RedGateApp database using this dialogue, by clicking on the + Add Database to SQL Test… link, selecting the RedGateApp database and clicking the Add Database link:   In the next screen, SQL Test describes what will be installed on the database for the tSQLt framework. Also in this dialogue, uncheck the “Add SQL Cop tests” option (shown below). SQL Cop is a great set of pre-defined tests that work within the tSQLt framework to check the general health of your SQL Server database. However, we won’t be using them in this particular simple example: Once you’ve clicked on the OK button, the changes described in the dialogue will be made to your database. Some of these are shown in the left-hand-side below: We’ve now installed the framework. However, we haven’t actually created any tests, so this will be the next step. But, before we proceed, we’ve made an update to our database so should, again check this in to source control, adding comments as required:   Also worth a quick check that your build still runs with the new additions!: (And a quick check of the RedGateAppCI database shows that the changes have been made).   Creating and Testing a Unit Test There are, of course, a lot of very interesting unit tests that you could and should set up for a database. The great thing about the tSQLt framework is that you can write these in SQL. The example I’m going to use here is pretty Mickey Mouse – our database table is going to include some email addresses as reference data and I want to check whether these are all in a correct email format. Nothing clever but it illustrates the process and hopefully shows the method by which more interesting tests could be set up. Adding Reference Data to our Database To start, I want to add some reference data to my database, and have this source controlled (as well as the schema). First of all I need to add some data in to my solitary table – this can be done a number of ways, but I’ll do this in SSMS for simplicity: I then add some reference data to my table: Currently this reference data just exists in the database. For proper integration testing, this needs to form part of the source-controlled version of the database – and so needs to be added to the Git repository. This can be done via SQL Source Control, though first a Primary Key needs to be added to the table. Right click the table, select Design, then right-click on the first “id” row. Then click on “Set Primary Key”: NB: once this change is made, click Save to save the change to the table. Then, to source control this reference data, right click on the table (dbo.Email) and selecting the following option:   In the next screen, link the data in the Email table, by selecting it from the list and clicking “save and close”: We should at this point re-commit the changes (both the addition of the Primary Key, and the data) to the Git repo. NB: From here on, I won’t show screenshots for the GitHub side of things – it’s the same each time: whenever a change is made in SQL Source Control and committed to your local folder, you then need to sync this in the GitHub Windows client (as this is where the build server, Bamboo is taking it from). An interesting point to note here, when these changes are committed in SQL Source Control (right-click database and select “Commit Changes to Source Control..”): The display gives a warning about possibly needing a migration script for the “Add Primary Key” step of the changes. This isn’t actually necessary in this case, but this mechanism would allow you to create override scripts to replace the default change scripts created by the SQL Compare engine (which runs underneath SQL Source Control). Ignoring this message (!), we add a comment and commit the changes to Git. I then sync these, run a build (or the build gets run automatically), and check that the data is being deployed over to the target RedGateAppCI database:   Creating and Running the Test As I mention, the test I’m going to use here is a very simple one - are the email addresses in my reference table valid? This isn’t of course, a full test of email validation (I expect the email addresses I’ve chosen here aren’t really the those of the Fab Four) – but just a very basic check of format used. I’ve taken the relevant SQL from this Stack Overflow article. In SSMS select “SQL Test” from the Tools menu, then click on + New Test: In the next screen, give your new test a name, and also enter a name in the Test Class box (test classes are schemas that help you keep things organised). Also check that the database in which the test is going to be created is correct – RedGateApp in this example: Click “Create Test”. After closing a couple of subsequent dialogues, you’ll see a dummy script for the test, that needs filling in:   We now need to define the SQL for our test. As mentioned before, tSQLt allows you to write your unit tests in T-SQL, and the code I’m going to use here is as below. This needs to be copied and pasted in to the query window, to replace the default given by tSQLt: –  Basic email check test ALTER PROCEDURE [MyChecks].[test Check Email Addresses] AS BEGIN SET NOCOUNT ON         Declare @Output VarChar(max)     Set @Output = ”       SELECT  @Output = @Output + Email +Char(13) + Char(10) FROM dbo.Email WHERE email NOT LIKE ‘%_@__%.__%’       If @Output > ”         Begin             Set @Output = Char(13) + Char(10)                           + @Output             EXEC tSQLt.Fail@Output         End   END;   Once this script is entered, hit execute to add the Stored Procedure to the database. Before committing the test to source control,  it’s worth just checking that it works! For a positive test, click on “SQL Test” from the Tools menu, then click Run Tests. You should see output like the following: - a green tick to indicate success! But of course, what we also need to do is test that this is actually doing something by showing a failed test. Edit one of the email addresses in your table to an incorrect format: Now, re-run the same SQL Test as before and you’ll see the following: Great – we now know that our test is really doing something! You’ll also see a useful error message at the bottom of SSMS: (leave the email address as invalid for now, for the next steps). The next stage is to check this new test in to source control again, by right-clicking on the database and checking in the changes with a commit message (and not forgetting to sync in the GitHub client):   Checking that the Tests are Running as Integration Tests After the changes above are made, and after a build has run on Bamboo (manual or automatic), looking at the Stored Procedures for the RedGateAppCI, the SPROC for the new test has been moved over to the database. However this is not exactly what we were after. We didn’t want to just copy objects from one database to another, but actually run the tests as part of the build/integration test process. I.e. we’re continuously checking any changes we make (in this case, to the reference data emails), to ensure we’re not breaking a test that we’ve set up. The behaviour we want to see is that, if we check in static data that is incorrect (as we did in step 9 above) and we have the tSQLt test set up, then our build in Bamboo should fail. However, re-running the build shows the following: - sadly, a successful build! To make sure the tSQLt tests are run as part of the integration test, we need to amend a switch in the Red Gate CI config file. First, navigate to file sqlCI.targets in your working folder: Edit this document, make the following change, save the document, then commit and sync this change in the GitHub client: <!-- tSQLt tests --> <!-- Optional --> <!-- To run tSQLt tests in source control for the database, enter true. --> <enableTsqlt>true</enableTsqlt> Now, if we re-run the build in Bamboo (NB: I’ve moved to a new server here, hence different address and build number): - superb, a broken build!! The error message isn’t great here, so to get more detailed info, click on the full build log link on this page (below the fold). The interesting part of the log shown is towards the bottom. Pulling out this part:   21-Jun-2013 11:35:19 Build FAILED. 21-Jun-2013 11:35:19 21-Jun-2013 11:35:19 "C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj" (default target) (1) -> 21-Jun-2013 11:35:19 (sqlCI target) -> 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: RedGate.Deploy.SqlServerDbPackage.Shared.Exceptions.InvalidSqlException: Test Case Summary: 1 test case(s) executed, 0 succeeded, 1 failed, 0 errored. [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: [MyChecks].[test Check Email Addresses] failed: [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: ringo.starr@beatles [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: +----------------------+ [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: |Test Execution Summary| [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj]   As a final check, we should make sure that, if we now fix this error, the build succeeds. So in SSMS, I’m going to correct the invalid email address, then check this change in to SQL Source Control (with a comment), commit to GitHub, and re-run the build:   This should have fixed the build: It worked! Summary This has been a very quick run through the implementation of CI for databases, including tSQLt tests to test whether your database updates are working. The next post in this series will focus on automated deployment – we’ve tested our database changes, how can we now deploy these to target sites?  

    Read the article

  • Flow-Design Cheat Sheet &ndash; Part I, Notation

    - by Ralf Westphal
    You want to avoid the pitfalls of object oriented design? Then this is the right place to start. Use Flow-Oriented Analysis (FOA) and –Design (FOD or just FD for Flow-Design) to understand a problem domain and design a software solution. Flow-Orientation as described here is related to Flow-Based Programming, Event-Based Programming, Business Process Modelling, and even Event-Driven Architectures. But even though “thinking in flows” is not new, I found it helpful to deviate from those precursors for several reasons. Some aim at too big systems for the average programmer, some are concerned with only asynchronous processing, some are even not very much concerned with programming at all. What I was looking for was a design method to help in software projects of any size, be they large or tiny, involing synchronous or asynchronous processing, being local or distributed, running on the web or on the desktop or on a smartphone. That´s why I took ideas from all of the above sources and some additional and came up with Event-Based Components which later got repositioned and renamed to Flow-Design. In the meantime this has generated some discussion (in the German developer community) and several teams have started to work with Flow-Design. Also I´ve conducted quite some trainings using Flow-Orientation for design. The results are very promising. Developers find it much easier to design software using Flow-Orientation than OOAD-based object orientation. Since Flow-Orientation is moving fast and is not covered completely by a single source like a book, demand has increased for at least an overview of the current state of its notation. This page is trying to answer this demand by briefly introducing/describing every notational element as well as their translation into C# source code. Take this as a cheat sheet to put next to your whiteboard when designing software. However, please do not expect any explanation as to the reasons behind Flow-Design elements. Details on why Flow-Design at all and why in this specific way you´ll find in the literature covering the topic. Here´s a resource page on Flow-Design/Event-Based Components, if you´re able to read German. Notation Connected Functional Units The basic element of any FOD are functional units (FU): Think of FUs as some kind of software code block processing data. For the moment forget about classes, methods, “components”, assemblies or whatever. See a FU as an abstract piece of code. Software then consists of just collaborating FUs. I´m using circles/ellipses to draw FUs. But if you like, use rectangles. Whatever suites your whiteboard needs best.   The purpose of FUs is to process input and produce output. FUs are transformational. However, FUs are not called and do not call other FUs. There is no dependency between FUs. Data just flows into a FU (input) and out of it (output). From where and where to is of no concern to a FU.   This way FUs can be concatenated in arbitrary ways:   Each FU can accept input from many sources and produce output for many sinks:   Flows Connected FUs form a flow with a start and an end. Data is entering a flow at a source, and it´s leaving it through a sink. Think of sources and sinks as special FUs which conntect wires to the environment of a network of FUs.   Wiring Details Data is flowing into/out of FUs through wires. This is to allude to electrical engineering which since long has been working with composable parts. Wires are attached to FUs usings pins. They are the entry/exit points for the data flowing along the wires. Input-/output pins currently need not be drawn explicitly. This is to keep designing on a whiteboard simple and quick.   Data flowing is of some type, so wires have a type attached to them. And pins have names. If there is only one input pin and output pin on a FU, though, you don´t need to mention them. The default is Process for a single input pin, and Result for a single output pin. But you´re free to give even single pins different names.   There is a shortcut in use to address a certain pin on a destination FU:   The type of the wire is put in parantheses for two reasons. 1. This way a “no-type” wire can be easily denoted, 2. this is a natural way to describe tuples of data.   To describe how much data is flowing, a star can be put next to the wire type:   Nesting – Boards and Parts If more than 5 to 10 FUs need to be put in a flow a FD starts to become hard to understand. To keep diagrams clutter free they can be nested. You can turn any FU into a flow: This leads to Flow-Designs with different levels of abstraction. A in the above illustration is a high level functional unit, A.1 and A.2 are lower level functional units. One of the purposes of Flow-Design is to be able to describe systems on different levels of abstraction and thus make it easier to understand them. Humans use abstraction/decomposition to get a grip on complexity. Flow-Design strives to support this and make levels of abstraction first class citizens for programming. You can read the above illustration like this: Functional units A.1 and A.2 detail what A is supposed to do. The whole of A´s responsibility is decomposed into smaller responsibilities A.1 and A.2. FU A thus does not do anything itself anymore! All A is responsible for is actually accomplished by the collaboration between A.1 and A.2. Since A now is not doing anything anymore except containing A.1 and A.2 functional units are devided into two categories: boards and parts. Boards are just containing other functional units; their sole responsibility is to wire them up. A is a board. Boards thus depend on the functional units nested within them. This dependency is not of a functional nature, though. Boards are not dependent on services provided by nested functional units. They are just concerned with their interface to be able to plug them together. Parts are the workhorses of flows. They contain the real domain logic. They actually transform input into output. However, they do not depend on other functional units. Please note the usage of source and sink in boards. They correspond to input-pins and output-pins of the board.   Implicit Dependencies Nesting functional units leads to a dependency tree. Boards depend on nested functional units, they are the inner nodes of the tree. Parts are independent, they are the leafs: Even though dependencies are the bane of software development, Flow-Design does not usually draw these dependencies. They are implicitly created by visually nesting functional units. And they are harmless. Boards are so simple in their functionality, they are little affected by changes in functional units they are depending on. But functional units are implicitly dependent on more than nested functional units. They are also dependent on the data types of the wires attached to them: This is also natural and thus does not need to be made explicit. And it pertains mainly to parts being dependent. Since boards don´t do anything with regard to a problem domain, they don´t care much about data types. Their infrastructural purpose just needs types of input/output-pins to match.   Explicit Dependencies You could say, Flow-Orientation is about tackling complexity at its root cause: that´s dependencies. “Natural” dependencies are depicted naturally, i.e. implicitly. And whereever possible dependencies are not even created. Functional units don´t know their collaborators within a flow. This is core to Flow-Orientation. That makes for high composability of functional units. A part is as independent of other functional units as a motor is from the rest of the car. And a board is as dependend on nested functional units as a motor is on a spark plug or a crank shaft. With Flow-Design software development moves closer to how hardware is constructed. Implicit dependencies are not enough, though. Sometimes explicit dependencies make designs easier – as counterintuitive this might sound. So FD notation needs a ways to denote explicit dependencies: Data flows along wires. But data does not flow along dependency relations. Instead dependency relations represent service calls. Functional unit C is depending on/calling services on functional unit S. If you want to be more specific, name the services next to the dependency relation: Although you should try to stay clear of explicit dependencies, they are fundamentally ok. See them as a way to add another dimension to a flow. Usually the functionality of the independent FU (“Customer repository” above) is orthogonal to the domain of the flow it is referenced by. If you like emphasize this by using different shapes for dependent and independent FUs like above. Such dependencies can be used to link in resources like databases or shared in-memory state. FUs can not only produce output but also can have side effects. A common pattern for using such explizit dependencies is to hook a GUI into a flow as the source and/or the sink of data: Which can be shortened to: Treat FUs others depend on as boards (with a special non-FD API the dependent part is connected to), but do not embed them in a flow in the diagram they are depended upon.   Attributes of Functional Units Creation and usage of functional units can be modified with attributes. So far the following have shown to be helpful: Singleton: FUs are by default multitons. FUs in the same of different flows with the same name refer to the same functionality, but to different instances. Think of functional units as objects that get instanciated anew whereever they appear in a design. Sometimes though it´s helpful to reuse the same instance of a functional unit; this is always due to valuable state it holds. Signify this by annotating the FU with a “(S)”. Multiton: FUs on which others depend are singletons by default. This is, because they usually are introduced where shared state comes into play. If you want to change them to be a singletons mark them with a “(M)”. Configurable: Some parts need to be configured before the can do they work in a flow. Annotate them with a “(C)” to have them initialized before any data items to be processed by them arrive. Do not assume any order in which FUs are configured. How such configuration is happening is an implementation detail. Entry point: In each design there needs to be a single part where “it all starts”. That´s the entry point for all processing. It´s like Program.Main() in C# programs. Mark the entry point part with an “(E)”. Quite often this will be the GUI part. How the entry point is started is an implementation detail. Just consider it the first FU to start do its job.   Patterns / Standard Parts If more than a single wire is attached to an output-pin that´s called a split (or fork). The same data is flowing on all of the wires. Remember: Flow-Designs are synchronous by default. So a split does not mean data is processed in parallel afterwards. Processing still happens synchronously and thus one branch after another. Do not assume any specific order of the processing on the different branches after the split.   It is common to do a split and let only parts of the original data flow on through the branches. This effectively means a map is needed after a split. This map can be implicit or explicit.   Although FUs can have multiple input-pins it is preferrable in most cases to combine input data from different branches using an explicit join: The default output of a join is a tuple of its input values. The default behavior of a join is to output a value whenever a new input is received. However, to produce its first output a join needs an input for all its input-pins. Other join behaviors can be: reset all inputs after an output only produce output if data arrives on certain input-pins

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Problem with AssetManager while loading a Model type

    - by user1204548
    Today I've tried the AssetManager for the first time with .g3db files and I'm having some problems. Exception in thread "LWJGL Application" com.badlogic.gdx.utils.GdxRuntimeException: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load dependencies of asset: data/data at com.badlogic.gdx.assets.AssetManager.handleTaskError(AssetManager.java:508) at com.badlogic.gdx.assets.AssetManager.update(AssetManager.java:342) at com.lostchg.martagdx3d.MartaGame.render(MartaGame.java:78) at com.badlogic.gdx.Game.render(Game.java:46) at com.badlogic.gdx.backends.lwjgl.LwjglApplication.mainLoop(LwjglApplication.java:207) at com.badlogic.gdx.backends.lwjgl.LwjglApplication$1.run(LwjglApplication.java:114) Caused by: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load dependencies of asset: data/data at com.badlogic.gdx.assets.AssetLoadingTask.handleAsyncLoader(AssetLoadingTask.java:119) at com.badlogic.gdx.assets.AssetLoadingTask.update(AssetLoadingTask.java:89) at com.badlogic.gdx.assets.AssetManager.updateTask(AssetManager.java:445) at com.badlogic.gdx.assets.AssetManager.update(AssetManager.java:340) ... 4 more Caused by: com.badlogic.gdx.utils.GdxRuntimeException: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load file: data/data at com.badlogic.gdx.utils.async.AsyncResult.get(AsyncResult.java:31) at com.badlogic.gdx.assets.AssetLoadingTask.handleAsyncLoader(AssetLoadingTask.java:117) ... 7 more Caused by: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load file: data/data at com.badlogic.gdx.graphics.Pixmap.<init>(Pixmap.java:140) at com.badlogic.gdx.assets.loaders.TextureLoader.loadAsync(TextureLoader.java:72) at com.badlogic.gdx.assets.loaders.TextureLoader.loadAsync(TextureLoader.java:41) at com.badlogic.gdx.assets.AssetLoadingTask.call(AssetLoadingTask.java:69) at com.badlogic.gdx.assets.AssetLoadingTask.call(AssetLoadingTask.java:34) at com.badlogic.gdx.utils.async.AsyncExecutor$2.call(AsyncExecutor.java:49) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: com.badlogic.gdx.utils.GdxRuntimeException: File not found: data\data (Internal) at com.badlogic.gdx.files.FileHandle.read(FileHandle.java:132) at com.badlogic.gdx.files.FileHandle.length(FileHandle.java:586) at com.badlogic.gdx.files.FileHandle.readBytes(FileHandle.java:220) at com.badlogic.gdx.graphics.Pixmap.<init>(Pixmap.java:137) ... 9 more Why it tries to load that unexisting file? It seems that the AssetManager manages to load my .g3db file at first, because earlier the java console threw some errors related to the textures associated to the 3D scene having to be a power of 2. Relevant code: public void show() { ... assets = new AssetManager(); assets.load("data/levelprueba2.g3db", Model.class); loading = true; ... } private void doneLoading() { Model model = assets.get("data/levelprueba2.g3db", Model.class); for (int i = 0; i < model.nodes.size; i++) { String id = model.nodes.get(i).id; ModelInstance instance = new ModelInstance(model, id); Node node = instance.getNode(id); instance.transform.set(node.globalTransform); node.translation.set(0,0,0); node.scale.set(1,1,1); node.rotation.idt(); instance.calculateTransforms(); instances.add(instance); } loading = false; } public void render(float delta) { super.render(delta); if (loading && assets.update()) doneLoading(); ... } The error points to the line with the assets.update() method. Please, help! Sorry for my bad English and my amateurish doubts.

    Read the article

  • Database Mirroring on SQL Server Express Edition

    - by Most Valuable Yak (Rob Volk)
    Like most SQL Server users I'm rather frustrated by Microsoft's insistence on making the really cool features only available in Enterprise Edition.  And it really doesn't help that they changed the licensing for SQL 2012 to be core-based, so now it's like 4 times as expensive!  It almost makes you want to go with Oracle.  That, and a desire to have Larry Ellison do things to your orifices. And since they've introduced Availability Groups, and marked database mirroring as deprecated, you'd think they'd make make mirroring available in all editions.  Alas…they don't…officially anyway.  Thanks to my constant poking around in places I'm not "supposed" to, I've discovered the low-level code that implements database mirroring, and found that it's available in all editions! It turns out that the query processor in all SQL Server editions prepends a simple check before every edition-specific DDL statement: IF CAST(SERVERPROPERTY('Edition') as nvarchar(max)) NOT LIKE '%e%e%e% Edition%' print 'Lame' else print 'Cool' If that statement returns true, it fails. (the print statements are just placeholders)  Go ahead and test it on Standard, Workgroup, and Express editions compared to an Enterprise or Developer edition instance (which support everything). Once again thanks to Argenis Fernandez (b | t) and his awesome sessions on using Sysinternals, I was able to watch the exact process SQL Server performs when setting up a mirror.  Surprisingly, it's not actually implemented in SQL Server!  Some of it is, but that's something of a smokescreen, the real meat of it is simple filesystem primitives. The NTFS filesystem supports links, both hard links and symbolic, so that you can create two entries for the same file in different directories and/or different names.  You can create them using the MKLINK command in a command prompt: mklink /D D:\SkyDrive\Data D:\Data mklink /D D:\SkyDrive\Log D:\Log This creates a symbolic link from my data and log folders to my Skydrive folder.  Any file saved in either location will instantly appear in the other.  And since my Skydrive will be automatically synchronized with the cloud, any changes I make will be copied instantly (depending on my internet bandwidth of course). So what does this have to do with database mirroring?  Well, it seems that the mirroring endpoint that you have to create between mirror and principal servers is really nothing more than a Skydrive link.  Although it doesn't actually use Skydrive, it performs the same function.  So in effect, the following statement: ALTER DATABASE Mir SET PARTNER='TCP://MyOtherServer.domain.com:5022' Is turned into: mklink /D "D:\Data" "\\MyOtherServer.domain.com\5022$" The 5022$ "port" is actually a hidden system directory on the principal and mirror servers. I haven't quite figured out how the log files are included in this, or why you have to SET PARTNER on both principal and mirror servers, except maybe that mklink has to do something special when linking across servers.  I couldn't get the above statement to work correctly, but found that doing mklink to a local Skydrive folder gave me similar functionality. To wrap this up, all you have to do is the following: Install Skydrive on both SQL Servers (principal and mirror) and set the local Skydrive folder (D:\SkyDrive in these examples) On the principal server, run mklink /D on the data and log folders to point to SkyDrive: mklink /D D:\SkyDrive\Data D:\Data On the mirror server, run the complementary linking: mklink /D D:\Data D:\SkyDrive\Data Create your database and make sure the files map to the principal data and log folders (D:\Data and D:\Log) Viola! Your databases are kept in sync on multiple servers! One wrinkle you will encounter is that the mirror server will show the data and log files, but you won't be able to attach them to the mirror SQL instance while they are attached to the principal. I think this is a bug in the Skydrive, but as it turns out that's fine: you can't access a mirror while it's hosted on the principal either.  So you don't quite get automatic failover, but you can attach the files to the mirror if the principal goes offline.  It's also not exactly synchronous, but it's better than nothing, and easier than either replication or log shipping with a lot less latency. I will end this with the obvious "not supported by Microsoft" and "Don't do this in production without an updated resume" spiel that you should by now assume with every one of my blog posts, especially considering the date.

    Read the article

  • Introducing MySQL for Excel

    - by Javier Treviño
    As part of the new product initiatives of the MySQL on Windows group we released a tool that makes the task of getting data in and out of a MySQL Database very friendly and intuitive, and we paired it with one of the preferred applications for data analysis and manipulation in Windows platforms, MS Excel. Welcome to MySQL for Excel, an add-in that is installed and accessed from within the MS Excel’s Data tab offering a wizard-like interface arranged in an elegant yet simple way to help users browse MySQL Schemas, Tables, Views and Procedures and perform data operations against them using MS Excel as the vehicle to drive the data in and out MySQL Databases. One of the coolest features we had in mind designing MySQL for Excel is simplicity. MS Excel is simple and easy to work with, thus liked by many Windows users because they don’t have to be software gurus to use it.  We applied the same principle by targeting MySQL for Excel to any kind of user, so if you are already familiarized with Excel’s interface you will find yourself working with MySQL data in no time. MySQL for Excel is shipped within the MySQL Installer as one of the tools in the suite; if prerequisites are already installed (.NET Framework 4.0, Visual Studio Tools for Office 4.0 and of course MS Office), installing the add-in involves a very few clicks and no further setup to use it. Being an Excel Add-In there is no executable file involved after the installation, running MS Excel and opening the add-in from its Data tab is all that is required. MySQL for Excel automatically integrates with MySQL Workbench (if installed) to share the same connections to MySQL Server installations, that way connections are defined just once in either product saving time.  Opening the Add-In brings the Welcome Panel at the right side of the Excel main window from which connections to MySQL Servers are shown grouped by Local VS Remote connections; then users can open any of those connections by double-clicking it and entering the password of the used account.  Additionally a user can create a connection by clicking on the New Connection action label or edit connections through MySQL Workbench (if installed) by clicking on the Manage Connections action label. Once a connection is opened, the Schema Selection panel is shown, at the top of it the selected connection (connection name, hostname/IP and username). Just below, a list of schemas is displayed where User Schemas are grouped first followed by System Schemas; users can double-click any selected schema to go to the next panel or select a schema and clicking the Next > button. Users can alternatively click on the < Back button to go back to the Welcome Panel to close the current connection and open a new one; also by clicking the Create New Schema action label they can create an empty new schema. Once a schema is opened the DB Object Selection panel is shown, this is actually the place where the fun stuff happens; from here users are able to perform actions against MySQL Tables, Views and Procedures. ">The actions available here are about importing data from a MySQL Table, View or Procedure to Excel, exporting Excel data to a new MySQL Table, appending Excel data to an existing MySQL Table or editing a MySQL Table’s data by using an Excel Worksheet as a user interface to update data in any row/column, insert new rows or delete existing rows in a very easy and friendly way. More blog posts will follow describing all of these actions, so stay tuned! Remember that your feedback is very important for us, so drop us a message: · MySQL on Windows (this) Blog - https://blogs.oracle.com/MySqlOnWindows/ · Forum - http://forums.mysql.com/list.php?172 · Facebook - http://www.facebook.com/mysql Cheers!

    Read the article

< Previous Page | 461 462 463 464 465 466 467 468 469 470 471 472  | Next Page >