Search Results

Search found 60391 results on 2416 pages for 'data generation'.

Page 500/2416 | < Previous Page | 496 497 498 499 500 501 502 503 504 505 506 507 | Next Page >

How to remove RAID flag on unstriped drive without losing data?

- by Alex Folland

I have a Gigabyte Z68X-UD4-B3 motherboard. It advertises this new thing called "XHD", which is like RAID but makes a SSD and traditional-style drive work together to enable high speed with high capacity. I don't want to use this feature, and I already have Windows 7 64 installed without using this feature. When I first installed my 2 hard drives (1 SSD and 1 traditional-style drive) in my machine and booted it up for the first time, it ran a program from the mobo that asked me if I wanted to set up XHD. Thinking it would go to some config screen, I said yes. It immediately started doing something with my drives and finished. I considered that strange, but figured it wouldn't matter when I simply install Windows onto my SSD only. I now have my BIOS and Windows running in AHCI mode with no RAID arrays and separate drives. My SSD is one of those new Corsair Force GT drives which loses power every so often, causing Windows to BSOD. I've figured everything out about this problem, including installing the latest firmware from Corsair, and the only way to fix it at this point is by installing Intel Rapid Storage Technology to control AHCI instead of Windows, since the Windows AHCI driver disables the drive's power every once in a while and can't be configured not to do so. I've tried installing Intel Rapid Storage Technology. When I reboot my machine after doing so, it BSODs just after the Windows logo. I've figured out this is because my SSD and my traditional drive are flagged as RAID, as seen in the "Intel Matrix Storage Manager" program found by switching the BIOS hard drive handling to "RAID" mode. This is due to the XHD auto-config program I mentioned earlier. Normally, the BIOS is set to AHCI, and when the drives boot in AHCI mode, they work perfectly. So, I've concluded the data is stored in AHCI mode but the drives' flags are set to RAID. I've figured out that I can accomplish my objective by using the "Intel Matrix Storage Manager" program on the mobo (with "Reset disks to non-RAID"), but doing so would cause it to completely wipe the drives I select. I want to simply toggle these flags from RAID to AHCI so Intel Rapid Storage Technology doesn't fail and cause a BSOD upon booting, but without wiping the drives.

Read the article
Can I save & store a user's submission in a way that proves that the data has not been altered, and that the timestamp is accurate?

- by jt0dd

There are many situations where the validity of the timestamp attached to a certain post (submission of information) might be invaluable for the post owner's legal usage. I'm not looking for a service to achieve this, as requested in this great question, but rather a method for the achievement of such a service. For the legal (in most any law system) authentication of text content and its submission time, the owner of the content would need to prove: that the timestamp itself has not been altered and was accurate to begin with. that the text content linked to the timestamp had not been altered I'd like to know how to achieve this via programming (not a language-specific solution, but rather the methodology behind the solution). Can a timestamp be validated to being accurate to the time that the content was really submitted? Can data be stored in a form that it can be read, but not written to, in a proven way? In other words, can I save & store a user's submission in a way that proves that the data has not been altered, and that the timestamp is accurate? I can't think of any programming method that would make this possible, but I am not the most experienced programmer out there. Based on MidnightLightning's answer to the question I cited, this sort of thing is being done. Clarification: I'm looking for a method (hashing, encryption, etc) that would allow an average guy like me to achieve the desired effect through programming. I'm interested in this subject for the purpose of Defensive Publication. I'd like to learn a method that allows an every-day programmer to pick up his computer, write a program, pass information through it, and say: I created this text at this moment in time, and I can prove it. This means the information should be protected from the programmer who writes the code as well. Perhaps a 3rd party API would be required. I'm ok with that.

Read the article
C#, AES encryption check!

- by Data-Base

I have this code for AES encryption, can some one verify that this code is good and not wrong? it works fine, but I'm more concern about the implementation of the algorithm // Plaintext value to be encrypted. //Passphrase from which a pseudo-random password will be derived. //The derived password will be used to generate the encryption key. //Password can be any string. In this example we assume that this passphrase is an ASCII string. //Salt value used along with passphrase to generate password. //Salt can be any string. In this example we assume that salt is an ASCII string. //HashAlgorithm used to generate password. Allowed values are: "MD5" and "SHA1". //SHA1 hashes are a bit slower, but more secure than MD5 hashes. //PasswordIterations used to generate password. One or two iterations should be enough. //InitialVector (or IV). This value is required to encrypt the first block of plaintext data. //For RijndaelManaged class IV must be exactly 16 ASCII characters long. //KeySize. Allowed values are: 128, 192, and 256. //Longer keys are more secure than shorter keys. //Encrypted value formatted as a base64-encoded string. public static string Encrypt(string PlainText, string Password, string Salt, string HashAlgorithm, int PasswordIterations, string InitialVector, int KeySize) { byte[] InitialVectorBytes = Encoding.ASCII.GetBytes(InitialVector); byte[] SaltValueBytes = Encoding.ASCII.GetBytes(Salt); byte[] PlainTextBytes = Encoding.UTF8.GetBytes(PlainText); PasswordDeriveBytes DerivedPassword = new PasswordDeriveBytes(Password, SaltValueBytes, HashAlgorithm, PasswordIterations); byte[] KeyBytes = DerivedPassword.GetBytes(KeySize / 8); RijndaelManaged SymmetricKey = new RijndaelManaged(); SymmetricKey.Mode = CipherMode.CBC; ICryptoTransform Encryptor = SymmetricKey.CreateEncryptor(KeyBytes, InitialVectorBytes); MemoryStream MemStream = new MemoryStream(); CryptoStream CryptoStream = new CryptoStream(MemStream, Encryptor, CryptoStreamMode.Write); CryptoStream.Write(PlainTextBytes, 0, PlainTextBytes.Length); CryptoStream.FlushFinalBlock(); byte[] CipherTextBytes = MemStream.ToArray(); MemStream.Close(); CryptoStream.Close(); return Convert.ToBase64String(CipherTextBytes); } public static string Decrypt(string CipherText, string Password, string Salt, string HashAlgorithm, int PasswordIterations, string InitialVector, int KeySize) { byte[] InitialVectorBytes = Encoding.ASCII.GetBytes(InitialVector); byte[] SaltValueBytes = Encoding.ASCII.GetBytes(Salt); byte[] CipherTextBytes = Convert.FromBase64String(CipherText); PasswordDeriveBytes DerivedPassword = new PasswordDeriveBytes(Password, SaltValueBytes, HashAlgorithm, PasswordIterations); byte[] KeyBytes = DerivedPassword.GetBytes(KeySize / 8); RijndaelManaged SymmetricKey = new RijndaelManaged(); SymmetricKey.Mode = CipherMode.CBC; ICryptoTransform Decryptor = SymmetricKey.CreateDecryptor(KeyBytes, InitialVectorBytes); MemoryStream MemStream = new MemoryStream(CipherTextBytes); CryptoStream cryptoStream = new CryptoStream(MemStream, Decryptor, CryptoStreamMode.Read); byte[] PlainTextBytes = new byte[CipherTextBytes.Length]; int ByteCount = cryptoStream.Read(PlainTextBytes, 0, PlainTextBytes.Length); MemStream.Close(); cryptoStream.Close(); return Encoding.UTF8.GetString(PlainTextBytes, 0, ByteCount); } Thank you

Read the article
SOA Suite 11g Native Format Builder Complex Format Example

- by bob.webster

This rather long posting details the steps required to process a grouping of fixed length records using Format Builder. If it’s 10 pm and you’re feeling beat you might want to leave this until tomorrow. But if it’s 10 pm and you need to get a Format Builder Complex template done, read on… The goal is to process individual orders from a file using the 11g File Adapter and Format Builder Sample Data =========== 001Square Widget 0245.98 102Triagular Widget 1120.00 403Circular Widget 0099.45 ORD8898302/01/2011 301Hexagon Widget 1150.98 ORD6735502/01/2011 The records are fixed length records representing a number of logical Order records. Each order record consists of a number of item records starting with a 3 digit number, followed by a single Summary Record which starts with the constant ORD. How can this file be processed so that the first poll returns the first order? 001Square Widget 0245.98 102Triagular Widget 1120.00 403Circular Widget 0099.45 ORD8898302/01/2011 And the second poll returns the second order? 301Hexagon Widget 1150.98 ORD6735502/01/2011 Note: if you need more than one order per poll, that’s also possible, see the “Multiple Messages” field in the “File Adapter Step 6 of 9” snapshot further down. To follow along with this example you will need - Studio Edition Version 11.1.1.4.0 with the - SOA Extension for JDeveloper 11.1.1.4.0 installed Both can be downloaded from here: http://www.oracle.com/technetwork/middleware/soasuite/downloads/index.html You will not need a running WebLogic Server domain to complete the steps and Format Builder tests in this article. Start with a SOA Composite containing a File Adapter The Format Builder is part of the File Adapter so start by creating a new SOA Project and Composite. Here is a quick summary for those not familiar with these steps - Start JDeveloper - From the Main Menu choose File->New - In the New Gallery window that opens Expand the “General” category and Select the Applications node. Then choose SOA Application from the Items section on the right. Finally press the OK button. - In Step 1 of the “Create SOA Application wizard” that appears enter an Application Name and an Directory of your choice, then press the Next button. - In Step 2 of the “Create SOA Application wizard”, press the Next button leaving all entries as defaulted. - In Step 3 of the “Create SOA Application wizard”, Enter a composite name of your choice and Press the Finish Button These steps result in a new Application and SOA Project. The SOA Project contains a composite.xml file which is opened and shown below. For our example we have not defined a Mediator or a BPEL process to minimize the steps, but one or the other would eventually be needed to use the File Adapter we are about to create. Drag and drop the File Adapter icon from the Component Pallette onto either the LEFT side of the diagram under “Exposed Services” or the right side under “External References”. (See the Green Circle in the image below). Placing the adapter on the left side would indicate the file being processed is inbound to the composite, if the adapter is placed on the right side then the data is outbound to a file. Note that the same Format Builder definition can be used in both directions. For example we could use the format with a File Adapter on the left side of the composite to parse fixed data into XML, modify the data in our Composite or BPEL process and then use the same Format Builder definition with a File adapter on the right side of the composite to write the data back out in the same fixed data format When the File Adapter is dropped on the Composite the File Adapter Wizard Appears. Skip Past the first page, Step 1 of 9 by pressing the Next button. In Step 2 enter a service name of your choice as shown below, then press Next When the Native Format Builder appears, skip the welcome page by pressing next. Also press the Next button to accept the settings on Step 3 of 9 On Step 4, select Read File and press the Next button as shown below. On Step 5 enter a directory that will contain a file with the input data, then Press the Next button as shown below. In step 6, enter *.txt or another file format to select input files from the input directory mentioned in step 5. ALSO check the “Files contain Multiple Messages” checkbox and set the “Publish Messages in Batches of” field to 1. The value can be set higher to increase the number of logical order group records returned on each poll of the file adapter. In other words, it determines the number of Orders that will be sent to each instance of a Mediator or Composite processing using the File Adapter. Skip Step 7 by pressing the Next button In Step 8 press the Gear Icon on the right side to load the Native Format Builder. Native Format Builder appears Before diving into the format, here is an overview of the process. Approach - Bottom up Assuming an Order is a grouping of item records and a summary record…. - Define a separate Complex Type for each Record Type found in the group. (One for itemRecord and one for summaryRecord) - Define a Complex Type to contain the Group of Record types defined above (LogicalOrderRecord) - Define a top level element to represent an order. (order) The order element will be of type LogicalOrderRecord Defining the Format In Step 1 select “Create new” and “Complex Type” and “Next” In Step two browse to and select a file containing the test data shown at the start of this article. A link is provided at the end of this article to download a file containing the test data. Press the Next button In Step 3 Complex types must be define for each type of input record. Select the Root-Element and Click on the Add Complex Type icon This creates a new empty complex type definition shown below. The fastest way to create the definition is to highlight the first line of the Sample File data and drag the line onto the <new_complex_type> Format Builder introspects the data and provides a grid to define additional fields. Change the “Complex Type Name” to “itemRecord” Then click on the ruler to indicate the position of fixed columns. Drag the red triangle icons to the exact columns if necessary. Double click on an existing red triangle to remove an unwanted entry. In the case below fields are define in columns 0-3, 4-28, 29-eol When the field definitions are correct, press the “Generate Fields” button. Field entries named C1, C2 and C3 will be created as shown below. Click on the field names and rename them from C1->itemNum, C2->itemDesc and C3->itemCost When all the fields are correctly defined press OK to save the complex type. Next, the process is repeated to define a Complex Type for the SummaryRecord. Select the Root-Element in the schema tree and press the new complex type icon Then highlight and drag the Summary Record from the sample data onto the <new_complex_type> Change the complex type name to “summaryRecord” Mark the fixed fields for Order Number and Order Date. Press the Generate Fields button and rename C1 and C2 to itemNum and orderDate respectively. The last complex type to be defined is a type to hold the group of items and the summary record. Select the Root-Element in the schema tree and click the new complex type icon Select the “<new_complex_type>” entry and click the pencil icon On the Complex Type Details page change the name and type of each input field. Change line 1 to be named item and set the Type to “itemRecord” Change line 2 to be named summary and set the Type to “summaryRecord” We also need to indicate that itemRecords repeat in the input file. Click the pencil icon at the right side of the item line. On the Edit Details page change the “Max Occurs” entry from 1 to UNBOUNDED. We also need to indicate how to identify an itemRecord. Since each item record has “.” in column 32 we can use this fact to differentiate an item record from a summary record. Change the “Look Ahead” field to value 32 and enter a period in the “Look For” field Press the OK button to save entry. Finally, its time to create a top level element to represent an order. Select the “Root-Element” in the schema tree and press the New element icon Click on the <new_element> and press the pencil icon. Set the Element Name to “order” and change the Data Type to “logicalOrderRecord” Press the OK button to save the element definition. The final definition should match the screenshot below. Press the Next Button to view the definition source. Press the Test Button to test the definition Press the Green Triangle Icon to run the test. And we are presented with an unwelcome error. The error states that the processor ran out of data while working through the definition. The processor was unable to differentiate between itemRecords and summaryRecords and therefore treated the entire file as a list of itemRecords. At end of file, the “summary” portion of the logicalOrderRecord remained unprocessed but mandatory. This root cause of this error is the loss of our “lookAhead” definition used to identify itemRecords. This appears to be a bug in the Native Format Builder 11.1.1.4.0 Luckily, a simple workaround exists. Press the Cancel button and return to the “Step 4 of 4” Window. Manually add nxsd:lookAhead="32" nxsd:lookFor="." attributes after the maxOccurs attribute of the item element. as shown in the highlighted text below. When the lookAhead and lookFor attributes have been added Press the Test button and on the Test page press the Green Triangle. The test is now successful, the first order in the file is returned by the File Adapter. Below is a complete listing of the Result XML from the right column of the screen above Try running it The downloaded input test file and completed schema file can be used for testing without following all the Native Format Builder steps in this example. Use the following link to download a file containing the sample data. Download Sample Input Data This is the best approach rather than cutting and pasting the input data at the top of the article. Since the data is fixed length it’s very important to watch out for trailing spaces in the data and to ensure an eol character at the end of every line. The download file is correctly formatted. The final schema definition can be downloaded at the following link Download Completed Schema Definition - Save the inputData.txt file to a known location like the xsd folder in your project. - Save the inputData_6.xsd file to the xsd folder in your project. - At step 1 in the Native Format Builder wizard (as shown above) check the “Edit existing” radio button, then browse and select the inputData_6.xsd file - At step 2 of the Format Builder configuration Wizard (as shown above) supply the path and filename for the inputData.txt file. - You can then proceed to the test page and run a test. - Remember the wizard bug will drop the lookAhead and lookFor attributes, you will need to manually add nxsd:lookAhead="32" nxsd:lookFor="." after the maxOccurs attribute of the item element in the LogicalOrderRecord Complex Type. (as shown above) Good Luck with your Format Project

Read the article
SQLAuthority News – Speaking Sessions at TechEd India – 3 Sessions – 1 Panel Discussion

- by pinaldave

Microsoft Tech-Ed India 2010 is considered as the major Technology event of the year for various IT professionals and developers. This event will feature a comprehensive forum in order to learn, connect, explore, and evolve the current technologies we have today. I would recommend this event to you since here you will learn about today’s cutting-edge trends, thereby enhancing your work profile and getting ahead of the rest. But, the most important benefit of all might be the networking opportunity that that you can attain by attending the forum. You can build personal connections with various Microsoft experts and peers that will last even far beyond this event! It also feels good to let you know that I will be speaking at this year’s event! So, here are the sessions that await you in this mega-forum. Session 1: True Lies of SQL Server – SQL Myth Buster Date: April 12, 2010 Time: 11:15pm – 11:45pm In this 30-minute demo session, I am going to briefly demonstrate few SQL Server Myth and their resolution backing up with some demo. This demo session is a must-attend for all developers and administrators who would come to the event. This is going to be a very quick yet fun session. Session 2: Master Data Services in Microsoft SQL Server 2008 R2 Date: April 12, 2010 Time: 2:30pm-3:30pm SQL Server Master Data Services will ship with SQL Server 2008 R2 and will improve Microsoft’s platform appeal. This session provides an in depth demonstration of MDS features and highlights important usage scenarios. Master Data Services enables consistent decision making by allowing you to create, manage and propagate changes from single master view of your business entities. Also with MDS – Master Data-hub which is the vital component helps ensure reporting consistency across systems and deliver faster more accurate results across the enterprise. We will talk about establishing the basis for a centralized approach to defining, deploying, and managing master data in the enterprise. Session 3: Developing with SQL Server Spatial and Deep Dive into Spatial Indexing Date: April 14, 2010 Time: 5:00pm-6:00pm Microsoft SQL Server 2008 delivers new spatial data types that enable you to consume, use, and extend location-based data through spatial-enabled applications. Attend this session to learn how to use spatial functionality in next version of SQL Server to build and optimize spatial queries. This session outlines the new geography data type to store geodetic spatial data and perform operations on it, use the new geometry data type to store planar spatial data and perform operations on it, take advantage of new spatial indexes for high performance queries, use the new spatial results tab to quickly and easily view spatial query results directly from within Management Studio, extend spatial data capabilities by building or integrating location-enabled applications through support for spatial standards and specifications and much more. Panel Discussion: Harness the power of Web – SEO and Technical Blogging Date: April 12, 2010 Time: 5:00pm-6:00pm Here you will learn lots of tricks and tips about SEO and Technical Blogging from various Industry Technical Blogging Experts. This event will surely be one of the most important Tech conventions of 2010. TechEd is going to be a very busy time for Tech developers and enthusiasts, since every evening there will be a fun session to attend. If you are interested in any of the above topics for every session, I suggest that you visit each of them as you will learn so many things about the topic to be discussed. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: MVP, Pinal Dave, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, SQLAuthority News, T SQL, Technology Tagged: TechEd, TechEdIn

Read the article
SQL SERVER – What is Spatial Database? – Developing with SQL Server Spatial and Deep Dive into Spati

- by pinaldave

What is Spatial Database? A spatial database is a database that is optimized to store and query data related to objects in space, including points, lines and polygons. While typical databases can understand various numeric and character types of data, additional functionality needs to be added for databases to process spatial data types. (Source: Wikipedia) Today I will be talking about the same subject at Microsoft TechEd India. If you want to learn about how to spatial aspect of data and how to integrate them with SQL Server this is the perfect session for you. Spatial is very special concept of SQL Server and I really like how it is implemented in SQL Server. In general Performance Tuning and Query Optimization is something I always have enjoyed in my professional life. Index are my best friends and many time, by implementing and many time by removing I have improved the performance of the system. In this session, I will be talking about Index along with Spatial Data. As Spatial Database is very interesting concept, I will cover super short but very interesting 10 quick slides about this subject. I will make sure in very first 20 mins, you will understand following topics Introduction to Spatial Database One line definition Understanding Spatial Indexing Index Internals Query/Performance Tuning Query Hinting/Cost Analysis Spatial Index Catalog Views Performance Troubleshooting Finding Optimal Index using Spatial Index SP Common Errors Index Maintenance This slides decks will be followed by around 30 mins demo which will have story of geometry, geography, index internals and performance tuning. If you are interested in learning how GIS works and how SQL Server out of the box supports this wonderful tools, you will really like how the story is told. I am sure all people who attend the event will know how the Bangalore is positioned on the map of India. I will take example of Bangalore and Hyderabad and demonstrate how index can improve the performance. Well there are lots of story to tell in the session, and I will be opening this session with the beautiful script of Botticelli’s Birth of Venus created by Michael J. Swart. I will also demonstrate few real life scenario where I will be talking about Spatial Database and its usage. Do not miss this session. At the end of session there will be book awarded to best participant. My session details: Session 3: Developing with SQL Server Spatial and Deep Dive into Spatial Indexing Date: April 14, 2010 Time: 5:00pm-6:00pm Microsoft SQL Server 2008 delivers new spatial data types that enable you to consume, use, and extend location-based data through spatial-enabled applications. Attend this session to learn how to use spatial functionality in next version of SQL Server to build and optimize spatial queries. This session outlines the new geography data type to store geodetic spatial data and perform operations on it, use the new geometry data type to store planar spatial data and perform operations on it, take advantage of new spatial indexes for high performance queries, use the new spatial results tab to quickly and easily view spatial query results directly from within Management Studio, extend spatial data capabilities by building or integrating location-enabled applications through support for spatial standards and specifications and much more. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, T SQL, Technology Tagged: Spatial Database

Read the article
Continuous Integration for SQL Server Part II – Integration Testing

- by Ben Rees

My previous post, on setting up Continuous Integration for SQL Server databases using GitHub, Bamboo and Red Gate’s tools, covered the first two parts of a simple Database Continuous Delivery process: Putting your database in to a source control system, and, Running a continuous integration process, each time changes are checked in. However there is, of course, a lot more to to Continuous Delivery than that. Specifically, in addition to the above: Putting some actual integration tests in to the CI process (otherwise, they don’t really do much, do they!?), Deploying the database changes with a managed, automated approach, Monitoring what you’ve just put live, to make sure you haven’t broken anything. This post will detail how to set up a very simple pipeline for implementing the first of these (continuous integration testing). NB: A lot of the setup in this post is built on top of the configuration from before, so it might be difficult to implement this post without running through part I first. There’ll then be a third post on automated database deployment followed by a final post dealing with the last item – monitoring changes on the live system. In the previous post, I used a mixture of Red Gate products and other 3rd party software – GitHub and Atlassian Bamboo specifically. This was partly because I believe most people work in an heterogeneous environment, using software from different vendors to suit their purposes and I wanted to show how this could work for this process. For example, you could easily substitute Atlassian’s BitBucket or Stash for GitHub, depending on your needs, or use an alternative CI server such as TeamCity, TFS or Jenkins. However, in this, post, I’ll be mostly using Red Gate products only (other than tSQLt). I would do this, firstly because I work for Red Gate. However, I also think that in the area of Database Delivery processes, nobody else has the offerings to implement this process fully – so I didn’t have any choice! Background on Continuous Delivery For me, a great source of information on what makes a proper Continuous Delivery process is the Jez Humble and David Farley classic: Continuous Delivery – Reliable Software Releases through Build, Test, and Deployment Automation This book is not of course, primarily about databases, and the process I outline here and in the previous article is a gross simplification of what Jez and David describe (not least because it’s that much harder for databases!). However, a lot of the principles that they describe can be equally applied to database development and, I would argue, should be. As I say however, what I describe here is a very simple version of what would be required for a full production process. A couple of useful resources on handling some of these complexities can be found in the following two references: Refactoring Databases – Evolutionary Database Design, by Scott J Ambler and Pramod J. Sadalage Versioning Databases – Branching and Merging, by Scott Allen In particular, I don’t deal at all with the issues of multiple branches and merging of those branches, an issue made particularly acute by the use of GitHub. The other point worth making is that, in the words of Martin Fowler: Continuous Delivery is about keeping your application in a state where it is always able to deploy into production. I.e. we are not talking about continuously delivery updates to the production database every time someone checks in an amendment to a stored procedure. That is possible (and what Martin calls Continuous Deployment). However, again, that’s more than I describe in this article. And I doubt I need to remind DBAs or Developers to Proceed with Caution! Integration Testing Back to something practical. The next stage, building on our set up from the previous article, is to add in some integration tests to the process. As I say, the CI process, though interesting, isn’t enormously useful without some sort of test process running. For this we’ll use the tSQLt framework, an open source framework designed specifically for running SQL Server tests. tSQLt is part of Red Gate’s SQL Test found on http://www.red-gate.com/products/sql-development/sql-test/ or can be downloaded separately from www.tsqlt.org - though I’ll provide a step-by-step guide below for setting this up. Getting tSQLt set up via SQL Test Click on the link http://www.red-gate.com/products/sql-development/sql-test/ and click on the blue Download button to download the Red Gate SQL Test product, if not already installed. Follow the install process for SQL Test to install the SQL Server Management Studio (SSMS) plugin on to your machine, if not already installed. Open SSMS. You should now see SQL Test under the Tools menu: Clicking this link will give you the basic SQL Test dialogue: As yet, though we’ve installed the SQL Test product we haven’t yet installed the tSQLt test framework on to any particular database. To do this, we need to add our RedGateApp database using this dialogue, by clicking on the + Add Database to SQL Test… link, selecting the RedGateApp database and clicking the Add Database link: In the next screen, SQL Test describes what will be installed on the database for the tSQLt framework. Also in this dialogue, uncheck the “Add SQL Cop tests” option (shown below). SQL Cop is a great set of pre-defined tests that work within the tSQLt framework to check the general health of your SQL Server database. However, we won’t be using them in this particular simple example: Once you’ve clicked on the OK button, the changes described in the dialogue will be made to your database. Some of these are shown in the left-hand-side below: We’ve now installed the framework. However, we haven’t actually created any tests, so this will be the next step. But, before we proceed, we’ve made an update to our database so should, again check this in to source control, adding comments as required: Also worth a quick check that your build still runs with the new additions!: (And a quick check of the RedGateAppCI database shows that the changes have been made). Creating and Testing a Unit Test There are, of course, a lot of very interesting unit tests that you could and should set up for a database. The great thing about the tSQLt framework is that you can write these in SQL. The example I’m going to use here is pretty Mickey Mouse – our database table is going to include some email addresses as reference data and I want to check whether these are all in a correct email format. Nothing clever but it illustrates the process and hopefully shows the method by which more interesting tests could be set up. Adding Reference Data to our Database To start, I want to add some reference data to my database, and have this source controlled (as well as the schema). First of all I need to add some data in to my solitary table – this can be done a number of ways, but I’ll do this in SSMS for simplicity: I then add some reference data to my table: Currently this reference data just exists in the database. For proper integration testing, this needs to form part of the source-controlled version of the database – and so needs to be added to the Git repository. This can be done via SQL Source Control, though first a Primary Key needs to be added to the table. Right click the table, select Design, then right-click on the first “id” row. Then click on “Set Primary Key”: NB: once this change is made, click Save to save the change to the table. Then, to source control this reference data, right click on the table (dbo.Email) and selecting the following option: In the next screen, link the data in the Email table, by selecting it from the list and clicking “save and close”: We should at this point re-commit the changes (both the addition of the Primary Key, and the data) to the Git repo. NB: From here on, I won’t show screenshots for the GitHub side of things – it’s the same each time: whenever a change is made in SQL Source Control and committed to your local folder, you then need to sync this in the GitHub Windows client (as this is where the build server, Bamboo is taking it from). An interesting point to note here, when these changes are committed in SQL Source Control (right-click database and select “Commit Changes to Source Control..”): The display gives a warning about possibly needing a migration script for the “Add Primary Key” step of the changes. This isn’t actually necessary in this case, but this mechanism would allow you to create override scripts to replace the default change scripts created by the SQL Compare engine (which runs underneath SQL Source Control). Ignoring this message (!), we add a comment and commit the changes to Git. I then sync these, run a build (or the build gets run automatically), and check that the data is being deployed over to the target RedGateAppCI database: Creating and Running the Test As I mention, the test I’m going to use here is a very simple one - are the email addresses in my reference table valid? This isn’t of course, a full test of email validation (I expect the email addresses I’ve chosen here aren’t really the those of the Fab Four) – but just a very basic check of format used. I’ve taken the relevant SQL from this Stack Overflow article. In SSMS select “SQL Test” from the Tools menu, then click on + New Test: In the next screen, give your new test a name, and also enter a name in the Test Class box (test classes are schemas that help you keep things organised). Also check that the database in which the test is going to be created is correct – RedGateApp in this example: Click “Create Test”. After closing a couple of subsequent dialogues, you’ll see a dummy script for the test, that needs filling in: We now need to define the SQL for our test. As mentioned before, tSQLt allows you to write your unit tests in T-SQL, and the code I’m going to use here is as below. This needs to be copied and pasted in to the query window, to replace the default given by tSQLt: – Basic email check test ALTER PROCEDURE [MyChecks].[test Check Email Addresses] AS BEGIN SET NOCOUNT ON Declare @Output VarChar(max) Set @Output = ” SELECT @Output = @Output + Email +Char(13) + Char(10) FROM dbo.Email WHERE email NOT LIKE ‘%_@__%.__%’ If @Output > ” Begin Set @Output = Char(13) + Char(10) + @Output EXEC tSQLt.Fail@Output End END; Once this script is entered, hit execute to add the Stored Procedure to the database. Before committing the test to source control, it’s worth just checking that it works! For a positive test, click on “SQL Test” from the Tools menu, then click Run Tests. You should see output like the following: - a green tick to indicate success! But of course, what we also need to do is test that this is actually doing something by showing a failed test. Edit one of the email addresses in your table to an incorrect format: Now, re-run the same SQL Test as before and you’ll see the following: Great – we now know that our test is really doing something! You’ll also see a useful error message at the bottom of SSMS: (leave the email address as invalid for now, for the next steps). The next stage is to check this new test in to source control again, by right-clicking on the database and checking in the changes with a commit message (and not forgetting to sync in the GitHub client): Checking that the Tests are Running as Integration Tests After the changes above are made, and after a build has run on Bamboo (manual or automatic), looking at the Stored Procedures for the RedGateAppCI, the SPROC for the new test has been moved over to the database. However this is not exactly what we were after. We didn’t want to just copy objects from one database to another, but actually run the tests as part of the build/integration test process. I.e. we’re continuously checking any changes we make (in this case, to the reference data emails), to ensure we’re not breaking a test that we’ve set up. The behaviour we want to see is that, if we check in static data that is incorrect (as we did in step 9 above) and we have the tSQLt test set up, then our build in Bamboo should fail. However, re-running the build shows the following: - sadly, a successful build! To make sure the tSQLt tests are run as part of the integration test, we need to amend a switch in the Red Gate CI config file. First, navigate to file sqlCI.targets in your working folder: Edit this document, make the following change, save the document, then commit and sync this change in the GitHub client:    <enableTsqlt>true</enableTsqlt> Now, if we re-run the build in Bamboo (NB: I’ve moved to a new server here, hence different address and build number): - superb, a broken build!! The error message isn’t great here, so to get more detailed info, click on the full build log link on this page (below the fold). The interesting part of the log shown is towards the bottom. Pulling out this part: 21-Jun-2013 11:35:19 Build FAILED. 21-Jun-2013 11:35:19 21-Jun-2013 11:35:19 "C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj" (default target) (1) -> 21-Jun-2013 11:35:19 (sqlCI target) -> 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: RedGate.Deploy.SqlServerDbPackage.Shared.Exceptions.InvalidSqlException: Test Case Summary: 1 test case(s) executed, 0 succeeded, 1 failed, 0 errored. [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: [MyChecks].[test Check Email Addresses] failed: [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: ringo.starr@beatles [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: +----------------------+ [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] 21-Jun-2013 11:35:19 EXEC : sqlCI error occurred: |Test Execution Summary| [C:\Users\Administrator\bamboo-home\xml-data\build-dir\RGA-RGP-JOB1\sqlCI.proj] As a final check, we should make sure that, if we now fix this error, the build succeeds. So in SSMS, I’m going to correct the invalid email address, then check this change in to SQL Source Control (with a comment), commit to GitHub, and re-run the build: This should have fixed the build: It worked! Summary This has been a very quick run through the implementation of CI for databases, including tSQLt tests to test whether your database updates are working. The next post in this series will focus on automated deployment – we’ve tested our database changes, how can we now deploy these to target sites?

Read the article
Flow-Design Cheat Sheet – Part I, Notation

- by Ralf Westphal

You want to avoid the pitfalls of object oriented design? Then this is the right place to start. Use Flow-Oriented Analysis (FOA) and –Design (FOD or just FD for Flow-Design) to understand a problem domain and design a software solution. Flow-Orientation as described here is related to Flow-Based Programming, Event-Based Programming, Business Process Modelling, and even Event-Driven Architectures. But even though “thinking in flows” is not new, I found it helpful to deviate from those precursors for several reasons. Some aim at too big systems for the average programmer, some are concerned with only asynchronous processing, some are even not very much concerned with programming at all. What I was looking for was a design method to help in software projects of any size, be they large or tiny, involing synchronous or asynchronous processing, being local or distributed, running on the web or on the desktop or on a smartphone. That´s why I took ideas from all of the above sources and some additional and came up with Event-Based Components which later got repositioned and renamed to Flow-Design. In the meantime this has generated some discussion (in the German developer community) and several teams have started to work with Flow-Design. Also I´ve conducted quite some trainings using Flow-Orientation for design. The results are very promising. Developers find it much easier to design software using Flow-Orientation than OOAD-based object orientation. Since Flow-Orientation is moving fast and is not covered completely by a single source like a book, demand has increased for at least an overview of the current state of its notation. This page is trying to answer this demand by briefly introducing/describing every notational element as well as their translation into C# source code. Take this as a cheat sheet to put next to your whiteboard when designing software. However, please do not expect any explanation as to the reasons behind Flow-Design elements. Details on why Flow-Design at all and why in this specific way you´ll find in the literature covering the topic. Here´s a resource page on Flow-Design/Event-Based Components, if you´re able to read German. Notation Connected Functional Units The basic element of any FOD are functional units (FU): Think of FUs as some kind of software code block processing data. For the moment forget about classes, methods, “components”, assemblies or whatever. See a FU as an abstract piece of code. Software then consists of just collaborating FUs. I´m using circles/ellipses to draw FUs. But if you like, use rectangles. Whatever suites your whiteboard needs best. The purpose of FUs is to process input and produce output. FUs are transformational. However, FUs are not called and do not call other FUs. There is no dependency between FUs. Data just flows into a FU (input) and out of it (output). From where and where to is of no concern to a FU. This way FUs can be concatenated in arbitrary ways: Each FU can accept input from many sources and produce output for many sinks: Flows Connected FUs form a flow with a start and an end. Data is entering a flow at a source, and it´s leaving it through a sink. Think of sources and sinks as special FUs which conntect wires to the environment of a network of FUs. Wiring Details Data is flowing into/out of FUs through wires. This is to allude to electrical engineering which since long has been working with composable parts. Wires are attached to FUs usings pins. They are the entry/exit points for the data flowing along the wires. Input-/output pins currently need not be drawn explicitly. This is to keep designing on a whiteboard simple and quick. Data flowing is of some type, so wires have a type attached to them. And pins have names. If there is only one input pin and output pin on a FU, though, you don´t need to mention them. The default is Process for a single input pin, and Result for a single output pin. But you´re free to give even single pins different names. There is a shortcut in use to address a certain pin on a destination FU: The type of the wire is put in parantheses for two reasons. 1. This way a “no-type” wire can be easily denoted, 2. this is a natural way to describe tuples of data. To describe how much data is flowing, a star can be put next to the wire type: Nesting – Boards and Parts If more than 5 to 10 FUs need to be put in a flow a FD starts to become hard to understand. To keep diagrams clutter free they can be nested. You can turn any FU into a flow: This leads to Flow-Designs with different levels of abstraction. A in the above illustration is a high level functional unit, A.1 and A.2 are lower level functional units. One of the purposes of Flow-Design is to be able to describe systems on different levels of abstraction and thus make it easier to understand them. Humans use abstraction/decomposition to get a grip on complexity. Flow-Design strives to support this and make levels of abstraction first class citizens for programming. You can read the above illustration like this: Functional units A.1 and A.2 detail what A is supposed to do. The whole of A´s responsibility is decomposed into smaller responsibilities A.1 and A.2. FU A thus does not do anything itself anymore! All A is responsible for is actually accomplished by the collaboration between A.1 and A.2. Since A now is not doing anything anymore except containing A.1 and A.2 functional units are devided into two categories: boards and parts. Boards are just containing other functional units; their sole responsibility is to wire them up. A is a board. Boards thus depend on the functional units nested within them. This dependency is not of a functional nature, though. Boards are not dependent on services provided by nested functional units. They are just concerned with their interface to be able to plug them together. Parts are the workhorses of flows. They contain the real domain logic. They actually transform input into output. However, they do not depend on other functional units. Please note the usage of source and sink in boards. They correspond to input-pins and output-pins of the board. Implicit Dependencies Nesting functional units leads to a dependency tree. Boards depend on nested functional units, they are the inner nodes of the tree. Parts are independent, they are the leafs: Even though dependencies are the bane of software development, Flow-Design does not usually draw these dependencies. They are implicitly created by visually nesting functional units. And they are harmless. Boards are so simple in their functionality, they are little affected by changes in functional units they are depending on. But functional units are implicitly dependent on more than nested functional units. They are also dependent on the data types of the wires attached to them: This is also natural and thus does not need to be made explicit. And it pertains mainly to parts being dependent. Since boards don´t do anything with regard to a problem domain, they don´t care much about data types. Their infrastructural purpose just needs types of input/output-pins to match. Explicit Dependencies You could say, Flow-Orientation is about tackling complexity at its root cause: that´s dependencies. “Natural” dependencies are depicted naturally, i.e. implicitly. And whereever possible dependencies are not even created. Functional units don´t know their collaborators within a flow. This is core to Flow-Orientation. That makes for high composability of functional units. A part is as independent of other functional units as a motor is from the rest of the car. And a board is as dependend on nested functional units as a motor is on a spark plug or a crank shaft. With Flow-Design software development moves closer to how hardware is constructed. Implicit dependencies are not enough, though. Sometimes explicit dependencies make designs easier – as counterintuitive this might sound. So FD notation needs a ways to denote explicit dependencies: Data flows along wires. But data does not flow along dependency relations. Instead dependency relations represent service calls. Functional unit C is depending on/calling services on functional unit S. If you want to be more specific, name the services next to the dependency relation: Although you should try to stay clear of explicit dependencies, they are fundamentally ok. See them as a way to add another dimension to a flow. Usually the functionality of the independent FU (“Customer repository” above) is orthogonal to the domain of the flow it is referenced by. If you like emphasize this by using different shapes for dependent and independent FUs like above. Such dependencies can be used to link in resources like databases or shared in-memory state. FUs can not only produce output but also can have side effects. A common pattern for using such explizit dependencies is to hook a GUI into a flow as the source and/or the sink of data: Which can be shortened to: Treat FUs others depend on as boards (with a special non-FD API the dependent part is connected to), but do not embed them in a flow in the diagram they are depended upon. Attributes of Functional Units Creation and usage of functional units can be modified with attributes. So far the following have shown to be helpful: Singleton: FUs are by default multitons. FUs in the same of different flows with the same name refer to the same functionality, but to different instances. Think of functional units as objects that get instanciated anew whereever they appear in a design. Sometimes though it´s helpful to reuse the same instance of a functional unit; this is always due to valuable state it holds. Signify this by annotating the FU with a “(S)”. Multiton: FUs on which others depend are singletons by default. This is, because they usually are introduced where shared state comes into play. If you want to change them to be a singletons mark them with a “(M)”. Configurable: Some parts need to be configured before the can do they work in a flow. Annotate them with a “(C)” to have them initialized before any data items to be processed by them arrive. Do not assume any order in which FUs are configured. How such configuration is happening is an implementation detail. Entry point: In each design there needs to be a single part where “it all starts”. That´s the entry point for all processing. It´s like Program.Main() in C# programs. Mark the entry point part with an “(E)”. Quite often this will be the GUI part. How the entry point is started is an implementation detail. Just consider it the first FU to start do its job. Patterns / Standard Parts If more than a single wire is attached to an output-pin that´s called a split (or fork). The same data is flowing on all of the wires. Remember: Flow-Designs are synchronous by default. So a split does not mean data is processed in parallel afterwards. Processing still happens synchronously and thus one branch after another. Do not assume any specific order of the processing on the different branches after the split. It is common to do a split and let only parts of the original data flow on through the branches. This effectively means a map is needed after a split. This map can be implicit or explicit. Although FUs can have multiple input-pins it is preferrable in most cases to combine input data from different branches using an explicit join: The default output of a join is a tuple of its input values. The default behavior of a join is to output a value whenever a new input is received. However, to produce its first output a join needs an input for all its input-pins. Other join behaviors can be: reset all inputs after an output only produce output if data arrives on certain input-pins

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Problem with AssetManager while loading a Model type

- by user1204548

Today I've tried the AssetManager for the first time with .g3db files and I'm having some problems. Exception in thread "LWJGL Application" com.badlogic.gdx.utils.GdxRuntimeException: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load dependencies of asset: data/data at com.badlogic.gdx.assets.AssetManager.handleTaskError(AssetManager.java:508) at com.badlogic.gdx.assets.AssetManager.update(AssetManager.java:342) at com.lostchg.martagdx3d.MartaGame.render(MartaGame.java:78) at com.badlogic.gdx.Game.render(Game.java:46) at com.badlogic.gdx.backends.lwjgl.LwjglApplication.mainLoop(LwjglApplication.java:207) at com.badlogic.gdx.backends.lwjgl.LwjglApplication$1.run(LwjglApplication.java:114) Caused by: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load dependencies of asset: data/data at com.badlogic.gdx.assets.AssetLoadingTask.handleAsyncLoader(AssetLoadingTask.java:119) at com.badlogic.gdx.assets.AssetLoadingTask.update(AssetLoadingTask.java:89) at com.badlogic.gdx.assets.AssetManager.updateTask(AssetManager.java:445) at com.badlogic.gdx.assets.AssetManager.update(AssetManager.java:340) ... 4 more Caused by: com.badlogic.gdx.utils.GdxRuntimeException: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load file: data/data at com.badlogic.gdx.utils.async.AsyncResult.get(AsyncResult.java:31) at com.badlogic.gdx.assets.AssetLoadingTask.handleAsyncLoader(AssetLoadingTask.java:117) ... 7 more Caused by: com.badlogic.gdx.utils.GdxRuntimeException: Couldn't load file: data/data at com.badlogic.gdx.graphics.Pixmap.<init>(Pixmap.java:140) at com.badlogic.gdx.assets.loaders.TextureLoader.loadAsync(TextureLoader.java:72) at com.badlogic.gdx.assets.loaders.TextureLoader.loadAsync(TextureLoader.java:41) at com.badlogic.gdx.assets.AssetLoadingTask.call(AssetLoadingTask.java:69) at com.badlogic.gdx.assets.AssetLoadingTask.call(AssetLoadingTask.java:34) at com.badlogic.gdx.utils.async.AsyncExecutor$2.call(AsyncExecutor.java:49) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: com.badlogic.gdx.utils.GdxRuntimeException: File not found: data\data (Internal) at com.badlogic.gdx.files.FileHandle.read(FileHandle.java:132) at com.badlogic.gdx.files.FileHandle.length(FileHandle.java:586) at com.badlogic.gdx.files.FileHandle.readBytes(FileHandle.java:220) at com.badlogic.gdx.graphics.Pixmap.<init>(Pixmap.java:137) ... 9 more Why it tries to load that unexisting file? It seems that the AssetManager manages to load my .g3db file at first, because earlier the java console threw some errors related to the textures associated to the 3D scene having to be a power of 2. Relevant code: public void show() { ... assets = new AssetManager(); assets.load("data/levelprueba2.g3db", Model.class); loading = true; ... } private void doneLoading() { Model model = assets.get("data/levelprueba2.g3db", Model.class); for (int i = 0; i < model.nodes.size; i++) { String id = model.nodes.get(i).id; ModelInstance instance = new ModelInstance(model, id); Node node = instance.getNode(id); instance.transform.set(node.globalTransform); node.translation.set(0,0,0); node.scale.set(1,1,1); node.rotation.idt(); instance.calculateTransforms(); instances.add(instance); } loading = false; } public void render(float delta) { super.render(delta); if (loading && assets.update()) doneLoading(); ... } The error points to the line with the assets.update() method. Please, help! Sorry for my bad English and my amateurish doubts.

Read the article
Database Mirroring on SQL Server Express Edition

- by Most Valuable Yak (Rob Volk)

Like most SQL Server users I'm rather frustrated by Microsoft's insistence on making the really cool features only available in Enterprise Edition. And it really doesn't help that they changed the licensing for SQL 2012 to be core-based, so now it's like 4 times as expensive! It almost makes you want to go with Oracle. That, and a desire to have Larry Ellison do things to your orifices. And since they've introduced Availability Groups, and marked database mirroring as deprecated, you'd think they'd make make mirroring available in all editions. Alas…they don't…officially anyway. Thanks to my constant poking around in places I'm not "supposed" to, I've discovered the low-level code that implements database mirroring, and found that it's available in all editions! It turns out that the query processor in all SQL Server editions prepends a simple check before every edition-specific DDL statement: IF CAST(SERVERPROPERTY('Edition') as nvarchar(max)) NOT LIKE '%e%e%e% Edition%' print 'Lame' else print 'Cool' If that statement returns true, it fails. (the print statements are just placeholders) Go ahead and test it on Standard, Workgroup, and Express editions compared to an Enterprise or Developer edition instance (which support everything). Once again thanks to Argenis Fernandez (b | t) and his awesome sessions on using Sysinternals, I was able to watch the exact process SQL Server performs when setting up a mirror. Surprisingly, it's not actually implemented in SQL Server! Some of it is, but that's something of a smokescreen, the real meat of it is simple filesystem primitives. The NTFS filesystem supports links, both hard links and symbolic, so that you can create two entries for the same file in different directories and/or different names. You can create them using the MKLINK command in a command prompt: mklink /D D:\SkyDrive\Data D:\Data mklink /D D:\SkyDrive\Log D:\Log This creates a symbolic link from my data and log folders to my Skydrive folder. Any file saved in either location will instantly appear in the other. And since my Skydrive will be automatically synchronized with the cloud, any changes I make will be copied instantly (depending on my internet bandwidth of course). So what does this have to do with database mirroring? Well, it seems that the mirroring endpoint that you have to create between mirror and principal servers is really nothing more than a Skydrive link. Although it doesn't actually use Skydrive, it performs the same function. So in effect, the following statement: ALTER DATABASE Mir SET PARTNER='TCP://MyOtherServer.domain.com:5022' Is turned into: mklink /D "D:\Data" "\\MyOtherServer.domain.com\5022$" The 5022$ "port" is actually a hidden system directory on the principal and mirror servers. I haven't quite figured out how the log files are included in this, or why you have to SET PARTNER on both principal and mirror servers, except maybe that mklink has to do something special when linking across servers. I couldn't get the above statement to work correctly, but found that doing mklink to a local Skydrive folder gave me similar functionality. To wrap this up, all you have to do is the following: Install Skydrive on both SQL Servers (principal and mirror) and set the local Skydrive folder (D:\SkyDrive in these examples) On the principal server, run mklink /D on the data and log folders to point to SkyDrive: mklink /D D:\SkyDrive\Data D:\Data On the mirror server, run the complementary linking: mklink /D D:\Data D:\SkyDrive\Data Create your database and make sure the files map to the principal data and log folders (D:\Data and D:\Log) Viola! Your databases are kept in sync on multiple servers! One wrinkle you will encounter is that the mirror server will show the data and log files, but you won't be able to attach them to the mirror SQL instance while they are attached to the principal. I think this is a bug in the Skydrive, but as it turns out that's fine: you can't access a mirror while it's hosted on the principal either. So you don't quite get automatic failover, but you can attach the files to the mirror if the principal goes offline. It's also not exactly synchronous, but it's better than nothing, and easier than either replication or log shipping with a lot less latency. I will end this with the obvious "not supported by Microsoft" and "Don't do this in production without an updated resume" spiel that you should by now assume with every one of my blog posts, especially considering the date.

Read the article
Introducing MySQL for Excel

- by Javier Treviño

As part of the new product initiatives of the MySQL on Windows group we released a tool that makes the task of getting data in and out of a MySQL Database very friendly and intuitive, and we paired it with one of the preferred applications for data analysis and manipulation in Windows platforms, MS Excel. Welcome to MySQL for Excel, an add-in that is installed and accessed from within the MS Excel’s Data tab offering a wizard-like interface arranged in an elegant yet simple way to help users browse MySQL Schemas, Tables, Views and Procedures and perform data operations against them using MS Excel as the vehicle to drive the data in and out MySQL Databases. One of the coolest features we had in mind designing MySQL for Excel is simplicity. MS Excel is simple and easy to work with, thus liked by many Windows users because they don’t have to be software gurus to use it. We applied the same principle by targeting MySQL for Excel to any kind of user, so if you are already familiarized with Excel’s interface you will find yourself working with MySQL data in no time. MySQL for Excel is shipped within the MySQL Installer as one of the tools in the suite; if prerequisites are already installed (.NET Framework 4.0, Visual Studio Tools for Office 4.0 and of course MS Office), installing the add-in involves a very few clicks and no further setup to use it. Being an Excel Add-In there is no executable file involved after the installation, running MS Excel and opening the add-in from its Data tab is all that is required. MySQL for Excel automatically integrates with MySQL Workbench (if installed) to share the same connections to MySQL Server installations, that way connections are defined just once in either product saving time. Opening the Add-In brings the Welcome Panel at the right side of the Excel main window from which connections to MySQL Servers are shown grouped by Local VS Remote connections; then users can open any of those connections by double-clicking it and entering the password of the used account. Additionally a user can create a connection by clicking on the New Connection action label or edit connections through MySQL Workbench (if installed) by clicking on the Manage Connections action label. Once a connection is opened, the Schema Selection panel is shown, at the top of it the selected connection (connection name, hostname/IP and username). Just below, a list of schemas is displayed where User Schemas are grouped first followed by System Schemas; users can double-click any selected schema to go to the next panel or select a schema and clicking the Next > button. Users can alternatively click on the < Back button to go back to the Welcome Panel to close the current connection and open a new one; also by clicking the Create New Schema action label they can create an empty new schema. Once a schema is opened the DB Object Selection panel is shown, this is actually the place where the fun stuff happens; from here users are able to perform actions against MySQL Tables, Views and Procedures. ">The actions available here are about importing data from a MySQL Table, View or Procedure to Excel, exporting Excel data to a new MySQL Table, appending Excel data to an existing MySQL Table or editing a MySQL Table’s data by using an Excel Worksheet as a user interface to update data in any row/column, insert new rows or delete existing rows in a very easy and friendly way. More blog posts will follow describing all of these actions, so stay tuned! Remember that your feedback is very important for us, so drop us a message: · MySQL on Windows (this) Blog - https://blogs.oracle.com/MySqlOnWindows/ · Forum - http://forums.mysql.com/list.php?172 · Facebook - http://www.facebook.com/mysql Cheers!

Read the article
Analytics in an Omni-Channel World

- by David Dorf

Retail has been around ever since mankind started bartering. The earliest transactions were very specific to the individuals buying and selling, then someone had the bright idea to open a store. Those transactions were a little more generic, but the store owner still knew his customers and what they wanted. As the chains rolled out, customer intimacy was sacrificed for scale, and retailers began to rely on segments and clusters. But thanks to the widespread availability of data and the technology to convert said data into information, retailers are getting back to details. The retail industry is following a maturity model for analytics that is has progressed through five stages, each delivering more value than the previous. Store Analytics Brick-and-mortar retailers (and pure-play catalogers as well) that collect anonymous basket-level data are able to get some sense of demand to help with allocation decisions. Promotions and foot-traffic can be measured to understand marketing effectiveness and perhaps focus groups can help test ideas. But decisions are influenced by the majority, using faceless customer segments and aggregated industry data points. Loyalty programs help a little, but in many cases the cost outweighs the benefits. Web Analytics The Web made it much easier to collect data on specific, yet still anonymous consumers using cookies to track visits. Clickstreams and product searches are analyzed to understand the purchase journey, gauge demand, and better understand up-selling opportunities. Personalization begins to allow retailers target market consumers with recommendations. Cross-Channel Analytics This phase is a minor one, but where most retailers probably sit today. They are able to use information from one channel to bolster activities in another. However, there are technical challenges combining data silos so its not an easy task. But for those retailers that are able to perform analytics on both sources of data, the pay-off is pretty nice. Revenue per customer begins to go up as customers have a better brand experience. Mobile & Social Analytics Big data technologies are enabling a 360-degree view of the customer by incorporating psychographic data from social sites alongside traditional demographic data. Retailers can track individual preferences, opinions, hobbies, etc. in order to understand a consumer's motivations. Using mobile devices, consumers can interact with brands anywhere, anytime, accessing deep product information and reviews. Mobile, combined with a loyalty program, presents an opportunity to put shopping into geographic context, understanding paths to the store, patterns within the store, and be an always-on advertising conduit. Omni-Channel Analytics All this data along with the proper technology represents a new paradigm in which the clock is turned back and retail becomes very personal once again. Rich, individualized data better illuminates demand, allows for highly localized assortments, and helps tailor up-selling. Interactions with all channels help build an accurate profile of each consumer, and allows retailers to tailor the retail experience to meet the heightened expectations of today's sophisticated shopper. And of course this culminates in greater customer satisfaction and business profitability.

Read the article
Windows Azure Recipe: Consumer Portal

- by Clint Edmonson

Nearly every company on the internet has a web presence. Many are merely using theirs for informational purposes. More sophisticated portals allow customers to register their contact information and provide some level of interaction or customer support. But as our understanding of how consumers use the web increases, the more progressive companies are taking advantage of social web and rich media delivery to connect at a deeper level with the consumers of their goods and services. Drivers Cost reduction Scalability Global distribution Time to market Solution Here’s a sketch of how a Windows Azure Consumer Portal might be built out: Ingredients Web Role – this will host the core of the solution. Each web role is a virtual machine hosting an application written in ASP.NET (or optionally php, or node.js). The number of web roles can be scaled up or down as needed to handle peak and non-peak traffic loads. Database – every modern web application needs to store data. SQL Azure databases look and act exactly like their on-premise siblings but are fault tolerant and have data redundancy built in. Access Control (optional) – if identity needs to be tracked within the solution, the access control service combined with the Windows Identity Foundation framework provides out-of-the-box support for several social media platforms including Windows LiveID, Google, Yahoo!, Facebook. It also has a provider model to allow integration with other platforms as well. Caching (optional) – for sites with high traffic with lots of read-only data and lists, the distributed in-memory caching service can be used to cache and serve up static data at higher scale and speed than direct database requests. It can also be used to manage user session state. Blob Storage (optional) – for sites that serve up unstructured data such as documents, video, audio, device drivers, and more. The data is highly available and stored redundantly across data centers. Each entry in blob storage is provided with it’s own unique URL for direct access by the browser. Content Delivery Network (CDN) (optional) – for sites that service users around the globe, the CDN is an extension to blob storage that, when enabled, will automatically cache frequently accessed blobs and static site content at edge data centers around the world. The data can be delivered statically or streamed in the case of rich media content. Training Labs These links point to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure (16 labs) Windows Azure is an internet-scale cloud computing and services platform hosted in Microsoft data centers, which provides an operating system and a set of developer services which can be used individually or together. It gives developers the choice to build web applications; applications running on connected devices, PCs, or servers; or hybrid solutions offering the best of both worlds. New or enhanced applications can be built using existing skills with the Visual Studio development environment and the .NET Framework. With its standards-based and interoperable approach, the services platform supports multiple internet protocols, including HTTP, REST, SOAP, and plain XML SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. Windows Azure Services (9 labs) As applications collaborate across organizational boundaries, ensuring secure transactions across disparate security domains is crucial but difficult to implement. Windows Azure Services provides hosted authentication and access control using powerful, secure, standards-based infrastructure. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

Read the article
How to remove synaptic without installing all the unwanted packages?

- by Jay

I am trying to uninstall synaptic. I prefer using apt-get and other command line tools to manage my packages. So I do not need synaptic and the software manager. I'm trying to remove both of them using apt-get. Its a new box. Recently installed Linux Mint mate 15. After installation, the only thing I did was, sudo apt-get update and sudo apt-get dist-upgrade After that, I did this command for removing synaptic, sudo apt-get remove --purge synaptic But this gives me a very weird output, Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common Use 'apt-get autoremove' to remove them. The following extra packages will be installed: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 libxml2-utils nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common Suggested packages: libterm-readline-gnu-perl libterm-readline-perl-perl djvulibre-bin finger hspell libqca2-plugin-cyrus-sasl libqca2-plugin-gnupg libqca2-plugin-ossl phonon-backend-vlc phonon-backend-xine phonon-backend-mplayer The following packages will be REMOVED: aptoncd* apturl* mintupdate* mintwelcome* synaptic* The following NEW packages will be installed: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 libxml2-utils nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common 0 upgraded, 78 newly installed, 5 to remove and 0 not upgraded. Need to get 60.9 MB of archives. After this operation, 146 MB of additional disk space will be used. Do you want to continue [Y/n]? n Abort. As you can see, apt-get is trying to install the same packages that it is asking me to autoremove. Could someone please tell me, how to uninstall synaptic properly? Or am I missing something? Just for the record, I also did, sudo apt-get autoremove --purge like it asked me to ... and this is what I got, Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.

Read the article
Organization & Architecture UNISA Studies – Chap 6

- by MarkPearl

Learning Outcomes Discuss the physical characteristics of magnetic disks Describe how data is organized and accessed on a magnetic disk Discuss the parameters that play a role in the performance of magnetic disks Describe different optical memory devices Magnetic Disk The way data is stored on and retried from magnetic disks Data is recorded on and later retrieved form the disk via a conducting coil named the head (in many systems there are two heads) The writ mechanism exploits the fact that electricity flowing through a coil produces a magnetic field. Electric pulses are sent to the write head, and the resulting magnetic patterns are recorded on the surface below with different patterns for positive and negative currents The physical characteristics of a magnetic disk Summarize from book The factors that play a role in the performance of a disk Seek time – the time it takes to position the head at the track Rotational delay / latency – the time it takes for the beginning of the sector to reach the head Access time – the sum of the seek time and rotational delay Transfer time – the time it takes to transfer data RAID The rate of improvement in secondary storage performance has been considerably less than the rate for processors and main memory. Thus secondary storage has become a bit of a bottleneck. RAID works on the concept that if one disk can be pushed so far, additional gains in performance are to be had by using multiple parallel components. Points to note about RAID… RAID is a set of physical disk drives viewed by the operating system as a single logical drive Data is distributed across the physical drives of an array in a scheme known as striping Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure (not supported by RAID 0 or RAID 1) Interesting to note that the increase in the number of drives, increases the probability of failure. To compensate for this decreased reliability RAID makes use of stored parity information that enables the recovery of data lost due to a disk failure. The RAID scheme consists of 7 levels… Category Level Description Disks Required Data Availability Large I/O Data Transfer Capacity Small I/O Request Rate Striping 0 Non Redundant N Lower than single disk Very high Very high for both read and write Mirroring 1 Mirrored 2N Higher than RAID 2 – 5 but lower than RAID 6 Higher than single disk Up to twice that of a signle disk for read Parallel Access 2 Redundant via Hamming Code N + m Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Parallel Access 3 Bit interleaved parity N + 1 Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Independent Access 4 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, significantly lower than single disk for write Similar to RAID 0 for read, significantly lower than single disk for write Independent Access 5 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, lower than single disk for write Similar to RAID 0 for read, generally lower than single disk for write Independent Access 6 Block interleaved parity N + 2 Highest of all listed alternatives Similar to RAID 0 for read; lower than RAID 5 for write Similar to RAID 0 for read, significantly lower than RAID 5 for write Read page 215 – 221 for detailed explanation on RAID levels Optical Memory There are a variety of optical-disk systems available. Read through the table on page 222 – 223 Some of the devices include… CD CD-ROM CD-R CD-RW DVD DVD-R DVD-RW Blue-Ray DVD Magnetic Tape Most modern systems use serial recording – data is lade out as a sequence of bits along each track. The typical recording used in serial is referred to as serpentine recording. In this technique when data is being recorded, the first set of bits is recorded along the whole length of the tape. When the end of the tape is reached the heads are repostioned to record a new track, and the tape is again recorded on its whole length, this time in the opposite direction. That process continued back and forth until the tape is full. To increase speed, the read-write head is capable of reading and writing a number of adjacent tracks simultaneously. Data is still recorded serially along individual tracks, but blocks in sequence are stored on adjacent tracks as suggested. A tape drive is a sequential access device. Magnetic tape was the first kind of secondary memory. It is still widely used as the lowest-cost, slowest speed member of the memory hierarchy.

Read the article
Non-blocking I/O using Servlet 3.1: Scalable applications using Java EE 7 (TOTD #188)

- by arungupta

Servlet 3.0 allowed asynchronous request processing but only traditional I/O was permitted. This can restrict scalability of your applications. In a typical application, ServletInputStream is read in a while loop. public class TestServlet extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { ServletInputStream input = request.getInputStream(); byte[] b = new byte[1024]; int len = -1; while ((len = input.read(b)) != -1) { . . . } }} If the incoming data is blocking or streamed slower than the server can read then the server thread is waiting for that data. The same can happen if the data is written to ServletOutputStream. This is resolved in Servet 3.1 (JSR 340, to be released as part Java EE 7) by adding event listeners - ReadListener and WriteListener interfaces. These are then registered using ServletInputStream.setReadListener and ServletOutputStream.setWriteListener. The listeners have callback methods that are invoked when the content is available to be read or can be written without blocking. The updated doGet in our case will look like: AsyncContext context = request.startAsync();ServletInputStream input = request.getInputStream();input.setReadListener(new MyReadListener(input, context)); Invoking setXXXListener methods indicate that non-blocking I/O is used instead of the traditional I/O. At most one ReadListener can be registered on ServletIntputStream and similarly at most one WriteListener can be registered on ServletOutputStream. ServletInputStream.isReady and ServletInputStream.isFinished are new methods to check the status of non-blocking I/O read. ServletOutputStream.canWrite is a new method to check if data can be written without blocking. MyReadListener implementation looks like: @Overridepublic void onDataAvailable() { try { StringBuilder sb = new StringBuilder(); int len = -1; byte b[] = new byte[1024]; while (input.isReady() && (len = input.read(b)) != -1) { String data = new String(b, 0, len); System.out.println("--> " + data); } } catch (IOException ex) { Logger.getLogger(MyReadListener.class.getName()).log(Level.SEVERE, null, ex); }}@Overridepublic void onAllDataRead() { System.out.println("onAllDataRead"); context.complete();}@Overridepublic void onError(Throwable t) { t.printStackTrace(); context.complete();} This implementation has three callbacks: onDataAvailable callback method is called whenever data can be read without blocking onAllDataRead callback method is invoked data for the current request is completely read. onError callback is invoked if there is an error processing the request. Notice, context.complete() is called in onAllDataRead and onError to signal the completion of data read. For now, the first chunk of available data need to be read in the doGet or service method of the Servlet. Rest of the data can be read in a non-blocking way using ReadListener after that. This is going to get cleaned up where all data read can happen in ReadListener only. The sample explained above can be downloaded from here and works with GlassFish 4.0 build 64 and onwards. The slides and a complete re-run of What's new in Servlet 3.1: An Overview session at JavaOne is available here. Here are some more references for you: Java EE 7 Specification Status Servlet Specification Project JSR Expert Group Discussion Archive Servlet 3.1 Javadocs

Read the article
How can I best implement 'cache until further notice' with memcache in multiple tiers?

- by ajreal

the term "client" used here is not referring to client's browser, but client server Before cache workflow 1. client make a HTTP request --> 2. server process --> 3. store parsed results into memcache for next use (cache indefinitely) --> 4. return results to client --> 5. client get the result, store into client's local memcache with TTL After cache workflow 1. another client make a HTTP request --> 2. memcache found return memcache results to client --> 3. client get the result, store into client's local memcache with TTL TTL = time to live Is possible for me to know when the data was updated, and to expire relevant memcache(s) accordingly. However, the pitfalls on client site cache TTL Any data update before the TTL is not pick-up by client memcache. In reverse manner, where there is no update, client memcache still expire after the TTL First request (or concurrent requests) after cache TTL will get throttle as it need to repeat the "Before cache workflow" In the event where client require several HTTP requests on a single web page, it could be very bad in performance. Ideal solution should be client to cache indefinitely until further notice. Here are the three proposals about futher notice Proposal 1 : Make use on HTTP header (current implementation) 1. client sent HTTP request last modified time header 2. server check if last data modified time=last cache time return status 304 3. client based on header to decide further processing GOOD? ---- - save some parsing for client - lesser data transfer BAD? ---- - fire a HTTP request is still slow - server end still need to process lots of requests Proposal 2 : Consistently issue a HTTP request to check all data group last modified time 1. client fire a HTTP request 2. server to return last modified time for all data group 3. client compare local last cache time with the result 4. if data group last cache time < server last modified time then request again for that data group only GOOD? ---- - only fetch what is no up-to-date - less requests for server BAD? ---- - every web page require a HTTP request Proposal 3 : Tell client when new data is available (Push) 1. when server end notice there is a change on a data group 2. notify clients on the changes 3. help clients to fetch again data 4. then reset client local memcache after data is parsed GOOD? ---- - let the cache act/behave like a true cache BAD? ---- - encourage race condition My preference is on proposal 3, and something like Gearman could be ideal Where there is a change, Gearman server to sent the task to multiple clients (workers). Am I crazy? (I know my first question is a bit crazy)

Read the article
RSA decrypting data in C# (.NET 3.5) which was encrypted with openssl in php 5.3.2

- by panny

Maybe someone can clear me up. I have been surfing on this a while now. Step #1: Create a root certificate Key generation on unix 1) openssl req -x509 -nodes -days 3650 -newkey rsa:1024 -keyout privatekey.pem -out mycert.pem 2) openssl rsa -in privatekey.pem -pubout -out publickey.pem 3) openssl pkcs12 -export -out mycertprivatekey.pfx -in mycert.pem -inkey privatekey.pem -name "my certificate" Step #2: Does root certificate work on php: YES PHP side I used the publickey.pem to read it into php: $publicKey = "file://C:/publickey.pem"; $privateKey = "file://C:/privatekey.pem"; $plaintext = "123"; openssl_public_encrypt($plaintext, $encrypted, $publicKey); $transfer = base64_encode($encrypted); openssl_private_decrypt($encrypted, $decrypted, $privateKey); echo $decrypted; // "123" OR $server_public_key = openssl_pkey_get_public(file_get_contents("C:\publickey.pem")); // rsa encrypt openssl_public_encrypt("123", $encrypted, $server_public_key); and the privatekey.pem to check if it works: openssl_private_decrypt($encrypted, $decrypted, openssl_get_privatekey(file_get_contents("C:\privatekey.pem"))); echo $decrypted; // "123" Coming to the conclusion, that encryption/decryption works fine on the php side with these openssl root certificate files. Step #3: Does root certificate work on .NET: YES C# side In same manner I read the keys into a .net C# console program: X509Certificate2 myCert2 = new X509Certificate2(); RSACryptoServiceProvider rsa = new RSACryptoServiceProvider(); try { myCert2 = new X509Certificate2(@"C:\mycertprivatekey.pfx"); rsa = (RSACryptoServiceProvider)myCert2.PrivateKey; } catch (Exception e) { } byte[] test = {Convert.ToByte("123")}; string t = Convert.ToString(rsa.Decrypt(rsa.Encrypt(test, false), false)); Coming to the point, that encryption/decryption works fine on the c# side with these openssl root certificate files. Step #4: Enrypt in php and Decrypt in .NET: !!NO!! PHP side $onett = "123" .... openssl_public_encrypt($onett, $encrypted, $server_public_key); $onettbase64 = base64_encode($encrypted); copy - paste $onettbase64 ("LkU2GOCy4lqwY4vtPI1JcsxgDgS2t05E6kYghuXjrQe7hSsYXETGdlhzEBlp+qhxzTXV3pw+AS5bEg9CPxqHus8fXHOnXYqsd2HL20QSaz+FjZee6Kvva0cGhWkFdWL+ANDSOWRWo/OMhm7JVqU3P/44c3dLA1eu2UsoDI26OMw=") into c# program: C# side byte[] transfered_onettbase64 = Convert.FromBase64String("LkU2GOCy4lqwY4vtPI1JcsxgDgS2t05E6kYghuXjrQe7hSsYXETGdlhzEBlp+qhxzTXV3pw+AS5bEg9CPxqHus8fXHOnXYqsd2HL20QSaz+FjZee6Kvva0cGhWkFdWL+ANDSOWRWo/OMhm7JVqU3P/44c3dLA1eu2UsoDI26OMw="); string k = Convert.ToString(rsa.Decrypt(transfered_onettbase64, false)); // Bad Data exception == Exception while decrypting!!! Any ideas?

Read the article
Apache 2.4 Prefork vs. PHP-FPM Event shows sig decrease in requests per second

- by Mark

On my Apache 2.4.2 server with a standard mod_php Prefork setup these are my server-status results Current Time: Wednesday, 24-Oct-2012 19:36:24 CDT Restart Time: Wednesday, 24-Oct-2012 01:27:30 CDT Parent Server Config. Generation: 1 Parent Server MPM Generation: 0 Server uptime: 18 hours 8 minutes 54 seconds Total accesses: 14304233 - Total Traffic: 342.3 GB CPU Usage: u12584.6 s721.93 cu.66 cs3.43 - 20.4% CPU load 219 requests/sec - 5.4 MB/second - 25.1 kB/request 507 requests currently being processed, 355 idle workers ______KKKKR_K______W_KKC___CKK_K_K_W__CC_KKK_KK._K_K_KK._KKKK_K_ K_____KK_KKKK_K_KK__K___KK_K___K_____CKKK_WK_K_____KCKK__K___K_K K_CK_K_K_____K__KKKK_K__K___K_KK_K_K_KKKCK____________KK_CK__KKK __C_KKKKKKK___CK___C_KKK_K__C__K_CK____KKK__K__K__K_K__KK_CK_K__ _KKKKK_K_W__KK______K___K__W___C_K__K____KKKKKKKK.KKKKKKKCK_K___ _C_KK_K_WK__K_KK__K__RK_KK___K____K_KK_K_K___RKC_KKKK___KKKC_K_W _C_KK_KK__W____KC__KKK__KKK___K___KKK_KK_K_KKW__K_KR_KK_KK__KKK_ R__KKK__KKKKKK__K_KKKKK_K__K_K___KKW_________KK_K___KKK___KK.K_C KKKKKKW_____K__K_KKC_KCKK_K_KK_K__KK__K___K__KK_KK__________KK__ __K___KK_K__K_C_KK_K___KK__KK__K__KCK_K__KK_________K_K_KK__.K__ K_CKK.CCRW__KKKKKKKKKKKC__W____K___KWK_KK_KKC______.K_K_KK_KKKC_ __KKK_W_KCKKK_K_K____CCCK__KC_KKKK_K____K_CK_K____K__K____KKK_KK KK___K_K_K__KW__KCKKKK____WKWK__K_KKRKK__C_K_KK_KK_K__KKCC_K__C_ KK_K___K_KK______K_____CKK_K_______KK_CKCK__KKKKK____K__K..K____ __KKWK_KW__KKK__K_KKK___K_KK_KKK__KK___KK___KK_KK___KK____KKWKKC KK_KKKK_................................` When I switch to a PHP-FPM setup with the Event MPM with no other variables changes, my requests/sec plummet and overall apache response is garbage. Current Time: Wednesday, 24-Oct-2012 19:51:21 CDT Restart Time: Wednesday, 24-Oct-2012 19:48:03 CDT Parent Server Config. Generation: 1 Parent Server MPM Generation: 0 Server uptime: 3 minutes 18 seconds Total accesses: 18720 - Total Traffic: 307.1 MB CPU Usage: u16.57 s4.74 cu0 cs0 - 10.8% CPU load 94.5 requests/sec - 1.6 MB/second - 16.8 kB/request 15 requests currently being processed, 49 idle workers PID Connections Threads Async connections total accepting busy idle writing keep-alive closing 11701 114 no 10 22 0 66 38 11702 134 no 5 27 0 81 48 Sum 248 15 49 0 147 86 __R_R__W___RRW________RR__R___W_W_______W_____W_____________R_R_ Is there any obvious reason anyone could think of why this would be the case. I can provide any other additional stats or server setup info to help out. Ive tried tweaking everything up and down and nothing really helps get the PHP-FPM setup anywhere near a baseic prefork/mod-php setup. Thanks!

Read the article
Tomcat web application intermittent freeze

- by tinny

I have a Grails web application (just a standard war file) deployed on a Ubuntu 10.10 server running on tomcat 6. My database is postgresql. The problem is that every so often (once or twice a day after inactivity) when I try to log into this web application it just freezes. I can navigate to the login page but when I try and login (first time the DB is hit, might be a clue..?) the application just freezes indefinitely, no 500 response code... the browser just waits and waits. I followed the instructions detailed here because the problem described sounded the same as mine. My GC logging showed no long running GC, all sub sec. When the application freezes a jmap heap output is... using parallel threads in the new generation. using thread-local object allocation. Concurrent Mark-Sweep GC Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 536870912 (512.0MB) NewSize = 21757952 (20.75MB) MaxNewSize = 87228416 (83.1875MB) OldSize = 65404928 (62.375MB) NewRatio = 7 SurvivorRatio = 8 PermSize = 21757952 (20.75MB) MaxPermSize = 85983232 (82.0MB) Heap Usage: New Generation (Eden + 1 Survivor Space): capacity = 19595264 (18.6875MB) used = 11411976 (10.883308410644531MB) free = 8183288 (7.804191589355469MB) 58.23843965562291% used Eden Space: capacity = 17432576 (16.625MB) used = 9249296 (8.820816040039062MB) free = 8183280 (7.8041839599609375MB) 53.05754009046053% used From Space: capacity = 2162688 (2.0625MB) used = 2162680 (2.0624923706054688MB) free = 8 (7.62939453125E-6MB) 99.99963008996212% used To Space: capacity = 2162688 (2.0625MB) used = 0 (0.0MB) free = 2162688 (2.0625MB) 0.0% used concurrent mark-sweep generation: capacity = 101556224 (96.8515625MB) used = 83906080 (80.01907348632812MB) free = 17650144 (16.832489013671875MB) 82.62032270912317% used Perm Generation: capacity = 85983232 (82.0MB) used = 62866832 (59.95448303222656MB) free = 23116400 (22.045516967773438MB) 73.1152232100324% used Anyone know what "From Space:" is? Any ideas on further fault finding ideas? I dont have much experience with this type of fault finding.

Read the article
Clarification On Write-Caching Policy, Its Underlying Options And How It Applies To Hard Drives And Solid-State Drives

- by Boris_yo

In last week after doing more research on subject matter, I have been wondering about what I have been neglecting all those years to understand write-caching policy, always leaving it on default setting. Write-caching policy improves writing performance and consists of write-back caching and write-cache buffer flushing. This is how I understand all the above, but correct me if I erred somewhere: Write-through cache / Write-through caching itself is not a part of write caching policy per se and it's when data is written to both cache and storage device so if Windows will need that data later again, it is retrieved from cache and not from storage device which means only improved read performance as there is no need for waiting for storage device to read required data again. Since data is still written to storage device, write performance isn't improved and represents no risk of data loss or corruption in case of power failure or system crash while only data in cache gets lost. This option seems to be enabled by default and is recommended for removable devices with no need to use function of "Safely Remove Hardware" on user's part. Write-back caching is similar to above but without writing data to storage device, periodically releasing data from cache and writing to storage device when it is idle. In my opinion this option improves both read and write performance but represents risk if power failure or system crash occurs with the outcome of not only losing data eventually to be written to storage device, but causing file inconsistencies or corrupted file system. Write-back caching cannot be enabled together with write-through caching and it is not recommended to be enabled if no backup power supply is availabe. Write-cache buffer flushing I reckon is similar to write-back caching but enables immediate release and writing of data from cache to storage device right before power outage occurs but I don't know if it applies also to occasional system crash. This option seem to be complementary to write-back cache reducing or potentially eliminating risk of data loss or corruption of file system. I have questions about relevance of last 2 options to today's modern SSDs in order to get best performance and with less wear on SSDs: I know that traditional hard drives come with onboard cache (I wonder what type of cache that is), but do SSDs also come with cache? Assuming they do, is this cache faster than their NAND flash and system RAM and worth taking the risk of utilizing it by enabling write-back cache? I read somewhere that generally storage device's cache is faster than RAM, but I want to be sure. Additionally I read that write-caching should be enabled since current data that is to be written later to NAND flash is kept for a while in cache and provided there is data that gets modified a lot before finally being written, holding of this data and its periodic release reduces its write times to SSD thereby reducing its wearing. Now regarding to write-cache buffer flushing, I heard that SSD controllers are so fast by themselves that enabling this option is not required, because they manage flushing. However, once again, I don't know if SSDs have their own onboard cache and whether or not it is faster than their NAND flash and system RAM because if it is, keeping this option enabled would make sense. Recently I have posted question about issue with my Intel 330 SSD 120GB which was main reason to do deeper research having suspicion of write-caching policy being the culprit of SSD's freezing issue assuming data being released is what causes freezes. Currently I have write-cache enabled and write-cache buffer flushing disabled because I believe SSD controller's management of write-cache flushing and Windows write-cache buffer flushing are conflicting with each other: Since I want to troubleshoot in small steps to finally determine the source of issue, I have decided to start with write-caching policy and the move to drivers, switching to AHCI later on and finally disabling DIPM (device initiated power management) through registry modification thanks to @TomWijsman

Read the article
DSOFramer closing Excel doc in another window. If unsaved data in file, dsoframer fails to open with

- by Steve

I'm using Microsoft's DSOFramer control to allow me to embed an Excel file in my dialog so the user can choose his sheet, then select his range of cells; it's used with an import button on my dialog. The problem is that when I call the DSOFramer's OPEN function, if I have Excel open in another window, it closes the Excel document (but leaves Excel running). If the document it tries to close has unsaved data, I get a dialog boxclosing Excel doc in another window. If unsaved data in file, dsoframer fails to open with a messagebox: "Attempt to access invalid address". I built the source, and stepped through, and its making a call in its CDsoDocObject::CreateFromFile function, calling BindToObject on an object of class IMoniker. The HR is 0x8001010a "The message filter indicated that the application is busy". On that failure, it tries to InstantiateDocObjectServer by classid of CLSID Microsoft Excel Worksheet... this fails with an HRESULT of 0x80040154 "Class not registered". The InstantiateDocObjectServer just calls CoCreateInstance on the classid, first with CLSCTX_LOCAL_SERVER, then (if that fails) with CLSCTX_INPROC_SERVER. I know DSOFramer is a popular sample project for embedding Office apps in various dialogs and forms. I'm hoping someone else has had this problem and might have some insight on how I can solve this. I really don't want it to close any other open Excel documents, and I really don't want it to error-out if it can't close the document due to unsaved data. Update 1: I've tried changing the classid that's passed in to "Excel.Application" (I know that class will resolve), but that didn't work. In CDsoDocObject, it tries to open key "HKEY_CLASSES_ROOT\CLSID{00024500-0000-0000-C000-000000000046}\DocObject", but fails. I've visually confirmed that the key is not present in my registry; The key is present for the guid, but there's no DocObject subkey. It then produces an error message box: "The associated COM server does not support ActiveX document embedding". I get similar (different key, of course) results when I try to use the Excel.Workbook programid. Update 2: I tried starting a 2nd instance of Excel, hoping that my automation would bind to it (being the most recently invoked) instead of the problem Excel instance, but it didn't seem to do that. Results were the same. My problem seems to have boiled down to this: I'm calling the BindToObject on an object of class IMoniker, and receiving 0x8001010A (RPC_E_SERVERCALL_RETRYLATER) "The message filter indicated that the application is busy". I've tried playing with the flags passed to the BindToObject (via the SetBindOptions), but nothing seems to make any difference. Update 3: It first tries to bind using an IMoniker class. If that fails, it calls CoCreateInstance for the clsid as a "fallback" method. This may work for other MS Office objects, but when it's Excel, the class is for the Worksheet. I modified the sample to CoCreateInstance _Application, then got the workbooks, then called the Workbooks::Open for the target file, which returns a Worksheet object. I then returned that pointer and merged back with the original sample code path. All working now.

Read the article
Improving the performance of XSL

- by Rachel

In the below XSL for the variable "insert-data", I have an input param with the structure, <insert-data> <data compareIndex="4" nodeName="d1e1"> <a/> </data> <data compareIndex="5" nodeName="d1e1"> <b/> </data> <data compareIndex="7" nodeName="d1e2"> <a/> </data> <data compareIndex="9" nodeName="d1e2"> <b/> </data> </insert-data> where "nodeName" is the id of a node and "compareIndex" is the position of the text content relative to the node having id "$nodeName". I am using the below XSL to select all the text nodes(generate-id) that satisfy the above condition and construct a data xml. The below implementation works perfectly but the time taken for the execution is in min. Is there a better way of implementing or is there any in-efficient operation being used. From my observation the code where the preceding text length is calculated consumes the major time. Please share your thoughts to improve the performance of the XSL. I am using Java SAXON XSL transformer. <xsl:variable name="insert-data" as="element()*"> <xsl:for-each select="$insert-file/insert-data/data"> <xsl:sort select="xsd:integer(@index)"/> <xsl:variable name="compareIndex" select="xsd:integer(@compareIndex)" /> <xsl:variable name="nodeName" select="@nodeName" /> <xsl:variable name="nodeContent" as="node()"> <xsl:copy-of select="node()"/> </xsl:variable> <xsl:for-each select="$main-root/*//text()[ancestor::*[@id = $nodeName]]"> <xsl:variable name="preTextLength" as="xsd:integer" select="sum((preceding::text())[. ancestor::*[@id = $nodeName]]/string-length(.))" /> <xsl:variable name="currentTextLength" as="xsd:integer" select="string-length(.)" /> <xsl:variable name="sum" select="$preTextLength + $currentTextLength" as="xsd:integer"></xsl:variable> <xsl:variable name="split-index" select="$compareIndex - $preTextLength" as="xsd:integer"></xsl:variable> <xsl:if test="($sum ge $compareIndex) and ($compareIndex gt $preTextLength)"> <data split-index="{$split-index}" text-id="{generate-id(.)}"> <xsl:copy-of select="$nodeContent"/> </data> </xsl:if> </xsl:for-each> </xsl:for-each> </xsl:variable>

Read the article
How do I get a Java to call data from the Internet? Where to even start??

- by cdg

Hello oh great wizards of all things android. I really need your help. Mostly because my little brain just doesn't know were to start. I am trying to pull data from the internet to make a widget for the home screen. I have the layout built: <?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" android:id="@+id/Layout" android:layout_width="fill_parent" android:layout_height="fill_parent" android:background="@drawable/widget_bg_normal" android:clipChildren="false" > <TextView android:id="@+id/text_view" android:layout_width="100px" android:layout_height="wrap_content" android:paddingTop="18px" android:layout_centerHorizontal="true" android:textSize="8px" android:text="158x154 Image downloaded from the internet goes here. Needs to be updated every evening at midnight or unless the button below is pressed. Now if I could only figure out exactly how to do this, life would be good." /> <Button android:id="@+id/new_button" android:layout_width="fill_parent" android:layout_height="wrap_content" android:text="Get New" android:layout_below="@+id/scroll_image" android:layout_centerHorizontal="true" android:padding="0px" android:textSize="10px" android:height="8px" android:includeFontPadding="false" /> </RelativeLayout> Got the provider xml bulit: <?xml version="1.0" encoding="utf-8"?> <appwidget-provider xmlns:android="http://schemas.android.com/apk/res/android" android:minWidth="150dip" android:minHeight="150dip" android:updatePeriodMillis="10000" android:initialLayout="@layout/widget" /> The Manifest works great. <?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.dge.myandroid" android:versionCode="1" android:versionName="1.0"> <application android:icon="@drawable/icon" android:label="@string/app_name"> <activity android:name=".myactivty" android:label="@string/app_name"> <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity>  <receiver android:name=".mywidget" android:label="@string/app_name" > <intent-filter> <action android:name="android.appwidget.action.APPWIDGET_UPDATE" /> </intent-filter> <meta-data android:name="android.appwidget.widgetprovider" android:resource="@xml/widgetprovider" /> </receiver>  </application> <uses-permission android:name="android.permission.INTERNET" /> <uses-sdk android:minSdkVersion="7" /> </manifest> The data it is calling looks something like this when it is called. It basically goes to a website that uses php to random the image: <html><body bgcolor="#000000">center> <a href="http://www.website.com" target="_blank"> <img border="0" src="http://www.webiste.com//0.gif"></a> <img src="http://www.webiste.com" style="border:none;" /> </center></body></html> But here is were I am stuck. I just don't know where to start at all. The java is so far beyond my little head that I don't know what to do. package com.dge.myandroid; import android.appwidget.AppWidgetProvider; public class mywidget extends AppWidgetProvider { } The wiki example just confused me more. I just don't know where to begin. Please help.

Read the article

Search Results

Search found 60391 results on 2416 pages for 'data generation'.

Page 500/2416 | < Previous Page | 496 497 498 499 500 501 502 503 504 505 506 507 | Next Page >

- by Alex Folland

- by jt0dd

- by Data-Base

- by bob.webster

- by pinaldave

- by pinaldave

- by Ben Rees

- by Ralf Westphal

- by rchrd

- by user1204548

- by Most Valuable Yak (Rob Volk)

- by Javier Treviño

- by David Dorf

- by Clint Edmonson

- by Jay

- by MarkPearl

- by arungupta

- by ajreal

- by panny

- by Mark

- by tinny

- by Boris_yo

- by Steve

- by Rachel

- by cdg

< Previous Page | 496 497 498 499 500 501 502 503 504 505 506 507 | Next Page >