Search Results

Search found 10424 results on 417 pages for 'persisted column'.

Page 389/417 | < Previous Page | 385 386 387 388 389 390 391 392 393 394 395 396 | Next Page >

[EF + Oracle] Entities

- by JTorrecilla

Prologue Following with the Serie I started yesterday about Entity Framework with Oracle, Today I am going to start talking about Entities. What is an Entity? A Entity is an object of the EF model corresponding to a record in a DB table. For example, let’s see, in Image 1 we can see one Entity from our model, and in the second one we can see the mapping done with the DB. (Image 1) (Image 2) More in depth a Entity is a Class inherited from the abstract class “EntityObject”, contained by the “System.Data.Objects.DataClasses” namespace. At the same time, this class inherits from the following Class and interfaces: StructuralObject: It is an Abstract class that inherits from INotifyPropertyChanging and INotifyPropertyChanged interfaces, and it exposes the events that manage the Changes of the class, and the functions related to check the data types of the Properties from our Entity. IEntityWithKey: Interface which exposes the Key of the entity. IEntityWithChangeTracker: Interface which lets indicate the state of the entity (Detached, Modified, Added…) IEntityWithRelationships: Interface which indicates the relations about the entity. Which is the Content of a Entity? A Entity is composed by: Properties, Navigation Properties and Methods. What is a Property? A Entity Property is an object that represents a column from the mapped table from DB. It has a data type equivalent in .Net Framework to the DB Type. When we create the EF model, VS, internally, create the code for each Entity selected in the Tables step, such all methods that we will see in next steps. For each property, VS creates a structure similar to: · Private variable with the mapped Data type. · Function with a name like On{Property_Name}Changing({dataType} value): It manages the event which happens when we try to change the value. · Function with a name like On{Property_Name}Change: It manages the event raised when the property has changed successfully. · Property with Get and Set methods: The Set Method manages the private variable and do the following steps: Raise Changing event. Report the Entity is Changing. Set the prívate variable. For it, Use the SetValidValue function of the StructuralObject. There is a function for each datatype, and the functions takes 2 params: the value, and if the prop allow nulls. Invoke that the entity has been successfully changed. Invoke the Changed event of the Prop. ReportPropertyChanging and ReportPropertyChanged events, let, respectively, indicate that there is pending changes in the Entity, and the changes have success correctly. While the ReportPropertyChanged is raised, the Track State of the Entity will be changed. What is a Navigation Property? Navigation Properties are a kind of property of the type: EntityCollection<TEntity>, where TEntity is an Entity type from the model related with the current one, it is said, is a set of record from a related table in the DB. The EntityCollection class inherits from: · RelatedEnd: There is an abstract class that give the functions needed to obtein the related objects. · ICollection<TEntity> · IEnumerable<TEntity> · IEnumerable · IListSource For the previous interfaces, I wish recommend the following post from Jose Miguel Torres. Navigation properties allow us, to get and query easily objects related with the Entity. Methods? There is only one method in the Entity object. “Create{Entity}”, that allow us to create an object of the Entity by sending the parameters needed to create it. Finally After this chapter, we know what is an Entity, how is related to the DB and the relation to other Entities. In following chapters, we will se CRUD operations(Create, Read, Update, Delete).

Read the article
Is inconsistent formatting a sign of a sloppy programmer?

- by dreza

I understand that everyone has their own style of programming and that you should be able to read other people's styles and accept it for what it is. However, would one be considered a sloppy programmer if one's style of coding was inconsistent across whatever standard they were working against? Some example of inconsistencies might be: Sometimes naming private variables with _ and sometimes not Sometimes having varying indentations within code blocks Not aligning braces up i.e. same column if using start using new line style Spacing not always consistent around operators i.e. p=p+1, p+=1 vs other times p =p+1 or p = p + 1 etc Is this even something that as a programmer I should be concerned with addressing? Or is it such a minor nit picking thing that at the end of the day I should just not worry about it and worry about what the end user sees and whether the code works rather than how it looks while working? Is it sloppy programming or just over obsessive nit picking? EDIT: After some excellent comments I realized I may have left out some information in my question. This question came about after reviewing another colleagues code check-in and noticing some of these things and then realizing that I've seen these kind of in-consistencies in previous check-ins. It then got me thinking about my code and whether I do the same things and noticed that I typically don't etc I'm not suggesting his technique is either bad or good in this question or whether his way of doing things is right or wrong. EDIT: To answer some queries to some more good feed back. The specific instance this review occurred in was using Visual Studio 2010 and programming in c# so I don't think the editor would cause any issues. In fact it should only help I would hope. Sorry if I left that piece of info out and it effects any current answers. I was trying to be a bit more generic in understanding if this would be considered sloppy etc. And to add an even more specific example of a code piece I saw during reading of the check-in: foreach(var block in Blocks) { // .. some other code in here foreach(var movement in movements) { movement.Moved.Zero(); } // the un-formatted brace } Such a minor thing I know, but many small things add up(???), and I did have to double glance at the code at the time to see where everything lined up I guess. Please note this code was formatted appropriately before this check-in. EDIT: After reading some great answers and varying thoughts, the summary I've taken from this was. It's not necessarily a sign of a sloppy programmer however as programmers we have a duty (to ourselves and other programmers) to make the code as readable as possible to assist in further ongoing development. However it can hint at inadequacies which is something that is only possible to review on a case by case (person by person) basis. There are many reasons why this might occur. They should be taken in context and worked through with the person/people involved if reasonable. We have a duty to try and help all programmers become better programmers! In the good old days when development was done using good old notepad (or other simple text editing tool) this occurred much more frequently. However we have the assistance of modern IDE's now so although we shouldn't necessarily become OTT about this, it should still probably be addressed to some degree. We as programmers vary in our standards, styles and approaches to solutions. However it seems that in general we all take PRIDE in our work and as a trait it is something that can stand programmers apart. Making something to the best of our abilities both internal (code) and external (end user result) goes along way to giving us that big fat pat on the back that we may not go looking for but swells our heart with pride. And finally to quote CrazyEddie from his post below. Don't sweat the small stuff

Read the article
Using the Script Component as a Conditional Split

This is a quick walk through on how you can use the Script Component to perform Conditional Split like behaviour, splitting your data across multiple outputs. We will use C# code to decide what does flows to which output, rather than the expression syntax of the Conditional Split transformation. Start by setting up the source. For my example the source is a list of SQL objects from sys.objects, just a quick way to get some data: SELECT type, name FROM sys.objects type name S syssoftobjrefs F FK_Message_Page U Conference IT queue_messages_23007163 Shown above is a small sample of the data you could expect to see. Once you have setup your source, add the Script Component, selecting Transformation when prompted for the type, and connect it up to the source. Now open the component, but don’t dive into the script just yet. First we need to select some columns. Select the Input Columns page and then select the columns we want to uses as part of our filter logic. You don’t need to choose columns that you may want later, this is just the columns used in the script itself. Next we need to add our outputs. Select the Inputs and Outputs page.You get one by default, but we need to add some more, it wouldn’t be much of a split otherwise. For this example we’ll add just one more. Click the Add Output button, and you’ll see a new output is added. Now we need to set some properties, so make sure our new Output 1 is selected. In the properties grid change the SynchronousInputID property to be our input Input 0, and change the ExclusionGroup property to 1. Now select Ouput 0 and change the ExclusionGroup property to 2. This value itself isn’t important, provided each output has a different value other than zero. By setting this property on both outputs it allows us to split the data down one or the other, making each exclusive. If we left it to 0, that output would get all the rows. It can be a useful feature allowing you to copy selected rows to one output whilst retraining the full set of data in the other. Now we can go back to the Script page and start writing some code. For the example we will do a very simple test, if the value of the type column is U, for user table, then it goes down the first output, otherwise it ends up in the other. This mimics the exclusive behaviour of the conditional split transformation. public override void Input0_ProcessInputRow(Input0Buffer Row) { // Filter all user tables to the first output, // the remaining objects down the other if (Row.type.Trim() == "U") { Row.DirectRowToOutput0(); } else { Row.DirectRowToOutput1(); } } The code itself is very simple, a basic if clause that determines which of the DirectRowToOutput methods we call, there is one for each output. Of course you could write a lot more code to implement some very complex logic, but the final direction is still just a method call. If we now close the script component, we can hook up the outputs and test the package. Your numbers will vary depending on the sample database but as you can see we have clearly split out input data into two outputs. As a final tip, when adding the outputs I would normally rename them, changing the Name in the Properties grid. This means the generated methods follow the pattern as do the path label shown on the design surface, making everything that much easier to recognise.

Read the article
Standards Corner: Preventing Pervasive Monitoring

- by independentid

Phil Hunt is an active member of multiple industry standards groups and committees and has spearheaded discussions, creation and ratifications of industry standards including the Kantara Identity Governance Framework, among others. Being an active voice in the industry standards development world, we have invited him to share his discussions, thoughts, news & updates, and discuss use cases, implementation success stories (and even failures) around industry standards on this monthly column. Author: Phil Hunt On Wednesday night, I watched NBC’s interview of Edward Snowden. The past year has been tumultuous one in the IT security industry. There has been some amazing revelations about the activities of governments around the world; and, we have had several instances of major security bugs in key security libraries: Apple's ‘gotofail’ bug the OpenSSL Heartbleed bug, not to mention Java’s zero day bug, and others. Snowden’s information showed the IT industry has been underestimating the need for security, and highlighted a general trend of lax use of TLS and poorly implemented security on the Internet. This did not go unnoticed in the standards community and in particular the IETF. Last November, the IETF (Internet Engineering Task Force) met in Vancouver Canada, where the issue of “Internet Hardening” was discussed in a plenary session. Presentations were given by Bruce Schneier, Brian Carpenter, and Stephen Farrell describing the problem, the work done so far, and potential IETF activities to address the problem pervasive monitoring. At the end of the presentation, the IETF called for consensus on the issue. If you know engineers, you know that it takes a while for a large group to arrive at a consensus and this group numbered approximately 3000. When asked if the IETF should respond to pervasive surveillance attacks? There was an overwhelming response for ‘Yes'. When it came to 'No', the room echoed in silence. This was just the first of several consensus questions that were each overwhelmingly in favour of response. This is the equivalent of a unanimous opinion for the IETF. Since the meeting, the IETF has followed through with the recent publication of a new “best practices” document on Pervasive Monitoring (RFC 7258). This document is extremely sensitive in its approach and separates the politics of monitoring from the technical ones. Pervasive Monitoring (PM) is widespread (and often covert) surveillance through intrusive gathering of protocol artefacts, including application content, or protocol metadata such as headers. Active or passive wiretaps and traffic analysis, (e.g., correlation, timing or measuring packet sizes), or subverting the cryptographic keys used to secure protocols can also be used as part of pervasive monitoring. PM is distinguished by being indiscriminate and very large scale, rather than by introducing new types of technical compromise. The IETF community's technical assessment is that PM is an attack on the privacy of Internet users and organisations. The IETF community has expressed strong agreement that PM is an attack that needs to be mitigated where possible, via the design of protocols that make PM significantly more expensive or infeasible. Pervasive monitoring was discussed at the technical plenary of the November 2013 IETF meeting [IETF88Plenary] and then through extensive exchanges on IETF mailing lists. This document records the IETF community's consensus and establishes the technical nature of PM. The draft goes on to further qualify what it means by “attack”, clarifying that The term is used here to refer to behavior that subverts the intent of communicating parties without the agreement of those parties. An attack may change the content of the communication, record the content or external characteristics of the communication, or through correlation with other communication events, reveal information the parties did not intend to be revealed. It may also have other effects that similarly subvert the intent of a communicator. The past year has shown that Internet specification authors need to put more emphasis into information security and integrity. The year also showed that specifications are not good enough. The implementations of security and protocol specifications have to be of high quality and superior testing. I’m proud to say Oracle has been a strong proponent of this, having already established its own secure coding practices.

Read the article
SQL SERVER – Asynchronous Update and Timestamp – Check if Row Values are Changed Since Last Retrieve

- by pinaldave

Here is the question received just this morning. “Pinal, Our application is much different than other application you might have come across. In simple words, I would like to call it Asynchronous Updated Application. We need your quick opinion about one of the situation which we are facing. From business side: We have bidding system (similar to eBay but not exactly) and where multiple parties bid on one item, during the last few minutes of bidding many parties try to bid at the same time with the same price. When they hit submit, we would like to check if the original data which they retrieved is changed or not. If the original data which they have retrieved is the same, we will accept their new proposed price. If original data are changed, they will have to resubmit the data with new price. From technical side: We have a row which we retrieve in our application. Multiple users are retrieving the same row. Some of the users will update the value of the row and submit. However, only the very first user should be allowed to update the row and remaining all the users will have to re-fetch the row and updated it once again. We do not want to lock any record as that will create other problems. Do you have any solution for this kind of situation?” Fantastic Question. I believe there is good chance that we can use timestamp datatype in this kind of application. Before we continue let us see following simple example. USE tempdb GO CREATE TABLE SampleTable (ID INT, Col1 VARCHAR(100), TimeStampCol TIMESTAMP) GO INSERT INTO SampleTable (ID, Col1) VALUES (1, 'FirstVal') GO SELECT ID, Col1, TimeStampCol FROM SampleTable st GO UPDATE SampleTable SET Col1 = 'NextValue' GO SELECT ID, Col1, TimeStampCol FROM SampleTable st GO DROP TABLE SampleTable GO Now let us see the resultset. Here is the simple explanation of the scenario. We created a table with simple column with TIMESTAMP datatype. When we inserted a very first value the timestamp was generated. When we updated any value in that row, the timestamp was updated with the new value. Every single time when we update any value in the row, it will generate new timestamp value. Now let us apply this in an original question’s scenario. In that case multiple users are retrieving the same row. Everybody will have the same now same TimeStamp with them. Before any user update any value they should once again retrieve the timestamp from the table and compare with the timestamp they have with them. If both of the timestamp have the same value – the original row has not been updated and we can safely update the row with the new value. After initial update, now the row will contain a new timestamp. Any subsequent update to the same row should also go to the same process of checking the value of the timestamp they have in their memory. In this case, the timestamp from memory will be different from the timestamp in the row. This indicates that row in the table has changed and new updates should not be allowed. I believe timestamp can be very very useful in this kind of scenario. Is there any better alternative? Please leave a comment with the suggestion and I will post on the blog with due credit. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
65536% Autogrowth!

- by Tara Kizer

Twice a year, we move our production systems to our disaster recovery site. Last Saturday night was one of those days. There are about 50 SQL Server databases to be moved to the DR site, which is done via database mirroring. It takes only a few seconds to failover, but some databases have a bit more involved work such as setting up replication. Everything went relatively smooth, but we encountered a weird bug on our most mission critical system. After everything was successfully failed over to the DR site, it was noticed that mirroring was in a suspended state on one of the databases. We thought we had run into a SQL Server 2005 bug that we had been encountering and were working with Microsoft on a fix. Microsoft did fix it in both SQL Server 2005 service pack 3 cumulative update package 13 and service pack 4 cumulative update package 2, however SP3 CU13 and SP4 both recently failed on this system so we were not patched yet with the bug fix. As the suspended state was causing us issues with replication, we dropped mirroring. We then noticed we had 10MB of free disk space on the mount point where the principal’s data files are stored. I knew something went amiss as this system should have at least 150GB free on that mount point. I immediately checked the main database’s data file and was shocked to see an autgrowth size of 65536%. The data file autogrew right before mirroring went into the suspended state. 65536%! I didn’t have a lot of time to research if this autgrowth problem was a known SQL Server bug, so I deferred that research to today. A quick Google search yielded no results but emphasis on “quick”. I checked our performance system, which was recently restored with a copy of the affected production database, and found the autogrowth setting to be 512MB. So this autogrowth bug was encountered sometime in the last two weeks. On February 26th, we had attempted to install SQL 2005 SP4 on production, however it had failed (PSS case open with Microsoft). I suspected that the SP4 failure was somehow related to this autgrowth bug although that turned out not to be the case. I then tweeted (@TaraKizer) about this problem to see if the SQL Server community (#sqlhelp) had any insights. It seems several people have either heard of this bug or encountered it. Aaron Bertrand (blog|twitter) referred me to this Connect item. Our affected database originated on SQL Server 2000 and was upgraded to SQL Server 2005 in 2007. Back on SQL Server 2000, we were using the default file growth setting which was a percentage. Sometime after the 2005 upgrade is when we changed it to 512MB. Our situation seemed to fit the bug Aaron referred to me, so now the question was whether Microsoft had fixed it yet. I received a reply to my tweet from Amit Banerjee (twitter) that it had been fixed in SP3 CU1 (KB958004). My affected system is SP3 CU8, so I was initially confused why we had encountered the bug. Because I don’t read things fully, I had missed that there are additional steps you have to follow after applying the bug fix. Amit set me straight. Although you can read this information in the KB article, I will also copy it here in case you are as lazy as me and miss the most important section of it (although if you are as lazy as me, you won’t have read this far down my blog post): This hotfix will prevent only future occurrences of this problem. For example, if you restore a database from SQL Server 2000 to a SQL Server 2005 instance that contains this hotfix, this problem will not occur. However, if you already have a database that is affected by this problem, you must follow these steps to resolve this problem manually: Apply this hotfix. Set the file growth settings for the affected files to percentage settings, and then set the settings back to megabyte settings. Take the database offline, and then bring it back online. Verify that the values of the is_percent_growth column are correct in the sys.database_files system table and in the sys.master_files system table.

Read the article
What DX level does my graphics card support? Does it go to 11?

- by Daniel Moth

Recently I run into a situation that I have run into quite a few times. Someone encounters a machine and the question arises: "Is there a DirectX 11 card in this machine?". Typically the reason you are interested in that is because cards with DirectX 11 drivers fully support DirectCompute (and by extension C++ AMP) for GPGPU programming. The driver specifically is WDDM (1.1 on Windows 7 and Windows 8 introduces WDDM 1.2 with cool new capabilities). There are many ways for figuring out if you have a DirectX11 card, so here are the approaches that you can use, with a bonus right at the end of the post. Run DxDiag WindowsKey + R, type DxDiag and hit Enter. That is the DirectX diagnostic tool, which unfortunately, only tells you on the "System" tab what is the highest version of DirectX installed on your machine. So if it reports DirectX 11, that doesn't mean you have a DX11 driver! The "Display" tab has a promising "DDI version" label, but unfortunately that doesn't seem to be accurate on the machines I've tested it with (or I may be misinterpreting its use). Either way, this tool is not the one you want for this purpose, although it is good for telling you the WDDM version among other things. Use the Microsoft hardware page There is a Microsoft Windows 7 compatibility center, that lists all hardware (tip: use the advanced search) and you could try and locate your device there… good luck. Use Wikipedia or the hardware vendor's website Use the Wikipedia page for the vendor cards, for both nvidia and amd. Often this information will also be in the specifications for the cards on the IHV site, but is is nice that wikipedia has a single page per vendor that you can search etc. There is a column in the tables for API support where you can see the DirectX version. Check if it is one of these recommended DX11 cards You may not have a DirectX 11 card and are interested in purchasing one. While I am in no position to make recommendations, I will list here some cards from two big IHVs that we know are DirectX 11 capable. Some AMD (aka ATI) cards Low end, inexpensive DX11 hardware: Radeon 5450, 5550, 6450, 6570 Mid range (decent perf, single precision): Radeon 5750, 5770, 6770, 6790 High end (capable of double precision): Radeon 5850, 5870, 6950, 6970 Single precision APUs: AMD E-Series APUs AMD A-Series APUs Some NVIDIA cards Low end, inexpensive DX11 hardware: GeForce GT430, GT 440, GT520, GTS 450 Quadro 400, 600 Mid-range (decent perf, single precision): GeForce GTX 460, GTX 550 Ti, GTX 560, GTX 560 Ti Quadro 2000 High end (capable of double precision): GeForce GTX 480, GTX 570, GTX 580, GTX 590, GTX 595 Quadro 4000, 5000, 6000 Tesla C2050, C2070, C2075 Get the DirectX SDK and run DirectX Caps Viewer Download and install the June 2010 DirectX SDK. As part of that you now have the DirectX Capabilities Viewer utility (find it in your start menu by searching for "DirectX Caps Viewer", the filename is DXCapsViewer.exe). It will list all your devices (emulated, and real hardware ones) under the first node. Expand the hardware entries and then expand again the Direct3D 11 folder. If you see D3D_FEATURE_LEVEL_11_ under that, then your card supports feature level 11 which means it supports DirectCompute and C++ AMP. In the following screenshot of one of my old laptops, the card only goes to feature level 10. Run a utility from the web that just tells you! Of course, writing some C++ AMP code that enumerates accelerators and lists the ones that are capable is trivial. However that requires that you have redistributed the runtime, so a more broadly applicable approach is to use the DX APIs directly to enumerate the DX11 capable cards. That is exactly what the development lead for C++ AMP has done and he describes and shares that utility at this post. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
SQL Azure: Notes on Building a Shard Technology

- by Herve Roggero

In Chapter 10 of the book on SQL Azure (http://www.apress.com/book/view/9781430229612) I am co-authoring, I am digging deeper in what it takes to write a Shard. It's actually a pretty cool exercise, and I wanted to share some thoughts on how I am designing the technology. A Shard is a technology that spreads the load of database requests over multiple databases, as transparently as possible. The type of shard I am building is called a Vertical Partition Shard (VPS). A VPS is a mechanism by which the data is stored in one or more databases behind the scenes, but your code has no idea at design time which data is in which database. It's like having a mini cloud for records instead of services. Imagine you have three SQL Azure databases that have the same schema (DB1, DB2 and DB3), you would like to issue a SELECT * FROM Users on all three databases, concatenate the results into a single resultset, and order by last name. Imagine you want to ensure your code doesn't need to change if you add a new database to the shard (DB4). Now imagine that you want to make sure all three databases are queried at the same time, in a multi-threaded manner so your code doesn't have to wait for three database calls sequentially. Then, imagine you would like to obtain a breadcrumb (in the form of a new, virtual column) that gives you a hint as to which database a record came from, so that you could update it if needed. Now imagine all that is done through the standard SqlClient library... and you have the Shard I am currently building. Here are some lessons learned and techniques I am using with this shard: Parellel Processing: Querying databases in parallel is not too hard using the Task Parallel Library; all you need is to lock your resources when needed Deleting/Updating Data: That's not too bad either as long as you have a breadcrumb. However it becomes more difficult if you need to update a single record and you don't know in which database it is. Inserting Data: I am using a round-robin approach in which each new insert request is directed to the next database in the shard. Not sure how to deal with Bulk Loads just yet... Shard Databases: I use a static collection of SqlConnection objects which needs to be loaded once; from there on all the Shard commands use this collection Extension Methods: In order to make it look like the Shard commands are part of the SqlClient class I use extension methods. For example I added ExecuteShardQuery and ExecuteShardNonQuery methods to SqlClient. Exceptions: Capturing exceptions in a multi-threaded code is interesting... but I kept it simple for now. I am using the ConcurrentQueue to store my exceptions. Database GUID: Every database in the shard is given a GUID, which is calculated based on the connection string's values. DataTable. The Shard methods return a DataTable object which can be bound to objects. I will be sharing the code soon as an open-source project in CodePlex. Please stay tuned on twitter to know when it will be available (@hroggero). Or check www.bluesyntax.net for updates on the shard. Thanks!

Read the article
How the number of indexes built on a table can impact performances?

- by Davide Mauri

We all know that putting too many indexes (I’m talking of non-clustered index only, of course) on table may produce performance problems due to the overhead that each index bring to all insert/update/delete operations on that table. But how much? I mean, we all agree – I think – that, generally speaking, having many indexes on a table is “bad”. But how bad it can be? How much the performance will degrade? And on a concurrent system how much this situation can also hurts SELECT performances? If SQL Server take more time to update a row on a table due to the amount of indexes it also has to update, this also means that locks will be held for more time, slowing down the perceived performance of all queries involved. I was quite curious to measure this, also because when teaching it’s by far more impressive and effective to show to attended a chart with the measured impact, so that they can really “feel” what it means! To do the tests, I’ve create a script that creates a table (that has a clustered index on the primary key which is an identity column) , loads 1000 rows into the table (inserting 1000 row using only one insert, instead of issuing 1000 insert of one row, in order to minimize the overhead needed to handle the transaction, that would have otherwise ), and measures the time taken to do it. The process is then repeated 16 times, each time adding a new index on the table, using columns from table in a round-robin fashion. Test are done against different row sizes, so that it’s possible to check if performance changes depending on row size. The result are interesting, although expected. This is the chart showing how much time it takes to insert 1000 on a table that has from 0 to 16 non-clustered indexes. Each test has been run 20 times in order to have an average value. The value has been cleaned from outliers value due to unpredictable performance fluctuations due to machine activity. The test shows that in a table with a row size of 80 bytes, 1000 rows can be inserted in 9,05 msec if no indexes are present on the table, and the value grows up to 88 (!!!) msec when you have 16 indexes on it This means a impact on performance of 975%. That’s *huge*! Now, what happens if we have a bigger row size? Say that we have a table with a row size of 1520 byte. Here’s the data, from 0 to 16 indexes on that table: In this case we need near 22 msec to insert 1000 in a table with no indexes, but we need more that 500msec if the table has 16 active indexes! Now we’re talking of a 2410% impact on performance! Now we can have a tangible idea of what’s the impact of having (too?) many indexes on a table and also how the size of a row also impact performances. That’s why the golden rule of OLTP databases “few indexes, but good” is so true! (And in fact last week I saw a database with tables with 1700bytes row size and 23 (!!!) indexes on them!) This also means that a too heavy denormalization is really not a good idea (we’re always talking about OLTP systems, keep it in mind), since the performance get worse with the increase of the row size. So, be careful out there, and keep in mind the “equilibrium” is the key world of a database professional: equilibrium between read and write performance, between normalization and denormalization, between to few and too may indexes. PS Tests are done on a VMWare Workstation 7 VM with 2 CPU and 4 GB of Memory. Host machine is a Dell Precsioni M6500 with i7 Extreme X920 Quad-Core HT 2.0Ghz and 16Gb of RAM. Database is stored on a SSD Intel X-25E Drive, Simple Recovery Model, running on SQL Server 2008 R2. If you also want to to tests on your own, you can download the test script here: Open TestIndexPerformance.sql

Read the article
What's wrong with this turn to face algorithm?

- by Chan

I implement a torpedo object that chases a rotating planet. Specifically, it will turn toward the planet each update. Initially my implement was: void move() { vector3<float> to_target = target - get_position(); to_target.normalize(); position += (to_target * speed); } which works perfectly for torpedo that is a solid sphere. Now my torpedo is actually a model, which has a forward vector, so using this method looks odd because it doesn't actually turn toward but jump toward. So I revised it a bit to get, double get_rotation_angle(vector3<float> u, vector3<float> v) const { u.normalize(); v.normalize(); double cosine_theta = u.dot(v); // domain of arccosine is [-1, 1] if (cosine_theta > 1) { cosine_theta = 1; } if (cosine_theta < -1) { cosine_theta = -1; } return math3d::to_degree(acos(cosine_theta)); } vector3<float> get_rotation_axis(vector3<float> u, vector3<float> v) const { u.normalize(); v.normalize(); // fix linear case if (u == v || u == -v) { v[0] += 0.05; v[1] += 0.0; v[2] += 0.05; v.normalize(); } vector3<float> axis = u.cross(v); return axis.normal(); } void turn_to_face() { vector3<float> to_target = (target - position); vector3<float> axis = get_rotation_axis(get_forward(), to_target); double angle = get_rotation_angle(get_forward(), to_target); double distance = math3d::distance(position, target); gl_matrix_mode(GL_MODELVIEW); gl_push_matrix(); { gl_load_identity(); gl_translate_f(position.get_x(), position.get_y(), position.get_z()); gl_rotate_f(angle, axis.get_x(), axis.get_y(), axis.get_z()); gl_get_float_v(GL_MODELVIEW_MATRIX, OM); } gl_pop_matrix(); move(); } void move() { vector3<float> to_target = target - get_position(); to_target.normalize(); position += (get_forward() * speed); } The logic is simple, I find the rotation axis by cross product, the angle to rotate by dot product, then turn toward the target position each update. Unfortunately, it looks extremely odds since the rotation happens too fast that it always turns back and forth. The forward vector for torpedo is from the ModelView matrix, the third column A: MODELVIEW MATRIX -------------------------------------------------- R U A T -------------------------------------------------- 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 -------------------------------------------------- Any suggestion or idea would be greatly appreciated.

Read the article
What DX level does my graphics card support? Does it go to 11?

- by Daniel Moth

Recently I run into a situation that I have run into quite a few times. Someone encounters a machine and the question arises: "Is there a DirectX 11 card in this machine?". Typically the reason you are interested in that is because cards with DirectX 11 drivers fully support DirectCompute (and by extension C++ AMP) for GPGPU programming. The driver specifically is WDDM (1.1 on Windows 7 and Windows 8 introduces WDDM 1.2 with cool new capabilities). There are many ways for figuring out if you have a DirectX11 card, so here are the approaches that you can use, with a bonus right at the end of the post. Run DxDiag WindowsKey + R, type DxDiag and hit Enter. That is the DirectX diagnostic tool, which unfortunately, only tells you on the "System" tab what is the highest version of DirectX installed on your machine. So if it reports DirectX 11, that doesn't mean you have a DX11 driver! The "Display" tab has a promising "DDI version" label, but unfortunately that doesn't seem to be accurate on the machines I've tested it with (or I may be misinterpreting its use). Either way, this tool is not the one you want for this purpose, although it is good for telling you the WDDM version among other things. Use the Microsoft hardware page There is a Microsoft Windows 7 compatibility center, that lists all hardware (tip: use the advanced search) and you could try and locate your device there… good luck. Use Wikipedia or the hardware vendor's website Use the Wikipedia page for the vendor cards, for both nvidia and amd. Often this information will also be in the specifications for the cards on the IHV site, but is is nice that wikipedia has a single page per vendor that you can search etc. There is a column in the tables for API support where you can see the DirectX version. Check if it is one of these recommended DX11 cards You may not have a DirectX 11 card and are interested in purchasing one. While I am in no position to make recommendations, I will list here some cards from two big IHVs that we know are DirectX 11 capable. Some AMD (aka ATI) cards Low end, inexpensive DX11 hardware: Radeon 5450, 5550, 6450, 6570 Mid range (decent perf, single precision): Radeon 5750, 5770, 6770, 6790 High end (capable of double precision): Radeon 5850, 5870, 6950, 6970 Single precision APUs: AMD E-Series APUs AMD A-Series APUs Some NVIDIA cards Low end, inexpensive DX11 hardware: GeForce GT430, GT 440, GT520, GTS 450 Quadro 400, 600 Mid-range (decent perf, single precision): GeForce GTX 460, GTX 550 Ti, GTX 560, GTX 560 Ti Quadro 2000 High end (capable of double precision): GeForce GTX 480, GTX 570, GTX 580, GTX 590, GTX 595 Quadro 4000, 5000, 6000 Tesla C2050, C2070, C2075 Get the DirectX SDK and run DirectX Caps Viewer Download and install the June 2010 DirectX SDK. As part of that you now have the DirectX Capabilities Viewer utility (find it in your start menu by searching for "DirectX Caps Viewer", the filename is DXCapsViewer.exe). It will list all your devices (emulated, and real hardware ones) under the first node. Expand the hardware entries and then expand again the Direct3D 11 folder. If you see D3D_FEATURE_LEVEL_11_ under that, then your card supports feature level 11 which means it supports DirectCompute and C++ AMP. In the following screenshot of one of my old laptops, the card only goes to feature level 10. Run a utility from the web that just tells you! Of course, writing some C++ AMP code that enumerates accelerators and lists the ones that are capable is trivial. However that requires that you have redistributed the runtime, so a more broadly applicable approach is to use the DX APIs directly to enumerate the DX11 capable cards. That is exactly what the development lead for C++ AMP has done and he describes and shares that utility at this post. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
How views are changing in future versions of SQL

- by Rob Farley

April is here, and this weekend, SQL v11.0 (previous known as Denali, now known as SQL Server 2012) reaches general availability. And so I thought I’d share some news about what’s coming next. I didn’t hear this at the MVP Summit earlier this year (where there was lots of NDA information given, but I didn’t go), so I think I’m free to share it. I’ve written before about CTEs being query-scoped views. Well, the actual story goes a bit further, and will continue to develop in future versions. A CTE is a like a “temporary temporary view”, scoped to a single query. Due to globally-scoped temporary objects using a two-hashes naming style, and session-scoped (or ‘local’) temporary objects a one-hash naming style, this query-scoped temporary object uses a cunning zero-hash naming style. We see this implied in Books Online in the CREATE TABLE page, but as we know, temporary views are not yet supported in the SQL Server. However, in a breakaway from ANSI-SQL, Microsoft is moving towards consistency with their naming. We know that a CTE is a “common table expression” – this is proving to be a more strategic than you may have appreciated. Within the Microsoft product group, the term “Table Expression” is far more widely used than just CTEs. Anything that can be used in a FROM clause is referred to as a Table Expression, so long as it doesn’t actually store data (which would make it a Table, rather than a Table Expression). You can see this is not just restricted to the product group by doing an internet search for how the term is used without ‘common’. In the past, Books Online has referred to a view as a “virtual table” (but notice that there is no SQL 2012 version of this page). However, it was generally decided that “virtual table” was a poor name because it wasn’t completely accurate, and it’s typically accepted that virtualisation and SQL is frowned upon. That page I linked to says “or stored query”, which is slightly better, but when the SQL 2012 version of that page is actually published, the line will be changed to read: “A view is a stored table expression (STE)”. This change will be the first of many. During the SQL 2012 R2 release, the keyword VIEW will become deprecated (this will be SQL v11 SP1.5). Three versions later, in SQL 14.5, you will need to be in compatibility mode 140 to allow “CREATE VIEW” to work. Also consistent with Microsoft’s deprecation policy, the execution of any query that refers to an object created as a view (rather than the new “CREATE STE”), will cause a Deprecation Event to fire. This will all be in preparation for the introduction of Single-Column Table Expressions (to be introduced in SQL 17.3 SP6) which will finally shut up those people waiting for a decent implementation of Inline Scalar Functions. And of course, CTEs are “Common” because the Table Expression definition needs to be repeated over and over throughout a stored procedure. ...or so I think I heard at some point. Oh, and congratulations to all the new MVPs on this April 1st. @rob_farley

Read the article
Oracle GoldenGate Active-Active Part 1

- by Nick_W

My name is Nick Wagner, and I'm a recent addition to the Oracle Maximum Availability Architecture (MAA) product management team. I've spent the last 15+ years working on database replication products, and I've spent the last 10 years working on the Oracle GoldenGate product. So most of my posting will probably be focused on OGG. One question that comes up all the time is around active-active replication with Oracle GoldenGate. How do I know if my application is a good fit for active-active replication with GoldenGate? To answer that, it really comes down to how you plan on handling conflict resolution. I will delve into topology and deployment in a later blog, but here is a simple architecture: The two most common resolution routines are host based resolution and timestamp based resolution. Host based resolution is used less often, but works with the fewest application changes. Think of it like this: any transactions from SystemA always take precedence over any transactions from SystemB. If there is a conflict on SystemB, then the record from SystemA will overwrite it. If there is a conflict on SystemA, then it will be ignored. It is quite a bit less restrictive, and in most cases, as long as all the tables have primary keys, host based resolution will work just fine. Timestamp based resolution, on the other hand, is a little trickier. In this case, you can decide which record is overwritten based on timestamps. For example, does the older record get overwritten with the newer record? Or vice-versa? This method not only requires primary keys on every table, but it also requires every table to have a timestamp/date column that is updated each time a record is inserted or updated on the table. Most homegrown applications can always be customized to include these requirements, but it's a little more difficult with 3rd party applications, and might even be impossible for large ERP type applications. If your database has these features - whether it’s primary keys for host based resolution, or primary keys and timestamp columns for timestamp based resolution - then your application could be a great candidate for active-active replication. But table structure is not the only requirement. The other consideration applies when there is a conflict; i.e., do I need to perform any notification or track down the user that had their data overwritten? In most cases, I don't think it's necessary, but if it is required, OGG can always create an exceptions table that contains all of the overwritten transactions so that people can be notified. It's a bit of extra work to implement this type of option, but if the business requires it, then it can be done. Unless someone is constantly monitoring this exception table or has an automated process in dealing with exceptions, there will be a delay in getting a response back to the end user. Ideally, when setting up active-active resolution we can include some simple procedural steps or configuration options that can reduce, or in some cases eliminate the potential for conflicts. This makes the whole implementation that much easier and foolproof. And I'll cover these in my next blog.

Read the article
PowerShell: Read Excel to Create Inserts

- by BuckWoody

I’m writing a series of articles on how to migrate “departmental” data into SQL Server. I also hold workshops on the entire process – from discovering that the data exists to the modeling process and then how to design the Extract, Transform and Load (ETL) process. Finally I write about (and teach) a few methods on actually moving the data. One of those options is to use PowerShell. There are a lot of ways even with that choice, but the one I show is to read two columns from the spreadsheet and output statements that would insert the data using a stored procedure. Of course, you could re-write this as INSERT statements, out to a text file for bcp, or even use a database connection in the script to move the data directly from Excel into SQL Server. This snippet won’t run on your system, of course – it assumes a Microsoft Office Excel 2007 spreadsheet located at c:\temp called VendorList.xlsx. It looks for a tab in that spreadsheet called Vendors. The statement that does the writing just uses one column: Vendor Code. Here’s the breakdown of what I’m doing: In the first block, I connect to Microsoft Office Excel. That connection string is specific to Excel 2007, so if you need a different version you’ll need to look that up. In the second block I set up a selection from the entire spreadsheet based on that tab. Note that if you’re only after certain data you shouldn’t get the whole spreadsheet – that’s just good practice. In the next block I create the text I want, inserting the Vendor Code field as I go. Finally I close the connection. Enjoy! $ExcelConnection= New-Object -com "ADODB.Connection" $ExcelFile="c:\temp\VendorList.xlsx" $ExcelConnection.Open("Provider=Microsoft.ACE.OLEDB.12.0;` Data Source=$ExcelFile;Extended Properties=Excel 12.0;") $strQuery="Select * from [Vendors$]" $ExcelRecordSet=$ExcelConnection.Execute($strQuery) do { Write-Host "EXEC sp_InsertVendors '" $ExcelRecordSet.Fields.Item("Vendor Code").Value "'" $ExcelRecordSet.MoveNext()} Until ($ExcelRecordSet.EOF) $ExcelConnection.Close() Script Disclaimer, for people who need to be told this sort of thing: Never trust any script, including those that you find here, until you understand exactly what it does and how it will act on your systems. Always check the script on a test system or Virtual Machine, not a production system. All scripts on this site are performed by a professional stunt driver on a closed course. Your mileage may vary. Void where prohibited. Offer good for a limited time only. Keep out of reach of small children. Do not operate heavy machinery while using this script. If you experience blurry vision, indigestion or diarrhea during the operation of this script, see a physician immediately. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Where should you put constants and why?

- by Tim Meyer

In our mostly large applications, we usually have a only few locations for constants: One class for GUI and internal contstants (Tab Page titles, Group Box titles, calculation factors, enumerations) One class for database tables and columns (this part is generated code) plus readable names for them (manually assigned) One class for application messages (logging, message boxes etc) The constants are usually separated into different structs in those classes. In our C++ applications, the constants are only defined in the .h file and the values are assigned in the .cpp file. One of the advantages is that all strings etc are in one central place and everybody knows where to find them when something must be changed. This is especially something project managers seem to like as people come and go and this way everybody can change such trivial things without having to dig into the application's structure. Also, you can easily change the title of similar Group Boxes / Tab Pages etc at once. Another aspect is that you can just print that class and give it to a non-programmer who can check if the captions are intuitive, and if messages to the user are too detailed or too confusing etc. However, I see certain disadvantages: Every single class is tightly coupled to the constants classes Adding/Removing/Renaming/Moving a constant requires recompilation of at least 90% of the application (Note: Changing the value doesn't, at least for C++). In one of our C++ projects with 1500 classes, this means around 7 minutes of compilation time (using precompiled headers; without them it's around 50 minutes) plus around 10 minutes of linking against certain static libraries. Building a speed optimized release through the Visual Studio Compiler takes up to 3 hours. I don't know if the huge amount of class relations is the source but it might as well be. You get driven into temporarily hard-coding strings straight into code because you want to test something very quickly and don't want to wait 15 minutes just for that test (and probably every subsequent one). Everybody knows what happens to the "I will fix that later"-thoughts. Reusing a class in another project isn't always that easy (mainly due to other tight couplings, but the constants handling doesn't make it easier.) Where would you store constants like that? Also what arguments would you bring in order to convince your project manager that there are better concepts which also comply with the advantages listed above? Feel free to give a C++-specific or independent answer. PS: I know this question is kind of subjective but I honestly don't know of any better place than this site for this kind of question. Update on this project I have news on the compile time thing: Following Caleb's and gbjbaanb's posts, I split my constants file into several other files when I had time. I also eventually split my project into several libraries which was now possible much easier. Compiling this in release mode showed that the auto-generated file which contains the database definitions (table, column names and more - more than 8000 symbols) and builds up certain hashes caused the huge compile times in release mode. Deactivating MSVC's optimizer for the library which contains the DB constants now allowed us to reduce the total compile time of your Project (several applications) in release mode from up to 8 hours to less than one hour! We have yet to find out why MSVC has such a hard time optimizing these files, but for now this change relieves a lot of pressure as we no longer have to rely on nightly builds only. That fact - and other benefits, such as less tight coupling, better reuseability etc - also showed that spending time splitting up the "constants" wasn't such a bad idea after all ;-)

Read the article
Green (Screen) Computing

- by onefloridacoder

I recently was given an assignment to create a UX where a user could use the up and down arrow keys, as well as the tab and enter keys to move through a Silverlight datagrid that is going be used as part of a high throughput data entry UI. And to be honest, I’ve not trapped key codes since I coded JavaScript a few years ago. Although the frameworks I’m using made it easy, it wasn’t without some trial and error. The other thing that bothered me was that the customer tossed this into the use case as they were delivering the use case. Fine. I’ll take a whack at anything and beat up myself and beg (I’m not beyond begging for help) the community for help to get something done if I have to. It wasn’t as bad as I thought and I thought I would hopefully save someone a few keystrokes if you wanted to build a green screen for your customer. Here’s the ValueConverter to handle changing the strings to decimals and then back again. The value is a nullable valuetype so there are few extra steps to take. Usually the “ConvertBack()” method doesn’t get addressed but in this case we have two-way binding and the converter needs to ensure that if the user doesn’t enter a value it will remain null when the value is reapplied to the model object’s setter. 1: using System; 2: using System.Windows.Data; 3: using System.Globalization; 4: 5: public class NullableDecimalToStringConverter : IValueConverter 6: { 7: public object Convert(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture) 8: { 9: if (!(((decimal?)value).HasValue)) 10: { 11: return (decimal?)null; 12: } 13: if (!(value is decimal)) 14: { 15: throw new ArgumentException("The value must be of type decimal"); 16: } 17: 18: NumberFormatInfo nfi = culture.NumberFormat; 19: nfi.NumberDecimalDigits = 4; 20: 21: return ((decimal)value).ToString("N", nfi); 22: } 23: 24: public object ConvertBack(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture) 25: { 26: decimal nullableDecimal; 27: decimal.TryParse(value.ToString(), out nullableDecimal); 28: 29: return nullableDecimal == 0 ? null : nullableDecimal.ToString(); 30: } 31: } The ConvertBack() method uses TryParse to create a value from the incoming string so if the parse fails, we get a null value back, which is what we would expect. But while I was testing I realized that if the user types something like “2..4” instead of “2.4”, TryParse will fail and still return a null. The user is getting “puuu-lenty” of eye-candy to ensure they know how many values are affected in this particular view. Here’s the XAML code. This is the simple part, we just have a DataGrid with one column here that’s bound to the the appropriate ViewModel property with the Converter referenced as well. 1: <data:DataGridTextColumn 2: Header="On-Hand" 3: Binding="{Binding Quantity, 4: Mode=TwoWay, 5: Converter={StaticResource DecimalToStringConverter}}" 6: IsReadOnly="False" /> Nothing too magical here. Just some XAML to hook things up. Here’s the code behind that’s handling the DataGridKeyup event. These are wired to a local/private method but could be converted to something the ViewModel could use, but I just need to get this working for now. 1: // Wire up happens in the constructor 2: this.PicDataGrid.KeyUp += (s, e) => this.HandleKeyUp(e); 1: // DataGrid.BeginEdit fires when DataGrid.KeyUp fires. 2: private void HandleKeyUp(KeyEventArgs args) 3: { 4: if (args.Key == Key.Down || 5: args.Key == Key.Up || 6: args.Key == Key.Tab || 7: args.Key == Key.Enter ) 8: { 9: this.PicDataGrid.BeginEdit(); 10: } 11: } And that’s it. The ValueConverter was the biggest problem starting out because I was using an existing converter that didn’t take nullable value types into account. Once the converter was passing back the appropriate value (null, “#.####”) the grid cell(s) and the model objects started working as I needed them to. HTH.

Read the article
When row estimation goes wrong

- by Dave Ballantyne

Whilst working at a client site, I hit upon one of those issues that you are not sure if that this is something entirely new or a bug or a gap in your knowledge. The client had a large query that needed optimizing. The query itself looked pretty good, no udfs, UNION ALL were used rather than UNION, most of the predicates were sargable other than one or two minor ones. There were a few extra joins that could be eradicated and having fixed up the query I then started to dive into the plan. I could see all manor of spills in the hash joins and the sort operations, these are caused when SQL Server has not reserved enough memory and has to write to tempdb. A VERY expensive operation that is generally avoidable. These, however, are a symptom of a bad row estimation somewhere else, and when that bad estimation is combined with other estimation errors, chaos can ensue. Working my way back down the plan, I found the cause, and the more I thought about it the more i came convinced that the optimizer could be making a much more intelligent choice. First step is to reproduce and I was able to simplify the query down a single join between two tables, Product and ProductStatus, from a business point of view, quite fundamental, find the status of particular products to show if ‘active’ ,’inactive’ or whatever. The query itself couldn’t be any simpler The estimated plan looked like this: Ignore the “!” warning which is a missing index, but notice that Products has 27,984 rows and the join outputs 14,000. The actual plan shows how bad that estimation of 14,000 is : So every row in Products has a corresponding row in ProductStatus. This is unsurprising, in fact it is guaranteed, there is a trusted FK relationship between the two columns. There is no way that the actual output of the join can be different from the input. The optimizer is already partly aware of the foreign key meta data, and that can be seen in the simplifiction stage. If we drop the Description column from the query: the join to ProductStatus is optimized out. It serves no purpose to the query, there is no data required from the table and the optimizer knows that the FK will guarantee that a matching row will exist so it has been removed. Surely the same should be applied to the row estimations in the initial example, right ? If you think so, please upvote this connect item. So what are our options in fixing this error ? Simply changing the join to a left join will cause the optimizer to think that we could allow the rows not to exist. or a subselect would also work However, this is a client site, Im not able to change each and every query where this join takes place but there is a more global switch that will fix this error, TraceFlag 2301. This is described as, perhaps loosely, “Enable advanced decision support optimizations”. We can test this on the original query in isolation by using the “QueryTraceOn” option and lo and behold our estimated plan now has the ‘correct’ estimation. Many thanks goes to Paul White (b|t) for his help and keeping me sane through this

Read the article
Fetching Partition Information

- by Mike Femenella

For a recent SSIS package at work I needed to determine the distinct values in a partition, the number of rows in each partition and the file group name on which each partition resided in order to come up with a grouping mechanism. Of course sys.partitions comes to mind for some of that but there are a few other tables you need to link to in order to grab the information required. The table I’m working on contains 8.8 billion rows. Finding the distinct partition keys from this table was not a fast operation. My original solution was to create a temporary table, grab the distinct values for the partitioned column, then update via sys.partitions for the rows and the $partition function for the partitionid and finally look back to the sys.filegroups table for the filegroup names. It wasn’t pretty, it could take up to 15 minutes to return the results. The primary issue is pulling distinct values from the table. Queries for distinct against 8.8 billion rows don’t go quickly. A few beers into a conversation with a friend and we ended up talking about work which led to a conversation about the task described above. The solution was already built in SQL Server, just needed to pull it together. The first table I needed was sys.partition_range_values. This contains one row for each range boundary value for a partition function. In my case I have a partition function which uses dayid values. For example July 4th would be represented as an int, 20130704. This table lists out all of the dayid values which were defined in the function. This eliminated the need to query my source table for distinct dayid values, everything I needed was already built in here for me. The only caveat was that in my SSIS package I needed to create a bucket for any dayid values that were out of bounds for my function. For example if my function handled 20130501 through 20130704 and I had day values of 20130401 or 20130705 in my table, these would not be listed in sys.partition_range_values. I just created an “everything else” bucket in my ssis package just in case I had any dayid values unaccounted for. To get the number of rows for a partition is very easy. The sys.partitions table contains values for each partition. Easy enough to achieve by querying for the object_id and index value of 1 (the clustered index) The final piece of information was the filegroup name. There are 2 options available to get the filegroup name, sys.data_spaces or sys.filegroups. For my query I chose sys.filegroups but really it’s a matter of preference and data needs. In order to bridge between sys.partitions table and either sys.data_spaces or sys.filegroups you need to get the container_id. This can be done by joining sys.allocation_units.container_id to the sys.partitions.hobt_id. sys.allocation_units contains the field data_space_id which then lets you join in either sys.data_spaces or sys.file_groups. The end result is the query below, which typically executes for me in under 1 second. I’ve included the join to sys.filegroups and to sys.dataspaces, and I’ve just commented out the join sys.filegroups. As I mentioned above, this shaves a good 10-15 minutes off of my original ssis package and is a really easy tweak to get a boost in my ETL time. Enjoy.

Read the article
Accounts in Work Items after migration to TFS 2010 and to new domain

- by Clara Oscura

Lately I’ve been doing some tests on migrating our TFS 2008 installation to TFS 2010, coupled with a machine and domain change. One particular topic that was tricky is user accounts. We installed first a new machine with TFS 2010 and then migrated the projects in the old server. The work items were migrated with the projects. Great, but if I try to edit one of the old work items I cannot save it anymore because some fields contain old user names (ex. OLDDOMAIN\user) which are not known in the new domain (it should be NEWDOMAIN\user). The errors look like this: When I correct the ‘Assigned To’ field value, I get another error regarding another field: Before TFS 2010, we had TFSUsers power tool. It allow you to map an old user name to a new user name. This is not available anymore because WI fields with user accounts are now synchronized with AD display names changes (explained here). The correct way to go about this in TFS 2010 is to use TFSConfig Identities before adding the new domain accounts into the TFS groups (documented here). So, too late for us. I’ve found a (tedious) workaround to change those old account in work items in order to allow people to keep working with them. 1. Install TFS 2010 power tools 2. Export WIT from your project (VS | Tools | Process Editor | Work Item Types). Save the definition, for example: Original_MyProject_Task.xml 3. Copy the xml (NoReadOnly_MyProject_Task.xml) and edit it. From the field definition of ‘Activated By’, ‘Closed By’ and ‘Resolved By’, remove the following: <WHENNOTCHANGED field="System.State"> <READONLY /> </WHENNOTCHANGED> 4. Import WIT in VS. Choose the new file (NoReadOnly_MyProject_Task.xml) and import it in MyProject 5. Open all tasks in Excel (flat list). Display the following columns: Asssigned To Activated By Closed By Resolved By Change the user accounts to the new ones (I usually sort each column alphabetically to make it easier). 6. Publish. If you get a conflict on a field, tough luck. You will have to manually choose “Local version” for each work item. I told you it was a tedious process. 7. Import original WIT (Original_MyProject_Task.xml) in MyProject. We only changed the WI definition so that we could change some fields. The original definition should be put back. And what about these other fields? Created By Authorized As These fields are not editable by definition (VS | Tools | Process Editor | Work Item Fields Explorer), even if they are not marked as read-only in the WIT. You can leave the old values. It doesn’t seem to matter to TFS. The other four fields are editable by definition, so only the WIT readonly rule prevents us from changing them. Technorati Tags: TFS,Team Foundation Server 2010,Work Item,Domain change

Read the article
Going by the eBook

- by Tony Davis

The book and magazine publishing world is rapidly going digital, and the industry is faced with making drastic changes to their ways of doing business. The sudden take-up of digital readers by the book-buying public has surprised even the most technological-savvy of the industry. Printed books just aren't selling like they did. In contrast, eBooks are doing well. The ePub file format is the standard around which all publishers are converging. ePub is a standard for formatting book content, so that it can be reflowed for various devices, with their widely differing screen-sizes, and can be read offline. If you unzip an ePub file, you'll find familiar formats such as XML, XHTML and CSS. This is both a blessing and a curse. Whilst it is good to be able to use familiar technologies that have been developed to a level of considerable sophistication, it doesn't get us all the way to producing a viable publication. XHTML is a page-description language, not a book-description language, as we soon found out during our initial experiments, when trying to specify headers, footers, indexes and chaptering. As a result, it is difficult to predict how any particular eBook application will decide to render a book. There isn't even a consensus as to how the cover image is specified. All of this is awkward for the publisher. Each book must be created and revised in a form from which can be generated a whole range of 'printed media', from print books, to Mobi for kindles, ePub for most Tablets and SmartPhones, HTML for excerpted chapters on websites, and a plethora of other formats for other eBook readers, each with its own idiosyncrasies. In theory, if we can get our content into a clean, semantic XML form, such as DOCBOOKS, we can, from there, after every revision, perform a series of relatively simple XSLT transformations to output anything from a HTML article, to an ePub file for reading on an iPad, to an ICML file (an XML-based file format supported by the InDesign tool), ready for print publication. As always, however, the task looks bigger the closer you get to the detail. On the way to the utopian world of an XML-based book format that encompasses all the diverse requirements of the different publication media, ePub looks like a reasonable format to adopt. Its forthcoming support for HTML 5 and CSS 3, with ePub 3.0, means that features, such as widow-and-orphan controls, multi-column flow and multi-media graphics can be incorporated into eBooks. This starts to make it possible to build an "app-like" experience into the eBook and to free publishers to think of putting context before container; to think of what content is required, be it graphical, textual or audio, from the point of view of the user, rather than what's possible in a given, traditional book "Container". In the meantime, there is a gap between what publishers require and what current technology can provide and, of course building this app-like experience is far from plain sailing. Real portability between devices is still a big challenge, and achieving the sort of wizardry seen in the likes of Theodore Grey's "Elements" eBook will require some serious device-specific programming skills. Cheers, Tony.

Read the article
Advanced Record-Level Business Intelligence with Inner Queries

- by gt0084e1

While business intelligence is generally applied at an aggregate level to large data sets, it's often useful to provide a more streamlined insight into an individual records or to be able to sort and rank them. For instance, a salesperson looking at a specific customer could benefit from basic stats on that account. A marketer trying to define an ideal customer could pull the top entries and look for insights or patterns. Inner queries let you do sophisticated analysis without the overhead of traditional BI or OLAP technologies like Analysis Services. Example - Order History Constancy Let's assume that management has realized that the best thing for our business is to have customers ordering every month. We'll need to identify and rank customers based on how consistently they buy and when their last purchase was so sales & marketing can respond accordingly. Our current application may not be able to provide this and adding an OLAP server like SSAS may be overkill for our needs. Luckily, SQL Server provides the ability to do relatively sophisticated analytics via inner queries. Here's the kind of output we'd like to see. Creating the Queries Before you create a view, you need to create the SQL query that does the calculations. Here we are calculating the total number of orders as well as the number of months since the last order. These fields might be very useful to sort by but may not be available in the app. This approach provides a very streamlined and high performance method of delivering actionable information without radically changing the application. It's also works very well with self-service reporting tools like Izenda. SELECT CustomerID,CompanyName, ( SELECT COUNT(OrderID) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID ) As Orders, DATEDIFF(mm, ( SELECT Max(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) ,getdate() ) AS MonthsSinceLastOrder FROM Customers Creating Views To turn this or any query into a view, just put CREATE VIEW AS before it. If you want to change it use the statement ALTER VIEW AS. Creating Computed Columns If you'd prefer not to create a view, inner queries can also be applied by using computed columns. Place you SQL in the (Formula) field of the Computed Column Specification or check out this article here. Advanced Scoring and Ranking One of the best uses for this approach is to score leads based on multiple fields. For instance, you may be in a business where customers that don't order every month require more persistent follow up. You could devise a simple formula that shows the continuity of an account. If they ordered every month since their first order, they would be at 100 indicating that they have been ordering 100% of the time. Here's the query that would calculate that. It uses a few SQL tricks to make this happen. We are extracting the count of unique months and then dividing by the months since initial order. This query will give you the following information which can be used to help sales and marketing now where to focus. You could sort by this percentage to know where to start calling or to find patterns describing your best customers. Number of orders First Order Date Last Order Date Percentage of months order was placed since last order. SELECT CustomerID, (SELECT COUNT(OrderID) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) As Orders, (SELECT Max(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS LastOrder, (SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS FirstOrder, DATEDIFF(mm,(SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID),getdate()) AS MonthsSinceFirstOrder, 100*(SELECT COUNT(DISTINCT 100*DATEPART(yy,OrderDate) + DATEPART(mm,OrderDate)) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) / DATEDIFF(mm,(SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID),getdate()) As OrderPercent FROM Customers

Read the article
Consumer Oriented Search In Oracle Endeca Information Discovery – Part 1

- by Bob Zurek

Information Discovery, a core capability of Oracle Endeca Information Discovery, enables business users to rapidly search, discover and navigate through a wide variety of big data including structured, unstructured and semi-structured data. One of the key capabilities, among many, that differentiate our solution from others in the Information Discovery market is our deep support for search across this growing amount of varied big data. Our method and approach is very different than classic simple keyword search that is found in may information discovery solutions. In this first part of a series on the topic of search, I will walk you through many of the key capabilities that go beyond the simple search box that you might experience in products where search was clearly an afterthought or attempt to catch up to our core capabilities in this area. Lets explore. The core data management solution of Oracle Endeca Information Discovery is the Endeca Server, a hybrid search-analytical database that his highly scalable and column-oriented in nature. We will talk in more technical detail about the capabilities of the Endeca Server in future blog posts as this post is intended to give you a feel for the deep search capabilities that are an integral part of the Endeca Server. The Endeca Server provides best-of-breed search features aw well as a new class of features that are the first to be designed around the requirement to bridge structured, semi-structured and unstructured big data. Some of the key features of search include type a heads, automatic alphanumeric spell corrections, positional search, Booleans, wildcarding, natural language, and category search and query classification dialogs. This is just a subset of the advanced search capabilities found in Oracle Endeca Information Discovery. Search is an important feature that makes it possible for business users to explore on the diverse data sets the Endeca Server can hold at any one time. The search capabilities in the Endeca server differ from other Information Discovery products with simple “search boxes” in the following ways: The Endeca Server Supports Exploratory Search. Enterprise data frequently requires the user to explore content through an ad hoc dialog, with guidance that helps them succeed. This has implications for how to design search features. Traditional search doesn’t assume a dialog, and so it uses relevance ranking to get its best guess to the top of the results list. It calculates many relevance factors for each query, like word frequency, distance, and meaning, and then reduces those many factors to a single score based on a proprietary “black box” formula. But how can a business users, searching, act on the information that the document is say only 38.1% relevant? In contrast, exploratory search gives users the opportunity to clarify what is relevant to them through refinements and summaries. This approach has received consumer endorsement through popular ecommerce sites where guided navigation across a broad range of products has helped consumers better discover choices that meet their, sometimes undetermined requirements. This same model exists in Oracle Endeca Information Discovery. In fact, the Endeca Server powers many of the most popular e-commerce sites in the world. The Endeca Server Supports Cascading Relevance. Traditional approaches of search reduce many relevance weights to a single score. This means that if a result with a good title match gets a similar score to one with an exact phrase match, they’ll appear next to each other in a list. But a user can’t deduce from their score why each got it’s ranking, even though that information could be valuable. Oracle Endeca Information Discovery takes a different approach. The Endeca Server stratifies results by a primary relevance strategy, and then breaks ties within a strata by ordering them with a secondary strategy, and so on. Application managers get the explicit means to compose these strategies based on their knowledge of their own domain. This approach gives both business users and managers a deterministic way to set and understand relevance. Now that you have an understanding of two of the core search capabilities in Oracle Endeca Information Discovery, our next blog post on this topic will discuss more advanced features including set search, second-order relevance as well as an understanding of faceted search mechanisms that include queries and filters.

Read the article
Editing files without race conditions?

- by user2569445

I have a CSV file that needs to be edited by multiple processes at the same time. My question is, how can I do this without introducing race conditions? It's easy to write to the end of the file without race conditions by open(2)ing it in "a" (O_APPEND) mode and simply write to it. Things get more difficult when removing lines from the file. The easiest solution is to read the file into memory, make changes to it, and overwrite it back to the file. If another process writes to it after it is in memory, however, that new data will be lost upon overwriting. To further complicate matters, my platform does not support POSIX record locks, checking for file existence is a race condition waiting to happen, rename(2) replaces the destination file if it exists instead of failing, and editing files in-place leaves empty bytes in it unless the remaining bytes are shifted towards the beginning of the file. My idea for removing a line is this (in pseudocode): filename = "/home/user/somefile"; file = open(filename, "r"); tmp = open(filename+".tmp", "ax") || die("could not create tmp file"); //"a" is O_APPEND, "x" is O_EXCL|O_CREAT while(write(tmp, read(file)); //copy the $file to $file+".new" close(file); //edit tmp file unlink(filename) || die("could not unlink file"); file = open(filename, "wx") || die("another process must have written to the file after we copied it."); //"w" is overwrite, "x" is force file creation while(write(file, read(tmp))); //copy ".tmp" back to the original file unlink(filename+".tmp") || die("could not unlink tmp file"); Or would I be better off with a simple lock file? Appender process: lock = open(filename+".lock", "wx") || die("could not lock file"); file = open(filename, "a"); write(file, "stuff"); close(file); close(lock); unlink(filename+".lock"); Editor process: lock = open(filename+".lock", "wx") || die("could not lock file"); file = open(filename, "rw"); while(contents += read(file)); //edit "contents" write(file, contents); close(file); close(lock); unlink(filename+".lock"); Both of these rely on an additional file that will be left over if a process terminates before unlinking it, causing other processes to refuse to write to the original file. In my opinion, these problems are brought on by the fact that the OS allows multiple writable file descriptors to be opened on the same file at the same time, instead of failing if a writable file descriptor is already open. It seems that O_CREAT|O_EXCL is the closest thing to a real solution for preventing filesystem race conditions, aside from POSIX record locks. Another possible solution is to separate the file into multiple files and directories, so that more granular control can be gained over components (lines, fields) of the file using O_CREAT|O_EXCL. For example, "file/$id/$field" would contain the value of column $field of the line $id. It wouldn't be a CSV file anymore, but it might just work. Yes, I know I should be using a database for this as databases are built to handle these types of problems, but the program is relatively simple and I was hoping to avoid the overhead. So, would any of these patterns work? Is there a better way? Any insight into these kinds of problems would be appreciated.

Read the article
Prevent Windows 7 User Accounts from accessing files in other User Accounts

- by Mantis

I'm trying to set up another User Account on my Windows 7 Professional laptop for use by another person. I do not want that person to have access to any of the files in my User Account on the same machine. This machine has a single hard disk formatted with NTFS. User accounts data is stored in the default location, C:\Users. I use the computer with a Standard Account (not an Administrator). Let's call my user account "User A." I have given the new user a Standard Account. Let's call the new user's account "User B." To be clear, I want User B to have the ability to log in to her account, to use the computer, but to be unable to access any of the files in the User A account on the same machine. Currently, User B cannot use Windows Explorer to navigate to the location C:\Users\User A. However, by simply using Windows Search, User B can easily find and open documents saved in C:\Users\User A\Documents. After opening a document, that document's full path appears in "Recent Places" in Windows Explorer, and the document appears as a file that can be opened using the "Recent" feature in Word 2010. This is not the desired behavior. User B should not have the ability to see any documents using Windows Search or anything else. I have attempted to set permissions using the following procedure. Using an Administrator account, navigate to C:\Users and right-click on the "User A" folder. Select "Properties." In the "User A Properties" window that appears, click the "Security" tab. Click the "Edit..." button to change permissions. IN the "Permissions for User B" window that appears, under "Group or User Names," select User B. Under "Permissions for User B", check the box under the "Deny" column for the "Full Control" row. Ensure that the "Deny" box is automatically checked for all the other rows, and then click "OK." The system should then begin working. The process could take several minutes. When I followed this procedure, I received several "Access Denied" errors, suggesting that the system was unable to set the permissions as I had directed. I think this might be one of the reasons why User B is still able to access files in User A's account folders. Is there any other way I could accomplish my goal here? Thank you.

Read the article
xterm not wrapping text properly

- by mulllhausen

I'm configuring both my gnome-terminal and xterm columns (i still haven't picked which of these I will be using) and I have a couple of issues I would like to fix: the typing area seems to be smaller (fewer columns) than the display area the typed text is not wrapping to the next line when it reaches the end - it just continues back around on the same line, overwriting the prompt (i have set a custom bash prompt with PS1 in case this is relevant) $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 7.1 (wheezy) Release: 7.1 Codename: wheezy $ echo $TERM xterm $ stty -a [peter@pc ~] $ stty -a speed 38400 baud; rows 52; columns 126; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = M-^?; eol2 = M-^?; swtch = M-^?; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 hupcl -cstopb cread -clocal -crtscts -ignbrk brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc ixany imaxbel iutf8 opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke $[peter@mine ~] $ # the column width only goes up to here ------------------------------------------------> the results are identical in both the xterm and in gnome-terminal 3.4.1.1 and as you can see, the output of the stty -a command goes right up to the edge of the screen, while the typing does not go that far. I have found that I can get the desired result by setting the columns to a very large number, eg: $ stty cols 1800 this fixes both problems. Is it the right way to go about solving this problem? Will this "break" any of the output from programs? So far I have tried top and stty -a and these seem OK. more info as requested in the comments i found that if i cat some input into a file then the columns actually strech the full width of the terminal window: [peter@mine applications] $ cat > /tmp/asd aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaasssssssssssssssssssssssssssssssssssssssssssssssssssssssssssqqqqqqqqqqqqqqqq qqqq does this imply that it is actually bash that is restricting the number of columns and not the terminal? if so then how to alter the number of columns in bash?

Read the article

Search Results

Search found 10424 results on 417 pages for 'persisted column'.

Page 389/417 | < Previous Page | 385 386 387 388 389 390 391 392 393 394 395 396 | Next Page >

- by JTorrecilla

- by dreza

- by independentid

- by pinaldave

- by Tara Kizer

- by Daniel Moth

- by Herve Roggero

- by Davide Mauri

- by Chan

- by Daniel Moth

- by Rob Farley

- by Nick_W

- by BuckWoody

- by Tim Meyer

- by onefloridacoder

- by Dave Ballantyne

- by Mike Femenella

- by Clara Oscura

- by Tony Davis

- by gt0084e1

- by Bob Zurek

- by user2569445

- by Mantis

- by mulllhausen

< Previous Page | 385 386 387 388 389 390 391 392 393 394 395 396 | Next Page >