Search Results

Search found 600 results on 24 pages for 'splitting'.

Page 11/24 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18  | Next Page >

  • best articles about organizing code files in C

    - by kliketa
    Can you recommend me what should I read/learn in order to make a well organized code in C? One of the things I want to learn is the principles of splitting project in .h and .c files, what goes where and why, variable naming, when to use global variables ... I am interested in books and articles that deal with this specific problem.

    Read the article

  • Way around 1mb file size restriction?

    - by Sarevok
    My app needs to save files that will range from about 2-20mb. When I tried to do this I was getting an OutOfMemoryException. I did some reading and it's looking like Android has a file size limit of 1mb. Is this correct? If so, is there a way around this limitation, other than splitting up every file into 1mb chunks?

    Read the article

  • How to emulate PHPs preg_split in ruby to capture offsets and delimiters?

    - by dimus
    I wonder if there is a way to get offsets and delimiters while I am splitting a string in ruby analagous to PHP preg_split: preg_split("/( |&nbsp;|<|>|\t|\n|\r|;|\.)/i", $html_string, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_OFFSET_CAPTURE); I imagine I can achieve it by traversing string by characters or using something heavy as treetop, but I would like to use something more convenient.

    Read the article

  • What is the role of the Router Object in MVC based frameworks

    - by Saif Bechan
    In most MVC based framework I see a router object. If I look at it splits up the uri and decides what controller should be used, and which action should be fired. Even though this makes a lot of sense, I can not give this a place in the MVC patern. Is splitting up the uri not the job of the controller. And then the controller should just decide which class and function to run.

    Read the article

  • effective functonal sort

    - by sreservoir
    I'm programming a function for a ti-nspire, so I can't use the builtins from inside a function. what is the most generally efficient algorithm for sorting a list of numbers without modifying the list itself? (recursion and list-splitting are fair game, as is general use of math.)

    Read the article

  • Warning produced by f#: value has been copied to ensure the original is not mutated

    - by user1878761
    The first definition below produces the warning in the title when compiled with f# 3.0 and the warning level set to 5. The second definition compiles cleanly. I wondered if someone could please explain just what the compiler worries I might accidentally mutate, or how would splitting the expression with a let clause help avoid that. Many thanks. let ticks_with_warning () : int64 = System.DateTime.Now.Ticks let ticks_clean () : int64 = let t = System.DateTime.Now t.Ticks

    Read the article

  • Getting a count of users each day in Mondrian MDX

    - by user1874144
    I'm trying to write a query to give me the total number of users for each customer per day. Here is what I have so far, which for each customer/day combination is giving the total number of user dimension entries without splitting them up by customer/day. WITH MEMBER [Measures].[MyUserCount] AS COUNT(Descendants([User].CurrentMember, [User].[User Name]), INCLUDEEMPTY) SELECT NON EMPTY CrossJoin([Date].[Date].Members, [Customer].[Customer Name].Members) ON ROWS, {[Measures].[MyUserCount]} on COLUMNS FROM [Users]

    Read the article

  • Help me understand why page sizes are a power of 2?

    - by eric
    Answer I need help with is: Recall that paging is implemented by breaking up an address into a page and offset number. It is most efficient to break the address into X page bits and Y offset bits, rather than perform arithmetic on the address to calculate the page number and offset. Because each bit position represents a power of 2, splitting an address between bits results in a page size that is a power of 2. i don't quite understand this answer, can anyone give a simpler explanation?

    Read the article

  • Checking array of censored words against user submitted content

    - by steve-o
    Hello, I have set up an array of censored words and I want to check that a user submitted comment doesn't contain any of these words. What is the most efficient way of doing this? All I've come up with so far is splitting the string into an array of words and checking it against the array of censored words, but I've a feeling there's a neater way of doing this.

    Read the article

  • Select only half the records

    - by coffeeaddict
    I am trying to figure out how to select half the records where an ID is null. I want half because I am going to use that result set to update another ID field. Then I am going to update the rest with another value for that ID field. So essentially I want to update half the records someFieldID with one number and the rest with another number splitting the update basically between two values for someFieldID the field I want to update.

    Read the article

  • spliting the string

    - by prince23
    hi, i have an string like this string strdate =@"5/2/2006"; which is in a form of month/day/year i need to display it in a form of like this 02-05-2006 how do i format the data like this if the value is like this 12/28/2005 it should be displayed like this 28-12-2010 i know we should be splitting the data based on that we should do it. i am not getting the syntax how to do it . any help would be great

    Read the article

  • Entity Framework version 1- Brief Synopsis and Tips &ndash; Part 1

    - by Rohit Gupta
    To Do Eager loading use Projections (for e.g. from c in context.Contacts select c, c.Addresses)  or use Include Query Builder Methods (Include(“Addresses”)) If there is multi-level hierarchical Data then to eager load all the relationships use Include Query Builder methods like customers.Include("Order.OrderDetail") to include Order and OrderDetail collections or use customers.Include("Order.OrderDetail.Location") to include all Order, OrderDetail and location collections with a single include statement =========================================================================== If the query uses Joins then Include() Query Builder method will be ignored, use Nested Queries instead If the query does projections then Include() Query Builder method will be ignored Use Address.ContactReference.Load() OR Contact.Addresses.Load() if you need to Deferred Load Specific Entity – This will result in extra round trips to the database ObjectQuery<> cannot return anonymous types... it will return a ObjectQuery<DBDataRecord> Only Include method can be added to Linq Query Methods Any Linq Query method can be added to Query Builder methods. If you need to append a Query Builder Method (other than Include) after a LINQ method  then cast the IQueryable<Contact> to ObjectQuery<Contact> and then append the Query Builder method to it =========================================================================== Query Builder methods are Select, Where, Include Methods which use Entity SQL as parameters e.g. "it.StartDate, it.EndDate" When Query Builder methods do projection then they return ObjectQuery<DBDataRecord>, thus to iterate over this collection use contact.Item[“Name”].ToString() When Linq To Entities methods do projection, they return collection of anonymous types --- thus the collection is strongly typed and supports Intellisense EF Object Context can track changes only on Entities, not on Anonymous types. If you use a Defining Query for a EntitySet then the EntitySet becomes readonly since a Defining Query is the same as a View (which is treated as a ReadOnly by default). However if you want to use this EntitySet for insert/update/deletes then we need to map stored procs (as created in the DB) to the insert/update/delete functions of the Entity in the Designer You can use either Execute method or ToList() method to bind data to datasources/bindingsources If you use the Execute Method then remember that you can traverse through the ObjectResult<> collection (returned by Execute) only ONCE. In WPF use ObservableCollection to bind to data sources , for keeping track of changes and letting EF send updates to the DB automatically. Use Extension Methods to add logic to Entities. For e.g. create extension methods for the EntityObject class. Create a method in ObjectContext Partial class and pass the entity as a parameter, then call this method as desired from within each entity. ================================================================ DefiningQueries and Stored Procedures: For Custom Entities, one can use DefiningQuery or Stored Procedures. Thus the Custom Entity Collection will be populated using the DefiningQuery (of the EntitySet) or the Sproc. If you use Sproc to populate the EntityCollection then the query execution is immediate and this execution happens on the Server side and any filters applied will be applied in the Client App. If we use a DefiningQuery then these queries are composable, meaning the filters (if applied to the entityset) will all be sent together as a single query to the DB, returning only filtered results. If the sproc returns results that cannot be mapped to existing entity, then we first create the Entity/EntitySet in the CSDL using Designer, then create a dummy Entity/EntitySet using XML in the SSDL. When creating a EntitySet in the SSDL for this dummy entity, use a TSQL that does not return any results, but does return the relevant columns e.g. select ContactID, FirstName, LastName from dbo.Contact where 1=2 Also insure that the Entity created in the SSDL uses the SQL DataTypes and not .NET DataTypes. If you are unable to open the EDMX file in the designer then note the Errors ... they will give precise info on what is wrong. The Thrid option is to simply create a Native Query in the SSDL using <Function Name="PaymentsforContact" IsComposable="false">   <CommandText>SELECT ActivityId, Activity AS ActivityName, ImagePath, Category FROM dbo.Activities </CommandText></FuncTion> Then map this Function to a existing Entity. This is a quick way to get a custom Entity which is regular Entity with renamed columns or additional columns (which are computed columns). The disadvantage to using this is that It will return all the rows from the Defining query and any filter (if defined) will be applied only at the Client side (after getting all the rows from DB). If you you DefiningQuery instead then we can use that as a Composable Query. The Fourth option (for mapping a READ stored proc results to a non-existent Entity) is to create a View in the Database which returns all the fields that the sproc also returns, then update the Model so that the model contains this View as a Entity. Then map the Read Sproc to this View Entity. The other option would be to simply create the View and remove the sproc altogether. ================================================================ To Execute a SProc that does not return a entity, use a EntityCommand to execute that proc. You cannot call a sproc FunctionImport that does not return Entities From Code, the only way is to use SSDL function calls using EntityCommand.  This changes with EntityFramework Version 4 where you can return Scalar Types, Complex Types, Entities or NonQuery ================================================================ UDF when created as a Function in SSDL, we need to set the Name & IsComposable properties for the Function element. IsComposable is always false for Sprocs, for UDF's set this to true. You cannot call UDF "Function" from within code since you cannot import a UDF Function into the CSDL Model (with Version 1 of EF). only stored procedures can be imported and then mapped to a entity ================================================================ Entity Framework requires properties that are involved in association mappings to be mapped in all of the function mappings for the entity (Insert, Update and Delete). Because Payment has an association to Reservation... hence we need to pass both the paymentId and reservationId to the Delete sproc even though just the paymentId is the PK on the Payment Table. ================================================================ When mapping insert, update and delete procs to a Entity, insure that all the three or none are mapped. Further if you have a base class and derived class in the CSDL, then you must map (ins, upd, del) sprocs to all parent and child entities in the inheritance relationship. Note that this limitation that base and derived entity methods must all must be mapped does not apply when you are mapping Read Stored Procedures.... ================================================================ You can write stored procedures SQL directly into the SSDL by creating a Function element in the SSDL and then once created, you can map this Function to a CSDL Entity directly in the designer during Function Import ================================================================ You can do Entity Splitting such that One Entity maps to multiple tables in the DB. For e.g. the Customer Entity currently derives from Contact Entity...in addition it also references the ContactPersonalInfo Entity. One can copy all properties from the ContactPersonalInfo Entity into the Customer Entity and then Delete the CustomerPersonalInfo entity, finall one needs to map the copied properties to the ContactPersonalInfo Table in Table Mapping (by adding another table (ContactPersonalInfo) to the Table Mapping... this is called Entity Splitting. Thus now when you insert a Customer record, it will automatically create SQL to insert records into the Contact, Customers and ContactPersonalInfo tables even though you have a Single Entity called Customer in the CSDL =================================================================== There is Table by Type Inheritance where another EDM Entity can derive from another EDM entity and absorb the inherted entities properties, for example in the Break Away Geek Adventures EDM, the Customer entity derives (inherits) from the Contact Entity and absorbs all the properties of Contact entity. Thus when you create a Customer Entity in Code and then call context.SaveChanges the Object Context will first create the TSQL to insert into the Contact Table followed by a TSQL to insert into the Customer table =================================================================== Then there is the Table per Hierarchy Inheritance..... where different types are created based on a condition (similar applying a condition to filter a Entity to contain filtered records)... the diference being that the filter condition populates a new Entity Type (derived from the base Entity). In the BreakAway sample the example is Lodging Entity which is a Abstract Entity and Then Resort and NonResort Entities which derive from Lodging Entity and records are filtered based on the value of the Resort Boolean field =================================================================== Then there is Table per Concrete Type Hierarchy where we create a concrete Entity for each table in the database. In the BreakAway sample there is a entity for the Reservation table and another Entity for the OldReservation table even though both the table contain the same number of fields. The OldReservation Entity can then inherit from the Reservation Entity and configure the OldReservation Entity to remove all Scalar Properties from the Entity (since it inherits the properties from Reservation and filters based on ReservationDate field) =================================================================== Complex Types (Complex Properties) Entities in EF can also contain Complex Properties (in addition to Scalar Properties) and these Complex Properties reference a ComplexType (not a EntityType) DropdownList, ListBox, RadioButtonList, CheckboxList, Bulletedlist are examples of List server controls (not data bound controls) these controls cannot use Complex properties during databinding, they need Scalar Properties. So if a Entity contains Complex properties and you need to bind those to list server controls then use projections to return Scalar properties and bind them to the control (the disadvantage is that projected collections are not tracked by the Object Context and hence cannot persist changes to the projected collections bound to controls) ObjectDataSource and EntityDataSource do account for Complex properties and one can bind entities with Complex Properties to Data Source controls and they will be tracked for changes... with no additional plumbing needed to persist changes to these collections bound to controls So DataBound controls like GridView, FormView need to use EntityDataSource or ObjectDataSource as a datasource for entities that contain Complex properties so that changes to the datasource done using the GridView can be persisted to the DB (enabling the controls for updates)....if you cannot use the EntityDataSource you need to flatten the ComplexType Properties using projections With EF Version 4 ComplexTypes are supported by the Designer and can add/remove/compose Complex Types directly using the Designer =================================================================== Conditional Mapping ... is like Table per Hierarchy Inheritance where Entities inherit from a base class and then used conditions to populate the EntitySet (called conditional Mapping). Conditional Mapping has limitations since you can only use =, is null and IS NOT NULL Conditions to do conditional mapping. If you need more operators for filtering/mapping conditionally then use QueryView(or possibly Defining Query) to create a readonly entity. QueryView are readonly by default... the EntitySet created by the QueryView is enabled for change tracking by the ObjectContext, however the ObjectContext cannot create insert/update/delete TSQL statements for these Entities when SaveChanges is called since it is QueryView. One way to get around this limitation is to map stored procedures for the insert/update/delete operations in the Designer. =================================================================== Difference between QueryView and Defining Query : QueryView is defined in the (MSL) Mapping File/section of the EDM XML, whereas the DefiningQuery is defined in the store schema (SSDL). QueryView is written using Entity SQL and is this database agnostic and can be used against any database/Data Layer. DefiningQuery is written using Database Lanaguage i.e. TSQL or PSQL thus you have more control =================================================================== Performance: Lazy loading is deferred loading done automatically. lazy loading is supported with EF version4 and is on by default. If you need to turn it off then use context.ContextOptions.lazyLoadingEnabled = false To improve Performance consider PreCompiling the ObjectQuery using the CompiledQuery.Compile method

    Read the article

  • hosting database on separate server

    - by Amit Aggarwal
    Hello Experts, We have an enterprise web app to which our clients post/upload lots of documents [mainly images and pdf files] via web interface, iphone app etc etc. We are also using imagick to split pdf documents into images. Also, large number of mysql SELECT/UPDATE/DELETE queries happen all day. Currently, all of this is happening on same server and we are planning to split the process in 3 stages : 1) a server only for database 2) a server only for documents (document upload, splitting etc) 3) a server for the main php web app Is there any drawback with this kind of structure as compared to hosting everything on same server ? Please guide. Thanks, Amit

    Read the article

  • What Gets Measured Gets Managed

    - by steve.diamond
    OK, so if I were to claim credit for inventing that expression, I guess I could share the mantle with Al Gore, creator of the Internet. But here's the point: How many of us acquire CRM systems without specifically benchmarking several key performance indicators across sales, marketing and service BEFORE and AFTER deployment of said system? Yes, this may sound obvious and it might provoke the, "Well of course, Diamond!" response, but is YOUR company doing this? Can you define in quantitative terms the delta across multiple parameters? I just trolled the Web site of one of my favorite sales consultancy firms, The Alexander Group. Right on their home page is a brief appeal citing the importance of benchmarking. The corresponding landing page states, "The fact that hundreds of sales executives now track how their sales forces spend time means they attach great value to understanding how much time sellers actually devote to selling." The opportunity is to extend this conversation to benchmarking the success that companies derive from the investment they make in CRM systems, i.e., to the automation side of the equation. To a certain extent, the 'game' is analogous to achieving optimal physical fitness. One may never quite get there, but beyond the 95% threshold of "excellence," she/he may be entering the realm of splitting infinitives. But at the very start, and to quote verbiage from the aforementioned Alexander Group Web page, what gets measured gets managed. And getting to that 95% level along several key indicators would be a high quality problem indeed, don't you think? Yes, this could be a "That's so 90's" conversation, but is it really?

    Read the article

  • Move smaller hard drive to partition on a larger hard drive

    - by bluejeansummer
    My parents bought a new hard drive for a laptop that I've owned for several years. It's much larger than the current one, so I plan on splitting it up to dual boot it with Ubuntu. I have no problem with partitioning a drive (I always keep a LiveCD handy), but my question is this: how can I go about moving the existing partition to the new drive? This is a laptop, so I can't simply plug the new drive into another slot. Also, even if I manage to move it, will Windows still work on the new drive in a larger partition? I've had this laptop for quite a while, and I've lost the recovery discs that came with it a long time ago. I also have a lot of software without CDs to reinstall them with. This makes not reinstalling Windows a high priority. In case it helps, both drives use 2.5" PATA, and I have a 1 TB external drive available if it's needed.

    Read the article

  • SQL Server 2000 tables

    - by user40766
    We currently have an SQL Server 2000 database with one table containing data for multiple users. The data is keyed by memberid which is an integer field. The table has a clustered index on memberid. The table is now about 200 million rows. Indexing and maintenance are becoming issues. We are debating splitting the table into one table per user model. This would imply that we would end up with a very large number of tables potentially upto the 2,147,483,647, considering just positive values. My questions: Does anyone have any experience with a SQL Server (2000/2005) installation with millions of tables? What are the implications of this architecture with regards to maintenance and access using Query Analyzer, Enterprise Manager etc. What are the implications to having such a large number of indexes in a database instance. All comments are appreciated. Thanks

    Read the article

  • Perforce Restore From Multiple Checkpoint Files?

    - by AJ
    Hi all, I am working with a very large (~11GB) checkpoint file and trying to do a -jr (journal restore) operation. About half way through the file, I'm hitting an entry which causes an error to occur. I'm unable to come up with a conventional way to print, edit, and save changes to the offending line. So right now I'm splitting the checkpoint into files of 500k lines each...up to 47 files and counting. My question is, once I have these separate files: Can I run journal restore on each one separately to check for errors? Once fixed, is it necessary to merge them back together again to do my full journal restore? Any other ideas on how to tackle this problem would be appreciated. Thanks in advance, -aj

    Read the article

  • Watchguard firebox: public IP addresses behind firewall with as much usable IP addresses as possible

    - by martinezpt
    Our ISP assigned us 16 public IP addresses that we want to assign to hosts behind a Watchguard firebox x750e. The IP addresses are: x.x.x.176/28 of which x.x.x.177 is the gateway. The hosts will be running software that needs to be directly assigned the public IP address so 1:1 NAT is not an option. I found this document that gives examples on how to assign public IP addresses to hosts behind the firewall, using an optional interface: http://www.watchguard.com/help/configuration-examples/public_IP_behind_XTM_configuration_example_(en-US).pdf However, I can't implement scenario 1 as it won't allow me to use the same subnet on both interfaces. As for scenario 2, splitting the address range into 2 subnets will decrease the usable hosts on the optional interface to 5 (8 - network - broadcast - optional interface ip). I'm convinced that there must be a better way to address this problem and maximize the number of usable IP addresses but I'm not very familiar with this specific firewall. Are there any suggestions on how to keep the hosts behind the firewall with public IP addresses while maximizing the usable IP addresses? thanks

    Read the article

  • Investigation: Can different combinations of components effect Dataflow performance?

    - by jamiet
    Introduction The Dataflow task is one of the core components (if not the core component) of SQL Server Integration Services (SSIS) and often the most misunderstood. This is not surprising, its an incredibly complicated beast and we’re abstracted away from that complexity via some boxes that go yellow red or green and that have some lines drawn between them. Example dataflow In this blog post I intend to look under that facade and get into some of the nuts and bolts of the Dataflow Task by investigating how the decisions we make when building our packages can affect performance. I will do this by comparing the performance of three dataflows that all have the same input, all produce the same output, but which all operate slightly differently by way of having different transformation components. I also want to use this blog post to challenge a common held opinion that I see perpetuated over and over again on the SSIS forum. That is, that people assume adding components to a dataflow will be detrimental to overall performance. Its not surprising that people think this –it is intuitive to think that more components means more work- however this is not a view that I share. I have always been of the opinion that there are many factors affecting dataflow duration and the number of components is actually one of the less important ones; having said that I have never proven that assertion and that is one reason for this investigation. I have actually seen evidence that some people think dataflow duration is simply a function of number of rows and number of components. I’ll happily call that one out as a myth even without any investigation!  The Setup I have a 2GB datafile which is a list of 4731904 (~4.7million) customer records with various attributes against them and it contains 2 columns that I am going to use for categorisation: [YearlyIncome] [BirthDate] The data file is a SSIS raw format file which I chose to use because it is the quickest way of getting data into a dataflow and given that I am testing the transformations, not the source or destination adapters, I want to minimise external influences as much as possible. In the test I will split the customers according to month of birth (12 of those) and whether or not their yearly income is above or below 50000 (2 of those); in other words I will be splitting them into 24 discrete categories and in order to do it I shall be using different combinations of SSIS’ Conditional Split and Derived Column transformation components. The 24 datapaths that occur will each input to a rowcount component, again because this is the least resource intensive means of terminating a datapath. The test is being carried out on a Dell XPS Studio laptop with a quad core (8 logical Procs) Intel Core i7 at 1.73GHz and Samsung SSD hard drive. Its running SQL Server 2008 R2 on Windows 7. The Variables Here are the three combinations of components that I am going to test:     One Conditional Split - A single Conditional Split component CSPL Split by Month of Birth and income category that will use expressions on [YearlyIncome] & [BirthDate] to send each row to one of 24 outputs. This next screenshot displays the expression logic in use: Derived Column & Conditional Split - A Derived Column component DER Income Category that adds a new column [IncomeCategory] which will contain one of two possible text values {“LessThan50000”,”GreaterThan50000”} and uses [YearlyIncome] to determine which value each row should get. A Conditional Split component CSPL Split by Month of Birth and Income Category then uses that new column in conjunction with [BirthDate] to determine which of the same 24 outputs to send each row to. Put more simply, I am separating the Conditional Split of #1 into a Derived Column and a Conditional Split. The next screenshots display the expression logic in use: DER Income Category         CSPL Split by Month of Birth and Income Category       Three Conditional Splits - A Conditional Split component that produces two outputs based on [YearlyIncome], one for each Income Category. Each of those outputs will go to a further Conditional Split that splits the input into 12 outputs, one for each month of birth (identical logic in each). In this case then I am separating the single Conditional Split of #1 into three Conditional Split components. The next screenshots display the expression logic in use: CSPL Split by Income Category         CSPL Split by Month of Birth 1& 2       Each of these combinations will provide an input to one of the 24 rowcount components, just the same as before. For illustration here is a screenshot of the dataflow containing three Conditional Split components: As you can these dataflows have a fair bit of work to do and remember that they’re doing that work for 4.7million rows. I will execute each dataflow 10 times and use the average for comparison. I foresee three possible outcomes: The dataflow containing just one Conditional Split (i.e. #1) will be quicker There is no significant difference between any of them One of the two dataflows containing multiple transformation components will be quicker Regardless of which of those outcomes come to pass we will have learnt something and that makes this an interesting test to carry out. Note that I will be executing the dataflows using dtexec.exe rather than hitting F5 within BIDS. The Results and Analysis The table below shows all of the executions, 10 for each dataflow. It also shows the average for each along with a standard deviation. All durations are in seconds. I’m pasting a screenshot because I frankly can’t be bothered with the faffing about needed to make a presentable HTML table. It is plain to see from the average that the dataflow containing three conditional splits is significantly faster, the other two taking 43% and 52% longer respectively. This seems strange though, right? Why does the dataflow containing the most components outperform the other two by such a big margin? The answer is actually quite logical when you put some thought into it and I’ll explain that below. Before progressing, a side note. The standard deviation for the “Three Conditional Splits” dataflow is orders of magnitude smaller – indicating that performance for this dataflow can be predicted with much greater confidence too. The Explanation I refer you to the screenshot above that shows how CSPL Split by Month of Birth and salary category in the first dataflow is setup. Observe that there is a case for each combination of Month Of Date and Income Category – 24 in total. These expressions get evaluated in the order that they appear and hence if we assume that Month of Date and Income Category are uniformly distributed in the dataset we can deduce that the expected number of expression evaluations for each row is 12.5 i.e. 1 (the minimum) + 24 (the maximum) divided by 2 = 12.5. Now take a look at the screenshots for the second dataflow. We are doing one expression evaluation in DER Income Category and we have the same 24 cases in CSPL Split by Month of Birth and Income Category as we had before, only the expression differs slightly. In this case then we have 1 + 12.5 = 13.5 expected evaluations for each row – that would account for the slightly longer average execution time for this dataflow. Now onto the third dataflow, the quick one. CSPL Split by Income Category does a maximum of 2 expression evaluations thus the expected number of evaluations per row is 1.5. CSPL Split by Month of Birth 1 & CSPL Split by Month of Birth 2 both have less work to do than the previous Conditional Split components because they only have 12 cases to test for thus the expected number of expression evaluations is 6.5 There are two of them so total expected number of expression evaluations for this dataflow is 6.5 + 6.5 + 1.5 = 14.5. 14.5 is still more than 12.5 & 13.5 though so why is the third dataflow so much quicker? Simple, the conditional expressions in the first two dataflows have two boolean predicates to evaluate – one for Income Category and one for Month of Birth; the expressions in the Conditional Split in the third dataflow however only have one predicate thus they are doing a lot less work. To sum up, the difference in execution times can be attributed to the difference between: MONTH(BirthDate) == 1 && YearlyIncome <= 50000 and MONTH(BirthDate) == 1 In the first two dataflows YearlyIncome <= 50000 gets evaluated an average of 12.5 times for every row whereas in the third dataflow it is evaluated once and once only. Multiply those 11.5 extra operations by 4.7million rows and you get a significant amount of extra CPU cycles – that’s where our duration difference comes from. The Wrap-up The obvious point here is that adding new components to a dataflow isn’t necessarily going to make it go any slower, moreover you may be able to achieve significant improvements by splitting logic over multiple components rather than one. Performance tuning is all about reducing the amount of work that needs to be done and that doesn’t necessarily mean use less components, indeed sometimes you may be able to reduce workload in ways that aren’t immediately obvious as I think I have proven here. Of course there are many variables in play here and your mileage will most definitely vary. I encourage you to download the package and see if you get similar results – let me know in the comments. The package contains all three dataflows plus a fourth dataflow that will create the 2GB raw file for you (you will also need the [AdventureWorksDW2008] sample database from which to source the data); simply disable all dataflows except the one you want to test before executing the package and remember, execute using dtexec, not within BIDS. If you want to explore dataflow performance tuning in more detail then here are some links you might want to check out: Inequality joins, Asynchronous transformations and Lookups Destination Adapter Comparison Don’t turn the dataflow into a cursor SSIS Dataflow – Designing for performance (webinar) Any comments? Let me know! @Jamiet

    Read the article

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18  | Next Page >