Search Results

Search found 1533 results on 62 pages for 'rdbms agnostic'.

Page 3/62 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • How to handle editing a large file for a non-technical user

    - by Luke
    I have a client who is given a tab delimited .txt file containing hundreds of thousands of rows. I have a user story as follows: As a user I want to take the text file and add a new value at the end of each line which contains the concatenated value of two of the columns. for example if the file read text_one text_two I need to output the following (preferably to a .txt file) text_one text_two text_onetext_two My first approach was to ask the vendor supplying the file to do the concatenation before providing the file, the easiest way to solve a problem is to eliminate it right? however they are very uncooperative and have point blank refused. I've looked at building a simple javascript application that does this client side so a non-technical user could select the file using a file selector. This approach has a few problems The file could be over a GB in size and so can't be loaded straight into memory, I've tried and the browser crashes There is no means to write a file in javascript so I'd need to output the content to the screen and have the user save it (somehow) I was thinking if I could get around the filesize limitations I could just output the edited content to the page and have the user save the page as a .txt file, however I think there is a better way than using javascript that will still accommodate the users lack of technical know-how. Please consider this question to be stack agnostic, but bear in mind that a nice little shell script or python script would be deemed unsuitable for a non technical user unless there is a way of "packaging" it nicely for a non-technical user. Updates The file is too large to open in excel. The process needs to be run weekly, but it doesn't require scheduling or automation...(yet)

    Read the article

  • Questions about identifying the components in MVC

    - by luiscubal
    I'm currently developing an client-server application in node.js, Express, mustache and MySQL. However, I believe this question should be mostly language and framework agnostic. This is the first time I'm doing a real MVC application and I'm having trouble deciding exactly what means each component. (I've done web applications that could perhaps be called MVC before, but I wouldn't confidently refer to them as such) I have a server.js that ties the whole application together. It does initialization of all other components (including the database connection, and what I think are the "models" and the "views"), receiving HTTP requests and deciding which "views" to use. Does this mean that my server.js file is the controller? Or am I mixing code that doesn't belong there? What components should I break the server.js file into? Some examples of code that's in the server.js file: var connection = mysql.createConnection({ host : 'localhost', user : 'root', password : 'sqlrevenge', database : 'blog' }); //... app.get("/login", function (req, res) { //Function handles a GET request for login forms if (process.env.NODE_ENV == 'DEVELOPMENT') { mu.clearCache(); } session.session_from_request(connection, req, function (err, session) { if (err) { console.log('index.js session error', err); session = null; } login_view.html(res, user_model, post_model, session, mu); //I named my view functions "html" for the case I might want to add other output types (such as a JSON API), or should I opt for completely separate views then? }); }); I have another file that belongs named session.js. It receives a cookies object, reads the stored data to decide if it's a valid user session or not. It also includes a function named login that does change the value of cookies. First, I thought it would be part of the controller, since it kind of dealt with user input and supplied data to the models. Then, I thought that maybe it was a model since it dealt with the application data/database and the data it supplies is used by views. Now, I'm even wondering if it could be considered a View, since it outputs data (cookies are part of HTTP headers, which are output)

    Read the article

  • Best way to create a SPARQL endpoint for a RDBMS (MySQL database)

    - by Ankur
    I am doing (want to do) some experiments with Linked Open Datasets particularly those put out by governments. I have a RDBMS (more specifically MySQL). I designed it with semantic web ideas in mind i.e. I have a information stored as objects, predicates and classes which define objects. In turn all objects are related to each other though statements of the form subject -- predicate -- object (where the subjects are from the objects table). I want to be able to query other RDF triple stores from my application and let other triple stores query my data. Is it possible to "set something up" so that this is possible? I have looked at Jena. Using Jena seems to mean I have to it as a storage application rather than MySQL - the only problem with this is that I include a new concept called a category (which I don't think is part of the semantic web languages). I will use categories to help with displaying information (they don't have any other meaning) but using Jena seems to mean that I can't organise predicates under categories for more convenient viewing. I am using Java so a JAVA API is preferred. It's also possible I misunderstood the purpose of Jena, and maybe that can be of use, but I am not sure how. I am sure four days from now this question will seem rather silly, but at the moment I am somewhat confused about how to proceed.

    Read the article

  • Symlink path can be followed manually, but `cd` returns Permission denied

    - by Ricket
    I am trying to access the directory /usr/software/test/agnostic. There are several symlinks involved in this path. As you can see by the below transcript, I am unable to cd directly to the path, but I can check each step of the way and cd to the symlinked directories until I reach the destination. Why is this? (and how do I fix it?) Ubuntu 12.10, bash > ls /usr/software/test/agnostic ls: cannot access /usr/software/test/agnostic: Permission denied > cd /usr/software/test > cd agnostic bash: cd: agnostic: Permission denied > pwd -P /x/eng/localtest/arch/x86_64-redhat-rhel5 > ls -al | grep agnostic lrwxrwxrwx 1 root root 15 Oct 23 2007 agnostic -> noarch/agnostic > ls -al | grep noarch ... lrwxrwxrwx 1 root root 23 Oct 23 2007 noarch -> /x/eng/localtest/noarch > cd noarch > cd agnostic bash: cd: agnostic: Permission denied > ls -al | grep agnostic lrwxrwxrwx 1 5808 dip 4 Oct 5 2010 agnostic -> main > cd main > ls (correct output of `ls`) > pwd /usr/software/test/noarch/main > pwd -P /x/eng/localtest/noarch/main

    Read the article

  • Coding Competition, language agnostic guidelines?

    - by Miau
    Hi there: I might be doing a coding competition soon, I was wondering if anyone made one and what where the guidelines/ process. I'd like to make the competition appealing to all devs, and I m trying to come up with ideas as to how. the scenario is: There is an event running and we(of the coding competition) will have a room that we can use (either to code or for questions, etc), however, ideally the task for the competition should be assignet and they should eb able to go and do other things, if they are so inclined. what i wonder is what kind of challenges to give, and most importantly, what is the criteria to "win" teaching and learning good coding standards takes a looong time, and I d like to think that if you ve been coding for longer you ll do things right and quick... but in a competition, you would be cutting corners... I would really appreciate your input on this

    Read the article

  • What's a good scheme for multi-user database synchronization?

    - by Mason Wheeler
    I'm working on a system to allow multiple users to collaborate on an online project. Everything is fairly straightforward, except for keeping the users in sync. Each user has their own local copy of the project database, which allows them to make changes and test things out, and then send the updates to the central server. But this runs into the classic synchronization question: how do you keep two users from editing the same thing and stomping each other's work? I've got an idea that should work, but I wonder if there's a simpler way to do it. Here's the basic concept: All project data is stored in a relational database. Each row in the database has an owner. If the current user is not the owner, he can read but not write that row. (This is enforced client-side.) The user can send a request to the server to take ownership of a row, which will be granted if the server's copy says that the current owner is NULL, or to release ownership when they're done with it. It is not possible to release ownership without committing changes to the server. It is not possible to commit changes to the server without having first downloaded all outstanding changes to the server. When any changes are made to rows you own, a trigger marks that row as Dirty. When you commit changes, the database is scanned for all Dirty rows in all tables, and the data is serialized into an update file, which is posted to the server, and all rows are marked Clean. The server applies the updates on its end, and keeps the file around. When other users download changes, the server sends them the update files that they haven't already received. So, essentially this is a reinvention of version control on a relational database. (Sort of.) As long as taking ownership and applying updates to the server are guaranteed atomic changes, and the server verifies that some smart-aleck user didn't edit their local database so they could send an update for a row they don't have ownership of, it should be guaranteed to be correct, and with no need to worry about merges and merge conflicts. (I think.) Can anyone think of any problems with this scheme, or ways to do it better? (And no, "build [insert VCS here] into your project" is not what I'm looking for. I've thought of that already. VCSs work well with text, and not so well with other file formats, such as relational databases.)

    Read the article

  • How to create a Clob in JPA in an implementation agnostic way

    - by kazanaki
    Hello I am using Ejb3 and JPA (based on Hibernate and Oracle 10g at the moment) I have an entity that contains a clob @Entity @Table(name = "My_TAB") public class ExampleEntity implements java.io.Serializable { private Clob someText; public void setSomeText(Clob someText) { this.someText= someText; } @Column(name = "COLUMN_NAME") public Clob getSomeText() { return this.someText; } Then I want to save an entity of this type. At the moment I am doing the following which works perfectly ExampleEntity exampleEntity = new ExampleEntity(); exampleEntity.setSomeText(Hibernate.createClob(aStringValue)); someOtherDao.save(exampleEntity); However this ties my code to Hibernate! I have specifically avoided so far Hibernate extensions and used only JPA annotations. The code works because indeed Hibernate is my current implementation. Is there some sort of JPA API that allows me to create a clob in a generic way? So if later I decide to switch to Toplink/EclipseLink or something else I won't have to change a thing?

    Read the article

  • Is writing eSQL database agnostic?

    - by Robert Koritnik
    Using EF we can use LINQ to read data which is rather simple (especialy using fluent calls), but we have less control unless we write eSQL on our own. Is writing eSQL database actually data store independant code? So if we decide to change data store, can the same statements still be used? Is writing eSQL strings in your code pose any serious security threats similar to writing TSQL statements in plain strings? So we moved to SPs. Could we still mode eSQL scripts outside of code as well and use some other technique to make them a bit more secure?

    Read the article

  • Predicting advantages of database denormalization

    - by Janus Troelsen
    I was always taught to strive for the highest Normal Form of database normalization, and we were taught Bernstein's Synthesis algorithm to achieve 3NF. This is all very well and it feels nice to normalize your database, knowing that fields can be modified while retaining consistency. However, performance may suffer. That's why I am wondering whether there is any way to predict the speedup/slowdown when denormalizing. That way, you can build your list of FD's featuring 3NF and then denormalize as little as possible. I imagine that denormalizing too much would waste space and time, because e.g. giant blobs are duplicated or it because harder to maintain consistency because you have to update multiple fields using a transaction. Summary: Given a 3NF FD set, and a set of queries, how do I predict the speedup/slowdown of denormalization? Link to papers appreciated too.

    Read the article

  • Avoiding Agnostic Jagged Array Flattening in Powershell

    - by matejhowell
    Hello, I'm running into an interesting problem in Powershell, and haven't been able to find a solution to it. When I google (and find things like this post), nothing quite as involved as what I'm trying to do comes up, so I thought I'd post the question here. The problem has to do with multidimensional arrays with an outer array length of one. It appears Powershell is very adamant about flattening arrays like @( @('A') ) becomes @( 'A' ). Here is the first snippet (prompt is , btw): > $a = @( @( 'Test' ) ) > $a.gettype().isarray True > $a[0].gettype().isarray False So, I'd like to have $a[0].gettype().isarray be true, so that I can index the value as $a[0][0] (the real world scenario is processing dynamic arrays inside of a loop, and I'd like to get the values as $a[$i][$j], but if the inner item is not recognized as an array but as a string (in my case), you start indexing into the characters of the string, as in $a[0][0] -eq 'T'). I have a couple of long code examples, so I have posted them at the end. And, for reference, this is on Windows 7 Ultimate with PSv2 and PSCX installed. Consider code example 1: I build a simple array manually using the += operator. Intermediate array $w is flattened, and consequently is not added to the final array correctly. I have found solutions online for similar problems, which basically involve putting a comma before the inner array to force the outer array to not flatten, which does work, but again, I'm looking for a solution that can build arrays inside a loop (a jagged array of arrays, processing a CSS file), so if I add the leading comma to the single element array (implemented as intermediate array $y), I'd like to do the same for other arrays (like $z), but that adversely affects how $z is added to the final array. Now consider code example 2: This is closer to the actual problem I am having. When a multidimensional array with one element is returned from a function, it is flattened. It is correct before it leaves the function. And again, these are examples, I'm really trying to process a file without having to know if the function is going to come back with @( @( 'color', 'black') ) or with @( @( 'color', 'black'), @( 'background-color', 'white') ) Has anybody encountered this, and has anybody resolved this? I know I can instantiate framework objects, and I'm assuming everything will be fine if I create an object[], or a list<, or something else similar, but I've been dealing with this for a little bit and something sure seems like there has to be a right way to do this (without having to instantiate true framework objects). Code Example 1 function Display($x, [int]$indent, [string]$title) { if($title -ne '') { write-host "$title`: " -foregroundcolor cyan -nonewline } if(!$x.GetType().IsArray) { write-host "'$x'" -foregroundcolor cyan } else { write-host '' $s = new-object string(' ', $indent) for($i = 0; $i -lt $x.length; $i++) { write-host "$s[$i]: " -nonewline -foregroundcolor cyan Display $x[$i] $($indent+1) } } if($title -ne '') { write-host '' } } ### Start Program $final = @( @( 'a', 'b' ), @('c')) Display $final 0 'Initial Value' ### How do we do this part ??? ########### ## $w = @( @('d', 'e') ) ## $x = @( @('f', 'g'), @('h') ) ## # But now $w is flat, $w.length = 2 ## ## ## # Even if we put a leading comma (,) ## # in front of the array, $y will work ## # but $w will not. This can be a ## # problem inside a loop where you don't ## # know the length of the array, and you ## # need to put a comma in front of ## # single- and multidimensional arrays. ## $y = @( ,@('D', 'E') ) ## $z = @( ,@('F', 'G'), @('H') ) ## ## ## ########################################## $final += $w $final += $x $final += $y $final += $z Display $final 0 'Final Value' ### Desired final value: @( @('a', 'b'), @('c'), @('d', 'e'), @('f', 'g'), @('h'), @('D', 'E'), @('F', 'G'), @('H') ) ### As in the below: # # Initial Value: # [0]: # [0]: 'a' # [1]: 'b' # [1]: # [0]: 'c' # # Final Value: # [0]: # [0]: 'a' # [1]: 'b' # [1]: # [0]: 'c' # [2]: # [0]: 'd' # [1]: 'e' # [3]: # [0]: 'f' # [1]: 'g' # [4]: # [0]: 'h' # [5]: # [0]: 'D' # [1]: 'E' # [6]: # [0]: 'F' # [1]: 'G' # [7]: # [0]: 'H' Code Example 2 function Display($x, [int]$indent, [string]$title) { if($title -ne '') { write-host "$title`: " -foregroundcolor cyan -nonewline } if(!$x.GetType().IsArray) { write-host "'$x'" -foregroundcolor cyan } else { write-host '' $s = new-object string(' ', $indent) for($i = 0; $i -lt $x.length; $i++) { write-host "$s[$i]: " -nonewline -foregroundcolor cyan Display $x[$i] $($indent+1) } } if($title -ne '') { write-host '' } } function funA() { $ret = @() $temp = @(0) $temp[0] = @('p', 'q') $ret += $temp Display $ret 0 'Inside Function A' return $ret } function funB() { $ret = @( ,@('r', 's') ) Display $ret 0 'Inside Function B' return $ret } ### Start Program $z = funA Display $z 0 'Return from Function A' $z = funB Display $z 0 'Return from Function B' ### Desired final value: @( @('p', 'q') ) and same for r,s ### As in the below: # # Inside Function A: # [0]: # [0]: 'p' # [1]: 'q' # # Return from Function A: # [0]: # [0]: 'p' # [1]: 'q' Thanks, Matt

    Read the article

  • Setting database-agnostic default column timestamp using Hibernate

    - by unsquared
    I'm working on a java project full of Hibernate (3.3.1) mapping files that have the following sort of declaration for most domain objects. <property name="dateCreated" generated="insert"> <column name="date_created" default="getdate()" /> </property> The problem here is that getdate() is an MSSQL specific function, and when I'm using something like H2 to test subsections of the project, H2 screams that getdate() isn't a recognized function. It's own timestamping function is current_timestamp(). I'd like to be able to keep working with H2 for testing, and wanted to know whether there was a way of telling Hibernate "use this database's own mechanism for retrieving the current timestamp". With H2, I've come up with the following solution. CREATE ALIAS getdate AS $$ java.util.Date now() { return new java.util.Date(); } $$; CALL getdate(); It works, but is obviously H2 specific. I've tried extending H2Dialect and registering the function getdate(), but that doesn't seem to be invoked when Hibernate is creating tables. Is it possible to abstract the idea of a default timestamp away from the specific database engine?

    Read the article

  • Creating/Maintaining a large project-agnostic code library

    - by bufferz
    In order to reduce repetition and streamline testing/debugging, I'm trying to find the best way to develop a group of libraries that many projects can utilize. I'd like to keep individual executable relatively small, and have shared libraries for math, database, collections, graphics, etc. that were previously scattered among several projects and in many cases duplicated (bad!). This library is to be in an SVN repo and several programmers will be working on it. This library will be in constant development along with the executables that utilize it. For example, I want a code file in ProjectA to look something like the following: using MyCompany.Math.2D; //static 2D math methods using MyCompany.Math.3D; //static #D math methods using MyCompany.Comms.SQL; //static methods for doing simple SQLDB I/O using MyCompany.Graphics.BitmapOperations; //static methods that play with bitmaps So in my ProjectA solution file in VisualStudio, in order to develop/debug the MyCompany library I have to add several projects (Math, Comms, Graphics). Things get pretty cluttered and Solution files get out of date quickly between programmer SVN commits. I'm just looking for a high level approach to maintaining a large, shared code base in an SCN repository. I am fully willing to radically redesign my approach. I'm looking for that warm fuzzy feeling you get when you're design approach is spot on and development is fluid and natural. And ideas? Thanks!!

    Read the article

  • How to make HTML layout whitespace-agnostic?

    - by ssg
    If you have consecutive inline-blocks white-space becomes significant. It adds some level of space between elements. What's the "correct" way of avoiding whitespace effect to HTML layout if you want those blocks to look stuck to each other? Example: <span>a</span> <span>b</span> This renders differently than: <span>a</span><span>b</span> because of the space inbetween. I want whitespace-effect to go away without compromising HTML source code layout. I want my HTML templates to stay clean and well-indented. I think these options are ugly: 1) Tweaking text-indent, margin, padding etc. (Because it would be dependent on font-size, default white-space width etc) 2) Putting everything on a single line, next to each other. 3) Zero font-size. That would require overriding font-size in blocks, which would otherwise be inherited. 4) Possible document-wide solutions. I want the solution to stay local for a certain block of HTML. Any ideas, any obvious points which I'm missing?

    Read the article

  • Relationship between databases [closed]

    - by user1525474
    Hi I am getting ready to create my first web aplication.I have some knowledge of databases but I have never used databases with relationship created beetween them and also I am not sure how to acces the data in the relationship.My experience is limited to basic CRUD applications and working on simple tables with no realtionship using PHP and MySql. For example I will be creating a login system and for each user I would like to create a profile page that store different data(name , address , profile image etc.).Some of the info will be the same in both tables so there is no point in creating the same table twice. What I would like is if anyone can tell of some tutorials so I can better understand the concept?

    Read the article

  • ???????????????

    - by Todd Bao
    ?????????,???????????????????,??????????,???????,??????,?????????????: SYS@fmw//Scripts> @showfkparent hr employees---------------|             ||DEPARTMENT_ID| +>-->HR.DEPARTMENTS.DEPARTMENT_ID|             ||JOB_ID       | +>-->HR.JOBS.JOB_ID|             ||MANAGER_ID   | +>-->HR.EMPLOYEES.EMPLOYEE_ID|             |--------------- SYS@fmw//Scripts> @showfkparent sh sales------------|          ||CHANNEL_ID| +>-->SH.CHANNELS.CHANNEL_ID|          ||CUST_ID   | +>-->SH.CUSTOMERS.CUST_ID|          ||PROD_ID   | +>-->SH.PRODUCTS.PROD_ID|          ||PROMO_ID  | +>-->SH.PROMOTIONS.PROMO_ID|          ||TIME_ID   | +>-->SH.TIMES.TIME_ID|          |------------ ????????? ??? 30-08-2012 set echo offset verify offset serveroutput ondefine table_owner='&1'define table_name='&2'declare        type info_typ is record (ct varchar2(30),cc varchar2(30),po varchar2(30),pt varchar2(30),pc varchar2(30));        type info_tab_typ is table of info_typ index by pls_integer;        info_tab info_tab_typ;        max_col_length number := 0;beginwith        cons_child as (select                        owner,constraint_name,table_name,                        r_owner,r_constraint_name                from dba_constraints                where                        constraint_type='R' and                        owner=upper('&table_owner') and                        table_name=upper('&table_name')),        cons_parent as (select owner,constraint_name,table_name                from dba_constraints                where                        (owner,constraint_name) in                        (select r_owner,r_constraint_name from cons_child))select        child.table_name child_table_name,        child.column_name child_column_name,        parent.owner parent_owner,        parent.table_name parent_table_name,        parent.column_name parent_column_name        bulk collect into info_tabfrom        cons_child cc,        cons_parent cp,        dba_cons_columns parent,        dba_cons_columns childwhere        cc.owner = child.owner and        cc.constraint_name = child.constraint_name and        cp.owner = parent.owner and        cp.constraint_name = parent.constraint_name and        cc.r_owner = cp.owner and        cc.r_constraint_name = cp.constraint_name and        parent.position = child.positionorder by 2;if (info_tab is not null and info_tab.count >0) then        for i in 1..info_tab.count loop                if length(info_tab(i).cc) > max_col_length then                        max_col_length := length(info_tab(i).cc);                end if;        end loop;        dbms_output.put_line(rpad('-',max_col_length+2,'-'));                dbms_output.put_line(' '||'|'||rpad(' ',max_col_length,' ')||'|');        for i in 1..info_tab.count loop                dbms_output.put('|'||rpad(info_tab(i).cc,max_col_length,' ')||'|');                dbms_output.put_line(' +>-->'||info_tab(i).po||'.'||info_tab(i).pt||'.'||info_tab(i).pc);                dbms_output.put_line('|'||rpad(' ',max_col_length,' ')||'|');        end loop;        dbms_output.put_line(rpad('-',max_col_length+2,'-'));else        dbms_output.put_line('### No foreign key defined on this table! ###');end if;end;/undefine table_ownerundefine table_nameset serveroutput off Todd

    Read the article

  • SQL??“ALL”,????

    - by Todd Bao
    ALL????????????,>ALL?>=ALL?<ALL?<=ALL?=ALL?!=ALL???????????????????WHERE????TRUE?????????“=ALL”?????:select employee_id,last_name,department_id  from hr.employees   where department_id =ALL (select department_id from hr.employees where salary > 16000);EMPLOYEE_ID + LAST_NAME         + DEPARTMENT_ID----------- + ------------------------- + -------------    100 + King            +         90    101 + Kochhar            +         90    102 + De Haan            +         903 rows selected.??????????SALARY??16000????????????????????(???90),???????????????????????,????????????department_id???,????????????,???16000???8000:select employee_id,last_name   from hr.employees   where department_id =ALL (select department_id from hr.employees where salary > 8000);no rows selected????,ALL???????????????????WHERE????TURE,??????,??????salary > 77777:select employee_id,last_name   from hr.employees   where department_id =ALL (select department_id from hr.employees where salary > 77777)/?????????????77777??,??????“where department_id =ALL (???)”??TRUE,?????????????:EMPLOYEE_ID + LAST_NAME----------- + -------------------------    198 + OConnell    199 + Grant    ...   .....    196 + Walsh    197 + Feeney107 rows selected.?????????????>ALL?<ALL?>=ALL?<=ALL?!=ALL??,??????????????,???????“?”??????????????????,?“?”?????????????????????????????:select * from hr.employees where department_id >=ALL (null)/Todd

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • Database Schema for Machine Tags?

    - by Gabriel
    Machine tags are more precise tags: http://www.flickr.com/groups/api/discuss/72157594497877875. They allow a user to basically tag anything as an object in the format object:property=value Any tips on a rdbms schema that implements this? Just wondering if anyone has already dabbled with this. I imagine the schema is quite similar to implementing rdf triples in a rdbms

    Read the article

  • Looking for the most painless non-RDBMS storage method in C#

    - by NateD
    I'm writing a simple program that will run entirely client-side. (Desktop programming? do people still do that?) and I need a simple way to store trivial amounts of data in a structured form, but really don't see any need to use a database system. What's more, some of the data needs to be serialized and passed around to different users, like some kind of "file" or perhaps a "document". (has anyone ever done that before?) So, I've looked at using .Net DataSets, LINQ, direct XML manipulation, and they all seem like they would get the job done, but I would like to know before I dive into any of them if there's one method that is generally regarded as easier to code than others. As I said, the amount of data to be stored is trivial, even if one hundred people all used the same machine we're not talking about more than 10 MB, so performance is not as large a concern as is codeability/maintainability. Thank you all in advance!

    Read the article

  • How to train yourself to avoid writing “clever” code?

    - by Dan Abramov
    Do you know that feeling when you just need to show off that new trick with Expressions or generalize three different procedures? This does not have to be on Architecture Astronaut scale and in fact may be helpful but I can't help but notice someone else would implement the same class or package in a more clear, straightforward (and sometimes boring) manner. I noticed I often design programs by oversolving the problem, sometimes deliberately and sometimes out of boredom. In either case, I usually honestly believe my solution is crystal clear and elegant, until I see evidence to the contrary but it's usually too late. There is also a part of me that prefers undocumented assumptions to code duplication, and cleverness to simplicity. What can I do to resist the urge to write “cleverish” code and when should the bell ring that I am Doing It Wrong? The problem is getting even more pushing as I'm now working with a team of experienced developers, and sometimes my attempts at writing smart code seem foolish even to myself after time dispels the illusion of elegance.

    Read the article

  • What "bad practice" do you do, and why?

    - by coppro
    Well, "good practice" and "bad practice" are tossed around a lot these days - "Disable assertions in release builds", "Don't disable assertions in release builds", "Don't use goto.", we've got all sorts of guidelines above and beyond simply making your program work. So I ask of you, what coding practices do you violate all the time, and more importantly, why? Do you disagree with the establishment? Do you just not care? Why should everyone else do the same? cross links: What's your favorite abandoned rule? Rule you know you should follow but don't

    Read the article

  • Data validation best practices: how can I better construct user feedback?

    - by Cory Larson
    Data validation, whether it be domain object, form, or any other type of input validation, could theoretically be part of any development effort, no matter its size or complexity. I sometimes find myself writing informational or error messages that might seem harsh or demanding to unsuspecting users, and frankly I feel like there must be a better way to describe the validation problem to the user. I know that this topic is subjective and argumentative. I've migrated this question from StackOverflow where I originally asked it with little response. Basically, I'm looking for good resources on data validation and user feedback that results from it at a theoretical level. Topics and questions I'm interested in are: Content Should I be describing what the user did correctly or incorrectly, or simply what was expected? How much detail can the user read before they get annoyed? (e.g. Is "Username cannot exceed 20 characters." enough, or should it be described more fully, such as "The username cannot be empty, and must be at least 6 characters but cannot exceed 30 characters."?) Grammar How do I decide between phrases like "must not," "may not," or "cannot"? Delivery This can depend on the project, but how should the information be delivered to the user? Should it be obtrusive (e.g. JavaScript alerts) or friendly? Should they be displayed prominently? Immediately (i.e. without confirmation steps, etc.)? Logging Do you bother logging validation errors? Internationalization Some cultures prefer or better understand directness over subtlety and vice-versa (e.g. "Don't do that!" vs. "Please check what you've done."). How do I cater to the majority of users? I may edit this list as I think more about the topic, but I'm genuinely interested in proper user feedback techniques. I'm looking for things like research results, poll results, etc. I've developed and refined my own techniques over the years that users seem to be okay with, but I work in an environment where the users prefer to adapt to what you give them over speaking up about things they don't like. I'm interested in hearing your experiences in addition to any resources to which you may be able to point me.

    Read the article

  • Which software development methodologies can be seen as foundations

    - by Bas
    I'm writing a small research paper which involves software development methodologiess. I was looking into all the available methodology's and I was wondering, from all methodologies, are there any that have provided the foundations for the others? For an example, looking at the following methodologies: Agile, Prototyping, Cleanroom, Iterative, RAD, RUP, Spiral, Waterfall, XP, Lean, Scrum, V-Model, TDD. Can we say that: Prototyping, Iterative, Spiral and Waterfall are the "foundation" for the others? Or is there no such thing as "foundations" and does each methodology has it's own unique history? I would ofcourse like to describe all the methodology's in my research paper, but I simply don't have the time to do so and that is why I would like to know which methodologies can be seen as representatives.

    Read the article

  • Where can I find video resources of people programming?

    - by Corey
    This might be a strange question. I'm looking for videos of people actively coding something while explaining it. However, I don't want is a beginner video that delves into what variables and objects are. Nick Gravelyn's tile engine tutorial is a great example of what I'm looking for. (He actually used to host the full, unbroken video files in his site's archive, but I guess he took them down...) I tend to learn best by "action" examples; it's difficult for me to learn by reading through documentation and text tutorials, but if I see somebody actively doing a task, I can immediately register it and apply it myself. I'm hard-of-hearing, so I would really prefer that if the video has a lot of talking, it have captioning or subtitling of some sort, or at the very least, a transcript. The tile engine videos did not have captions, but the code he was writing was very self-documenting, so I understood it for the most part. I've gone through most of the relevant GoogleDevelopers and GoogleTechTalks videos on Youtube, so those need not apply. Are there any resources out there, or even websites dedicated to this kind of thing?

    Read the article

  • Why are invariants important in Computer Science

    - by Antony Thomas
    I understand 'invariant' in its literal sense. I also recognize them when I type code. But I don't think I understand the importance of this term in the context of computer science. Whenever I read conversations\white papers about language design from famous programmers\computer scientists, the term 'invariant' keeps popping up as a jargon; and that is the part I don't understand. What is so special about it?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >