trick - Page 79 - Developer IT

A Taxonomy of Numerical Methods v1

- by JoshReuben

Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

Read the article

Quick guide to Oracle IRM 11g: Classification design

- by Simon Thorpe

Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

Read the article

An Honest look at SharePoint Web Services

- by juanlarios

INTRODUCTION If you are a SharePoint developer you know that there are two basic ways to develop against SharePoint. 1) The object Model 2) Web services. SharePoint object model has the advantage of being quite rich. Anything you can do through the SharePoint UI as an administrator or end user, you can do through the object model. In fact everything that is done through the UI is done through the object model behind the scenes. The major disadvantage to getting at SharePoint this way is that the code needs to run on the server. This means that all web parts, event receivers, features, etc… all of this is code that is deployed to the server. The second way to get to SharePoint is through the built in web services. There are many articles on how to manipulate web services, how to authenticate to them and interact with them. The basic idea is that a remote application or process can contact SharePoint through a web service. Lots has been written about how great these web services are. This article is written to document the limitations, some of the issues and frustrations with working with SharePoint built in web services. Ultimately, for the tasks I was given to , SharePoint built in web services did not suffice. My evaluation of SharePoint built in services was compared against creating my own WCF Services to do what I needed. The current project I'm working on right now involved several "integration points". A remote application, installed on a separate server was to contact SharePoint and perform an task or operation. So I decided to start up Visual Studio and built a DLL and basically have 2 layers of logic. An integration layer and a data layer. A good friend of mine pointed me to SOLID principles and referred me to some videos and tutorials about it. I decided to implement the methodology (although a lot of the principles are common sense and I already incorporated in my coding practices). I was to deliver this dll to the application team and they would simply call the methods exposed by this dll and voila! it would do some task or operation in SharePoint. SOLUTION My integration layer implemented an interface that defined some of the basic integration tasks that I was to put together. My data layer was about the same, it implemented an interface with some of the tasks that I was going to develop. This gave me the opportunity to develop different data layers, ultimately different ways to get at SharePoint if I needed to. This is a classic SOLID principle. In this case it proved to be quite helpful because I wrote one data layer completely implementing SharePoint built in Web Services and another implementing my own WCF Service that I wrote. I should mention there is another layer underneath the data layer. In referencing SharePoint or WCF services in my visual studio project I created a class for every web service call. So for example, if I used List.asx. I created a class called "DocumentRetreival" this class would do the grunt work to connect to the correct URL, It would perform the basic operation of contacting the service and so on. If I used a view.asmx, I implemented a class called "ViewRetrieval" with the same idea as the last class but it would now interact with all he operations in view.asmx. This gave my data layer the ability to perform multiple calls without really worrying about some of the grunt work each class performs. This again, is a classic SOLID principle. So, in order to compare them side by side we can look at both data layers and with is involved in each. Lets take a look at the "Create Project" task or operation. The integration point is described as , "dll is to provide a way to create a project in SharePoint". Projects , in this case are basically document libraries. I am to implement a way in which a remote application can create a document library in SharePoint. Easy enough right? Use the list.asmx Web service in SharePoint. So here we go! Lets take a look at the code. I added the List.asmx web service reference to my project and this is the class that contacts it: class DocumentRetrieval { private ListsSoapClient _service; d private bool _impersonation; public DocumentRetrieval(bool impersonation, string endpt) { _service = new ListsSoapClient(); this.SetEndPoint(string.Format("{0}/{1}", endpt, ConfigurationManager.AppSettings["List"])); _impersonation = impersonation; if (_impersonation) { _service.ClientCredentials.Windows.ClientCredential.Password = ConfigurationManager.AppSettings["password"]; _service.ClientCredentials.Windows.ClientCredential.UserName = ConfigurationManager.AppSettings["username"]; _service.ClientCredentials.Windows.AllowedImpersonationLevel = System.Security.Principal.TokenImpersonationLevel.Impersonation; } private void SetEndPoint(string p) { _service.Endpoint.Address = new EndpointAddress(p); } /// <summary> /// Creates a document library with specific name and templateID /// </summary> /// <param name="listName">New list name</param> /// <param name="templateID">Template ID</param> /// <returns></returns> public XmlElement CreateLibrary(string listName, int templateID, ref ExceptionContract exContract) { XmlDocument sample = new XmlDocument(); XmlElement viewCol = sample.CreateElement("Empty"); try { _service.Open(); viewCol = _service.AddList(listName, "", templateID); } catch (Exception ex) { exContract = new ExceptionContract("DocumentRetrieval/CreateLibrary", ex.GetType(), "Connection Error", ex.StackTrace, ExceptionContract.ExceptionCode.error); }finally { _service.Close(); } return viewCol; } } There was a lot more in this class (that I am not including) because i was reusing the grunt work and making other operations with LIst.asmx, For example, updating content types, changing or configuring lists or document libraries. One of the first things I noticed about working with the built in services is that you are really at the mercy of what is available to you. Before creating a document library (Project) I wanted to expose a IsProjectExisting method. This way the integration or data layer could recognize if a library already exists. Well there is no service call or method available to do that check. So this is what I wrote: public bool DocLibExists(string listName, ref ExceptionContract exContract) { try { var allLists = _service.GetListCollection(); return allLists.ChildNodes.OfType<XmlElement>().ToList().Exists(x => x.Attributes["Title"].Value ==listName); } catch (Exception ex) { exContract = new ExceptionContract("DocumentRetrieval/GetList/GetListWSCall", ex.GetType(), "Unable to Retrieve List Collection", ex.StackTrace, ExceptionContract.ExceptionCode.error); } return false; } This really just gets an XMLElement with all the lists. It was then up to me to sift through the clutter and noise and see if Document library already existed. This took a little bit of getting used to. Now instead of working with code, you are working with XMLElement response format from web service. I wrote a LINQ query to go through and find if the attribute "Title" existed and had a value of the listname then it would return True, if not False. I didn't particularly like working this way. Dealing with XMLElement responses and then having to manipulate it to get at the exact data I was looking for. Once the check for the DocLibExists, was done, I would either create the document library or send back an error indicating the document library already existed. Now lets examine the code that actually creates the document library. It does what you are really after, it creates a document library. Notice how the template ID is really an integer. Every document library template in SharePoint has an ID associated with it. Document libraries, Image Library, Custom List, Project Tasks, etc… they all he a unique integer associated with it. Well, that's great but the client came back to me and gave me some specifics that each "project" or document library, should have. They specified they had 3 types of projects. Each project would have unique views, about 10 views for each project. Each Project specified unique configurations (auditing, versioning, content types, etc…) So what turned out to be a simple implementation of creating a document library as a repository for a project, turned out to be quite involved. The first thing I thought of was to create a template for document library. There are other ways you can do this too. Using the web Service call, you could configure views, versioning, even content types, etc… the only catch is, you have to be working quite extensively with CAML. I am not fond of CAML. I can do it and work with it, I just don't like doing it. It is quite touchy and at times it is quite tough to understand where errors were made with CAML statements. Working with Web Services and CAML proved to be quite annoying. The service call would return a generic error message that did not particularly point me to a CAML statement syntax error, or even a CAML error. I was not sure if it was a security , performance or code based issue. It was quite tough to work with. At times it was difficult to work with because of the way SharePoint handles metadata. There are "Names", "Display Name", and "StaticName" fields. It was quite tough to understand at times, which one to use. So it took a lot of trial and error. There are tools that can help with CAML generation. There is also now intellisense for CAML statements in Visual Studio that might help but ultimately I'm not fond of CAML with Web Services. So I decided on the template. So my plan was to create create a document library, configure it accordingly and then use The Template Builder that comes with the SharePoint SDK. This tool allows you to create site templates, list template etc… It is quite interesting because it does not generate an STP file, it actually generates an xml definition and a feature you can activate and make that template available on a site or site collection. The first issue I experienced with this is that one of the specifications to this template was that the "All Documents" view was to have 2 web parts on it. Well, it turns out that using the template builder , it did not include the web parts as part of the list template definition it generated. It backed up the settings, the views, the content types but not the custom web parts. I still decided to try this even without the web parts on the page. This new template defined a new Document library definition with a unique ID. The problem was that the service call accepts an int but it only has access to the built in library int definitions. Any new ones added or created will not be available to create. So this made it impossible for me to approach the problem this way. I should also mention that one of the nice features about SharePoint is the ability to create list templates, back them up and then create lists based on that template. It can all be done by end user administrators. These templates are quite unique because they are saved as an STP file and not an xml definition. I also went this route and tried to see if there was another service call where I could create a document library based no given template name. Nope! none. After some thinking I decide to implement a WCF service to do this creation for me. I was quite certain that the object model would allow me to create document libraries base on a template in which an ID was required and also templates saved as STP files. Now I don't want to bother with posting the code to contact WCF service because it's self explanatory, but I will post the code that I used to create a list with custom template. public ServiceResult CreateProject(string name, string templateName, string projectId) { string siteurl = SPContext.Current.Site.Url; Guid webguid = SPContext.Current.Web.ID; using (SPSite site = new SPSite(siteurl)) { using (SPWeb rootweb = site.RootWeb) { SPListTemplateCollection temps = site.GetCustomListTemplates(rootweb); ProcessWeb(siteurl, webguid, web => Act_CreateProject(web, name, templateName, projectId, temps)); }//SpWeb }//SPSite return _globalResult; } private void Act_CreateProject(SPWeb targetsite, string name, string templateName, string projectId, SPListTemplateCollection temps) { var temp = temps.Cast<SPListTemplate>().FirstOrDefault(x => x.Name.Equals(templateName)); if (temp != null) { try { Guid listGuid = targetsite.Lists.Add(name, "", temp); SPList newList = targetsite.Lists[listGuid]; _globalResult = new ServiceResult(true, "Success", "Success"); } catch (Exception ex) { _globalResult = new ServiceResult(false, (string.IsNullOrEmpty(ex.Message) ? "None" : ex.Message + " " + templateName), ex.StackTrace.ToString()); } } private void ProcessWeb(string siteurl, Guid webguid, Action<SPWeb> action) { using (SPSite sitecollection = new SPSite(siteurl)) { using (SPWeb web = sitecollection.AllWebs[webguid]) { action(web); } } } This code is actually some of the code I implemented for the service. there was a lot more I did on Project Creation which I will cover in my next blog post. I implemented an ACTION method to process the web. This allowed me to properly dispose the SPWEb and SPSite objects and not rewrite this code over and over again. So I implemented a WCF service to create projects for me, this allowed me to do a lot more than just create a document library with a template, it now gave me the flexibility to do just about anything the client wanted at project creation. Once this was implemented , the client came back to me and said, "we reference all our projects with ID's in our application. we want SharePoint to do the same". This has been something I have been doing for a little while now but I do hope that SharePoint 2010 can have more of an answer to this and address it properly. I have been adding metadata to SPWebs through property bag. I believe I have blogged about it before. This time it required metadata added to a document library. No problem!!! I also mentioned these web parts that were to go on the "All Documents" View. I took the opportunity to configure them to the appropriate settings. There were two settings that needed to be set on these web parts. One of them was a Project ID configured in the webpart properties. The following code enhances and replaces the "Act_CreateProject " method above: private void Act_CreateProject(SPWeb targetsite, string name, string templateName, string projectId, SPListTemplateCollection temps) { var temp = temps.Cast<SPListTemplate>().FirstOrDefault(x => x.Name.Equals(templateName)); if (temp != null) { SPLimitedWebPartManager wpmgr = null; try { Guid listGuid = targetsite.Lists.Add(name, "", temp); SPList newList = targetsite.Lists[listGuid]; SPFolder rootFolder = newList.RootFolder; rootFolder.Properties.Add(KEY, projectId); rootFolder.Update(); if (rootFolder.ParentWeb != targetsite) rootFolder.ParentWeb.Dispose(); if (!templateName.Contains("Natural")) { SPView alldocumentsview = newList.Views.Cast<SPView>().FirstOrDefault(x => x.Title.Equals(ALLDOCUMENTS)); SPFile alldocfile = targetsite.GetFile(alldocumentsview.ServerRelativeUrl); wpmgr = alldocfile.GetLimitedWebPartManager(PersonalizationScope.Shared); ConfigureWebPart(wpmgr, projectId, CUSTOMWPNAME); alldocfile.Update(); } if (newList.ParentWeb != targetsite) newList.ParentWeb.Dispose(); _globalResult = new ServiceResult(true, "Success", "Success"); } catch (Exception ex) { _globalResult = new ServiceResult(false, (string.IsNullOrEmpty(ex.Message) ? "None" : ex.Message + " " + templateName), ex.StackTrace.ToString()); } finally { if (wpmgr != null) { wpmgr.Web.Dispose(); wpmgr.Dispose(); } } } } private void ConfigureWebPart(SPLimitedWebPartManager mgr, string prjId, string webpartname) { var wp = mgr.WebParts.Cast<System.Web.UI.WebControls.WebParts.WebPart>().FirstOrDefault(x => x.DisplayTitle.Equals(webpartname)); if (wp != null) { (wp as ListRelationshipWebPart.ListRelationshipWebPart).ProjectID = prjId; mgr.SaveChanges(wp); } } This Shows you how I was able to set metadata on the document library. It has to be added to the RootFolder of the document library, Unfortunately, the SPList does not have a Property bag that I can add a key\value pair to. It has to be done on the root folder. Now everything in the integration will reference projects by ID's and will not care about names. My, "DocLibExists" will now need to be changed because a web service is not set up to look at property bags. I had to write another method on the Service to do the equivalent but with ID's instead of names. The second thing you will notice about the code is the use of the Webpartmanager. I have seen several examples online, and also read a lot about memory leaks, The above code does not produce memory leaks. The web part manager creates an SPWeb, so just dispose it like I did. CONCLUSION This is a long long post so I will stop here for now, I will continue with more comparisons and limitations in my next post. My conclusion for this example is that Web Services will do the trick if you can suffer through CAML and if you are doing some simple operations. For Everything else, there's WCF! **** fireI apologize for the disorganization of this post, I was on a bus on a 12 hour trip to IOWA while I wrote it, I was half asleep and half awake, hopefully it makes enough sense to someone.

Read the article

Grandparent – Parent – Child Reports in SQL Developer

- by thatjeffsmith

You’ll never see one of these family stickers on my car, but I promise not to judge…much. Parent – Child reports are pretty straightforward in Oracle SQL Developer. You have a ‘parent’ report, and then one or more ‘child’ reports which are based off of a value in a selected row or value from the parent. If you need a quick tutorial to get up to speed on the subject, go ahead and take 5 minutes Shortly before I left for vacation 2 weeks agao, I got an interesting question from one of my Twitter Followers: @thatjeffsmith any luck with the #Oracle awr reports in #SQLDeveloper?This is easy with multi generation parent>child Done in #dbvisualizer — Ronald Rood (@Ik_zelf) August 26, 2012 Now that I’m back from vacation, I can tell Ronald and everyone else that the answer is ‘Yes!’ And here’s how Time to Get Out Your XML Editor Don’t have one? That’s OK, SQL Developer can edit XML files. While the Reporting interface doesn’t surface the ability to create multi-generational reports, the underlying code definitely supports it. We just need to hack away at the XML that powers a report. For this example I’m going to start simple. A query that brings back DEPARTMENTs, then EMPLOYEES, then JOBs. We can build the first two parts of the report using the report editor. A Parent-Child report in Oracle SQL Developer (Departments – Employees) Save the Report to XML Once you’ve generated the XML file, open it with your favorite XML editor. For this example I’ll be using the build-it XML editor in SQL Developer. SQL Developer Reports in their raw XML glory! Right after the PDF element in the XML document, we can start a new ‘child’ report by inserting a DISPLAY element. I just copied and pasted the existing ‘display’ down so I wouldn’t have to worry about screwing anything up. Note I also needed to change the ‘master’ name so it wouldn’t confuse SQL Developer when I try to import/open a report that has the same name. Also I needed to update the binds tags to reflect the names from the child versus the original parent report. This is pretty easy to figure out on your own actually – I mean I’m no real developer and I got it pretty quick. <?xml version="1.0" encoding="UTF-8" ?> <displays> <display id="92857fce-0139-1000-8006-7f0000015340" type="" style="Table" enable="true"> <name><![CDATA[Grandparent]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[select * from hr.departments]]></sql> </query> <pdf version="VERSION_1_7" compression="CONTENT"> <docproperty title="" author="" subject="" keywords="" /> <cell toppadding="2" bottompadding="2" leftpadding="2" rightpadding="2" horizontalalign="LEFT" verticalalign="TOP" wrap="true" /> <column> <heading font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="FIRST_PAGE" /> <footing font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="NONE" /> <blob blob="NONE" zip="false" /> </column> <table font="Courier" size="10" style="NORMAL" color="-16777216" userowshading="false" oddrowshading="-1" evenrowshading="-1" showborders="true" spacingbefore="12" spacingafter="12" horizontalalign="LEFT" /> <header enable="false" generatedate="false"> <data> null </data> </header> <footer enable="false" generatedate="false"> <data value="null" /> </footer> <security enable="false" useopenpassword="false" openpassword="" encryption="EXCLUDE_METADATA"> <permission enable="false" permissionpassword="" allowcopying="true" allowprinting="true" allowupdating="false" allowaccessdevices="true" /> </security> <pagesetup papersize="LETTER" orientation="1" measurement="in" margintop="1.0" marginbottom="1.0" marginleft="1.0" marginright="1.0" /> </pdf> <display id="null" type="" style="Table" enable="true"> <name><![CDATA[Parent]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[select * from hr.employees where department_id = EPARTMENT_ID]]></sql> <binds> <bind id="DEPARTMENT_ID"> <prompt><![CDATA[DEPARTMENT_ID]]></prompt> <tooltip><![CDATA[DEPARTMENT_ID]]></tooltip> <value><![CDATA[NULL_VALUE]]></value> </bind> </binds> </query> <pdf version="VERSION_1_7" compression="CONTENT"> <docproperty title="" author="" subject="" keywords="" /> <cell toppadding="2" bottompadding="2" leftpadding="2" rightpadding="2" horizontalalign="LEFT" verticalalign="TOP" wrap="true" /> <column> <heading font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="FIRST_PAGE" /> <footing font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="NONE" /> <blob blob="NONE" zip="false" /> </column> <table font="Courier" size="10" style="NORMAL" color="-16777216" userowshading="false" oddrowshading="-1" evenrowshading="-1" showborders="true" spacingbefore="12" spacingafter="12" horizontalalign="LEFT" /> <header enable="false" generatedate="false"> <data> null </data> </header> <footer enable="false" generatedate="false"> <data value="null" /> </footer> <security enable="false" useopenpassword="false" openpassword="" encryption="EXCLUDE_METADATA"> <permission enable="false" permissionpassword="" allowcopying="true" allowprinting="true" allowupdating="false" allowaccessdevices="true" /> </security> <pagesetup papersize="LETTER" orientation="1" measurement="in" margintop="1.0" marginbottom="1.0" marginleft="1.0" marginright="1.0" /> </pdf> <display id="null" type="" style="Table" enable="true"> <name><![CDATA[Child]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[select * from hr.jobs where job_id = :JOB_ID]]></sql> <binds> <bind id="JOB_ID"> <prompt><![CDATA[JOB_ID]]></prompt> <tooltip><![CDATA[JOB_ID]]></tooltip> <value><![CDATA[NULL_VALUE]]></value> </bind> </binds> </query> <pdf version="VERSION_1_7" compression="CONTENT"> <docproperty title="" author="" subject="" keywords="" /> <cell toppadding="2" bottompadding="2" leftpadding="2" rightpadding="2" horizontalalign="LEFT" verticalalign="TOP" wrap="true" /> <column> <heading font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="FIRST_PAGE" /> <footing font="Courier" size="10" style="NORMAL" color="-16777216" rowshading="-1" labeling="NONE" /> <blob blob="NONE" zip="false" /> </column> <table font="Courier" size="10" style="NORMAL" color="-16777216" userowshading="false" oddrowshading="-1" evenrowshading="-1" showborders="true" spacingbefore="12" spacingafter="12" horizontalalign="LEFT" /> <header enable="false" generatedate="false"> <data> null </data> </header> <footer enable="false" generatedate="false"> <data value="null" /> </footer> <security enable="false" useopenpassword="false" openpassword="" encryption="EXCLUDE_METADATA"> <permission enable="false" permissionpassword="" allowcopying="true" allowprinting="true" allowupdating="false" allowaccessdevices="true" /> </security> <pagesetup papersize="LETTER" orientation="1" measurement="in" margintop="1.0" marginbottom="1.0" marginleft="1.0" marginright="1.0" /> </pdf> </display> </display> </display> </displays> Save the file and ‘Open Report…’ You’ll see your new report name in the tree. You just need to double-click it to open it. Here’s what it looks like running A 3 generation family Now Let’s Build an AWR Text Report Ronald wanted to have the ability to query AWR snapshots and generate the AWR reports. That requires a few inputs, including a START and STOP snapshot ID. That basically tells AWR what time period to use for generating the report. And here’s where it gets tricky. We’ll need to use aliases for the SNAP_ID column. Since we’re using the same column name from 2 different queries, we need to use different bind variables. Fortunately for us, SQL Developer’s clever enough to use the column alias as the BIND. Here’s what I mean: Grandparent Query SELECT snap_id start1, begin_interval_time, end_interval_time FROM dba_hist_snapshot ORDER BY 1 asc Parent Query SELECT snap_id stop1, begin_interval_time, end_interval_time, :START1 carry FROM dba_hist_snapshot WHERE snap_id > :START1 ORDER BY 1 asc And here’s where it gets even trickier – you can’t reference a bind from outside the parent query. My grandchild report can’t reference a value from the grandparent report. So I just carry the selected value down to the parent. In my parent query SELECT you see the ‘:START1′ at the end? That’s making that value available to me when I use it in my grandchild query. To complicate things a bit further, I can’t have a column name with a ‘:’ in it, or SQL Developer will get confused when I try to reference the value of the variable with the ‘:’ – and ‘::Name’ doesn’t work. But that’s OK, just alias it. Grandchild Query Select Output From Table(Dbms_Workload_Repository.Awr_Report_Text(1298953802, 1,:CARRY, :STOP1)); Ok, and the last trick – I hard-coded my report to use my database’s DB_ID and INST_ID into the AWR package call. Now a smart person could figure out a way to make that work on any database, but I got lazy and and ran out of time. But this should be far enough for you to take it from here. Here’s what my report looks like now: Caution: don’t run this if you haven’t licensed Enterprise Edition with Diagnostic Pack. The Raw XML for this AWR Report <?xml version="1.0" encoding="UTF-8" ?> <displays> <display id="927ba96c-0139-1000-8001-7f0000015340" type="" style="Table" enable="true"> <name><![CDATA[AWR Start Stop Report Final]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[SELECT snap_id start1, begin_interval_time, end_interval_time FROM dba_hist_snapshot ORDER BY 1 asc]]></sql> </query> <display id="null" type="" style="Table" enable="true"> <name><![CDATA[Stop SNAP_ID]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[SELECT snap_id stop1, begin_interval_time, end_interval_time, :START1 carry FROM dba_hist_snapshot WHERE snap_id > :START1 ORDER BY 1 asc]]></sql> </query> <display id="null" type="" style="Table" enable="true"> <name><![CDATA[AWR Report]]></name> <description><![CDATA[]]></description> <tooltip><![CDATA[]]></tooltip> <drillclass><![CDATA[null]]></drillclass> <CustomValues> <TYPE>horizontal</TYPE> </CustomValues> <query> <sql><![CDATA[Select Output From Table(Dbms_Workload_Repository.Awr_Report_Text(1298953802, 1,:CARRY, :STOP1 ))]]></sql> </query> </display> </display> </display> </displays> Should We Build Support for Multiple Levels of Reports into the User Interface? Let us know! A comment here or a suggestion on our SQL Developer Exchange might help your case!

Read the article

Advanced TSQL Tuning: Why Internals Knowledge Matters

- by Paul White

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. Query tuning is not complete as soon as the query returns results quickly in the development or test environments. In production, your query will compete for memory, CPU, locks, I/O and other resources on the server. Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data. In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row. The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type. A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL, CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL, CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table. The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results. This seems slow, considering that the table only has 50,000 rows. Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time. Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring. The script below monitors STATISTICS IO output and the amount of tempdb used by the test query. We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB. The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row. With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first). This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts. In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted. SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed. Sorts typically require a very large number of comparisons, so this is usually a very effective optimization. One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself. SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use. The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory. Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant. The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources. More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1). Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB. The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file. Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case. In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms. If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much. Why is the data size so much smaller? The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored. By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output. You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead. SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows. The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes. The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page. There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option. By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row. SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage. You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now. This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query). SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant. No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers. Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’. Why? Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed. SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages. This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this? Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need. 245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory. Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily? That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column. That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal. I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator. Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted. We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need. The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb. The small remaining inefficiency is in reading the id column values from the clustered primary key index. As a clustered index, it contains all the in-row data at its leaf. The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead. The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure. This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only. This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data. The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings. It isn’t about the internal differences between TEXT and MAX data types either. It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load. No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance. Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows. 5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is! If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb. Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough. Tuning for logical reads and adding covering indexes is not enough. If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on. The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008. They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator. Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

Read the article

Security Issues with Single Page Apps

- by Stephen.Walther

Last week, I was asked to do a code review of a Single Page App built using the ASP.NET Web API, Durandal, and Knockout (good stuff!). In particular, I was asked to investigate whether there any special security issues associated with building a Single Page App which are not present in the case of a traditional server-side ASP.NET application. In this blog entry, I discuss two areas in which you need to exercise extra caution when building a Single Page App. I discuss how Single Page Apps are extra vulnerable to both Cross-Site Scripting (XSS) attacks and Cross-Site Request Forgery (CSRF) attacks. This goal of this blog post is NOT to persuade you to avoid writing Single Page Apps. I’m a big fan of Single Page Apps. Instead, the goal is to ensure that you are fully aware of some of the security issues related to Single Page Apps and ensure that you know how to guard against them. Cross-Site Scripting (XSS) Attacks According to WhiteHat Security, over 65% of public websites are open to XSS attacks. That’s bad. By taking advantage of XSS holes in a website, a hacker can steal your credit cards, passwords, or bank account information. Any website that redisplays untrusted information is open to XSS attacks. Let me give you a simple example. Imagine that you want to display the name of the current user on a page. To do this, you create the following server-side ASP.NET page located at http://MajorBank.com/SomePage.aspx: <%@Page Language="C#" %> <html> <head> <title>Some Page</title> </head> <body> Welcome <%= Request["username"] %> </body> </html> Nothing fancy here. Notice that the page displays the current username by using Request[“username”]. Using Request[“username”] displays the username regardless of whether the username is present in a cookie, a form field, or a query string variable. Unfortunately, by using Request[“username”] to redisplay untrusted information, you have now opened your website to XSS attacks. Here’s how. Imagine that an evil hacker creates the following link on another website (hackers.com): <a href="/SomePage.aspx?username=<script src=Evil.js></script>">Visit MajorBank</a> Notice that the link includes a query string variable named username and the value of the username variable is an HTML <SCRIPT> tag which points to a JavaScript file named Evil.js. When anyone clicks on the link, the <SCRIPT> tag will be injected into SomePage.aspx and the Evil.js script will be loaded and executed. What can a hacker do in the Evil.js script? Anything the hacker wants. For example, the hacker could display a popup dialog on the MajorBank.com site which asks the user to enter their password. The script could then post the password back to hackers.com and now the evil hacker has your secret password. ASP.NET Web Forms and ASP.NET MVC have two automatic safeguards against this type of attack: Request Validation and Automatic HTML Encoding. Protecting Coming In (Request Validation) In a server-side ASP.NET app, you are protected against the XSS attack described above by a feature named Request Validation. If you attempt to submit “potentially dangerous” content — such as a JavaScript <SCRIPT> tag — in a form field or query string variable then you get an exception. Unfortunately, Request Validation only applies to server-side apps. Request Validation does not help in the case of a Single Page App. In particular, the ASP.NET Web API does not pay attention to Request Validation. You can post any content you want – including <SCRIPT> tags – to an ASP.NET Web API action. For example, the following HTML page contains a form. When you submit the form, the form data is submitted to an ASP.NET Web API controller on the server using an Ajax request: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> </head> <body> <form data-bind="submit:submit"> <div> <label> User Name: <input data-bind="value:user.userName" /> </label> </div> <div> <label> Email: <input data-bind="value:user.email" /> </label> </div> <div> <input type="submit" value="Submit" /> </div> </form> <script src="Scripts/jquery-1.7.1.js"></script> <script src="Scripts/knockout-2.1.0.js"></script> <script> var viewModel = { user: { userName: ko.observable(), email: ko.observable() }, submit: function () { $.post("/api/users", ko.toJS(this.user)); } }; ko.applyBindings(viewModel); </script> </body> </html> The form above is using Knockout to bind the form fields to a view model. When you submit the form, the view model is submitted to an ASP.NET Web API action on the server. Here’s the server-side ASP.NET Web API controller and model class: public class UsersController : ApiController { public HttpResponseMessage Post(UserViewModel user) { var userName = user.UserName; return Request.CreateResponse(HttpStatusCode.OK); } } public class UserViewModel { public string UserName { get; set; } public string Email { get; set; } } If you submit the HTML form, you don’t get an error. The “potentially dangerous” content is passed to the server without any exception being thrown. In the screenshot below, you can see that I was able to post a username form field with the value “<script>alert(‘boo’)</script”. So what this means is that you do not get automatic Request Validation in the case of a Single Page App. You need to be extra careful in a Single Page App about ensuring that you do not display untrusted content because you don’t have the Request Validation safety net which you have in a traditional server-side ASP.NET app. Protecting Going Out (Automatic HTML Encoding) Server-side ASP.NET also protects you from XSS attacks when you render content. By default, all content rendered by the razor view engine is HTML encoded. For example, the following razor view displays the text “<b>Hello!</b>” instead of the text “Hello!” in bold: @{ var message = "<b>Hello!</b>"; } @message If you don’t want to render content as HTML encoded in razor then you need to take the extra step of using the @Html.Raw() helper. In a Web Form page, if you use <%: %> instead of <%= %> then you get automatic HTML Encoding: <%@ Page Language="C#" %> <% var message = "<b>Hello!</b>"; %> <%: message %> This automatic HTML Encoding will prevent many types of XSS attacks. It prevents <script> tags from being rendered and only allows <script> tags to be rendered which are useless for executing JavaScript. (This automatic HTML encoding does not protect you from all forms of XSS attacks. For example, you can assign the value “javascript:alert(‘evil’)” to the Hyperlink control’s NavigateUrl property and execute the JavaScript). The situation with Knockout is more complicated. If you use the Knockout TEXT binding then you get HTML encoded content. On the other hand, if you use the HTML binding then you do not:  <div data-bind="text:someProp"></div>  <div data-bind="html:someProp"></div> <script src="Scripts/jquery-1.7.1.js"></script> <script src="Scripts/knockout-2.1.0.js"></script> <script> var viewModel = { someProp : "<script>alert('Evil!')<" + "/script>" }; ko.applyBindings(viewModel); </script> So, in the page above, the DIV element which uses the TEXT binding is safe from XSS attacks. According to the Knockout documentation: “Since this binding sets your text value using a text node, it’s safe to set any string value without risking HTML or script injection.” Just like server-side HTML encoding, Knockout does not protect you from all types of XSS attacks. For example, there is nothing in Knockout which prevents you from binding JavaScript to a hyperlink like this: <a data-bind="attr:{href:homePageUrl}">Go</a> <script src="Scripts/jquery-1.7.1.min.js"></script> <script src="Scripts/knockout-2.1.0.js"></script> <script> var viewModel = { homePageUrl: "javascript:alert('evil!')" }; ko.applyBindings(viewModel); </script> In the page above, the value “javascript:alert(‘evil’)” is bound to the HREF attribute using Knockout. When you click the link, the JavaScript executes. Cross-Site Request Forgery (CSRF) Attacks Cross-Site Request Forgery (CSRF) attacks rely on the fact that a session cookie does not expire until you close your browser. In particular, if you visit and login to MajorBank.com and then you navigate to Hackers.com then you will still be authenticated against MajorBank.com even after you navigate to Hackers.com. Because MajorBank.com cannot tell whether a request is coming from MajorBank.com or Hackers.com, Hackers.com can submit requests to MajorBank.com pretending to be you. For example, Hackers.com can post an HTML form from Hackers.com to MajorBank.com and change your email address at MajorBank.com. Hackers.com can post a form to MajorBank.com using your authentication cookie. After your email address has been changed, by using a password reset page at MajorBank.com, a hacker can access your bank account. To prevent CSRF attacks, you need some mechanism for detecting whether a request is coming from a page loaded from your website or whether the request is coming from some other website. The recommended way of preventing Cross-Site Request Forgery attacks is to use the “Synchronizer Token Pattern” as described here: https://www.owasp.org/index.php/Cross-Site_Request_Forgery_%28CSRF%29_Prevention_Cheat_Sheet When using the Synchronizer Token Pattern, you include a hidden input field which contains a random token whenever you display an HTML form. When the user opens the form, you add a cookie to the user’s browser with the same random token. When the user posts the form, you verify that the hidden form token and the cookie token match. Preventing Cross-Site Request Forgery Attacks with ASP.NET MVC ASP.NET gives you a helper and an action filter which you can use to thwart Cross-Site Request Forgery attacks. For example, the following razor form for creating a product shows how you use the @Html.AntiForgeryToken() helper: @model MvcApplication2.Models.Product <h2>Create Product</h2> @using (Html.BeginForm()) { @Html.AntiForgeryToken(); <div> @Html.LabelFor( p => p.Name, "Product Name:") @Html.TextBoxFor( p => p.Name) </div> <div> @Html.LabelFor( p => p.Price, "Product Price:") @Html.TextBoxFor( p => p.Price) </div> <input type="submit" /> } The @Html.AntiForgeryToken() helper generates a random token and assigns a serialized version of the same random token to both a cookie and a hidden form field. (Actually, if you dive into the source code, the AntiForgeryToken() does something a little more complex because it takes advantage of a user’s identity when generating the token). Here’s what the hidden form field looks like: <input name=”__RequestVerificationToken” type=”hidden” value=”NqqZGAmlDHh6fPTNR_mti3nYGUDgpIkCiJHnEEL59S7FNToyyeSo7v4AfzF2i67Cv0qTB1TgmZcqiVtgdkW2NnXgEcBc-iBts0x6WAIShtM1″ /> And here’s what the cookie looks like using the Google Chrome developer toolbar: You use the [ValidateAntiForgeryToken] action filter on the controller action which is the recipient of the form post to validate that the token in the hidden form field matches the token in the cookie. If the tokens don’t match then validation fails and you can’t post the form: public ActionResult Create() { return View(); } [ValidateAntiForgeryToken] [HttpPost] public ActionResult Create(Product productToCreate) { if (ModelState.IsValid) { // save product to db return RedirectToAction("Index"); } return View(); } How does this all work? Let’s imagine that a hacker has copied the Create Product page from MajorBank.com to Hackers.com – the hacker grabs the HTML source and places it at Hackers.com. Now, imagine that the hacker trick you into submitting the Create Product form from Hackers.com to MajorBank.com. You’ll get the following exception: The Cross-Site Request Forgery attack is blocked because the anti-forgery token included in the Create Product form at Hackers.com won’t match the anti-forgery token stored in the cookie in your browser. The tokens were generated at different times for different users so the attack fails. Preventing Cross-Site Request Forgery Attacks with a Single Page App In a Single Page App, you can’t prevent Cross-Site Request Forgery attacks using the same method as a server-side ASP.NET MVC app. In a Single Page App, HTML forms are not generated on the server. Instead, in a Single Page App, forms are loaded dynamically in the browser. Phil Haack has a blog post on this topic where he discusses passing the anti-forgery token in an Ajax header instead of a hidden form field. He also describes how you can create a custom anti-forgery token attribute to compare the token in the Ajax header and the token in the cookie. See: http://haacked.com/archive/2011/10/10/preventing-csrf-with-ajax.aspx Also, take a look at Johan’s update to Phil Haack’s original post: http://johan.driessen.se/posts/Updated-Anti-XSRF-Validation-for-ASP.NET-MVC-4-RC (Other server frameworks such as Rails and Django do something similar. For example, Rails uses an X-CSRF-Token to prevent CSRF attacks which you generate on the server – see http://excid3.com/blog/rails-tip-2-include-csrf-token-with-every-ajax-request/#.UTFtgDDkvL8 ). For example, if you are creating a Durandal app, then you can use the following razor view for your one and only server-side page: @{ Layout = null; } <!DOCTYPE html> <html> <head> <title>Index</title> </head> <body> @Html.AntiForgeryToken() <div id="applicationHost"> Loading app.... </div> @Scripts.Render("~/scripts/vendor") <script type="text/javascript" src="~/App/durandal/amd/require.js" data-main="/App/main"></script> </body> </html> Notice that this page includes a call to @Html.AntiForgeryToken() to generate the anti-forgery token. Then, whenever you make an Ajax request in the Durandal app, you can retrieve the anti-forgery token from the razor view and pass the token as a header: var csrfToken = $("input[name='__RequestVerificationToken']").val(); $.ajax({ headers: { __RequestVerificationToken: csrfToken }, type: "POST", dataType: "json", contentType: 'application/json; charset=utf-8', url: "/api/products", data: JSON.stringify({ name: "Milk", price: 2.33 }), statusCode: { 200: function () { alert("Success!"); } } }); Use the following code to create an action filter which you can use to match the header and cookie tokens: using System.Linq; using System.Net.Http; using System.Web.Helpers; using System.Web.Http.Controllers; namespace MvcApplication2.Infrastructure { public class ValidateAjaxAntiForgeryToken : System.Web.Http.AuthorizeAttribute { protected override bool IsAuthorized(HttpActionContext actionContext) { var headerToken = actionContext .Request .Headers .GetValues("__RequestVerificationToken") .FirstOrDefault(); ; var cookieToken = actionContext .Request .Headers .GetCookies() .Select(c => c[AntiForgeryConfig.CookieName]) .FirstOrDefault(); // check for missing cookie or header if (cookieToken == null || headerToken == null) { return false; } // ensure that the cookie matches the header try { AntiForgery.Validate(cookieToken.Value, headerToken); } catch { return false; } return base.IsAuthorized(actionContext); } } } Notice that the action filter derives from the base AuthorizeAttribute. The ValidateAjaxAntiForgeryToken only works when the user is authenticated and it will not work for anonymous requests. Add the action filter to your ASP.NET Web API controller actions like this: [ValidateAjaxAntiForgeryToken] public HttpResponseMessage PostProduct(Product productToCreate) { // add product to db return Request.CreateResponse(HttpStatusCode.OK); } After you complete these steps, it won’t be possible for a hacker to pretend to be you at Hackers.com and submit a form to MajorBank.com. The header token used in the Ajax request won’t travel to Hackers.com. This approach works, but I am not entirely happy with it. The one thing that I don’t like about this approach is that it creates a hard dependency on using razor. Your single page in your Single Page App must be generated from a server-side razor view. A better solution would be to generate the anti-forgery token in JavaScript. Unfortunately, until all browsers support a way to generate cryptographically strong random numbers – for example, by supporting the window.crypto.getRandomValues() method — there is no good way to generate anti-forgery tokens in JavaScript. So, at least right now, the best solution for generating the tokens is the server-side solution with the (regrettable) dependency on razor. Conclusion The goal of this blog entry was to explore some ways in which you need to handle security differently in the case of a Single Page App than in the case of a traditional server app. In particular, I focused on how to prevent Cross-Site Scripting and Cross-Site Request Forgery attacks in the case of a Single Page App. I want to emphasize that I am not suggesting that Single Page Apps are inherently less secure than server-side apps. Whatever type of web application you build – regardless of whether it is a Single Page App, an ASP.NET MVC app, an ASP.NET Web Forms app, or a Rails app – you must constantly guard against security vulnerabilities.

Read the article

Sorting and Filtering By Model-Based LOV Display Value

- by Steven Davelaar

If you use a model-based LOV and you use display type "choice", then ADF nicely displays the display value, even if the table is read-only. In the screen shot below, you see the RegionName attribute displayed instead of the RegionId. This is accomplished by the model-based LOV, I did not modify the Countries view object to include a join with Regions. Also note the sort icon, the table is sorted by RegionId. This sorting typically results in a bug reported by your test team. Europe really shouldn't come before America when sorting ascending, right? To fix this, we could of course change the Countries view object query and add a join with the Regions table to include the RegionName attribute. If the table is updateable, we still need the choice list, so we need to move the model-based LOV from the RegionId attribute to the RegionName attribute and hide the RegionId attribute in the table. But that is a lot of work for such a simple requirement, in particular if we have lots of model-based choice lists in our view object. Fortunately, there is an easier way to do this, with some generic code in your view object base class that fixes this at once for all model-based choice lists that we have defined in our application. The trick is to override the method getSortCriteria() in the base view object class. By default, this method returns null because the sorting is done in the database through a SQL Order By clause. However, if the getSortCriteria method does return a sort criteria the framework will perform in memory sorting which is what we need to achieve sorting by region name. So, inside this method we need to evaluate the Order By clause, and if the order by column matches an attribute that has a model-based LOV choicelist defined with a display attribute that is different from the value attribute, we need to return a sort criterria. Here is the complete code of this method: public SortCriteria[] getSortCriteria() { String orderBy = getOrderByClause(); if (orderBy!=null ) { boolean descending = false; if (orderBy.endsWith(" DESC")) { descending = true; orderBy = orderBy.substring(0,orderBy.length()-5); } // extract column name, is part after the dot int dotpos = orderBy.lastIndexOf("."); String columnName = orderBy.substring(dotpos+1); // loop over attributes and find matching attribute AttributeDef orderByAttrDef = null; for (AttributeDef attrDef : getAttributeDefs()) { if (columnName.equals(attrDef.getColumnName())) { orderByAttrDef = attrDef; break; } } if (orderByAttrDef!=null && "choice".equals(orderByAttrDef.getProperty("CONTROLTYPE")) && orderByAttrDef.getListBindingDef()!=null) { String orderbyAttr = orderByAttrDef.getName(); String[] displayAttrs = orderByAttrDef.getListBindingDef().getListDisplayAttrNames(); String[] listAttrs = orderByAttrDef.getListBindingDef().getListAttrNames(); // if first list display attributes is not the same as first list attribute, than the value // displayed is different from the value copied back to the order by attribute, in which case we need to // use our custom comparator if (displayAttrs!=null && listAttrs!=null && displayAttrs.length>0 && !displayAttrs[0].equals(listAttrs[0])) { SortCriteriaImpl sc1 = new SortCriteriaImpl(orderbyAttr, descending); SortCriteria[] sc = new SortCriteriaImpl[]{sc1}; return sc; } } } return super.getSortCriteria(); } If this method returns a sort criteria, then the framework will call the sort method on the view object. The sort method uses a Comparator object to determine the sequence in which the rows should be returned. This comparator is retrieved by calling the getRowComparator method on the view object. So, to ensure sorting by our display value, we need to override this method to return our custom comparator: public Comparator getRowComparator() { return new LovDisplayAttributeRowComparator(getSortCriteria()); } The custom comparator class extends the default RowComparator class and overrides the method compareRows and looks up the choice display value to compare the two rows. The complete code of this class is included in the sample application. With this code in place, clicking on the Region sort icon nicely sorts the countries by RegionName, as you can see below. When using the Query-By-Example table filter at the top of the table, you typically want to use the same choice list to filter the rows. One way to do that is documented in ADF code corner sample 16 - How To Customize the ADF Faces Table Filter.The solution in this sample is perfectly fine to use. This sample requires you to define a separate iterator binding and associated tree binding to populate the choice list in the table filter area using the af:iterator tag. You might be able to reuse the same LOV view object instance in this iterator binding that is used as view accessor for the model-bassed LOV. However, I have seen quite a few customers who have a generic LOV view object (mapped to one "refcodes" table) with the bind variable values set in the LOV view accessor. In such a scenario, some duplicate work is needed to get a dedicated view object instance with the correct bind variables that can be used in the iterator binding. Looking for ways to maximize reuse, wouldn't it be nice if we could just reuse our model-based LOV to populate this filter choice list? Well we can. Here are the basic steps: 1. Create an attribute list binding in the page definition that we can use to retrieve the list of SelectItems needed to populate the choice list <list StaticList="false" Uses="LOV_RegionId" IterBinding="CountriesView1Iterator" id="RegionId"/> We need this "current row" list binding because the implicit list binding used by the item in the table is not accessible outside a table row, we cannot use the expression #{row.bindings.RegionId} in the table filter facet. 2. Create a Map-style managed bean with the get method retrieving the list binding as key, and returning the list of SelectItems. To return this list, we take the list of selectItems contained by the list binding and replace the index number that is normally used as key value with the actual attribute value that is set by the choice list. Here is the code of the get method: public Object get(Object key) { if (key instanceof FacesCtrlListBinding) { // we need to cast to internal class FacesCtrlListBinding rather than JUCtrlListBinding to // be able to call getItems method. To prevent this import, we could evaluate an EL expression // to get the list of items FacesCtrlListBinding lb = (FacesCtrlListBinding) key; if (cachedFilterLists.containsKey(lb.getName())) { return cachedFilterLists.get(lb.getName()); } List<SelectItem> items = (List<SelectItem>)lb.getItems(); if (items==null || items.size()==0) { return items; } List<SelectItem> newItems = new ArrayList<SelectItem>(); JUCtrlValueDef def = ((JUCtrlValueDef)lb.getDef()); String valueAttr = def.getFirstAttrName(); // the items list has an index number as value, we need to replace this with the actual // value of the attribute that is copied back by the choice list for (int i = 0; i < items.size(); i++) { SelectItem si = (SelectItem) items.get(i); Object value = lb.getValueFromList(i); if (value instanceof Row) { Row row = (Row) value; si.setValue(row.getAttribute(valueAttr)); } else { // this is the "empty" row, set value to empty string so all rows will be returned // as user no longer wants to filter on this attribute si.setValue(""); } newItems.add(si); } cachedFilterLists.put(lb.getName(), newItems); return newItems; } return null; } Note that we added caching to speed up performance, and to handle the situation where table filters or search criteria are set such that no rows are retrieved in the table. When there are no rows, there is no current row and the getItems method on the list binding will return no items. An alternative approach to create the list of SelectItems would be to retrieve the iterator binding from the list binding and loop over the rows in the iterator binding rowset. Then we wouldn't need the import of the ADF internal oracle.adfinternal.view.faces.model.binding.FacesCtrlListBinding class, but then we need to figure out the display attributes from the list binding definition, and possible separate them with a dash if multiple display attributes are defined in the LOV. Doable but less reuse and more work. 3. Inside the filter facet for the column create an af:selectOneChoice with the value property of the f:selectItems tag referencing the get method of the managed bean: <f:facet name="filter"> <af:selectOneChoice id="soc0" autoSubmit="true" value="#{vs.filterCriteria.RegionId}">  <f:selectItems id="si0" value="#{viewScope.TableFilterChoiceList[bindings.RegionId]}"/> </af:selectOneChoice> </f:facet> Note that the managed bean is defined in viewScope for the caching to take effect. Here is a screen shot of the tabe filter in action: You can download the sample application here.

Read the article

Android WebView not loading a JavaScript file, but Android Browser loads it fine.

- by Justin

I'm writing an application which connects to a back office site. The backoffice site contains a whole slew of JavaScript functions, at least 100 times the average site. Unfortunately it does not load them, and causes much of the functionality to not work properly. So I am running a test. I put a page out on my server which loads the FireBugLite javascript text. Its a lot of javascript and perfect to test and see if the Android WebView will load it. The WebView loads nothing, but the browser loads the Firebug Icon. What on earth would make the difference, why can it run in the browser and not in my WebView? Any suggestions. More background information, in order to get the stinking backoffice application available on a Droid (or any other platform except windows) I needed to trick the bakcoffice application to believe what's accessing the website is Internet Explorer. I do this by modifying the WebView User Agent. Also for this application I've slimmed my landing page, so I could give you the source to offer me aid. package ksc.myKMB; import android.app.Activity; import android.app.AlertDialog; import android.app.Dialog; import android.app.ProgressDialog; import android.content.DialogInterface; import android.graphics.Bitmap; import android.os.Bundle; import android.view.Menu; import android.view.MenuInflater; import android.view.MenuItem; import android.view.Window; import android.webkit.WebChromeClient; import android.webkit.WebView; import android.webkit.WebSettings; import android.webkit.WebViewClient; import android.widget.Toast; public class myKMB extends Activity { /** Called when the activity is first created. */ @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); /** Performs base set up */ /** Create a Activity of this Activity, IE myProcess */ myProcess = this; /*** Create global objects and web browsing objects */ HideDialogOnce = true; webview = new WebView(this) { }; webChromeClient = new WebChromeClient() { public void onProgressChanged(WebView view, int progress) { // Activities and WebViews measure progress with different scales. // The progress meter will automatically disappear when we reach 100% myProcess.setProgress((progress * 100)); //CreateMessage("Progress is : " + progress); } }; webViewClient = new WebViewClient() { public void onReceivedError(WebView view, int errorCode, String description, String failingUrl) { Toast.makeText(myProcess, MessageBegText + description + MessageEndText, Toast.LENGTH_SHORT).show(); } public void onPageFinished (WebView view, String url) { /** Hide dialog */ try { // loadingDialog.dismiss(); } finally { } //myProcess.setProgress(1000); /** Fon't show the dialog while I'm performing fixes */ //HideDialogOnce = true; view.loadUrl("javascript:document.getElementById('JTRANS011').style.visibility='visible';"); } public void onPageStarted(WebView view, String url, Bitmap favicon) { if (HideDialogOnce == false) { //loadingDialog = ProgressDialog.show(myProcess, "", // "One moment, the page is laoding...", true); } else { //HideDialogOnce = true; } } }; getWindow().requestFeature(Window.FEATURE_PROGRESS); webview.setWebChromeClient(webChromeClient); webview.setWebViewClient(webViewClient); setContentView(webview); /** Load the Keynote Browser Settings */ LoadSettings(); webview.loadUrl(LandingPage); } /** Get Menu */ @Override public boolean onCreateOptionsMenu(Menu menu) { MenuInflater inflater = getMenuInflater(); inflater.inflate(R.menu.menu, menu); return true; } /** an item gets pushed */ @Override public boolean onOptionsItemSelected(MenuItem item) { switch (item.getItemId()) { // We have only one menu option case R.id.quit: System.exit(0); break; case R.id.back: webview.goBack(); case R.id.refresh: webview.reload(); case R.id.info: //IncludeJavascript(""); } return true; } /** Begin Globals */ public WebView webview; public WebChromeClient webChromeClient; public WebViewClient webViewClient; public ProgressDialog loadingDialog; public Boolean HideDialogOnce; public Activity myProcess; public String OverideUserAgent_IE = "Mozilla/5.0 (Windows; MSIE 6.0; Android 1.6; en-US) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Safari/523.12.2 myKMB/1.0"; public String LandingPage = "http://kscserver.com/main-leap-slim.html"; public String MessageBegText = "Problem making a connection, Details: "; public String MessageEndText = " For Support Call: (xxx) xxx - xxxx."; public void LoadSettings() { webview.getSettings().setUserAgentString(OverideUserAgent_IE); webview.getSettings().setJavaScriptEnabled(true); webview.getSettings().setBuiltInZoomControls(true); webview.getSettings().setSupportZoom(true); } /** Creates a message alert dialog */ public void CreateMessage(String message) { AlertDialog.Builder builder = new AlertDialog.Builder(this); builder.setMessage(message) .setCancelable(true) .setNegativeButton("Close", new DialogInterface.OnClickListener() { public void onClick(DialogInterface dialog, int id) { dialog.cancel(); } }); AlertDialog alert = builder.create(); alert.show(); } } My Application is running in the background, and as you can see no Firebug in the lower right hand corner. However the browser (the emulator on top) has the same page but shows the firebug. What am I doing wrong? I'm assuming its either not enough memory allocated to the application, process power allocation, or a physical memory thing. I can't tell, all I know is the results are strange. I get the same thing form my android device, the application shows no firebug but the browser shows the firebug.

Read the article

Passing XML markers to Google Map

- by djmadscribbler

I've been creating a V3 Google map based on this example from Mike Williams http://www.geocodezip.com/v3_MW_example_map3.html I've run into a bit of a problem though. If I have no parameters in my URL then I get the error "id is undefined idmarkers [id.toLowerCase()] = marker;" in Firebug and only one marker will show up. If I have a parameter (?id=105 for example) then all the sidebar links say 105 (or whatever the parameter in the URL was) instead of their respective label as listed in the XML file and a random infowindow will be opened instead of the window for the id in the URL. Here is my javascript: var map = null; var lastmarker = null; // ========== Read paramaters that have been passed in ========== // Before we go looking for the passed parameters, set some defaults // in case there are no parameters var id; var index = -1; // these set the initial center, zoom and maptype for the map // if it is not specified in the query string var lat = 42.194741; var lng = -121.700301; var zoom = 18; var maptype = google.maps.MapTypeId.HYBRID; function MapTypeId2UrlValue(maptype) { var urlValue = 'm'; switch (maptype) { case google.maps.MapTypeId.HYBRID: urlValue = 'h'; break; case google.maps.MapTypeId.SATELLITE: urlValue = 'k'; break; case google.maps.MapTypeId.TERRAIN: urlValue = 't'; break; default: case google.maps.MapTypeId.ROADMAP: urlValue = 'm'; break; } return urlValue; } // If there are any parameters at eh end of the URL, they will be in location.search // looking something like "?marker=3" // skip the first character, we are not interested in the "?" var query = location.search.substring(1); // split the rest at each "&" character to give a list of "argname=value" pairs var pairs = query.split("&"); for (var i = 0; i < pairs.length; i++) { // break each pair at the first "=" to obtain the argname and value var pos = pairs[i].indexOf("="); var argname = pairs[i].substring(0, pos).toLowerCase(); var value = pairs[i].substring(pos + 1).toLowerCase(); // process each possible argname - use unescape() if theres any chance of spaces if (argname == "id") { id = unescape(value); } if (argname == "marker") { index = parseFloat(value); } if (argname == "lat") { lat = parseFloat(value); } if (argname == "lng") { lng = parseFloat(value); } if (argname == "zoom") { zoom = parseInt(value); } if (argname == "type") { // from the v3 documentation 8/24/2010 // HYBRID This map type displays a transparent layer of major streets on satellite images. // ROADMAP This map type displays a normal street map. // SATELLITE This map type displays satellite images. // TERRAIN This map type displays maps with physical features such as terrain and vegetation. if (value == "m") { maptype = google.maps.MapTypeId.ROADMAP; } if (value == "k") { maptype = google.maps.MapTypeId.SATELLITE; } if (value == "h") { maptype = google.maps.MapTypeId.HYBRID; } if (value == "t") { maptype = google.maps.MapTypeId.TERRAIN; } } } // this variable will collect the html which will eventually be placed in the side_bar var side_bar_html = ""; // arrays to hold copies of the markers and html used by the side_bar // because the function closure trick doesnt work there var gmarkers = []; var idmarkers = []; // global "map" variable var map = null; // A function to create the marker and set up the event window function function createMarker(point, icon, label, html) { var contentString = html; var marker = new google.maps.Marker({ position: point, map: map, title: label, icon: icon, zIndex: Math.round(point.lat() * -100000) << 5 }); marker.id = id; marker.index = gmarkers.length; google.maps.event.addListener(marker, 'click', function () { lastmarker = new Object; lastmarker.id = marker.id; lastmarker.index = marker.index; infowindow.setContent(contentString); infowindow.open(map, marker); }); // save the info we need to use later for the side_bar gmarkers.push(marker); idmarkers[id.toLowerCase()] = marker; // add a line to the side_bar html side_bar_html += '<a href="javascript:myclick(' + (gmarkers.length - 1) + ')">' + id + '<\/a><br>'; } // This function picks up the click and opens the corresponding info window function myclick(i) { google.maps.event.trigger(gmarkers[i], "click"); } function makeLink() { var mapinfo = "lat=" + map.getCenter().lat().toFixed(6) + "&lng=" + map.getCenter().lng().toFixed(6) + "&zoom=" + map.getZoom() + "&type=" + MapTypeId2UrlValue(map.getMapTypeId()); if (lastmarker) { var a = "/about/map/default.aspx?id=" + lastmarker.id + "&" + mapinfo; var b = "/about/map/default.aspx?marker=" + lastmarker.index + "&" + mapinfo; } else { var a = "/about/map/default.aspx?" + mapinfo; var b = a; } document.getElementById("idlink").innerHTML = '<a href="' + a + '" id=url target=_new>- Link directly to this page by id</a> (id in xml file also entry "name" in sidebar menu)'; document.getElementById("indexlink").innerHTML = '<a href="' + b + '" id=url target=_new>- Link directly to this page by index</a> (position in gmarkers array)'; } function initialize() { // create the map var myOptions = { zoom: zoom, center: new google.maps.LatLng(lat, lng), mapTypeId: maptype, mapTypeControlOptions: { style: google.maps.MapTypeControlStyle.DROPDOWN_MENU }, navigationControl: true, mapTypeId: google.maps.MapTypeId.HYBRID }; map = new google.maps.Map(document.getElementById("map_canvas"), myOptions); var stylesarray = [ { featureType: "poi", elementType: "labels", stylers: [ { visibility: "off" } ] }, { featureType: "landscape.man_made", elementType: "labels", stylers: [ { visibility: "off" } ] } ]; var options = map.setOptions({ styles: stylesarray }); // Make the link the first time when the page opens makeLink(); // Make the link again whenever the map changes google.maps.event.addListener(map, 'maptypeid_changed', makeLink); google.maps.event.addListener(map, 'center_changed', makeLink); google.maps.event.addListener(map, 'bounds_changed', makeLink); google.maps.event.addListener(map, 'zoom_changed', makeLink); google.maps.event.addListener(map, 'click', function () { lastmarker = null; makeLink(); infowindow.close(); }); // Read the data from example.xml downloadUrl("example.xml", function (doc) { var xmlDoc = xmlParse(doc); var markers = xmlDoc.documentElement.getElementsByTagName("marker"); for (var i = 0; i < markers.length; i++) { // obtain the attribues of each marker var lat = parseFloat(markers[i].getAttribute("lat")); var lng = parseFloat(markers[i].getAttribute("lng")); var point = new google.maps.LatLng(lat, lng); var html = markers[i].getAttribute("html"); var label = markers[i].getAttribute("label"); var icon = markers[i].getAttribute("icon"); // create the marker var marker = createMarker(point, icon, label, html); } // put the assembled side_bar_html contents into the side_bar div document.getElementById("side_bar").innerHTML = side_bar_html; // ========= If a parameter was passed, open the info window ========== if (id) { if (idmarkers[id]) { google.maps.event.trigger(idmarkers[id], "click"); } else { alert("id " + id + " does not match any marker"); } } if (index > -1) { if (index < gmarkers.length) { google.maps.event.trigger(gmarkers[index], "click"); } else { alert("marker " + index + " does not exist"); } } }); } var infowindow = new google.maps.InfoWindow( { size: new google.maps.Size(150, 50) }); google.maps.event.addDomListener(window, "load", initialize); And here is an example of my XML formatting <marker lat="42.196175" lng="-121.699224" html="This is the information about 104" iconimage="/about/map/images/104.png" label="104" />

Read the article

How to find an entry-level job after you already have a graduate degree?

- by Uri

Note: I asked this question in early 2009. A couple of months later, I found a great job. I've previously updated this question with some tips for whoever ends up in a similar situation, and now cleaned it up a little for the benefit of the fresh batch of graduates. Original post: In my early 20s I abandoned a great C++ development career path in a major company to go to graduate school and get a research masters (3 years). I did another year in industrial research, and then moved to the US to attend graduate school again, getting another masters and a Ph.D in software engineering from a top school (another 6 years down the drain). I was coding the whole way throughout my degrees (core Java and Eclipse plug-ins) and working on research related to software engineering (usability of APIs). I ended up graduating the year of the recession, with a son on the way and the prospects of no healthcare. Academic jobs and industrial research jobs are quite scarce. Initially, I was naive, thinking that with my background, I could easily find a coding job. Big mistake. It turns out that I'm in a complicated position. Entry level positions are usually offered to college undergraduates. I attended my school's career fairs, but you could immediately see signs of Ph.D. aversion and overqualification issues. Some of the recruiters I spoke with explicitly told me that they wanted 20 year olds with clean slates, and some were looking for interns since they are in various forms of hiring freezes. I managed to get a couple of interviews from these career fairs and through recruiters. However, since I've been out of school for a long time and programming primarily in Java, I am also no longer proficient in C/C++ and the usual range of college-level interview questions that everyone uses. I had no problems with this when I was 19 and interviewing for my first job since a lot of what you do in C is manipulate pointers and I was coding C++ for fun and for school. Later I was routinely doing pointer manipulation on the job, and during my first masters taught college courses with data structures and C++. But even though I remember many properties of C++ well, it's been close to ten years since I regularly used C++ and pointers. As a Java developer I rarely had to work at this level, but experience in OOD and in writing good maintainable code is meaningless for C++ interviews. Reading books as a refresh and looking at sample code did not do the trick. I also looked at mid-to-senior level Java positions, but most of them focused on J2EE APIs rather than on core Java and required a certain number of years in industrial positions. Coding research tools and prior C++ experience doesn't count. So that sends me back to entry-level jobs that are posted through job-boards, and these are not common (mostly they are Monster junk), and small companies are even less likely to answer a Ph.D. compared to the giants who participate in top-10 career fairs. Even worse, in many companies initial screening is done by HR folks who really don't want to deal with anything anomalous like a Ph.D. Any tips on how I should approach this intractable position? For example, what should I write in cover letters? Note that while immigration is not an issue for me, I cannot go freelance as I need the benefits (and in particular group health insurance). During my studies I had no time to contribute to open-source projects or maintain a popular blog, so even if I invested in that now there would be no immediate benefit. Updates: In the two months after posting this I received several offers to work as a core Java developer in the financial industry and accepted one from a firm where I am working to this day. For those who find themselves in similar situations, here are my tips: Give up on trying to find an entry level positions. You can't undo time. Accept the fact that there is Ph.D. discrimination in the job market (some might say rightfully so). It is legal to discriminate based on education. No point fighting it. The most important tip is to focus on the language you are comfortable with. The sad truth about programming in a particular language is that it is not like riding a bike. If you haven't used a language in the last few years, and can't actually apply it routinely (not just as a refresher) before you start your search, it is going to be very difficult to do well in an interview. Now that I'm interviewing others, I routinely see it in folks with a mixed C++/Java background. We maintain "a shadow" of the old language but end up with a weird mix that makes it hard to interview on either. Entry-level folks are at an advantage here since they usually have one language. Memory can help you do great in a screening interview, but without recent day-to-day experience, code tests will be difficult. Despite the supposed relation, core Java programming and J2EE programming are two different things with different skillsets. If you come from academia, you likely have very little J2EE experience and may find it hard to get accepted for a J2EE job. J2EE jobs seem to have a larger list of acronyms in their requirements. In addition, from interviewing J2EE developers it seems that for many there is a focus on mastering specific APIs and architectures, whereas core Java development tends to be secondary. In the same way that I can no longer manipulate pointers well, a J2EE developer may have difficulties doing low level Java manipulation. This puts you at a relative advantage in competing for core Java jobs! If you are able to work for startups (in terms of family life and stability) or migrate to startup-rich areas such as the west coast, you can find many exciting opportunities where advanced degrees are a benefit. I've since been approached by several startups, although I had to decline. Work through a recruiter if possible. They have direct contacts with the hiring parties, allowing you to "stand out". It is better to get a clear yes/no confirmation from a recruiter on whether a company might be interested in interviewing you, than it is to send your resume and hope that someone will ever see it. Recruiters are also a great way of bypassing HR. However, also beware of recruiters. They have a vested interest and will go to various shady practices and pressure tactics. To find a good recruiter, talk to a friend who declined a job offer he got through a recruiter. A good recruiter, to me, is measured in how they handle that. Interview for the jobs that require your core strength. If you're rusty or entirely unfamiliar with a technology around which the job revolves, you're probably not a good match. Yes, you probably have the talent to master them, but most companies would want "instant gratification". I got my offers from companies that wanted core Java developer. I didn't do well on places that wanted advance C++ because I am too rusty and not up to date on recent libraries. I also didn't hear from companies that wanted lots of J2EE experience, and that's ok. Finding companies that want core Java without web is harder, but exists in specific industries (e.g., finance, defense). This requires a lot more legwork in terms of search, but these jobs do exist. There are different interview styles. Some companies focus on puzzles, some companies focus on algorithms, and some companies focus on design and coding skills. I had the most success in places where the questions were the most related to the function I would have been performing. Pick companies accordingly as well.

Read the article

Capturing and Transforming ASP.NET Output with Response.Filter

- by Rick Strahl

During one of my Handlers and Modules session at DevConnections this week one of the attendees asked a question that I didn’t have an immediate answer for. Basically he wanted to capture response output completely and then apply some filtering to the output – effectively injecting some additional content into the page AFTER the page had completely rendered. Specifically the output should be captured from anywhere – not just a page and have this code injected into the page. Some time ago I posted some code that allows you to capture ASP.NET Page output by overriding the Render() method, capturing the HtmlTextWriter() and reading its content, modifying the rendered data as text then writing it back out. I’ve actually used this approach on a few occasions and it works fine for ASP.NET pages. But this obviously won’t work outside of the Page class environment and it’s not really generic – you have to create a custom page class in order to handle the output capture. [updated 11/16/2009 – updated ResponseFilterStream implementation and a few additional notes based on comments] Enter Response.Filter However, ASP.NET includes a Response.Filter which can be used – well to filter output. Basically Response.Filter is a stream through which the OutputStream is piped back to the Web Server (indirectly). As content is written into the Response object, the filter stream receives the appropriate Stream commands like Write, Flush and Close as well as read operations although for a Response.Filter that’s uncommon to be hit. The Response.Filter can be programmatically replaced at runtime which allows you to effectively intercept all output generation that runs through ASP.NET. A common Example: Dynamic GZip Encoding A rather common use of Response.Filter hooking up code based, dynamic GZip compression for requests which is dead simple by applying a GZipStream (or DeflateStream) to Response.Filter. The following generic routines can be used very easily to detect GZip capability of the client and compress response output with a single line of code and a couple of library helper routines: WebUtils.GZipEncodePage(); which is handled with a few lines of reusable code and a couple of static helper methods: /// <summary> ///Sets up the current page or handler to use GZip through a Response.Filter ///IMPORTANT: ///You have to call this method before any output is generated! /// </summary> public static void GZipEncodePage() { HttpResponse Response = HttpContext.Current.Response; if(IsGZipSupported()) { stringAcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if(AcceptEncoding.Contains("deflate")) { Response.Filter = newSystem.IO.Compression.DeflateStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.AppendHeader("Content-Encoding", "deflate"); } else { Response.Filter = newSystem.IO.Compression.GZipStream(Response.Filter, System.IO.Compression.CompressionMode.Compress); Response.AppendHeader("Content-Encoding", "gzip"); } } // Allow proxy servers to cache encoded and unencoded versions separately Response.AppendHeader("Vary", "Content-Encoding"); } /// <summary> /// Determines if GZip is supported /// </summary> /// <returns></returns> public static bool IsGZipSupported() { string AcceptEncoding = HttpContext.Current.Request.Headers["Accept-Encoding"]; if (!string.IsNullOrEmpty(AcceptEncoding) && (AcceptEncoding.Contains("gzip") || AcceptEncoding.Contains("deflate"))) return true; return false; } GZipStream and DeflateStream are streams that are assigned to Response.Filter and by doing so apply the appropriate compression on the active Response. Response.Filter content is chunked So to implement a Response.Filter effectively requires only that you implement a custom stream and handle the Write() method to capture Response output as it’s written. At first blush this seems very simple – you capture the output in Write, transform it and write out the transformed content in one pass. And that indeed works for small amounts of content. But you see, the problem is that output is written in small buffer chunks (a little less than 16k it appears) rather than just a single Write() statement into the stream, which makes perfect sense for ASP.NET to stream data back to IIS in smaller chunks to minimize memory usage en route. Unfortunately this also makes it a more difficult to implement any filtering routines since you don’t directly get access to all of the response content which is problematic especially if those filtering routines require you to look at the ENTIRE response in order to transform or capture the output as is needed for the solution the gentleman in my session asked for. So in order to address this a slightly different approach is required that basically captures all the Write() buffers passed into a cached stream and then making the stream available only when it’s complete and ready to be flushed. As I was thinking about the implementation I also started thinking about the few instances when I’ve used Response.Filter implementations. Each time I had to create a new Stream subclass and create my custom functionality but in the end each implementation did the same thing – capturing output and transforming it. I thought there should be an easier way to do this by creating a re-usable Stream class that can handle stream transformations that are common to Response.Filter implementations. Creating a semi-generic Response Filter Stream Class What I ended up with is a ResponseFilterStream class that provides a handful of Events that allow you to capture and/or transform Response content. The class implements a subclass of Stream and then overrides Write() and Flush() to handle capturing and transformation operations. By exposing events it’s easy to hook up capture or transformation operations via single focused methods. ResponseFilterStream exposes the following events: CaptureStream, CaptureString Captures the output only and provides either a MemoryStream or String with the final page output. Capture is hooked to the Flush() operation of the stream. TransformStream, TransformString Allows you to transform the complete response output with events that receive a MemoryStream or String respectively and can you modify the output then return it back as a return value. The transformed output is then written back out in a single chunk to the response output stream. These events capture all output internally first then write the entire buffer into the response. TransformWrite, TransformWriteString Allows you to transform the Response data as it is written in its original chunk size in the Stream’s Write() method. Unlike TransformStream/TransformString which operate on the complete output, these events only see the current chunk of data written. This is more efficient as there’s no caching involved, but can cause problems due to searched content splitting over multiple chunks. Using this implementation, creating a custom Response.Filter transformation becomes as simple as the following code. To hook up the Response.Filter using the MemoryStream version event: ResponseFilterStream filter = new ResponseFilterStream(Response.Filter); filter.TransformStream += filter_TransformStream; Response.Filter = filter; and the event handler to do the transformation: MemoryStream filter_TransformStream(MemoryStream ms) { Encoding encoding = HttpContext.Current.Response.ContentEncoding; string output = encoding.GetString(ms.ToArray()); output = FixPaths(output); ms = new MemoryStream(output.Length); byte[] buffer = encoding.GetBytes(output); ms.Write(buffer,0,buffer.Length); return ms; } private string FixPaths(string output) { string path = HttpContext.Current.Request.ApplicationPath; // override root path wonkiness if (path == "/") path = ""; output = output.Replace("\"~/", "\"" + path + "/").Replace("'~/", "'" + path + "/"); return output; } The idea of the event handler is that you can do whatever you want to the stream and return back a stream – either the same one that’s been modified or a brand new one – which is then sent back to as the final response. The above code can be simplified even more by using the string version events which handle the stream to string conversions for you: ResponseFilterStream filter = new ResponseFilterStream(Response.Filter); filter.TransformString += filter_TransformString; Response.Filter = filter; and the event handler to do the transformation calling the same FixPaths method shown above: string filter_TransformString(string output) { return FixPaths(output); } The events for capturing output and capturing and transforming chunks work in a very similar way. By using events to handle the transformations ResponseFilterStream becomes a reusable component and we don’t have to create a new stream class or subclass an existing Stream based classed. By the way, the example used here is kind of a cool trick which transforms “~/” expressions inside of the final generated HTML output – even in plain HTML controls not HTML controls – and transforms them into the appropriate application relative path in the same way that ResolveUrl would do. So you can write plain old HTML like this: <a href=”~/default.aspx”>Home</a> and have it turned into: <a href=”/myVirtual/default.aspx”>Home</a> without having to use an ASP.NET control like Hyperlink or Image or having to constantly use: <img src=”<%= ResolveUrl(“~/images/home.gif”) %>” /> in MVC applications (which frankly is one of the most annoying things about MVC especially given the path hell that extension-less and endpoint-less URLs impose). I can’t take credit for this idea. While discussing the Response.Filter issues on Twitter a hint from Dylan Beattie who pointed me at one of his examples which does something similar. I thought the idea was cool enough to use an example for future demos of Response.Filter functionality in ASP.NET next I time I do the Modules and Handlers talk (which was great fun BTW). How practical this is is debatable however since there’s definitely some overhead to using a Response.Filter in general and especially on one that caches the output and the re-writes it later. Make sure to test for performance anytime you use Response.Filter hookup and make sure it' doesn’t end up killing perf on you. You’ve been warned :-}. How does ResponseFilterStream work? The big win of this implementation IMHO is that it’s a reusable component – so for implementation there’s no new class, no subclassing – you simply attach to an event to implement an event handler method with a straight forward signature to retrieve the stream or string you’re interested in. The implementation is based on a subclass of Stream as is required in order to handle the Response.Filter requirements. What’s different than other implementations I’ve seen in various places is that it supports capturing output as a whole to allow retrieving the full response output for capture or modification. The exception are the TransformWrite and TransformWrite events which operate only active chunk of data written by the Response. For captured output, the Write() method captures output into an internal MemoryStream that is cached until writing is complete. So Write() is called when ASP.NET writes to the Response stream, but the filter doesn’t pass on the Write immediately to the filter’s internal stream. The data is cached and only when the Flush() method is called to finalize the Stream’s output do we actually send the cached stream off for transformation (if the events are hooked up) and THEN finally write out the returned content in one big chunk. Here’s the implementation of ResponseFilterStream: /// <summary> /// A semi-generic Stream implementation for Response.Filter with /// an event interface for handling Content transformations via /// Stream or String. /// <remarks> /// Use with care for large output as this implementation copies /// the output into a memory stream and so increases memory usage. /// </remarks> /// </summary> public class ResponseFilterStream : Stream { /// <summary> /// The original stream /// </summary> Stream _stream; /// <summary> /// Current position in the original stream /// </summary> long _position; /// <summary> /// Stream that original content is read into /// and then passed to TransformStream function /// </summary> MemoryStream _cacheStream = new MemoryStream(5000); /// <summary> /// Internal pointer that that keeps track of the size /// of the cacheStream /// </summary> int _cachePointer = 0; /// <summary> /// /// </summary> /// <param name="responseStream"></param> public ResponseFilterStream(Stream responseStream) { _stream = responseStream; } /// <summary> /// Determines whether the stream is captured /// </summary> private bool IsCaptured { get { if (CaptureStream != null || CaptureString != null || TransformStream != null || TransformString != null) return true; return false; } } /// <summary> /// Determines whether the Write method is outputting data immediately /// or delaying output until Flush() is fired. /// </summary> private bool IsOutputDelayed { get { if (TransformStream != null || TransformString != null) return true; return false; } } /// <summary> /// Event that captures Response output and makes it available /// as a MemoryStream instance. Output is captured but won't /// affect Response output. /// </summary> public event Action<MemoryStream> CaptureStream; /// <summary> /// Event that captures Response output and makes it available /// as a string. Output is captured but won't affect Response output. /// </summary> public event Action<string> CaptureString; /// <summary> /// Event that allows you transform the stream as each chunk of /// the output is written in the Write() operation of the stream. /// This means that that it's possible/likely that the input /// buffer will not contain the full response output but only /// one of potentially many chunks. /// /// This event is called as part of the filter stream's Write() /// operation. /// </summary> public event Func<byte[], byte[]> TransformWrite; /// <summary> /// Event that allows you to transform the response stream as /// each chunk of bytep[] output is written during the stream's write /// operation. This means it's possibly/likely that the string /// passed to the handler only contains a portion of the full /// output. Typical buffer chunks are around 16k a piece. /// /// This event is called as part of the stream's Write operation. /// </summary> public event Func<string, string> TransformWriteString; /// <summary> /// This event allows capturing and transformation of the entire /// output stream by caching all write operations and delaying final /// response output until Flush() is called on the stream. /// </summary> public event Func<MemoryStream, MemoryStream> TransformStream; /// <summary> /// Event that can be hooked up to handle Response.Filter /// Transformation. Passed a string that you can modify and /// return back as a return value. The modified content /// will become the final output. /// </summary> public event Func<string, string> TransformString; protected virtual void OnCaptureStream(MemoryStream ms) { if (CaptureStream != null) CaptureStream(ms); } private void OnCaptureStringInternal(MemoryStream ms) { if (CaptureString != null) { string content = HttpContext.Current.Response.ContentEncoding.GetString(ms.ToArray()); OnCaptureString(content); } } protected virtual void OnCaptureString(string output) { if (CaptureString != null) CaptureString(output); } protected virtual byte[] OnTransformWrite(byte[] buffer) { if (TransformWrite != null) return TransformWrite(buffer); return buffer; } private byte[] OnTransformWriteStringInternal(byte[] buffer) { Encoding encoding = HttpContext.Current.Response.ContentEncoding; string output = OnTransformWriteString(encoding.GetString(buffer)); return encoding.GetBytes(output); } private string OnTransformWriteString(string value) { if (TransformWriteString != null) return TransformWriteString(value); return value; } protected virtual MemoryStream OnTransformCompleteStream(MemoryStream ms) { if (TransformStream != null) return TransformStream(ms); return ms; } /// <summary> /// Allows transforming of strings /// /// Note this handler is internal and not meant to be overridden /// as the TransformString Event has to be hooked up in order /// for this handler to even fire to avoid the overhead of string /// conversion on every pass through. /// </summary> /// <param name="responseText"></param> /// <returns></returns> private string OnTransformCompleteString(string responseText) { if (TransformString != null) TransformString(responseText); return responseText; } /// <summary> /// Wrapper method form OnTransformString that handles /// stream to string and vice versa conversions /// </summary> /// <param name="ms"></param> /// <returns></returns> internal MemoryStream OnTransformCompleteStringInternal(MemoryStream ms) { if (TransformString == null) return ms; //string content = ms.GetAsString(); string content = HttpContext.Current.Response.ContentEncoding.GetString(ms.ToArray()); content = TransformString(content); byte[] buffer = HttpContext.Current.Response.ContentEncoding.GetBytes(content); ms = new MemoryStream(); ms.Write(buffer, 0, buffer.Length); //ms.WriteString(content); return ms; } /// <summary> /// /// </summary> public override bool CanRead { get { return true; } } public override bool CanSeek { get { return true; } } /// <summary> /// /// </summary> public override bool CanWrite { get { return true; } } /// <summary> /// /// </summary> public override long Length { get { return 0; } } /// <summary> /// /// </summary> public override long Position { get { return _position; } set { _position = value; } } /// <summary> /// /// </summary> /// <param name="offset"></param> /// <param name="direction"></param> /// <returns></returns> public override long Seek(long offset, System.IO.SeekOrigin direction) { return _stream.Seek(offset, direction); } /// <summary> /// /// </summary> /// <param name="length"></param> public override void SetLength(long length) { _stream.SetLength(length); } /// <summary> /// /// </summary> public override void Close() { _stream.Close(); } /// <summary> /// Override flush by writing out the cached stream data /// </summary> public override void Flush() { if (IsCaptured && _cacheStream.Length > 0) { // Check for transform implementations _cacheStream = OnTransformCompleteStream(_cacheStream); _cacheStream = OnTransformCompleteStringInternal(_cacheStream); OnCaptureStream(_cacheStream); OnCaptureStringInternal(_cacheStream); // write the stream back out if output was delayed if (IsOutputDelayed) _stream.Write(_cacheStream.ToArray(), 0, (int)_cacheStream.Length); // Clear the cache once we've written it out _cacheStream.SetLength(0); } // default flush behavior _stream.Flush(); } /// <summary> /// /// </summary> /// <param name="buffer"></param> /// <param name="offset"></param> /// <param name="count"></param> /// <returns></returns> public override int Read(byte[] buffer, int offset, int count) { return _stream.Read(buffer, offset, count); } /// <summary> /// Overriden to capture output written by ASP.NET and captured /// into a cached stream that is written out later when Flush() /// is called. /// </summary> /// <param name="buffer"></param> /// <param name="offset"></param> /// <param name="count"></param> public override void Write(byte[] buffer, int offset, int count) { if ( IsCaptured ) { // copy to holding buffer only - we'll write out later _cacheStream.Write(buffer, 0, count); _cachePointer += count; } // just transform this buffer if (TransformWrite != null) buffer = OnTransformWrite(buffer); if (TransformWriteString != null) buffer = OnTransformWriteStringInternal(buffer); if (!IsOutputDelayed) _stream.Write(buffer, offset, buffer.Length); } } The key features are the events and corresponding OnXXX methods that handle the event hookups, and the Write() and Flush() methods of the stream implementation. All the rest of the members tend to be plain jane passthrough stream implementation code without much consequence. I do love the way Action<t> and Func<T> make it so easy to create the event signatures for the various events – sweet. A few Things to consider Performance Response.Filter is not great for performance in general as it adds another layer of indirection to the ASP.NET output pipeline, and this implementation in particular adds a memory hit as it basically duplicates the response output into the cached memory stream which is necessary since you may have to look at the entire response. If you have large pages in particular this can cause potentially serious memory pressure in your server application. So be careful of wholesale adoption of this (or other) Response.Filters. Make sure to do some performance testing to ensure it’s not killing your app’s performance. Response.Filter works everywhere A few questions came up in comments and discussion as to capturing ALL output hitting the site and – yes you can definitely do that by assigning a Response.Filter inside of a module. If you do this however you’ll want to be very careful and decide which content you actually want to capture especially in IIS 7 which passes ALL content – including static images/CSS etc. through the ASP.NET pipeline. So it is important to filter only on what you’re looking for – like the page extension or maybe more effectively the Response.ContentType. Response.Filter Chaining Originally I thought that filter chaining doesn’t work at all due to a bug in the stream implementation code. But it’s quite possible to assign multiple filters to the Response.Filter property. So the following actually works to both compress the output and apply the transformed content: WebUtils.GZipEncodePage(); ResponseFilterStream filter = new ResponseFilterStream(Response.Filter); filter.TransformString += filter_TransformString; Response.Filter = filter; However the following does not work resulting in invalid content encoding errors: ResponseFilterStream filter = new ResponseFilterStream(Response.Filter); filter.TransformString += filter_TransformString; Response.Filter = filter; WebUtils.GZipEncodePage(); In other words multiple Response filters can work together but it depends entirely on the implementation whether they can be chained or in which order they can be chained. In this case running the GZip/Deflate stream filters apparently relies on the original content length of the output and chokes when the content is modified. But if attaching the compression first it works fine as unintuitive as that may seem. Resources Download example code Capture Output from ASP.NET Pages © Rick Strahl, West Wind Technologies, 2005-2010Posted in ASP.NET

Read the article

An Xml Serializable PropertyBag Dictionary Class for .NET

- by Rick Strahl

I don't know about you but I frequently need property bags in my applications to store and possibly cache arbitrary data. Dictionary<T,V> works well for this although I always seem to be hunting for a more specific generic type that provides a string key based dictionary. There's string dictionary, but it only works with strings. There's Hashset<T> but it uses the actual values as keys. In most key value pair situations for me string is key value to work off. Dictionary<T,V> works well enough, but there are some issues with serialization of dictionaries in .NET. The .NET framework doesn't do well serializing IDictionary objects out of the box. The XmlSerializer doesn't support serialization of IDictionary via it's default serialization, and while the DataContractSerializer does support IDictionary serialization it produces some pretty atrocious XML. What doesn't work? First off Dictionary serialization with the Xml Serializer doesn't work so the following fails: [TestMethod] public void DictionaryXmlSerializerTest() { var bag = new Dictionary<string, object>(); bag.Add("key", "Value"); bag.Add("Key2", 100.10M); bag.Add("Key3", Guid.NewGuid()); bag.Add("Key4", DateTime.Now); bag.Add("Key5", true); bag.Add("Key7", new byte[3] { 42, 45, 66 }); TestContext.WriteLine(this.ToXml(bag)); } public string ToXml(object obj) { if (obj == null) return null; StringWriter sw = new StringWriter(); XmlSerializer ser = new XmlSerializer(obj.GetType()); ser.Serialize(sw, obj); return sw.ToString(); } The error you get with this is: System.NotSupportedException: The type System.Collections.Generic.Dictionary`2[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.Object, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] is not supported because it implements IDictionary. Got it! BTW, the same is true with binary serialization. Running the same code above against the DataContractSerializer does work: [TestMethod] public void DictionaryDataContextSerializerTest() { var bag = new Dictionary<string, object>(); bag.Add("key", "Value"); bag.Add("Key2", 100.10M); bag.Add("Key3", Guid.NewGuid()); bag.Add("Key4", DateTime.Now); bag.Add("Key5", true); bag.Add("Key7", new byte[3] { 42, 45, 66 }); TestContext.WriteLine(this.ToXmlDcs(bag)); } public string ToXmlDcs(object value, bool throwExceptions = false) { var ser = new DataContractSerializer(value.GetType(), null, int.MaxValue, true, false, null); MemoryStream ms = new MemoryStream(); ser.WriteObject(ms, value); return Encoding.UTF8.GetString(ms.ToArray(), 0, (int)ms.Length); } This DOES work but produces some pretty heinous XML (formatted with line breaks and indentation here): <ArrayOfKeyValueOfstringanyType xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <KeyValueOfstringanyType> <Key>key</Key> <Value i:type="a:string" xmlns:a="http://www.w3.org/2001/XMLSchema">Value</Value> </KeyValueOfstringanyType> <KeyValueOfstringanyType> <Key>Key2</Key> <Value i:type="a:decimal" xmlns:a="http://www.w3.org/2001/XMLSchema">100.10</Value> </KeyValueOfstringanyType> <KeyValueOfstringanyType> <Key>Key3</Key> <Value i:type="a:guid" xmlns:a="http://schemas.microsoft.com/2003/10/Serialization/">2cd46d2a-a636-4af4-979b-e834d39b6d37</Value> </KeyValueOfstringanyType> <KeyValueOfstringanyType> <Key>Key4</Key> <Value i:type="a:dateTime" xmlns:a="http://www.w3.org/2001/XMLSchema">2011-09-19T17:17:05.4406999-07:00</Value> </KeyValueOfstringanyType> <KeyValueOfstringanyType> <Key>Key5</Key> <Value i:type="a:boolean" xmlns:a="http://www.w3.org/2001/XMLSchema">true</Value> </KeyValueOfstringanyType> <KeyValueOfstringanyType> <Key>Key7</Key> <Value i:type="a:base64Binary" xmlns:a="http://www.w3.org/2001/XMLSchema">Ki1C</Value> </KeyValueOfstringanyType> </ArrayOfKeyValueOfstringanyType> Ouch! That seriously hurts the eye! :-) Worse though it's extremely verbose with all those repetitive namespace declarations. It's good to know that it works in a pinch, but for a human readable/editable solution or something lightweight to store in a database it's not quite ideal. Why should I care? As a little background, in one of my applications I have a need for a flexible property bag that is used on a free form database field on an otherwise static entity. Basically what I have is a standard database record to which arbitrary properties can be added in an XML based string field. I intend to expose those arbitrary properties as a collection from field data stored in XML. The concept is pretty simple: When loading write the data to the collection, when the data is saved serialize the data into an XML string and store it into the database. When reading the data pick up the XML and if the collection on the entity is accessed automatically deserialize the XML into the Dictionary. (I'll talk more about this in another post). While the DataContext Serializer would work, it's verbosity is problematic both for size of the generated XML strings and the fact that users can manually edit this XML based property data in an advanced mode. A clean(er) layout certainly would be preferable and more user friendly. Custom XMLSerialization with a PropertyBag Class So… after a bunch of experimentation with different serialization formats I decided to create a custom PropertyBag class that provides for a serializable Dictionary. It's basically a custom Dictionary<TType,TValue> implementation with the keys always set as string keys. The result are PropertyBag<TValue> and PropertyBag (which defaults to the object type for values). The PropertyBag<TType> and PropertyBag classes provide these features: Subclassed from Dictionary<T,V> Implements IXmlSerializable with a cleanish XML format ToXml() and FromXml() methods to export and import to and from XML strings Static CreateFromXml() method to create an instance It's simple enough as it's merely a Dictionary<string,object> subclass but that supports serialization to a - what I think at least - cleaner XML format. The class is super simple to use: [TestMethod] public void PropertyBagTwoWayObjectSerializationTest() { var bag = new PropertyBag(); bag.Add("key", "Value"); bag.Add("Key2", 100.10M); bag.Add("Key3", Guid.NewGuid()); bag.Add("Key4", DateTime.Now); bag.Add("Key5", true); bag.Add("Key7", new byte[3] { 42,45,66 } ); bag.Add("Key8", null); bag.Add("Key9", new ComplexObject() { Name = "Rick", Entered = DateTime.Now, Count = 10 }); string xml = bag.ToXml(); TestContext.WriteLine(bag.ToXml()); bag.Clear(); bag.FromXml(xml); Assert.IsTrue(bag["key"] as string == "Value"); Assert.IsInstanceOfType( bag["Key3"], typeof(Guid)); Assert.IsNull(bag["Key8"]); //Assert.IsNull(bag["Key10"]); Assert.IsInstanceOfType(bag["Key9"], typeof(ComplexObject)); } This uses the PropertyBag class which uses a PropertyBag<string,object> - which means it returns untyped values of type object. I suspect for me this will be the most common scenario as I'd want to store arbitrary values in the PropertyBag rather than one specific type. The same code with a strongly typed PropertyBag<decimal> looks like this: [TestMethod] public void PropertyBagTwoWayValueTypeSerializationTest() { var bag = new PropertyBag<decimal>(); bag.Add("key", 10M); bag.Add("Key1", 100.10M); bag.Add("Key2", 200.10M); bag.Add("Key3", 300.10M); string xml = bag.ToXml(); TestContext.WriteLine(bag.ToXml()); bag.Clear(); bag.FromXml(xml); Assert.IsTrue(bag.Get("Key1") == 100.10M); Assert.IsTrue(bag.Get("Key3") == 300.10M); } and produces typed results of type decimal. The types can be either value or reference types the combination of which actually proved to be a little more tricky than anticipated due to null and specific string value checks required - getting the generic typing right required use of default(T) and Convert.ChangeType() to trick the compiler into playing nice. Of course the whole raison d'etre for this class is the XML serialization. You can see in the code above that we're doing a .ToXml() and .FromXml() to serialize to and from string. The XML produced for the first example looks like this: <?xml version="1.0" encoding="utf-8"?> <properties> <item> <key>key</key> <value>Value</value> </item> <item> <key>Key2</key> <value type="decimal">100.10</value> </item> <item> <key>Key3</key> <value type="___System.Guid"> <guid>f7a92032-0c6d-4e9d-9950-b15ff7cd207d</guid> </value> </item> <item> <key>Key4</key> <value type="datetime">2011-09-26T17:45:58.5789578-10:00</value> </item> <item> <key>Key5</key> <value type="boolean">true</value> </item> <item> <key>Key7</key> <value type="base64Binary">Ki1C</value> </item> <item> <key>Key8</key> <value type="nil" /> </item> <item> <key>Key9</key> <value type="___Westwind.Tools.Tests.PropertyBagTest+ComplexObject"> <ComplexObject> <Name>Rick</Name> <Entered>2011-09-26T17:45:58.5789578-10:00</Entered> <Count>10</Count> </ComplexObject> </value> </item> </properties> The format is a bit cleaner than the DataContractSerializer. Each item is serialized into <key> <value> pairs. If the value is a string no type information is written. Since string tends to be the most common type this saves space and serialization processing. All other types are attributed. Simple types are mapped to XML types so things like decimal, datetime, boolean and base64Binary are encoded using their Xml type values. All other types are embedded with a hokey format that describes the .NET type preceded by a three underscores and then are encoded using the XmlSerializer. You can see this best above in the ComplexObject encoding. For custom types this isn't pretty either, but it's more concise than the DCS and it works as long as you're serializing back and forth between .NET clients at least. The XML generated from the second example that uses PropertyBag<decimal> looks like this: <?xml version="1.0" encoding="utf-8"?> <properties> <item> <key>key</key> <value type="decimal">10</value> </item> <item> <key>Key1</key> <value type="decimal">100.10</value> </item> <item> <key>Key2</key> <value type="decimal">200.10</value> </item> <item> <key>Key3</key> <value type="decimal">300.10</value> </item> </properties> How does it work As I mentioned there's nothing fancy about this solution - it's little more than a subclass of Dictionary<T,V> that implements custom Xml Serialization and a couple of helper methods that facilitate getting the XML in and out of the class more easily. But it's proven very handy for a number of projects for me where dynamic data storage is required. Here's the code: /// <summary> /// Creates a serializable string/object dictionary that is XML serializable /// Encodes keys as element names and values as simple values with a type /// attribute that contains an XML type name. Complex names encode the type /// name with type='___namespace.classname' format followed by a standard xml /// serialized format. The latter serialization can be slow so it's not recommended /// to pass complex types if performance is critical. /// </summary> [XmlRoot("properties")] public class PropertyBag : PropertyBag<object> { /// <summary> /// Creates an instance of a propertybag from an Xml string /// </summary> /// <param name="xml">Serialize</param> /// <returns></returns> public static PropertyBag CreateFromXml(string xml) { var bag = new PropertyBag(); bag.FromXml(xml); return bag; } } /// <summary> /// Creates a serializable string for generic types that is XML serializable. /// /// Encodes keys as element names and values as simple values with a type /// attribute that contains an XML type name. Complex names encode the type /// name with type='___namespace.classname' format followed by a standard xml /// serialized format. The latter serialization can be slow so it's not recommended /// to pass complex types if performance is critical. /// </summary> /// <typeparam name="TValue">Must be a reference type. For value types use type object</typeparam> [XmlRoot("properties")] public class PropertyBag<TValue> : Dictionary<string, TValue>, IXmlSerializable { /// <summary> /// Not implemented - this means no schema information is passed /// so this won't work with ASMX/WCF services. /// </summary> /// <returns></returns> public System.Xml.Schema.XmlSchema GetSchema() { return null; } /// <summary> /// Serializes the dictionary to XML. Keys are /// serialized to element names and values as /// element values. An xml type attribute is embedded /// for each serialized element - a .NET type /// element is embedded for each complex type and /// prefixed with three underscores. /// </summary> /// <param name="writer"></param> public void WriteXml(System.Xml.XmlWriter writer) { foreach (string key in this.Keys) { TValue value = this[key]; Type type = null; if (value != null) type = value.GetType(); writer.WriteStartElement("item"); writer.WriteStartElement("key"); writer.WriteString(key as string); writer.WriteEndElement(); writer.WriteStartElement("value"); string xmlType = XmlUtils.MapTypeToXmlType(type); bool isCustom = false; // Type information attribute if not string if (value == null) { writer.WriteAttributeString("type", "nil"); } else if (!string.IsNullOrEmpty(xmlType)) { if (xmlType != "string") { writer.WriteStartAttribute("type"); writer.WriteString(xmlType); writer.WriteEndAttribute(); } } else { isCustom = true; xmlType = "___" + value.GetType().FullName; writer.WriteStartAttribute("type"); writer.WriteString(xmlType); writer.WriteEndAttribute(); } // Actual deserialization if (!isCustom) { if (value != null) writer.WriteValue(value); } else { XmlSerializer ser = new XmlSerializer(value.GetType()); ser.Serialize(writer, value); } writer.WriteEndElement(); // value writer.WriteEndElement(); // item } } /// <summary> /// Reads the custom serialized format /// </summary> /// <param name="reader"></param> public void ReadXml(System.Xml.XmlReader reader) { this.Clear(); while (reader.Read()) { if (reader.NodeType == XmlNodeType.Element && reader.Name == "key") { string xmlType = null; string name = reader.ReadElementContentAsString(); // item element reader.ReadToNextSibling("value"); if (reader.MoveToNextAttribute()) xmlType = reader.Value; reader.MoveToContent(); TValue value; if (xmlType == "nil") value = default(TValue); // null else if (string.IsNullOrEmpty(xmlType)) { // value is a string or object and we can assign TValue to value string strval = reader.ReadElementContentAsString(); value = (TValue) Convert.ChangeType(strval, typeof(TValue)); } else if (xmlType.StartsWith("___")) { while (reader.Read() && reader.NodeType != XmlNodeType.Element) { } Type type = ReflectionUtils.GetTypeFromName(xmlType.Substring(3)); //value = reader.ReadElementContentAs(type,null); XmlSerializer ser = new XmlSerializer(type); value = (TValue)ser.Deserialize(reader); } else value = (TValue)reader.ReadElementContentAs(XmlUtils.MapXmlTypeToType(xmlType), null); this.Add(name, value); } } } /// <summary> /// Serializes this dictionary to an XML string /// </summary> /// <returns>XML String or Null if it fails</returns> public string ToXml() { string xml = null; SerializationUtils.SerializeObject(this, out xml); return xml; } /// <summary> /// Deserializes from an XML string /// </summary> /// <param name="xml"></param> /// <returns>true or false</returns> public bool FromXml(string xml) { this.Clear(); // if xml string is empty we return an empty dictionary if (string.IsNullOrEmpty(xml)) return true; var result = SerializationUtils.DeSerializeObject(xml, this.GetType()) as PropertyBag<TValue>; if (result != null) { foreach (var item in result) { this.Add(item.Key, item.Value); } } else // null is a failure return false; return true; } /// <summary> /// Creates an instance of a propertybag from an Xml string /// </summary> /// <param name="xml"></param> /// <returns></returns> public static PropertyBag<TValue> CreateFromXml(string xml) { var bag = new PropertyBag<TValue>(); bag.FromXml(xml); return bag; } } } The code uses a couple of small helper classes SerializationUtils and XmlUtils for mapping Xml types to and from .NET, both of which are from the WestWind,Utilities project (which is the same project where PropertyBag lives) from the West Wind Web Toolkit. The code implements ReadXml and WriteXml for the IXmlSerializable implementation using old school XmlReaders and XmlWriters (because it's pretty simple stuff - no need for XLinq here). Then there are two helper methods .ToXml() and .FromXml() that basically allow your code to easily convert between XML and a PropertyBag object. In my code that's what I use to actually to persist to and from the entity XML property during .Load() and .Save() operations. It's sweet to be able to have a string key dictionary and then be able to turn around with 1 line of code to persist the whole thing to XML and back. Hopefully some of you will find this class as useful as I've found it. It's a simple solution to a common requirement in my applications and I've used the hell out of it in the short time since I created it. Resources You can find the complete code for the two classes plus the helpers in the Subversion repository for Westwind.Utilities. You can grab the source files from there or download the whole project. You can also grab the full Westwind.Utilities assembly from NuGet and add it to your project if that's easier for you. PropertyBag Source Code SerializationUtils and XmlUtils Westwind.Utilities Assembly on NuGet (add from Visual Studio) © Rick Strahl, West Wind Technologies, 2005-2011Posted in .NET CSharp Tweet (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

Silverlight Recruiting Application Part 5 - Jobs Module / View

Now we starting getting into a more code-heavy portion of this series, thankfully though this means the groundwork is all set for the most part and after adding the modules we will have a complete application that can be provided with full source. The Jobs module will have two concerns- adding and maintaining jobs that can then be broadcast out to the website. How they are displayed on the site will be handled by our admin system (which will just poll from this common database), so we aren't too concerned with that, but rather with getting the information into the system and allowing the backend administration/HR users to keep things up to date. Since there is a fair bit of information that we want to display, we're going to move editing to a separate view so we can get all that information in an easy-to-use spot. With all the files created for this module, the project looks something like this: And now... on to the code. XAML for the Job Posting View All we really need for the Job Posting View is a RadGridView and a few buttons. This will let us both show off records and perform operations on the records without much hassle. That XAML is going to look something like this: 01.<Grid x:Name="LayoutRoot" 02.Background="White"> 03.<Grid.RowDefinitions> 04.<RowDefinition Height="30" /> 05.<RowDefinition /> 06.</Grid.RowDefinitions> 07.<StackPanel Orientation="Horizontal"> 08.<Button x:Name="xAddRecordButton" 09.Content="Add Job" 10.Width="120" 11.cal:Click.Command="{Binding AddRecord}" 12.telerik:StyleManager.Theme="Windows7" /> 13.<Button x:Name="xEditRecordButton" 14.Content="Edit Job" 15.Width="120" 16.cal:Click.Command="{Binding EditRecord}" 17.telerik:StyleManager.Theme="Windows7" /> 18.</StackPanel> 19.<telerikGrid:RadGridView x:Name="xJobsGrid" 20.Grid.Row="1" 21.IsReadOnly="True" 22.AutoGenerateColumns="False" 23.ColumnWidth="*" 24.RowDetailsVisibilityMode="VisibleWhenSelected" 25.ItemsSource="{Binding MyJobs}" 26.SelectedItem="{Binding SelectedJob, Mode=TwoWay}" 27.command:SelectedItemChangedEventClass.Command="{Binding SelectedItemChanged}"> 28.<telerikGrid:RadGridView.Columns> 29.<telerikGrid:GridViewDataColumn Header="Job Title" 30.DataMemberBinding="{Binding JobTitle}" 31.UniqueName="JobTitle" /> 32.<telerikGrid:GridViewDataColumn Header="Location" 33.DataMemberBinding="{Binding Location}" 34.UniqueName="Location" /> 35.<telerikGrid:GridViewDataColumn Header="Resume Required" 36.DataMemberBinding="{Binding NeedsResume}" 37.UniqueName="NeedsResume" /> 38.<telerikGrid:GridViewDataColumn Header="CV Required" 39.DataMemberBinding="{Binding NeedsCV}" 40.UniqueName="NeedsCV" /> 41.<telerikGrid:GridViewDataColumn Header="Overview Required" 42.DataMemberBinding="{Binding NeedsOverview}" 43.UniqueName="NeedsOverview" /> 44.<telerikGrid:GridViewDataColumn Header="Active" 45.DataMemberBinding="{Binding IsActive}" 46.UniqueName="IsActive" /> 47.</telerikGrid:RadGridView.Columns> 48.</telerikGrid:RadGridView> 49.</Grid> I'll explain what's happening here by line numbers: Lines 11 and 16: Using the same type of click commands as we saw in the Menu module, we tie the button clicks to delegate commands in the viewmodel. Line 25: The source for the jobs will be a collection in the viewmodel. Line 26: We also bind the selected item to a public property from the viewmodel for use in code. Line 27: We've turned the event into a command so we can handle it via code in the viewmodel. So those first three probably make sense to you as far as Silverlight/WPF binding magic is concerned, but for line 27... This actually comes from something I read onDamien Schenkelman's blog back in the day for creating an attached behavior from any event. So, any time you see me using command:Whatever.Command, the backing for it is actually something like this: SelectedItemChangedEventBehavior.cs: 01.public class SelectedItemChangedEventBehavior : CommandBehaviorBase<Telerik.Windows.Controls.DataControl> 02.{ 03.public SelectedItemChangedEventBehavior(DataControl element) 04.: base(element) 05.{ 06.element.SelectionChanged += new EventHandler<SelectionChangeEventArgs>(element_SelectionChanged); 07.} 08.void element_SelectionChanged(object sender, SelectionChangeEventArgs e) 09.{ 10.// We'll only ever allow single selection, so will only need item index 0 11.base.CommandParameter = e.AddedItems[0]; 12.base.ExecuteCommand(); 13.} 14.} SelectedItemChangedEventClass.cs: 01.public class SelectedItemChangedEventClass 02.{ 03.#region The Command Stuff 04.public static ICommand GetCommand(DependencyObject obj) 05.{ 06.return (ICommand)obj.GetValue(CommandProperty); 07.} 08.public static void SetCommand(DependencyObject obj, ICommand value) 09.{ 10.obj.SetValue(CommandProperty, value); 11.} 12.public static readonly DependencyProperty CommandProperty = 13.DependencyProperty.RegisterAttached("Command", typeof(ICommand), 14.typeof(SelectedItemChangedEventClass), new PropertyMetadata(OnSetCommandCallback)); 15.public static void OnSetCommandCallback(DependencyObject dependencyObject, DependencyPropertyChangedEventArgs e) 16.{ 17.DataControl element = dependencyObject as DataControl; 18.if (element != null) 19.{ 20.SelectedItemChangedEventBehavior behavior = GetOrCreateBehavior(element); 21.behavior.Command = e.NewValue as ICommand; 22.} 23.} 24.#endregion 25.public static SelectedItemChangedEventBehavior GetOrCreateBehavior(DataControl element) 26.{ 27.SelectedItemChangedEventBehavior behavior = element.GetValue(SelectedItemChangedEventBehaviorProperty) as SelectedItemChangedEventBehavior; 28.if (behavior == null) 29.{ 30.behavior = new SelectedItemChangedEventBehavior(element); 31.element.SetValue(SelectedItemChangedEventBehaviorProperty, behavior); 32.} 33.return behavior; 34.} 35.public static SelectedItemChangedEventBehavior GetSelectedItemChangedEventBehavior(DependencyObject obj) 36.{ 37.return (SelectedItemChangedEventBehavior)obj.GetValue(SelectedItemChangedEventBehaviorProperty); 38.} 39.public static void SetSelectedItemChangedEventBehavior(DependencyObject obj, SelectedItemChangedEventBehavior value) 40.{ 41.obj.SetValue(SelectedItemChangedEventBehaviorProperty, value); 42.} 43.public static readonly DependencyProperty SelectedItemChangedEventBehaviorProperty = 44.DependencyProperty.RegisterAttached("SelectedItemChangedEventBehavior", 45.typeof(SelectedItemChangedEventBehavior), typeof(SelectedItemChangedEventClass), null); 46.} These end up looking very similar from command to command, but in a nutshell you create a command based on any event, determine what the parameter for it will be, then execute. It attaches via XAML and ties to a DelegateCommand in the viewmodel, so you get the full event experience (since some controls get a bit event-rich for added functionality). Simple enough, right? Viewmodel for the Job Posting View The Viewmodel is going to need to handle all events going back and forth, maintaining interactions with the data we are using, and both publishing and subscribing to events. Rather than breaking this into tons of little pieces, I'll give you a nice view of the entire viewmodel and then hit up the important points line-by-line: 001.public class JobPostingViewModel : ViewModelBase 002.{ 003.private readonly IEventAggregator eventAggregator; 004.private readonly IRegionManager regionManager; 005.public DelegateCommand<object> AddRecord { get; set; } 006.public DelegateCommand<object> EditRecord { get; set; } 007.public DelegateCommand<object> SelectedItemChanged { get; set; } 008.public RecruitingContext context; 009.private QueryableCollectionView _myJobs; 010.public QueryableCollectionView MyJobs 011.{ 012.get { return _myJobs; } 013.} 014.private QueryableCollectionView _selectionJobActionHistory; 015.public QueryableCollectionView SelectedJobActionHistory 016.{ 017.get { return _selectionJobActionHistory; } 018.} 019.private JobPosting _selectedJob; 020.public JobPosting SelectedJob 021.{ 022.get { return _selectedJob; } 023.set 024.{ 025.if (value != _selectedJob) 026.{ 027._selectedJob = value; 028.NotifyChanged("SelectedJob"); 029.} 030.} 031.} 032.public SubscriptionToken editToken = new SubscriptionToken(); 033.public SubscriptionToken addToken = new SubscriptionToken(); 034.public JobPostingViewModel(IEventAggregator eventAgg, IRegionManager regionmanager) 035.{ 036.// set Unity items 037.this.eventAggregator = eventAgg; 038.this.regionManager = regionmanager; 039.// load our context 040.context = new RecruitingContext(); 041.this._myJobs = new QueryableCollectionView(context.JobPostings); 042.context.Load(context.GetJobPostingsQuery()); 043.// set command events 044.this.AddRecord = new DelegateCommand<object>(this.AddNewRecord); 045.this.EditRecord = new DelegateCommand<object>(this.EditExistingRecord); 046.this.SelectedItemChanged = new DelegateCommand<object>(this.SelectedRecordChanged); 047.SetSubscriptions(); 048.} 049.#region DelegateCommands from View 050.public void AddNewRecord(object obj) 051.{ 052.this.eventAggregator.GetEvent<AddJobEvent>().Publish(true); 053.} 054.public void EditExistingRecord(object obj) 055.{ 056.if (_selectedJob == null) 057.{ 058.this.eventAggregator.GetEvent<NotifyUserEvent>().Publish("No job selected."); 059.} 060.else 061.{ 062.this._myJobs.EditItem(this._selectedJob); 063.this.eventAggregator.GetEvent<EditJobEvent>().Publish(this._selectedJob); 064.} 065.} 066.public void SelectedRecordChanged(object obj) 067.{ 068.if (obj.GetType() == typeof(ActionHistory)) 069.{ 070.// event bubbles up so we don't catch items from the ActionHistory grid 071.} 072.else 073.{ 074.JobPosting job = obj as JobPosting; 075.GrabHistory(job.PostingID); 076.} 077.} 078.#endregion 079.#region Subscription Declaration and Events 080.public void SetSubscriptions() 081.{ 082.EditJobCompleteEvent editComplete = eventAggregator.GetEvent<EditJobCompleteEvent>(); 083.if (editToken != null) 084.editComplete.Unsubscribe(editToken); 085.editToken = editComplete.Subscribe(this.EditCompleteEventHandler); 086.AddJobCompleteEvent addComplete = eventAggregator.GetEvent<AddJobCompleteEvent>(); 087.if (addToken != null) 088.addComplete.Unsubscribe(addToken); 089.addToken = addComplete.Subscribe(this.AddCompleteEventHandler); 090.} 091.public void EditCompleteEventHandler(bool complete) 092.{ 093.if (complete) 094.{ 095.JobPosting thisJob = _myJobs.CurrentEditItem as JobPosting; 096.this._myJobs.CommitEdit(); 097.this.context.SubmitChanges((s) => 098.{ 099.ActionHistory myAction = new ActionHistory(); 100.myAction.PostingID = thisJob.PostingID; 101.myAction.Description = String.Format("Job '{0}' has been edited by {1}", thisJob.JobTitle, "default user"); 102.myAction.TimeStamp = DateTime.Now; 103.eventAggregator.GetEvent<AddActionEvent>().Publish(myAction); 104.} 105., null); 106.} 107.else 108.{ 109.this._myJobs.CancelEdit(); 110.} 111.this.MakeMeActive(this.regionManager, "MainRegion", "JobPostingsView"); 112.} 113.public void AddCompleteEventHandler(JobPosting job) 114.{ 115.if (job == null) 116.{ 117.// do nothing, new job add cancelled 118.} 119.else 120.{ 121.this.context.JobPostings.Add(job); 122.this.context.SubmitChanges((s) => 123.{ 124.ActionHistory myAction = new ActionHistory(); 125.myAction.PostingID = job.PostingID; 126.myAction.Description = String.Format("Job '{0}' has been added by {1}", job.JobTitle, "default user"); 127.myAction.TimeStamp = DateTime.Now; 128.eventAggregator.GetEvent<AddActionEvent>().Publish(myAction); 129.} 130., null); 131.} 132.this.MakeMeActive(this.regionManager, "MainRegion", "JobPostingsView"); 133.} 134.#endregion 135.public void GrabHistory(int postID) 136.{ 137.context.ActionHistories.Clear(); 138._selectionJobActionHistory = new QueryableCollectionView(context.ActionHistories); 139.context.Load(context.GetHistoryForJobQuery(postID)); 140.} Taking it from the top, we're injecting an Event Aggregator and Region Manager for use down the road and also have the public DelegateCommands (just like in the Menu module). We also grab a reference to our context, which we'll obviously need for data, then set up a few fields with public properties tied to them. We're also setting subscription tokens, which we have not yet seen but I will get into below. The AddNewRecord (50) and EditExistingRecord (54) methods should speak for themselves for functionality, the one thing of note is we're sending events off to the Event Aggregator which some module, somewhere will take care of. Since these aren't entirely relying on one another, the Jobs View doesn't care if anyone is listening, but it will publish AddJobEvent (52), NotifyUserEvent (58) and EditJobEvent (63)regardless. Don't mind the GrabHistory() method so much, that is just grabbing history items (visibly being created in the SubmitChanges callbacks), and adding them to the database. Every action will trigger a history event, so we'll know who modified what and when, just in case. ;) So where are we at? Well, if we click to Add a job, we publish an event, if we edit a job, we publish an event with the selected record (attained through the magic of binding). Where is this all going though? To the Viewmodel, of course! XAML for the AddEditJobView This is pretty straightforward except for one thing, noted below: 001.<Grid x:Name="LayoutRoot" 002.Background="White"> 003.<Grid x:Name="xEditGrid" 004.Margin="10" 005.validationHelper:ValidationScope.Errors="{Binding Errors}"> 006.<Grid.Background> 007.<LinearGradientBrush EndPoint="0.5,1" 008.StartPoint="0.5,0"> 009.<GradientStop Color="#FFC7C7C7" 010.Offset="0" /> 011.<GradientStop Color="#FFF6F3F3" 012.Offset="1" /> 013.</LinearGradientBrush> 014.</Grid.Background> 015.<Grid.RowDefinitions> 016.<RowDefinition Height="40" /> 017.<RowDefinition Height="40" /> 018.<RowDefinition Height="40" /> 019.<RowDefinition Height="100" /> 020.<RowDefinition Height="100" /> 021.<RowDefinition Height="100" /> 022.<RowDefinition Height="40" /> 023.<RowDefinition Height="40" /> 024.<RowDefinition Height="40" /> 025.</Grid.RowDefinitions> 026.<Grid.ColumnDefinitions> 027.<ColumnDefinition Width="150" /> 028.<ColumnDefinition Width="150" /> 029.<ColumnDefinition Width="300" /> 030.<ColumnDefinition Width="100" /> 031.</Grid.ColumnDefinitions> 032. 033.<TextBlock Margin="8" 034.Text="{Binding AddEditString}" 035.TextWrapping="Wrap" 036.Grid.Column="1" 037.Grid.ColumnSpan="2" 038.FontSize="16" /> 039. 040. 041.<TextBlock Margin="8,0,0,0" 042.Style="{StaticResource LabelTxb}" 043.Grid.Row="1" 044.Text="Job Title" 045.VerticalAlignment="Center" /> 046.<TextBox x:Name="xJobTitleTB" 047.Margin="0,8" 048.Grid.Column="1" 049.Grid.Row="1" 050.Text="{Binding activeJob.JobTitle, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 051.Grid.ColumnSpan="2" /> 052.<TextBlock Margin="8,0,0,0" 053.Grid.Row="2" 054.Text="Location" 055.d:LayoutOverrides="Height" 056.VerticalAlignment="Center" /> 057.<TextBox x:Name="xLocationTB" 058.Margin="0,8" 059.Grid.Column="1" 060.Grid.Row="2" 061.Text="{Binding activeJob.Location, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 062.Grid.ColumnSpan="2" /> 063. 064.<TextBlock Margin="8,11,8,0" 065.Grid.Row="3" 066.Text="Description" 067.TextWrapping="Wrap" 068.VerticalAlignment="Top" /> 069. 070.<TextBox x:Name="xDescriptionTB" 071.Height="84" 072.TextWrapping="Wrap" 073.ScrollViewer.VerticalScrollBarVisibility="Auto" 074.Grid.Column="1" 075.Grid.Row="3" 076.Text="{Binding activeJob.Description, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 077.Grid.ColumnSpan="2" /> 078.<TextBlock Margin="8,11,8,0" 079.Grid.Row="4" 080.Text="Requirements" 081.TextWrapping="Wrap" 082.VerticalAlignment="Top" /> 083. 084.<TextBox x:Name="xRequirementsTB" 085.Height="84" 086.TextWrapping="Wrap" 087.ScrollViewer.VerticalScrollBarVisibility="Auto" 088.Grid.Column="1" 089.Grid.Row="4" 090.Text="{Binding activeJob.Requirements, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 091.Grid.ColumnSpan="2" /> 092.<TextBlock Margin="8,11,8,0" 093.Grid.Row="5" 094.Text="Qualifications" 095.TextWrapping="Wrap" 096.VerticalAlignment="Top" /> 097. 098.<TextBox x:Name="xQualificationsTB" 099.Height="84" 100.TextWrapping="Wrap" 101.ScrollViewer.VerticalScrollBarVisibility="Auto" 102.Grid.Column="1" 103.Grid.Row="5" 104.Text="{Binding activeJob.Qualifications, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 105.Grid.ColumnSpan="2" /> 106. 107. 108.<CheckBox x:Name="xResumeRequiredCB" Margin="8,8,8,15" 109.Content="Resume Required" 110.Grid.Row="6" 111.Grid.ColumnSpan="2" 112.IsChecked="{Binding activeJob.NeedsResume, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 113. 114.<CheckBox x:Name="xCoverletterRequiredCB" Margin="8,8,8,15" 115.Content="Cover Letter Required" 116.Grid.Column="2" 117.Grid.Row="6" 118.IsChecked="{Binding activeJob.NeedsCV, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 119. 120.<CheckBox x:Name="xOverviewRequiredCB" Margin="8,8,8,15" 121.Content="Overview Required" 122.Grid.Row="7" 123.Grid.ColumnSpan="2" 124.IsChecked="{Binding activeJob.NeedsOverview, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 125. 126.<CheckBox x:Name="xJobActiveCB" Margin="8,8,8,15" 127.Content="Job is Active" 128.Grid.Column="2" 129.Grid.Row="7" 130.IsChecked="{Binding activeJob.IsActive, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 131. 132. 133. 134.<Button x:Name="xAddEditButton" Margin="8,8,0,10" 135.Content="{Binding AddEditButtonString}" 136.cal:Click.Command="{Binding AddEditCommand}" 137.Grid.Column="2" 138.Grid.Row="8" 139.HorizontalAlignment="Left" 140.Width="125" 141.telerik:StyleManager.Theme="Windows7" /> 142. 143.<Button x:Name="xCancelButton" HorizontalAlignment="Right" 144.Content="Cancel" 145.cal:Click.Command="{Binding CancelCommand}" 146.Margin="0,8,8,10" 147.Width="125" 148.Grid.Column="2" 149.Grid.Row="8" 150.telerik:StyleManager.Theme="Windows7" /> 151.</Grid> 152.</Grid> The 'validationHelper:ValidationScope' line may seem odd. This is a handy little trick for catching current and would-be validation errors when working in this whole setup. This all comes from an approach found on theJoy Of Code blog, although it looks like the story for this will be changing slightly with new advances in SL4/WCF RIA Services, so this section can definitely get an overhaul a little down the road. The code is the fun part of all this, so let us see what's happening under the hood. Viewmodel for the AddEditJobView We are going to see some of the same things happening here, so I'll skip over the repeat info and get right to the good stuff: 001.public class AddEditJobViewModel : ViewModelBase 002.{ 003.private readonly IEventAggregator eventAggregator; 004.private readonly IRegionManager regionManager; 005. 006.public RecruitingContext context; 007. 008.private JobPosting _activeJob; 009.public JobPosting activeJob 010.{ 011.get { return _activeJob; } 012.set 013.{ 014.if (_activeJob != value) 015.{ 016._activeJob = value; 017.NotifyChanged("activeJob"); 018.} 019.} 020.} 021. 022.public bool isNewJob; 023. 024.private string _addEditString; 025.public string AddEditString 026.{ 027.get { return _addEditString; } 028.set 029.{ 030.if (_addEditString != value) 031.{ 032._addEditString = value; 033.NotifyChanged("AddEditString"); 034.} 035.} 036.} 037. 038.private string _addEditButtonString; 039.public string AddEditButtonString 040.{ 041.get { return _addEditButtonString; } 042.set 043.{ 044.if (_addEditButtonString != value) 045.{ 046._addEditButtonString = value; 047.NotifyChanged("AddEditButtonString"); 048.} 049.} 050.} 051. 052.public SubscriptionToken addJobToken = new SubscriptionToken(); 053.public SubscriptionToken editJobToken = new SubscriptionToken(); 054. 055.public DelegateCommand<object> AddEditCommand { get; set; } 056.public DelegateCommand<object> CancelCommand { get; set; } 057. 058.private ObservableCollection<ValidationError> _errors = new ObservableCollection<ValidationError>(); 059.public ObservableCollection<ValidationError> Errors 060.{ 061.get { return _errors; } 062.} 063. 064.private ObservableCollection<ValidationResult> _valResults = new ObservableCollection<ValidationResult>(); 065.public ObservableCollection<ValidationResult> ValResults 066.{ 067.get { return this._valResults; } 068.} 069. 070.public AddEditJobViewModel(IEventAggregator eventAgg, IRegionManager regionmanager) 071.{ 072.// set Unity items 073.this.eventAggregator = eventAgg; 074.this.regionManager = regionmanager; 075. 076.context = new RecruitingContext(); 077. 078.AddEditCommand = new DelegateCommand<object>(this.AddEditJobCommand); 079.CancelCommand = new DelegateCommand<object>(this.CancelAddEditCommand); 080. 081.SetSubscriptions(); 082.} 083. 084.#region Subscription Declaration and Events 085. 086.public void SetSubscriptions() 087.{ 088.AddJobEvent addJob = this.eventAggregator.GetEvent<AddJobEvent>(); 089. 090.if (addJobToken != null) 091.addJob.Unsubscribe(addJobToken); 092. 093.addJobToken = addJob.Subscribe(this.AddJobEventHandler); 094. 095.EditJobEvent editJob = this.eventAggregator.GetEvent<EditJobEvent>(); 096. 097.if (editJobToken != null) 098.editJob.Unsubscribe(editJobToken); 099. 100.editJobToken = editJob.Subscribe(this.EditJobEventHandler); 101.} 102. 103.public void AddJobEventHandler(bool isNew) 104.{ 105.this.activeJob = null; 106.this.activeJob = new JobPosting(); 107.this.activeJob.IsActive = true; // We assume that we want a new job to go up immediately 108.this.isNewJob = true; 109.this.AddEditString = "Add New Job Posting"; 110.this.AddEditButtonString = "Add Job"; 111. 112.MakeMeActive(this.regionManager, "MainRegion", "AddEditJobView"); 113.} 114. 115.public void EditJobEventHandler(JobPosting editJob) 116.{ 117.this.activeJob = null; 118.this.activeJob = editJob; 119.this.isNewJob = false; 120.this.AddEditString = "Edit Job Posting"; 121.this.AddEditButtonString = "Edit Job"; 122. 123.MakeMeActive(this.regionManager, "MainRegion", "AddEditJobView"); 124.} 125. 126.#endregion 127. 128.#region DelegateCommands from View 129. 130.public void AddEditJobCommand(object obj) 131.{ 132.if (this.Errors.Count > 0) 133.{ 134.List<string> errorMessages = new List<string>(); 135. 136.foreach (var valR in this.Errors) 137.{ 138.errorMessages.Add(valR.Exception.Message); 139.} 140. 141.this.eventAggregator.GetEvent<DisplayValidationErrorsEvent>().Publish(errorMessages); 142. 143.} 144.else if (!Validator.TryValidateObject(this.activeJob, new ValidationContext(this.activeJob, null, null), _valResults, true)) 145.{ 146.List<string> errorMessages = new List<string>(); 147. 148.foreach (var valR in this._valResults) 149.{ 150.errorMessages.Add(valR.ErrorMessage); 151.} 152. 153.this._valResults.Clear(); 154. 155.this.eventAggregator.GetEvent<DisplayValidationErrorsEvent>().Publish(errorMessages); 156.} 157.else 158.{ 159.if (this.isNewJob) 160.{ 161.this.eventAggregator.GetEvent<AddJobCompleteEvent>().Publish(this.activeJob); 162.} 163.else 164.{ 165.this.eventAggregator.GetEvent<EditJobCompleteEvent>().Publish(true); 166.} 167.} 168.} 169. 170.public void CancelAddEditCommand(object obj) 171.{ 172.if (this.isNewJob) 173.{ 174.this.eventAggregator.GetEvent<AddJobCompleteEvent>().Publish(null); 175.} 176.else 177.{ 178.this.eventAggregator.GetEvent<EditJobCompleteEvent>().Publish(false); 179.} 180.} 181. 182.#endregion 183.} 184.} We start seeing something new on line 103- the AddJobEventHandler will create a new job and set that to the activeJob item on the ViewModel. When this is all set, the view calls that familiar MakeMeActive method to activate itself. I made a bit of a management call on making views self-activate like this, but I figured it works for one reason. As I create this application, views may not exist that I have in mind, so after a view receives its 'ping' from being subscribed to an event, it prepares whatever it needs to do and then goes active. This way if I don't have 'edit' hooked up, I can click as the day is long on the main view and won't get lost in an empty region. Total personal preference here. :) Everything else should again be pretty straightforward, although I do a bit of validation checking in the AddEditJobCommand, which can either fire off an event back to the main view/viewmodel if everything is a success or sent a list of errors to our notification module, which pops open a RadWindow with the alerts if any exist. As a bonus side note, here's what my WCF RIA Services metadata looks like for handling all of the validation: private JobPostingMetadata() { } [StringLength(2500, ErrorMessage = "Description should be more than one and less than 2500 characters.", MinimumLength = 1)] [Required(ErrorMessage = "Description is required.")] public string Description; [Required(ErrorMessage="Active Status is Required")] public bool IsActive; [StringLength(100, ErrorMessage = "Posting title must be more than 3 but less than 100 characters.", MinimumLength = 3)] [Required(ErrorMessage = "Job Title is required.")] public bool JobTitle; [Required] public string Location; public bool NeedsCV; public bool NeedsOverview; public bool NeedsResume; public int PostingID; [Required(ErrorMessage="Qualifications are required.")] [StringLength(2500, ErrorMessage="Qualifications should be more than one and less than 2500 characters.", MinimumLength=1)] public string Qualifications; [StringLength(2500, ErrorMessage = "Requirements should be more than one and less than 2500 characters.", MinimumLength = 1)] [Required(ErrorMessage="Requirements are required.")] public string Requirements; The RecruitCB Alternative See all that Xaml I pasted above? Those are now two pieces sitting in the JobsView.xaml file now. The only real difference is that the xEditGrid now sits in the same place as xJobsGrid, with visibility swapping out between the two for a quick switch. I also took out all the cal: and command: command references and replaced Button events with clicks and the Grid selection command replaced with a SelectedItemChanged event. Also, at the bottom of the xEditGrid after the last button, I add a ValidationSummary (with Visibility=Collapsed) to catch any errors that are popping up. Simple as can be, and leads to this being the single code-behind file: 001.public partial class JobsView : UserControl 002.{ 003.public RecruitingContext context; 004.public JobPosting activeJob; 005.public bool isNew; 006.private ObservableCollection<ValidationResult> _valResults = new ObservableCollection<ValidationResult>(); 007.public ObservableCollection<ValidationResult> ValResults 008.{ 009.get { return this._valResults; } 010.} 011.public JobsView() 012.{ 013.InitializeComponent(); 014.this.Loaded += new RoutedEventHandler(JobsView_Loaded); 015.} 016.void JobsView_Loaded(object sender, RoutedEventArgs e) 017.{ 018.context = new RecruitingContext(); 019.xJobsGrid.ItemsSource = context.JobPostings; 020.context.Load(context.GetJobPostingsQuery()); 021.} 022.private void xAddRecordButton_Click(object sender, RoutedEventArgs e) 023.{ 024.activeJob = new JobPosting(); 025.isNew = true; 026.xAddEditTitle.Text = "Add a Job Posting"; 027.xAddEditButton.Content = "Add"; 028.xEditGrid.DataContext = activeJob; 029.HideJobsGrid(); 030.} 031.private void xEditRecordButton_Click(object sender, RoutedEventArgs e) 032.{ 033.activeJob = xJobsGrid.SelectedItem as JobPosting; 034.isNew = false; 035.xAddEditTitle.Text = "Edit a Job Posting"; 036.xAddEditButton.Content = "Edit"; 037.xEditGrid.DataContext = activeJob; 038.HideJobsGrid(); 039.} 040.private void xAddEditButton_Click(object sender, RoutedEventArgs e) 041.{ 042.if (!Validator.TryValidateObject(this.activeJob, new ValidationContext(this.activeJob, null, null), _valResults, true)) 043.{ 044.List<string> errorMessages = new List<string>(); 045.foreach (var valR in this._valResults) 046.{ 047.errorMessages.Add(valR.ErrorMessage); 048.} 049.this._valResults.Clear(); 050.ShowErrors(errorMessages); 051.} 052.else if (xSummary.Errors.Count > 0) 053.{ 054.List<string> errorMessages = new List<string>(); 055.foreach (var err in xSummary.Errors) 056.{ 057.errorMessages.Add(err.Message); 058.} 059.ShowErrors(errorMessages); 060.} 061.else 062.{ 063.if (this.isNew) 064.{ 065.context.JobPostings.Add(activeJob); 066.context.SubmitChanges((s) => 067.{ 068.ActionHistory thisAction = new ActionHistory(); 069.thisAction.PostingID = activeJob.PostingID; 070.thisAction.Description = String.Format("Job '{0}' has been edited by {1}", activeJob.JobTitle, "default user"); 071.thisAction.TimeStamp = DateTime.Now; 072.context.ActionHistories.Add(thisAction); 073.context.SubmitChanges(); 074.}, null); 075.} 076.else 077.{ 078.context.SubmitChanges((s) => 079.{ 080.ActionHistory thisAction = new ActionHistory(); 081.thisAction.PostingID = activeJob.PostingID; 082.thisAction.Description = String.Format("Job '{0}' has been added by {1}", activeJob.JobTitle, "default user"); 083.thisAction.TimeStamp = DateTime.Now; 084.context.ActionHistories.Add(thisAction); 085.context.SubmitChanges(); 086.}, null); 087.} 088.ShowJobsGrid(); 089.} 090.} 091.private void xCancelButton_Click(object sender, RoutedEventArgs e) 092.{ 093.ShowJobsGrid(); 094.} 095.private void ShowJobsGrid() 096.{ 097.xAddEditRecordButtonPanel.Visibility = Visibility.Visible; 098.xEditGrid.Visibility = Visibility.Collapsed; 099.xJobsGrid.Visibility = Visibility.Visible; 100.} 101.private void HideJobsGrid() 102.{ 103.xAddEditRecordButtonPanel.Visibility = Visibility.Collapsed; 104.xJobsGrid.Visibility = Visibility.Collapsed; 105.xEditGrid.Visibility = Visibility.Visible; 106.} 107.private void ShowErrors(List<string> errorList) 108.{ 109.string nm = "Errors received: \n"; 110.foreach (string anerror in errorList) 111.nm += anerror + "\n"; 112.RadWindow.Alert(nm); 113.} 114.} The first 39 lines should be pretty familiar, not doing anything too unorthodox to get this up and running. Once we hit the xAddEditButton_Click on line 40, we're still doing pretty much the same things except instead of checking the ValidationHelper errors, we both run a check on the current activeJob object as well as check the ValidationSummary errors list. Once that is set, we again use the callback of context.SubmitChanges (lines 68 and 78) to create an ActionHistory which we will use to track these items down the line. That's all? Essentially... yes. If you look back through this post, most of the code and adventures we have taken were just to get things working in the MVVM/Prism setup. Since I have the whole 'module' self-contained in a single JobView+code-behind setup, I don't have to worry about things like sending events off into space for someone to pick up, communicating through an Infrastructure project, or even re-inventing events to be used with attached behaviors. Everything just kinda works, and again with much less code. Here's a picture of the MVVM and Code-behind versions on the Jobs and AddEdit views, but since the functionality is the same in both apps you still cannot tell them apart (for two-strike): Looking ahead, the Applicants module is effectively the same thing as the Jobs module, so most of the code is being cut-and-pasted back and forth with minor tweaks here and there. So that one is being taken care of by me behind the scenes. Next time, we get into a new world of fun- the interview scheduling module, which will pull from available jobs and applicants for each interview being scheduled, tying everything together with RadScheduler to the rescue. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

PHP contact form font color is stuck to black

- by Richard Hayward

No matter what i try strangely enough my form and thank you page both php files one with embed font coloring and one with an external style sheet refuse to change font color from black. thank you php file: http://www.richiesportfolio.com/contact/thank-you.php contact form php file: http://www.richiesportfolio.com/contact/contactform.php Everything else works purfectly but changing the contact forms font color contactform.php ` <?PHP require_once("./include/fgcontactform.php"); $formproc = new FGContactForm(); if(isset($_POST['submitted'])) { if($formproc->ProcessForm()) { $formproc->RedirectToURL("thank-you.php"); } } ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head> <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> <title>Contact us</title> <link rel="stylesheet" type="text/css" href="http://www.richiesportfolio.com/contact/contact.css" /> <script type='text/javascript' src='scripts/gen_validatorv31.js'></script> </head> <body>  <form id='contactus' action='<?php echo $formproc->GetSelfScript(); ?>' method='post' accept-charset='UTF-8'> <fieldset > <legend>Contact Me</legend> <input type='hidden' name='submitted' id='submitted' value='1'/> <input type='hidden' name='<?php echo $formproc->GetFormIDInputName(); ?>' value='<?php echo $formproc->GetFormIDInputValue(); ?>'/> <input type='text' class='spmhidip' name='<?php echo $formproc->GetSpamTrapInputName(); ?>' /> <div class='short_explanation'>* required fields</div> <div><span class='error'><?php echo $formproc->GetErrorMessage(); ?></span></div> <div class='container'> <label for='name' >Your Full Name*: </label><br/> <input type='text' name='name' id='name' value='<?php echo $formproc->SafeDisplay('name') ?>' maxlength="50" /><br/> <span id='contactus_name_errorloc' class='error'></span> </div> <div class='container'> <label for='email' >Email Address*:</label><br/> <input type='text' name='email' id='email' value='<?php echo $formproc->SafeDisplay('email') ?>' maxlength="50" /><br/> <span id='contactus_email_errorloc' class='error'></span> </div> <div class='container'> <label for='message' >Message:</label><br/> <span id='contactus_message_errorloc' class='error'></span> <textarea rows="10" cols="50" name='message' id='message'><?php echo $formproc->SafeDisplay('message') ?></textarea> </div> <div class='container'> <input type='submit' name='Submit' value='Submit' /> </div> </fieldset> </form>  <script type='text/javascript'> // <![CDATA[ var frmvalidator = new Validator("contactus"); frmvalidator.EnableOnPageErrorDisplay(); frmvalidator.EnableMsgsTogether(); frmvalidator.addValidation("name","req","Please provide your name"); frmvalidator.addValidation("email","req","Please provide your email address"); frmvalidator.addValidation("email","email","Please provide a valid email address"); frmvalidator.addValidation("message","maxlen=2048","The message is too long!(more than 2KB!)"); // ]]> </script> </body> </html>` contact.css body,table,tr,td,a,p,h1,h2,h3,h4,h5,input,h3 a,h4 a,h5 ul, li, ul, a { color:#FFF; } #contactus fieldset { width:320px; padding:20px; border:20px; -moz-border-radius: 10px; -webkit-border-radius: 10px; -khtml-border-radius: 10px; border-radius: 10px; } #contactus legend, h2 { font-family : Arial, sans-serif; font-size:1.3em; font-weight:bold; color:#FFF; } #contactus label { font-family : Arial, sans-serif; font-size:0.8em; font-weight: bold; color:#FFF; } #contactus input[type="text"],textarea { font-family : Arial, Verdana, sans-serif; font-size: 0.8em; line-height:140%; color : #000; padding : 3px; border : 1px solid; #999; -moz-border-radius: 5px; -webkit-border-radius: 5px; -khtml-border-radius: 5px; border-radius: 5px; } #contactus input[type="text"] { height:18px; width:220px; -webkit-border-radius: 5px; -moz-border-radius: 5px; border-radius: 5px; } #contactus #scaptcha { width:60px; height:18px; } #contactus input[type="submit"] { width:100px; height:30px; padding-left:0px; -webkit-border-radius: 5px; -moz-border-radius: 5px; border-radius: 5px; } #contactus textarea { height:120px; width:310px; -webkit-border-radius:8px; -moz-border-radius: 8px; border-radius: 8px; } #contactus input[type="text"]:focus,textarea:focus { color : #009; border : 1px solid #990000; background-color : #ffff99; font-weight:bold; } #contactus .container { margin-top:8px; margin-bottom:10px; color:#FFF; } #contactus .error { font-family: Verdana, Arial, sans-serif; font-size: 0.7em; color: #900; background-color : #111; } #contactus fieldset#antispam { padding:2px; border-top:1px solid #EEE; border-left:0; border-right:0; border-bottom:0; width:350px; } #contactus fieldset#antispam legend { font-family : Arial, sans-serif; font-size: 0.8em; font-weight:bold; color:#FFF; } #contactus .short_explanation { font-family : Arial, sans-serif; font-size: 0.6em; color:#FFF; } /* spam_trap: This input is hidden. This is here to trick the spam bots*/ #contactus .spmhidip { display:none; width:10px; height:3px; } #fg_crdiv { font-family : Arial, sans-serif; font-size: 0.3em; opacity: .2; -moz-opacity: .2; filter: alpha(opacity=20); } #fg_crdiv p { display:none; }

Read the article

jQuery Datatables throws error when dynamically created row headers

- by JM4

I am using the Datatables jquery plugin for one of my projects. For one in particular, the number of columns can vary based on how many children a consumer has (yes I realize normalization and proper technique would insert on another row but it is a client requirement). Datatables must be set up as such: <table> <thead> <tr> <th></th> </tr> </thead> <tbody> <tr> <td></td> </tr> </tbody> </table> my script starts out as: <table cellpadding="0" cellspacing="0" border="0" class="display" id="sortable"> <thead> <tr> <th>parent name</th> <th>parent phone</th> <?php try { $db->beginTransaction(); $stmt = $db->prepare("SELECT max(num_deps) FROM (SELECT count(a.id) as num_deps FROM children a INNER JOIN parents b USING(id) WHERE a.id !=0 GROUP BY a.id) x"); $stmt->execute(); $rows = $stmt->fetchAll(); for($i=1; $i<=$rows[0][0]; $i++) { echo " <th>Child Name ".$i."</th> <th>Date of Birth ".$i."</th> "; } $db->commit(); } catch (PDOException $e) { echo "<p align='center'>There was a system error. Please contact administration.<br>".$e->getMessage()."</p><br />"; } ?> </tr> </thead> In this manner, the final column headers can be 1 or 50 spots long. However, with this dynamic code in place, datatables throws the following error: ""DataTables warning (table id = 'datatable'): Cannot reinitialise DataTable. To retrieve the DataTables object for this table, please pass either no arguments to the dataTable() function, or set bRetrive to true. Alternativly, to destroy old table and create a new one...ETC."' Yes I have set "bRetrieve" : true in the javascript above and that does not do the trick. If I remove the code above, the file "works" fine but it leaves off the necessary columns for my table. Any ideas? Displaying JS <script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jqueryui/1.8.6/jquery-ui.min.js"></script> <script type="text/javascript" src="../media/js/jquery.dataTables.min.js"></script> <script type="text/javascript" src="../media/js/TableTools/TableTools.js"></script> <script type="text/javascript" src="../media/ZeroClipboard/ZeroClipboard.js"></script> <script type="text/javascript"> $(document).ready(function() { TableToolsInit.sSwfPath = "../media/swf/ZeroClipboard.swf"; oTable = $('#sortable').dataTable({ "bRetrieve": true, "bProcessing": true, "sScrollX": "100%", "sScrollXInner": "110%", "bScrollCollapse": true, "bJQueryUI": true, "sPaginationType": "full_numbers", "sDom": 'T<"clear"><"fg-toolbar ui-widget-header ui-corner-tl ui-corner-tr ui-helper-clearfix"lfr>t<"fg-toolbar ui-widget-header ui-corner-bl ui-corner-br ui-helper-clearfix"ip>' }); }); </script> </head> TOP piece of HTML <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Home</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <link rel="stylesheet" type="text/css" href="style.css" /> <link rel="stylesheet" type="text/css" href="default.css" /> <script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js"></script> <style type="text/css" title="currentStyle"> @import "TableTools.css"; @import "demo_table_jui.css"; @import "jquery-ui-1.8.4.custom.css"; </style> <script type="text/javascript" src="//ajax.googleapis.com/ajax/libs/jqueryui/1.8.6/jquery-ui.min.js"></script> <script type="text/javascript" src="js/jquery.dataTables.min.js"></script> <script type="text/javascript" src="js/TableTools/TableTools.js"></script> <script type="text/javascript" src="ZeroClipboard/ZeroClipboard.js"></script> <script type="text/javascript"> $(document).ready(function() { TableToolsInit.sSwfPath = "ZeroClipboard.swf"; oTable = $('#sortable').dataTable({ "bRetrieve": true, "bProcessing": true, "sScrollX": "100%", "sScrollXInner": "110%", "bScrollCollapse": true, "bJQueryUI": true, "sPaginationType": "full_numbers", "sDom": 'T<"clear"><"fg-toolbar ui-widget-header ui-corner-tl ui-corner-tr ui-helper-clearfix"lfr>t<"fg-toolbar ui-widget-header ui-corner-bl ui-corner-br ui-helper-clearfix"ip>' }); }); </script> </head> <body bgcolor="#e0e0e0"> <div class="main"> <div class="body"> <div class="body_resize"> <div class="liquid-round"> <div class="top"><span><h2>Details</h2></span></div> <div class="center-content"> <div style="overflow-x:hidden; min-height:400px; max-height:600px; overflow-y:auto;"> <div class="demo_jui"><br /> <table cellpadding="0" cellspacing="0" border="0" class="display" width="100%" id="sortable"> <thead> <tr> <th>First Name</th> <th>MI</th> <th>Last Name</th> <th>Street Address</th> <th>City</th> <th>State</th> <th>Zip</th> <th>DOB</th> <th>Gender</th> <th>Spouse Name</th> <th>Spouse Date of Birth</th>  <th>Dependent Child Name 1</th> <th>Dependent Date of Birth 1</th> <th>Dependent Child Name 2</th> <th>Dependent Date of Birth 2</th> <th>Dependent Child Name 3</th> <th>Dependent Date of Birth 3</th> <th>Dependent Child Name 4</th> <th>Dependent Date of Birth 4</th> <th>Dependent Child Name 5</th> <th>Dependent Date of Birth 5</th> <th>Dependent Child Name 6</th> <th>Dependent Date of Birth 6</th> <th>Dependent Child Name 7</th> <th>Dependent Date of Birth 7</th> </tr> </thead> <tbody> <tr> ... UPDATE REGARDING COMMENTS/ANSWERS I have received a number of responses indicating the number of headers may not match the field count in the body. As I mention below, eliminating the php script below altogether would eliminate 5+ fields in the header and without question throw the count match off balance. This DOES NOT however cause an error and in fact "resolves" the issue in that datatables functions properly (even though there is NO header record for 5+ fields in the body.

Read the article

How do you convert a parent-child (adjacency) table to a nested set using PHP and MySQL?

- by mrbinky3000

I've spent the last few hours trying to find the solution to this question online. I've found plenty of examples on how to convert from nested set to adjacency... but few that go the other way around. The examples I have found either don't work or use MySQL procedures. Unfortunately, I can't use procedures for this project. I need a pure PHP solution. I have a table that uses the adjacency model below: id parent_id category 1 0 ROOT_NODE 2 1 Books 3 1 CD's 4 1 Magazines 5 2 Books/Hardcover 6 2 Books/Large Format 7 4 Magazines/Vintage And I would like to convert it to a Nested Set table below: id left right category 1 1 14 Root Node 2 2 7 Books 3 3 4 Books/Hardcover 4 5 6 Books/Large Format 5 8 9 CD's 6 10 13 Magazines 7 11 12 Magazines/Vintage Here is an image of what I need: I have a function, based on the pseudo code from this forum post (http://www.sitepoint.com/forums/showthread.php?t=320444) but it doesn't work. I get multiple rows that have the same value for left. This should not happen. <?php /** -- -- Table structure for table `adjacent_table` -- CREATE TABLE IF NOT EXISTS `adjacent_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `father_id` int(11) DEFAULT NULL, `category` varchar(128) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; -- -- Dumping data for table `adjacent_table` -- INSERT INTO `adjacent_table` (`id`, `father_id`, `category`) VALUES (1, 0, 'ROOT'), (2, 1, 'Books'), (3, 1, 'CD''s'), (4, 1, 'Magazines'), (5, 2, 'Hard Cover'), (6, 2, 'Large Format'), (7, 4, 'Vintage'); -- -- Table structure for table `nested_table` -- CREATE TABLE IF NOT EXISTS `nested_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `lft` int(11) DEFAULT NULL, `rgt` int(11) DEFAULT NULL, `category` varchar(128) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ; */ mysql_connect('localhost','USER','PASSWORD') or die(mysql_error()); mysql_select_db('DATABASE') or die(mysql_error()); adjacent_to_nested(0); /** * adjacent_to_nested * * Reads a "adjacent model" table and converts it to a "Nested Set" table. * @param integer $i_id Should be the id of the "root node" in the adjacent table; * @param integer $i_left Should only be used on recursive calls. Holds the current value for lft */ function adjacent_to_nested($i_id, $i_left = 0) { // the right value of this node is the left value + 1 $i_right = $i_left + 1; // get all children of this node $a_children = get_source_children($i_id); foreach ($a_children as $a) { // recursive execution of this function for each child of this node // $i_right is the current right value, which is incremented by the // import_from_dc_link_category method $i_right = adjacent_to_nested($a['id'], $i_right); // insert stuff into the our new "Nested Sets" table $s_query = " INSERT INTO `nested_table` (`id`, `lft`, `rgt`, `category`) VALUES( NULL, '".$i_left."', '".$i_right."', '".mysql_real_escape_string($a['category'])."' ) "; if (!mysql_query($s_query)) { echo "<pre>$s_query</pre>\n"; throw new Exception(mysql_error()); } echo "<p>$s_query</p>\n"; // get the newly created row id $i_new_nested_id = mysql_insert_id(); } return $i_right + 1; } /** * get_source_children * * Examines the "adjacent" table and finds all the immediate children of a node * @param integer $i_id The unique id for a node in the adjacent_table table * @return array Returns an array of results or an empty array if no results. */ function get_source_children($i_id) { $a_return = array(); $s_query = "SELECT * FROM `adjacent_table` WHERE `father_id` = '".$i_id."'"; if (!$i_result = mysql_query($s_query)) { echo "<pre>$s_query</pre>\n"; throw new Exception(mysql_error()); } if (mysql_num_rows($i_result) > 0) { while($a = mysql_fetch_assoc($i_result)) { $a_return[] = $a; } } return $a_return; } ?> This is the output of the above script. INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '2', '5', 'Hard Cover' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '2', '7', 'Large Format' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '1', '8', 'Books' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '1', '10', 'CD\'s' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '10', '13', 'Vintage' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '1', '14', 'Magazines' ) INSERT INTO nested_table (id, lft, rgt, category) VALUES( NULL, '0', '15', 'ROOT' ) As you can see, there are multiple rows sharing the lft value of "1" same goes for "2" In a nested-set, the values for left and right must be unique. Here is an example of how to manually number the left and right ID's in a nested set: UPDATE - PROBLEM SOLVED First off, I had mistakenly believed that the source table (the one in adjacent-lists format) needed to be altered to include a source node. This is not the case. Secondly, I found a cached page on BING (of all places) with a class that does the trick. I've altered it for PHP5 and converted the original author's mysql related bits to basic PHP. He was using some DB class. You can convert them to your own database abstraction class later if you want. Obviously, if your "source table" has other columns that you want to move to the nested set table, you will have to adjust the write method in the class below. Hopefully this will save someone else from the same problems in the future. <?php /** -- -- Table structure for table `adjacent_table` -- DROP TABLE IF EXISTS `adjacent_table`; CREATE TABLE IF NOT EXISTS `adjacent_table` ( `id` int(11) NOT NULL AUTO_INCREMENT, `father_id` int(11) DEFAULT NULL, `category` varchar(128) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ; -- -- Dumping data for table `adjacent_table` -- INSERT INTO `adjacent_table` (`id`, `father_id`, `category`) VALUES (1, 0, 'Books'), (2, 0, 'CD''s'), (3, 0, 'Magazines'), (4, 1, 'Hard Cover'), (5, 1, 'Large Format'), (6, 3, 'Vintage'); -- -- Table structure for table `nested_table` -- DROP TABLE IF EXISTS `nested_table`; CREATE TABLE IF NOT EXISTS `nested_table` ( `lft` int(11) NOT NULL DEFAULT '0', `rgt` int(11) DEFAULT NULL, `id` int(11) DEFAULT NULL, `category` varchar(128) DEFAULT NULL, PRIMARY KEY (`lft`), UNIQUE KEY `id` (`id`), UNIQUE KEY `rgt` (`rgt`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1; */ /** * @class tree_transformer * @author Paul Houle, Matthew Toledo * @created 2008-11-04 * @url http://gen5.info/q/2008/11/04/nested-sets-php-verb-objects-and-noun-objects/ */ class tree_transformer { private $i_count; private $a_link; public function __construct($a_link) { if(!is_array($a_link)) throw new Exception("First parameter should be an array. Instead, it was type '".gettype($a_link)."'"); $this->i_count = 1; $this->a_link= $a_link; } public function traverse($i_id) { $i_lft = $this->i_count; $this->i_count++; $a_kid = $this->get_children($i_id); if ($a_kid) { foreach($a_kid as $a_child) { $this->traverse($a_child); } } $i_rgt=$this->i_count; $this->i_count++; $this->write($i_lft,$i_rgt,$i_id); } private function get_children($i_id) { return $this->a_link[$i_id]; } private function write($i_lft,$i_rgt,$i_id) { // fetch the source column $s_query = "SELECT * FROM `adjacent_table` WHERE `id` = '".$i_id."'"; if (!$i_result = mysql_query($s_query)) { echo "<pre>$s_query</pre>\n"; throw new Exception(mysql_error()); } $a_source = array(); if (mysql_num_rows($i_result)) { $a_source = mysql_fetch_assoc($i_result); } // root node? label it unless already labeled in source table if (1 == $i_lft && empty($a_source['category'])) { $a_source['category'] = 'ROOT'; } // insert into the new nested tree table // use mysql_real_escape_string because one value "CD's" has a single ' $s_query = " INSERT INTO `nested_table` (`id`,`lft`,`rgt`,`category`) VALUES ( '".$i_id."', '".$i_lft."', '".$i_rgt."', '".mysql_real_escape_string($a_source['category'])."' ) "; if (!$i_result = mysql_query($s_query)) { echo "<pre>$s_query</pre>\n"; throw new Exception(mysql_error()); } else { // success: provide feedback echo "<p>$s_query</p>\n"; } } } mysql_connect('localhost','USER','PASSWORD') or die(mysql_error()); mysql_select_db('DATABASE') or die(mysql_error()); // build a complete copy of the adjacency table in ram $s_query = "SELECT `id`,`father_id` FROM `adjacent_table`"; $i_result = mysql_query($s_query); $a_rows = array(); while ($a_rows[] = mysql_fetch_assoc($i_result)); $a_link = array(); foreach($a_rows as $a_row) { $i_father_id = $a_row['father_id']; $i_child_id = $a_row['id']; if (!array_key_exists($i_father_id,$a_link)) { $a_link[$i_father_id]=array(); } $a_link[$i_father_id][]=$i_child_id; } $o_tree_transformer = new tree_transformer($a_link); $o_tree_transformer->traverse(0); ?>

Read the article

Zen and the Art of File and Folder Organization

- by Mark Virtue

Is your desk a paragon of neatness, or does it look like a paper-bomb has gone off? If you’ve been putting off getting organized because the task is too huge or daunting, or you don’t know where to start, we’ve got 40 tips to get you on the path to zen mastery of your filing system. For all those readers who would like to get their files and folders organized, or, if they’re already organized, better organized—we have compiled a complete guide to getting organized and staying organized, a comprehensive article that will hopefully cover every possible tip you could want. Signs that Your Computer is Poorly Organized If your computer is a mess, you’re probably already aware of it. But just in case you’re not, here are some tell-tale signs: Your Desktop has over 40 icons on it “My Documents” contains over 300 files and 60 folders, including MP3s and digital photos You use the Windows’ built-in search facility whenever you need to find a file You can’t find programs in the out-of-control list of programs in your Start Menu You save all your Word documents in one folder, all your spreadsheets in a second folder, etc Any given file that you’re looking for may be in any one of four different sets of folders But before we start, here are some quick notes: We’re going to assume you know what files and folders are, and how to create, save, rename, copy and delete them The organization principles described in this article apply equally to all computer systems. However, the screenshots here will reflect how things look on Windows (usually Windows 7). We will also mention some useful features of Windows that can help you get organized. Everyone has their own favorite methodology of organizing and filing, and it’s all too easy to get into “My Way is Better than Your Way” arguments. The reality is that there is no perfect way of getting things organized. When I wrote this article, I tried to keep a generalist and objective viewpoint. I consider myself to be unusually well organized (to the point of obsession, truth be told), and I’ve had 25 years experience in collecting and organizing files on computers. So I’ve got a lot to say on the subject. But the tips I have described here are only one way of doing it. Hopefully some of these tips will work for you too, but please don’t read this as any sort of “right” way to do it. At the end of the article we’ll be asking you, the reader, for your own organization tips. Why Bother Organizing At All? For some, the answer to this question is self-evident. And yet, in this era of powerful desktop search software (the search capabilities built into the Windows Vista and Windows 7 Start Menus, and third-party programs like Google Desktop Search), the question does need to be asked, and answered. I have a friend who puts every file he ever creates, receives or downloads into his My Documents folder and doesn’t bother filing them into subfolders at all. He relies on the search functionality built into his Windows operating system to help him find whatever he’s looking for. And he always finds it. He’s a Search Samurai. For him, filing is a waste of valuable time that could be spent enjoying life! It’s tempting to follow suit. On the face of it, why would anyone bother to take the time to organize their hard disk when such excellent search software is available? Well, if all you ever want to do with the files you own is to locate and open them individually (for listening, editing, etc), then there’s no reason to ever bother doing one scrap of organization. But consider these common tasks that are not achievable with desktop search software: Find files manually. Often it’s not convenient, speedy or even possible to utilize your desktop search software to find what you want. It doesn’t work 100% of the time, or you may not even have it installed. Sometimes its just plain faster to go straight to the file you want, if you know it’s in a particular sub-folder, rather than trawling through hundreds of search results. Find groups of similar files (e.g. all your “work” files, all the photos of your Europe holiday in 2008, all your music videos, all the MP3s from Dark Side of the Moon, all your letters you wrote to your wife, all your tax returns). Clever naming of the files will only get you so far. Sometimes it’s the date the file was created that’s important, other times it’s the file format, and other times it’s the purpose of the file. How do you name a collection of files so that they’re easy to isolate based on any of the above criteria? Short answer, you can’t. Move files to a new computer. It’s time to upgrade your computer. How do you quickly grab all the files that are important to you? Or you decide to have two computers now – one for home and one for work. How do you quickly isolate only the work-related files to move them to the work computer? Synchronize files to other computers. If you have more than one computer, and you need to mirror some of your files onto the other computer (e.g. your music collection), then you need a way to quickly determine which files are to be synced and which are not. Surely you don’t want to synchronize everything? Choose which files to back up. If your backup regime calls for multiple backups, or requires speedy backups, then you’ll need to be able to specify which files are to be backed up, and which are not. This is not possible if they’re all in the same folder. Finally, if you’re simply someone who takes pleasure in being organized, tidy and ordered (me! me!), then you don’t even need a reason. Being disorganized is simply unthinkable. Tips on Getting Organized Here we present our 40 best tips on how to get organized. Or, if you’re already organized, to get better organized. Tip #1. Choose Your Organization System Carefully The reason that most people are not organized is that it takes time. And the first thing that takes time is deciding upon a system of organization. This is always a matter of personal preference, and is not something that a geek on a website can tell you. You should always choose your own system, based on how your own brain is organized (which makes the assumption that your brain is, in fact, organized). We can’t instruct you, but we can make suggestions: You may want to start off with a system based on the users of the computer. i.e. “My Files”, “My Wife’s Files”, My Son’s Files”, etc. Inside “My Files”, you might then break it down into “Personal” and “Business”. You may then realize that there are overlaps. For example, everyone may want to share access to the music library, or the photos from the school play. So you may create another folder called “Family”, for the “common” files. You may decide that the highest-level breakdown of your files is based on the “source” of each file. In other words, who created the files. You could have “Files created by ME (business or personal)”, “Files created by people I know (family, friends, etc)”, and finally “Files created by the rest of the world (MP3 music files, downloaded or ripped movies or TV shows, software installation files, gorgeous desktop wallpaper images you’ve collected, etc).” This system happens to be the one I use myself. See below: Mark is for files created by meVC is for files created by my company (Virtual Creations)Others is for files created by my friends and familyData is the rest of the worldAlso, Settings is where I store the configuration files and other program data files for my installed software (more on this in tip #34, below). Each folder will present its own particular set of requirements for further sub-organization. For example, you may decide to organize your music collection into sub-folders based on the artist’s name, while your digital photos might get organized based on the date they were taken. It can be different for every sub-folder! Another strategy would be based on “currentness”. Files you have yet to open and look at live in one folder. Ones that have been looked at but not yet filed live in another place. Current, active projects live in yet another place. All other files (your “archive”, if you like) would live in a fourth folder. (And of course, within that last folder you’d need to create a further sub-system based on one of the previous bullet points). Put some thought into this – changing it when it proves incomplete can be a big hassle! Before you go to the trouble of implementing any system you come up with, examine a wide cross-section of the files you own and see if they will all be able to find a nice logical place to sit within your system. Tip #2. When You Decide on Your System, Stick to It! There’s nothing more pointless than going to all the trouble of creating a system and filing all your files, and then whenever you create, receive or download a new file, you simply dump it onto your Desktop. You need to be disciplined – forever! Every new file you get, spend those extra few seconds to file it where it belongs! Otherwise, in just a month or two, you’ll be worse off than before – half your files will be organized and half will be disorganized – and you won’t know which is which! Tip #3. Choose the Root Folder of Your Structure Carefully Every data file (document, photo, music file, etc) that you create, own or is important to you, no matter where it came from, should be found within one single folder, and that one single folder should be located at the root of your C: drive (as a sub-folder of C:\). In other words, do not base your folder structure in standard folders like “My Documents”. If you do, then you’re leaving it up to the operating system engineers to decide what folder structure is best for you. And every operating system has a different system! In Windows 7 your files are found in C:\Users\YourName, whilst on Windows XP it was C:\Documents and Settings\YourName\My Documents. In UNIX systems it’s often /home/YourName. These standard default folders tend to fill up with junk files and folders that are not at all important to you. “My Documents” is the worst offender. Every second piece of software you install, it seems, likes to create its own folder in the “My Documents” folder. These folders usually don’t fit within your organizational structure, so don’t use them! In fact, don’t even use the “My Documents” folder at all. Allow it to fill up with junk, and then simply ignore it. It sounds heretical, but: Don’t ever visit your “My Documents” folder! Remove your icons/links to “My Documents” and replace them with links to the folders you created and you care about! Create your own file system from scratch! Probably the best place to put it would be on your D: drive – if you have one. This way, all your files live on one drive, while all the operating system and software component files live on the C: drive – simply and elegantly separated. The benefits of that are profound. Not only are there obvious organizational benefits (see tip #10, below), but when it comes to migrate your data to a new computer, you can (sometimes) simply unplug your D: drive and plug it in as the D: drive of your new computer (this implies that the D: drive is actually a separate physical disk, and not a partition on the same disk as C:). You also get a slight speed improvement (again, only if your C: and D: drives are on separate physical disks). Warning: From tip #12, below, you will see that it’s actually a good idea to have exactly the same file system structure – including the drive it’s filed on – on all of the computers you own. So if you decide to use the D: drive as the storage system for your own files, make sure you are able to use the D: drive on all the computers you own. If you can’t ensure that, then you can still use a clever geeky trick to store your files on the D: drive, but still access them all via the C: drive (see tip #17, below). If you only have one hard disk (C:), then create a dedicated folder that will contain all your files – something like C:\Files. The name of the folder is not important, but make it a single, brief word. There are several reasons for this: When creating a backup regime, it’s easy to decide what files should be backed up – they’re all in the one folder! If you ever decide to trade in your computer for a new one, you know exactly which files to migrate You will always know where to begin a search for any file If you synchronize files with other computers, it makes your synchronization routines very simple. It also causes all your shortcuts to continue to work on the other machines (more about this in tip #24, below). Once you’ve decided where your files should go, then put all your files in there – Everything! Completely disregard the standard, default folders that are created for you by the operating system (“My Music”, “My Pictures”, etc). In fact, you can actually relocate many of those folders into your own structure (more about that below, in tip #6). The more completely you get all your data files (documents, photos, music, etc) and all your configuration settings into that one folder, then the easier it will be to perform all of the above tasks. Once this has been done, and all your files live in one folder, all the other folders in C:\ can be thought of as “operating system” folders, and therefore of little day-to-day interest for us. Here’s a screenshot of a nicely organized C: drive, where all user files are located within the \Files folder: Tip #4. Use Sub-Folders This would be our simplest and most obvious tip. It almost goes without saying. Any organizational system you decide upon (see tip #1) will require that you create sub-folders for your files. Get used to creating folders on a regular basis. Tip #5. Don’t be Shy About Depth Create as many levels of sub-folders as you need. Don’t be scared to do so. Every time you notice an opportunity to group a set of related files into a sub-folder, do so. Examples might include: All the MP3s from one music CD, all the photos from one holiday, or all the documents from one client. It’s perfectly okay to put files into a folder called C:\Files\Me\From Others\Services\WestCo Bank\Statements\2009. That’s only seven levels deep. Ten levels is not uncommon. Of course, it’s possible to take this too far. If you notice yourself creating a sub-folder to hold only one file, then you’ve probably become a little over-zealous. On the other hand, if you simply create a structure with only two levels (for example C:\Files\Work) then you really haven’t achieved any level of organization at all (unless you own only six files!). Your “Work” folder will have become a dumping ground, just like your Desktop was, with most likely hundreds of files in it. Tip #6. Move the Standard User Folders into Your Own Folder Structure Most operating systems, including Windows, create a set of standard folders for each of its users. These folders then become the default location for files such as documents, music files, digital photos and downloaded Internet files. In Windows 7, the full list is shown below: Some of these folders you may never use nor care about (for example, the Favorites folder, if you’re not using Internet Explorer as your browser). Those ones you can leave where they are. But you may be using some of the other folders to store files that are important to you. Even if you’re not using them, Windows will still often treat them as the default storage location for many types of files. When you go to save a standard file type, it can become annoying to be automatically prompted to save it in a folder that’s not part of your own file structure. But there’s a simple solution: Move the folders you care about into your own folder structure! If you do, then the next time you go to save a file of the corresponding type, Windows will prompt you to save it in the new, moved location. Moving the folders is easy. Simply drag-and-drop them to the new location. Here’s a screenshot of the default My Music folder being moved to my custom personal folder (Mark): Tip #7. Name Files and Folders Intelligently This is another one that almost goes without saying, but we’ll say it anyway: Do not allow files to be created that have meaningless names like Document1.doc, or folders called New Folder (2). Take that extra 20 seconds and come up with a meaningful name for the file/folder – one that accurately divulges its contents without repeating the entire contents in the name. Tip #8. Watch Out for Long Filenames Another way to tell if you have not yet created enough depth to your folder hierarchy is that your files often require really long names. If you need to call a file Johnson Sales Figures March 2009.xls (which might happen to live in the same folder as Abercrombie Budget Report 2008.xls), then you might want to create some sub-folders so that the first file could be simply called March.xls, and living in the Clients\Johnson\Sales Figures\2009 folder. A well-placed file needs only a brief filename! Tip #9. Use Shortcuts! Everywhere! This is probably the single most useful and important tip we can offer. A shortcut allows a file to be in two places at once. Why would you want that? Well, the file and folder structure of every popular operating system on the market today is hierarchical. This means that all objects (files and folders) always live within exactly one parent folder. It’s a bit like a tree. A tree has branches (folders) and leaves (files). Each leaf, and each branch, is supported by exactly one parent branch, all the way back to the root of the tree (which, incidentally, is exactly why C:\ is called the “root folder” of the C: drive). That hard disks are structured this way may seem obvious and even necessary, but it’s only one way of organizing data. There are others: Relational databases, for example, organize structured data entirely differently. The main limitation of hierarchical filing structures is that a file can only ever be in one branch of the tree – in only one folder – at a time. Why is this a problem? Well, there are two main reasons why this limitation is a problem for computer users: The “correct” place for a file, according to our organizational rationale, is very often a very inconvenient place for that file to be located. Just because it’s correctly filed doesn’t mean it’s easy to get to. Your file may be “correctly” buried six levels deep in your sub-folder structure, but you may need regular and speedy access to this file every day. You could always move it to a more convenient location, but that would mean that you would need to re-file back to its “correct” location it every time you’d finished working on it. Most unsatisfactory. A file may simply “belong” in two or more different locations within your file structure. For example, say you’re an accountant and you have just completed the 2009 tax return for John Smith. It might make sense to you to call this file 2009 Tax Return.doc and file it under Clients\John Smith. But it may also be important to you to have the 2009 tax returns from all your clients together in the one place. So you might also want to call the file John Smith.doc and file it under Tax Returns\2009. The problem is, in a purely hierarchical filing system, you can’t put it in both places. Grrrrr! Fortunately, Windows (and most other operating systems) offers a way for you to do exactly that: It’s called a “shortcut” (also known as an “alias” on Macs and a “symbolic link” on UNIX systems). Shortcuts allow a file to exist in one place, and an icon that represents the file to be created and put anywhere else you please. In fact, you can create a dozen such icons and scatter them all over your hard disk. Double-clicking on one of these icons/shortcuts opens up the original file, just as if you had double-clicked on the original file itself. Consider the following two icons: The one on the left is the actual Word document, while the one on the right is a shortcut that represents the Word document. Double-clicking on either icon will open the same file. There are two main visual differences between the icons: The shortcut will have a small arrow in the lower-left-hand corner (on Windows, anyway) The shortcut is allowed to have a name that does not include the file extension (the “.docx” part, in this case) You can delete the shortcut at any time without losing any actual data. The original is still intact. All you lose is the ability to get to that data from wherever the shortcut was. So why are shortcuts so great? Because they allow us to easily overcome the main limitation of hierarchical file systems, and put a file in two (or more) places at the same time. You will always have files that don’t play nice with your organizational rationale, and can’t be filed in only one place. They demand to exist in two places. Shortcuts allow this! Furthermore, they allow you to collect your most often-opened files and folders together in one spot for convenient access. The cool part is that the original files stay where they are, safe forever in their perfectly organized location. So your collection of most often-opened files can – and should – become a collection of shortcuts! If you’re still not convinced of the utility of shortcuts, consider the following well-known areas of a typical Windows computer: The Start Menu (and all the programs that live within it) The Quick Launch bar (or the Superbar in Windows 7) The “Favorite folders” area in the top-left corner of the Windows Explorer window (in Windows Vista or Windows 7) Your Internet Explorer Favorites or Firefox Bookmarks Each item in each of these areas is a shortcut! Each of those areas exist for one purpose only: For convenience – to provide you with a collection of the files and folders you access most often. It should be easy to see by now that shortcuts are designed for one single purpose: To make accessing your files more convenient. Each time you double-click on a shortcut, you are saved the hassle of locating the file (or folder, or program, or drive, or control panel icon) that it represents. Shortcuts allow us to invent a golden rule of file and folder organization: “Only ever have one copy of a file – never have two copies of the same file. Use a shortcut instead” (this rule doesn’t apply to copies created for backup purposes, of course!) There are also lesser rules, like “don’t move a file into your work area – create a shortcut there instead”, and “any time you find yourself frustrated with how long it takes to locate a file, create a shortcut to it and place that shortcut in a convenient location.” So how to we create these massively useful shortcuts? There are two main ways: “Copy” the original file or folder (click on it and type Ctrl-C, or right-click on it and select Copy): Then right-click in an empty area of the destination folder (the place where you want the shortcut to go) and select Paste shortcut: Right-drag (drag with the right mouse button) the file from the source folder to the destination folder. When you let go of the mouse button at the destination folder, a menu pops up: Select Create shortcuts here. Note that when shortcuts are created, they are often named something like Shortcut to Budget Detail.doc (windows XP) or Budget Detail – Shortcut.doc (Windows 7). If you don’t like those extra words, you can easily rename the shortcuts after they’re created, or you can configure Windows to never insert the extra words in the first place (see our article on how to do this). And of course, you can create shortcuts to folders too, not just to files! Bottom line: Whenever you have a file that you’d like to access from somewhere else (whether it’s convenience you’re after, or because the file simply belongs in two places), create a shortcut to the original file in the new location. Tip #10. Separate Application Files from Data Files Any digital organization guru will drum this rule into you. Application files are the components of the software you’ve installed (e.g. Microsoft Word, Adobe Photoshop or Internet Explorer). Data files are the files that you’ve created for yourself using that software (e.g. Word Documents, digital photos, emails or playlists). Software gets installed, uninstalled and upgraded all the time. Hopefully you always have the original installation media (or downloaded set-up file) kept somewhere safe, and can thus reinstall your software at any time. This means that the software component files are of little importance. Whereas the files you have created with that software is, by definition, important. It’s a good rule to always separate unimportant files from important files. So when your software prompts you to save a file you’ve just created, take a moment and check out where it’s suggesting that you save the file. If it’s suggesting that you save the file into the same folder as the software itself, then definitely don’t follow that suggestion. File it in your own folder! In fact, see if you can find the program’s configuration option that determines where files are saved by default (if it has one), and change it. Tip #11. Organize Files Based on Purpose, Not on File Type If you have, for example a folder called Work\Clients\Johnson, and within that folder you have two sub-folders, Word Documents and Spreadsheets (in other words, you’re separating “.doc” files from “.xls” files), then chances are that you’re not optimally organized. It makes little sense to organize your files based on the program that created them. Instead, create your sub-folders based on the purpose of the file. For example, it would make more sense to create sub-folders called Correspondence and Financials. It may well be that all the files in a given sub-folder are of the same file-type, but this should be more of a coincidence and less of a design feature of your organization system. Tip #12. Maintain the Same Folder Structure on All Your Computers In other words, whatever organizational system you create, apply it to every computer that you can. There are several benefits to this: There’s less to remember. No matter where you are, you always know where to look for your files If you copy or synchronize files from one computer to another, then setting up the synchronization job becomes very simple Shortcuts can be copied or moved from one computer to another with ease (assuming the original files are also copied/moved). There’s no need to find the target of the shortcut all over again on the second computer Ditto for linked files (e.g Word documents that link to data in a separate Excel file), playlists, and any files that reference the exact file locations of other files. This applies even to the drive that your files are stored on. If your files are stored on C: on one computer, make sure they’re stored on C: on all your computers. Otherwise all your shortcuts, playlists and linked files will stop working! Tip #13. Create an “Inbox” Folder Create yourself a folder where you store all files that you’re currently working on, or that you haven’t gotten around to filing yet. You can think of this folder as your “to-do” list. You can call it “Inbox” (making it the same metaphor as your email system), or “Work”, or “To-Do”, or “Scratch”, or whatever name makes sense to you. It doesn’t matter what you call it – just make sure you have one! Once you have finished working on a file, you then move it from the “Inbox” to its correct location within your organizational structure. You may want to use your Desktop as this “Inbox” folder. Rightly or wrongly, most people do. It’s not a bad place to put such files, but be careful: If you do decide that your Desktop represents your “to-do” list, then make sure that no other files find their way there. In other words, make sure that your “Inbox”, wherever it is, Desktop or otherwise, is kept free of junk – stray files that don’t belong there. So where should you put this folder, which, almost by definition, lives outside the structure of the rest of your filing system? Well, first and foremost, it has to be somewhere handy. This will be one of your most-visited folders, so convenience is key. Putting it on the Desktop is a great option – especially if you don’t have any other folders on your Desktop: the folder then becomes supremely easy to find in Windows Explorer: You would then create shortcuts to this folder in convenient spots all over your computer (“Favorite Links”, “Quick Launch”, etc). Tip #14. Ensure You have Only One “Inbox” Folder Once you’ve created your “Inbox” folder, don’t use any other folder location as your “to-do list”. Throw every incoming or created file into the Inbox folder as you create/receive it. This keeps the rest of your computer pristine and free of randomly created or downloaded junk. The last thing you want to be doing is checking multiple folders to see all your current tasks and projects. Gather them all together into one folder. Here are some tips to help ensure you only have one Inbox: Set the default “save” location of all your programs to this folder. Set the default “download” location for your browser to this folder. If this folder is not your desktop (recommended) then also see if you can make a point of not putting “to-do” files on your desktop. This keeps your desktop uncluttered and Zen-like: (the Inbox folder is in the bottom-right corner) Tip #15. Be Vigilant about Clearing Your “Inbox” Folder This is one of the keys to staying organized. If you let your “Inbox” overflow (i.e. allow there to be more than, say, 30 files or folders in there), then you’re probably going to start feeling like you’re overwhelmed: You’re not keeping up with your to-do list. Once your Inbox gets beyond a certain point (around 30 files, studies have shown), then you’ll simply start to avoid it. You may continue to put files in there, but you’ll be scared to look at it, fearing the “out of control” feeling that all overworked, chaotic or just plain disorganized people regularly feel. So, here’s what you can do: Visit your Inbox/to-do folder regularly (at least five times per day). Scan the folder regularly for files that you have completed working on and are ready for filing. File them immediately. Make it a source of pride to keep the number of files in this folder as small as possible. If you value peace of mind, then make the emptiness of this folder one of your highest (computer) priorities If you know that a particular file has been in the folder for more than, say, six weeks, then admit that you’re not actually going to get around to processing it, and move it to its final resting place. Tip #16. File Everything Immediately, and Use Shortcuts for Your Active Projects As soon as you create, receive or download a new file, store it away in its “correct” folder immediately. Then, whenever you need to work on it (possibly straight away), create a shortcut to it in your “Inbox” (“to-do”) folder or your desktop. That way, all your files are always in their “correct” locations, yet you still have immediate, convenient access to your current, active files. When you finish working on a file, simply delete the shortcut. Ideally, your “Inbox” folder – and your Desktop – should contain no actual files or folders. They should simply contain shortcuts. Tip #17. Use Directory Symbolic Links (or Junctions) to Maintain One Unified Folder Structure Using this tip, we can get around a potential hiccup that we can run into when creating our organizational structure – the issue of having more than one drive on our computer (C:, D:, etc). We might have files we need to store on the D: drive for space reasons, and yet want to base our organized folder structure on the C: drive (or vice-versa). Your chosen organizational structure may dictate that all your files must be accessed from the C: drive (for example, the root folder of all your files may be something like C:\Files). And yet you may still have a D: drive and wish to take advantage of the hundreds of spare Gigabytes that it offers. Did you know that it’s actually possible to store your files on the D: drive and yet access them as if they were on the C: drive? And no, we’re not talking about shortcuts here (although the concept is very similar). By using the shell command mklink, you can essentially take a folder that lives on one drive and create an alias for it on a different drive (you can do lots more than that with mklink – for a full rundown on this programs capabilities, see our dedicated article). These aliases are called directory symbolic links (and used to be known as junctions). You can think of them as “virtual” folders. They function exactly like regular folders, except they’re physically located somewhere else. For example, you may decide that your entire D: drive contains your complete organizational file structure, but that you need to reference all those files as if they were on the C: drive, under C:\Files. If that was the case you could create C:\Files as a directory symbolic link – a link to D:, as follows: mklink /d c:\files d:\ Or it may be that the only files you wish to store on the D: drive are your movie collection. You could locate all your movie files in the root of your D: drive, and then link it to C:\Files\Media\Movies, as follows: mklink /d c:\files\media\movies d:\ (Needless to say, you must run these commands from a command prompt – click the Start button, type cmd and press Enter) Tip #18. Customize Your Folder Icons This is not strictly speaking an organizational tip, but having unique icons for each folder does allow you to more quickly visually identify which folder is which, and thus saves you time when you’re finding files. An example is below (from my folder that contains all files downloaded from the Internet): To learn how to change your folder icons, please refer to our dedicated article on the subject. Tip #19. Tidy Your Start Menu The Windows Start Menu is usually one of the messiest parts of any Windows computer. Every program you install seems to adopt a completely different approach to placing icons in this menu. Some simply put a single program icon. Others create a folder based on the name of the software. And others create a folder based on the name of the software manufacturer. It’s chaos, and can make it hard to find the software you want to run. Thankfully we can avoid this chaos with useful operating system features like Quick Launch, the Superbar or pinned start menu items. Even so, it would make a lot of sense to get into the guts of the Start Menu itself and give it a good once-over. All you really need to decide is how you’re going to organize your applications. A structure based on the purpose of the application is an obvious candidate. Below is an example of one such structure: In this structure, Utilities means software whose job it is to keep the computer itself running smoothly (configuration tools, backup software, Zip programs, etc). Applications refers to any productivity software that doesn’t fit under the headings Multimedia, Graphics, Internet, etc. In case you’re not aware, every icon in your Start Menu is a shortcut and can be manipulated like any other shortcut (copied, moved, deleted, etc). With the Windows Start Menu (all version of Windows), Microsoft has decided that there be two parallel folder structures to store your Start Menu shortcuts. One for you (the logged-in user of the computer) and one for all users of the computer. Having two parallel structures can often be redundant: If you are the only user of the computer, then having two parallel structures is totally redundant. Even if you have several users that regularly log into the computer, most of your installed software will need to be made available to all users, and should thus be moved out of the “just you” version of the Start Menu and into the “all users” area. To take control of your Start Menu, so you can start organizing it, you’ll need to know how to access the actual folders and shortcut files that make up the Start Menu (both versions of it). To find these folders and files, click the Start button and then right-click on the All Programs text (Windows XP users should right-click on the Start button itself): The Open option refers to the “just you” version of the Start Menu, while the Open All Users option refers to the “all users” version. Click on the one you want to organize. A Windows Explorer window then opens with your chosen version of the Start Menu selected. From there it’s easy. Double-click on the Programs folder and you’ll see all your folders and shortcuts. Now you can delete/rename/move until it’s just the way you want it. Note: When you’re reorganizing your Start Menu, you may want to have two Explorer windows open at the same time – one showing the “just you” version and one showing the “all users” version. You can drag-and-drop between the windows. Tip #20. Keep Your Start Menu Tidy Once you have a perfectly organized Start Menu, try to be a little vigilant about keeping it that way. Every time you install a new piece of software, the icons that get created will almost certainly violate your organizational structure. So to keep your Start Menu pristine and organized, make sure you do the following whenever you install a new piece of software: Check whether the software was installed into the “just you” area of the Start Menu, or the “all users” area, and then move it to the correct area. Remove all the unnecessary icons (like the “Read me” icon, the “Help” icon (you can always open the help from within the software itself when it’s running), the “Uninstall” icon, the link(s)to the manufacturer’s website, etc) Rename the main icon(s) of the software to something brief that makes sense to you. For example, you might like to rename Microsoft Office Word 2010 to simply Word Move the icon(s) into the correct folder based on your Start Menu organizational structure And don’t forget: when you uninstall a piece of software, the software’s uninstall routine is no longer going to be able to remove the software’s icon from the Start Menu (because you moved and/or renamed it), so you’ll need to remove that icon manually. Tip #21. Tidy C:\ The root of your C: drive (C:\) is a common dumping ground for files and folders – both by the users of your computer and by the software that you install on your computer. It can become a mess. There’s almost no software these days that requires itself to be installed in C:\. 99% of the time it can and should be installed into C:\Program Files. And as for your own files, well, it’s clear that they can (and almost always should) be stored somewhere else. In an ideal world, your C:\ folder should look like this (on Windows 7): Note that there are some system files and folders in C:\ that are usually and deliberately “hidden” (such as the Windows virtual memory file pagefile.sys, the boot loader file bootmgr, and the System Volume Information folder). Hiding these files and folders is a good idea, as they need to stay where they are and are almost never needed to be opened or even seen by you, the user. Hiding them prevents you from accidentally messing with them, and enhances your sense of order and well-being when you look at your C: drive folder. Tip #22. Tidy Your Desktop The Desktop is probably the most abused part of a Windows computer (from an organization point of view). It usually serves as a dumping ground for all incoming files, as well as holding icons to oft-used applications, plus some regularly opened files and folders. It often ends up becoming an uncontrolled mess. See if you can avoid this. Here’s why… Application icons (Word, Internet Explorer, etc) are often found on the Desktop, but it’s unlikely that this is the optimum place for them. The “Quick Launch” bar (or the Superbar in Windows 7) is always visible and so represents a perfect location to put your icons. You’ll only be able to see the icons on your Desktop when all your programs are minimized. It might be time to get your application icons off your desktop… You may have decided that the Inbox/To-do folder on your computer (see tip #13, above) should be your Desktop. If so, then enough said. Simply be vigilant about clearing it and preventing it from being polluted by junk files (see tip #15, above). On the other hand, if your Desktop is not acting as your “Inbox” folder, then there’s no reason for it to have any data files or folders on it at all, except perhaps a couple of shortcuts to often-opened files and folders (either ongoing or current projects). Everything else should be moved to your “Inbox” folder. In an ideal world, it might look like this: Tip #23. Move Permanent Items on Your Desktop Away from the Top-Left Corner When files/folders are dragged onto your desktop in a Windows Explorer window, or when shortcuts are created on your Desktop from Internet Explorer, those icons are always placed in the top-left corner – or as close as they can get. If you have other files, folders or shortcuts that you keep on the Desktop permanently, then it’s a good idea to separate these permanent icons from the transient ones, so that you can quickly identify which ones the transients are. An easy way to do this is to move all your permanent icons to the right-hand side of your Desktop. That should keep them separated from incoming items. Tip #24. Synchronize If you have more than one computer, you’ll almost certainly want to share files between them. If the computers are permanently attached to the same local network, then there’s no need to store multiple copies of any one file or folder – shortcuts will suffice. However, if the computers are not always on the same network, then you will at some point need to copy files between them. For files that need to permanently live on both computers, the ideal way to do this is to synchronize the files, as opposed to simply copying them. We only have room here to write a brief summary of synchronization, not a full article. In short, there are several different types of synchronization: Where the contents of one folder are accessible anywhere, such as with Dropbox Where the contents of any number of folders are accessible anywhere, such as with Windows Live Mesh Where any files or folders from anywhere on your computer are synchronized with exactly one other computer, such as with the Windows “Briefcase”, Microsoft SyncToy, or (much more powerful, yet still free) SyncBack from 2BrightSparks. This only works when both computers are on the same local network, at least temporarily. A great advantage of synchronization solutions is that once you’ve got it configured the way you want it, then the sync process happens automatically, every time. Click a button (or schedule it to happen automatically) and all your files are automagically put where they’re supposed to be. If you maintain the same file and folder structure on both computers, then you can also sync files depend upon the correct location of other files, like shortcuts, playlists and office documents that link to other office documents, and the synchronized files still work on the other computer! Tip #25. Hide Files You Never Need to See If you have your files well organized, you will often be able to tell if a file is out of place just by glancing at the contents of a folder (for example, it should be pretty obvious if you look in a folder that contains all the MP3s from one music CD and see a Word document in there). This is a good thing – it allows you to determine if there are files out of place with a quick glance. Yet sometimes there are files in a folder that seem out of place but actually need to be there, such as the “folder art” JPEGs in music folders, and various files in the root of the C: drive. If such files never need to be opened by you, then a good idea is to simply hide them. Then, the next time you glance at the folder, you won’t have to remember whether that file was supposed to be there or not, because you won’t see it at all! To hide a file, simply right-click on it and choose Properties: Then simply tick the Hidden tick-box: Tip #26. Keep Every Setup File These days most software is downloaded from the Internet. Whenever you download a piece of software, keep it. You’ll never know when you need to reinstall the software. Further, keep with it an Internet shortcut that links back to the website where you originally downloaded it, in case you ever need to check for updates. See tip #33 below for a full description of the excellence of organizing your setup files. Tip #27. Try to Minimize the Number of Folders that Contain Both Files and Sub-folders Some of the folders in your organizational structure will contain only files. Others will contain only sub-folders. And you will also have some folders that contain both files and sub-folders. You will notice slight improvements in how long it takes you to locate a file if you try to avoid this third type of folder. It’s not always possible, of course – you’ll always have some of these folders, but see if you can avoid it. One way of doing this is to take all the leftover files that didn’t end up getting stored in a sub-folder and create a special “Miscellaneous” or “Other” folder for them. Tip #28. Starting a Filename with an Underscore Brings it to the Top of a List Further to the previous tip, if you name that “Miscellaneous” or “Other” folder in such a way that its name begins with an underscore “_”, then it will appear at the top of the list of files/folders. The screenshot below is an example of this. Each folder in the list contains a set of digital photos. The folder at the top of the list, _Misc, contains random photos that didn’t deserve their own dedicated folder: Tip #29. Clean Up those CD-ROMs and (shudder!) Floppy Disks Have you got a pile of CD-ROMs stacked on a shelf of your office? Old photos, or files you archived off onto CD-ROM (or even worse, floppy disks!) because you didn’t have enough disk space at the time? In the meantime have you upgraded your computer and now have 500 Gigabytes of space you don’t know what to do with? If so, isn’t it time you tidied up that stack of disks and filed them into your gorgeous new folder structure? So what are you waiting for? Bite the bullet, copy them all back onto your computer, file them in their appropriate folders, and then back the whole lot up onto a shiny new 1000Gig external hard drive! Useful Folders to Create This next section suggests some useful folders that you might want to create within your folder structure. I’ve personally found them to be indispensable. The first three are all about convenience – handy folders to create and then put somewhere that you can always access instantly. For each one, it’s not so important where the actual folder is located, but it’s very important where you put the shortcut(s) to the folder. You might want to locate the shortcuts: On your Desktop In your “Quick Launch” area (or pinned to your Windows 7 Superbar) In your Windows Explorer “Favorite Links” area Tip #30. Create an “Inbox” (“To-Do”) Folder This has already been mentioned in depth (see tip #13), but we wanted to reiterate its importance here. This folder contains all the recently created, received or downloaded files that you have not yet had a chance to file away properly, and it also may contain files that you have yet to process. In effect, it becomes a sort of “to-do list”. It doesn’t have to be called “Inbox” – you can call it whatever you want. Tip #31. Create a Folder where Your Current Projects are Collected Rather than going hunting for them all the time, or dumping them all on your desktop, create a special folder where you put links (or work folders) for each of the projects you’re currently working on. You can locate this folder in your “Inbox” folder, on your desktop, or anywhere at all – just so long as there’s a way of getting to it quickly, such as putting a link to it in Windows Explorer’s “Favorite Links” area: Tip #32. Create a Folder for Files and Folders that You Regularly Open You will always have a few files that you open regularly, whether it be a spreadsheet of your current accounts, or a favorite playlist. These are not necessarily “current projects”, rather they’re simply files that you always find yourself opening. Typically such files would be located on your desktop (or even better, shortcuts to those files). Why not collect all such shortcuts together and put them in their own special folder? As with the “Current Projects” folder (above), you would want to locate that folder somewhere convenient. Below is an example of a folder called “Quick links”, with about seven files (shortcuts) in it, that is accessible through the Windows Quick Launch bar: See tip #37 below for a full explanation of the power of the Quick Launch bar. Tip #33. Create a “Set-ups” Folder A typical computer has dozens of applications installed on it. For each piece of software, there are often many different pieces of information you need to keep track of, including: The original installation setup file(s). This can be anything from a simple 100Kb setup.exe file you downloaded from a website, all the way up to a 4Gig ISO file that you copied from a DVD-ROM that you purchased. The home page of the software manufacturer (in case you need to look up something on their support pages, their forum or their online help) The page containing the download link for your actual file (in case you need to re-download it, or download an upgraded version) The serial number Your proof-of-purchase documentation Any other template files, plug-ins, themes, etc that also need to get installed For each piece of software, it’s a great idea to gather all of these files together and put them in a single folder. The folder can be the name of the software (plus possibly a very brief description of what it’s for – in case you can’t remember what the software does based in its name). Then you would gather all of these folders together into one place, and call it something like “Software” or “Setups”. If you have enough of these folders (I have several hundred, being a geek, collected over 20 years), then you may want to further categorize them. My own categorization structure is based on “platform” (operating system): The last seven folders each represents one platform/operating system, while _Operating Systems contains set-up files for installing the operating systems themselves. _Hardware contains ROMs for hardware I own, such as routers. Within the Windows folder (above), you can see the beginnings of the vast library of software I’ve compiled over the years: An example of a typical application folder looks like this: Tip #34. Have a “Settings” Folder We all know that our documents are important. So are our photos and music files. We save all of these files into folders, and then locate them afterwards and double-click on them to open them. But there are many files that are important to us that can’t be saved into folders, and then searched for and double-clicked later on. These files certainly contain important information that we need, but are often created internally by an application, and saved wherever that application feels is appropriate. A good example of this is the “PST” file that Outlook creates for us and uses to store all our emails, contacts, appointments and so forth. Another example would be the collection of Bookmarks that Firefox stores on your behalf. And yet another example would be the customized settings and configuration files of our all our software. Granted, most Windows programs store their configuration in the Registry, but there are still many programs that use configuration files to store their settings. Imagine if you lost all of the above files! And yet, when people are backing up their computers, they typically only back up the files they know about – those that are stored in the “My Documents” folder, etc. If they had a hard disk failure or their computer was lost or stolen, their backup files would not include some of the most vital files they owned. Also, when migrating to a new computer, it’s vital to ensure that these files make the journey. It can be a very useful idea to create yourself a folder to store all your “settings” – files that are important to you but which you never actually search for by name and double-click on to open them. Otherwise, next time you go to set up a new computer just the way you want it, you’ll need to spend hours recreating the configuration of your previous computer! So how to we get our important files into this folder? Well, we have a few options: Some programs (such as Outlook and its PST files) allow you to place these files wherever you want. If you delve into the program’s options, you will find a setting somewhere that controls the location of the important settings files (or “personal storage” – PST – when it comes to Outlook) Some programs do not allow you to change such locations in any easy way, but if you get into the Registry, you can sometimes find a registry key that refers to the location of the file(s). Simply move the file into your Settings folder and adjust the registry key to refer to the new location. Some programs stubbornly refuse to allow their settings files to be placed anywhere other then where they stipulate. When faced with programs like these, you have three choices: (1) You can ignore those files, (2) You can copy the files into your Settings folder (let’s face it – settings don’t change very often), or (3) you can use synchronization software, such as the Windows Briefcase, to make synchronized copies of all your files in your Settings folder. All you then have to do is to remember to run your sync software periodically (perhaps just before you run your backup software!). There are some other things you may decide to locate inside this new “Settings” folder: Exports of registry keys (from the many applications that store their configurations in the Registry). This is useful for backup purposes or for migrating to a new computer Notes you’ve made about all the specific customizations you have made to a particular piece of software (so that you’ll know how to do it all again on your next computer) Shortcuts to webpages that detail how to tweak certain aspects of your operating system or applications so they are just the way you like them (such as how to remove the words “Shortcut to” from the beginning of newly created shortcuts). In other words, you’d want to create shortcuts to half the pages on the How-To Geek website! Here’s an example of a “Settings” folder: Windows Features that Help with Organization This section details some of the features of Microsoft Windows that are a boon to anyone hoping to stay optimally organized. Tip #35. Use the “Favorite Links” Area to Access Oft-Used Folders Once you’ve created your great new filing system, work out which folders you access most regularly, or which serve as great starting points for locating the rest of the files in your folder structure, and then put links to those folders in your “Favorite Links” area of the left-hand side of the Windows Explorer window (simply called “Favorites” in Windows 7): Some ideas for folders you might want to add there include: Your “Inbox” folder (or whatever you’ve called it) – most important! The base of your filing structure (e.g. C:\Files) A folder containing shortcuts to often-accessed folders on other computers around the network (shown above as Network Folders) A folder containing shortcuts to your current projects (unless that folder is in your “Inbox” folder) Getting folders into this area is very simple – just locate the folder you’re interested in and drag it there! Tip #36. Customize the Places Bar in the File/Open and File/Save Boxes Consider the screenshot below: The highlighted icons (collectively known as the “Places Bar”) can be customized to refer to any folder location you want, allowing instant access to any part of your organizational structure. Note: These File/Open and File/Save boxes have been superseded by new versions that use the Windows Vista/Windows 7 “Favorite Links”, but the older versions (shown above) are still used by a surprisingly large number of applications. The easiest way to customize these icons is to use the Group Policy Editor, but not everyone has access to this program. If you do, open it up and navigate to: User Configuration > Administrative Templates > Windows Components > Windows Explorer > Common Open File Dialog If you don’t have access to the Group Policy Editor, then you’ll need to get into the Registry. Navigate to: HKEY_CURRENT_USER \ Software \ Microsoft \ Windows \ CurrentVersion \ Policies \ comdlg32 \ Placesbar It should then be easy to make the desired changes. Log off and log on again to allow the changes to take effect. Tip #37. Use the Quick Launch Bar as a Application and File Launcher That Quick Launch bar (to the right of the Start button) is a lot more useful than people give it credit for. Most people simply have half a dozen icons in it, and use it to start just those programs. But it can actually be used to instantly access just about anything in your filing system: For complete instructions on how to set this up, visit our dedicated article on this topic. Tip #38. Put a Shortcut to Windows Explorer into Your Quick Launch Bar This is only necessary in Windows Vista and Windows XP. The Microsoft boffins finally got wise and added it to the Windows 7 Superbar by default. Windows Explorer – the program used for managing your files and folders – is one of the most useful programs in Windows. Anyone who considers themselves serious about being organized needs instant access to this program at any time. A great place to create a shortcut to this program is in the Windows XP and Windows Vista “Quick Launch” bar: To get it there, locate it in your Start Menu (usually under “Accessories”) and then right-drag it down into your Quick Launch bar (and create a copy). Tip #39. Customize the Starting Folder for Your Windows 7 Explorer Superbar Icon If you’re on Windows 7, your Superbar will include a Windows Explorer icon. Clicking on the icon will launch Windows Explorer (of course), and will start you off in your “Libraries” folder. Libraries may be fine as a starting point, but if you have created yourself an “Inbox” folder, then it would probably make more sense to start off in this folder every time you launch Windows Explorer. To change this default/starting folder location, then first right-click the Explorer icon in the Superbar, and then right-click Properties:Then, in Target field of the Windows Explorer Properties box that appears, type %windir%\explorer.exe followed by the path of the folder you wish to start in. For example: %windir%\explorer.exe C:\Files If that folder happened to be on the Desktop (and called, say, “Inbox”), then you would use the following cleverness: %windir%\explorer.exe shell:desktop\Inbox Then click OK and test it out. Tip #40. Ummmmm…. No, that’s it. I can’t think of another one. That’s all of the tips I can come up with. I only created this one because 40 is such a nice round number… Case Study – An Organized PC To finish off the article, I have included a few screenshots of my (main) computer (running Vista). The aim here is twofold: To give you a sense of what it looks like when the above, sometimes abstract, tips are applied to a real-life computer, and To offer some ideas about folders and structure that you may want to steal to use on your own PC. Let’s start with the C: drive itself. Very minimal. All my files are contained within C:\Files. I’ll confine the rest of the case study to this folder: That folder contains the following: Mark: My personal files VC: My business (Virtual Creations, Australia) Others contains files created by friends and family Data contains files from the rest of the world (can be thought of as “public” files, usually downloaded from the Net) Settings is described above in tip #34 The Data folder contains the following sub-folders: Audio: Radio plays, audio books, podcasts, etc Development: Programmer and developer resources, sample source code, etc (see below) Humour: Jokes, funnies (those emails that we all receive) Movies: Downloaded and ripped movies (all legal, of course!), their scripts, DVD covers, etc. Music: (see below) Setups: Installation files for software (explained in full in tip #33) System: (see below) TV: Downloaded TV shows Writings: Books, instruction manuals, etc (see below) The Music folder contains the following sub-folders: Album covers: JPEG scans Guitar tabs: Text files of guitar sheet music Lists: e.g. “Top 1000 songs of all time” Lyrics: Text files MIDI: Electronic music files MP3 (representing 99% of the Music folder): MP3s, either ripped from CDs or downloaded, sorted by artist/album name Music Video: Video clips Sheet Music: usually PDFs The Data\Writings folder contains the following sub-folders: (all pretty self-explanatory) The Data\Development folder contains the following sub-folders: Again, all pretty self-explanatory (if you’re a geek) The Data\System folder contains the following sub-folders: These are usually themes, plug-ins and other downloadable program-specific resources. The Mark folder contains the following sub-folders: From Others: Usually letters that other people (friends, family, etc) have written to me For Others: Letters and other things I have created for other people Green Book: None of your business Playlists: M3U files that I have compiled of my favorite songs (plus one M3U playlist file for every album I own) Writing: Fiction, philosophy and other musings of mine Mark Docs: Shortcut to C:\Users\Mark Settings: Shortcut to C:\Files\Settings\Mark The Others folder contains the following sub-folders: The VC (Virtual Creations, my business – I develop websites) folder contains the following sub-folders: And again, all of those are pretty self-explanatory. Conclusion These tips have saved my sanity and helped keep me a productive geek, but what about you? What tips and tricks do you have to keep your files organized? Please share them with us in the comments. Come on, don’t be shy… Similar Articles Productive Geek Tips Fix For When Windows Explorer in Vista Stops Showing File NamesWhy Did Windows Vista’s Music Folder Icon Turn Yellow?Print or Create a Text File List of the Contents in a Directory the Easy WayCustomize the Windows 7 or Vista Send To MenuAdd Copy To / Move To on Windows 7 or Vista Right-Click Menu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Track Daily Goals With 42Goals Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics

Read the article

Sensible Way to Pass Web Data in XML to a SQL Server Database

- by Emtucifor

After exploring several different ways to pass web data to a database for update purposes, I'm wondering if XML might be a good strategy. The database is currently SQL 2000. In a few months it will move to SQL 2005 and I will be able to change things if needed, but I need a SQL 2000 solution now. First of all, the database in question uses the EAV model. I know that this kind of database is generally highly frowned on, so for the purposes of this question, please just accept that this is not going to change. The current update method has the web server inserting values (that have all been converted first to their correct underlying types, then to sql_variant) to a temp table. A stored procedure is then run which expects the temp table to exist and it takes care of updating, inserting, or deleting things as needed. So far, only a single element has needed to be updated at a time. But now, there is a requirement to be able to edit multiple elements at once, and also to support hierarchical elements, each of which can have its own list of attributes. Here's some example XML I hand-typed to demonstrate what I'm thinking of. Note that in this database the Entity is Element and an ID of 0 signifies "create" aka an insert of a new item. <Elements> <Element ID="1234"> <Attr ID="221">Value</Attr> <Attr ID="225">287</Attr> <Attr ID="234"> <Element ID="99825"> <Attr ID="7">Value1</Attr> <Attr ID="8">Value2</Attr> <Attr ID="9" Action="delete" /> </Element> <Element ID="99826" Action="delete" /> <Element ID="0" Type="24"> <Attr ID="7">Value4</Attr> <Attr ID="8">Value5</Attr> <Attr ID="9">Value6</Attr> </Element> <Element ID="0" Type="24"> <Attr ID="7">Value7</Attr> <Attr ID="8">Value8</Attr> <Attr ID="9">Value9</Attr> </Element> </Attr> <Rel ID="3827" Action="delete" /> <Rel ID="2284" Role="parent"> <Element ID="3827" /> <Element ID="3829" /> <Attr ID="665">1</Attr> </Rel> <Rel ID="0" Type="23" Role="child"> <Element ID="3830" /> <Attr ID="67" </Rel> </Element> <Element ID="0" Type="87"> <Attr ID="221">Value</Attr> <Attr ID="225">569</Attr> <Attr ID="234"> <Element ID="0" Type="24"> <Attr ID="7">Value10</Attr> <Attr ID="8">Value11</Attr> <Attr ID="9">Value12</Attr> </Element> </Attr> </Element> <Element ID="1235" Action="delete" /> </Elements> Some Attributes are straight value types, such as AttrID 221. But AttrID 234 is a special "multi-value" type that can have a list of elements underneath it, and each one can have one or more values. Types only need to be presented when a new item is created, since the ElementID fully implies the type if it already exists. I'll probably support only passing in changed items (as detected by javascript). And there may be an Action="Delete" on Attr elements as well, since NULLs are treated as "unselected"--sometimes it's very important to know if a Yes/No question has intentionally been answered No or if no one's bothered to say Yes yet. There is also a different kind of data, a Relationship. At this time, those are updated through individual AJAX calls as things are edited in the UI, but I'd like to include those so that changes to relationships can be canceled (right now, once you change it, it's done). So those are really elements, too, but they are called Rel instead of Element. Relationships are implemented as ElementID1 and ElementID2, so the RelID 2284 in the XML above is in the database as: ElementID 2284 ElementID1 1234 ElementID2 3827 Having multiple children in one relationship isn't currently supported, but it would be nice later. Does this strategy and the example XML make sense? Is there a more sensible way? I'm just looking for some broad critique to help save me from going down a bad path. Any aspect that you'd like to comment on would be helpful. The web language happens to be Classic ASP, but that could change to ASP.Net at some point. A persistence engine like Linq or nHibernate is probably not acceptable right now--I just want to get this already working application enhanced without a huge amount of development time. I'll choose the answer that shows experience and has a balance of good warnings about what not to do, confirmations of what I'm planning to do, and recommendations about something else to do. I'll make it as objective as possible. P.S. I'd like to handle unicode characters as well as very long strings (10k +). UPDATE I have had this working for some time and I used the ADO Recordset Save-To-Stream trick to make creating the XML really easy. The result seems to be fairly fast, though if speed ever becomes a problem I may revisit this. In the meantime, my code works to handle any number of elements and attributes on the page at once, including updating, deleting, and creating new items all in one go. I settled on a scheme like so for all my elements: Existing data elements Example: input name e12345_a678 (element 12345, attribute 678), the input value is the value of the attribute. New elements Javascript copies a hidden template of the set of HTML elements needed for the type into the correct location on the page, increments a counter to get a new ID for this item, and prepends the number to the names of the form items. var newid = 0; function metadataAdd(reference, nameid, value) { var t = document.createElement('input'); t.setAttribute('name', nameid); t.setAttribute('id', nameid); t.setAttribute('type', 'hidden'); t.setAttribute('value', value); reference.appendChild(t); } function multiAdd(target, parentelementid, attrid, elementtypeid) { var proto = document.getElementById('a' + attrid + '_proto'); var instance = document.createElement('p'); target.parentNode.parentNode.insertBefore(instance, target.parentNode); var thisid = ++newid; instance.innerHTML = proto.innerHTML.replace(/{prefix}/g, 'n' + thisid + '_'); instance.id = 'n' + thisid; instance.className += ' new'; metadataAdd(instance, 'n' + thisid + '_p', parentelementid); metadataAdd(instance, 'n' + thisid + '_c', attrid); metadataAdd(instance, 'n' + thisid + '_t', elementtypeid); return false; } Example: Template input name _a678 becomes n1_a678 (a new element, the first one on the page, attribute 678). all attributes of this new element are tagged with the same prefix of n1. The next new item will be n2, and so on. Some hidden form inputs are created: n1_t, value is the elementtype of the element to be created n1_p, value is the parent id of the element (if it is a relationship) n1_c, value is the child id of the element (if it is a relationship) Deleting elements A hidden input is created in the form e12345_t with value set to 0. The existing controls displaying that attribute's values are disabled so they are not included in the form post. So "set type to 0" is treated as delete. With this scheme, every item on the page has a unique name and can be distinguished properly, and every action can be represented properly. When the form is posted, here's a sample of building one of the two recordsets used (classic ASP code): Set Data = Server.CreateObject("ADODB.Recordset") Data.Fields.Append "ElementID", adInteger, 4, adFldKeyColumn Data.Fields.Append "AttrID", adInteger, 4, adFldKeyColumn Data.Fields.Append "Value", adLongVarWChar, 2147483647, adFldIsNullable Or adFldMayBeNull Data.CursorLocation = adUseClient Data.CursorType = adOpenDynamic Data.Open This is the recordset for values, the other is for the elements themselves. I step through the posted form and for the element recordset use a Scripting.Dictionary populated with instances of a custom Class that has the properties I need, so that I can add the values piecemeal, since they don't always come in order. New elements are added as negative to distinguish them from regular elements (rather than requiring a separate column to indicate if it is new or addresses an existing element). I use regular expression to tear apart the form keys: "^(e|n)([0-9]{1,10})_(a|p|t|c)([0-9]{0,10})$" Then, adding an attribute looks like this. Data.AddNew ElementID.Value = DataID AttrID.Value = Integerize(Matches(0).SubMatches(3)) AttrValue.Value = Request.Form(Key) Data.Update ElementID, AttrID, and AttrValue are references to the fields of the recordset. This method is hugely faster than using Data.Fields("ElementID").Value each time. I loop through the Dictionary of element updates and ignore any that don't have all the proper information, adding the good ones to the recordset. Then I call my data-updating stored procedure like so: Set Cmd = Server.CreateObject("ADODB.Command") With Cmd Set .ActiveConnection = MyDBConn .CommandType = adCmdStoredProc .CommandText = "DataPost" .Prepared = False .Parameters.Append .CreateParameter("@ElementMetadata", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Element)) .Parameters.Append .CreateParameter("@ElementData", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Data)) End With Result.Open Cmd ' previously created recordset object with options set Here's the function that does the xml conversion: Private Function XMLFromRecordset(Recordset) Dim Stream Set Stream = Server.CreateObject("ADODB.Stream") Stream.Open Recordset.Save Stream, adPersistXML Stream.Position = 0 XMLFromRecordset = Stream.ReadText End Function Just in case the web page needs to know, the SP returns a recordset of any new elements, showing their page value and their created value (so I can see that n1 is now e12346 for example). Here are some key snippets from the stored procedure. Note this is SQL 2000 for now, though I'll be able to switch to 2005 soon: CREATE PROCEDURE [dbo].[DataPost] @ElementMetaData ntext, @ElementData ntext AS DECLARE @hdoc int --- snip --- EXEC sp_xml_preparedocument @hdoc OUTPUT, @ElementMetaData, '<xml xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" />' INSERT #ElementMetadata (ElementID, ElementTypeID, ElementID1, ElementID2) SELECT * FROM OPENXML(@hdoc, '/xml/rs:data/rs:insert/z:row', 0) WITH ( ElementID int, ElementTypeID int, ElementID1 int, ElementID2 int ) ORDER BY ElementID -- orders negative items (new elements) first so they begin counting at 1 for later ID calculation EXEC sp_xml_removedocument @hdoc --- snip --- UPDATE E SET E.ElementTypeID = M.ElementTypeID FROM Element E INNER JOIN #ElementMetadata M ON E.ElementID = M.ElementID WHERE E.ElementID >= 1 AND M.ElementTypeID >= 1 The following query does the correlation of the negative new element ids to the newly inserted ones: UPDATE #ElementMetadata -- Correlate the new ElementIDs with the input rows SET NewElementID = Scope_Identity() - @@RowCount + DataID WHERE ElementID < 0 Other set-based queries do all the other work of validating that the attributes are allowed, are the correct data type, and inserting, updating, and deleting elements and attributes. I hope this brief run-down is useful to others some day! Converting ADO Recordsets to an XML stream was a huge winner for me as it saved all sorts of time and had a namespace and schema already defined that made the results come out correctly. Using a flatter XML format with 2 inputs was also much easier than sticking to some ideal about having everything in a single XML stream.

Read the article

Sensible Way to Pass Web Data to Sql Server Database

- by Emtucifor

After exploring several different ways to pass web data to a database for update purposes, I'm wondering if XML might be a good strategy. The database is currently SQL 2000. In a few months it will move to SQL 2005 and I will be able to change things if needed, but I need a SQL 2000 solution now. First of all, the database in question uses the EAV model. I know that this kind of database is generally highly frowned on, so for the purposes of this question, please just accept that this is not going to change. The current update method has the web server inserting values (that have all been converted first to their correct underlying types, then to sql_variant) to a temp table. A stored procedure is then run which expects the temp table to exist and it takes care of updating, inserting, or deleting things as needed. So far, only a single element has needed to be updated at a time. But now, there is a requirement to be able to edit multiple elements at once, and also to support hierarchical elements, each of which can have its own list of attributes. Here's some example XML I hand-typed to demonstrate what I'm thinking of. Note that in this database the Entity is Element and an ID of 0 signifies "create" aka an insert of a new item. <Elements> <Element ID="1234"> <Attr ID="221">Value</Attr> <Attr ID="225">287</Attr> <Attr ID="234"> <Element ID="99825"> <Attr ID="7">Value1</Attr> <Attr ID="8">Value2</Attr> <Attr ID="9" Action="delete" /> </Element> <Element ID="99826" Action="delete" /> <Element ID="0" Type="24"> <Attr ID="7">Value4</Attr> <Attr ID="8">Value5</Attr> <Attr ID="9">Value6</Attr> </Element> <Element ID="0" Type="24"> <Attr ID="7">Value7</Attr> <Attr ID="8">Value8</Attr> <Attr ID="9">Value9</Attr> </Element> </Attr> <Rel ID="3827" Action="delete" /> <Rel ID="2284" Role="parent"> <Element ID="3827" /> <Element ID="3829" /> <Attr ID="665">1</Attr> </Rel> <Rel ID="0" Type="23" Role="child"> <Element ID="3830" /> <Attr ID="67" </Rel> </Element> <Element ID="0" Type="87"> <Attr ID="221">Value</Attr> <Attr ID="225">569</Attr> <Attr ID="234"> <Element ID="0" Type="24"> <Attr ID="7">Value10</Attr> <Attr ID="8">Value11</Attr> <Attr ID="9">Value12</Attr> </Element> </Attr> </Element> <Element ID="1235" Action="delete" /> </Elements> Some Attributes are straight value types, such as AttrID 221. But AttrID 234 is a special "multi-value" type that can have a list of elements underneath it, and each one can have one or more values. Types only need to be presented when a new item is created, since the ElementID fully implies the type if it already exists. I'll probably support only passing in changed items (as detected by javascript). And there may be an Action="Delete" on Attr elements as well, since NULLs are treated as "unselected"--sometimes it's very important to know if a Yes/No question has intentionally been answered No or if no one's bothered to say Yes yet. There is also a different kind of data, a Relationship. At this time, those are updated through individual AJAX calls as things are edited in the UI, but I'd like to include those so that changes to relationships can be canceled (right now, once you change it, it's done). So those are really elements, too, but they are called Rel instead of Element. Relationships are implemented as ElementID1 and ElementID2, so the RelID 2284 in the XML above is in the database as: ElementID 2284 ElementID1 1234 ElementID2 3827 Having multiple children in one relationship isn't currently supported, but it would be nice later. Does this strategy and the example XML make sense? Is there a more sensible way? I'm just looking for some broad critique to help save me from going down a bad path. Any aspect that you'd like to comment on would be helpful. The web language happens to be Classic ASP, but that could change to ASP.Net at some point. A persistence engine like Linq or nHibernate is probably not acceptable right now--I just want to get this already working application enhanced without a huge amount of development time. I'll choose the answer that shows experience and has a balance of good warnings about what not to do, confirmations of what I'm planning to do, and recommendations about something else to do. I'll make it as objective as possible. P.S. I'd like to handle unicode characters as well as very long strings (10k +). UPDATE I have had this working for some time and I used the ADO Recordset Save-To-Stream trick to make creating the XML really easy. The result seems to be fairly fast, though if speed ever becomes a problem I may revisit this. In the meantime, my code works to handle any number of elements and attributes on the page at once, including updating, deleting, and creating new items all in one go. I settled on a scheme like so for all my elements: Existing data elements Example: input name e12345_a678 (element 12345, attribute 678), the input value is the value of the attribute. New elements Javascript copies a hidden template of the set of HTML elements needed for the type into the correct location on the page, increments a counter to get a new ID for this item, and prepends the number to the names of the form items. var newid = 0; function metadataAdd(reference, nameid, value) { var t = document.createElement('input'); t.setAttribute('name', nameid); t.setAttribute('id', nameid); t.setAttribute('type', 'hidden'); t.setAttribute('value', value); reference.appendChild(t); } function multiAdd(target, parentelementid, attrid, elementtypeid) { var proto = document.getElementById('a' + attrid + '_proto'); var instance = document.createElement('p'); target.parentNode.parentNode.insertBefore(instance, target.parentNode); var thisid = ++newid; instance.innerHTML = proto.innerHTML.replace(/{prefix}/g, 'n' + thisid + '_'); instance.id = 'n' + thisid; instance.className += ' new'; metadataAdd(instance, 'n' + thisid + '_p', parentelementid); metadataAdd(instance, 'n' + thisid + '_c', attrid); metadataAdd(instance, 'n' + thisid + '_t', elementtypeid); return false; } Example: Template input name _a678 becomes n1_a678 (a new element, the first one on the page, attribute 678). all attributes of this new element are tagged with the same prefix of n1. The next new item will be n2, and so on. Some hidden form inputs are created: n1_t, value is the elementtype of the element to be created n1_p, value is the parent id of the element (if it is a relationship) n1_c, value is the child id of the element (if it is a relationship) Deleting elements A hidden input is created in the form e12345_t with value set to 0. The existing controls displaying that attribute's values are disabled so they are not included in the form post. So "set type to 0" is treated as delete. With this scheme, every item on the page has a unique name and can be distinguished properly, and every action can be represented properly. When the form is posted, here's a sample of building one of the two recordsets used (classic ASP code): Set Data = Server.CreateObject("ADODB.Recordset") Data.Fields.Append "ElementID", adInteger, 4, adFldKeyColumn Data.Fields.Append "AttrID", adInteger, 4, adFldKeyColumn Data.Fields.Append "Value", adLongVarWChar, 2147483647, adFldIsNullable Or adFldMayBeNull Data.CursorLocation = adUseClient Data.CursorType = adOpenDynamic Data.Open This is the recordset for values, the other is for the elements themselves. I step through the posted form and for the element recordset use a Scripting.Dictionary populated with instances of a custom Class that has the properties I need, so that I can add the values piecemeal, since they don't always come in order. New elements are added as negative to distinguish them from regular elements (rather than requiring a separate column to indicate if it is new or addresses an existing element). I use regular expression to tear apart the form keys: "^(e|n)([0-9]{1,10})_(a|p|t|c)([0-9]{0,10})$" Then, adding an attribute looks like this. Data.AddNew ElementID.Value = DataID AttrID.Value = Integerize(Matches(0).SubMatches(3)) AttrValue.Value = Request.Form(Key) Data.Update ElementID, AttrID, and AttrValue are references to the fields of the recordset. This method is hugely faster than using Data.Fields("ElementID").Value each time. I loop through the Dictionary of element updates and ignore any that don't have all the proper information, adding the good ones to the recordset. Then I call my data-updating stored procedure like so: Set Cmd = Server.CreateObject("ADODB.Command") With Cmd Set .ActiveConnection = MyDBConn .CommandType = adCmdStoredProc .CommandText = "DataPost" .Prepared = False .Parameters.Append .CreateParameter("@ElementMetadata", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Element)) .Parameters.Append .CreateParameter("@ElementData", adLongVarWChar, adParamInput, 2147483647, XMLFromRecordset(Data)) End With Result.Open Cmd ' previously created recordset object with options set Here's the function that does the xml conversion: Private Function XMLFromRecordset(Recordset) Dim Stream Set Stream = Server.CreateObject("ADODB.Stream") Stream.Open Recordset.Save Stream, adPersistXML Stream.Position = 0 XMLFromRecordset = Stream.ReadText End Function Just in case the web page needs to know, the SP returns a recordset of any new elements, showing their page value and their created value (so I can see that n1 is now e12346 for example). Here are some key snippets from the stored procedure. Note this is SQL 2000 for now, though I'll be able to switch to 2005 soon: CREATE PROCEDURE [dbo].[DataPost] @ElementMetaData ntext, @ElementData ntext AS DECLARE @hdoc int --- snip --- EXEC sp_xml_preparedocument @hdoc OUTPUT, @ElementMetaData, '<xml xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" />' INSERT #ElementMetadata (ElementID, ElementTypeID, ElementID1, ElementID2) SELECT * FROM OPENXML(@hdoc, '/xml/rs:data/rs:insert/z:row', 0) WITH ( ElementID int, ElementTypeID int, ElementID1 int, ElementID2 int ) ORDER BY ElementID -- orders negative items (new elements) first so they begin counting at 1 for later ID calculation EXEC sp_xml_removedocument @hdoc --- snip --- UPDATE E SET E.ElementTypeID = M.ElementTypeID FROM Element E INNER JOIN #ElementMetadata M ON E.ElementID = M.ElementID WHERE E.ElementID >= 1 AND M.ElementTypeID >= 1 The following query does the correlation of the negative new element ids to the newly inserted ones: UPDATE #ElementMetadata -- Correlate the new ElementIDs with the input rows SET NewElementID = Scope_Identity() - @@RowCount + DataID WHERE ElementID < 0 Other set-based queries do all the other work of validating that the attributes are allowed, are the correct data type, and inserting, updating, and deleting elements and attributes. I hope this brief run-down is useful to others some day! Converting ADO Recordsets to an XML stream was a huge winner for me as it saved all sorts of time and had a namespace and schema already defined that made the results come out correctly. Using a flatter XML format with 2 inputs was also much easier than sticking to some ideal about having everything in a single XML stream.

Read the article

XML\Jquery create listings based on user selection

- by Sirius Mane

Alright, so what I need to try and accomplish is having a static web page that will display information pulled from an XML document and render it to the screen without refreshing. Basic AJAX stuff I guess. The trick is, as I'm trying to think this through I keep coming into 'logical' barriers mentally. Objectives: -Have a chart which displays baseball team names, wins, losses, ties. In my XML doc there is a 'pending' status, so games not completed should not be displayed.(Need help here) -Have a selection list which allows you to select a team which is populated from XML doc. (done) -Upon selecting a particular team from the aforementioned selection list the page should display in a separate area all of the planned games for that team. Including pending. Basically all of the games associated with that team and the dates (which is included in the XML file). (Need help here) What I have so far: HTML\JS <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <link rel="stylesheet" href="batty.css" type="text/css" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Little Batty League</title> <script type="text/javascript" src="library.js"></script> <script type="text/javascript" src="jquery.js"></script> <script type="text/javascript"> var IE = window.ActiveXObject ? true: false; var MOZ = document.implementation.createDocument ? true: false; $(document).ready(function(){ $.ajax({ type: "GET", url: "schedule.xml", dataType: "xml", success: function(xml) { var select = $('#mySelect'); $(xml).find('Teams').each(function(){ var title = $(this).find('Team').text(); select.append("<option/><option class='ddheader'>"+title+"</option>"); }); select.children(":first").text("please make a selection").attr("selected",true); } }); }); </script> </script> </head> <body onLoad="init()">  <div id="container">  <div id="banner"> <img src="images/mascot.jpg" width="324" height="112" alt="Mascot" />  <table width="900" border="0" cellpadding="0" cellspacing="0"> <tr> <td><div class="menuButton"><a href="index.html">Home</a></div></td> <td><div class="menuButton"><a href="schedule.html">Schedule</a></div></td> <td><div class="menuButton"><a href="contact.html">Contact</a></div></td> <td><div class="menuButton"><a href="about.html">About</a></div></td> </tr> </table>  </div>   <div id="content"> <br /> <form> <select id="mySelect"> <option>please make a selection</option> </select> </form> </div>   <div id="footer"> © 2012 Batty League </div>  </div>  </body> </html> And the XML is: <?xml version="1.0" encoding="utf-8"?> <Schedule season="1"> <Teams> <Team>Bluejays</Team> </Teams> <Teams> <Team>Chickens</Team> </Teams> <Teams> <Team>Lions</Team> </Teams> <Teams> <Team>Pixies</Team> </Teams> <Teams> <Team>Zombies</Team> </Teams> <Teams> <Team>Wombats</Team> </Teams> <Game status="Played"> <Home_Team>Chickens</Home_Team> <Away_Team>Bluejays</Away_Team> <Date>2012-01-10T09:00:00</Date> </Game> <Game status="Pending"> <Home_Team>Bluejays </Home_Team> <Away_Team>Chickens</Away_Team> <Date>2012-01-11T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Bluejays</Home_Team> <Away_Team>Lions</Away_Team> <Date>2012-01-18T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Lions</Home_Team> <Away_Team>Bluejays</Away_Team> <Date>2012-01-19T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Bluejays</Home_Team> <Away_Team>Pixies</Away_Team> <Date>2012-01-21T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Pixies</Home_Team> <Away_Team>Bluejays</Away_Team> <Date>2012-01-23T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Bluejays</Home_Team> <Away_Team>Zombies</Away_Team> <Date>2012-01-25T09:00:00</Date> </Game> <Game status="Pending"> <Home_Team>Zombies</Home_Team> <Away_Team>Bluejays</Away_Team> <Date>2012-01-27T09:00:00</Date> </Game> <Game status="Pending"> <Home_Team>Bluejays</Home_Team> <Away_Team>Wombats</Away_Team> <Date>2012-01-28T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Wombats</Home_Team> <Away_Team>Bluejays</Away_Team> <Date>2012-01-30T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Chickens</Home_Team> <Away_Team>Lions</Away_Team> <Date>2012-01-31T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Lions</Home_Team> <Away_Team>Chickens</Away_Team> <Date>2012-02-04T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Chickens</Home_Team> <Away_Team>Pixies</Away_Team> <Date>2012-02-05T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Pixies</Home_Team> <Away_Team>Chickens</Away_Team> <Date>2012-02-07T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Chickens</Home_Team> <Away_Team>Zombies</Away_Team> <Date>2012-02-08T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Zombies</Home_Team> <Away_Team>Chickens</Away_Team> <Date>2012-02-10T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Lions</Home_Team> <Away_Team>Pixies</Away_Team> <Date>2012-02-12T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Pixies </Home_Team> <Away_Team>Lions</Away_Team> <Date>2012-02-14T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Lions</Home_Team> <Away_Team>Zombies</Away_Team> <Date>2012-02-15T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Zombies</Home_Team> <Away_Team>Lions</Away_Team> <Date>2012-02-16T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Lions</Home_Team> <Away_Team>Wombats</Away_Team> <Date>2012-01-23T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Wombats</Home_Team> <Away_Team>Lions</Away_Team> <Date>2012-02-24T09:00:00</Date> </Game> <Game status="Pending"> <Home_Team>Pixies</Home_Team> <Away_Team>Zombies</Away_Team> <Date>2012-02-25T09:00:00</Date> </Game> <Game status="Pending"> <Home_Team>Zombies</Home_Team> <Away_Team>Pixies</Away_Team> <Date>2012-02-26T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Pixies</Home_Team> <Away_Team>Wombats</Away_Team> <Date>2012-02-27T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Wombats</Home_Team> <Away_Team>Pixies</Away_Team> <Date>2012-02-28T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Zombies</Home_Team> <Away_Team>Wombats</Away_Team> <Date>2012-02-04T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Wombats</Home_Team> <Away_Team>Zombies</Away_Team> <Date>2012-02-05T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Wombats</Home_Team> <Away_Team>Chickens</Away_Team> <Date>2012-02-07T09:00:00</Date> </Game> <Game status="Played"> <Home_Team>Chickens</Home_Team> <Away_Team>Wombats</Away_Team> <Date>2012-02-08T09:00:00</Date> </Game> </Schedule> If anybody can point me to Jquery code\modules that would greatly help me with this I'd be appreciate. Any help right now would be great, I'm just banging my head against a wall. I'm trying to avoid using XSLT transforms because I absolutely despise XML and I'm not good with it. So I'd -like- to just use Javascript\PHP\etc with only a sprinkling of the necessary XML where possible..

Read the article

C++ HW - defining classes - objects that have objects of other class problem in header file (out of

- by kitfuntastik

This is my first time with much of this code. With this instancepool.h file below I get errors saying I can't use vector (line 14) or have instance& as a return type (line 20). It seems it can't use the instance objects despite the fact that I have included them. #ifndef _INSTANCEPOOL_H #define _INSTANCEPOOL_H #include "instance.h" #include <iostream> #include <string> #include <vector> #include <stdlib.h> using namespace std; class InstancePool { private: unsigned instances;//total number of instance objects vector<instance> ipp;//the collection of instance objects, held in a vector public: InstancePool();//Default constructor. Creates an InstancePool object that contains no Instance objects InstancePool(const InstancePool& original);//Copy constructor. After copying, changes to original should not affect the copy that was created. ~InstancePool();//Destructor unsigned getNumberOfInstances() const;//Returns the number of Instance objects the the InstancePool contains. const instance& operator[](unsigned index) const; InstancePool& operator=(const InstancePool& right);//Overloading the assignment operator for InstancePool. friend istream& operator>>(istream& in, InstancePool& ip);//Overloading of the >> operator. friend ostream& operator<<(ostream& out, const InstancePool& ip);//Overloading of the << operator. }; #endif Here is the instance.h : #ifndef _INSTANCE_H #define _INSTANCE_H ///////////////////////////////#include "instancepool.h" #include <iostream> #include <string> #include <stdlib.h> using namespace std; class Instance { private: string filenamee; bool categoryy; unsigned featuress; unsigned* featureIDD; unsigned* frequencyy; string* featuree; public: Instance (unsigned features = 0);//default constructor unsigned getNumberOfFeatures() const; //Returns the number of the keywords that the calling Instance object can store. Instance(const Instance& original);//Copy constructor. After copying, changes to the original should not affect the copy that was created. ~Instance() { delete []featureIDD; delete []frequencyy; delete []featuree;}//Destructor. void setCategory(bool category){categoryy = category;}//Sets the category of the message. Spam messages are represented with true and and legit messages with false.//easy bool getCategory() const;//Returns the category of the message. void setFileName(const string& filename){filenamee = filename;}//Stores the name of the file (i.e. “spam/spamsga1.txt”, like in 1st assignment) in which the message was initially stored.//const string& trick? string getFileName() const;//Returns the name of the file in which the message was initially stored. void setFeature(unsigned i, const string& feature, unsigned featureID,unsigned frequency) {//i for array positions featuree[i] = feature; featureIDD[i] = featureID; frequencyy[i] = frequency; } string getFeature(unsigned i) const;//Returns the keyword which is located in the ith position.//const string unsigned getFeatureID(unsigned i) const;//Returns the code of the keyword which is located in the ith position. unsigned getFrequency(unsigned i) const;//Returns the frequency Instance& operator=(const Instance& right);//Overloading of the assignment operator for Instance. friend ostream& operator<<(ostream& out, const Instance& inst);//Overloading of the << operator for Instance. friend istream& operator>>(istream& in, Instance& inst);//Overloading of the >> operator for Instance. }; #endif Also, if it is helpful here is instance.cpp: // Here we implement the functions of the class apart from the inline ones #include "instance.h" #include <iostream> #include <string> #include <stdlib.h> using namespace std; Instance::Instance(unsigned features) { //Constructor that can be used as the default constructor. featuress = features; if (features == 0) return; featuree = new string[featuress]; // Dynamic memory allocation. featureIDD = new unsigned[featuress]; frequencyy = new unsigned[featuress]; return; } unsigned Instance::getNumberOfFeatures() const {//Returns the number of the keywords that the calling Instance object can store. return featuress;} Instance::Instance(const Instance& original) {//Copy constructor. filenamee = original.filenamee; categoryy = original.categoryy; featuress = original.featuress; featuree = new string[featuress]; for(unsigned i = 0; i < featuress; i++) { featuree[i] = original.featuree[i]; } featureIDD = new unsigned[featuress]; for(unsigned i = 0; i < featuress; i++) { featureIDD[i] = original.featureIDD[i]; } frequencyy = new unsigned[featuress]; for(unsigned i = 0; i < featuress; i++) { frequencyy[i] = original.frequencyy[i];} } bool Instance::getCategory() const { //Returns the category of the message. return categoryy;} string Instance::getFileName() const { //Returns the name of the file in which the message was initially stored. return filenamee;} string Instance::getFeature(unsigned i) const { //Returns the keyword which is located in the ith position.//const string return featuree[i];} unsigned Instance::getFeatureID(unsigned i) const { //Returns the code of the keyword which is located in the ith position. return featureIDD[i];} unsigned Instance::getFrequency(unsigned i) const { //Returns the frequency return frequencyy[i];} Instance& Instance::operator=(const Instance& right) { //Overloading of the assignment operator for Instance. if(this == &right) return *this; delete []featureIDD; delete []frequencyy; delete []featuree; filenamee = right.filenamee; categoryy = right.categoryy; featuress = right.featuress; featureIDD = new unsigned[featuress]; frequencyy = new unsigned[featuress]; featuree = new string[featuress]; for(unsigned i = 0; i < featuress; i++) { featureIDD[i] = right.featureIDD[i]; } for(unsigned i = 0; i < featuress; i++) { frequencyy[i] = right.frequencyy[i]; } for(unsigned i = 0; i < featuress; i++) { featuree[i] = right.featuree[i]; } return *this; } ostream& operator<<(ostream& out, const Instance& inst) {//Overloading of the << operator for Instance. out << endl << "<message file=" << '"' << inst.filenamee << '"' << " category="; if (inst.categoryy == 0) out << '"' << "legit" << '"'; else out << '"' << "spam" << '"'; out << " features=" << '"' << inst.featuress << '"' << ">" <<endl; for (int i = 0; i < inst.featuress; i++) { out << "<feature id=" << '"' << inst.featureIDD[i] << '"' << " freq=" << '"' << inst.frequencyy[i] << '"' << "> " << inst.featuree[i] << " </feature>"<< endl; } out << "</message>" << endl; return out; } istream& operator>>(istream& in, Instance& inst) { //Overloading of the >> operator for Instance. string word; string numbers = ""; string filenamee2 = ""; bool categoryy2 = 0; unsigned featuress2; string featuree2; unsigned featureIDD2; unsigned frequencyy2; unsigned i; unsigned y; while(in >> word) { if (word == "<message") {//if at beginning of message in >> word;//grab filename word for (y=6; word[y]!='"'; y++) {//pull out filename from between quotes filenamee2 += word[y];} in >> word;//grab category word if (word[10] == 's') categoryy2 = 1; in >> word;//grab features word for (y=10; word[y]!='"'; y++) { numbers += word[y];} featuress2 = atoi(numbers.c_str());//convert string of numbers to integer Instance tempp2(featuress2);//make a temporary Instance object to hold values read in tempp2.setFileName(filenamee2);//set temp object to filename read in tempp2.setCategory(categoryy2); for (i=0; i<featuress2; i++) {//loop reading in feature reports for message in >> word >> word >> word;//skip two words numbers = "";//reset numbers string for (int y=4; word[y]!='"'; y++) {//grab feature ID numbers += word[y];} featureIDD2 = atoi(numbers.c_str()); in >> word;// numbers = ""; for (int y=6; word[y]!='"'; y++) {//grab frequency numbers += word[y];} frequencyy2 = atoi(numbers.c_str()); in >> word;//grab actual feature string featuree2 = word; tempp2.setFeature(i, featuree2, featureIDD2, frequencyy2); }//all done reading in and setting features in >> word;//read in last part of message : </message> inst = tempp2;//set inst (reference) to tempp2 (tempp2 will be destroyed at end of function call) return in; } } } and instancepool.cpp: // Here we implement the functions of the class apart from the inline ones #include "instancepool.h" #include "instance.h" #include <iostream> #include <string> #include <vector> #include <stdlib.h> using namespace std; InstancePool::InstancePool()//Default constructor. Creates an InstancePool object that contains no Instance objects { instances = 0; ipp.clear(); } InstancePool::~InstancePool() { ipp.clear();} InstancePool::InstancePool(const InstancePool& original) {//Copy constructor. instances = original.instances; for (int i = 0; i<instances; i++) { ipp.push_back(original.ipp[i]); } } unsigned InstancePool::getNumberOfInstances() const {//Returns the number of Instance objects the the InstancePool contains. return instances;} const Instance& InstancePool::operator[](unsigned index) const {//Overloading of the [] operator for InstancePool. return ipp[index];} InstancePool& InstancePool::operator=(const InstancePool& right) {//Overloading the assignment operator for InstancePool. if(this == &right) return *this; ipp.clear(); instances = right.instances; for(unsigned i = 0; i < instances; i++) { ipp.push_back(right.ipp[i]); } return *this; } istream& operator>>(istream& in, InstancePool& ip) {//Overloading of the >> operator. ip.ipp.clear(); string word; string numbers; int total;//int to hold total number of messages in collection while(in >> word) { if (word == "<messagecollection"){ in >> word;//reads in total number of all messages for (int y=10; word[y]!='"'; y++){ numbers = ""; numbers += word[y]; } total = atoi(numbers.c_str()); for (int x = 0; x<total; x++) {//do loop for each message in collection in >> ip.ipp[x];//use instance friend function and [] operator to fill in values and create Instance objects and read them intot he vector } } } } ostream& operator<<(ostream& out, const InstancePool& ip) {//Overloading of the << operator. out << "<messagecollection messages=" << '"' << '>' << ip.instances << '"'<< endl << endl; for (int z=0; z<ip.instances; z++) { out << ip[z];} out << endl<<"</messagecollection>\n"; } This code is currently not writing to files correctly either at least, I'm sure it has many problems. I hope my posting of so much is not too much, and any help would be very much appreciated. Thanks!

Read the article

What are good design practices when working with Entity Framework

- by AD

This will apply mostly for an asp.net application where the data is not accessed via soa. Meaning that you get access to the objects loaded from the framework, not Transfer Objects, although some recommendation still apply. This is a community post, so please add to it as you see fit. Applies to: Entity Framework 1.0 shipped with Visual Studio 2008 sp1. Why pick EF in the first place? Considering it is a young technology with plenty of problems (see below), it may be a hard sell to get on the EF bandwagon for your project. However, it is the technology Microsoft is pushing (at the expense of Linq2Sql, which is a subset of EF). In addition, you may not be satisfied with NHibernate or other solutions out there. Whatever the reasons, there are people out there (including me) working with EF and life is not bad.make you think. EF and inheritance The first big subject is inheritance. EF does support mapping for inherited classes that are persisted in 2 ways: table per class and table the hierarchy. The modeling is easy and there are no programming issues with that part. (The following applies to table per class model as I don't have experience with table per hierarchy, which is, anyway, limited.) The real problem comes when you are trying to run queries that include one or many objects that are part of an inheritance tree: the generated sql is incredibly awful, takes a long time to get parsed by the EF and takes a long time to execute as well. This is a real show stopper. Enough that EF should probably not be used with inheritance or as little as possible. Here is an example of how bad it was. My EF model had ~30 classes, ~10 of which were part of an inheritance tree. On running a query to get one item from the Base class, something as simple as Base.Get(id), the generated SQL was over 50,000 characters. Then when you are trying to return some Associations, it degenerates even more, going as far as throwing SQL exceptions about not being able to query more than 256 tables at once. Ok, this is bad, EF concept is to allow you to create your object structure without (or with as little as possible) consideration on the actual database implementation of your table. It completely fails at this. So, recommendations? Avoid inheritance if you can, the performance will be so much better. Use it sparingly where you have to. In my opinion, this makes EF a glorified sql-generation tool for querying, but there are still advantages to using it. And ways to implement mechanism that are similar to inheritance. Bypassing inheritance with Interfaces First thing to know with trying to get some kind of inheritance going with EF is that you cannot assign a non-EF-modeled class a base class. Don't even try it, it will get overwritten by the modeler. So what to do? You can use interfaces to enforce that classes implement some functionality. For example here is a IEntity interface that allow you to define Associations between EF entities where you don't know at design time what the type of the entity would be. public enum EntityTypes{ Unknown = -1, Dog = 0, Cat } public interface IEntity { int EntityID { get; } string Name { get; } Type EntityType { get; } } public partial class Dog : IEntity { // implement EntityID and Name which could actually be fields // from your EF model Type EntityType{ get{ return EntityTypes.Dog; } } } Using this IEntity, you can then work with undefined associations in other classes // lets take a class that you defined in your model. // that class has a mapping to the columns: PetID, PetType public partial class Person { public IEntity GetPet() { return IEntityController.Get(PetID,PetType); } } which makes use of some extension functions: public class IEntityController { static public IEntity Get(int id, EntityTypes type) { switch (type) { case EntityTypes.Dog: return Dog.Get(id); case EntityTypes.Cat: return Cat.Get(id); default: throw new Exception("Invalid EntityType"); } } } Not as neat as having plain inheritance, particularly considering you have to store the PetType in an extra database field, but considering the performance gains, I would not look back. It also cannot model one-to-many, many-to-many relationship, but with creative uses of 'Union' it could be made to work. Finally, it creates the side effet of loading data in a property/function of the object, which you need to be careful about. Using a clear naming convention like GetXYZ() helps in that regards. Compiled Queries Entity Framework performance is not as good as direct database access with ADO (obviously) or Linq2SQL. There are ways to improve it however, one of which is compiling your queries. The performance of a compiled query is similar to Linq2Sql. What is a compiled query? It is simply a query for which you tell the framework to keep the parsed tree in memory so it doesn't need to be regenerated the next time you run it. So the next run, you will save the time it takes to parse the tree. Do not discount that as it is a very costly operation that gets even worse with more complex queries. There are 2 ways to compile a query: creating an ObjectQuery with EntitySQL and using CompiledQuery.Compile() function. (Note that by using an EntityDataSource in your page, you will in fact be using ObjectQuery with EntitySQL, so that gets compiled and cached). An aside here in case you don't know what EntitySQL is. It is a string-based way of writing queries against the EF. Here is an example: "select value dog from Entities.DogSet as dog where dog.ID = @ID". The syntax is pretty similar to SQL syntax. You can also do pretty complex object manipulation, which is well explained [here][1]. Ok, so here is how to do it using ObjectQuery< string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); The first time you run this query, the framework will generate the expression tree and keep it in memory. So the next time it gets executed, you will save on that costly step. In that example EnablePlanCaching = true, which is unnecessary since that is the default option. The other way to compile a query for later use is the CompiledQuery.Compile method. This uses a delegate: static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => ctx.DogSet.FirstOrDefault(it => it.ID == id)); or using linq static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet where dog.ID == id select dog).FirstOrDefault()); to call the query: query_GetDog.Invoke( YourContext, id ); The advantage of CompiledQuery is that the syntax of your query is checked at compile time, where as EntitySQL is not. However, there are other consideration... Includes Lets say you want to have the data for the dog owner to be returned by the query to avoid making 2 calls to the database. Easy to do, right? EntitySQL string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)).Include("Owner"); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); CompiledQuery static readonly Func<Entities, int, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, Dog>((ctx, id) => (from dog in ctx.DogSet.Include("Owner") where dog.ID == id select dog).FirstOrDefault()); Now, what if you want to have the Include parametrized? What I mean is that you want to have a single Get() function that is called from different pages that care about different relationships for the dog. One cares about the Owner, another about his FavoriteFood, another about his FavotireToy and so on. Basicly, you want to tell the query which associations to load. It is easy to do with EntitySQL public Dog Get(int id, string include) { string query = "select value dog " + "from Entities.DogSet as dog " + "where dog.ID = @ID"; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>(query, EntityContext.Instance)) .IncludeMany(include); oQuery.Parameters.Add(new ObjectParameter("ID", id)); oQuery.EnablePlanCaching = true; return oQuery.FirstOrDefault(); } The include simply uses the passed string. Easy enough. Note that it is possible to improve on the Include(string) function (that accepts only a single path) with an IncludeMany(string) that will let you pass a string of comma-separated associations to load. Look further in the extension section for this function. If we try to do it with CompiledQuery however, we run into numerous problems: The obvious static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.Include(include) where dog.ID == id select dog).FirstOrDefault()); will choke when called with: query_GetDog.Invoke( YourContext, id, "Owner,FavoriteFood" ); Because, as mentionned above, Include() only wants to see a single path in the string and here we are giving it 2: "Owner" and "FavoriteFood" (which is not to be confused with "Owner.FavoriteFood"!). Then, let's use IncludeMany(), which is an extension function static readonly Func<Entities, int, string, Dog> query_GetDog = CompiledQuery.Compile<Entities, int, string, Dog>((ctx, id, include) => (from dog in ctx.DogSet.IncludeMany(include) where dog.ID == id select dog).FirstOrDefault()); Wrong again, this time it is because the EF cannot parse IncludeMany because it is not part of the functions that is recognizes: it is an extension. Ok, so you want to pass an arbitrary number of paths to your function and Includes() only takes a single one. What to do? You could decide that you will never ever need more than, say 20 Includes, and pass each separated strings in a struct to CompiledQuery. But now the query looks like this: from dog in ctx.DogSet.Include(include1).Include(include2).Include(include3) .Include(include4).Include(include5).Include(include6) .[...].Include(include19).Include(include20) where dog.ID == id select dog which is awful as well. Ok, then, but wait a minute. Can't we return an ObjectQuery< with CompiledQuery? Then set the includes on that? Well, that what I would have thought so as well: static readonly Func<Entities, int, ObjectQuery<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, ObjectQuery<Dog>>((ctx, id) => (ObjectQuery<Dog>)(from dog in ctx.DogSet where dog.ID == id select dog)); public Dog GetDog( int id, string include ) { ObjectQuery<Dog> oQuery = query_GetDog(id); oQuery = oQuery.IncludeMany(include); return oQuery.FirstOrDefault; } That should have worked, except that when you call IncludeMany (or Include, Where, OrderBy...) you invalidate the cached compiled query because it is an entirely new one now! So, the expression tree needs to be reparsed and you get that performance hit again. So what is the solution? You simply cannot use CompiledQueries with parametrized Includes. Use EntitySQL instead. This doesn't mean that there aren't uses for CompiledQueries. It is great for localized queries that will always be called in the same context. Ideally CompiledQuery should always be used because the syntax is checked at compile time, but due to limitation, that's not possible. An example of use would be: you may want to have a page that queries which two dogs have the same favorite food, which is a bit narrow for a BusinessLayer function, so you put it in your page and know exactly what type of includes are required. Passing more than 3 parameters to a CompiledQuery Func is limited to 5 parameters, of which the last one is the return type and the first one is your Entities object from the model. So that leaves you with 3 parameters. A pitance, but it can be improved on very easily. public struct MyParams { public string param1; public int param2; public DateTime param3; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where dog.Age == myParams.param2 && dog.Name == myParams.param1 and dog.BirthDate > myParams.param3 select dog); public List<Dog> GetSomeDogs( int age, string Name, DateTime birthDate ) { MyParams myParams = new MyParams(); myParams.param1 = name; myParams.param2 = age; myParams.param3 = birthDate; return query_GetDog(YourContext,myParams).ToList(); } Return Types (this does not apply to EntitySQL queries as they aren't compiled at the same time during execution as the CompiledQuery method) Working with Linq, you usually don't force the execution of the query until the very last moment, in case some other functions downstream wants to change the query in some way: static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public IEnumerable<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name); } public void DataBindStuff() { IEnumerable<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } What is going to happen here? By still playing with the original ObjectQuery (that is the actual return type of the Linq statement, which implements IEnumerable), it will invalidate the compiled query and be force to re-parse. So, the rule of thumb is to return a List< of objects instead. static readonly Func<Entities, int, string, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, int, string, IEnumerable<Dog>>((ctx, age, name) => from dog in ctx.DogSet where dog.Age == age && dog.Name == name select dog); public List<Dog> GetSomeDogs( int age, string name ) { return query_GetDog(YourContext,age,name).ToList(); //<== change here } public void DataBindStuff() { List<Dog> dogs = GetSomeDogs(4,"Bud"); // but I want the dogs ordered by BirthDate gridView.DataSource = dogs.OrderBy( it => it.BirthDate ); } When you call ToList(), the query gets executed as per the compiled query and then, later, the OrderBy is executed against the objects in memory. It may be a little bit slower, but I'm not even sure. One sure thing is that you have no worries about mis-handling the ObjectQuery and invalidating the compiled query plan. Once again, that is not a blanket statement. ToList() is a defensive programming trick, but if you have a valid reason not to use ToList(), go ahead. There are many cases in which you would want to refine the query before executing it. Performance What is the performance impact of compiling a query? It can actually be fairly large. A rule of thumb is that compiling and caching the query for reuse takes at least double the time of simply executing it without caching. For complex queries (read inherirante), I have seen upwards to 10 seconds. So, the first time a pre-compiled query gets called, you get a performance hit. After that first hit, performance is noticeably better than the same non-pre-compiled query. Practically the same as Linq2Sql When you load a page with pre-compiled queries the first time you will get a hit. It will load in maybe 5-15 seconds (obviously more than one pre-compiled queries will end up being called), while subsequent loads will take less than 300ms. Dramatic difference, and it is up to you to decide if it is ok for your first user to take a hit or you want a script to call your pages to force a compilation of the queries. Can this query be cached? { Dog dog = from dog in YourContext.DogSet where dog.ID == id select dog; } No, ad-hoc Linq queries are not cached and you will incur the cost of generating the tree every single time you call it. Parametrized Queries Most search capabilities involve heavily parametrized queries. There are even libraries available that will let you build a parametrized query out of lamba expressions. The problem is that you cannot use pre-compiled queries with those. One way around that is to map out all the possible criteria in the query and flag which one you want to use: public struct MyParams { public string name; public bool checkName; public int age; public bool checkAge; } static readonly Func<Entities, MyParams, IEnumerable<Dog>> query_GetDog = CompiledQuery.Compile<Entities, MyParams, IEnumerable<Dog>>((ctx, myParams) => from dog in ctx.DogSet where (myParams.checkAge == true && dog.Age == myParams.age) && (myParams.checkName == true && dog.Name == myParams.name ) select dog); protected List<Dog> GetSomeDogs() { MyParams myParams = new MyParams(); myParams.name = "Bud"; myParams.checkName = true; myParams.age = 0; myParams.checkAge = false; return query_GetDog(YourContext,myParams).ToList(); } The advantage here is that you get all the benifits of a pre-compiled quert. The disadvantages are that you most likely will end up with a where clause that is pretty difficult to maintain, that you will incur a bigger penalty for pre-compiling the query and that each query you run is not as efficient as it could be (particularly with joins thrown in). Another way is to build an EntitySQL query piece by piece, like we all did with SQL. protected List<Dod> GetSomeDogs( string name, int age) { string query = "select value dog from Entities.DogSet where 1 = 1 "; if( !String.IsNullOrEmpty(name) ) query = query + " and dog.Name == @Name "; if( age > 0 ) query = query + " and dog.Age == @Age "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); if( !String.IsNullOrEmpty(name) ) oQuery.Parameters.Add( new ObjectParameter( "Name", name ) ); if( age > 0 ) oQuery.Parameters.Add( new ObjectParameter( "Age", age ) ); return oQuery.ToList(); } Here the problems are: - there is no syntax checking during compilation - each different combination of parameters generate a different query which will need to be pre-compiled when it is first run. In this case, there are only 4 different possible queries (no params, age-only, name-only and both params), but you can see that there can be way more with a normal world search. - Noone likes to concatenate strings! Another option is to query a large subset of the data and then narrow it down in memory. This is particularly useful if you are working with a definite subset of the data, like all the dogs in a city. You know there are a lot but you also know there aren't that many... so your CityDog search page can load all the dogs for the city in memory, which is a single pre-compiled query and then refine the results protected List<Dod> GetSomeDogs( string name, int age, string city) { string query = "select value dog from Entities.DogSet where dog.Owner.Address.City == @City "; ObjectQuery<Dog> oQuery = new ObjectQuery<Dog>( query, YourContext ); oQuery.Parameters.Add( new ObjectParameter( "City", city ) ); List<Dog> dogs = oQuery.ToList(); if( !String.IsNullOrEmpty(name) ) dogs = dogs.Where( it => it.Name == name ); if( age > 0 ) dogs = dogs.Where( it => it.Age == age ); return dogs; } It is particularly useful when you start displaying all the data then allow for filtering. Problems: - Could lead to serious data transfer if you are not careful about your subset. - You can only filter on the data that you returned. It means that if you don't return the Dog.Owner association, you will not be able to filter on the Dog.Owner.Name So what is the best solution? There isn't any. You need to pick the solution that works best for you and your problem: - Use lambda-based query building when you don't care about pre-compiling your queries. - Use fully-defined pre-compiled Linq query when your object structure is not too complex. - Use EntitySQL/string concatenation when the structure could be complex and when the possible number of different resulting queries are small (which means fewer pre-compilation hits). - Use in-memory filtering when you are working with a smallish subset of the data or when you had to fetch all of the data on the data at first anyway (if the performance is fine with all the data, then filtering in memory will not cause any time to be spent in the db). Singleton access The best way to deal with your context and entities accross all your pages is to use the singleton pattern: public sealed class YourContext { private const string instanceKey = "On3GoModelKey"; YourContext(){} public static YourEntities Instance { get { HttpContext context = HttpContext.Current; if( context == null ) return Nested.instance; if (context.Items[instanceKey] == null) { On3GoEntities entity = new On3GoEntities(); context.Items[instanceKey] = entity; } return (YourEntities)context.Items[instanceKey]; } } class Nested { // Explicit static constructor to tell C# compiler // not to mark type as beforefieldinit static Nested() { } internal static readonly YourEntities instance = new YourEntities(); } } NoTracking, is it worth it? When executing a query, you can tell the framework to track the objects it will return or not. What does it mean? With tracking enabled (the default option), the framework will track what is going on with the object (has it been modified? Created? Deleted?) and will also link objects together, when further queries are made from the database, which is what is of interest here. For example, lets assume that Dog with ID == 2 has an owner which ID == 10. Dog dog = (from dog in YourContext.DogSet where dog.ID == 2 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Person owner = (from o in YourContext.PersonSet where o.ID == 10 select dog).FirstOrDefault(); //dog.OwnerReference.IsLoaded == true; If we were to do the same with no tracking, the result would be different. ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog = oDogQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>) (from o in YourContext.PersonSet where o.ID == 10 select o); oPersonQuery.MergeOption = MergeOption.NoTracking; Owner owner = oPersonQuery.FirstOrDefault(); //dog.OwnerReference.IsLoaded == false; Tracking is very useful and in a perfect world without performance issue, it would always be on. But in this world, there is a price for it, in terms of performance. So, should you use NoTracking to speed things up? It depends on what you are planning to use the data for. Is there any chance that the data your query with NoTracking can be used to make update/insert/delete in the database? If so, don't use NoTracking because associations are not tracked and will causes exceptions to be thrown. In a page where there are absolutly no updates to the database, you can use NoTracking. Mixing tracking and NoTracking is possible, but it requires you to be extra careful with updates/inserts/deletes. The problem is that if you mix then you risk having the framework trying to Attach() a NoTracking object to the context where another copy of the same object exist with tracking on. Basicly, what I am saying is that Dog dog1 = (from dog in YourContext.DogSet where dog.ID == 2).FirstOrDefault(); ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>) (from dog in YourContext.DogSet where dog.ID == 2 select dog); oDogQuery.MergeOption = MergeOption.NoTracking; Dog dog2 = oDogQuery.FirstOrDefault(); dog1 and dog2 are 2 different objects, one tracked and one not. Using the detached object in an update/insert will force an Attach() that will say "Wait a minute, I do already have an object here with the same database key. Fail". And when you Attach() one object, all of its hierarchy gets attached as well, causing problems everywhere. Be extra careful. How much faster is it with NoTracking It depends on the queries. Some are much more succeptible to tracking than other. I don't have a fast an easy rule for it, but it helps. So I should use NoTracking everywhere then? Not exactly. There are some advantages to tracking object. The first one is that the object is cached, so subsequent call for that object will not hit the database. That cache is only valid for the lifetime of the YourEntities object, which, if you use the singleton code above, is the same as the page lifetime. One page request == one YourEntity object. So for multiple calls for the same object, it will load only once per page request. (Other caching mechanism could extend that). What happens when you are using NoTracking and try to load the same object multiple times? The database will be queried each time, so there is an impact there. How often do/should you call for the same object during a single page request? As little as possible of course, but it does happens. Also remember the piece above about having the associations connected automatically for your? You don't have that with NoTracking, so if you load your data in multiple batches, you will not have a link to between them: ObjectQuery<Dog> oDogQuery = (ObjectQuery<Dog>)(from dog in YourContext.DogSet select dog); oDogQuery.MergeOption = MergeOption.NoTracking; List<Dog> dogs = oDogQuery.ToList(); ObjectQuery<Person> oPersonQuery = (ObjectQuery<Person>)(from o in YourContext.PersonSet select o); oPersonQuery.MergeOption = MergeOption.NoTracking; List<Person> owners = oPersonQuery.ToList(); In this case, no dog will have its .Owner property set. Some things to keep in mind when you are trying to optimize the performance. No lazy loading, what am I to do? This can be seen as a blessing in disguise. Of course it is annoying to load everything manually. However, it decreases the number of calls to the db and forces you to think about when you should load data. The more you can load in one database call the better. That was always true, but it is enforced now with this 'feature' of EF. Of course, you can call if( !ObjectReference.IsLoaded ) ObjectReference.Load(); if you want to, but a better practice is to force the framework to load the objects you know you will need in one shot. This is where the discussion about parametrized Includes begins to make sense. Lets say you have you Dog object public class Dog { public Dog Get(int id) { return YourContext.DogSet.FirstOrDefault(it => it.ID == id ); } } This is the type of function you work with all the time. It gets called from all over the place and once you have that Dog object, you will do very different things to it in different functions. First, it should be pre-compiled, because you will call that very often. Second, each different pages will want to have access to a different subset of the Dog data. Some will want the Owner, some the FavoriteToy, etc. Of course, you could call Load() for each reference you need anytime you need one. But that will generate a call to the database each time. Bad idea. So instead, each page will ask for the data it wants to see when it first request for the Dog object: static public Dog Get(int id) { return GetDog(entity,"");} static public Dog Get(int id, string includePath) { string query = "select value o " + " from YourEntities.DogSet as o " +

Search Results

Search found 1972 results on 79 pages for 'trick'.

Page 79/79 | < Previous Page | 75 76 77 78 79

- by JoshReuben

- by Simon Thorpe

- by juanlarios

- by thatjeffsmith

- by Paul White

- by Stephen.Walther

- by Steven Davelaar

- by Justin

- by djmadscribbler

- by Uri

- by Rick Strahl

- by Rick Strahl

- by Richard Hayward

- by JM4

- by mrbinky3000

- by Mark Virtue

- by Emtucifor

- by Emtucifor

- by Sirius Mane

- by kitfuntastik

- by AD

< Previous Page | 75 76 77 78 79