Search Results

Search found 1894 results on 76 pages for 'phil factor'.

Page 76/76 | < Previous Page | 72 73 74 75 76 

  • Problem implementing Blinn–Phong shading model

    - by Joe Hopfgartner
    I did this very simple, perfectly working, implementation of Phong Relflection Model (There is no ambience implemented yet, but that doesn't bother me for now). The functions should be self explaining. /** * Implements the classic Phong illumination Model using a reflected light * vector. */ public class PhongIllumination implements IlluminationModel { @RGBParam(r = 0, g = 0, b = 0) public Vec3 ambient; @RGBParam(r = 1, g = 1, b = 1) public Vec3 diffuse; @RGBParam(r = 1, g = 1, b = 1) public Vec3 specular; @FloatParam(value = 20, min = 1, max = 200.0f) public float shininess; /* * Calculate the intensity of light reflected to the viewer . * * @param P = The surface position expressed in world coordinates. * * @param V = Normalized viewing vector from surface to eye in world * coordinates. * * @param N = Normalized normal vector at surface point in world * coordinates. * * @param surfaceColor = surfaceColor Color of the surface at the current * position. * * @param lights = The active light sources in the scene. * * @return Reflected light intensity I. */ public Vec3 shade(Vec3 P, Vec3 V, Vec3 N, Vec3 surfaceColor, Light lights[]) { Vec3 surfaceColordiffused = Vec3.mul(surfaceColor, diffuse); Vec3 totalintensity = new Vec3(0, 0, 0); for (int i = 0; i < lights.length; i++) { Vec3 L = lights[i].calcDirection(P); N = N.normalize(); V = V.normalize(); Vec3 R = Vec3.reflect(L, N); // reflection vector float diffuseLight = Vec3.dot(N, L); float specularLight = Vec3.dot(V, R); if (diffuseLight > 0) { totalintensity = Vec3.add(Vec3.mul(Vec3.mul( surfaceColordiffused, lights[i].calcIntensity(P)), diffuseLight), totalintensity); if (specularLight > 0) { Vec3 Il = lights[i].calcIntensity(P); Vec3 Ilincident = Vec3.mul(Il, Math.max(0.0f, Vec3 .dot(N, L))); Vec3 intensity = Vec3.mul(Vec3.mul(specular, Ilincident), (float) Math.pow(specularLight, shininess)); totalintensity = Vec3.add(totalintensity, intensity); } } } return totalintensity; } } Now i need to adapt it to become a Blinn-Phong illumination model I used the formulas from hearn and baker, followed pseudocodes and tried to implement it multiple times according to wikipedia articles in several languages but it never worked. I just get no specular reflections or they are so weak and/or are at the wrong place and/or have the wrong color. From the numerous wrong implementations I post some little code that already seems to be wrong. So I calculate my Half Way vector and my new specular light like so: Vec3 H = Vec3.mul(Vec3.add(L.normalize(), V), Vec3.add(L.normalize(), V).length()); float specularLight = Vec3.dot(H, N); With theese little changes it should already work (maby not with correct intensity but basically it should be correct). But the result is wrong. Here are two images. Left how it should render correctly and right how it renders. If i lower the shininess factor you can see a little specular light at the top right: Altough I understand the concept of Phong illumination and also the simplified more performant adaptaion of blinn phong I am trying around for days and just cant get it to work. Any help is appriciated. Edit: I was made aware of an error by this answer, that i am mutiplying by |L+V| instead of dividing by it when calculating H. I changed to deviding doing so: Vec3 H = Vec3.mul(Vec3.add(L.normalize(), V), 1/Vec3.add(L.normalize(), V).length()); Unfortunately this doesnt change much. The results look like this: and if I rise the specular constant and lower the shininess You can see the effects more clearly in a smilar wrong way: However this division just the normalisation. I think I am missing one step. Because the formulas like this just dont make sense to me. If you look at this picture: http://en.wikipedia.org/wiki/File:Blinn-Phong_vectors.svg The projection of H to N is far less than V to R. And if you imagine changing the vector V in the picture the angle is the same when the viewing vector is "on the left side". and becomes more and more different when going to the right. I pesonally would multiply the whole projection by two to become something similiar (and the hole point is to avoid the calculation of R). Altough I didnt read anythinga bout that anywehre i am gonna try this out... Result: The intension of the specular light is far too much (white areas) and the position is still wrong. I think I am messing something else up because teh reflection are just at the wrong place. But what? Edit: Now I read on wikipedia in the notes that the angle of N/H is in fact approximalty half or V/R. To compensate that i should multiply my shineness exponent by 4 rather than my projection. If i do that I end up with this: Far to intense but still one thing. The projection is at the wrong place. Where could i mess up my vectors?

    Read the article

  • How should I change my Graph structure (very slow insertion)?

    - by Nazgulled
    Hi, This program I'm doing is about a social network, which means there are users and their profiles. The profiles structure is UserProfile. Now, there are various possible Graph implementations and I don't think I'm using the best one. I have a Graph structure and inside, there's a pointer to a linked list of type Vertex. Each Vertex element has a value, a pointer to the next Vertex and a pointer to a linked list of type Edge. Each Edge element has a value (so I can define weights and whatever it's needed), a pointer to the next Edge and a pointer to the Vertex owner. I have a 2 sample files with data to process (in CSV style) and insert into the Graph. The first one is the user data (one user per line); the second one is the user relations (for the graph). The first file is quickly inserted into the graph cause I always insert at the head and there's like ~18000 users. The second file takes ages but I still insert the edges at the head. The file has about ~520000 lines of user relations and takes between 13-15mins to insert into the Graph. I made a quick test and reading the data is pretty quickly, instantaneously really. The problem is in the insertion. This problem exists because I have a Graph implemented with linked lists for the vertices. Every time I need to insert a relation, I need to lookup for 2 vertices, so I can link them together. This is the problem... Doing this for ~520000 relations, takes a while. How should I solve this? Solution 1) Some people recommended me to implement the Graph (the vertices part) as an array instead of a linked list. This way I have direct access to every vertex and the insertion is probably going to drop considerably. But, I don't like the idea of allocating an array with [18000] elements. How practically is this? My sample data has ~18000, but what if I need much less or much more? The linked list approach has that flexibility, I can have whatever size I want as long as there's memory for it. But the array doesn't, how am I going to handle such situation? What are your suggestions? Using linked lists is good for space complexity but bad for time complexity. And using an array is good for time complexity but bad for space complexity. Any thoughts about this solution? Solution 2) This project also demands that I have some sort of data structures that allows quick lookup based on a name index and an ID index. For this I decided to use Hash Tables. My tables are implemented with separate chaining as collision resolution and when a load factor of 0.70 is reach, I normally recreate the table. I base the next table size on this http://planetmath.org/encyclopedia/GoodHashTablePrimes.html. Currently, both Hash Tables hold a pointer to the UserProfile instead of duplication the user profile itself. That would be stupid, changing data would require 3 changes and it's really dumb to do it that way. So I just save the pointer to the UserProfile. The same user profile pointer is also saved as value in each Graph Vertex. So, I have 3 data structures, one Graph and two Hash Tables and every single one of them point to the same exact UserProfile. The Graph structure will serve the purpose of finding the shortest path and stuff like that while the Hash Tables serve as quick index by name and ID. What I'm thinking to solve my Graph problem is to, instead of having the Hash Tables value point to the UserProfile, I point it to the corresponding Vertex. It's still a pointer, no more and no less space is used, I just change what I point to. Like this, I can easily and quickly lookup for each Vertex I need and link them together. This will insert the ~520000 relations pretty quickly. I thought of this solution because I already have the Hash Tables and I need to have them, then, why not take advantage of them for indexing the Graph vertices instead of the user profile? It's basically the same thing, I can still access the UserProfile pretty quickly, just go to the Vertex and then to the UserProfile. But, do you see any cons on this second solution against the first one? Or only pros that overpower the pros and cons on the first solution? Other Solution) If you have any other solution, I'm all ears. But please explain the pros and cons of that solution over the previous 2. I really don't have much time to be wasting with this right now, I need to move on with this project, so, if I'm doing to do such a change, I need to understand exactly what to change and if that's really the way to go. Hopefully no one fell asleep reading this and closed the browser, sorry for the big testament. But I really need to decide what to do about this and I really need to make a change. P.S: When answering my proposed solutions, please enumerate them as I did so I know exactly what are you talking about and don't confuse my self more than I already am.

    Read the article

  • Improving HTML scrapper efficiency with pcntl_fork()

    - by Michael Pasqualone
    With the help from two previous questions, I now have a working HTML scrapper that feeds product information into a database. What I am now trying to do is improve efficiently by wrapping my brain around with getting my scrapper working with pcntl_fork. If I split my php5-cli script into 10 separate chunks, I improve total runtime by a large factor so I know I am not i/o or cpu bound but just limited by the linear nature of my scraping functions. Using code I've cobbled together from multiple sources, I have this working test: <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $hrefArray = array("http://slashdot.org", "http://slashdot.org", "http://slashdot.org", "http://slashdot.org"); function doDomStuff($singleHref,$childPid) { $html = new DOMDocument(); $html->loadHtmlFile($singleHref); $xPath = new DOMXPath($html); $domQuery = '//div[@id="slogan"]/h2'; $domReturn = $xPath->query($domQuery); foreach($domReturn as $return) { $slogan = $return->nodeValue; echo "Child PID #" . $childPid . " says: " . $slogan . "\n"; } } $pids = array(); foreach ($hrefArray as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref,$childPid); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); Which raises the following questions: 1) Given my hrefArray contains 4 urls - if the array was to contain say 1,000 product urls this code would spawn 1,000 child processes? If so, what is the best way to limit the amount of processes to say 10, and again 1,000 urls as an example split the child work load to 100 products per child (10 x 100). 2) I've learn that pcntl_fork creates a copy of the process and all variables, classes, etc. What I would like to do is replace my hrefArray variable with a DOMDocument query that builds the list of products to scrape, and then feeds them off to child processes to do the processing - so spreading the load across 10 child workers. My brain is telling I need to do something like the following (obviously this doesn't work, so don't run it): <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $maxChildWorkers = 10; $html = new DOMDocument(); $html->loadHtmlFile('http://xxxx'); $xPath = new DOMXPath($html); $domQuery = '//div[@id=productDetail]/a'; $domReturn = $xPath->query($domQuery); $hrefsArray[] = $domReturn->getAttribute('href'); function doDomStuff($singleHref) { // Do stuff here with each product } // To figure out: Split href array into $maxChilderWorks # of workArray1, workArray2 ... workArray10. $pids = array(); foreach ($workArray(1,2,3 ... 10) as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); But what I can't figure out is how to build my hrefsArray[] in the master/parent process only and feed it off to the child process. Currently everything I've tried causes loops in the child processes. I.e. my hrefsArray gets built in the master, and in each subsequent child process. I am sure I am going about this all totally wrong, so would greatly appreciate just general nudge in the right direction.

    Read the article

  • Qt C++ signals and slots did not fire

    - by Xegara
    I have programmed Qt a couple of times already and I really like the signals and slots feature. But now, I guess I'm having a problem when a signal is emitted from one thread, the corresponding slot from another thread is not fired. The connection was made in the main program. This is also my first time to use Qt for ROS which uses CMake. The signal fired by the QThread triggered their corresponding slots but the emitted signal of my class UserInput did not trigger the slot in tflistener where it supposed to. I have tried everything I can. Any help? The code is provided below. Main.cpp #include <QCoreApplication> #include <QThread> #include "userinput.h" #include "tfcompleter.h" int main(int argc, char** argv) { QCoreApplication app(argc, argv); QThread *thread1 = new QThread(); QThread *thread2 = new QThread(); UserInput *input1 = new UserInput(); TfCompleter *completer = new TfCompleter(); QObject::connect(input1, SIGNAL(togglePause2()), completer, SLOT(toggle())); QObject::connect(thread1, SIGNAL(started()), completer, SLOT(startCounting())); QObject::connect(thread2, SIGNAL(started()), input1, SLOT(start())); completer->moveToThread(thread1); input1->moveToThread(thread2); thread1->start(); thread2->start(); app.exec(); return 0; } What I want to do is.. There are two seperate threads. One thread is for the user input. When the user enters [space], the thread emits a signal to toggle the boolean member field of the other thread. The other thread 's task is to just continue its process if the user wants it to run, otherwise, the user does not want it to run. I wanted to grant the user to toggle the processing anytime that he wants, that's why I decided to bring them into seperate threads. The following codes are the tflistener and userinput. tfcompleter.h #ifndef TFCOMPLETER_H #define TFCOMPLETER_H #include <QObject> #include <QtCore> class TfCompleter : public QObject { Q_OBJECT private: bool isCount; public Q_SLOTS: void toggle(); void startCounting(); }; #endif tflistener.cpp #include "tfcompleter.h" #include <iostream> void TfCompleter::startCounting() { static uint i = 0; while(true) { if(isCount) std::cout << i++ << std::endl; } } void TfCompleter::toggle() { // isCount = ~isCount; std::cout << "isCount " << std::endl; } UserInput.h #ifndef USERINPUT_H #define USERINPUT_H #include <QObject> #include <QtCore> class UserInput : public QObject { Q_OBJECT public Q_SLOTS: void start(); // Waits for the keypress from the user and emits the corresponding signal. public: Q_SIGNALS: void togglePause2(); }; #endif UserInput.cpp #include "userinput.h" #include <iostream> #include <cstdio> // Implementation of getch #include <termios.h> #include <unistd.h> /* reads from keypress, doesn't echo */ int getch(void) { struct termios oldattr, newattr; int ch; tcgetattr( STDIN_FILENO, &oldattr ); newattr = oldattr; newattr.c_lflag &= ~( ICANON | ECHO ); tcsetattr( STDIN_FILENO, TCSANOW, &newattr ); ch = getchar(); tcsetattr( STDIN_FILENO, TCSANOW, &oldattr ); return ch; } void UserInput::start() { char c = 0; while (true) { c = getch(); if (c == ' ') { Q_EMIT togglePause2(); std::cout << "SPACE" << std::endl; } c = 0; } } Here is the CMakeLists.txt. I just placed it here also since I don't know maybe the CMake has also a factor here. CMakeLists.txt ############################################################################## # CMake ############################################################################## cmake_minimum_required(VERSION 2.4.6) ############################################################################## # Ros Initialisation ############################################################################## include($ENV{ROS_ROOT}/core/rosbuild/rosbuild.cmake) rosbuild_init() set(CMAKE_AUTOMOC ON) #set the default path for built executables to the "bin" directory set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin) #set the default path for built libraries to the "lib" directory set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib) # Set the build type. Options are: # Coverage : w/ debug symbols, w/o optimization, w/ code-coverage # Debug : w/ debug symbols, w/o optimization # Release : w/o debug symbols, w/ optimization # RelWithDebInfo : w/ debug symbols, w/ optimization # MinSizeRel : w/o debug symbols, w/ optimization, stripped binaries #set(ROS_BUILD_TYPE Debug) ############################################################################## # Qt Environment ############################################################################## # Could use this, but qt-ros would need an updated deb, instead we'll move to catkin # rosbuild_include(qt_build qt-ros) rosbuild_find_ros_package(qt_build) include(${qt_build_PACKAGE_PATH}/qt-ros.cmake) rosbuild_prepare_qt4(QtCore) # Add the appropriate components to the component list here ADD_DEFINITIONS(-DQT_NO_KEYWORDS) ############################################################################## # Sections ############################################################################## #file(GLOB QT_FORMS RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} ui/*.ui) #file(GLOB QT_RESOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} resources/*.qrc) file(GLOB_RECURSE QT_MOC RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} FOLLOW_SYMLINKS include/rgbdslam_client/*.hpp) #QT4_ADD_RESOURCES(QT_RESOURCES_CPP ${QT_RESOURCES}) #QT4_WRAP_UI(QT_FORMS_HPP ${QT_FORMS}) QT4_WRAP_CPP(QT_MOC_HPP ${QT_MOC}) ############################################################################## # Sources ############################################################################## file(GLOB_RECURSE QT_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} FOLLOW_SYMLINKS src/*.cpp) ############################################################################## # Binaries ############################################################################## rosbuild_add_executable(rgbdslam_client ${QT_SOURCES} ${QT_MOC_HPP}) #rosbuild_add_executable(rgbdslam_client ${QT_SOURCES} ${QT_RESOURCES_CPP} ${QT_FORMS_HPP} ${QT_MOC_HPP}) target_link_libraries(rgbdslam_client ${QT_LIBRARIES})

    Read the article

  • A Taxonomy of Numerical Methods v1

    - by JoshReuben
    Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms     Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

    Read the article

  • Toorcon14

    - by danx
    Toorcon 2012 Information Security Conference San Diego, CA, http://www.toorcon.org/ Dan Anderson, October 2012 It's almost Halloween, and we all know what that means—yes, of course, it's time for another Toorcon Conference! Toorcon is an annual conference for people interested in computer security. This includes the whole range of hackers, computer hobbyists, professionals, security consultants, press, law enforcement, prosecutors, FBI, etc. We're at Toorcon 14—see earlier blogs for some of the previous Toorcon's I've attended (back to 2003). This year's "con" was held at the Westin on Broadway in downtown San Diego, California. The following are not necessarily my views—I'm just the messenger—although I could have misquoted or misparaphrased the speakers. Also, I only reviewed some of the talks, below, which I attended and interested me. MalAndroid—the Crux of Android Infections, Aditya K. Sood Programming Weird Machines with ELF Metadata, Rebecca "bx" Shapiro Privacy at the Handset: New FCC Rules?, Valkyrie Hacking Measured Boot and UEFI, Dan Griffin You Can't Buy Security: Building the Open Source InfoSec Program, Boris Sverdlik What Journalists Want: The Investigative Reporters' Perspective on Hacking, Dave Maas & Jason Leopold Accessibility and Security, Anna Shubina Stop Patching, for Stronger PCI Compliance, Adam Brand McAfee Secure & Trustmarks — a Hacker's Best Friend, Jay James & Shane MacDougall MalAndroid—the Crux of Android Infections Aditya K. Sood, IOActive, Michigan State PhD candidate Aditya talked about Android smartphone malware. There's a lot of old Android software out there—over 50% Gingerbread (2.3.x)—and most have unpatched vulnerabilities. Of 9 Android vulnerabilities, 8 have known exploits (such as the old Gingerbread Global Object Table exploit). Android protection includes sandboxing, security scanner, app permissions, and screened Android app market. The Android permission checker has fine-grain resource control, policy enforcement. Android static analysis also includes a static analysis app checker (bouncer), and a vulnerablity checker. What security problems does Android have? User-centric security, which depends on the user to grant permission and make smart decisions. But users don't care or think about malware (the're not aware, not paranoid). All they want is functionality, extensibility, mobility Android had no "proper" encryption before Android 3.0 No built-in protection against social engineering and web tricks Alternative Android app markets are unsafe. Simply visiting some markets can infect Android Aditya classified Android Malware types as: Type A—Apps. These interact with the Android app framework. For example, a fake Netflix app. Or Android Gold Dream (game), which uploads user files stealthy manner to a remote location. Type K—Kernel. Exploits underlying Linux libraries or kernel Type H—Hybrid. These use multiple layers (app framework, libraries, kernel). These are most commonly used by Android botnets, which are popular with Chinese botnet authors What are the threats from Android malware? These incude leak info (contacts), banking fraud, corporate network attacks, malware advertising, malware "Hackivism" (the promotion of social causes. For example, promiting specific leaders of the Tunisian or Iranian revolutions. Android malware is frequently "masquerated". That is, repackaged inside a legit app with malware. To avoid detection, the hidden malware is not unwrapped until runtime. The malware payload can be hidden in, for example, PNG files. Less common are Android bootkits—there's not many around. What they do is hijack the Android init framework—alteering system programs and daemons, then deletes itself. For example, the DKF Bootkit (China). Android App Problems: no code signing! all self-signed native code execution permission sandbox — all or none alternate market places no robust Android malware detection at network level delayed patch process Programming Weird Machines with ELF Metadata Rebecca "bx" Shapiro, Dartmouth College, NH https://github.com/bx/elf-bf-tools @bxsays on twitter Definitions. "ELF" is an executable file format used in linking and loading executables (on UNIX/Linux-class machines). "Weird machine" uses undocumented computation sources (I think of them as unintended virtual machines). Some examples of "weird machines" are those that: return to weird location, does SQL injection, corrupts the heap. Bx then talked about using ELF metadata as (an uintended) "weird machine". Some ELF background: A compiler takes source code and generates a ELF object file (hello.o). A static linker makes an ELF executable from the object file. A runtime linker and loader takes ELF executable and loads and relocates it in memory. The ELF file has symbols to relocate functions and variables. ELF has two relocation tables—one at link time and another one at loading time: .rela.dyn (link time) and .dynsym (dynamic table). GOT: Global Offset Table of addresses for dynamically-linked functions. PLT: Procedure Linkage Tables—works with GOT. The memory layout of a process (not the ELF file) is, in order: program (+ heap), dynamic libraries, libc, ld.so, stack (which includes the dynamic table loaded into memory) For ELF, the "weird machine" is found and exploited in the loader. ELF can be crafted for executing viruses, by tricking runtime into executing interpreted "code" in the ELF symbol table. One can inject parasitic "code" without modifying the actual ELF code portions. Think of the ELF symbol table as an "assembly language" interpreter. It has these elements: instructions: Add, move, jump if not 0 (jnz) Think of symbol table entries as "registers" symbol table value is "contents" immediate values are constants direct values are addresses (e.g., 0xdeadbeef) move instruction: is a relocation table entry add instruction: relocation table "addend" entry jnz instruction: takes multiple relocation table entries The ELF weird machine exploits the loader by relocating relocation table entries. The loader will go on forever until told to stop. It stores state on stack at "end" and uses IFUNC table entries (containing function pointer address). The ELF weird machine, called "Brainfu*k" (BF) has: 8 instructions: pointer inc, dec, inc indirect, dec indirect, jump forward, jump backward, print. Three registers - 3 registers Bx showed example BF source code that implemented a Turing machine printing "hello, world". More interesting was the next demo, where bx modified ping. Ping runs suid as root, but quickly drops privilege. BF modified the loader to disable the library function call dropping privilege, so it remained as root. Then BF modified the ping -t argument to execute the -t filename as root. It's best to show what this modified ping does with an example: $ whoami bx $ ping localhost -t backdoor.sh # executes backdoor $ whoami root $ The modified code increased from 285948 bytes to 290209 bytes. A BF tool compiles "executable" by modifying the symbol table in an existing ELF executable. The tool modifies .dynsym and .rela.dyn table, but not code or data. Privacy at the Handset: New FCC Rules? "Valkyrie" (Christie Dudley, Santa Clara Law JD candidate) Valkyrie talked about mobile handset privacy. Some background: Senator Franken (also a comedian) became alarmed about CarrierIQ, where the carriers track their customers. Franken asked the FCC to find out what obligations carriers think they have to protect privacy. The carriers' response was that they are doing just fine with self-regulation—no worries! Carriers need to collect data, such as missed calls, to maintain network quality. But carriers also sell data for marketing. Verizon sells customer data and enables this with a narrow privacy policy (only 1 month to opt out, with difficulties). The data sold is not individually identifiable and is aggregated. But Verizon recommends, as an aggregation workaround to "recollate" data to other databases to identify customers indirectly. The FCC has regulated telephone privacy since 1934 and mobile network privacy since 2007. Also, the carriers say mobile phone privacy is a FTC responsibility (not FCC). FTC is trying to improve mobile app privacy, but FTC has no authority over carrier / customer relationships. As a side note, Apple iPhones are unique as carriers have extra control over iPhones they don't have with other smartphones. As a result iPhones may be more regulated. Who are the consumer advocates? Everyone knows EFF, but EPIC (Electrnic Privacy Info Center), although more obsecure, is more relevant. What to do? Carriers must be accountable. Opt-in and opt-out at any time. Carriers need incentive to grant users control for those who want it, by holding them liable and responsible for breeches on their clock. Location information should be added current CPNI privacy protection, and require "Pen/trap" judicial order to obtain (and would still be a lower standard than 4th Amendment). Politics are on a pro-privacy swing now, with many senators and the Whitehouse. There will probably be new regulation soon, and enforcement will be a problem, but consumers will still have some benefit. Hacking Measured Boot and UEFI Dan Griffin, JWSecure, Inc., Seattle, @JWSdan Dan talked about hacking measured UEFI boot. First some terms: UEFI is a boot technology that is replacing BIOS (has whitelisting and blacklisting). UEFI protects devices against rootkits. TPM - hardware security device to store hashs and hardware-protected keys "secure boot" can control at firmware level what boot images can boot "measured boot" OS feature that tracks hashes (from BIOS, boot loader, krnel, early drivers). "remote attestation" allows remote validation and control based on policy on a remote attestation server. Microsoft pushing TPM (Windows 8 required), but Google is not. Intel TianoCore is the only open source for UEFI. Dan has Measured Boot Tool at http://mbt.codeplex.com/ with a demo where you can also view TPM data. TPM support already on enterprise-class machines. UEFI Weaknesses. UEFI toolkits are evolving rapidly, but UEFI has weaknesses: assume user is an ally trust TPM implicitly, and attached to computer hibernate file is unprotected (disk encryption protects against this) protection migrating from hardware to firmware delays in patching and whitelist updates will UEFI really be adopted by the mainstream (smartphone hardware support, bank support, apathetic consumer support) You Can't Buy Security: Building the Open Source InfoSec Program Boris Sverdlik, ISDPodcast.com co-host Boris talked about problems typical with current security audits. "IT Security" is an oxymoron—IT exists to enable buiness, uptime, utilization, reporting, but don't care about security—IT has conflict of interest. There's no Magic Bullet ("blinky box"), no one-size-fits-all solution (e.g., Intrusion Detection Systems (IDSs)). Regulations don't make you secure. The cloud is not secure (because of shared data and admin access). Defense and pen testing is not sexy. Auditors are not solution (security not a checklist)—what's needed is experience and adaptability—need soft skills. Step 1: First thing is to Google and learn the company end-to-end before you start. Get to know the management team (not IT team), meet as many people as you can. Don't use arbitrary values such as CISSP scores. Quantitive risk assessment is a myth (e.g. AV*EF-SLE). Learn different Business Units, legal/regulatory obligations, learn the business and where the money is made, verify company is protected from script kiddies (easy), learn sensitive information (IP, internal use only), and start with low-hanging fruit (customer service reps and social engineering). Step 2: Policies. Keep policies short and relevant. Generic SANS "security" boilerplate policies don't make sense and are not followed. Focus on acceptable use, data usage, communications, physical security. Step 3: Implementation: keep it simple stupid. Open source, although useful, is not free (implementation cost). Access controls with authentication & authorization for local and remote access. MS Windows has it, otherwise use OpenLDAP, OpenIAM, etc. Application security Everyone tries to reinvent the wheel—use existing static analysis tools. Review high-risk apps and major revisions. Don't run different risk level apps on same system. Assume host/client compromised and use app-level security control. Network security VLAN != segregated because there's too many workarounds. Use explicit firwall rules, active and passive network monitoring (snort is free), disallow end user access to production environment, have a proxy instead of direct Internet access. Also, SSL certificates are not good two-factor auth and SSL does not mean "safe." Operational Controls Have change, patch, asset, & vulnerability management (OSSI is free). For change management, always review code before pushing to production For logging, have centralized security logging for business-critical systems, separate security logging from administrative/IT logging, and lock down log (as it has everything). Monitor with OSSIM (open source). Use intrusion detection, but not just to fulfill a checkbox: build rules from a whitelist perspective (snort). OSSEC has 95% of what you need. Vulnerability management is a QA function when done right: OpenVas and Seccubus are free. Security awareness The reality is users will always click everything. Build real awareness, not compliance driven checkbox, and have it integrated into the culture. Pen test by crowd sourcing—test with logging COSSP http://www.cossp.org/ - Comprehensive Open Source Security Project What Journalists Want: The Investigative Reporters' Perspective on Hacking Dave Maas, San Diego CityBeat Jason Leopold, Truthout.org The difference between hackers and investigative journalists: For hackers, the motivation varies, but method is same, technological specialties. For investigative journalists, it's about one thing—The Story, and they need broad info-gathering skills. J-School in 60 Seconds: Generic formula: Person or issue of pubic interest, new info, or angle. Generic criteria: proximity, prominence, timeliness, human interest, oddity, or consequence. Media awareness of hackers and trends: journalists becoming extremely aware of hackers with congressional debates (privacy, data breaches), demand for data-mining Journalists, use of coding and web development for Journalists, and Journalists busted for hacking (Murdock). Info gathering by investigative journalists include Public records laws. Federal Freedom of Information Act (FOIA) is good, but slow. California Public Records Act is a lot stronger. FOIA takes forever because of foot-dragging—it helps to be specific. Often need to sue (especially FBI). CPRA is faster, and requests can be vague. Dumps and leaks (a la Wikileaks) Journalists want: leads, protecting ourselves, our sources, and adapting tools for news gathering (Google hacking). Anonomity is important to whistleblowers. They want no digital footprint left behind (e.g., email, web log). They don't trust encryption, want to feel safe and secure. Whistleblower laws are very weak—there's no upside for whistleblowers—they have to be very passionate to do it. Accessibility and Security or: How I Learned to Stop Worrying and Love the Halting Problem Anna Shubina, Dartmouth College Anna talked about how accessibility and security are related. Accessibility of digital content (not real world accessibility). mostly refers to blind users and screenreaders, for our purpose. Accessibility is about parsing documents, as are many security issues. "Rich" executable content causes accessibility to fail, and often causes security to fail. For example MS Word has executable format—it's not a document exchange format—more dangerous than PDF or HTML. Accessibility is often the first and maybe only sanity check with parsing. They have no choice because someone may want to read what you write. Google, for example, is very particular about web browser you use and are bad at supporting other browsers. Uses JavaScript instead of links, often requiring mouseover to display content. PDF is a security nightmare. Executible format, embedded flash, JavaScript, etc. 15 million lines of code. Google Chrome doesn't handle PDF correctly, causing several security bugs. PDF has an accessibility checker and PDF tagging, to help with accessibility. But no PDF checker checks for incorrect tags, untagged content, or validates lists or tables. None check executable content at all. The "Halting Problem" is: can one decide whether a program will ever stop? The answer, in general, is no (Rice's theorem). The same holds true for accessibility checkers. Language-theoretic Security says complicated data formats are hard to parse and cannot be solved due to the Halting Problem. W3C Web Accessibility Guidelines: "Perceivable, Operable, Understandable, Robust" Not much help though, except for "Robust", but here's some gems: * all information should be parsable (paraphrasing) * if not parsable, cannot be converted to alternate formats * maximize compatibility in new document formats Executible webpages are bad for security and accessibility. They say it's for a better web experience. But is it necessary to stuff web pages with JavaScript for a better experience? A good example is The Drudge Report—it has hand-written HTML with no JavaScript, yet drives a lot of web traffic due to good content. A bad example is Google News—hidden scrollbars, guessing user input. Solutions: Accessibility and security problems come from same source Expose "better user experience" myth Keep your corner of Internet parsable Remember "Halting Problem"—recognize false solutions (checking and verifying tools) Stop Patching, for Stronger PCI Compliance Adam Brand, protiviti @adamrbrand, http://www.picfun.com/ Adam talked about PCI compliance for retail sales. Take an example: for PCI compliance, 50% of Brian's time (a IT guy), 960 hours/year was spent patching POSs in 850 restaurants. Often applying some patches make no sense (like fixing a browser vulnerability on a server). "Scanner worship" is overuse of vulnerability scanners—it gives a warm and fuzzy and it's simple (red or green results—fix reds). Scanners give a false sense of security. In reality, breeches from missing patches are uncommon—more common problems are: default passwords, cleartext authentication, misconfiguration (firewall ports open). Patching Myths: Myth 1: install within 30 days of patch release (but PCI §6.1 allows a "risk-based approach" instead). Myth 2: vendor decides what's critical (also PCI §6.1). But §6.2 requires user ranking of vulnerabilities instead. Myth 3: scan and rescan until it passes. But PCI §11.2.1b says this applies only to high-risk vulnerabilities. Adam says good recommendations come from NIST 800-40. Instead use sane patching and focus on what's really important. From NIST 800-40: Proactive: Use a proactive vulnerability management process: use change control, configuration management, monitor file integrity. Monitor: start with NVD and other vulnerability alerts, not scanner results. Evaluate: public-facing system? workstation? internal server? (risk rank) Decide:on action and timeline Test: pre-test patches (stability, functionality, rollback) for change control Install: notify, change control, tickets McAfee Secure & Trustmarks — a Hacker's Best Friend Jay James, Shane MacDougall, Tactical Intelligence Inc., Canada "McAfee Secure Trustmark" is a website seal marketed by McAfee. A website gets this badge if they pass their remote scanning. The problem is a removal of trustmarks act as flags that you're vulnerable. Easy to view status change by viewing McAfee list on website or on Google. "Secure TrustGuard" is similar to McAfee. Jay and Shane wrote Perl scripts to gather sites from McAfee and search engines. If their certification image changes to a 1x1 pixel image, then they are longer certified. Their scripts take deltas of scans to see what changed daily. The bottom line is change in TrustGuard status is a flag for hackers to attack your site. Entire idea of seals is silly—you're raising a flag saying if you're vulnerable.

    Read the article

  • Background Image not showing up in IE8

    - by Davey
    So I have a tiny header image that repeats on the x axis, but for some reason it won't show up in IE8. Anyone know a work around? Thanks in advanced. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <meta content='' name='description' /> <meta content='' name='keywords' /> <link rel="stylesheet" type="text/css" href="style.css" media="screen" /> <title>Book Site</title> </head> <body> <div id="wrapper"> <div id="header"> <div id="title"> <span class="maintitle">Site Title Goes Here</span> <br /> <span class="subtitle">Transitional Justice, Post-Conflict Reconstruction & Reconciliation in Rwanda and Beyond Phil Clark and Zachary D. Kaufman, editors</span> </div> <img class="thebook" src="images/thebook.png" /> <span class="bookblurb"> <span class="bookbuy">Buy the book</span> get it online <br /> from Columbia, Hurst or your favorite reseller </span> </div> <div id="navbar"> <ul> <li>HOME</li> <li>ABOUT THE BOOK</li> <li>AUTHORS</li> <li>NEWS & EVENTS</li> <li>KIGALI PUBLIC LIBRARY</li> <li>CONTACT US</li> </ul> </div> <div id="content"> <div id="blockone"> <div id="polaroid"> <img class="polaroid" src="images/polaroid.png" /> <br /> <span class="roidplace">Gisimba Memorial Centre</span> <br /> <span class="roidname">Kigali, Rwanda</span> </div> <div id="textblockone"> <h3>An incisive analysis of genocide and its aftermath</h3> <br /> <span class="description">In After Genocide leading scholars and practitioners analyse the political, legal and regional impact of events in post-genocide Rwanda within the broader themes of transitional justice, reconstruction and reconciliation. Given the forthcoming fifteenth anniversary of the Rwandan genocide, and continued mass violence in Africa, especially in Darfur, the Democratic Republic of Congo (DRC) and northern Uganda, this volume is unquestionably of continuing relevance. </span> </div> </div> <div id="form"> <div id="statement"> This book should be labeled for the mature individual only. But for that mature individual it is of extreme interest. It shows, far from any Manichean stereotyping, the many facets of having to try to live in an impossibly complex social and human situation. Highly recommended. <br /><br /> <span class="author">-Grard Prunier</span> <br /><span class="bookname">The Rwanda Crisis: History of a Genocide (Hurst, 1995)</span> </div> <div id="contactform"> <span class="contactus">Contact us for additional information and site updates</span> <br /> <span class="theform"> <form class="forming"> Name: <input type="text" name="firstname" /> <br /> Title: <input type="text" name="title" /> <br /> Institution: <input type="text" name="institution" /> <br /> Email: <input type="text" name="email" /> <br /> Message: <input type="text" name="message" class="message" /> </form> </span> </div> </div> </div> <div id="footer"> <p class="footernav">&copy; 2008 After Genocide <span class="footerlinks">Sitemap | Terms | Privacy | Contact </span> <span class="plug">Web design by <span class="avity">Avity</span> </p> </div> </div> </body> </html> ----------------css------------------- html, body { margin:0; padding:0; background-color:#fdffe3; font-family: Arial, Helvetica, sans-serif; } #wrapper { width:1020px; margin:0 auto; } /*begin header style*/ #header { background:url("images/headback.png")repeat-x; width:1020px; height:120px; font-family:arial; position:relative; } #title { width:565px; height:100px; float:left; margin:20px 0 0 100px; } .maintitle { font-size:40px; } .subtitle { font-size:13px; } .thebook { float:left; margin:10px 0 0 30px; border:2px solid #666666; } .bookblurb { float:left; width:110px; margin:15px 0 0 15px; font-size:13px; } .bookbuy { font-weight:bold; font-size:14px; } /*end header style*/ /*begin navigation style*/ #navbar { margin:5px 0 0 0; height: 30px; width: 1020px; background-color: #3a3e30; } #navbar ul { padding: 0px; font-family: Arial, Helvetica, sans-serif; font-size: 12px; color: #FFF; line-height: 30px; white-space: nowrap; margin:0 0 0 140px; } #navbar ul li { list-style-type: none; display: inline; margin:0 40px 0 0; } /*end navigation style*/ /*begin content style*/ #content { width:775px; margin:0 auto; } #blockone { margin:25px 0 0 0; } #polaroid { float:left; width:230px; } .roidplace { font-weight:bold; font-size:11px; } .roidname { font-size:11px; margin:0 0 0 40px; } #textblockone { width:745px; margin:0 0 0 0; font-family: Arial, Helvetica, sans-serif; } .description { font-size:13px; } #form { background:url("images/formbackround.png") no-repeat; width:758px; height:231px; margin:80px 0 0 10px; } #statement { width:320px; margin:30px 0 0 30px; position:absolute; font-size:15px; font-style:italic; float:left; } .author { font-weight:bold; font-size:14; } .bookname { font-weight:bold; font-size:11px; color:#3f91ad; } #contactform { float:right; width:320px; margin:20px 30px 0 0; } .contactus { font-weight:bold; font-size:12px; } .theform { } .forming { } .message { height:50px; } #footer { width:1020px; height:65px; background-color:#dfdacc; margin:35px 0 0 0; font-size:13px; font-weight:bold; } .footernav { margin:30px 0 0 150px; position:absolute; width:1020px; } .footerlinks { margin:0 10px 0 10px; color:#0f77a9; } .plug { margin:0 0 0 175px; } .avity { color:#0f77a9; } Live site: http://cheapramen.com/testsite/

    Read the article

  • Custom rails route problem with 2.3.8 and Mongrel

    - by CHsurfer
    I have a controller called 'exposures' which I created automatically with the script/generate scaffold call. The scaffold pages work fine. I created a custom action called 'test' in the exposures controller. When I try to call the page (http://127.0.0.1:3000/exposures/test/1) I get a blank, white screen with no text at all in the source. I am using Rails 2.3.8 and mongrel in the development environment. There are no entries in development.log and the console that was used to open mongrel has the following error: You might have expected an instance of Array. The error occurred while evaluating nil.split D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/cgi_process.rb:52:in dispatch_cgi' D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/dispatcher.rb:101:in dispatch_cgi' D:/Rails/ruby/lib/ruby/gems/1.8/gems/actionpack-2.3.8/lib/action_controller/dispatcher.rb:27:in dispatch' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:76:in process' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:74:in synchronize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/rails.rb:74:in process' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:159:in process_client' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:158:in each' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:158:in process_client' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in initialize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in new' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:285:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in initialize' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in new' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel.rb:268:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:282:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:281:in each' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/configurator.rb:281:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/mongrel_rails:128:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/../lib/mongrel/command.rb:212:in run' D:/Rails/ruby/lib/ruby/gems/1.8/gems/mongrel-1.1.2-x86-mswin32/bin/mongrel_rails:281 D:/Rails/ruby/bin/mongrel_rails:19:in load' D:/Rails/ruby/bin/mongrel_rails:19 Here is the exposures_controller code: class ExposuresController < ApplicationController # GET /exposures # GET /exposures.xml def index @exposures = Exposure.all respond_to do |format| format.html # index.html.erb format.xml { render :xml => @exposures } end end #/exposure/graph/1 def graph @exposure = Exposure.find(params[:id]) project_name = @exposure.tender.project.name group_name = @exposure.tender.user.group.name tender_desc = @exposure.tender.description direction = "Cash Out" direction = "Cash In" if @exposure.supply currency_1_and_2 = "#{@exposure.currency_in} = #{@exposure.currency_out}" title = "#{project_name}:#{group_name}:#{tender_desc}/n" title += "#{direction}:#{currency_1_and_2}" factors = Array.new carrieds = Array.new days = Array.new @exposure.rates.each do |r| factors << r.factor carrieds << r.carried days << r.day.to_s end max = (factors+carrieds).max min = (factors+carrieds).min g = Graph.new g.title(title, '{font-size: 12px;}') g.set_data(factors) g.line_hollow(2, 4, '0x80a033', 'Bounces', 10) g.set_x_labels(days) g.set_x_label_style( 10, '#CC3399', 2 ); g.set_y_min(min*0.9) g.set_y_max(max*1.1) g.set_y_label_steps(5) render :text = g.render end def test render :text = "this works" end # GET /exposures/1 # GET /exposures/1.xml def show @exposure = Exposure.find(params[:id]) @graph = open_flash_chart_object(700,250, "/exposures/graph/#{@exposure.id}") #@graph = "/exposures/graph/#{@exposure.id}" respond_to do |format| format.html # show.html.erb format.xml { render :xml => @exposure } end end # GET /exposures/new # GET /exposures/new.xml def new @exposure = Exposure.new respond_to do |format| format.html # new.html.erb format.xml { render :xml => @exposure } end end # GET /exposures/1/edit def edit @exposure = Exposure.find(params[:id]) end # POST /exposures # POST /exposures.xml def create @exposure = Exposure.new(params[:exposure]) respond_to do |format| if @exposure.save flash[:notice] = 'Exposure was successfully created.' format.html { redirect_to(@exposure) } format.xml { render :xml => @exposure, :status => :created, :location => @exposure } else format.html { render :action => "new" } format.xml { render :xml => @exposure.errors, :status => :unprocessable_entity } end end end # PUT /exposures/1 # PUT /exposures/1.xml def update @exposure = Exposure.find(params[:id]) respond_to do |format| if @exposure.update_attributes(params[:exposure]) flash[:notice] = 'Exposure was successfully updated.' format.html { redirect_to(@exposure) } format.xml { head :ok } else format.html { render :action => "edit" } format.xml { render :xml => @exposure.errors, :status => :unprocessable_entity } end end end # DELETE /exposures/1 # DELETE /exposures/1.xml def destroy @exposure = Exposure.find(params[:id]) @exposure.destroy respond_to do |format| format.html { redirect_to(exposures_url) } format.xml { head :ok } end end end Clever readers will notice the 'graph' action. This is what I really want to work, but if I can't even get the test action working, then I'm sure I have no chance. Any ideas? I have restarted mongrel a few times with no change. Here is the output of Rake routes (but I don't believe this is the problem. The error would be in the form of and HTML error response). D:\Rails\rails_apps\fxrake routes (in D:/Rails/rails_apps/fx) DEPRECATION WARNING: Rake tasks in vendor/plugins/open_flash_chart/tasks are deprecated. Use lib/tasks instead. (called from D:/ by/gems/1.8/gems/rails-2.3.8/lib/tasks/rails.rb:10) rates GET /rates(.:format) {:controller="rates", :action="index"} POST /rates(.:format) {:controller="rates", :action="create"} new_rate GET /rates/new(.:format) {:controller="rates", :action="new"} edit_rate GET /rates/:id/edit(.:format) {:controller="rates", :action="edit"} rate GET /rates/:id(.:format) {:controller="rates", :action="show"} PUT /rates/:id(.:format) {:controller="rates", :action="update"} DELETE /rates/:id(.:format) {:controller="rates", :action="destroy"} tenders GET /tenders(.:format) {:controller="tenders", :action="index"} POST /tenders(.:format) {:controller="tenders", :action="create"} new_tender GET /tenders/new(.:format) {:controller="tenders", :action="new"} edit_tender GET /tenders/:id/edit(.:format) {:controller="tenders", :action="edit"} tender GET /tenders/:id(.:format) {:controller="tenders", :action="show"} PUT /tenders/:id(.:format) {:controller="tenders", :action="update"} DELETE /tenders/:id(.:format) {:controller="tenders", :action="destroy"} exposures GET /exposures(.:format) {:controller="exposures", :action="index"} POST /exposures(.:format) {:controller="exposures", :action="create"} new_exposure GET /exposures/new(.:format) {:controller="exposures", :action="new"} edit_exposure GET /exposures/:id/edit(.:format) {:controller="exposures", :action="edit"} exposure GET /exposures/:id(.:format) {:controller="exposures", :action="show"} PUT /exposures/:id(.:format) {:controller="exposures", :action="update"} DELETE /exposures/:id(.:format) {:controller="exposures", :action="destroy"} currencies GET /currencies(.:format) {:controller="currencies", :action="index"} POST /currencies(.:format) {:controller="currencies", :action="create"} new_currency GET /currencies/new(.:format) {:controller="currencies", :action="new"} edit_currency GET /currencies/:id/edit(.:format) {:controller="currencies", :action="edit"} currency GET /currencies/:id(.:format) {:controller="currencies", :action="show"} PUT /currencies/:id(.:format) {:controller="currencies", :action="update"} DELETE /currencies/:id(.:format) {:controller="currencies", :action="destroy"} projects GET /projects(.:format) {:controller="projects", :action="index"} POST /projects(.:format) {:controller="projects", :action="create"} new_project GET /projects/new(.:format) {:controller="projects", :action="new"} edit_project GET /projects/:id/edit(.:format) {:controller="projects", :action="edit"} project GET /projects/:id(.:format) {:controller="projects", :action="show"} PUT /projects/:id(.:format) {:controller="projects", :action="update"} DELETE /projects/:id(.:format) {:controller="projects", :action="destroy"} groups GET /groups(.:format) {:controller="groups", :action="index"} POST /groups(.:format) {:controller="groups", :action="create"} new_group GET /groups/new(.:format) {:controller="groups", :action="new"} edit_group GET /groups/:id/edit(.:format) {:controller="groups", :action="edit"} group GET /groups/:id(.:format) {:controller="groups", :action="show"} PUT /groups/:id(.:format) {:controller="groups", :action="update"} DELETE /groups/:id(.:format) {:controller="groups", :action="destroy"} users GET /users(.:format) {:controller="users", :action="index"} POST /users(.:format) {:controller="users", :action="create"} new_user GET /users/new(.:format) {:controller="users", :action="new"} edit_user GET /users/:id/edit(.:format) {:controller="users", :action="edit"} user GET /users/:id(.:format) {:controller="users", :action="show"} PUT /users/:id(.:format) {:controller="users", :action="update"} DELETE /users/:id(.:format) {:controller="users", :action="destroy"} /:controller/:action/:id /:controller/:action/:id(.:format) D:\Rails\rails_apps\fxrails -v Rails 2.3.8 Thanks in advance for the help -Jon

    Read the article

  • rotating bitmaps. In code.

    - by Marco van de Voort
    Is there a faster way to rotate a large bitmap by 90 or 270 degrees than simply doing a nested loop with inverted coordinates? The bitmaps are 8bpp and typically 2048*2400*8bpp Currently I do this by simply copying with argument inversion, roughly (pseudo code: for x = 0 to 2048-1 for y = 0 to 2048-1 dest[x][y]=src[y][x]; (In reality I do it with pointers, for a bit more speed, but that is roughly the same magnitude) GDI is quite slow with large images, and GPU load/store times for textures (GF7 cards) are in the same magnitude as the current CPU time. Any tips, pointers? An in-place algorithm would even be better, but speed is more important than being in-place. Target is Delphi, but it is more an algorithmic question. SSE(2) vectorization no problem, it is a big enough problem for me to code it in assembler Duplicates How do you rotate a two dimensional array?. Follow up to Nils' answer Image 2048x2700 - 2700x2048 Compiler Turbo Explorer 2006 with optimization on. Windows: Power scheme set to "Always on". (important!!!!) Machine: Core2 6600 (2.4 GHz) time with old routine: 32ms (step 1) time with stepsize 8 : 12ms time with stepsize 16 : 10ms time with stepsize 32+ : 9ms Meanwhile I also tested on a Athlon 64 X2 (5200+ iirc), and the speed up there was slightly more than a factor four (80 to 19 ms). The speed up is well worth it, thanks. Maybe that during the summer months I'll torture myself with a SSE(2) version. However I already thought about how to tackle that, and I think I'll run out of SSE2 registers for an straight implementation: for n:=0 to 7 do begin load r0, <source+n*rowsize> shift byte from r0 into r1 shift byte from r0 into r2 .. shift byte from r0 into r8 end; store r1, <target> store r2, <target+1*<rowsize> .. store r8, <target+7*<rowsize> So 8x8 needs 9 registers, but 32-bits SSE only has 8. Anyway that is something for the summer months :-) Note that the pointer thing is something that I do out of instinct, but it could be there is actually something to it, if your dimensions are not hardcoded, the compiler can't turn the mul into a shift. While muls an sich are cheap nowadays, they also generate more register pressure afaik. The code (validated by subtracting result from the "naieve" rotate1 implementation): const stepsize = 32; procedure rotatealign(Source: tbw8image; Target:tbw8image); var stepsx,stepsy,restx,resty : Integer; RowPitchSource, RowPitchTarget : Integer; pSource, pTarget,ps1,ps2 : pchar; x,y,i,j: integer; rpstep : integer; begin RowPitchSource := source.RowPitch; // bytes to jump to next line. Can be negative (includes alignment) RowPitchTarget := target.RowPitch; rpstep:=RowPitchTarget*stepsize; stepsx:=source.ImageWidth div stepsize; stepsy:=source.ImageHeight div stepsize; // check if mod 16=0 here for both dimensions, if so -> SSE2. for y := 0 to stepsy - 1 do begin psource:=source.GetImagePointer(0,y*stepsize); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(target.imagewidth-(y+1)*stepsize,0); for x := 0 to stepsx - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; // 3 more areas to do, with dimensions // - stepsy*stepsize * restx // right most column of restx width // - stepsx*stepsize * resty // bottom row with resty height // - restx*resty // bottom-right rectangle. restx:=source.ImageWidth mod stepsize; // typically zero because width is // typically 1024 or 2048 resty:=source.Imageheight mod stepsize; if restx>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(source.ImageWidth-restx,0); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(Target.imagewidth-stepsize,Target.imageheight-restx); for y := 0 to stepsy - 1 do begin for i := 0 to stepsize - 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[stepsize-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize*RowPitchSource); dec(ptarget,stepsize); end; end; if resty>0 then begin // one loop less, since we know this fits in one line of "blocks" psource:=source.GetImagePointer(0,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,0); for x := 0 to stepsx - 1 do begin for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to stepsize - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; inc(psource,stepsize); inc(ptarget,rpstep); end; end; if (resty>0) and (restx>0) then begin // another loop less, since only one block psource:=source.GetImagePointer(source.ImageWidth-restx,source.ImageHeight-resty); // gets pointer to pixel x,y ptarget:=Target.GetImagePointer(0,target.ImageHeight-restx); for i := 0 to resty- 1 do begin ps1:=@psource[rowpitchsource*i]; // ( 0,i) ps2:=@ptarget[resty-1-i]; // (maxx-i,0); for j := 0 to restx - 1 do begin ps2[0]:=ps1[j]; inc(ps2,RowPitchTarget); end; end; end; end;

    Read the article

  • Xml failing to deserialise

    - by Carnotaurus
    I call a method to get my pages [see GetPages(String xmlFullFilePath)]. The FromXElement method is supposed to deserialise the LitePropertyData elements to strongly type LitePropertyData objects. Instead it fails on the following line: return (T)xmlSerializer.Deserialize(memoryStream); and gives the following error: <LitePropertyData xmlns=''> was not expected. What am I doing wrong? I have included the methods that I call and the xml data: public static T FromXElement<T>(this XElement xElement) { using (var memoryStream = new MemoryStream(Encoding.ASCII.GetBytes(xElement.ToString()))) { var xmlSerializer = new XmlSerializer(typeof(T)); return (T)xmlSerializer.Deserialize(memoryStream); } } public static List<LitePageData> GetPages(String xmlFullFilePath) { XDocument document = XDocument.Load(xmlFullFilePath); List<LitePageData> results = (from record in document.Descendants("row") select new LitePageData { Guid = IsValid(record, "Guid") ? record.Element("Guid").Value : null, ParentID = IsValid(record, "ParentID") ? Convert.ToInt32(record.Element("ParentID").Value) : (Int32?)null, Created = Convert.ToDateTime(record.Element("Created").Value), Changed = Convert.ToDateTime(record.Element("Changed").Value), Name = record.Element("Name").Value, ID = Convert.ToInt32(record.Element("ID").Value), LitePageTypeID = IsValid(record, "ParentID") ? Convert.ToInt32(record.Element("ParentID").Value) : (Int32?)null, Html = record.Element("Html").Value, FriendlyName = record.Element("FriendlyName").Value, Properties = record.Element("Properties") != null ? record.Element("Properties").Element("LitePropertyData").FromXElement<List<LitePropertyData>>() : new List<LitePropertyData>() }).ToList(); return results; } Here is the xml: <?xml version="1.0" encoding="utf-8"?> <root> <rows> <row> <ID>1</ID> <ImageUrl></ImageUrl> <Html>Home page</Html> <Created>01-01-2012</Created> <Changed>01-01-2012</Changed> <Name>Home page</Name> <FriendlyName>home-page</FriendlyName> </row> <row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Guid>edeaf468-f490-4271-bf4d-be145bc6a1fd</Guid> <ID>8</ID> <Name>Unused</Name> <ParentID>1</ParentID> <Created>2006-03-25T10:57:17</Created> <Changed>2012-07-17T12:24:30.0984747+01:00</Changed> <ChangedBy /> <LitePageTypeID xsi:nil="true" /> <Html> What is the purpose of this option? This option checks the current document for accessibility issues. It uses Bobby to provide details of whether the current web page conforms to W3C's WCAG criteria for web content accessibility. Issues with Bobby and Cynthia Bobby and Cynthia are free services that supposedly allow a user to expose web page accessibility barriers. It is something of a guide but perhaps a blunt instrument. I tested a few of the webpages that I have designed. Sure enough, my pages fall short and for good reason. I am not about to claim that Bobby and Cynthia are useless. Although it is useful and commendable tool, it project appears to be overly ambitious. Nevertheless, let me explain my issues with Bobby and Cynthia: First, certain W3C standards for designing web documents are often too strict and unworkable. For instance, in some versions W3C standards for HTML, certain tags should not include a particular attribute, whereas in others they are requisite if the document is to be ???well-formed???. The standard that a designer chooses is determined usually by the requirements specification document. This specifies which browsers and versions of those browsers that the web page is expected to correctly display. Forcing a hypertext document to conform strictly to a specific W3C standard for HTML is often no simple task. In the worst case, it cannot conform without losing some aesthetics or accessibility functionality. Second, the case of HTML documents is not an isolated case. Standards for XML, XSL, JavaScript, VBScript, are analogous. Therefore, you might imagine the problems when you begin to combine these languages and formats in an HTML document. Third, there is always more than one way to skin a cat. For example, Bobby and Cynthia may flag those IMG tags that do not contain a TITLE attribute. There might be good reason that a web developer chooses not to include the title attribute. The title attribute has a limited numbers of characters and does not support carriage returns. This is a major defect in the design of this tag. In fact, before the TITLE attribute was supported, there was the ALT attribute. Most browsers support both, yet they both perform a similar function. However, both attributes share the same deficiencies. In practice, there are instances where neither attribute would be used. Instead, for example, the developer would write some JavaScript or VBScript to circumvent these deficiencies. The concern is that Bobby and Cynthia would not notice this because it does not ???understand??? what the JavaScript does. </Html> <FriendlyName>unused</FriendlyName> <IsDeleted>false</IsDeleted> <Properties> <LitePropertyData> <Description>Image for the page</Description> <DisplayEditUI>true</DisplayEditUI> <OwnerTab>1</OwnerTab> <DisplayName>Image Url</DisplayName> <FieldOrder>1</FieldOrder> <IsRequired>false</IsRequired> <Name>ImageUrl</Name> <IsModified>false</IsModified> <ParentPageID>3</ParentPageID> <Type>String</Type> <Value xsi:type="xsd:string">smarter.jpg</Value> </LitePropertyData> <LitePropertyData> <Description>WebItemApplicationEnum</Description> <DisplayEditUI>true</DisplayEditUI> <OwnerTab>1</OwnerTab> <DisplayName>WebItemApplicationEnum</DisplayName> <FieldOrder>1</FieldOrder> <IsRequired>false</IsRequired> <Name>WebItemApplicationEnum</Name> <IsModified>false</IsModified> <ParentPageID>3</ParentPageID> <Type>Number</Type> <Value xsi:type="xsd:string">1</Value> </LitePropertyData> </Properties> <Seo> <Author>Phil Carney</Author> <Classification /> <Copyright>Carnotaurus</Copyright> <Description> What is the purpose of this option? This option checks the current document for accessibility issues. It uses Bobby to provide details of whether the current web page conforms to W3C's WCAG criteria for web content accessibility. Issues with Bobby and Cynthia Bobby and Cynthia are free services that supposedly allow a user to expose web page accessibility barriers. It is something of a guide but perhaps a blunt instrument. I tested a few of the webpages that I have designed. Sure enough, my pages fall short and for good reason. I am not about to claim that Bobby and Cynthia are useless. Although it is useful and commendable tool, it project appears to be overly ambitious. Nevertheless, let me explain my issues with Bobby and Cynthia: First, certain W3C standards for designing web documents are often too strict and unworkable. For instance, in some versions W3C standards for HTML, certain tags should not include a particular attribute, whereas in others they are requisite if the document is to be ???well-formed???. The standard that a designer chooses is determined usually by the requirements specification document. This specifies which browsers and versions of those browsers that the web page is expected to correctly display. Forcing a hypertext document to conform strictly to a specific W3C standard for HTML is often no simple task. In the worst case, it cannot conform without losing some aesthetics or accessibility functionality. Second, the case of HTML documents is not an isolated case. Standards for XML, XSL, JavaScript, VBScript, are analogous. Therefore, you might imagine the problems when you begin to combine these languages and formats in an HTML document. Third, there is always more than one way to skin a cat. For example, Bobby and Cynthia may flag those IMG tags that do not contain a TITLE attribute. There might be good reason that a web developer chooses not to include the title attribute. The title attribute has a limited numbers of characters and does not support carriage returns. This is a major defect in the design of this tag. In fact, before the TITLE attribute was supported, there was the ALT attribute. Most browsers support both, yet they both perform a similar function. However, both attributes share the same deficiencies. In practice, there are instances where neither attribute would be used. Instead, for example, the developer would write some JavaScript or VBScript to circumvent these deficiencies. The concern is that Bobby and Cynthia would not notice this because it does not ???understand??? what the JavaScript does. </Description> <Keywords>unused</Keywords> <Title>unused</Title> </Seo> </row> </rows> </root> EDIT Here are my entities: public class LitePropertyData { public virtual string Description { get; set; } public virtual bool DisplayEditUI { get; set; } public int OwnerTab { get; set; } public virtual string DisplayName { get; set; } public int FieldOrder { get; set; } public bool IsRequired { get; set; } public string Name { get; set; } public virtual bool IsModified { get; set; } public virtual int ParentPageID { get; set; } public LiteDataType Type { get; set; } public object Value { get; set; } } [Serializable] public class LitePageData { public String Guid { get; set; } public Int32 ID { get; set; } public String Name { get; set; } public Int32? ParentID { get; set; } public DateTime Created { get; set; } public String CreatedBy { get; set; } public DateTime Changed { get; set; } public String ChangedBy { get; set; } public Int32? LitePageTypeID { get; set; } public String Html { get; set; } public String FriendlyName { get; set; } public Boolean IsDeleted { get; set; } public List<LitePropertyData> Properties { get; set; } public LiteSeoPageData Seo { get; set; } /// <summary> /// Saves the specified XML full file path. /// </summary> /// <param name="xmlFullFilePath">The XML full file path.</param> public void Save(String xmlFullFilePath) { XDocument doc = XDocument.Load(xmlFullFilePath); XElement demoNode = this.ToXElement<LitePageData>(); demoNode.Name = "row"; doc.Descendants("rows").Single().Add(demoNode); doc.Save(xmlFullFilePath); } }

    Read the article

  • questions regarding the use of A* with the 15-square puzzle

    - by Cheeso
    I'm trying to build an A* solver for a 15-square puzzle. The goal is to re-arrange the tiles so that they appear in their natural positions. You can only slide one tile at a time. Each possible state of the puzzle is a node in the search graph. For the h(x) function, I am using an aggregate sum, across all tiles, of the tile's dislocation from the goal state. In the above image, the 5 is at location 0,0, and it belongs at location 1,0, therefore it contributes 1 to the h(x) function. The next tile is the 11, located at 0,1, and belongs at 2,2, therefore it contributes 3 to h(x). And so on. EDIT: I now understand this is what they call "Manhattan distance", or "taxicab distance". I have been using a step count for g(x). In my implementation, for any node in the state graph, g is just +1 from the prior node's g. To find successive nodes, I just examine where I can possibly move the "hole" in the puzzle. There are 3 neighbors for the puzzle state (aka node) that is displayed: the hole can move north, west, or east. My A* search sometimes converges to a solution in 20s, sometimes 180s, and sometimes doesn't converge at all (waited 10 mins or more). I think h is reasonable. I'm wondering if I've modeled g properly. In other words, is it possible that my A* function is reaching a node in the graph via a path that is not the shortest path? Maybe have I not waited long enough? Maybe 10 minutes is not long enough? For a fully random arrangement, (assuming no parity problems), What is the average number of permutations an A* solution will examine? (please show the math) I'm going to look for logic errors in my code, but in the meantime, Any tips? (ps: it's done in Javascript). Also, no, this isn't CompSci homework. It's just a personal exploration thing. I'm just trying to learn Javascript. EDIT: I've found that the run-time is highly depend upon the heuristic. I saw the 10x factor applied to the heuristic from the article someone mentioned, and it made me wonder - why 10x? Why linear? Because this is done in javascript, I could modify the code to dynamically update an html table with the node currently being considered. This allowd me to peek at the algorithm as it was progressing. With a regular taxicab distance heuristic, I watched as it failed to converge. There were 5's and 12's in the top row, and they kept hanging around. I'd see 1,2,3,4 creep into the top row, but then they'd drop out, and other numbers would move up there. What I was hoping to see was 1,2,3,4 sort of creeping up to the top, and then staying there. I thought to myself - this is not the way I solve this personally. Doing this manually, I solve the top row, then the 2ne row, then the 3rd and 4th rows sort of concurrently. So I tweaked the h(x) function to more heavily weight the higher rows and the "lefter" columns. The result was that the A* converged much more quickly. It now runs in 3 minutes instead of "indefinitely". With the "peek" I talked about, I can see the smaller numbers creep up to the higher rows and stay there. Not only does this seem like the right thing, it runs much faster. I'm in the process of trying a bunch of variations. It seems pretty clear that A* runtime is very sensitive to the heuristic. Currently the best heuristic I've found uses the summation of dislocation * ((4-i) + (4-j)) where i and j are the row and column, and dislocation is the taxicab distance. One interesting part of the result I got: with a particular heuristic I find a path very quickly, but it is obviously not the shortest path. I think this is because I am weighting the heuristic. In one case I got a path of 178 steps in 10s. My own manual effort produce a solution in 87 moves. (much more than 10s). More investigation warranted. So the result is I am seeing it converge must faster, and the path is definitely not the shortest. I have to think about this more. Code: var stop = false; function Astar(start, goal, callback) { // start and goal are nodes in the graph, represented by // an array of 16 ints. The goal is: [1,2,3,...14,15,0] // Zero represents the hole. // callback is a method to call when finished. This runs a long time, // therefore we need to use setTimeout() to break it up, to avoid // the browser warning like "Stop running this script?" // g is the actual distance traveled from initial node to current node. // h is the heuristic estimate of distance from current to goal. stop = false; start.g = start.dontgo = 0; // calcHeuristic inserts an .h member into the array calcHeuristicDistance(start); // start the stack with one element var closed = []; // set of nodes already evaluated. var open = [ start ]; // set of nodes to evaluate (start with initial node) var iteration = function() { if (open.length==0) { // no more nodes. Fail. callback(null); return; } var current = open.shift(); // get highest priority node // update the browser with a table representation of the // node being evaluated $("#solution").html(stateToString(current)); // check solution returns true if current == goal if (checkSolution(current,goal)) { // reconstructPath just records the position of the hole // through each node var path= reconstructPath(start,current); callback(path); return; } closed.push(current); // get the set of neighbors. This is 3 or fewer nodes. // (nextStates is optimized to NOT turn directly back on itself) var neighbors = nextStates(current, goal); for (var i=0; i<neighbors.length; i++) { var n = neighbors[i]; // skip this one if we've already visited it if (closed.containsNode(n)) continue; // .g, .h, and .previous get assigned implicitly when // calculating neighbors. n.g is nothing more than // current.g+1 ; // add to the open list if (!open.containsNode(n)) { // slot into the list, in priority order (minimum f first) open.priorityPush(n); n.previous = current; } } if (stop) { callback(null); return; } setTimeout(iteration, 1); }; // kick off the first iteration iteration(); return null; }

    Read the article

  • A way of doing real-world test-driven development (and some thoughts about it)

    - by Thomas Weller
    Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like  'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make:  It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below:   First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how  the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator {     /// <summary>     /// Defines a very simple calculator component for demo purposes.     /// </summary>     public interface ICalculator     {         /// <summary>         /// Gets the result of the last successful operation.         /// </summary>         /// <value>The last result.</value>         /// <remarks>         /// Will be <see langword="null" /> before the first successful operation.         /// </remarks>         double? LastResult { get; }       } // interface ICalculator   } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1)   First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test {     [TestFixture]     public class CalculatorTest     {         private readonly ServiceContainer container = new ServiceContainer();           [Test]         public void CalculatorLastResultIsInitiallyNull()         {             ICalculator calculator = container.GetService<ICalculator>();               Assert.IsNull(calculator.LastResult);         }       } // class CalculatorTest   } // namespace Calculator.Test       This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more.  Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type:        Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult         {             get             {                 throw new NotImplementedException();             }         }     } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest {     #region Fields       private readonly ServiceContainer container = new ServiceContainer();       #endregion // Fields       #region Setup/TearDown       [FixtureSetUp]     public void FixtureSetUp()     {        container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll");     }       ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property:     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         public double? LastResult { get; private set; }     }         One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method:         ...   /// <summary>         /// Adds the specified operands.         /// </summary>         /// <param name="operand1">The operand1.</param>         /// <param name="operand2">The operand2.</param>         /// <returns>The result of the additon.</returns>         /// <exception cref="ArgumentException">         /// Argument <paramref name="operand1"/> is &lt; 0.<br/>         /// -- or --<br/>         /// Argument <paramref name="operand2"/> is &lt; 0.         /// </exception>         double Add(double operand1, double operand2);       } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this:   Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location.  This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…).     Back to our Calculator again: Two more R# – clicks implement the Add() skeleton:         ...           public double Add(double operand1, double operand2)         {             throw new NotImplementedException();         }       } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest:   [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) {     ICalculator calculator = container.GetService<ICalculator>();       Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); }   // in Calculator: public double Add(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }     if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }     throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Add(operand1, operand2);       Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio):   Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs:   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       double result = calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, result); }   [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) {     ICalculator calculator = container.GetService<ICalculator>();       calculator.Subtract(operand1, operand2);       Assert.AreEqual(expectedResult, calculator.LastResult); }   ...   // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is &lt; 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is &lt; 0. /// </exception> double Subtract(double operand1, double operand2);   ...   // Calculator.cs:   public double Subtract(double operand1, double operand2) {     if (operand1 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand1");     }       if (operand2 < 0.0)     {         throw new ArgumentException("Value must not be negative.", "operand2");     }       return (this.LastResult = operand1 - operand2).Value; }   Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration;   namespace Calculator {     [Implements(typeof(ICalculator))]     internal class Calculator : ICalculator     {         #region ICalculator           public double? LastResult { get; private set; }           public double Add(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 + operand2).Value;         }           public double Subtract(double operand1, double operand2)         {             ThrowIfOneOperandIsInvalid(operand1, operand2);               return (this.LastResult = operand1 - operand2).Value;         }           #endregion // ICalculator           #region Implementation (Helper)           private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2)         {             if (operand1 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand1");             }               if (operand2 < 0.0)             {                 throw new ArgumentException("Value must not be negative.", "operand2");             }         }           #endregion // Implementation (Helper)       } // class Calculator   } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…).  As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks -  is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

    Read the article

  • 256 Windows Azure Worker Roles, Windows Kinect and a 90's Text-Based Ray-Tracer

    - by Alan Smith
    For a couple of years I have been demoing a simple render farm hosted in Windows Azure using worker roles and the Azure Storage service. At the start of the presentation I deploy an Azure application that uses 16 worker roles to render a 1,500 frame 3D ray-traced animation. At the end of the presentation, when the animation was complete, I would play the animation delete the Azure deployment. The standing joke with the audience was that it was that it was a “$2 demo”, as the compute charges for running the 16 instances for an hour was $1.92, factor in the bandwidth charges and it’s a couple of dollars. The point of the demo is that it highlights one of the great benefits of cloud computing, you pay for what you use, and if you need massive compute power for a short period of time using Windows Azure can work out very cost effective. The “$2 demo” was great for presenting at user groups and conferences in that it could be deployed to Azure, used to render an animation, and then removed in a one hour session. I have always had the idea of doing something a bit more impressive with the demo, and scaling it from a “$2 demo” to a “$30 demo”. The challenge was to create a visually appealing animation in high definition format and keep the demo time down to one hour.  This article will take a run through how I achieved this. Ray Tracing Ray tracing, a technique for generating high quality photorealistic images, gained popularity in the 90’s with companies like Pixar creating feature length computer animations, and also the emergence of shareware text-based ray tracers that could run on a home PC. In order to render a ray traced image, the ray of light that would pass from the view point must be tracked until it intersects with an object. At the intersection, the color, reflectiveness, transparency, and refractive index of the object are used to calculate if the ray will be reflected or refracted. Each pixel may require thousands of calculations to determine what color it will be in the rendered image. Pin-Board Toys Having very little artistic talent and a basic understanding of maths I decided to focus on an animation that could be modeled fairly easily and would look visually impressive. I’ve always liked the pin-board desktop toys that become popular in the 80’s and when I was working as a 3D animator back in the 90’s I always had the idea of creating a 3D ray-traced animation of a pin-board, but never found the energy to do it. Even if I had a go at it, the render time to produce an animation that would look respectable on a 486 would have been measured in months. PolyRay Back in 1995 I landed my first real job, after spending three years being a beach-ski-climbing-paragliding-bum, and was employed to create 3D ray-traced animations for a CD-ROM that school kids would use to learn physics. I had got into the strange and wonderful world of text-based ray tracing, and was using a shareware ray-tracer called PolyRay. PolyRay takes a text file describing a scene as input and, after a few hours processing on a 486, produced a high quality ray-traced image. The following is an example of a basic PolyRay scene file. background Midnight_Blue   static define matte surface { ambient 0.1 diffuse 0.7 } define matte_white texture { matte { color white } } define matte_black texture { matte { color dark_slate_gray } } define position_cylindrical 3 define lookup_sawtooth 1 define light_wood <0.6, 0.24, 0.1> define median_wood <0.3, 0.12, 0.03> define dark_wood <0.05, 0.01, 0.005>     define wooden texture { noise surface { ambient 0.2  diffuse 0.7  specular white, 0.5 microfacet Reitz 10 position_fn position_cylindrical position_scale 1  lookup_fn lookup_sawtooth octaves 1 turbulence 1 color_map( [0.0, 0.2, light_wood, light_wood] [0.2, 0.3, light_wood, median_wood] [0.3, 0.4, median_wood, light_wood] [0.4, 0.7, light_wood, light_wood] [0.7, 0.8, light_wood, median_wood] [0.8, 0.9, median_wood, light_wood] [0.9, 1.0, light_wood, dark_wood]) } } define glass texture { surface { ambient 0 diffuse 0 specular 0.2 reflection white, 0.1 transmission white, 1, 1.5 }} define shiny surface { ambient 0.1 diffuse 0.6 specular white, 0.6 microfacet Phong 7  } define steely_blue texture { shiny { color black } } define chrome texture { surface { color white ambient 0.0 diffuse 0.2 specular 0.4 microfacet Phong 10 reflection 0.8 } }   viewpoint {     from <4.000, -1.000, 1.000> at <0.000, 0.000, 0.000> up <0, 1, 0> angle 60     resolution 640, 480 aspect 1.6 image_format 0 }       light <-10, 30, 20> light <-10, 30, -20>   object { disc <0, -2, 0>, <0, 1, 0>, 30 wooden }   object { sphere <0.000, 0.000, 0.000>, 1.00 chrome } object { cylinder <0.000, 0.000, 0.000>, <0.000, 0.000, -4.000>, 0.50 chrome }   After setting up the background and defining colors and textures, the viewpoint is specified. The “camera” is located at a point in 3D space, and it looks towards another point. The angle, image resolution, and aspect ratio are specified. Two lights are present in the image at defined coordinates. The three objects in the image are a wooden disc to represent a table top, and a sphere and cylinder that intersect to form a pin that will be used for the pin board toy in the final animation. When the image is rendered, the following image is produced. The pins are modeled with a chrome surface, so they reflect the environment around them. Note that the scale of the pin shaft is not correct, this will be fixed later. Modeling the Pin Board The frame of the pin-board is made up of three boxes, and six cylinders, the front box is modeled using a clear, slightly reflective solid, with the same refractive index of glass. The other shapes are modeled as metal. object { box <-5.5, -1.5, 1>, <5.5, 5.5, 1.2> glass } object { box <-5.5, -1.5, -0.04>, <5.5, 5.5, -0.09> steely_blue } object { box <-5.5, -1.5, -0.52>, <5.5, 5.5, -0.59> steely_blue } object { cylinder <-5.2, -1.2, 1.4>, <-5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, -1.2, 1.4>, <5.2, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <-5.2, 5.2, 1.4>, <-5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <5.2, 5.2, 1.4>, <5.2, 5.2, -0.74>, 0.2 steely_blue } object { cylinder <0, -1.2, 1.4>, <0, -1.2, -0.74>, 0.2 steely_blue } object { cylinder <0, 5.2, 1.4>, <0, 5.2, -0.74>, 0.2 steely_blue }   In order to create the matrix of pins that make up the pin board I used a basic console application with a few nested loops to create two intersecting matrixes of pins, which models the layout used in the pin boards. The resulting image is shown below. The pin board contains 11,481 pins, with the scene file containing 23,709 lines of code. For the complete animation 2,000 scene files will be created, which is over 47 million lines of code. Each pin in the pin-board will slide out a specific distance when an object is pressed into the back of the board. This is easily modeled by setting the Z coordinate of the pin to a specific value. In order to set all of the pins in the pin-board to the correct position, a bitmap image can be used. The position of the pin can be set based on the color of the pixel at the appropriate position in the image. When the Windows Azure logo is used to set the Z coordinate of the pins, the following image is generated. The challenge now was to make a cool animation. The Azure Logo is fine, but it is static. Using a normal video to animate the pins would not work; the colors in the video would not be the same as the depth of the objects from the camera. In order to simulate the pin board accurately a series of frames from a depth camera could be used. Windows Kinect The Kenect controllers for the X-Box 360 and Windows feature a depth camera. The Kinect SDK for Windows provides a programming interface for Kenect, providing easy access for .NET developers to the Kinect sensors. The Kinect Explorer provided with the Kinect SDK is a great starting point for exploring Kinect from a developers perspective. Both the X-Box 360 Kinect and the Windows Kinect will work with the Kinect SDK, the Windows Kinect is required for commercial applications, but the X-Box Kinect can be used for hobby projects. The Windows Kinect has the advantage of providing a mode to allow depth capture with objects closer to the camera, which makes for a more accurate depth image for setting the pin positions. Creating a Depth Field Animation The depth field animation used to set the positions of the pin in the pin board was created using a modified version of the Kinect Explorer sample application. In order to simulate the pin board accurately, a small section of the depth range from the depth sensor will be used. Any part of the object in front of the depth range will result in a white pixel; anything behind the depth range will be black. Within the depth range the pixels in the image will be set to RGB values from 0,0,0 to 255,255,255. A screen shot of the modified Kinect Explorer application is shown below. The Kinect Explorer sample application was modified to include slider controls that are used to set the depth range that forms the image from the depth stream. This allows the fine tuning of the depth image that is required for simulating the position of the pins in the pin board. The Kinect Explorer was also modified to record a series of images from the depth camera and save them as a sequence JPEG files that will be used to animate the pins in the animation the Start and Stop buttons are used to start and stop the image recording. En example of one of the depth images is shown below. Once a series of 2,000 depth images has been captured, the task of creating the animation can begin. Rendering a Test Frame In order to test the creation of frames and get an approximation of the time required to render each frame a test frame was rendered on-premise using PolyRay. The output of the rendering process is shown below. The test frame contained 23,629 primitive shapes, most of which are the spheres and cylinders that are used for the 11,800 or so pins in the pin board. The 1280x720 image contains 921,600 pixels, but as anti-aliasing was used the number of rays that were calculated was 4,235,777, with 3,478,754,073 object boundaries checked. The test frame of the pin board with the depth field image applied is shown below. The tracing time for the test frame was 4 minutes 27 seconds, which means rendering the2,000 frames in the animation would take over 148 hours, or a little over 6 days. Although this is much faster that an old 486, waiting almost a week to see the results of an animation would make it challenging for animators to create, view, and refine their animations. It would be much better if the animation could be rendered in less than one hour. Windows Azure Worker Roles The cost of creating an on-premise render farm to render animations increases in proportion to the number of servers. The table below shows the cost of servers for creating a render farm, assuming a cost of $500 per server. Number of Servers Cost 1 $500 16 $8,000 256 $128,000   As well as the cost of the servers, there would be additional costs for networking, racks etc. Hosting an environment of 256 servers on-premise would require a server room with cooling, and some pretty hefty power cabling. The Windows Azure compute services provide worker roles, which are ideal for performing processor intensive compute tasks. With the scalability available in Windows Azure a job that takes 256 hours to complete could be perfumed using different numbers of worker roles. The time and cost of using 1, 16 or 256 worker roles is shown below. Number of Worker Roles Render Time Cost 1 256 hours $30.72 16 16 hours $30.72 256 1 hour $30.72   Using worker roles in Windows Azure provides the same cost for the 256 hour job, irrespective of the number of worker roles used. Provided the compute task can be broken down into many small units, and the worker role compute power can be used effectively, it makes sense to scale the application so that the task is completed quickly, making the results available in a timely fashion. The task of rendering 2,000 frames in an animation is one that can easily be broken down into 2,000 individual pieces, which can be performed by a number of worker roles. Creating a Render Farm in Windows Azure The architecture of the render farm is shown in the following diagram. The render farm is a hybrid application with the following components: ·         On-Premise o   Windows Kinect – Used combined with the Kinect Explorer to create a stream of depth images. o   Animation Creator – This application uses the depth images from the Kinect sensor to create scene description files for PolyRay. These files are then uploaded to the jobs blob container, and job messages added to the jobs queue. o   Process Monitor – This application queries the role instance lifecycle table and displays statistics about the render farm environment and render process. o   Image Downloader – This application polls the image queue and downloads the rendered animation files once they are complete. ·         Windows Azure o   Azure Storage – Queues and blobs are used for the scene description files and completed frames. A table is used to store the statistics about the rendering environment.   The architecture of each worker role is shown below.   The worker role is configured to use local storage, which provides file storage on the worker role instance that can be use by the applications to render the image and transform the format of the image. The service definition for the worker role with the local storage configuration highlighted is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceDefinition name="CloudRay" >   <WorkerRole name="CloudRayWorkerRole" vmsize="Small">     <Imports>     </Imports>     <ConfigurationSettings>       <Setting name="DataConnectionString" />     </ConfigurationSettings>     <LocalResources>       <LocalStorage name="RayFolder" cleanOnRoleRecycle="true" />     </LocalResources>   </WorkerRole> </ServiceDefinition>     The two executable programs, PolyRay.exe and DTA.exe are included in the Azure project, with Copy Always set as the property. PolyRay will take the scene description file and render it to a Truevision TGA file. As the TGA format has not seen much use since the mid 90’s it is converted to a JPG image using Dave's Targa Animator, another shareware application from the 90’s. Each worker roll will use the following process to render the animation frames. 1.       The worker process polls the job queue, if a job is available the scene description file is downloaded from blob storage to local storage. 2.       PolyRay.exe is started in a process with the appropriate command line arguments to render the image as a TGA file. 3.       DTA.exe is started in a process with the appropriate command line arguments convert the TGA file to a JPG file. 4.       The JPG file is uploaded from local storage to the images blob container. 5.       A message is placed on the images queue to indicate a new image is available for download. 6.       The job message is deleted from the job queue. 7.       The role instance lifecycle table is updated with statistics on the number of frames rendered by the worker role instance, and the CPU time used. The code for this is shown below. public override void Run() {     // Set environment variables     string polyRayPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), PolyRayLocation);     string dtaPath = Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), DTALocation);       LocalResource rayStorage = RoleEnvironment.GetLocalResource("RayFolder");     string localStorageRootPath = rayStorage.RootPath;       JobQueue jobQueue = new JobQueue("renderjobs");     JobQueue downloadQueue = new JobQueue("renderimagedownloadjobs");     CloudRayBlob sceneBlob = new CloudRayBlob("scenes");     CloudRayBlob imageBlob = new CloudRayBlob("images");     RoleLifecycleDataSource roleLifecycleDataSource = new RoleLifecycleDataSource();       Frames = 0;       while (true)     {         // Get the render job from the queue         CloudQueueMessage jobMsg = jobQueue.Get();           if (jobMsg != null)         {             // Get the file details             string sceneFile = jobMsg.AsString;             string tgaFile = sceneFile.Replace(".pi", ".tga");             string jpgFile = sceneFile.Replace(".pi", ".jpg");               string sceneFilePath = Path.Combine(localStorageRootPath, sceneFile);             string tgaFilePath = Path.Combine(localStorageRootPath, tgaFile);             string jpgFilePath = Path.Combine(localStorageRootPath, jpgFile);               // Copy the scene file to local storage             sceneBlob.DownloadFile(sceneFilePath);               // Run the ray tracer.             string polyrayArguments =                 string.Format("\"{0}\" -o \"{1}\" -a 2", sceneFilePath, tgaFilePath);             Process polyRayProcess = new Process();             polyRayProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), polyRayPath);             polyRayProcess.StartInfo.Arguments = polyrayArguments;             polyRayProcess.Start();             polyRayProcess.WaitForExit();               // Convert the image             string dtaArguments =                 string.Format(" {0} /FJ /P{1}", tgaFilePath, Path.GetDirectoryName (jpgFilePath));             Process dtaProcess = new Process();             dtaProcess.StartInfo.FileName =                 Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), dtaPath);             dtaProcess.StartInfo.Arguments = dtaArguments;             dtaProcess.Start();             dtaProcess.WaitForExit();               // Upload the image to blob storage             imageBlob.UploadFile(jpgFilePath);               // Add a download job.             downloadQueue.Add(jpgFile);               // Delete the render job message             jobQueue.Delete(jobMsg);               Frames++;         }         else         {             Thread.Sleep(1000);         }           // Log the worker role activity.         roleLifecycleDataSource.Alive             ("CloudRayWorker", RoleLifecycleDataSource.RoleLifecycleId, Frames);     } }     Monitoring Worker Role Instance Lifecycle In order to get more accurate statistics about the lifecycle of the worker role instances used to render the animation data was tracked in an Azure storage table. The following class was used to track the worker role lifecycles in Azure storage.   public class RoleLifecycle : TableServiceEntity {     public string ServerName { get; set; }     public string Status { get; set; }     public DateTime StartTime { get; set; }     public DateTime EndTime { get; set; }     public long SecondsRunning { get; set; }     public DateTime LastActiveTime { get; set; }     public int Frames { get; set; }     public string Comment { get; set; }       public RoleLifecycle()     {     }       public RoleLifecycle(string roleName)     {         PartitionKey = roleName;         RowKey = Utils.GetAscendingRowKey();         Status = "Started";         StartTime = DateTime.UtcNow;         LastActiveTime = StartTime;         EndTime = StartTime;         SecondsRunning = 0;         Frames = 0;     } }     A new instance of this class is created and added to the storage table when the role starts. It is then updated each time the worker renders a frame to record the total number of frames rendered and the total processing time. These statistics are used be the monitoring application to determine the effectiveness of use of resources in the render farm. Rendering the Animation The Azure solution was deployed to Windows Azure with the service configuration set to 16 worker role instances. This allows for the application to be tested in the cloud environment, and the performance of the application determined. When I demo the application at conferences and user groups I often start with 16 instances, and then scale up the application to the full 256 instances. The configuration to run 16 instances is shown below. <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="16" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     About six minutes after deploying the application the first worker roles become active and start to render the first frames of the animation. The CloudRay Monitor application displays an icon for each worker role instance, with a number indicating the number of frames that the worker role has rendered. The statistics on the left show the number of active worker roles and statistics about the render process. The render time is the time since the first worker role became active; the CPU time is the total amount of processing time used by all worker role instances to render the frames.   Five minutes after the first worker role became active the last of the 16 worker roles activated. By this time the first seven worker roles had each rendered one frame of the animation.   With 16 worker roles u and running it can be seen that one hour and 45 minutes CPU time has been used to render 32 frames with a render time of just under 10 minutes.     At this rate it would take over 10 hours to render the 2,000 frames of the full animation. In order to complete the animation in under an hour more processing power will be required. Scaling the render farm from 16 instances to 256 instances is easy using the new management portal. The slider is set to 256 instances, and the configuration saved. We do not need to re-deploy the application, and the 16 instances that are up and running will not be affected. Alternatively, the configuration file for the Azure service could be modified to specify 256 instances.   <?xml version="1.0" encoding="utf-8"?> <ServiceConfiguration serviceName="CloudRay" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration" osFamily="1" osVersion="*">   <Role name="CloudRayWorkerRole">     <Instances count="256" />     <ConfigurationSettings>       <Setting name="DataConnectionString"         value="DefaultEndpointsProtocol=https;AccountName=cloudraydata;AccountKey=..." />     </ConfigurationSettings>   </Role> </ServiceConfiguration>     Six minutes after the new configuration has been applied 75 new worker roles have activated and are processing their first frames.   Five minutes later the full configuration of 256 worker roles is up and running. We can see that the average rate of frame rendering has increased from 3 to 12 frames per minute, and that over 17 hours of CPU time has been utilized in 23 minutes. In this test the time to provision 140 worker roles was about 11 minutes, which works out at about one every five seconds.   We are now half way through the rendering, with 1,000 frames complete. This has utilized just under three days of CPU time in a little over 35 minutes.   The animation is now complete, with 2,000 frames rendered in a little over 52 minutes. The CPU time used by the 256 worker roles is 6 days, 7 hours and 22 minutes with an average frame rate of 38 frames per minute. The rendering of the last 1,000 frames took 16 minutes 27 seconds, which works out at a rendering rate of 60 frames per minute. The frame counts in the server instances indicate that the use of a queue to distribute the workload has been very effective in distributing the load across the 256 worker role instances. The first 16 instances that were deployed first have rendered between 11 and 13 frames each, whilst the 240 instances that were added when the application was scaled have rendered between 6 and 9 frames each.   Completed Animation I’ve uploaded the completed animation to YouTube, a low resolution preview is shown below. Pin Board Animation Created using Windows Kinect and 256 Windows Azure Worker Roles   The animation can be viewed in 1280x720 resolution at the following link: http://www.youtube.com/watch?v=n5jy6bvSxWc Effective Use of Resources According to the CloudRay monitor statistics the animation took 6 days, 7 hours and 22 minutes CPU to render, this works out at 152 hours of compute time, rounded up to the nearest hour. As the usage for the worker role instances are billed for the full hour, it may have been possible to render the animation using fewer than 256 worker roles. When deciding the optimal usage of resources, the time required to provision and start the worker roles must also be considered. In the demo I started with 16 worker roles, and then scaled the application to 256 worker roles. It would have been more optimal to start the application with maybe 200 worker roles, and utilized the full hour that I was being billed for. This would, however, have prevented showing the ease of scalability of the application. The new management portal displays the CPU usage across the worker roles in the deployment. The average CPU usage across all instances is 93.27%, with over 99% used when all the instances are up and running. This shows that the worker role resources are being used very effectively. Grid Computing Scenarios Although I am using this scenario for a hobby project, there are many scenarios where a large amount of compute power is required for a short period of time. Windows Azure provides a great platform for developing these types of grid computing applications, and can work out very cost effective. ·         Windows Azure can provide massive compute power, on demand, in a matter of minutes. ·         The use of queues to manage the load balancing of jobs between role instances is a simple and effective solution. ·         Using a cloud-computing platform like Windows Azure allows proof-of-concept scenarios to be tested and evaluated on a very low budget. ·         No charges for inbound data transfer makes the uploading of large data sets to Windows Azure Storage services cost effective. (Transaction charges still apply.) Tips for using Windows Azure for Grid Computing Scenarios I found the implementation of a render farm using Windows Azure a fairly simple scenario to implement. I was impressed by ease of scalability that Azure provides, and by the short time that the application took to scale from 16 to 256 worker role instances. In this case it was around 13 minutes, in other tests it took between 10 and 20 minutes. The following tips may be useful when implementing a grid computing project in Windows Azure. ·         Using an Azure Storage queue to load-balance the units of work across multiple worker roles is simple and very effective. The design I have used in this scenario could easily scale to many thousands of worker role instances. ·         Windows Azure accounts are typically limited to 20 cores. If you need to use more than this, a call to support and a credit card check will be required. ·         Be aware of how the billing model works. You will be charged for worker role instances for the full clock our in which the instance is deployed. Schedule the workload to start just after the clock hour has started. ·         Monitor the utilization of the resources you are provisioning, ensure that you are not paying for worker roles that are idle. ·         If you are deploying third party applications to worker roles, you may well run into licensing issues. Purchasing software licenses on a per-processor basis when using hundreds of processors for a short time period would not be cost effective. ·         Third party software may also require installation onto the worker roles, which can be accomplished using start-up tasks. Bear in mind that adding a startup task and possible re-boot will add to the time required for the worker role instance to start and activate. An alternative may be to use a prepared VM and use VM roles. ·         Consider using the Windows Azure Autoscaling Application Block (WASABi) to autoscale the worker roles in your application. When using a large number of worker roles, the utilization must be carefully monitored, if the scaling algorithms are not optimal it could get very expensive!

    Read the article

  • Debian squeeze keyboard and touchpad not working / detected on laptop

    - by Esa
    They work before gdm3 starts. a connected mouse also stops working, but functions after removal and re-plug. no xorg.conf. log doesn't show any loading of drivers for kbd/touchpad [ 33.783] X.Org X Server 1.10.4 Release Date: 2011-08-19 [ 33.783] X Protocol Version 11, Revision 0 [ 33.783] Build Operating System: Linux 3.0.0-1-amd64 x86_64 Debian [ 33.783] Current Operating System: Linux sus 3.2.0-0.bpo.2-amd64 #1 SMP Sun Mar 25 10:33:35 UTC 2012 x86_64 [ 33.783] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-0.bpo.2-amd64 root=UUID=8686f840-d165-4d1e-b995-2ebbd94aa3d2 ro quiet [ 33.783] Build Date: 28 August 2011 09:39:43PM [ 33.783] xorg-server 2:1.10.4-1~bpo60+1 (Cyril Brulebois <[email protected]>) [ 33.783] Current version of pixman: 0.16.4 [ 33.783] Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. [ 33.783] Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. [ 33.783] (==) Log file: "/var/log/Xorg.0.log", Time: Wed Mar 28 09:34:04 2012 [ 33.837] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 33.936] (==) No Layout section. Using the first Screen section. [ 33.936] (==) No screen section available. Using defaults. [ 33.936] (**) |-->Screen "Default Screen Section" (0) [ 33.936] (**) | |-->Monitor "<default monitor>" [ 33.936] (==) No monitor specified for screen "Default Screen Section". Using a default monitor configuration. [ 33.936] (==) Automatically adding devices [ 33.936] (==) Automatically enabling devices [ 34.164] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist. [ 34.164] Entry deleted from font path. [ 34.226] (==) FontPath set to: /usr/share/fonts/X11/misc, /usr/share/fonts/X11/100dpi/:unscaled, /usr/share/fonts/X11/75dpi/:unscaled, /usr/share/fonts/X11/Type1, /usr/share/fonts/X11/100dpi, /usr/share/fonts/X11/75dpi, /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType, built-ins [ 34.226] (==) ModulePath set to "/usr/lib/xorg/modules" [ 34.226] (II) The server relies on udev to provide the list of input devices. If no devices become available, reconfigure udev or disable AutoAddDevices. [ 34.226] (II) Loader magic: 0x7d3ae0 [ 34.226] (II) Module ABI versions: [ 34.226] X.Org ANSI C Emulation: 0.4 [ 34.226] X.Org Video Driver: 10.0 [ 34.226] X.Org XInput driver : 12.2 [ 34.226] X.Org Server Extension : 5.0 [ 34.227] (--) PCI:*(0:1:5:0) 1002:9712:103c:1661 rev 0, Mem @ 0xd0000000/268435456, 0xf1400000/65536, 0xf1300000/1048576, I/O @ 0x00008000/256 [ 34.227] (--) PCI: (0:2:0:0) 1002:6760:103c:1661 rev 0, Mem @ 0xe0000000/268435456, 0xf0300000/131072, I/O @ 0x00004000/256, BIOS @ 0x????????/131072 [ 34.227] (II) Open ACPI successful (/var/run/acpid.socket) [ 34.227] (II) LoadModule: "extmod" [ 34.249] (II) Loading /usr/lib/xorg/modules/extensions/libextmod.so [ 34.277] (II) Module extmod: vendor="X.Org Foundation" [ 34.277] compiled for 1.10.4, module version = 1.0.0 [ 34.277] Module class: X.Org Server Extension [ 34.277] ABI class: X.Org Server Extension, version 5.0 [ 34.277] (II) Loading extension SELinux [ 34.277] (II) Loading extension MIT-SCREEN-SAVER [ 34.277] (II) Loading extension XFree86-VidModeExtension [ 34.277] (II) Loading extension XFree86-DGA [ 34.277] (II) Loading extension DPMS [ 34.277] (II) Loading extension XVideo [ 34.277] (II) Loading extension XVideo-MotionCompensation [ 34.277] (II) Loading extension X-Resource [ 34.277] (II) LoadModule: "dbe" [ 34.277] (II) Loading /usr/lib/xorg/modules/extensions/libdbe.so [ 34.299] (II) Module dbe: vendor="X.Org Foundation" [ 34.299] compiled for 1.10.4, module version = 1.0.0 [ 34.299] Module class: X.Org Server Extension [ 34.299] ABI class: X.Org Server Extension, version 5.0 [ 34.299] (II) Loading extension DOUBLE-BUFFER [ 34.299] (II) LoadModule: "glx" [ 34.299] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so [ 34.477] (II) Module glx: vendor="X.Org Foundation" [ 34.477] compiled for 1.10.4, module version = 1.0.0 [ 34.477] ABI class: X.Org Server Extension, version 5.0 [ 34.477] (==) AIGLX enabled [ 34.477] (II) Loading extension GLX [ 34.477] (II) LoadModule: "record" [ 34.478] (II) Loading /usr/lib/xorg/modules/extensions/librecord.so [ 34.481] (II) Module record: vendor="X.Org Foundation" [ 34.481] compiled for 1.10.4, module version = 1.13.0 [ 34.481] Module class: X.Org Server Extension [ 34.481] ABI class: X.Org Server Extension, version 5.0 [ 34.481] (II) Loading extension RECORD [ 34.481] (II) LoadModule: "dri" [ 34.481] (II) Loading /usr/lib/xorg/modules/extensions/libdri.so [ 34.512] (II) Module dri: vendor="X.Org Foundation" [ 34.512] compiled for 1.10.4, module version = 1.0.0 [ 34.512] ABI class: X.Org Server Extension, version 5.0 [ 34.512] (II) Loading extension XFree86-DRI [ 34.512] (II) LoadModule: "dri2" [ 34.512] (II) Loading /usr/lib/xorg/modules/extensions/libdri2.so [ 34.515] (II) Module dri2: vendor="X.Org Foundation" [ 34.515] compiled for 1.10.4, module version = 1.2.0 [ 34.515] ABI class: X.Org Server Extension, version 5.0 [ 34.515] (II) Loading extension DRI2 [ 34.515] (==) Matched ati as autoconfigured driver 0 [ 34.515] (==) Matched vesa as autoconfigured driver 1 [ 34.515] (==) Matched fbdev as autoconfigured driver 2 [ 34.515] (==) Assigned the driver to the xf86ConfigLayout [ 34.515] (II) LoadModule: "ati" [ 34.706] (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so [ 34.724] (II) Module ati: vendor="X.Org Foundation" [ 34.724] compiled for 1.10.3, module version = 6.14.2 [ 34.724] Module class: X.Org Video Driver [ 34.724] ABI class: X.Org Video Driver, version 10.0 [ 34.724] (II) LoadModule: "radeon" [ 34.725] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so [ 34.923] (II) Module radeon: vendor="X.Org Foundation" [ 34.923] compiled for 1.10.3, module version = 6.14.2 [ 34.923] Module class: X.Org Video Driver [ 34.923] ABI class: X.Org Video Driver, version 10.0 [ 34.945] (II) LoadModule: "vesa" [ 34.945] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so [ 34.988] (II) Module vesa: vendor="X.Org Foundation" [ 34.988] compiled for 1.10.3, module version = 2.3.0 [ 34.988] Module class: X.Org Video Driver [ 34.988] ABI class: X.Org Video Driver, version 10.0 [ 34.988] (II) LoadModule: "fbdev" [ 34.988] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so [ 35.020] (II) Module fbdev: vendor="X.Org Foundation" [ 35.020] compiled for 1.10.3, module version = 0.4.2 [ 35.020] ABI class: X.Org Video Driver, version 10.0 [ 35.020] (II) RADEON: Driver for ATI Radeon chipsets: <snip> [ 35.023] (II) VESA: driver for VESA chipsets: vesa [ 35.023] (II) FBDEV: driver for framebuffer: fbdev [ 35.023] (++) using VT number 7 [ 35.033] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so [ 35.033] (II) [KMS] Kernel modesetting enabled. [ 35.033] (WW) Falling back to old probe method for vesa [ 35.034] (WW) Falling back to old probe method for fbdev [ 35.034] (II) Loading sub module "fbdevhw" [ 35.034] (II) LoadModule: "fbdevhw" [ 35.034] (II) Loading /usr/lib/xorg/modules/libfbdevhw.so [ 35.185] (II) Module fbdevhw: vendor="X.Org Foundation" [ 35.185] compiled for 1.10.4, module version = 0.0.2 [ 35.185] ABI class: X.Org Video Driver, version 10.0 [ 35.288] (II) RADEON(0): Creating default Display subsection in Screen section "Default Screen Section" for depth/fbbpp 24/32 [ 35.288] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32 [ 35.288] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps) [ 35.288] (==) RADEON(0): Default visual is TrueColor [ 35.288] (==) RADEON(0): RGB weight 888 [ 35.288] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC) [ 35.288] (--) RADEON(0): Chipset: "ATI Mobility Radeon HD 4200" (ChipID = 0x9712) [ 35.288] (II) RADEON(0): PCI card detected [ 35.288] drmOpenDevice: node name is /dev/dri/card0 [ 35.288] drmOpenDevice: open result is 9, (OK) [ 35.288] drmOpenByBusid: Searching for BusID pci:0000:01:05.0 [ 35.288] drmOpenDevice: node name is /dev/dri/card0 [ 35.288] drmOpenDevice: open result is 9, (OK) [ 35.288] drmOpenByBusid: drmOpenMinor returns 9 [ 35.288] drmOpenByBusid: drmGetBusid reports pci:0000:01:05.0 [ 35.288] (II) Loading sub module "exa" [ 35.288] (II) LoadModule: "exa" [ 35.288] (II) Loading /usr/lib/xorg/modules/libexa.so [ 35.335] (II) Module exa: vendor="X.Org Foundation" [ 35.335] compiled for 1.10.4, module version = 2.5.0 [ 35.335] ABI class: X.Org Video Driver, version 10.0 [ 35.335] (II) RADEON(0): KMS Color Tiling: disabled [ 35.335] (II) RADEON(0): KMS Pageflipping: enabled [ 35.335] (II) RADEON(0): SwapBuffers wait for vsync: enabled [ 35.360] (II) RADEON(0): Output VGA-0 has no monitor section [ 35.360] (II) RADEON(0): Output LVDS has no monitor section [ 35.364] (II) RADEON(0): Output HDMI-0 has no monitor section [ 35.388] (II) RADEON(0): EDID for output VGA-0 [ 35.388] (II) RADEON(0): EDID for output LVDS [ 35.388] (II) RADEON(0): Manufacturer: LGD Model: 2ac Serial#: 0 [ 35.388] (II) RADEON(0): Year: 2010 Week: 0 [ 35.388] (II) RADEON(0): EDID Version: 1.3 [ 35.388] (II) RADEON(0): Digital Display Input [ 35.388] (II) RADEON(0): Max Image Size [cm]: horiz.: 34 vert.: 19 [ 35.388] (II) RADEON(0): Gamma: 2.20 [ 35.388] (II) RADEON(0): No DPMS capabilities specified [ 35.388] (II) RADEON(0): Supported color encodings: RGB 4:4:4 YCrCb 4:4:4 [ 35.388] (II) RADEON(0): First detailed timing is preferred mode [ 35.388] (II) RADEON(0): redX: 0.616 redY: 0.371 greenX: 0.355 greenY: 0.606 [ 35.388] (II) RADEON(0): blueX: 0.152 blueY: 0.100 whiteX: 0.313 whiteY: 0.329 [ 35.388] (II) RADEON(0): Manufacturer's mask: 0 [ 35.388] (II) RADEON(0): Supported detailed timing: [ 35.388] (II) RADEON(0): clock: 69.3 MHz Image Size: 344 x 194 mm [ 35.388] (II) RADEON(0): h_active: 1366 h_sync: 1398 h_sync_end 1430 h_blank_end 1486 h_border: 0 [ 35.388] (II) RADEON(0): v_active: 768 v_sync: 770 v_sync_end 774 v_blanking: 782 v_border: 0 [ 35.388] (II) RADEON(0): LG Display [ 35.388] (II) RADEON(0): LP156WH2-TLQB [ 35.388] (II) RADEON(0): EDID (in hex): [ 35.388] (II) RADEON(0): 00ffffffffffff0030e4ac0200000000 [ 35.388] (II) RADEON(0): 00140103802213780ac1259d5f5b9b27 [ 35.388] (II) RADEON(0): 19505400000001010101010101010101 [ 35.388] (II) RADEON(0): 010101010101121b567850000e302020 [ 35.388] (II) RADEON(0): 240058c2100000190000000000000000 [ 35.388] (II) RADEON(0): 00000000000000000000000000fe004c [ 35.388] (II) RADEON(0): 4720446973706c61790a2020000000fe [ 35.388] (II) RADEON(0): 004c503135365748322d544c514200c1 [ 35.388] (II) RADEON(0): Printing probed modes for output LVDS [ 35.388] (II) RADEON(0): Modeline "1366x768"x59.6 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 35.388] (II) RADEON(0): Modeline "1280x720"x59.9 74.50 1280 1344 1472 1664 720 723 728 748 -hsync +vsync (44.8 kHz) [ 35.388] (II) RADEON(0): Modeline "1152x768"x59.8 71.75 1152 1216 1328 1504 768 771 781 798 -hsync +vsync (47.7 kHz) [ 35.388] (II) RADEON(0): Modeline "1024x768"x59.9 63.50 1024 1072 1176 1328 768 771 775 798 -hsync +vsync (47.8 kHz) [ 35.388] (II) RADEON(0): Modeline "800x600"x59.9 38.25 800 832 912 1024 600 603 607 624 -hsync +vsync (37.4 kHz) [ 35.388] (II) RADEON(0): Modeline "848x480"x59.7 31.50 848 872 952 1056 480 483 493 500 -hsync +vsync (29.8 kHz) [ 35.388] (II) RADEON(0): Modeline "720x480"x59.7 26.75 720 744 808 896 480 483 493 500 -hsync +vsync (29.9 kHz) [ 35.388] (II) RADEON(0): Modeline "640x480"x59.4 23.75 640 664 720 800 480 483 487 500 -hsync +vsync (29.7 kHz) [ 35.392] (II) RADEON(0): EDID for output HDMI-0 [ 35.392] (II) RADEON(0): Output VGA-0 disconnected [ 35.392] (II) RADEON(0): Output LVDS connected [ 35.392] (II) RADEON(0): Output HDMI-0 disconnected [ 35.392] (II) RADEON(0): Using exact sizes for initial modes [ 35.392] (II) RADEON(0): Output LVDS using initial mode 1366x768 [ 35.392] (II) RADEON(0): Using default gamma of (1.0, 1.0, 1.0) unless otherwise stated. [ 35.392] (II) RADEON(0): mem size init: gart size :1fdff000 vram size: s:10000000 visible:fba0000 [ 35.392] (II) RADEON(0): EXA: Driver will allow EXA pixmaps in VRAM [ 35.392] (==) RADEON(0): DPI set to (96, 96) [ 35.392] (II) Loading sub module "fb" [ 35.392] (II) LoadModule: "fb" [ 35.392] (II) Loading /usr/lib/xorg/modules/libfb.so [ 35.492] (II) Module fb: vendor="X.Org Foundation" [ 35.492] compiled for 1.10.4, module version = 1.0.0 [ 35.492] ABI class: X.Org ANSI C Emulation, version 0.4 [ 35.492] (II) Loading sub module "ramdac" [ 35.492] (II) LoadModule: "ramdac" [ 35.492] (II) Module "ramdac" already built-in [ 35.492] (II) UnloadModule: "vesa" [ 35.492] (II) Unloading vesa [ 35.492] (II) UnloadModule: "fbdev" [ 35.492] (II) Unloading fbdev [ 35.492] (II) UnloadModule: "fbdevhw" [ 35.492] (II) Unloading fbdevhw [ 35.492] (--) Depth 24 pixmap format is 32 bpp [ 35.492] (II) RADEON(0): [DRI2] Setup complete [ 35.492] (II) RADEON(0): [DRI2] DRI driver: r600 [ 35.492] (II) RADEON(0): Front buffer size: 4224K [ 35.492] (II) RADEON(0): VRAM usage limit set to 228096K [ 35.615] (==) RADEON(0): Backing store disabled [ 35.615] (II) RADEON(0): Direct rendering enabled [ 35.658] (II) RADEON(0): Setting EXA maxPitchBytes [ 35.658] (II) EXA(0): Driver allocated offscreen pixmaps [ 35.658] (II) EXA(0): Driver registered support for the following operations: [ 35.658] (II) Solid [ 35.658] (II) Copy [ 35.658] (II) Composite (RENDER acceleration) [ 35.658] (II) UploadToScreen [ 35.658] (II) DownloadFromScreen [ 35.687] (II) RADEON(0): Acceleration enabled [ 35.687] (==) RADEON(0): DPMS enabled [ 35.687] (==) RADEON(0): Silken mouse enabled [ 35.721] (II) RADEON(0): Set up textured video [ 35.721] (II) RADEON(0): RandR 1.2 enabled, ignore the following RandR disabled message. [ 35.721] (--) RandR disabled [ 35.721] (II) Initializing built-in extension Generic Event Extension [ 35.721] (II) Initializing built-in extension SHAPE [ 35.721] (II) Initializing built-in extension MIT-SHM [ 35.721] (II) Initializing built-in extension XInputExtension [ 35.721] (II) Initializing built-in extension XTEST [ 35.721] (II) Initializing built-in extension BIG-REQUESTS [ 35.721] (II) Initializing built-in extension SYNC [ 35.721] (II) Initializing built-in extension XKEYBOARD [ 35.721] (II) Initializing built-in extension XC-MISC [ 35.721] (II) Initializing built-in extension SECURITY [ 35.721] (II) Initializing built-in extension XINERAMA [ 35.721] (II) Initializing built-in extension XFIXES [ 35.721] (II) Initializing built-in extension RENDER [ 35.721] (II) Initializing built-in extension RANDR [ 35.721] (II) Initializing built-in extension COMPOSITE [ 35.721] (II) Initializing built-in extension DAMAGE [ 35.721] (II) SELinux: Disabled on system [ 35.982] (II) AIGLX: enabled GLX_MESA_copy_sub_buffer [ 35.982] (II) AIGLX: enabled GLX_INTEL_swap_event [ 35.982] (II) AIGLX: enabled GLX_SGI_swap_control and GLX_MESA_swap_control [ 35.982] (II) AIGLX: enabled GLX_SGI_make_current_read [ 35.982] (II) AIGLX: GLX_EXT_texture_from_pixmap backed by buffer objects [ 35.982] (II) AIGLX: Loaded and initialized /usr/lib/dri/r600_dri.so [ 35.982] (II) GLX: Initialized DRI2 GL provider for screen 0 [ 35.999] (II) RADEON(0): Setting screen physical size to 361 x 203 [ 43.896] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.896] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.896] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 43.924] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.924] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.924] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 43.988] (II) RADEON(0): EDID vendor "LGD", prod id 684 [ 43.988] (II) RADEON(0): Printing DDC gathered Modelines: [ 43.988] (II) RADEON(0): Modeline "1366x768"x0.0 69.30 1366 1398 1430 1486 768 770 774 782 -hsync -vsync (46.6 kHz) [ 67.375] (II) config/udev: Adding input device Logitech USB Optical Mouse (/dev/input/event1) [ 67.376] (**) Logitech USB Optical Mouse: Applying InputClass "evdev pointer catchall" [ 67.376] (II) LoadModule: "evdev" [ 67.376] (II) Loading /usr/lib/xorg/modules/input/evdev_drv.so [ 67.392] (II) Module evdev: vendor="X.Org Foundation" [ 67.392] compiled for 1.10.3, module version = 2.6.0 [ 67.392] Module class: X.Org XInput Driver [ 67.392] ABI class: X.Org XInput driver, version 12.2 [ 67.392] (II) Using input driver 'evdev' for 'Logitech USB Optical Mouse' [ 67.392] (II) Loading /usr/lib/xorg/modules/input/evdev_drv.so [ 67.392] (**) Logitech USB Optical Mouse: always reports core events [ 67.392] (**) Logitech USB Optical Mouse: Device: "/dev/input/event1" [ 67.392] (--) Logitech USB Optical Mouse: Found 12 mouse buttons [ 67.392] (--) Logitech USB Optical Mouse: Found scroll wheel(s) [ 67.392] (--) Logitech USB Optical Mouse: Found relative axes [ 67.392] (--) Logitech USB Optical Mouse: Found x and y relative axes [ 67.392] (II) Logitech USB Optical Mouse: Configuring as mouse [ 67.392] (II) Logitech USB Optical Mouse: Adding scrollwheel support [ 67.392] (**) Logitech USB Optical Mouse: YAxisMapping: buttons 4 and 5 [ 67.392] (**) Logitech USB Optical Mouse: EmulateWheelButton: 4, EmulateWheelInertia: 10, EmulateWheelTimeout: 200 [ 67.392] (**) Option "config_info" "udev:/sys/devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/input/input14/event1" [ 67.392] (II) XINPUT: Adding extended input device "Logitech USB Optical Mouse" (type: MOUSE) [ 67.392] (II) Logitech USB Optical Mouse: initialized for relative axes. [ 67.392] (**) Logitech USB Optical Mouse: (accel) keeping acceleration scheme 1 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration profile 0 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration factor: 2.000 [ 67.392] (**) Logitech USB Optical Mouse: (accel) acceleration threshold: 4 [ 67.392] (II) config/udev: Adding input device Logitech USB Optical Mouse (/dev/input/mouse0) [ 67.392] (II) No input driver/identifier specified (ignoring) [ 78.692] (II) Logitech USB Optical Mouse: Close [ 78.692] (II) UnloadModule: "evdev" [ 78.692] (II) Unloading evdev

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Updating the value of a math equation with YUI slider and simple radio buttons.

    - by dj lewis
    I have a form that is used to show a price for a product. I have a YUI slider setup that changes the price, and it works perfectly. Now I'm trying to add in radio buttons that also should update that same price value. The price displayed should take into account all 3 fields, and update dynamically as any are updated. This is the code I have, but I don't have any radio buttons for cpanelPrice yet as I'm still just trying to get the IPs to work. <script type="text/javascript"> (function() { var Event = YAHOO.util.Event, Dom = YAHOO.util.Dom, lang = YAHOO.lang, slider, bg="slider-bg", thumb="slider-thumb", orderlink="order-link", monthlyprice="monthly-price", dram="ram", stor="storage",dcpu="cpu",bandw="bandwidth",slid="sliderbg" // The slider can move 0 pixels up var topConstraint = 0; // The slider can move 200 pixels down var bottomConstraint = 585; // Custom scale factor for converting the pixel offset into a real value var scaleFactor = 1; // The amount the slider moves when the value is changed with the arrow // keys var keyIncrement = 65; var tickSize = 65; Event.onDOMReady(function() { slider = YAHOO.widget.Slider.getHorizSlider(bg, thumb, topConstraint, bottomConstraint, tickSize); slider.setValue(1, true); slider.animate = true; slider.getRealValue = function() { return Math.round(this.getValue() * scaleFactor); } slider.subscribe("change", function(offsetFromStart) { var ordnode = Dom.get(orderlink); var prinode = Dom.get(monthlyprice); var ramnode = Dom.get(dram); var stornode = Dom.get(stor); var cpunode = Dom.get(dcpu); var bwnode = Dom.get(bandw); var slidnode = Dom.get(slid); var actualValue = slider.getRealValue(); if (actualValue < 0) { var actualValue = 0; } if (actualValue > -1 && actualValue < 5) { basePrice = 15; var pid = "7"; var ram = "128 MB"; stornode.innerHTML = "5"; cpunode.innerHTML = ".5"; bwnode.innerHTML = "50"; slidnode.innerHTML = "<img src=\"/images/sliderbg1.png\" alt=\"\" />"; } else if (actualValue > 60 && actualValue < 70) { basePrice = 25; var pid = "8"; var ram = "256 MB"; stornode.innerHTML = "10"; cpunode.innerHTML = ".5"; bwnode.innerHTML = "100"; slidnode.innerHTML = "<img src=\"/images/sliderbg2.png\" alt=\"\" />"; } else if (actualValue > 125 && actualValue < 135) { basePrice = 40; var pid = "9"; var ram = "512 MB"; stornode.innerHTML = "20"; cpunode.innerHTML = "1"; bwnode.innerHTML = "200"; slidnode.innerHTML = "<img src=\"/images/sliderbg3.png\" alt=\"\" />"; } else if (actualValue > 190 && actualValue < 200) { basePrice = 60; var pid = "10"; var ram = "1 GB"; stornode.innerHTML = "40"; cpunode.innerHTML = "1"; bwnode.innerHTML = "400"; slidnode.innerHTML = "<img src=\"/images/sliderbg4.png\" alt=\"\" />"; } else if (actualValue> 255 && actualValue < 265) { basePrice = 80; var pid = "11"; var ram = "1.5 GB"; stornode.innerHTML = "60"; cpunode.innerHTML = "1"; bwnode.innerHTML = "600"; slidnode.innerHTML = "<img src=\"/images/sliderbg5.png\" alt=\"\" />"; } else if (actualValue > 320 && actualValue < 330) { basePrice = 110; var pid = "12"; var ram = "2 GB"; stornode.innerHTML = "80"; cpunode.innerHTML = "2"; bwnode.innerHTML = "800"; slidnode.innerHTML = "<img src=\"/images/sliderbg6.png\" alt=\"\" />"; } else if (actualValue > 385 && actualValue < 395) { basePrice = 140; var pid = "13"; var ram = "2.5 GB"; stornode.innerHTML = "100"; cpunode.innerHTML = "2"; bwnode.innerHTML = "1000"; slidnode.innerHTML = "<img src=\"/images/sliderbg7.png\" alt=\"\" />"; } else if (actualValue > 450 && actualValue < 460) { basePrice = 170; var pid = "14"; var ram = "3 GB"; stornode.innerHTML = "120"; cpunode.innerHTML = "3"; bwnode.innerHTML = "1200"; slidnode.innerHTML = "<img src=\"/images/sliderbg8.png\" alt=\"\" />"; } else if (actualValue > 515 && actualValue < 525) { basePrice = 200; var pid = "15"; var ram = "3.5 GB"; stornode.innerHTML = "140"; cpunode.innerHTML = "3"; bwnode.innerHTML = "1400"; slidnode.innerHTML = "<img src=\"/images/sliderbg9.png\" alt=\"\" />"; } else if (actualValue > 580 && actualValue < 590) { basePrice = 240; var pid = "16"; var ram = "4 GB"; stornode.innerHTML = "160"; cpunode.innerHTML = "4"; bwnode.innerHTML = "1600"; slidnode.innerHTML = "<img src=\"/images/sliderbg10.png\" alt=\"\" />"; } // Setup the order link ordnode.innerHTML = "<a href=\"https://account.hostingbeast.com/cart.php?a=add&pid=" + pid + "\"><img src=\"/images/blank.gif\" alt=\"Order VPS Hosting\" height=\"100\" width=\"100\" /></a>"; ramnode.innerHTML = ram; ipPrice = 0; function setIpPrice(ips) { ipPrice = ips.value; } cpanelPrice = 0; prinode.innerHTML = basePrice + ipPrice + cpanelPrice; }); // Use setValue to reset the value to white: Event.on("putval", "click", function(e) { slider.setValue(100, false); //false here means to animate if possible }); setTimeout(function () { slider.setValue(10); },0); }); })(); </script> <div style="width: 649px; margin:auto"> <span id="sliderbg"></span> <div class="yui-skin-sam"> <div id="slider-bg" class="yui-h-slider" tabindex="-1"> <div id="slider-thumb" class="yui-slider-thumb"><img src="/images/thumb-bar.png"></div> </div> </div> </div> <div class="vpsdetails"> <div id="vpsprod"><span id="cpu"></span></div> <div id="vpsram"><span id="ram"></span></div> <div id="vpsstor"><span id="storage"></span> GB</div> <div id="vpsbw"><span id="bandwidth"></span> GB</div> <div id="slideprice">$ <span id="monthly-price"></span></div> </div> <input type="radio" name="ips" value="2" onclick="setIpPrice(this.value - 2 * 2);" checked="checked" /> 2 <input type="radio" name="ips" value="4" onclick="setIpPrice(this.value - 2 * 2);" /> 4

    Read the article

  • Optimized OCR black/white pixel algorithm

    - by eagle
    I am writing a simple OCR solution for a finite set of characters. That is, I know the exact way all 26 letters in the alphabet will look like. I am using C# and am able to easily determine if a given pixel should be treated as black or white. I am generating a matrix of black/white pixels for every single character. So for example, the letter I (capital i), might look like the following: 01110 00100 00100 00100 01110 Note: all points, which I use later in this post, assume that the top left pixel is (0, 0), bottom right pixel is (4, 4). 1's represent black pixels, and 0's represent white pixels. I would create a corresponding matrix in C# like this: CreateLetter("I", new List<List<bool>>() { new List<bool>() { false, true, true, true, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, false, true, false, false }, new List<bool>() { false, true, true, true, false } }); I know I could probably optimize this part by using a multi-dimensional array instead, but let's ignore that for now, this is for illustrative purposes. Every letter is exactly the same dimensions, 10px by 11px (10px by 11px is the actual dimensions of a character in my real program. I simplified this to 5px by 5px in this posting since it is much easier to "draw" the letters using 0's and 1's on a smaller image). Now when I give it a 10px by 11px part of an image to analyze with OCR, it would need to run on every single letter (26) on every single pixel (10 * 11 = 110) which would mean 2,860 (26 * 110) iterations (in the worst case) for every single character. I was thinking this could be optimized by defining the unique characteristics of every character. So, for example, let's assume that the set of characters only consists of 5 distinct letters: I, A, O, B, and L. These might look like the following: 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 After analyzing the unique characteristics of every character, I can significantly reduce the number of tests that need to be performed to test for a character. For example, for the "I" character, I could define it's unique characteristics as having a black pixel in the coordinate (3, 0) since no other characters have that pixel as black. So instead of testing 110 pixels for a match on the "I" character, I reduced it to a 1 pixel test. This is what it might look like for all these characters: var LetterI = new OcrLetter() { Name = "I", BlackPixels = new List<Point>() { new Point (3, 0) } } var LetterA = new OcrLetter() { Name = "A", WhitePixels = new List<Point>() { new Point(2, 4) } } var LetterO = new OcrLetter() { Name = "O", BlackPixels = new List<Point>() { new Point(3, 2) }, WhitePixels = new List<Point>() { new Point(2, 2) } } var LetterB = new OcrLetter() { Name = "B", BlackPixels = new List<Point>() { new Point(3, 1) }, WhitePixels = new List<Point>() { new Point(3, 2) } } var LetterL = new OcrLetter() { Name = "L", BlackPixels = new List<Point>() { new Point(1, 1), new Point(3, 4) }, WhitePixels = new List<Point>() { new Point(2, 2) } } This is challenging to do manually for 5 characters and gets much harder the greater the amount of letters that are added. You also want to guarantee that you have the minimum set of unique characteristics of a letter since you want it to be optimized as much as possible. I want to create an algorithm that will identify the unique characteristics of all the letters and would generate similar code to that above. I would then use this optimized black/white matrix to identify characters. How do I take the 26 letters that have all their black/white pixels filled in (e.g. the CreateLetter code block) and convert them to an optimized set of unique characteristics that define a letter (e.g. the new OcrLetter() code block)? And how would I guarantee that it is the most efficient definition set of unique characteristics (e.g. instead of defining 6 points as the unique characteristics, there might be a way to do it with 1 or 2 points, as the letter "I" in my example was able to). An alternative solution I've come up with is using a hash table, which will reduce it from 2,860 iterations to 110 iterations, a 26 time reduction. This is how it might work: I would populate it with data similar to the following: Letters["01110 00100 00100 00100 01110"] = "I"; Letters["00100 01010 01110 01010 01010"] = "A"; Letters["00100 01010 01010 01010 00100"] = "O"; Letters["01100 01010 01100 01010 01100"] = "B"; Now when I reach a location in the image to process, I convert it to a string such as: "01110 00100 00100 00100 01110" and simply find it in the hash table. This solution seems very simple, however, this still requires 110 iterations to generate this string for each letter. In big O notation, the algorithm is the same since O(110N) = O(2860N) = O(N) for N letters to process on the page. However, it is still improved by a constant factor of 26, a significant improvement (e.g. instead of it taking 26 minutes, it would take 1 minute). Update: Most of the solutions provided so far have not addressed the issue of identifying the unique characteristics of a character and rather provide alternative solutions. I am still looking for this solution which, as far as I can tell, is the only way to achieve the fastest OCR processing. I just came up with a partial solution: For each pixel, in the grid, store the letters that have it as a black pixel. Using these letters: I A O B L 01110 00100 00100 01100 01000 00100 01010 01010 01010 01000 00100 01110 01010 01100 01000 00100 01010 01010 01010 01000 01110 01010 00100 01100 01110 You would have something like this: CreatePixel(new Point(0, 0), new List<Char>() { }); CreatePixel(new Point(1, 0), new List<Char>() { 'I', 'B', 'L' }); CreatePixel(new Point(2, 0), new List<Char>() { 'I', 'A', 'O', 'B' }); CreatePixel(new Point(3, 0), new List<Char>() { 'I' }); CreatePixel(new Point(4, 0), new List<Char>() { }); CreatePixel(new Point(0, 1), new List<Char>() { }); CreatePixel(new Point(1, 1), new List<Char>() { 'A', 'B', 'L' }); CreatePixel(new Point(2, 1), new List<Char>() { 'I' }); CreatePixel(new Point(3, 1), new List<Char>() { 'A', 'O', 'B' }); // ... CreatePixel(new Point(2, 2), new List<Char>() { 'I', 'A', 'B' }); CreatePixel(new Point(3, 2), new List<Char>() { 'A', 'O' }); // ... CreatePixel(new Point(2, 4), new List<Char>() { 'I', 'O', 'B', 'L' }); CreatePixel(new Point(3, 4), new List<Char>() { 'I', 'A', 'L' }); CreatePixel(new Point(4, 4), new List<Char>() { }); Now for every letter, in order to find the unique characteristics, you need to look at which buckets it belongs to, as well as the amount of other characters in the bucket. So let's take the example of "I". We go to all the buckets it belongs to (1,0; 2,0; 3,0; ...; 3,4) and see that the one with the least amount of other characters is (3,0). In fact, it only has 1 character, meaning it must be an "I" in this case, and we found our unique characteristic. You can also do the same for pixels that would be white. Notice that bucket (2,0) contains all the letters except for "L", this means that it could be used as a white pixel test. Similarly, (2,4) doesn't contain an 'A'. Buckets that either contain all the letters or none of the letters can be discarded immediately, since these pixels can't help define a unique characteristic (e.g. 1,1; 4,0; 0,1; 4,4). It gets trickier when you don't have a 1 pixel test for a letter, for example in the case of 'O' and 'B'. Let's walk through the test for 'O'... It's contained in the following buckets: // Bucket Count Letters // 2,0 4 I, A, O, B // 3,1 3 A, O, B // 3,2 2 A, O // 2,4 4 I, O, B, L Additionally, we also have a few white pixel tests that can help: (I only listed those that are missing at most 2). The Missing Count was calculated as (5 - Bucket.Count). // Bucket Missing Count Missing Letters // 1,0 2 A, O // 1,1 2 I, O // 2,2 2 O, L // 3,4 2 O, B So now we can take the shortest black pixel bucket (3,2) and see that when we test for (3,2) we know it is either an 'A' or an 'O'. So we need an easy way to tell the difference between an 'A' and an 'O'. We could either look for a black pixel bucket that contains 'O' but not 'A' (e.g. 2,4) or a white pixel bucket that contains an 'O' but not an 'A' (e.g. 1,1). Either of these could be used in combination with the (3,2) pixel to uniquely identify the letter 'O' with only 2 tests. This seems like a simple algorithm when there are 5 characters, but how would I do this when there are 26 letters and a lot more pixels overlapping? For example, let's say that after the (3,2) pixel test, it found 10 different characters that contain the pixel (and this was the least from all the buckets). Now I need to find differences from 9 other characters instead of only 1 other character. How would I achieve my goal of getting the least amount of checks as possible, and ensure that I am not running extraneous tests?

    Read the article

  • Referencing movie clips from within an actionscript class

    - by Ant
    Hi all, I have been given the task of adding a scoring system to various flash games. This simply involves taking input, adding functionality such as pausing and replaying and then outputting the score, time left etc. at the end. I've so far successfully edited two games. Both these games used the "actions" code on frames. The latest game I'm trying to do uses an actionscript class which makes it both easier and harder. I'm not very adept at flash at all, but I've worked it out so far. I've added various movie clips that are to be used for displaying the pause screen background, buttons for replaying etc. I've been showing and hiding these using: back._visible = true; //movie clip, instance of back (back.png) I doubt it's best practice, but it's quick and has been working. However, now with the change of coding style to classes, this doesn't seem to work. I kinda understand why, but I'm now unsure how to hide/show these elements. Any help would be greatly appreciated :) I've attached the modified AS. class RivalOrbs extends MovieClip { var infinite_levels, orbs_start, orbs_inc, orbs_per_level, show_timer, _parent, one_time_per_level, speed_start, speed_inc_percent, max_speed, percent_starting_on_wrong_side, colorize, colors, secs_per_level; function RivalOrbs() { super(); mc = this; this.init(); } // End of the function function get_num_orbs() { if (infinite_levels) { return (orbs_start + (level - 1) * orbs_inc); } else if (level > orbs_per_level.length) { return (0); } else { return (orbs_per_level[level - 1]); } // end else if } // End of the function function get_timer_str(secs) { var _loc2 = Math.floor(secs / 60); var _loc1 = secs % 60; return ((_loc2 > 0 ? (_loc2) : ("0")) + ":" + (_loc1 >= 10 ? (_loc1) : ("0" + _loc1))); } // End of the function function frame() { //PLACE PAUSE CODE HERE if (!Key.isDown(80) and !Key.isDown(Key.ESCAPE)) { _root.offKey = true; } else if (Key.isDown(80) or Key.isDown(Key.ESCAPE)) { if (_root.offKey and _root.game_mode == "play") { _root.game_mode = "pause"; /* back._visible = true; btn_resume._visible = true; btn_exit._visible = true; txt_pause._visible = true; */ } else if (_root.offKey and _root.game_mode == "pause") { _root.game_mode = "play"; } _root.offKey = false; } if (_root.game_mode == "pause" or paused) { return; } else { /* back._visible = false; btn_resume._visible = false; btn_exit._visible = false; txt_pause._visible = false; */ } if (show_timer && total_secs != -1 || show_timer && _parent.timesup) { _loc7 = total_secs - Math.ceil((getTimer() - timer) / 1000); var diff = oldSeconds - (_loc7 + additional); if (diff > 1) additional = additional + diff; _loc7 = _loc7 + additional; oldSeconds = _loc7; trace(oldSeconds); mc.timer_field.text = this.get_timer_str(Math.max(0, _loc7)); if (_loc7 <= -1 || _parent.timesup) { if (one_time_per_level) { _root.gotoAndPlay("Lose"); } else { this.show_dialog(false); return; } // end if } // end if } // end else if var _loc9 = _root._xmouse; var _loc8 = _root._ymouse; var _loc6 = {x: _loc9, y: _loc8}; mc.globalToLocal(_loc6); _loc6.y = Math.max(-mc.bg._height / 2 + gap / 2, _loc6.y); _loc6.y = Math.min(mc.bg._height / 2 - gap / 2, _loc6.y); mc.wall1._y = _loc6.y - gap / 2 - mc.wall1._height / 2; mc.wall2._y = _loc6.y + gap / 2 + mc.wall1._height / 2; var _loc5 = true; for (var _loc4 = 0; _loc4 < this.get_num_orbs(); ++_loc4) { var _loc3 = mc.stage["orb" + _loc4]; _loc3.x_last = _loc3._x; _loc3.y_last = _loc3._y; _loc3._x = _loc3._x + _loc3.x_speed; _loc3._y = _loc3._y + _loc3.y_speed; if (_loc3._x < l_thresh) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = l_thresh + (l_thresh - _loc3._x); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._x > r_thresh) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = r_thresh - (_loc3._x - r_thresh); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._y < t_thresh) { _loc3.y_speed = _loc3.y_speed * -1; _loc3._y = t_thresh + (t_thresh - _loc3._y); _loc3.gotoAndPlay("hit"); } // end if if (_loc3._y > b_thresh) { _loc3.y_speed = _loc3.y_speed * -1; _loc3._y = b_thresh - (_loc3._y - b_thresh); _loc3.gotoAndPlay("hit"); } // end if if (_loc3.x_speed > 0) { if (_loc3._x >= m1_thresh && _loc3.x_last < m1_thresh || _loc3._x >= m1_thresh && _loc3._x <= m2_thresh) { if (_loc3._y <= mc.wall1._y + mc.wall1._height / 2 || _loc3._y >= mc.wall2._y - mc.wall2._height / 2) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = m1_thresh - (_loc3._x - m1_thresh); _loc3.gotoAndPlay("hit"); } // end if } // end if } else if (_loc3._x <= m2_thresh && _loc3.x_last > m2_thresh || _loc3._x >= m1_thresh && _loc3._x <= m2_thresh) { if (_loc3._y <= mc.wall1._y + mc.wall1._height / 2 || _loc3._y >= mc.wall2._y - mc.wall2._height / 2) { _loc3.x_speed = _loc3.x_speed * -1; _loc3._x = m2_thresh + (m2_thresh - _loc3._x); _loc3.gotoAndPlay("hit"); } // end if } // end else if if (_loc3.side == 1 && _loc3._x > 0) { _loc5 = false; } // end if if (_loc3.side == 2 && _loc3._x < 0) { _loc5 = false; } // end if } // end of for if (_loc5) { this.end_level(); } // end if } // End of the function function colorize_hex(mc, hex) { var _loc4 = hex >> 16; var _loc5 = (hex ^ hex >> 16 << 16) >> 8; var _loc3 = hex >> 8 << 8 ^ hex; var _loc2 = new flash.geom.ColorTransform(0, 0, 0, 1, _loc4, _loc5, _loc3, 0); mc.transform.colorTransform = _loc2; } // End of the function function tint_hex(mc, hex, amount) { var _loc4 = hex >> 16; var _loc5 = hex >> 8 & 255; var _loc3 = hex & 255; this.tint(mc, _loc4, _loc5, _loc3, amount); } // End of the function function tint(mc, r, g, b, amount) { var _loc4 = 100 - amount; var _loc1 = new Object(); _loc1.ra = _loc1.ga = _loc1.ba = _loc4; var _loc2 = amount / 100; _loc1.rb = r * _loc2; _loc1.gb = g * _loc2; _loc1.bb = b * _loc2; var _loc3 = new Color(mc); _loc3.setTransform(_loc1); } // End of the function function get_num_levels() { if (infinite_levels) { return (Number.MAX_VALUE); } else { return (orbs_per_level.length); } // end else if } // End of the function function end_level() { _global.inputTimeAvailable = _global.inputTimeAvailable - (60 - oldSeconds); ++level; _parent.levelOver = true; if (level <= this.get_num_levels()) { this.show_dialog(true); } else { _root.gotoAndPlay("Win"); } // end else if } // End of the function function get_speed() { var _loc3 = speed_start; for (var _loc2 = 0; _loc2 < level - 1; ++_loc2) { _loc3 = _loc3 + _loc3 * (speed_inc_percent / 100); } // end of for return (Math.min(_loc3, Math.max(max_speed, speed_start))); } // End of the function function init_orbs() { var _loc6 = this.get_speed(); var _loc7 = Math.max(1, Math.ceil(this.get_num_orbs() * (percent_starting_on_wrong_side / 100))); for (var _loc3 = 0; _loc3 < this.get_num_orbs(); ++_loc3) { var _loc2 = null; if (_loc3 % 2 == 0) { _loc2 = mc.stage.attachMovie("Orb1", "orb" + _loc3, _loc3); _loc2.side = 1; if (colorize && color1 != -1) { this.colorize_hex(_loc2.orb.bg, color1); } // end if _loc2._x = Math.random() * (mc.bg._width * 4.000000E-001) - mc.bg._width * 2.000000E-001 - mc.bg._width / 4; } else { _loc2 = mc.stage.attachMovie("Orb2", "orb" + _loc3, _loc3); _loc2.side = 2; if (colorize && color2 != -1) { this.colorize_hex(_loc2.orb.bg, color2); } // end if _loc2._x = Math.random() * (mc.bg._width * 4.000000E-001) - mc.bg._width * 2.000000E-001 + mc.bg._width / 4; } // end else if _loc2._width = _loc2._height = orb_w; _loc2._y = Math.random() * (mc.bg._height * 8.000000E-001) - mc.bg._height * 4.000000E-001; if (_loc3 < _loc7) { _loc2._x = _loc2._x * -1; } // end if var _loc5 = Math.random() * 60; var _loc4 = _loc5 / 180 * 3.141593E+000; _loc2.x_speed = Math.cos(_loc4) * _loc6; _loc2.y_speed = Math.sin(_loc4) * _loc6; if (Math.random() >= 5.000000E-001) { _loc2.x_speed = _loc2.x_speed * -1; } // end if if (Math.random() >= 5.000000E-001) { _loc2.y_speed = _loc2.y_speed * -1; } // end if } // end of for } // End of the function function init_colors() { if (colorize && colors.length >= 2) { color1 = colors[Math.floor(Math.random() * colors.length)]; for (color2 = colors[Math.floor(Math.random() * colors.length)]; color2 == color1; color2 = colors[Math.floor(Math.random() * colors.length)]) { } // end of for this.tint_hex(mc.side1, color1, 40); this.tint_hex(mc.side2, color2, 40); } else { color1 = -1; color2 = -1; } // end else if } // End of the function function get_total_secs() { if (show_timer) { if (secs_per_level.length > 0) { if (level > secs_per_level.length) { return (secs_per_level[secs_per_level.length - 1]); } else { return (secs_per_level[level - 1]); } // end if } // end if } // end else if return (-1); } // End of the function function start_level() { trace ("start_level"); _parent.timesup = false; _parent.levelOver = false; _parent.times_up_comp.start_timer(); this.init_orbs(); mc.level_field.text = "LEVEL " + level; total_secs = _global.inputTimeAvailable; if (total_secs > 60) total_secs = 60; timer = getTimer(); paused = false; mc.dialog.gotoAndPlay("off"); } // End of the function function clear_orbs() { for (var _loc2 = 0; mc.stage["orb" + _loc2]; ++_loc2) { mc.stage["orb" + _loc2].removeMovieClip(); } // end of for } // End of the function function show_dialog(new_level) { mc.back._visible = false; trace("yes"); paused = true; if (new_level) { this.init_colors(); } // end if this.clear_orbs(); mc.dialog.gotoAndPlay("level"); if (!new_level || _parent.timesup) { mc.dialog.level_top.text = "Time\'s Up!"; /* dyn_line1.text = "Goodbye " + _global.inputName + "!"; dyn_line2.text = "You scored " + score; //buttons if (_global.inputTimeAvailable > 60) btn_replay._visible = true; btn_resume._visible = false; btn_exit._visible = false; txt_pause._visible = false; sendInfo = new LoadVars(); sendLoader = new LoadVars(); sendInfo.game_name = 'rival_orbs'; sendInfo.timeavailable = _global.inputTimeAvailable; if (sendInfo.timeavailable < 0) sendInfo.timeavailable = 0; sendInfo.id = _global.inputId; sendInfo.score = level*_global.inputFactor; sendInfo.directive = 'record'; //sendInfo.sendAndLoad('ncc1701e.aspx', sendLoader, "GET"); sendInfo.sendAndLoad('http://keyload.co.uk/output.php', sendLoader, "POST"); */ } else if (level > 1) { mc.dialog.level_top.text = "Next Level:"; } else { mc.dialog.level_top.text = ""; } // end else if mc.dialog.level_num.text = "LEVEL " + level; mc.dialog.level_mid.text = "Number of Orbs: " + this.get_num_orbs(); _root.max_level = level; var _this = this; mc.dialog.btn.onRelease = function () { _this.start_level(); }; } // End of the function function init() { var getInfo = new LoadVars(); var getLoader = new LoadVars(); getInfo.directive = "read"; getInfo.sendAndLoad('http://keyload.co.uk/input.php', getLoader, "GET"); getLoader.onLoad = function (success) { if (success) { _global.inputId = this.id; _global.inputTimeAvailable = this.timeavailable; _global.inputFactor = this.factor; _global.inputName = this.name; } else { trace("Failed"); } } _root.game_mode = "play"; /* back._visible = false; btn_exit._visible = false; btn_replay._visible = false; btn_resume._visible = false; txt_pause._visible = false; */ l_thresh = -mc.bg._width / 2 + orb_w / 2; t_thresh = -mc.bg._height / 2 + orb_w / 2; r_thresh = mc.bg._width / 2 - orb_w / 2; b_thresh = mc.bg._height / 2 - orb_w / 2; m1_thresh = -wall_w / 2 - orb_w / 2; m2_thresh = wall_w / 2 + orb_w / 2; this.show_dialog(true); mc.onEnterFrame = frame; } // End of the function var mc = null; var orb_w = 15; var wall_w = 2; var l_thresh = 0; var r_thresh = 0; var t_thresh = 0; var b_thresh = 0; var m1_thresh = 0; var m2_thresh = 0; var color1 = -1; var color2 = -1; var level = 1; var total_secs = 30; var gap = 60; var timer = 0; var additional = 0; var oldSeconds = 0; var paused = true; var _loc7 = 0; } // End of Class

    Read the article

  • Is there a Telecommunications Reference Architecture?

    - by raul.goycoolea
    @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Abstract   Reference architecture provides needed architectural information that can be provided in advance to an enterprise to enable consistent architectural best practices. Enterprise Reference Architecture helps business owners to actualize their strategies, vision, objectives, and principles. It evaluates the IT systems, based on Reference Architecture goals, principles, and standards. It helps to reduce IT costs by increasing functionality, availability, scalability, etc. Telecom Reference Architecture provides customers with the flexibility to view bundled service bills online with the provision of multiple services. It provides real-time, flexible billing and charging systems, to handle complex promotions, discounts, and settlements with multiple parties. This paper attempts to describe the Reference Architecture for the Telecom Enterprises. It lays the foundation for a Telecom Reference Architecture by articulating the requirements, drivers, and pitfalls for telecom service providers. It describes generic reference architecture for telecom enterprises and moves on to explain how to achieve Enterprise Reference Architecture by using SOA.   Introduction   A Reference Architecture provides a methodology, set of practices, template, and standards based on a set of successful solutions implemented earlier. These solutions have been generalized and structured for the depiction of both a logical and a physical architecture, based on the harvesting of a set of patterns that describe observations in a number of successful implementations. It helps as a reference for the various architectures that an enterprise can implement to solve various problems. It can be used as the starting point or the point of comparisons for various departments/business entities of a company, or for the various companies for an enterprise. It provides multiple views for multiple stakeholders.   Major artifacts of the Enterprise Reference Architecture are methodologies, standards, metadata, documents, design patterns, etc.   Purpose of Reference Architecture   In most cases, architects spend a lot of time researching, investigating, defining, and re-arguing architectural decisions. It is like reinventing the wheel as their peers in other organizations or even the same organization have already spent a lot of time and effort defining their own architectural practices. This prevents an organization from learning from its own experiences and applying that knowledge for increased effectiveness.   Reference architecture provides missing architectural information that can be provided in advance to project team members to enable consistent architectural best practices.   Enterprise Reference Architecture helps an enterprise to achieve the following at the abstract level:   ·       Reference architecture is more of a communication channel to an enterprise ·       Helps the business owners to accommodate to their strategies, vision, objectives, and principles. ·       Evaluates the IT systems based on Reference Architecture Principles ·       Reduces IT spending through increasing functionality, availability, scalability, etc ·       A Real-time Integration Model helps to reduce the latency of the data updates Is used to define a single source of Information ·       Provides a clear view on how to manage information and security ·       Defines the policy around the data ownership, product boundaries, etc. ·       Helps with cost optimization across project and solution portfolios by eliminating unused or duplicate investments and assets ·       Has a shorter implementation time and cost   Once the reference architecture is in place, the set of architectural principles, standards, reference models, and best practices ensure that the aligned investments have the greatest possible likelihood of success in both the near term and the long term (TCO).     Common pitfalls for Telecom Service Providers   Telecom Reference Architecture serves as the first step towards maturity for a telecom service provider. During the course of our assignments/experiences with telecom players, we have come across the following observations – Some of these indicate a lack of maturity of the telecom service provider:   ·       In markets that are growing and not so mature, it has been observed that telcos have a significant amount of in-house or home-grown applications. In some of these markets, the growth has been so rapid that IT has been unable to cope with business demands. Telcos have shown a tendency to come up with workarounds in their IT applications so as to meet business needs. ·       Even for core functions like provisioning or mediation, some telcos have tried to manage with home-grown applications. ·       Most of the applications do not have the required scalability or maintainability to sustain growth in volumes or functionality. ·       Applications face interoperability issues with other applications in the operator's landscape. Integrating a new application or network element requires considerable effort on the part of the other applications. ·       Application boundaries are not clear, and functionality that is not in the initial scope of that application gets pushed onto it. This results in the development of the multiple, small applications without proper boundaries. ·       Usage of Legacy OSS/BSS systems, poor Integration across Multiple COTS Products and Internal Systems. Most of the Integrations are developed on ad-hoc basis and Point-to-Point Integration. ·       Redundancy of the business functions in different applications • Fragmented data across the different applications and no integrated view of the strategic data • Lot of performance Issues due to the usage of the complex integration across OSS and BSS systems   However, this is where the maturity of the telecom industry as a whole can be of help. The collaborative efforts of telcos to overcome some of these problems have resulted in bodies like the TM Forum. They have come up with frameworks for business processes, data, applications, and technology for telecom service providers. These could be a good starting point for telcos to clean up their enterprise landscape.   Industry Trends in Telecom Reference Architecture   Telecom reference architectures are evolving rapidly because telcos are facing business and IT challenges.   “The reality is that there probably is no killer application, no silver bullet that the telcos can latch onto to carry them into a 21st Century.... Instead, there are probably hundreds – perhaps thousands – of niche applications.... And the only way to find which of these works for you is to try out lots of them, ramp up the ones that work, and discontinue the ones that fail.” – Martin Creaner President & CTO TM Forum.   The following trends have been observed in telecom reference architecture:   ·       Transformation of business structures to align with customer requirements ·       Adoption of more Internet-like technical architectures. The Web 2.0 concept is increasingly being used. ·       Virtualization of the traditional operations support system (OSS) ·       Adoption of SOA to support development of IP-based services ·       Adoption of frameworks like Service Delivery Platforms (SDPs) and IP Multimedia Subsystem ·       (IMS) to enable seamless deployment of various services over fixed and mobile networks ·       Replacement of in-house, customized, and stove-piped OSS/BSS with standards-based COTS products ·       Compliance with industry standards and frameworks like eTOM, SID, and TAM to enable seamless integration with other standards-based products   Drivers of Reference Architecture   The drivers of the Reference Architecture are Reference Architecture Goals, Principles, and Enterprise Vision and Telecom Transformation. The details are depicted below diagram. @font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }div.Section1 { page: Section1; } Figure 1. Drivers for Reference Architecture @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Today’s telecom reference architectures should seamlessly integrate traditional legacy-based applications and transition to next-generation network technologies (e.g., IP multimedia subsystems). This has resulted in new requirements for flexible, real-time billing and OSS/BSS systems and implications on the service provider’s organizational requirements and structure.   Telecom reference architectures are today expected to:   ·       Integrate voice, messaging, email and other VAS over fixed and mobile networks, back end systems ·       Be able to provision multiple services and service bundles • Deliver converged voice, video and data services ·       Leverage the existing Network Infrastructure ·       Provide real-time, flexible billing and charging systems to handle complex promotions, discounts, and settlements with multiple parties. ·       Support charging of advanced data services such as VoIP, On-Demand, Services (e.g.  Video), IMS/SIP Services, Mobile Money, Content Services and IPTV. ·       Help in faster deployment of new services • Serve as an effective platform for collaboration between network IT and business organizations ·       Harness the potential of converging technology, networks, devices and content to develop multimedia services and solutions of ever-increasing sophistication on a single Internet Protocol (IP) ·       Ensure better service delivery and zero revenue leakage through real-time balance and credit management ·       Lower operating costs to drive profitability   Enterprise Reference Architecture   The Enterprise Reference Architecture (RA) fills the gap between the concepts and vocabulary defined by the reference model and the implementation. Reference architecture provides detailed architectural information in a common format such that solutions can be repeatedly designed and deployed in a consistent, high-quality, supportable fashion. This paper attempts to describe the Reference Architecture for the Telecom Application Usage and how to achieve the Enterprise Level Reference Architecture using SOA.   • Telecom Reference Architecture • Enterprise SOA based Reference Architecture   Telecom Reference Architecture   Tele Management Forum’s New Generation Operations Systems and Software (NGOSS) is an architectural framework for organizing, integrating, and implementing telecom systems. NGOSS is a component-based framework consisting of the following elements:   ·       The enhanced Telecom Operations Map (eTOM) is a business process framework. ·       The Shared Information Data (SID) model provides a comprehensive information framework that may be specialized for the needs of a particular organization. ·       The Telecom Application Map (TAM) is an application framework to depict the functional footprint of applications, relative to the horizontal processes within eTOM. ·       The Technology Neutral Architecture (TNA) is an integrated framework. TNA is an architecture that is sustainable through technology changes.   NGOSS Architecture Standards are:   ·       Centralized data ·       Loosely coupled distributed systems ·       Application components/re-use  ·       A technology-neutral system framework with technology specific implementations ·       Interoperability to service provider data/processes ·       Allows more re-use of business components across multiple business scenarios ·       Workflow automation   The traditional operator systems architecture consists of four layers,   ·       Business Support System (BSS) layer, with focus toward customers and business partners. Manages order, subscriber, pricing, rating, and billing information. ·       Operations Support System (OSS) layer, built around product, service, and resource inventories. ·       Networks layer – consists of Network elements and 3rd Party Systems. ·       Integration Layer – to maximize application communication and overall solution flexibility.   Reference architecture for telecom enterprises is depicted below. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 2. Telecom Reference Architecture   The major building blocks of any Telecom Service Provider architecture are as follows:   1. Customer Relationship Management   CRM encompasses the end-to-end lifecycle of the customer: customer initiation/acquisition, sales, ordering, and service activation, customer care and support, proactive campaigns, cross sell/up sell, and retention/loyalty.   CRM also includes the collection of customer information and its application to personalize, customize, and integrate delivery of service to a customer, as well as to identify opportunities for increasing the value of the customer to the enterprise.   The key functionalities related to Customer Relationship Management are   ·       Manage the end-to-end lifecycle of a customer request for products. ·       Create and manage customer profiles. ·       Manage all interactions with customers – inquiries, requests, and responses. ·       Provide updates to Billing and other south bound systems on customer/account related updates such as customer/ account creation, deletion, modification, request bills, final bill, duplicate bills, credit limits through Middleware. ·       Work with Order Management System, Product, and Service Management components within CRM. ·       Manage customer preferences – Involve all the touch points and channels to the customer, including contact center, retail stores, dealers, self service, and field service, as well as via any media (phone, face to face, web, mobile device, chat, email, SMS, mail, the customer's bill, etc.). ·       Support single interface for customer contact details, preferences, account details, offers, customer premise equipment, bill details, bill cycle details, and customer interactions.   CRM applications interact with customers through customer touch points like portals, point-of-sale terminals, interactive voice response systems, etc. The requests by customers are sent via fulfillment/provisioning to billing system for ordering processing.   2. Billing and Revenue Management   Billing and Revenue Management handles the collection of appropriate usage records and production of timely and accurate bills – for providing pre-bill usage information and billing to customers; for processing their payments; and for performing payment collections. In addition, it handles customer inquiries about bills, provides billing inquiry status, and is responsible for resolving billing problems to the customer's satisfaction in a timely manner. This process grouping also supports prepayment for services.   The key functionalities provided by these applications are   ·       To ensure that enterprise revenue is billed and invoices delivered appropriately to customers. ·       To manage customers’ billing accounts, process their payments, perform payment collections, and monitor the status of the account balance. ·       To ensure the timely and effective fulfillment of all customer bill inquiries and complaints. ·       Collect the usage records from mediation and ensure appropriate rating and discounting of all usage and pricing. ·       Support revenue sharing; split charging where usage is guided to an account different from the service consumer. ·       Support prepaid and post-paid rating. ·       Send notification on approach / exceeding the usage thresholds as enforced by the subscribed offer, and / or as setup by the customer. ·       Support prepaid, post paid, and hybrid (where some services are prepaid and the rest of the services post paid) customers and conversion from post paid to prepaid, and vice versa. ·       Support different billing function requirements like charge prorating, promotion, discount, adjustment, waiver, write-off, account receivable, GL Interface, late payment fee, credit control, dunning, account or service suspension, re-activation, expiry, termination, contract violation penalty, etc. ·       Initiate direct debit to collect payment against an invoice outstanding. ·       Send notification to Middleware on different events; for example, payment receipt, pre-suspension, threshold exceed, etc.   Billing systems typically get usage data from mediation systems for rating and billing. They get provisioning requests from order management systems and inquiries from CRM systems. Convergent and real-time billing systems can directly get usage details from network elements.   3. Mediation   Mediation systems transform/translate the Raw or Native Usage Data Records into a general format that is acceptable to billing for their rating purposes.   The following lists the high-level roles and responsibilities executed by the Mediation system in the end-to-end solution.   ·       Collect Usage Data Records from different data sources – like network elements, routers, servers – via different protocol and interfaces. ·       Process Usage Data Records – Mediation will process Usage Data Records as per the source format. ·       Validate Usage Data Records from each source. ·       Segregates Usage Data Records coming from each source to multiple, based on the segregation requirement of end Application. ·       Aggregates Usage Data Records based on the aggregation rule if any from different sources. ·       Consolidates multiple Usage Data Records from each source. ·       Delivers formatted Usage Data Records to different end application like Billing, Interconnect, Fraud Management, etc. ·       Generates audit trail for incoming Usage Data Records and keeps track of all the Usage Data Records at various stages of mediation process. ·       Checks duplicate Usage Data Records across files for a given time window.   4. Fulfillment   This area is responsible for providing customers with their requested products in a timely and correct manner. It translates the customer's business or personal need into a solution that can be delivered using the specific products in the enterprise's portfolio. This process informs the customers of the status of their purchase order, and ensures completion on time, as well as ensuring a delighted customer. These processes are responsible for accepting and issuing orders. They deal with pre-order feasibility determination, credit authorization, order issuance, order status and tracking, customer update on customer order activities, and customer notification on order completion. Order management and provisioning applications fall into this category.   The key functionalities provided by these applications are   ·       Issuing new customer orders, modifying open customer orders, or canceling open customer orders; ·       Verifying whether specific non-standard offerings sought by customers are feasible and supportable; ·       Checking the credit worthiness of customers as part of the customer order process; ·       Testing the completed offering to ensure it is working correctly; ·       Updating of the Customer Inventory Database to reflect that the specific product offering has been allocated, modified, or cancelled; ·       Assigning and tracking customer provisioning activities; ·       Managing customer provisioning jeopardy conditions; and ·       Reporting progress on customer orders and other processes to customer.   These applications typically get orders from CRM systems. They interact with network elements and billing systems for fulfillment of orders.   5. Enterprise Management   This process area includes those processes that manage enterprise-wide activities and needs, or have application within the enterprise as a whole. They encompass all business management processes that   ·       Are necessary to support the whole of the enterprise, including processes for financial management, legal management, regulatory management, process, cost, and quality management, etc.;   ·       Are responsible for setting corporate policies, strategies, and directions, and for providing guidelines and targets for the whole of the business, including strategy development and planning for areas, such as Enterprise Architecture, that are integral to the direction and development of the business;   ·       Occur throughout the enterprise, including processes for project management, performance assessments, cost assessments, etc.     (i) Enterprise Risk Management:   Enterprise Risk Management focuses on assuring that risks and threats to the enterprise value and/or reputation are identified, and appropriate controls are in place to minimize or eliminate the identified risks. The identified risks may be physical or logical/virtual. Successful risk management ensures that the enterprise can support its mission critical operations, processes, applications, and communications in the face of serious incidents such as security threats/violations and fraud attempts. Two key areas covered in Risk Management by telecom operators are:   ·       Revenue Assurance: Revenue assurance system will be responsible for identifying revenue loss scenarios across components/systems, and will help in rectifying the problems. The following lists the high-level roles and responsibilities executed by the Revenue Assurance system in the end-to-end solution. o   Identify all usage information dropped when networks are being upgraded. o   Interconnect bill verification. o   Identify where services are routinely provisioned but never billed. o   Identify poor sales policies that are intensifying collections problems. o   Find leakage where usage is sent to error bucket and never billed for. o   Find leakage where field service, CRM, and network build-out are not optimized.   ·       Fraud Management: Involves collecting data from different systems to identify abnormalities in traffic patterns, usage patterns, and subscription patterns to report suspicious activity that might suggest fraudulent usage of resources, resulting in revenue losses to the operator.   The key roles and responsibilities of the system component are as follows:   o   Fraud management system will capture and monitor high usage (over a certain threshold) in terms of duration, value, and number of calls for each subscriber. The threshold for each subscriber is decided by the system and fixed automatically. o   Fraud management will be able to detect the unauthorized access to services for certain subscribers. These subscribers may have been provided unauthorized services by employees. The component will raise the alert to the operator the very first time of such illegal calls or calls which are not billed. o   The solution will be to have an alarm management system that will deliver alarms to the operator/provider whenever it detects a fraud, thus minimizing fraud by catching it the first time it occurs. o   The Fraud Management system will be capable of interfacing with switches, mediation systems, and billing systems   (ii) Knowledge Management   This process focuses on knowledge management, technology research within the enterprise, and the evaluation of potential technology acquisitions.   Key responsibilities of knowledge base management are to   ·       Maintain knowledge base – Creation and updating of knowledge base on ongoing basis. ·       Search knowledge base – Search of knowledge base on keywords or category browse ·       Maintain metadata – Management of metadata on knowledge base to ensure effective management and search. ·       Run report generator. ·       Provide content – Add content to the knowledge base, e.g., user guides, operational manual, etc.   (iii) Document Management   It focuses on maintaining a repository of all electronic documents or images of paper documents relevant to the enterprise using a system.   (iv) Data Management   It manages data as a valuable resource for any enterprise. For telecom enterprises, the typical areas covered are Master Data Management, Data Warehousing, and Business Intelligence. It is also responsible for data governance, security, quality, and database management.   Key responsibilities of Data Management are   ·       Using ETL, extract the data from CRM, Billing, web content, ERP, campaign management, financial, network operations, asset management info, customer contact data, customer measures, benchmarks, process data, e.g., process inputs, outputs, and measures, into Enterprise Data Warehouse. ·       Management of data traceability with source, data related business rules/decisions, data quality, data cleansing data reconciliation, competitors data – storage for all the enterprise data (customer profiles, products, offers, revenues, etc.) ·       Get online update through night time replication or physical backup process at regular frequency. ·       Provide the data access to business intelligence and other systems for their analysis, report generation, and use.   (v) Business Intelligence   It uses the Enterprise Data to provide the various analysis and reports that contain prospects and analytics for customer retention, acquisition of new customers due to the offers, and SLAs. It will generate right and optimized plans – bolt-ons for the customers.   The following lists the high-level roles and responsibilities executed by the Business Intelligence system at the Enterprise Level:   ·       It will do Pattern analysis and reports problem. ·       It will do Data Analysis – Statistical analysis, data profiling, affinity analysis of data, customer segment wise usage patterns on offers, products, service and revenue generation against services and customer segments. ·       It will do Performance (business, system, and forecast) analysis, churn propensity, response time, and SLAs analysis. ·       It will support for online and offline analysis, and report drill down capability. ·       It will collect, store, and report various SLA data. ·       It will provide the necessary intelligence for marketing and working on campaigns, etc., with cost benefit analysis and predictions.   It will advise on customer promotions with additional services based on loyalty and credit history of customer   ·       It will Interface with Enterprise Data Management system for data to run reports and analysis tasks. It will interface with the campaign schedules, based on historical success evidence.   (vi) Stakeholder and External Relations Management   It manages the enterprise's relationship with stakeholders and outside entities. Stakeholders include shareholders, employee organizations, etc. Outside entities include regulators, local community, and unions. Some of the processes within this grouping are Shareholder Relations, External Affairs, Labor Relations, and Public Relations.   (vii) Enterprise Resource Planning   It is used to manage internal and external resources, including tangible assets, financial resources, materials, and human resources. Its purpose is to facilitate the flow of information between all business functions inside the boundaries of the enterprise and manage the connections to outside stakeholders. ERP systems consolidate all business operations into a uniform and enterprise wide system environment.   The key roles and responsibilities for Enterprise System are given below:   ·        It will handle responsibilities such as core accounting, financial, and management reporting. ·       It will interface with CRM for capturing customer account and details. ·       It will interface with billing to capture the billing revenue and other financial data. ·       It will be responsible for executing the dunning process. Billing will send the required feed to ERP for execution of dunning. ·       It will interface with the CRM and Billing through batch interfaces. Enterprise management systems are like horizontals in the enterprise and typically interact with all major telecom systems. E.g., an ERP system interacts with CRM, Fulfillment, and Billing systems for different kinds of data exchanges.   6. External Interfaces/Touch Points   The typical external parties are customers, suppliers/partners, employees, shareholders, and other stakeholders. External interactions from/to a Service Provider to other parties can be achieved by a variety of mechanisms, including:   ·       Exchange of emails or faxes ·       Call Centers ·       Web Portals ·       Business-to-Business (B2B) automated transactions   These applications provide an Internet technology driven interface to external parties to undertake a variety of business functions directly for themselves. These can provide fully or partially automated service to external parties through various touch points.   Typical characteristics of these touch points are   ·       Pre-integrated self-service system, including stand-alone web framework or integration front end with a portal engine ·       Self services layer exposing atomic web services/APIs for reuse by multiple systems across the architectural environment ·       Portlets driven connectivity exposing data and services interoperability through a portal engine or web application   These touch points mostly interact with the CRM systems for requests, inquiries, and responses.   7. Middleware   The component will be primarily responsible for integrating the different systems components under a common platform. It should provide a Standards-Based Platform for building Service Oriented Architecture and Composite Applications. The following lists the high-level roles and responsibilities executed by the Middleware component in the end-to-end solution.   ·       As an integration framework, covering to and fro interfaces ·       Provide a web service framework with service registry. ·       Support SOA framework with SOA service registry. ·       Each of the interfaces from / to Middleware to other components would handle data transformation, translation, and mapping of data points. ·       Receive data from the caller / activate and/or forward the data to the recipient system in XML format. ·       Use standard XML for data exchange. ·       Provide the response back to the service/call initiator. ·       Provide a tracking until the response completion. ·       Keep a store transitional data against each call/transaction. ·       Interface through Middleware to get any information that is possible and allowed from the existing systems to enterprise systems; e.g., customer profile and customer history, etc. ·       Provide the data in a common unified format to the SOA calls across systems, and follow the Enterprise Architecture directive. ·       Provide an audit trail for all transactions being handled by the component.   8. Network Elements   The term Network Element means a facility or equipment used in the provision of a telecommunications service. Such terms also includes features, functions, and capabilities that are provided by means of such facility or equipment, including subscriber numbers, databases, signaling systems, and information sufficient for billing and collection or used in the transmission, routing, or other provision of a telecommunications service.   Typical network elements in a GSM network are Home Location Register (HLR), Intelligent Network (IN), Mobile Switching Center (MSC), SMS Center (SMSC), and network elements for other value added services like Push-to-talk (PTT), Ring Back Tone (RBT), etc.   Network elements are invoked when subscribers use their telecom devices for any kind of usage. These elements generate usage data and pass it on to downstream systems like mediation and billing system for rating and billing. They also integrate with provisioning systems for order/service fulfillment.   9. 3rd Party Applications   3rd Party systems are applications like content providers, payment gateways, point of sale terminals, and databases/applications maintained by the Government.   Depending on applicability and the type of functionality provided by 3rd party applications, the integration with different telecom systems like CRM, provisioning, and billing will be done.   10. Service Delivery Platform   A service delivery platform (SDP) provides the architecture for the rapid deployment, provisioning, execution, management, and billing of value added telecom services. SDPs are based on the concept of SOA and layered architecture. They support the delivery of voice, data services, and content in network and device-independent fashion. They allow application developers to aggregate network capabilities, services, and sources of content. SDPs typically contain layers for web services exposure, service application development, and network abstraction.   SOA Reference Architecture   SOA concept is based on the principle of developing reusable business service and building applications by composing those services, instead of building monolithic applications in silos. It’s about bridging the gap between business and IT through a set of business-aligned IT services, using a set of design principles, patterns, and techniques.   In an SOA, resources are made available to participants in a value net, enterprise, line of business (typically spanning multiple applications within an enterprise or across multiple enterprises). It consists of a set of business-aligned IT services that collectively fulfill an organization’s business processes and goals. We can choreograph these services into composite applications and invoke them through standard protocols. SOA, apart from agility and reusability, enables:   ·       The business to specify processes as orchestrations of reusable services ·       Technology agnostic business design, with technology hidden behind service interface ·       A contractual-like interaction between business and IT, based on service SLAs ·       Accountability and governance, better aligned to business services ·       Applications interconnections untangling by allowing access only through service interfaces, reducing the daunting side effects of change ·       Reduced pressure to replace legacy and extended lifetime for legacy applications, through encapsulation in services   ·       A Cloud Computing paradigm, using web services technologies, that makes possible service outsourcing on an on-demand, utility-like, pay-per-usage basis   The following section represents the Reference Architecture of logical view for the Telecom Solution. The new custom built application needs to align with this logical architecture in the long run to achieve EA benefits.   Packaged implementation applications, such as ERP billing applications, need to expose their functions as service providers (as other applications consume) and interact with other applications as service consumers.   COT applications need to expose services through wrappers such as adapters to utilize existing resources and at the same time achieve Enterprise Architecture goal and objectives.   The following are the various layers for Enterprise level deployment of SOA. This diagram captures the abstract view of Enterprise SOA layers and important components of each layer. Layered architecture means decomposition of services such that most interactions occur between adjacent layers. However, there is no strict rule that top layers should not directly communicate with bottom layers.   The diagram below represents the important logical pieces that would result from overall SOA transformation. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 3. Enterprise SOA Reference Architecture 1.          Operational System Layer: This layer consists of all packaged applications like CRM, ERP, custom built applications, COTS based applications like Billing, Revenue Management, Fulfilment, and the Enterprise databases that are essential and contribute directly or indirectly to the Enterprise OSS/BSS Transformation.   ERP holds the data of Asset Lifecycle Management, Supply Chain, and Advanced Procurement and Human Capital Management, etc.   CRM holds the data related to Order, Sales, and Marketing, Customer Care, Partner Relationship Management, Loyalty, etc.   Content Management handles Enterprise Search and Query. Billing application consists of the following components:   ·       Collections Management, Customer Billing Management, Invoices, Real-Time Rating, Discounting, and Applying of Charges ·       Enterprise databases will hold both the application and service data, whether structured or unstructured.   MDM - Master data majorly consists of Customer, Order, Product, and Service Data.     2.          Enterprise Component Layer:   This layer consists of the Application Services and Common Services that are responsible for realizing the functionality and maintaining the QoS of the exposed services. This layer uses container-based technologies such as application servers to implement the components, workload management, high availability, and load balancing.   Application Services: This Service Layer enables application, technology, and database abstraction so that the complex accessing logic is hidden from the other service layers. This is a basic service layer, which exposes application functionalities and data as reusable services. The three types of the Application access services are:   ·       Application Access Service: This Service Layer exposes application level functionalities as a reusable service between BSS to BSS and BSS to OSS integration. This layer is enabled using disparate technology such as Web Service, Integration Servers, and Adaptors, etc.   ·       Data Access Service: This Service Layer exposes application data services as a reusable reference data service. This is done via direct interaction with application data. and provides the federated query.   ·       Network Access Service: This Service Layer exposes provisioning layer as a reusable service from OSS to OSS integration. This integration service emphasizes the need for high performance, stateless process flows, and distributed design.   Common Services encompasses management of structured, semi-structured, and unstructured data such as information services, portal services, interaction services, infrastructure services, and security services, etc.   3.          Integration Layer:   This consists of service infrastructure components like service bus, service gateway for partner integration, service registry, service repository, and BPEL processor. Service bus will carry the service invocation payloads/messages between consumers and providers. The other important functions expected from it are itinerary based routing, distributed caching of routing information, transformations, and all qualities of service for messaging-like reliability, scalability, and availability, etc. Service registry will hold all contracts (wsdl) of services, and it helps developers to locate or discover service during design time or runtime.   • BPEL processor would be useful in orchestrating the services to compose a complex business scenario or process. • Workflow and business rules management are also required to support manual triggering of certain activities within business process. based on the rules setup and also the state machine information. Application, data, and service mediation layer typically forms the overall composite application development framework or SOA Framework.   4.          Business Process Layer: These are typically the intermediate services layer and represent Shared Business Process Services. At Enterprise Level, these services are from Customer Management, Order Management, Billing, Finance, and Asset Management application domains.   5.          Access Layer: This layer consists of portals for Enterprise and provides a single view of Enterprise information management and dashboard services.   6.          Channel Layer: This consists of various devices; applications that form part of extended enterprise; browsers through which users access the applications.   7.          Client Layer: This designates the different types of users accessing the enterprise applications. The type of user typically would be an important factor in determining the level of access to applications.   8.          Vertical pieces like management, monitoring, security, and development cut across all horizontal layers Management and monitoring involves all aspects of SOA-like services, SLAs, and other QoS lifecycle processes for both applications and services surrounding SOA governance.     9.          EA Governance, Reference Architecture, Roadmap, Principles, and Best Practices:   EA Governance is important in terms of providing the overall direction to SOA implementation within the enterprise. This involves board-level involvement, in addition to business and IT executives. At a high level, this involves managing the SOA projects implementation, managing SOA infrastructure, and controlling the entire effort through all fine-tuned IT processes in accordance with COBIT (Control Objectives for Information Technology).   Devising tools and techniques to promote reuse culture, and the SOA way of doing things needs competency centers to be established in addition to training the workforce to take up new roles that are suited to SOA journey.   Conclusions   Reference Architectures can serve as the basis for disparate architecture efforts throughout the organization, even if they use different tools and technologies. Reference architectures provide best practices and approaches in the independent way a vendor deals with technology and standards. Reference Architectures model the abstract architectural elements for an enterprise independent of the technologies, protocols, and products that are used to implement an SOA. Telecom enterprises today are facing significant business and technology challenges due to growing competition, a multitude of services, and convergence. Adopting architectural best practices could go a long way in meeting these challenges. The use of SOA-based architecture for communication to each of the external systems like Billing, CRM, etc., in OSS/BSS system has made the architecture very loosely coupled, with greater flexibility. Any change in the external systems would be absorbed at the Integration Layer without affecting the rest of the ecosystem. The use of a Business Process Management (BPM) tool makes the management and maintenance of the business processes easy, with better performance in terms of lead time, quality, and cost. Since the Architecture is based on standards, it will lower the cost of deploying and managing OSS/BSS applications over their lifecycles.

    Read the article

< Previous Page | 72 73 74 75 76