Search Results

Search found 9199 results on 368 pages for 'topological sort'.

Page 256/368 | < Previous Page | 252 253 254 255 256 257 258 259 260 261 262 263  | Next Page >

  • Getting from a user-story to code while using TDD (scrum)

    - by Ittai
    I'm getting into scrum and TDD and I think I have some confusion which I'd like to get your feedback about. Let's assume I have a user-story in my backlog, in order for me to start developing it as part of TDD I need to have requirements, right so far? Is it true to say that the product manager and the QA should be responsible for taking the user-story and breaking it down to acceptance tests? I think the above is true since the acceptance tests need to be formal, so they can be used as tests, but also human readable so that the product can approve they are the requirements, right? Is it also true that I later take these acceptance tests and use them as my requirements, i.e. they are a set of use-cases which I implement (through TDD)? I hope I'm not making too much of a mess but that's the current flow I have in mind right now. Update I think my initial intentions were unclear so I'll try to rephrase. I want to know more details about the scrum flow of turning a user-story into code while using TDD. The starting point is obvious, a user surfaces a need (or the user's representative as the product) which is a short 1-2 lines description in the known format and that is added to the product backlog. When there is a spring planning meeting user-stories are taken from the backlog and assigned to developers. In order for a developer to write code they need requirements (especially in TDD since the requirements are what the tests are derived from). When, by whom and to which format are the requirements compiled? What I had in mind was that the product and QA define the requirements via acceptance tests (I'm thinking of automatic using FitNesse or the sort but that's not the core I think) which help to serve 2 purposes at the same time: They define "Done" properly. They give a developer something to derive tests from. I wasn't sure when these were written (before the sprint they're picked then that might be a waste since additional information will arrive or the story won't be picked, during the iteration then the developer might get stuck waiting for them...)

    Read the article

  • Achieving Zero Downtime Deployment

    - by MattW
    I am trying to achieve zero downtime deployments so I can deploy less during off hours and more during "slower" hours - or anytime, in theory. My current setup, somewhat simplified: Web Server A (.NET App) Web Server B (.NET App) Database Server (SQL Server) My current deployment process: "Stop" the sites on both Web Server A and B Upgrade the database schema for the version of the app being deployed Update Web Server A Update Web Server B Bring everything back online Current Problem This leads to a small amount of downtime each month - about 30 mins. I do this during off hours, so it isn't a huge problem - but it is something I'd like to get away from. Also - there is no way to really go 'back'. I don't generally make rollback DB scripts - only upgrade scripts. Leveraging The Load Balancer I'd love to be able to upgrade one Web Server at a time. Take Web Server A out of the load balancer, upgrade it, put it back online, then repeat for Web Server B. The problem is the database. Each version of my software will need to execute against a different version of the database - so I am sort of "stuck". Possible Solution A current solution I am considering is adopting the following rules: Never delete a database table. Never delete a database column. Never rename a database column. Never reorder a column. Every stored procedure must be versioned. Meaning - 'spFindAllThings' will become 'spFindAllThings_2' when it is edited. Then it becomes 'spFindAllThings_3' when edited again. Same rule applies to views. While, this seems a bit extreme - I think it solves the problem. Each version of the application will be hitting the DB in a non breaking way. The code expects certain results from the views/stored procedures - and this keeps that 'contract' valid. The problem is - it just seeps sloppy. I know I can clean up old stored procedures after the app is deployed for awhile, but it just feels dirty. Also - it depends on all of the developers following these rule, which will mostly happen, but I imagine someone will make a mistake. Finally - My Question Is this sloppy or hacky? Is anybody else doing it this way? How are other people solving this problem?

    Read the article

  • Space partitioning when everything is moving

    - by Roy T.
    Background Together with a friend I'm working on a 2D game that is set in space. To make it as immersive and interactive as possible we want there to be thousands of objects freely floating around, some clustered together, others adrift in empty space. Challenge To unburden the rendering and physics engine we need to implement some sort of spatial partitioning. There are two challenges we have to overcome. The first challenge is that everything is moving so reconstructing/updating the data structure has to be extremely cheap since it will have to be done every frame. The second challenge is the distribution of objects, as said before there might be clusters of objects together and vast bits of empty space and to make it even worse there is no boundary to space. Existing technologies I've looked at existing techniques like BSP-Trees, QuadTrees, kd-Trees and even R-Trees but as far as I can tell these data structures aren't a perfect fit since updating a lot of objects that have moved to other cells is relatively expensive. What I've tried I made the decision that I need a data structure that is more geared toward rapid insertion/update than on giving back the least amount of possible hits given a query. For that purpose I made the cells implicit so each object, given it's position, can calculate in which cell(s) it should be. Then I use a HashMap that maps cell-coordinates to an ArrayList (the contents of the cell). This works fairly well since there is no memory lost on 'empty' cells and its easy to calculate which cells to inspect. However creating all those ArrayLists (worst case N) is expensive and so is growing the HashMap a lot of times (although that is slightly mitigated by giving it a large initial capacity). Problem OK so this works but still isn't very fast. Now I can try to micro-optimize the JAVA code. However I'm not expecting too much of that since the profiler tells me that most time is spent in creating all those objects that I use to store the cells. I'm hoping that there are some other tricks/algorithms out there that make this a lot faster so here is what my ideal data structure looks like: The number one priority is fast updating/reconstructing of the entire data structure Its less important to finely divide the objects into equally sized bins, we can draw a few extra objects and do a few extra collision checks if that means that updating is a little bit faster Memory is not really important (PC game)

    Read the article

  • Interviews: Going Beyond the Technical Quiz

    - by Tony Davis
    All developers will be familiar with the basic format of a technical interview. After a bout of CV-trawling to gauge basic experience, strengths and weaknesses, the interview turns technical. The whiteboard takes center stage and the challenge is set to design a function or query, or solve what on the face of it might seem a disarmingly simple programming puzzle. Most developers will have experienced those few panic-stricken moments, when one’s mind goes as blank as the whiteboard, before un-popping the marker pen, and hopefully one’s mental functions, to work through the problem. It is a way to probe the candidate’s knowledge of basic programming structures and techniques and to challenge their critical thinking. However, these challenges or puzzles, often devised by some of the smartest brains in the development team, have a tendency to become unnecessarily ‘tricksy’. They often seem somewhat academic in nature. While the candidate straight out of IT school might breeze through the construction of a Markov chain, a candidate with bags of practical experience but less in the way of formal training could become nonplussed. Also, a whiteboard and a marker pen make up only a very small part of the toolkit that a programmer will use in everyday work. I remember vividly my first job interview, for a position as technical editor. It went well, but after the usual CV grilling and technical questions, I was only halfway there. Later, they sat me alongside a team of editors, in front of a computer loaded with MS Word and copy of SQL Server Query Analyzer, and my task was to edit a real chapter for a real SQL Server book that they planned to publish, including validating and testing all the code. It was a tough challenge but I came away with a sound knowledge of the sort of work I’d do, and its context. It makes perfect sense, yet my impression is that many organizations don’t do this. Indeed, it is only relatively recently that Red Gate started to move over to this model for developer interviews. Now, instead of, or perhaps in addition to, the whiteboard challenges, the candidate can expect to sit with their prospective team, in front of Visual Studio, loaded with all the useful tools in the developer’s kit (ReSharper and so on) and asked to, for example, analyze and improve a real piece of software. The same principles should apply when interviewing for a database positon. In addition to the usual questions challenging the candidate’s knowledge of such things as b-trees, object permissions, database recovery models, and so on, sit the candidate down with the other database developers or DBAs. Arm them with a copy of Management Studio, and a few other tools, then challenge them to discover the flaws in a stored procedure, and improve its performance. Or present them with a corrupt database and ask them to get the database back online, and discover the cause of the corruption.

    Read the article

  • rkhunter 1.4 different results than version before?

    - by dschinn1001
    with rkhunter version before ubuntu-update from 12.04 to 12.10 I had NOT these warnings like listed here: Performing file properties checks Checking for prerequisites [ Warning ] /usr/sbin/adduser [ Warning ] /usr/sbin/chroot [ Warning ] /usr/sbin/cron [ Warning ] /usr/sbin/groupadd [ Warning ] /usr/sbin/groupdel [ Warning ] /usr/sbin/groupmod [ Warning ] /usr/sbin/grpck [ Warning ] /usr/sbin/nologin [ Warning ] /usr/sbin/pwck [ Warning ] /usr/sbin/rsyslogd [ Warning ] /usr/sbin/tcpd [ Warning ] /usr/sbin/useradd [ Warning ] /usr/sbin/userdel [ Warning ] /usr/sbin/usermod [ Warning ] /usr/sbin/vipw [ Warning ] /usr/bin/awk [ Warning ] /usr/bin/basename [ Warning ] /usr/bin/chattr [ Warning ] /usr/bin/curl [ Warning ] /usr/bin/cut [ Warning ] /usr/bin/diff [ Warning ] /usr/bin/dirname [ Warning ] /usr/bin/dpkg [ Warning ] /usr/bin/dpkg-query [ Warning ] /usr/bin/du [ Warning ] /usr/bin/env [ Warning ] /usr/bin/file [ Warning ] /usr/bin/find [ Warning ] /usr/bin/GET [ Warning ] /usr/bin/groups [ Warning ] /usr/bin/head [ Warning ] /usr/bin/id [ Warning ] /usr/bin/killall [ Warning ] /usr/bin/last [ Warning ] /usr/bin/lastlog [ Warning ] /usr/bin/ldd [ Warning ] /usr/bin/less [ Warning ] /usr/bin/locate [ Warning ] /usr/bin/logger [ Warning ] /usr/bin/lsattr [ Warning ] /usr/bin/lsof [ Warning ] /usr/bin/lynx [ Warning ] /usr/bin/mail [ Warning ] /usr/bin/md5sum [ Warning ] /usr/bin/mlocate [ Warning ] /usr/bin/newgrp [ Warning ] /usr/bin/passwd [ Warning ] /usr/bin/perl [ Warning ] /usr/bin/pgrep [ Warning ] /usr/bin/pkill [ Warning ] /usr/bin/pstree [ Warning ] /usr/bin/rkhunter [ Warning ] /usr/bin/rpm [ Warning ] /usr/bin/runcon [ Warning ] /usr/bin/sha1sum [ Warning ] /usr/bin/sha224sum [ Warning ] /usr/bin/sha256sum [ Warning ] /usr/bin/sha384sum [ Warning ] /usr/bin/sha512sum [ Warning ] /usr/bin/size [ Warning ] /usr/bin/sort [ Warning ] /usr/bin/stat [ Warning ] /usr/bin/strace [ Warning ] /usr/bin/strings [ Warning ] /usr/bin/sudo [ Warning ] /usr/bin/tail [ Warning ] /usr/bin/test [ Warning ] /usr/bin/top [ Warning ] /usr/bin/touch [ Warning ] /usr/bin/tr [ Warning ] /usr/bin/uniq [ Warning ] /usr/bin/users [ Warning ] /usr/bin/vmstat [ Warning ] /usr/bin/w [ Warning ] /usr/bin/watch [ Warning ] /usr/bin/wc [ Warning ] /usr/bin/wget [ Warning ] /usr/bin/whatis [ Warning ] /usr/bin/whereis [ Warning ] /usr/bin/which [ Warning ] /usr/bin/who [ Warning ] /usr/bin/whoami [ Warning ] /usr/bin/unhide.rb [ Warning ] /usr/bin/gawk [ Warning ] /usr/bin/lwp-request [ Warning ] /usr/bin/heirloom-mailx [ Warning ] /usr/bin/w.procps [ Warning ] /sbin/depmod [ Warning ] /sbin/fsck [ Warning ] /sbin/ifconfig [ Warning ] /sbin/ifdown [ Warning ] /sbin/ifup [ Warning ] /sbin/init [ Warning ] /sbin/insmod [ Warning ] /sbin/ip [ Warning ] /sbin/lsmod [ Warning ] /sbin/modinfo [ Warning ] /sbin/modprobe [ Warning ] /sbin/rmmod [ Warning ] /sbin/route [ Warning ] /sbin/runlevel [ Warning ] /sbin/sulogin [ Warning ] /sbin/sysctl [ Warning ] /bin/bash [ Warning ] /bin/cat [ Warning ] /bin/chmod [ Warning ] /bin/chown [ Warning ] /bin/cp [ Warning ] /bin/date [ Warning ] /bin/df [ Warning ] /bin/dmesg [ Warning ] /bin/echo [ Warning ] /bin/ed [ Warning ] /bin/egrep [ Warning ] /bin/fgrep [ Warning ] /bin/fuser [ Warning ] /bin/grep [ Warning ] /bin/ip [ Warning ] /bin/kill [ Warning ] /bin/less [ Warning ] /bin/login [ Warning ] /bin/ls [ Warning ] /bin/lsmod [ Warning ] /bin/mktemp [ Warning ] /bin/more [ Warning ] /bin/mount [ Warning ] /bin/mv [ Warning ] /bin/netstat [ Warning ] /bin/ping [ Warning ] /bin/ps [ Warning ] /bin/pwd [ Warning ] /bin/readlink [ Warning ] /bin/sed [ Warning ] /bin/sh [ Warning ] /bin/su [ Warning ] /bin/touch [ Warning ] /bin/uname [ Warning ] /bin/which [ Warning ] /bin/dash [ Warning ] It seems that rkhunter 1.4 is oversensitive somehow about changed bin-files ? chkrootkit finds nothing and no warnings too.

    Read the article

  • Is inline SQL still classed as bad practice now that we have Micro ORMs?

    - by Grofit
    This is a bit of an open ended question but I wanted some opinions, as I grew up in a world where inline SQL scripts were the norm, then we were all made very aware of SQL injection based issues, and how fragile the sql was when doing string manipulations all over the place. Then came the dawn of the ORM where you were explaining the query to the ORM and letting it generate its own SQL, which in a lot of cases was not optimal but was safe and easy. Another good thing about ORMs or database abstraction layers were that the SQL was generated with its database engine in mind, so I could use Hibernate/Nhibernate with MSSQL, MYSQL and my code never changed it was just a configuration detail. Now fast forward to current day, where Micro ORMs seem to be winning over more developers I was wondering why we have seemingly taken a U-Turn on the whole in-line sql subject. I must admit I do like the idea of no ORM config files and being able to write my query in a more optimal manner but it feels like I am opening myself back up to the old vulnerabilities such as SQL injection and I am also tying myself to one database engine so if I want my software to support multiple database engines I would need to do some more string hackery which seems to then start to make code unreadable and more fragile. (Just before someone mentions it I know you can use parameter based arguments with most micro orms which offers protection in most cases from sql injection) So what are peoples opinions on this sort of thing? I am using Dapper as my Micro ORM in this instance and NHibernate as my regular ORM in this scenario, however most in each field are quite similar. What I term as inline sql is SQL strings within source code. There used to be design debates over SQL strings in source code detracting from the fundamental intent of the logic, which is why statically typed linq style queries became so popular its still just 1 language, but with lets say C# and Sql in one page you have 2 languages intermingled in your raw source code now. Just to clarify, the SQL injection is just one of the known issues with using sql strings, I already mention you can stop this from happening with parameter based queries, however I highlight other issues with having SQL queries ingrained in your source code, such as the lack of DB Vendor abstraction as well as losing any level of compile time error capturing on string based queries, these are all issues which we managed to side step with the dawn of ORMs with their higher level querying functionality, such as HQL or LINQ etc (not all of the issues but most of them). So I am less focused on the individual highlighted issues and more the bigger picture of is it now becoming more acceptable to have SQL strings directly in your source code again, as most Micro ORMs use this mechanism. Here is a similar question which has a few different view points, although is more about the inline sql without the micro orm context: http://stackoverflow.com/questions/5303746/is-inline-sql-hard-coding

    Read the article

  • Silverlight Cream for June 15, 2010 - 2 -- #883

    - by Dave Campbell
    In this Issue: Vibor Cipan, Chris Klug, Pete Brown, Kirupa, and Xianzhong Zhu. Shoutouts (thought I gave up on them, didn't you?): Jesse Liberty has the companion video to his WP7 OData post up: New Video: Master/Detail in WinPhone 7 with oData Michael Scherotter who made the first Ball Watch SL1 app back in the day, has a Virtual Event: Creating an Entry for the BALL Watch Silverlight Contest... sounds like the thing to do if you want in on this :) Even if you don't speak Portuguese, you can check this out: MSN Brazil Uses Silverlight to Showcase the 2010 FIFA World Cup South Africa Erik Mork and crew have their latest up: This Week in Silverlight – Teched and Quizes Michael Klucher has a post up to give you some relief if you're having Trouble Installing the Windows Phone Developer Tools Portuguese above and now French... Jeremy Alles has a post up about [WP7] Windows Phone 7 challenge for french readers ! Just a note, not that it makes any difference, but Adam Kinney turned @SilverlightNews over to me today. I am the only one that has ever posted on it, but still having it all to myself feels special :) From SilverlightCream.com: Silverlight 4 tutorial: HOW TO use PathListBox and Sample Data Crank up that new version of Blend and follow along with Vibor Cipan's PathListBox tutorial ... oh, and sample data too. Cool INotifyPropertyChanged implementation Chris Klug shows off some INotifyPropertyChange goodness he is not implementing, and credits a blog by Manuel Felicio for some inspiration. Check out that post as well... I've tagged his blog... I needed *another* one :) Silverlight Tip: Using LINQ to Select the Largest Available Webcam Resolution With no Silverlight Tip of the Day today, Pete Brown stepped up with this tip for finding the largest available webcam resolution using LINQ ... and read the comment from Rene as well. Creating a Master-Detail UI in Blend Kirupa has a very nice Master/Detail UI post up with backrounder info and the code for the project. There's a running example in the post for you to get an idea what you're learning. Get started with Farseer Physics 2.1.3 in Silverlight 3 Xianzhong Zhu has a Silverlight 3 tutorial up for Farseer Physics 2.1.3 ... might track for Silverlight 4, but hey, WP7 is kinda/sort Silverlight 3, right? ... lots of code and external links. Stay in the 'Light! Twitter SilverlightNews | Twitter WynApse | WynApse.com | Tagged Posts | SilverlightCream Join me @ SilverlightCream | Phoenix Silverlight User Group Technorati Tags: Silverlight    Silverlight 3    Silverlight 4    Windows Phone MIX10

    Read the article

  • Source of (programmer) inefficiency

    - by Daniel
    I am interested to gain a better insight about the possible reasons of personal inefficiency as programmers (and only in programming) due to – simply - our own errors (because we are humans – well, almost all of us). I am not interested in how much we are productive or in how many adjustements the customer asks for when the work is done, but where and how each of us spend that part of its time in tasks that are unproductive and there is no one to blame except ourselves. Excluding ego - feeding and / or self – gratification, what I am trying to get (for all of us) is: what are the common issues eating our time; insight on reasons for that issues; identify simple way for us, personally (not delegating actions to other or our organizations), to correct our own problems. Please, do not think in academic terms but aim at the opportunity to compare our daily experiences and understand what are and how we try to fix our personal deficiencies. If you are interested to respond to this post, please: integrate the list if you see something important (or obvious) missing; highlight or name honestly your first issue tellng the way you try to address and solve your issue acting on yourself and yourself only in a sort of "continuous quality improving" My criteria for accepting the answer is: choose the best solution (feasibility and utility) to fix one (or more) of the problems of the list. Of course, selecting an error is not a vote on our skills: maybe we are hyper professional programmers and we lose ten minutes only every year or we are terribly inefficient, losing a couple of days a week: reasons for inefficiency could be really the same - but in a different scale. A possible list: Plain error in the names (variables, functions). Inability to see the obvious in your code. Misreading. Lack of concentration. Trying to use a technology you have not mastered. Errors with data types. Time required to understand your previous code or your documentation. Trying to do something more than requested because you enjoy it Using solutions more complicated than required because you enjoy it. Plain logical errors. Errors due to your fault in communications. Distraction My first personal issue: "Trying to use a technology you do not master." I have to use daily several technologies and I often need to spend significant time correcting code because my assumptions were plainly wrong. Reasons for this: production needs put high pressure and make difficult to find the time to learn. I try to address this reading technical books - as many as I can - even if this actually consumes a lot of time.

    Read the article

  • Canonicalization of single, small pages like reviews or product categories [SEO]

    - by Valorized
    In general I pretty much like the idea of canonicalization. And in most cases, Google explains possible procedures in a clear way. For example: If I have duplicates because of parameters (eg: &sort=desc) it's clear to use the canonical for the site, provided the within the head-tag. However I'm wondering how to handle "small - no to say thin content - sites". What's my definition of a small site? An Example: On one of my main sites, we use a directory based url-structure. Let's see: example.com/ (root) example.com/category-abc/ example.com/category-abc/produkt-xy/ Moreover we provide on page, that includes all products example.com/all-categories/ (lists all products the same way as in the categories) In case of reviews, we use a similar structure: example.com/reviews/product-xy/ shows all review for one certain product example.com/reviews/product-xy/abc-your-product-is-great/ shows one certain review example.com/reviews/ shows all reviews for all products (latest first) Let's make it even more complicated: On every product site, there are the latest 2 reviews at the end of the page. So you see, a lot of potential duplicates. Q1: Should I create canonicals for a: example.com/category-abc/ to example.com/all-categories/ b: example.com/reviews/product-xy/abc-your-product-is-great/ to example.com/reviews/product-xy/ or to example.com/review/ or none of them? Q2: Can I link the collection of categories (all-categories/) and collection of all reviews (reviews/ and reviews/product-xy/) to the single category respectively to the single review. Example: example.com/reviews/ includes - let's say - 100 reviews. Can I somehow use a markup that tells search engines: "Hey, wait, you are now looking at a collection of 100 reviews - do not index this collection, you should rather prefer indexing every single review as a single page!". In HTML it might be something like that (which - of course - does not work, it's only to show you what I mean): <div class="review" rel="canonical" href="http://example.com/reviews/product-xz/abc-your-product-is-great/">HERE GOES THE REVIEW</div> Reason: I don't think it is a great user experience if the user searches for "your product is great" and lands on example.com/reviews/ instead of example.com/reviews/product-xy/abc-your-product-is-great/. On the first site, he will have to search and might stop because of frustration. The second result, however, might lead to a conversion. The same applies for categories. If the user is searching for category-Z, he might land on the all-categories page and he has to scroll down to the (last) category, to find what he searched for (Z). So what's best practice? What should I do? Thank you for your help!

    Read the article

  • Oracle????????????????????????~????????????????????

    - by Yusuke.Yamamoto
    RDBMS ???????·????????????????????????????????????????????????????????????????????????? ????????Oracle ?????????????????????????????????? Oracle Database ???????????????????????????????? ????????????????????? ????Oracle???????????????????????????????????????????????????????????????????????????? ?????????????? Oracle Database ???????????????????????? ??????????????????????????????????2????????????? 1. ??????(Query Transformation) Query Transformation ???????SQL??????????????????SQL????????????????????? Query Transformation ???Predicate Transformation ? Common Sub-expression Elimination (CSE), Order-BY Elimination (OBYE), Outer Join Elimination (OJE), Simple View Meging (SVM), Predicate Move around (PM), Complex View Merging (CVM), Sub-query Unnesting (SU), Join Predicate Push Down (JPPD) ???? OR Expansion, Star Transformation (ST) ????????????? ···???????????????????????????????????????????????????? Predicate Transformation ?????? Transitive Predicate Generation ????????????? ?????????????SQL???deptno ? 10 ????????????????????????????? select e.ename, d.loc from emp e, dept d where e.deptno=d.deptno and e.deptno=10; ???????????????emp ??? deptno=10 ??????????????dept ??? d.deptno=10 ??????????????????? emp ?? deptno=10 ????????????????????emp ?? deptno=10 ??????10???????10? dept ????????????dept ??20???????????????????????10?*20?=200?????(??????????·?????????)? ??SQL?? Transitive Predicate Generation ??????SQL????????????????? select e.ename, d.location from emp e, dept d where e.deptno=d.deptno and e.deptno=10 and d.deptno=10; ^^^^^^^^^^^ ??????dept ?????? deptno=10 ??????????????????????????10?*1?=10(dept.deptno ?unique????)?1/20????????????????1/20????????????????10??????????30???????????????Query Transformation ???????????????????????????? ?:??????????? dept ?? 1-row table ??????dept ?? driving ???(Outer Table)??? emp ?? probe ???(Inner Table)????????????1?*10?=10 ????????????????????????????????????????????????????????1/20????????????? ?????? Query Transformation ??????SQL????????????????????????????????? Transformation ??????????????????????????????????? 2. ????·????(Access Path Analysis) Access Path Analysis ??Query Transformation ??SQL????????????(Access Path)?????????(Join Method)?????(Join Order)?????????? ??????????????????(FTS)?ROWID?????????????????????????????·?????(Nested Loop Join)???????(Hash Join)????/?????(Sort Merge Join)????????????????????????????????????????????????????????????????????????? Oracle Database ????????? Query Transformation ???? Logical Optimizer?Access Path Analysis ???? Physical Optimizer ????????? ??????????????????????????????????????????????????????????????????????????????????????????????????? ????????????????????????????????????????????????????? Oracle Database ????????????????????? "Oracle ????????" ?????????? Sustaining Engineering?? ?(??? ???) ???????????????? Sustaining Engineering ????????????????????????Oracle Database ???????????????????????? ?????????????????????Ruby????????????????????????? Oracle????????????????????????! Oracle????????????? Oracle????????????????????????

    Read the article

  • Fast programmatic compare of "timetable" data

    - by Brendan Green
    Consider train timetable data, where each service (or "run") has a data structure as such: public class TimeTable { public int Id {get;set;} public List<Run> Runs {get;set;} } public class Run { public List<Stop> Stops {get;set;} public int RunId {get;set;} } public class Stop { public int StationId {get;set;} public TimeSpan? StopTime {get;set;} public bool IsStop {get;set;} } We have a list of runs that operate against a particular line (the TimeTable class). Further, whilst we have a set collection of stations that are on a line, not all runs stop at all stations (that is, IsStop would be false, and StopTime would be null). Now, imagine that we have received the initial timetable, processed it, and loaded it into the above data structure. Once the initial load is complete, it is persisted into a database - the data structure is used only to load the timetable from its source and to persist it to the database. We are now receiving an updated timetable. The updated timetable may or may not have any changes to it - we don't know and are not told whether any changes are present. What I would like to do is perform a compare for each run in an efficient manner. I don't want to simply replace each run. Instead, I want to have a background task that runs periodically that downloads the updated timetable dataset, and then compares it to the current timetable. If differences are found, some action (not relevant to the question) will take place. I was initially thinking of some sort of checksum process, where I could, for example, load both runs (that is, the one from the new timetable received and the one that has been persisted to the database) into the data structure and then add up all the hour components of the StopTime, and all the minute components of the StopTime and compare the results (i.e. both the sum of Hours and sum of Minutes would be the same, and differences introduced if a stop time is changed, a stop deleted or a new stop added). Would that be a valid way to check for differences, or is there a better way to approach this problem? I can see a problem that, for example, one stop is changed to be 2 minutes earlier, and another changed to be 2 minutes later would have a net zero change. Or am I over thinking this, and would it just be simpler to brute check all stops to ensure that The updated run stops at the same stations; and Each stop is at the same time

    Read the article

  • Don Knuth and MMIXAL vs. Chuck Moore and Forth -- Algorithms and Ideal Machines -- was there cross-pollination / influence in their ideas / work?

    - by AKE
    Question: To what extent is it known (or believed) that Chuck Moore and Don Knuth had influence on each other's thoughts on ideal machines, or their work on algorithms? I'm interested in citations, interviews, articles, links, or any other sort of evidence. It could also be evidence of the form of A and B here suggest that Moore might have borrowed or influenced C and D from Knuth here, or vice versa. (Opinions are of course welcome, but references / links would be better!) Context: Until fairly recently, I have been primarily familiar with Knuth's work on algorithms and computing models, mostly through TAOCP but also through his interviews and other writings. However, the more I have been using Forth, the more I am struck by both the power of a stack-based machine model, and the way in which the spareness of the model makes fundamental algorithmic improvements more readily apparent. A lot of what Knuth has done in fundamental analysis of algorithms has, it seems to me, a very similar flavour, and I can easily imagine that in a parallel universe, Knuth might perhaps have chosen Forth as his computing model. That's the software / algorithms / programming side of things. When it comes to "ideal computing machines", Knuth in the 70s came up with the MIX computer model, and then, collaborating with designers of state-of-the-art RISC chips through the 90s, updated this with the modern MMIX model and its attendant assembly language MMIXAL. Meanwhile, Moore, having been using and refining Forth as a language, but using it on top of whatever processor happened to be in the computer he was programming, began to imagine a world in which the efficiency and value of stack-based programming were reflected in hardware. So he went on in the 80s to develop his own stack-based hardware chips, defining the term MISC (Minimal Instruction Set Computers) along the way, and ending up eventually with the first Forth chip, the MuP21. Both are brilliant men with keen insight into the art of programming and algorithms, and both work at the intersection between algorithms, programs, and bare metal hardware (i.e. hardware without the clutter of operating systems). Which leads me to the headlined question... Question:To what extent is it known (or believed) that Chuck Moore and Don Knuth had influence on each other's thoughts on ideal machines, or their work on algorithms?

    Read the article

  • Working with Reporting Services Filters – Part 3: The TOP and BOTTOM Operators

    - by smisner
    Thus far in this series, I have described using the IN operator and the LIKE operator. Today, I’ll continue the series by reviewing the TOP and BOTTOM operators. Today, I happened to be working on an example of using the TOP N operator and was not successful on my first try because the behavior is just a bit different than we find when using an “equals” comparison as I described in my first post in this series. In my example, I wanted to display a list of the top 5 resellers in the United States for AdventureWorks, but I wanted it based on a filter. I started with a hard-coded filter like this: Expression Data Type Operator Value [ResellerSalesAmount] Float Top N 5 And received the following error: A filter value in the filter for tablix 'Tablix1' specifies a data type that is not supported by the 'TopN' operator. Verify that the data type for each filter value is Integer. Well, that puzzled me. Did I really have to convert ResellerSalesAmount to an integer to use the Top N operator? Just for kicks, I switched to the Top % operator like this: Expression Data Type Operator Value [ResellerSalesAmount] Float Top % 50 This time, I got exactly the results I expected – I had a total of 10 records in my dataset results, so 50% of that should yield 5 rows in my tablix. So thinking about the problem with Top N some  more, I switched the Value to an expression, like this: Expression Data Type Operator Value [ResellerSalesAmount] Float Top N =5 And it worked! So the value for Top N or Top % must reflect a number to plug into the calculation, such as Top 5 or Top 50%, and the expression is the basis for determining what’s in that group. In other words, Reporting Services will sort the rows by the expression – ResellerSalesAmount in this case – in descending order, and then filter out everything except the topmost rows based on the operator you specify. The curious thing is that, if you’re going to hard-code the value, you must enter the value for Top N with an equal sign in front of the integer, but you can omit the equal sign when entering a hard-coded value for Top %. This experience is why working with Reporting Services filters is not always intuitive! When you use a report parameter to set the value, you won’t have this problem. Just be sure that the data type of the report parameter is set to Integer. Jessica Moss has an example of using a Top N filter in a tablix which you can view here. Working with Bottom N and Bottom % works similarly. You just provide a number for N or for the percentage and Reporting Services works from the bottom up to determine which rows are kept and which are excluded.

    Read the article

  • What&rsquo;s new in RadChart for 2010 Q1 (Silverlight / WPF)

    Greetings, RadChart fans! It is with great pleasure that I present this short highlight of our accomplishments for the Q1 release :). Weve worked very hard to make the best silverlight and WPF charting product even better. Here is some of what we did during the past few months.   1) Zooming&Scrolling and the new sampling engine: Without a doubt one of the most important things we did. This new feature allows you to bind your chart to a very large set of data with blazing performance. Dont take my word for it give it a try!   2) New Smart Label Positioning and Spider-like labels feature: This new feature really helps with very busy graphs. You can play with the different settings we offer in this example.   3) Sorting and Filtering. Much like our RadGridview control the chart now allows you to sort and filter your data out of the box with a single line of code!   4) Legend improvements Weve also been paying attention to those of you who wanted a much improved legend. It is now possible to customize the look and feel of legend items and legend position with a single click.   5) Custom palette brushes. You have told us that you want to easily customize all palette colors using a single clean API from both XAML and code behind. The new custom palette brushes API does exactly that.   There are numerous other improvements as well, as much improved themes, performance optimizations and other features that we did. If you want to dig in further check the release notes and changes and backwards compatibility topics.   Feel free to share the pains and gains of working with RadChart. Our team is always open to receiving constructive feedback and beer :-)Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • How to store Role Based Access rights in web application?

    - by JonH
    Currently working on a web based CRM type system that deals with various Modules such as Companies, Contacts, Projects, Sub Projects, etc. A typical CRM type system (asp.net web form, C#, SQL Server backend). We plan to implement role based security so that basically a user can have one or more roles. Roles would be broken down by first the module type such as: -Company -Contact And then by the actions for that module for instance each module would end up with a table such as this: Role1 Example: Module Create Edit Delete View Company Yes Owner Only No Yes Contact Yes Yes Yes Yes In the above case Role1 has two module types (Company, and Contact). For company, the person assigned to this role can create companies, can view companies, can only edit records he/she created and cannot delete. For this same role for the module contact this user can create contacts, edit contacts, delete contacts, and view contacts (full rights basically). I am wondering is it best upon coming into the system to session the user's role with something like a: List<Role> roles; Where the Role class would have some sort of List<Module> modules; (can contain Company, Contact, etc.).? Something to the effect of: class Role{ string name; string desc; List<Module> modules; } And the module action class would have a set of actions (Create, Edit, Delete, etc.) for each module: class ModuleActions{ List<Action> actions; } And the action has a value of whether the user can perform the right: class Action{ string right; } Just a rough idea, I know the action could be an enum and the ModuleAction can probably be eliminated with a List<x, y>. My main question is what would be the best way to store this information in this type of application: Should I store it in the User Session state (I have a session class where I manage things related to the user). I generally load this during the initial loading of the application (global.asax). I can simply tack onto this session. Or should this be loaded at the page load event of each module (page load of company etc..). I eventually need to be able to hide / unhide various buttons / divs based on the user's role and that is what got me thinking to load this via session. Any examples or points would be great.

    Read the article

  • Best practices for logging and tracing in .NET

    - by Levidad
    I've been reading a lot about tracing and logging, trying to find some golden rule for best practices in the matter, but there isn't any. People say that good programmers produce good tracing, but put it that way and it has to come from experience. I've also read similar questions in here and through the internet and they are not really the same thing I am asking or do not have a satisfying answer, maybe because the questions lack some detail. So, folks say that tracing should sort of replicate the experience of debugging the application in cases where you can't attach a debugger. It should provide enough context so that you can see which path is taken at each control point in the application. Going deeper, you can even distinguish between tracing and event logging, in that "event logging is different from tracing in that it captures major states rather than detailed flow of control". Now, say I want to do my tracing and logging using only the standard .NET classes, those in the System.Diagnostics namespace. I figured that the TraceSource class is better for the job than the static Trace class, because I want to differentiate among the trace levels and using the TraceSource class I can pass in a parameter informing the event type, while using the Trace class I must use Trace.WriteLineIf and then verify things like SourceSwitch.TraceInformation and SourceSwitch.TraceErrors, and it doesn't even have properties like TraceVerbose or TraceStart. With all that in mind, would you consider a good practice to do as follows: Trace a "Start" event when begining a method, which should represent a single logical operation or a pipeline, along with a string representation of the parameter values passed in to the method. Trace an "Information" event when inserting an item into the database. Trace an "Information" event when taking one path or another in an important if/else statement. Trace a "Critical" or "Error" in a catch block depending on weather this is a recoverable error. Trace a "Stop" event when finishing the execution of the method. And also, please clarify when best to trace Verbose and Warning event types. If you have examples of code with nice trace/logging and are willing to share, that would be excelent. Note: I've found some good information here, but still not what I am looking for: http://msdn.microsoft.com/en-us/magazine/ff714589.aspx Thanks in advance!

    Read the article

  • Stagnating in programming

    - by Coder
    Time after time this question came up in my mind, but up until today I wasn't thinking about it much. I have been programming for maybe around 8 years now, and for the last two years it seems I'm not as keen to pick up new technologies anymore. Maybe that's a burnout or something, but I'd say it's experience and what I like, that's stopping me from running after the latest and greatest. I'm C++ developer, by this I mean, I love close to metal programming. I have no problems tracing problems through assembly, using tools like WinDbg or HexView. When I use constructs, I think about how they are realized underneath, how the bits are set and unset under the hood. I love battling with complex threading problems and doing everything hardcore way, even by hand if the regular solutions seem half baked. But I also love the C++0x stuff, and use it a lot. And all C++ code as long as it's not cumbersome compared to C counterparts, sometimes I also fall back to sort of "Super C" if the C++ way is ugly. And then there are all other developers who seem to be way more forward looking, .Net 4.0 MVC, WPF, all those Microsoft X#s, LINQ languages, XML and XSLT, mobile devices and so on. I have done a considerable amount of .NET, SQL, ASPX programming, but the further I go, the less I want to try those technologies. Is that bad? Almost every day I hear people saying that managed code is the only way forward, WPF is the way to go. I hear that C++ is godawful, and you can't code anything in it that's somewhat stable. But I don't buy it. With the experience I have, and the knowledge of how native code is compiled and executes, I can say I find it extremely rare that C++ code is unstable, or leaks, or causes crashes that takes more than 30 seconds to identify and fix. And to tell the truth, I've seen enough problems with other "cool" languages that I'd say C++ is even more stable and production proof than the safe languages, at least for me. The only thing that scares me in C++ is new frameworks, I don't trust them, and I use them extra sparingly. STL - yes, ATL - very sparingly, everything else... Well, not very keen on it. Most huge problems I've ran into, all were related to frameworks, not the language itself. Some overrided operator here, bad hierarchy there, poor class design here, mystical castings there. Other than that, C/C++ (yes, I use them together) still seems a very controlled and stable way to develop applications. Am I stagnating? Should I switch a profession, or force myself in all that marketing hype? Are there more developers who feel the same way?

    Read the article

  • Video stutter when using external drive

    - by psion
    When using boxee to play video files off of an external western digital 1TB drive formatted NTFS, I notice a slight stutter in the video every 5-10 seconds. When using mplayer, it doesn't stutter as often, but it still stutters occasionally. If I play the video off of the local sata drive, it plays fine even in boxee. I use this computer as my HTPC and I just switched from windows to linux on it. In windows, I never had any sort of stutter playing movies from the drive. I am using the latest intel graphics drivers (for the intel GMA 950) root@eee-htpc:/home/htpc# grep wd /etc/mtab /dev/sdb1 /mnt/wd2 fuseblk rw,nosuid,nodev,allow_other,blksize=512 0 0 I notice that despite trying to use ntfs or ntfs-3g, ubuntu uses ntfs-fuse which I've heard is slower. /dev/sdb1: Timing buffered disk reads: 80 MB in 3.07 seconds = 26.08 MB/sec root@eee-htpc:/mnt/wd2# dd if=/dev/zero of=./120mb bs=1024 count=120000 root@eee-htpc:/mnt/wd2# time mv ./120mb /home/htpc real 0m2.095s user 0m0.016s sys 0m0.736s Even though fuse has a reputation for being slow, it should easily be fast enough for playing standard definition video files. So why the video stutter? edit: The issue seems to be overhead cpu usage from either playing off of a usb device or ntfs/fuse. Watching CPU usage with top, local files use 10-40% CPU. Watching the same video on the external formatted ntfs, it spikes to 170% (over 100% because of hyperthreading). To me it seems like it must be overhead from the fuse driver, though I don't know if it has more or less overhead than ntfs-3g. It's a EEEBox B202 that has an atom 270, so not exactly the most powerful out there. edit2: I believe the solution would be to use non-fuse drivers or different fuse drivers. so far I have not been able to. edit3: I've probably edited this more times than I should, but as an update I have upgraded ntfs drivers to ntfs-3g 2010.8.8 external FUSE 28 - Third Generation NTFS Driver using the following PPA - ppa:x3lectric/team-iquik-releases. When first opening a video file in boxee that's on ntfs there's still the same amount of lag. After a few minutes of video, the lag seems to go away and the cpu usage comes down to 10-40%. Every so often though, it begins to stutter again. Also, if I skip ahead/back in the file, it begins to stutter a lot.

    Read the article

  • Canonicalization of single, small pages like reviews or product categories

    - by Valorized
    In general I pretty much like the idea of canonicalization. And in most cases, Google explains possible procedures in a clear way. For example: If I have duplicates because of parameters (eg: &sort=desc) it's clear to use the canonical for the site, provided the within the head-tag. However I'm wondering how to handle "small - no to say thin content - sites". What's my definition of a small site? An Example: On one of my main sites, we use a directory based url-structure. Let's see: example.com/ (root) example.com/category-abc/ example.com/category-abc/produkt-xy/ Moreover we provide on page, that includes all products example.com/all-categories/ (lists all products the same way as in the categories) In case of reviews, we use a similar structure: example.com/reviews/product-xy/ shows all review for one certain product example.com/reviews/product-xy/abc-your-product-is-great/ shows one certain review example.com/reviews/ shows all reviews for all products (latest first) Let's make it even more complicated: On every product site, there are the latest 2 reviews at the end of the page. So you see, a lot of potential duplicates. Q1: Should I create canonicals for a: example.com/category-abc/ to example.com/all-categories/ b: example.com/reviews/product-xy/abc-your-product-is-great/ to example.com/reviews/product-xy/ or to example.com/review/ or none of them? Q2: Can I link the collection of categories (all-categories/) and collection of all reviews (reviews/ and reviews/product-xy/) to the single category respectively to the single review. Example: example.com/reviews/ includes - let's say - 100 reviews. Can I somehow use a markup that tells search engines: "Hey, wait, you are now looking at a collection of 100 reviews - do not index this collection, you should rather prefer indexing every single review as a single page!". In HTML it might be something like that (which - of course - does not work, it's only to show you what I mean): <div class="review" rel="canonical" href="http://example.com/reviews/product-xz/abc-your-product-is-great/"> HERE GOES THE REVIEW</div> Reason: I don't think it is a great user experience if the user searches for "your product is great" and lands on example.com/reviews/ instead of example.com/reviews/product-xy/abc-your-product-is-great/. On the first site, he will have to search and might stop because of frustration. The second result, however, might lead to a conversion. The same applies for categories. If the user is searching for category-Z, he might land on the all-categories page and he has to scroll down to the (last) category, to find what he searched for (Z). So what's best practice? What should I do?

    Read the article

  • Monitoring Your Servers

    - by Grant Fritchey
    If you are the DBA in a large scale enterprise, you’re probably already monitoring your servers for up-time and performance. But if you work for a medium-sized business, a small shop, or even a one-man operation, chances are pretty good that you’re not doing that sort of monitoring. You know that you’re supposed to be doing it, but other things, more important at-the-moment things, keep getting in the way. After all, which is more important, some monitoring or backup testing?  Backup testing, of course. Monitoring is frequently one of those things that you do when can get around to it.  Well, as you can see at the right, I have your round tuit ready to go. What if I told you that you could get monitoring on your servers for up-time, job completion, performance, all the standard stuff? And what if I told you that you wouldn’t need to install and configure another server in your environment to get it done? And what if I told you that you’d be able to set up and customize your alerts so you could know if your server was offline or a drive was full? Almost nothing for you to do, and you’ll have a full-blown monitoring process. Sounds to good to be true doesn’t it? Well, it’s coming. We’re creating an online, remote, monitoring system here at Red Gate. You’ll be able to use our SQL Monitor tool (which you can see here, monitoring SQL Server Central in real time) to keep track of your systems, but without having to set up a server and a database for storing the information collected. Instead, we’re taking advantage of services available through the internet to enable collection and storage of this information remotely, off your systems. All you have to do is install a piece of software that will communicate between our service and your servers and you’ll be off and running. It’s that easy. Before you get too excited, let me break the news that this is the near future I’m talking about. We’re setting up the program and there’s a sign-up you can use to get in on the initial tests.

    Read the article

  • Should I be an algorithm developer, or java web frameworks type developer?

    - by Derek
    So - as I see it, there are really two kinds of developers. Those that do frameworks, web services, pretty-making front ends, etc etc. Then there are developers that write the algorithms that solve the problem. That is, unless the problem is "display this raw data in some meaningful way." In that case, the framework/web developer guy might be doing both jobs. So my basic problem is this. I have been an algorithms kind of software developer for a few years now. I double majored in Math and Computer science, and I have a master's in systems engineering. I have never done any web-dev work, with the exception of a couple minor jobs, and some hobby level stuff. I have been job interviewing lately, and this is what happens: Job is listed as "programmer- 5 years of experience with the following: C/C++, Java,Perl, Ruby, ant, blah blah blah" Recruiter calls me, says they want me to come in for interview In the interview, find out they have some webservices development, blah blah blah When asked in the interview, talk about my experience doing algorithms, optimization, blah blah..but very willing to learn new languages, frameworks, etc Get a call back saying "we didn't think you were a fit for the job you interviewed wtih, but our algorithm team got wind of you and wants to bring you on" This has happened to me a couple times now - see a vague-ish job description looking for a "programmer" Go in, find out they are doing some sort of web-based tool, maybe with some hardcore algorithms running in the background. interview with people for the web-based tool, but get an offer from the algorithms people. So the question is - which job is the better job? I basically just want to get a wide berth of experience at this level of my career, but are algorithm developers so much in demand? Even more so than all these supposed hot in demand web developer guys? Will I be ok in the long run if I go into the niche of math based algorithm development, and just little to no, or hobby level web-dev experience? I basically just don't want to pigeon hole myself this early. My salary is already starting to get pretty high - and I can see a company later on saying "we really need a web developer, but we'll hire this 50k/year college guy, instead of this 100k/year experience algorithm guy" Cliffs notes: I have been doing algorithm development. I consider myself to be a "good programmer." I would have no problem picking up web technologies and those sorts of frameworks. During job interviews, I keep getting "we think you've got a good skillset - talk to our algorithm team" instead of wanting me to learn new skills on the job to do their web services or whhatever other new technology they are doing. Edit: Whenever I am talking about algorithm development here - I am talking about the code that produces the answer. Typically I think of more math-based algorithms: solving a financial problem, solving a finite element method, image processing, etc

    Read the article

  • Camera not staying behind model while moving in circle

    - by ChocoMan
    I have a camera behind a model (3rd Person) and I'm having problems KEEPING it behind the model. When I first start my game, you see the back of the model. If the model moves forward, backward or strafe left or right, the camera moves along accordingly. When the model rotates (stationary), the camera rotates accordingly with the model still pointing at the model's back. So far, so good. The problem comes when the player is BOTH moving and rotating at the same time. Take for example a model moving in a circular pattern like running around a track. As the model moves in this motion, the model rotates slightly more with each complete rotation. Eventually, instead of looking at the model's back, eventually you will see the model in a profile view and before you know it, the model's front is facing the camera. And when you stop moving the model, the model stays in that position. So, as long as my model is stationary and rotating in one place, the camera rotates correctly. But as soon as there is any sort movement while rotating, the model is offset by a mysterious increasing amount. How can I keep the camera maintaining the same view no matter how I move AND rotate at the same time? // Rotates model and pitches camera on its own axis public void modelRotMovement(GamePadState pController) { /* For rotating the model left or right. * Camera maintains distance from model * throughout rotation and if model moves * to a new position. */ Yaw = pController.ThumbSticks.Right.X * MathHelper.ToRadians(speedAngleMAX); AddRotation = Quaternion.CreateFromAxisAngle(Vector3.Up, yaw); //AddRotation = Quaternion.CreateFromYawPitchRoll(Yaw, 0, 0); ModelLoad.MRotation *= AddRotation; MOrientation = Matrix.CreateFromQuaternion(ModelLoad.MRotation); Pitch = pController.ThumbSticks.Right.Y * MathHelper.ToRadians(speedAngleMAX); AddPitch = Quaternion.CreateFromAxisAngle(Vector3.Up, pitch); ModelLoad.CRotation *= AddPitch; COrientation = Matrix.CreateFromQuaternion(ModelLoad.CRotation); } // Orbit (yaw) Camera around model public void cameraYaw(float yaw) { Vector3 yawAngle = ModelLoad.CameraPos - ModelLoad.camTarget; Vector3 axisYaw = Vector3.Up; ModelLoad.CameraPos = Vector3.Transform(yawAngle, Matrix.CreateFromAxisAngle(axisYaw, yaw)) + ModelLoad.camTarget; }

    Read the article

  • Understanding Data Science: Recent Studies

    - by Joe Lamantia
    If you need such a deeper understanding of data science than Drew Conway's popular venn diagram model, or Josh Wills' tongue in cheek characterization, "Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician." two relatively recent studies are worth reading.   'Analyzing the Analyzers,' an O'Reilly e-book by Harlan Harris, Sean Patrick Murphy, and Marck Vaisman, suggests four distinct types of data scientists -- effectively personas, in a design sense -- based on analysis of self-identified skills among practitioners.  The scenario format dramatizes the different personas, making what could be a dry statistical readout of survey data more engaging.  The survey-only nature of the data,  the restriction of scope to just skills, and the suggested models of skill-profiles makes this feel like the sort of exercise that data scientists undertake as an every day task; collecting data, analyzing it using a mix of statistical techniques, and sharing the model that emerges from the data mining exercise.  That's not an indictment, simply an observation about the consistent feel of the effort as a product of data scientists, about data science.  And the paper 'Enterprise Data Analysis and Visualization: An Interview Study' by researchers Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffery Heer considers data science within the larger context of industrial data analysis, examining analytical workflows, skills, and the challenges common to enterprise analysis efforts, and identifying three archetypes of data scientist.  As an interview-based study, the data the researchers collected is richer, and there's correspondingly greater depth in the synthesis.  The scope of the study included a broader set of roles than data scientist (enterprise analysts) and involved questions of workflow and organizational context for analytical efforts in general.  I'd suggest this is useful as a primer on analytical work and workers in enterprise settings for those who need a baseline understanding; it also offers some genuinely interesting nuggets for those already familiar with discovery work. We've undertaken a considerable amount of research into discovery, analytical work/ers, and data science over the past three years -- part of our programmatic approach to laying a foundation for product strategy and highlighting innovation opportunities -- and both studies complement and confirm much of the direct research into data science that we conducted. There were a few important differences in our findings, which I'll share and discuss in upcoming posts.

    Read the article

  • Challenge: Learn One New Thing Today

    - by BuckWoody
    Most of us know that there's a lot to learn. I'm teaching a class this morning, and even on the subject where I'm the "expert" (that word always makes me nervous!) I still have a lot to learn. To learn, sometimes I take a class, read a book, or carve out a large chunk of time so that I can fully grasp the subject. But since I've been working, I really don't have a lot of opportunities to do that. Like you, I'm really busy. So what I've been able to learn is to take just a few moments each day and learn something new about SQL Server. I thought I would share that process here. First, I started with an outline of the product. You can use Books Online, a college class syllabus, a training class outline, or a comprehensive book table of contents. Then I checked off the things I felt I knew a little about. Sure, I'll come back around to those, but I want to be as efficient as I can. I then trolled various checklists to see what I needed to know about the subjects I didn't have checked off. From there (I'm doing all this in a notepad, and then later in OneNote when that came out) I developed a block of text for that subject. Every time I ran across a book, article, web site or recording on that topic I wrote that reference down. Later I went back and quickly looked over those resources and tried to figure out how I could parcel it out - 10 minutes for this one, a free seminar (like the one I'm teaching today - ironic) takes 4 hours, a web site takes an hour to grok, that sort of thing.  Then all I did was figure out how much time each day I'll give to training. Sure, it literally may be ten minutes, but it adds up. One final thing - as I used something I learned, I came back and made notes in that topic. You learn to play the piano not just from a book, but by playing the piano, after all. If you don't use what you learn, you'll lose it. So if you're interested in getting better at SQL Server, and you're willing to do a little work, try out this method. Leave a note here for others to encourage them.  Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Internal Mutation of Persistent Data Structures

    - by Greg Ros
    To clarify, when I mean use the terms persistent and immutable on a data structure, I mean that: The state of the data structure remains unchanged for its lifetime. It always holds the same data, and the same operations always produce the same results. The data structure allows Add, Remove, and similar methods that return new objects of its kind, modified as instructed, that may or may not share some of the data of the original object. However, while a data structure may seem to the user as persistent, it may do other things under the hood. To be sure, all data structures are, internally, at least somewhere, based on mutable storage. If I were to base a persistent vector on an array, and copy it whenever Add is invoked, it would still be persistent, as long as I modify only locally created arrays. However, sometimes, you can greatly increase performance by mutating a data structure under the hood. In more, say, insidious, dangerous, and destructive ways. Ways that might leave the abstraction untouched, not letting the user know anything has changed about the data structure, but being critical in the implementation level. For example, let's say that we have a class called ArrayVector implemented using an array. Whenever you invoke Add, you get a ArrayVector build on top of a newly allocated array that has an additional item. A sequence of such updates will involve n array copies and allocations. Here is an illustration: However, let's say we implement a lazy mechanism that stores all sorts of updates -- such as Add, Set, and others in a queue. In this case, each update requires constant time (adding an item to a queue), and no array allocation is involved. When a user tries to get an item in the array, all the queued modifications are applied under the hood, requiring a single array allocation and copy (since we know exactly what data the final array will hold, and how big it will be). Future get operations will be performed on an empty cache, so they will take a single operation. But in order to implement this, we need to 'switch' or mutate the internal array to the new one, and empty the cache -- a very dangerous action. However, considering that in many circumstances (most updates are going to occur in sequence, after all), this can save a lot of time and memory, it might be worth it -- you will need to ensure exclusive access to the internal state, of course. This isn't a question about the efficacy of such a data structure. It's a more general question. Is it ever acceptable to mutate the internal state of a supposedly persistent or immutable object in destructive and dangerous ways? Does performance justify it? Would you still be able to call it immutable? Oh, and could you implement this sort of laziness without mutating the data structure in the specified fashion?

    Read the article

< Previous Page | 252 253 254 255 256 257 258 259 260 261 262 263  | Next Page >