Search Results

Search found 1848 results on 74 pages for 'significant'.

Page 64/74 | < Previous Page | 60 61 62 63 64 65 66 67 68 69 70 71 | Next Page >

debugging JBoss 100% CPU usage

- by NateS

Originally posted on Server Fault, where it was suggested this question might better asked here. We are using JBoss to run two of our WARs. One is our web app, the other is our web service. The web app accesses a database on another machine and makes requests to the web service. The web service makes JMS requests to other machines, aggregates the data, and returns it. At our biggest client, about once a month the JBoss Java process takes 100% of all CPUs. The machine running JBoss has 8 CPUs. Our web app is still accessible during this time, however pages take about 3 minutes to load. Restarting JBoss restores everything to normal. The database machine and all the other machines are fine, only the machine running JBoss is affected. Memory usage is normal. Network utilization is normal. There are no suspect error messages in the JBoss logs. I have set up a test environment as close as possible to the client's production environment and I've done load testing with as much as 2x the number of concurrent users. I have not gotten my test environment to replicate the problem. Where do we go from here? How can we narrow down the problem? Currently the only plan we have is to wait until the problem occurs in production on its own, then do some debugging to determine the cause. So far people have just restarted JBoss when the problem occurred to minimize down time. Next time it happens they will get a developer to take a look. The question is, next time it happens, what can be done to determine the cause? We could setup a separate JBoss instance on the same box and install the web app separately from the web service. This way when the problem next occurs we will know which WAR has the problem (assuming it is our code). This doesn't narrow it down much though. Should I enable JMX remote? This way the next time the problem occurs I can connect with VisualVM and see which threads are taking the CPU and what the hell they are doing. However, is there a significant down side to enabling JMX remote in a production environment? Is there another way to see what threads are eating the CPU and to get a stacktrace to see what they are doing? Any other ideas? Thanks!

Read the article
Why does PostgresQL query performance drop over time, but restored when rebuilding index

- by Jim Rush

According to this page in the manual, indexes don't need to be maintained. However, we are running with a PostgresQL table that has a continuous rate of updates, deletes and inserts that over time (a few days) sees a significant query degradation. If we delete and recreate the index, query performance is restored. We are using out of the box settings. The table in our test is currently starting out empty and grows to half a million rows. It has a fairly large row (lots of text fields). We are search is based of an index, not the primary key (I've confirmed the index is being used, at least under normal conditions) The table is being used as a persistent store for a single process. Using PostgresQL on Windows with a Java client I'm willing to give up insert and update performance to keep up the query performance. We are considering rearchitecting the application so that data is spread across various dynamic tables in a manner that allows us to drop and rebuild indexes periodically without impacting the application. However, as always, there is a time crunch to get this to work and I suspect we are missing something basic in our configuration or usage. We have considered forcing vacuuming and rebuild to run at certain times, but I suspect the locking period for such an action would cause our query to block. This may be an option, but there are some real-time (windows of 3-5 seconds) implications that require other changes in our code. Additional information: Table and index CREATE TABLE icl_contacts ( id bigint NOT NULL, campaignfqname character varying(255) NOT NULL, currentstate character(16) NOT NULL, xmlscheduledtime character(23) NOT NULL, ... 25 or so other fields. Most of them fixed or varying character fiel ... CONSTRAINT icl_contacts_pkey PRIMARY KEY (id) ) WITH (OIDS=FALSE); ALTER TABLE icl_contacts OWNER TO postgres; CREATE INDEX icl_contacts_idx ON icl_contacts USING btree (xmlscheduledtime, currentstate, campaignfqname); Analyze: Limit (cost=0.00..3792.10 rows=750 width=32) (actual time=48.922..59.601 rows=750 loops=1) - Index Scan using icl_contacts_idx on icl_contacts (cost=0.00..934580.47 rows=184841 width=32) (actual time=48.909..55.961 rows=750 loops=1) Index Cond: ((xmlscheduledtime < '2010-05-20T13:00:00.000'::bpchar) AND (currentstate = 'SCHEDULED'::bpchar) AND ((campaignfqname)::text = '.main.ee45692a-6113-43cb-9257-7b6bf65f0c3e'::text)) And, yes, I am aware there there are a variety of things we could do to normalize and improve the design of this table. Some of these options may be available to us. My focus in this question is about understanding how PostgresQL is managing the index and query over time (understand why, not just fix). If it were to be done over or significantly refactored, there would be a lot of changes.

Read the article
Soon to be PhD in Computer Science - Which Path to Follow?

- by mttr

I am going to submit my PhD thesis within the next six months. My PhD is on managing the availabiity of large-scale distributed systems, so I have some experience actually building non-trivial systems (+ I have four years experience working as a programmer). I am now trying to figure out what I should do following the PhD. I enjoy research (a quick definition: identify problem, come up with solution, ask interesting questions, find ways to answer them, build system, experiment, contribute some new knowledge and publish). I also like teaching and supervising students. It would seem that a career in academia is the ideal thing to do (can work on non-trivial problems and contribute something of use to some or more people). However, a career in academia has two significant drawbacks. First, it can be difficult to gain access to real systems with real users which then display real problems. This creates the danger that you do work that seems important (to you and maybe to some of your colleagues), but is not really relevant to anything or anyone. Second, the pay is pretty sad. Apparently, you have to sacrifice this for the privilege of doing research. I enjoy programming, but don't just want to hack some web-based system for the rest of my life. That is, working in IT for a bank is not a future I see myself enjoying. I want to work on interesting problms (that's difficult to define clearly): things where you don't know how to start, that take some time to figure out and attack, that require a rigorous approach to demonstrate that the problem has been solved, and problems that need a solution in the real world. Give the experience of people on stackoverflow, what do you think suitable options are and why (or alternatively, what gaps in my thinking does the above reveal)? Is industrial research (aka IBM Research, Microsoft Research) the only alternative avenue to a career in academia? What other areas, companies, occupations, etc. could provide me with stimulating, inspiring work? Which regions, countries am I most likely to find such work? Please share your experience.

Read the article
What was your the most impressive technical programming achievement performed to impress a romantic

- by DVK

OK, so the archetypal human story is for a guy to go out and impress the girl with some wonderful achievement like slaying a dragon or building a monument or conquering neighboring tribe. This being enlightened 21st century on SO, let's morph this into a: StackOverflower performing a feat of programming to impress a romantic interest. There are two ways to do this: Technical achievement: Impressing a person with suitable background/understanding of programming with actual coding powerss you displayed. A dumb movie example would be that kid in "Hackers" move showing off his hacking skills in front of Angeline Jolie. Artistic achievement: Impressing a person with a result of running said code, whether they understand just how incredible the code itself is. An example is the animated ANSI rose (for a guy who actually wrote the ANSI code) This question is only about the first kind (technical achievements) - e.g. the person of interest was presented with impressive code/design that (s)he was able to properly appreciate. Rules (what doesn't qualify): The target audience must have been a person of romantic interest (prospective or present significant other or random hook-up). E.g. showing your program to your sister who's also a software developer doesn't count. The achievement must have been done specifically with the goal to impress such a person. However, it is OK if the achievement was done to impress a generic qualifying person, not someone specific. Although... if you write code to impress girls in general, I'd say "get a better idea of the opposite sex" The achievement must have been done with the goal of impressing the person. In other words, if you would have done it without romantic interest's knowledge anyway, it doesn't count. As examples, the following does not count: programming for your job. Programming for a coding contest. Open Source program that you'd have done anyway. The precise nature of the awesomeness of the achievement is somewhat irrelevant - from learning entire J2EE in 2 days to writing fancy game engine to implementing Python compiler in LOGO. As long as it's programming/software development related. The achievement should preferably be something other people would rank highly as well. If your date was impressed with your skill at calculating Fibonacci sequence without recursive function calls, it doesn't mean most developers will be. But it does mean you need to start finding better things to do on dates ;)

Read the article
Can Hudson branch promotion get based on project stability?

- by Wayne

Hudson CI server displays stability "weather" which is cool. And it allows one project build to kick off based on the successful build of another. However, how can you make that secondary project dependent additionally on the stability of multiple builds of the first project? Specifically, project "stable_deploy" needs to only kick off to promote a version to "stable" if project "integrate" with version 8.3.4.1233 has built and tested successfully at least 8 times--in a row. Until then, it's still in integration mode. IMPORTANT: A significant caveat to this is that a single set of Hudson projects gets used as a "pipeline" to process each new version through to release. So a project may have built successfully 8 times in a rolw but the latest version 8.3.4.1233 may be only the 2 most recent builds. The builds prior to that may be an earlier version. We're open to completely reorganizing this but the pipeline idea seemed to greatly reduce the amount of manually project creation and deletion. Is there a better way to track version release "pipeline"? In particular, we will have multiple versions in this pipeline simultaneously in the future due to fixes or patches to older versions. We don't see how to do that yet, except to create new pipeline projects for each version which is a real hassle. Here's some background details: The TickZoom application has some very complete unit tests some of which simulates real time trading environments. Add to that TickZoom makes elaborate use of parallelization for leveraging multi-core computers. Needless to say, during development of a new version, there can be stability issues during integration testing which get uncovered by running the build and auto tests repeatedly. A version which builds and tests cleanly 8 times in a row without change plus has undergone some real world testing by users can be deemed "stable" and promoted to the stable branch. Our Hudson projects look like this: test - Only for testing a build, zero user visibility. integrate_deploy - Promotes a test project build to integrate branch and makes it available to public for UA testing. integrate - Repeatedly builds the integrate branch to determine if it's stable enough to promote to stable branch. This runs the builds and test hourly throughout every night. stable_deploy - Promotes an integrate project build to the stable branch and makes it public for users who want the latest and greatest. stable - Builds the stable branch once every night. After 2 weeks of successful builds (14 builds) it can go to "release candidate". And so on... it continues with "release candidate" and then "release".

Read the article
Release Process Improvements

- by wallismark

The process of creating a new build and releasing it to production is a critical step in the SDLC but it is often left as an afterthought and varies greatly from one company to the next. I'm hoping people will share improvements they have made to this process in their organisation so we can all takes steps to 'reduce the pain'. So the question is, specify one painful/time consuming part of your release process and what did you do to improve it? My example: at a previous employer all developers made database changes on one common development database. Then when it came to release time, we used Redgate's SQL Compare to generate a huge script from the differences between the Dev and QA databases. This works reasonably well but the problems with this approach are:- ALL changes in the Dev database are included, some of which may still be 'works in progress'. Sometimes developers made conflicting changes (that were not noticed until the release was in production) It was a time consuming and manual process to create and validate the script (by validate I mean, try to weed out issues like problem 1 and 2). When there were problems with the script (eg the order in which things were run such as creating a record which relies on a foreign key record which is in the script but not yet run) it took time to 'tweak' it so it ran smoothly. It's not an ideal scenario for Continuous Integration. So the solution was:- Enforce a policy of all changes to the database must be scripted. A naming convention was important for ensuring the correct running order of the scripts. Create/Use a tool to run the scripts at release time. Developers had their own copy of the database do develop against (so there was no more 'stepping on each others toes') The next release after we started this process was much faster with fewer problems, indeed the only problems found were due to people 'breaking the rules', eg not creating a script. Once the issues with releasing to QA were fixed, when it came time to release to production it was very smooth. We applied a few other changes (like introducing CI) but this was the most significant, overall we reduced release time from around 3 hours down to a max of 10-15 minutes.

Read the article
Opening HTML Table in Excel - numbers are getting changed

- by nickfranceschina

so I have this HTML table with a bunch of big numbers in it that I want to open in Excel 2007... you can follow along at home: <table> <tr> <td>this is a big number</td> </tr><tr> <td>1111111</td> </tr><tr> <td>2335322864</td> </tr><tr> <td>23353228641</td> </tr><tr> <td>233532286418</td> </tr><tr> <td>2335322864187</td> </tr><tr> <td>23353228641877</td> </tr><tr> <td>233532286418777</td> </tr><tr> <td>2335322864187774</td> </tr><tr> <td>23353228641877745</td> </tr><tr> <td>233532286418777456</td> </tr><tr> <td>2335322864187774562</td> </tr><tr> <td>23353228641877745623</td> </tr><tr> <td>233532286418777456238</td> </tr> when I open this file in Excel it starts converting those numbers to scientific notation when they get over 10 digits in length... and in doing so, it starts changing the actual number and replaces least significant digits to zeros how can I tell Excel not to do this?

Read the article
Is the Scala 2.8 collections library a case of "the longest suicide note in history" ?

- by oxbow_lakes

First note the inflammatory subject title is a quotation made about the manifesto of a UK political party in the early 1980s. This question is subjective but it is a genuine question, I've made it CW and I'd like some opinions on the matter. Despite whatever my wife and coworkers keep telling me, I don't think I'm an idiot: I have a good degree in mathematics from the University of Oxford and I've been programming commercially for almost 12 years and in Scala for about a year (also commercially). I have just started to look at the Scala collections library re-implementation which is coming in the imminent 2.8 release. Those familiar with the library from 2.7 will notice that the library, from a usage perspective, has changed little. For example... > List("Paris", "London").map(_.length) res0: List[Int] List(5, 6) ...would work in either versions. The library is eminently useable: in fact it's fantastic. However, those previously unfamiliar with Scala and poking around to get a feel for the language now have to make sense of method signatures like: def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That For such simple functionality, this is a daunting signature and one which I find myself struggling to understand. Not that I think Scala was ever likely to be the next Java (or /C/C++/C#) - I don't believe its creators were aiming it at that market - but I think it is/was certainly feasible for Scala to become the next Ruby or Python (i.e. to gain a significant commercial user-base) Is this going to put people off coming to Scala? Is this going to give Scala a bad name in the commercial world as an academic plaything that only dedicated PhD students can understand? Are CTOs and heads of software going to get scared off? Was the library re-design a sensible idea? If you're using Scala commercially, are you worried about this? Are you planning to adopt 2.8 immediately or wait to see what happens? Steve Yegge once attacked Scala (mistakenly in my opinion) for what he saw as its overcomplicated type-system. I worry that someone is going to have a field day spreading fud with this API (similarly to how Josh Bloch scared the JCP out of adding closures to Java). Note - I should be clear that, whilst I believe that Josh Bloch was influential in the rejection of the BGGA closures proposal, I don't ascribe this to anything other than his honestly-held beliefs that the proposal represented a mistake.

Read the article
physics game programming box2d - orientating a turret-like object using torques

- by egarcia

This is a problem I hit when trying to implement a game using the LÖVE engine, which covers box2d with Lua scripting. The objective is simple: A turret-like object (seen from the top, on a 2D environment) needs to orientate itself so it points to a target. The turret is on the x,y coordinates, and the target is on tx, ty. We can consider that x,y are fixed, but tx, ty tend to vary from one instant to the other (i.e. they would be the mouse cursor). The turret has a rotor that can apply a rotational force (torque) on any given moment, clockwise or counter-clockwise. The magnitude of that force has an upper limit called maxTorque. The turret also has certain rotational inertia, which acts for angular movement the same way mass acts for linear movement. There's no friction of any kind, so the turret will keep spinning if it has an angular velocity. The turret has a small AI function that re-evaluates its orientation to verify that it points to the right direction, and activates the rotator. This happens every dt (~60 times per second). It looks like this right now: function Turret:update(dt) local x,y = self:getPositon() local tx,ty = self:getTarget() local maxTorque = self:getMaxTorque() -- max force of the turret rotor local inertia = self:getInertia() -- the rotational inertia local w = self:getAngularVelocity() -- current angular velocity of the turret local angle = self:getAngle() -- the angle the turret is facing currently -- the angle of the like that links the turret center with the target local targetAngle = math.atan2(oy-y,ox-x) local differenceAngle = _normalizeAngle(targetAngle - angle) if(differenceAngle <= math.pi) then -- counter-clockwise is the shortest path self:applyTorque(maxTorque) else -- clockwise is the shortest path self:applyTorque(-maxTorque) end end ... it fails. Let me explain with two illustrative situations: The turret "oscillates" around the targetAngle. If the target is "right behind the turret, just a little clock-wise", the turret will start applying clockwise torques, and keep applying them until the instant in which it surpasses the target angle. At that moment it will start applying torques on the opposite direction. But it will have gained a significant angular velocity, so it will keep going clockwise for some time... until the target will be "just behind, but a bit counter-clockwise". And it will start again. So the turret will oscillate or even go in round circles. I think that my turret should start applying torques in the "opposite direction of the shortest path" before it reaches the target angle (like a car braking before stopping). Intuitively, I think the turret should "start applying torques on the opposite direction of the shortest path when it is about half-way to the target objective". My intuition tells me that it has something to do with the angular velocity. And then there's the fact that the target is mobile - I don't know if I should take that into account somehow or just ignore it. How do I calculate when the turret must "start braking"?

Read the article
Should I use Python or Assembly for a super fast copy program

- by PyNEwbie

As a maintenance issue I need to routinely (3-5 times per year) copy a repository that is now has over 20 million files and exceeds 1.5 terabytes in total disk space. I am currently using RICHCOPY, but have tried others. RICHCOPY seems the fastest but I do not believe I am getting close to the limits of the capabilities of my XP machine. I am toying around with using what I have read in The Art of Assembly Language to write a program to copy my files. My other thought is to start learning how to multi-thread in Python to do the copies. I am toying around with the idea of doing this in Assembly because it seems interesting, but while my time is not incredibly precious it is precious enough that I am trying to get a sense of whether or not I will see significant enough gains in copy speed. I am assuming that I would but I only started really learning to program 18 months and it is still more or less a hobby. Thus I may be missing some fundamental concept of what happens with interpreted languages. Any observations or experiences would be appreciated. Note, I am not looking for any code. I have already written a basic copy program in Python 2.6 that is no slower than RICHCOPY. I am looking for some observations on which will give me more speed. Right now it takes me over 50 hours to make a copy from a disk to a Drobo and then back from the Drobo to a disk. I have a LogicCube for when I am simply duplicating a disk but sometimes I need to go from a disk to Drobo or the reverse. I am thinking that given that I can sector copy a 3/4 full 2 terabyte drive using the LogicCube in under seven hours I should be able to get close to that using Assembly, but I don't know enough to know if this is valid. (Yes, sometimes ignorance is bliss) The reason I need to speed it up is I have had two or three cycles where something has happened during copy (fifty hours is a long time to expect the world to hold still) that has caused me to have to trash the copy and start over. For example, last week the water main broke under our building and shorted out the power.

Read the article
"Emulating" Application.Run using Application.DoEvents

- by Luca

I'm getting in trouble. I'm trying to emulate the call Application.Run using Application.DoEvents... this sounds bad, and then I accept also alternative solutions to my question... I have to handle a message pump like Application.Run does, but I need to execute code before and after the message handling. Here is the main significant snippet of code. // Create barrier (multiple kernels synchronization) sKernelBarrier = new KernelBarrier(sKernels.Count); foreach (RenderKernel k in sKernels) { // Create rendering contexts (one for each kernel) k.CreateRenderContext(); // Start render kernel kernels k.mThread = new Thread(RenderKernelMain); k.mThread.Start(k); } while (sKernelBarrier.KernelCount > 0) { // Wait untill all kernel loops has finished sKernelBarrier.WaitKernelBarrier(); // Do application events Application.DoEvents(); // Execute shared context services foreach (RenderKernelContextService s in sContextServices) s.Execute(sSharedContext); // Next kernel render loop sKernelBarrier.ReleaseKernelBarrier(); } This snippet of code is execute by the Main routine. Pratically I have a list of Kernel classes, which runs in separate threads, these threads handle a Form for rendering in OpenGL. I need to synchronize all the Kernel threads using a barrier, and this work perfectly. Of course, I need to handle Form messages in the main thread (Main routine), for every Form created, and indeed I call Application.DoEvents() to do the job. Now I have to modify the snippet above to have a common Form (simple dialog box) without consuming the 100% of CPU calling Application.DoEvents(), as Application.Run does. The goal should be to have the snippet above handle messages when arrives, and issue a rendering (releasing the barrier) only when necessary, without trying to get the maximum FPS; there should be the possibility to switch to a strict loop to render as much as possible. How could it be possible? Note: the snippet above must be executed in the Main routine, since the OpenGL context is created on the main thread. Moving the snippet in a separated thread and calling Application.Run is quite unstable and buggy...

Read the article
R glm standard error estimate differences to SAS PROC GENMOD

- by Michelle

I am converting a SAS PROC GENMOD example into R, using glm in R. The SAS code was: proc genmod data=data0 namelen=30; model boxcoxy=boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + SEQ/dist=normal; FREQ REPLICATE_VAR; run; My R code is: parmsg2 <- glm(boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + SEQ , data=data0, family=gaussian, weights = REPLICATE_VAR) When I use summary(parmsg2) I get the same coefficient estimates as in SAS, but my standard errors are wildly different. The summary output from SAS is: Name df Estimate StdErr LowerWaldCL UpperWaldCL ChiSq ProbChiSq Intercept 1 6.5007436 .00078884 6.4991975 6.5022897 67911982 0 agegrp4 1 .64607262 .00105425 .64400633 .64813891 375556.79 0 agegrp5 1 .4191395 .00089722 .41738099 .42089802 218233.76 0 agegrp6 1 -.22518765 .00083118 -.22681672 -.22355857 73401.113 0 agegrp7 1 -1.7445189 .00087569 -1.7462352 -1.7428026 3968762.2 0 agegrp8 1 -2.2908855 .00109766 -2.2930369 -2.2887342 4355849.4 0 race1 1 -.13454883 .00080672 -.13612997 -.13296769 27817.29 0 race3 1 -.20607036 .00070966 -.20746127 -.20467944 84319.131 0 weekend 1 .0327884 .00044731 .0319117 .03366511 5373.1931 0 seq2 1 -.47509583 .00047337 -.47602363 -.47416804 1007291.3 0 Scale 1 2.9328613 .00015586 2.9325559 2.9331668 -127 The summary output from R is: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.50074 0.10354 62.785 < 2e-16 AGEGRP4 0.64607 0.13838 4.669 3.07e-06 AGEGRP5 0.41914 0.11776 3.559 0.000374 AGEGRP6 -0.22519 0.10910 -2.064 0.039031 AGEGRP7 -1.74452 0.11494 -15.178 < 2e-16 AGEGRP8 -2.29089 0.14407 -15.901 < 2e-16 RACE1 -0.13455 0.10589 -1.271 0.203865 RACE3 -0.20607 0.09315 -2.212 0.026967 WEEKEND 0.03279 0.05871 0.558 0.576535 SEQ -0.47510 0.06213 -7.646 2.25e-14 The importance of the difference in the standard errors is that the SAS coefficients are all statistically significant, but the RACE1 and WEEKEND coefficients in the R output are not. I have found a formula to calculate the Wald confidence intervals in R, but this is pointless given the difference in the standard errors, as I will not get the same results. Apparently SAS uses a ridge-stabilized Newton-Raphson algorithm for its estimates, which are ML. The information I read about the glm function in R is that the results should be equivalent to ML. What can I do to change my estimation procedure in R so that I get the equivalent coefficents and standard error estimates that were produced in SAS? To update, thanks to Spacedman's answer, I used weights because the data are from individuals in a dietary survey, and REPLICATE_VAR is a balanced repeated replication weight, that is an integer (and quite large, in the order of 1000s or 10000s). The website that describes the weight is here. I don't know why the FREQ rather than the WEIGHT command was used in SAS. I will now test by expanding the number of observations using REPLICATE_VAR and rerunning the analysis.

Read the article
Java map / nio / NFS issue causing a VM fault: "a fault occurred in a recent unsafe memory access op

- by Matthew Bloch

I have written a parser class for a particular binary format (nfdump if anyone is interested) which uses java.nio's MappedByteBuffer to read through files of a few GB each. The binary format is just a series of headers and mostly fixed-size binary records, which are fed out to the called by calling nextRecord(), which pushes on the state machine, returning null when it's done. It performs well. It works on a development machine. On my production host, it can run for a few minutes or hours, but always seems to throw "java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code", fingering one of the Map.getInt, getShort methods, i.e. a read operation in the map. The uncontroversial (?) code that sets up the map is this: /** Set up the map from the given filename and position */ protected void open() throws IOException { // Set up buffer, is this all the flexibility we'll need? channel = new FileInputStream(file).getChannel(); MappedByteBuffer map1 = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size()); map1.load(); // we want the whole thing, plus seems to reduce frequency of crashes? map = map1; // assumes the host writing the files is little-endian (x86), ought to be configurable map.order(java.nio.ByteOrder.LITTLE_ENDIAN); map.position(position); } and then I use the various map.get* methods to read shorts, ints, longs and other sequences of bytes, before hitting the end of the file and closing the map. I've never seen the exception thrown on my development host. But the significant point of difference between my production host and development is that on the former, I am reading sequences of these files over NFS (probably 6-8TB eventually, still growing). On my dev machine, I have a smaller selection of these files locally (60GB), but when it blows up on the production host it's usually well before it gets to 60GB of data. Both machines are running java 1.6.0_20-b02, though the production host is running Debian/lenny, the dev host is Ubuntu/karmic. I'm not convinced that will make any difference. Both machines have 16GB RAM, and are running with the same java heap settings. I take the view that if there is a bug in my code, there is enough of a bug in the JVM not to throw me a proper exception! But I think it is just a particular JVM implementation bug due to interactions between NFS and mmap, possibly a recurrence of 6244515 which is officially fixed. I already tried adding in a "load" call to force the MappedByteBuffer to load its contents into RAM - this seemed to delay the error in the one test run I've done, but not prevent it. Or it could be coincidence that was the longest it had gone before crashing! If you've read this far and have done this kind of thing with java.nio before, what would your instinct be? Right now mine is to rewrite it without nio :)

Read the article
How to make freelance clients understand the costs of developing and maintaining mature products?

- by John

I have a freelance web application project where the client requests new features every two weeks or so. I am unable to anticipate the requirements of upcoming features. So when the client requests a new feature, one of several things may happen: I implement the feature with ease because it is compatible with the existing platform I implement the feature with difficulty because I have to rewrite a significant portion of the platform's foundation Client withdraws request because it costs too much to implement against existing platform At the beginning of the project, for about six months, all feature requests fell under category 1) because the system was small and agile. But for the past six months, most feature implementation fell under category 2). The system is mature, forcing me to refactor and test everytime I want to add new modules. Additionally, I find myself breaking things that use to work, and fixing it (I don't get paid for this). The client is starting to express frustration at the time and cost for me to implement new features. To them, many of the feature requests are of the same scale as the features they requested six months ago. For example, a client would ask, "If it took you 1 week to build a ticketing system last year, why does it take you 1 month to build an event registration system today? An event registration system is much simpler than a ticketing system. It should only take you 1 week!" Because of this scenario, I fear feature requests will soon land in category 3). In fact, I'm already eating a lot of the cost myself because I volunteer many hours to support the project. The client is often shocked when I tell him honestly the time it takes to do something. The client always compares my estimates against the early months of a project. I don't think they're prepared for what it really costs to develop, maintain and support a mature web application. When working on a salary for a full time company, managers were more receptive of my estimates and even encouraged me to pad my numbers to prepare for the unexpected. Is there a way to condition my clients to think the same way? Can anyone offer advice on how I can continue to work on this web project without eating too much of the cost myself? Additional info - I've only been freelancing full time for 1 year. I don't yet have the high end clients, but I'm slowly getting there. I'm getting better quality clients as time goes by.

Read the article
How to force VS 2010 to skip "builds" of projects which haven't changed?

- by Ladislav Mrnka

Our product's solution has more than 100+ projects (500+ksloc of production code). Most of them are C# projects but we also have few using C++/CLI to bridge communication with native code. Rebuilding the whole solution takes several minutes. That's fine. If I want to rebuilt the solution I expect that it will really take some time. What is not fine is time needed to build solution after full rebuild. Imagine I used full rebuild and know without doing any changes to to the solution I press Build (F6 or Ctrl+Shift+B). Why it takes 35s if there was no change? In output I see that it started "building" of each project - it doesn't perform real build but it does something which consumes significant amount of time. That 35s delay is pain in the ass. Yes I can improve the time by not using build solution but only build project (Shift+F6). If I run build project on particular test project I'm currently working on it will take "only" 8+s. It requires me to run project build on correct project (the test project to ensure dependent tested code is build as well). At least ReSharper test runner correctly recognizes that only this single project must be build and rerunning test usually contains only 8+s compilation. My current coding Kata is: don't touch Ctrl+Shift+B. The test project build will take 8s even if I don't do any changes. The reason why it takes 8s is because it also "builds" dependencies = in my case it "builds" more than 20 projects but I made changes only to unit test or single dependency! I don't want it to touch other projects. Is there a way to simply tell VS to build only projects where some changes were done and projects which are dependent on changed ones (preferably this part as another build option)? I worry you will tell me that it is exactly what VS is doing but in MS way ... I want to improve my TDD experience and reduce the time of compilation (in TDD the compilation can happen twice per minute). To make this even more frustrated I'm working in a team where most of developers used to work on Java projects prior to joining this one. So you can imagine how they are pissed off when they must use VS in contrast to full incremental compilation in Java. I don't require incremental compilation of classes. I expect working incremental compilation of solutions. Especially in product like VS 2010 Ultimate which costs several thousands dollars. I really don't want to get answers like: Make a separate solution Unload projects you don't need etc. I can read those answers here. Those are not acceptable solutions. We're not paying for VS to do such compromises.

Read the article
Grouping geographical shapes

- by grenade

I am using Dundas Maps and attempting to draw a map of the world where countries are grouped into regions that are specific to a business implementation. I have shape data (points and segments) for each country in the world. I can combine countries into regions by adding all points and segments for countries within a region to a new region shape. foreach(var region in GetAllRegions()){ var regionShape = new Shape { Name = region.Name }; foreach(var country in GetCountriesInRegion(region.Id)){ var countryShape = GetCountryShape(country.Id); regionShape.AddSegments(countryShape.ShapeData.Points, countryShape.ShapeData.Segments); } map.Shapes.Add(regionShape); } The problem is that the country border lines still show up within a region and I want to remove them so that only regional borders show up. Dundas polygons must start and end at the same point. This is the case for all the country shapes. Now I need an algorithm that can: Determine where country borders intersect at a regional border, so that I can join the regional border segments. Determine which country borders are not regional borders so that I can discard them. Sort the resulting regional points so that they sequentialy describe the shape boundaries. Below is where I have gotten to so far with the map. You can see that the country borders still need to be removed. For example, the border between Mongolia and China should be discarded whereas the border between Mongolia and Russia should be retained. The reason I need to retain a regional border is that the region colors will be significant in conveying information but adjacent regions may be the same color. The regions can change to include or exclude countries and this is why the regional shaping must be dynamic. EDIT: I now know that I what I am looking for is a UNION of polygons. David Lean explains how to do it using the spatial functions in SQL Server 2008 which might be an option but my efforts have come to a halt because the resulting polygon union is so complex that SQL truncates it at 43,680 characters. I'm now trying to either find a workaround for that or find a way of doing the union in code.

Read the article
.Net Dynamically Load DLL

- by hermiod

I am trying to write some code that will allow me to dynamically load DLLs into my application, depending on an application setting. The idea is that the database to be accessed is set in the application settings and then this loads the appropriate DLL and assigns it to an instance of an interface for my application to access. This is my code at the moment: Dim SQLDataSource As ICRDataLayer Dim ass As Assembly = Assembly. _ LoadFrom("M:\MyProgs\WebService\DynamicAssemblyLoading\SQLServer\bin\Debug\SQLServer.dll") Dim obj As Object = ass.CreateInstance(GetType(ICRDataLayer).ToString, True) SQLDataSource = DirectCast(obj, ICRDataLayer) MsgBox(SQLDataSource.ModuleName & vbNewLine & SQLDataSource.ModuleDescription) I have my interface (ICRDataLayer) and the SQLServer.dll contains an implementation of this interface. I just want to load the assembly and assign it to the SQLDataSource object. The above code just doesn't work. There are no exceptions thrown, even the Msgbox doesn't appear. I would've expected at least the messagebox appearing with nothing in it, but even this doesn't happen! Is there a way to determine if the loaded assembly implements a specific interface. I tried the below but this also doesn't seem to do anything! For Each loadedType As Type In ass.GetTypes If GetType(ICRDataLayer).IsAssignableFrom(loadedType) Then Dim obj1 As Object = ass.CreateInstance(GetType(ICRDataLayer).ToString, True) SQLDataSource = DirectCast(obj1, ICRDataLayer) End If Next EDIT: New code from Vlad's examples: Module CRDataLayerFactory Sub New() End Sub ' class name is a contract, ' should be the same for all plugins Private Function Create() As ICRDataLayer Return New SQLServer() End Function End Module Above is Module in each DLL, converted from Vlad's C# example. Below is my code to bring in the DLL: Dim SQLDataSource As ICRDataLayer Dim ass As Assembly = Assembly. _ LoadFrom("M:\MyProgs\WebService\DynamicAssemblyLoading\SQLServer\bin\Debug\SQLServer.dll") Dim factory As Object = ass.CreateInstance("CRDataLayerFactory", True) Dim t As Type = factory.GetType Dim method As MethodInfo = t.GetMethod("Create") Dim obj As Object = method.Invoke(factory, Nothing) SQLDataSource = DirectCast(obj, ICRDataLayer) EDIT: Implementation based on Paul Kohler's code Dim file As String For Each file In Directory.GetFiles(baseDir, searchPattern, SearchOption.TopDirectoryOnly) Dim assemblyType As System.Type For Each assemblyType In Assembly.LoadFrom(file).GetTypes Dim s As System.Type() = assemblyType.GetInterfaces For Each ty As System.Type In s If ty.Name.Contains("ICRDataLayer") Then MsgBox(ty.Name) plugin = DirectCast(Activator.CreateInstance(assemblyType), ICRDataLayer) MessageBox.Show(plugin.ModuleName) End If Next I get the following error with this code: Unable to cast object of type 'SQLServer.CRDataSource.SQLServer' to type 'DynamicAssemblyLoading.ICRDataLayer'. The actual DLL is in a different project called SQLServer in the same solution as my implementation code. CRDataSource is a namespace and SQLServer is the actual class name of the DLL. The SQLServer class implements ICRDataLayer, so I don't understand why it wouldn't be able to cast it. Is the naming significant here, I wouldn't have thought it would be.

Read the article
CUDA, more threads for same work = Longer run time despite better occupancy, Why?

- by zenna

I encountered a strange problem where increasing my occupancy by increasing the number of threads reduced performance. I created the following program to illustrate the problem: #include <stdio.h> #include <stdlib.h> #include <cuda_runtime.h> __global__ void less_threads(float * d_out) { int num_inliers; for (int j=0;j<800;++j) { //Do 12 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; num_inliers += threadIdx.x*5; num_inliers += threadIdx.x*6; num_inliers += threadIdx.x*7; num_inliers += threadIdx.x*8; num_inliers += threadIdx.x*9; num_inliers += threadIdx.x*10; num_inliers += threadIdx.x*11; num_inliers += threadIdx.x*12; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } __global__ void more_threads(float *d_out) { int num_inliers; for (int j=0;j<800;++j) { // Do 4 computations num_inliers += threadIdx.x*1; num_inliers += threadIdx.x*2; num_inliers += threadIdx.x*3; num_inliers += threadIdx.x*4; } if (threadIdx.x == -1) d_out[blockIdx.x*blockDim.x+threadIdx.x] = num_inliers; } int main(int argc, char* argv[]) { float *d_out = NULL; cudaMalloc((void**)&d_out,sizeof(float)*25000); more_threads<<<780,128>>>(d_out); less_threads<<<780,32>>>(d_out); return 0; } Note both kernels should do the same amount of work in total, the (if threadIdx.x == -1 is a trick to stop the compiler optimising everything out and leaving an empty kernel). The work should be the same as more_threads is using 4 times as many threads but with each thread doing 4 times less work. Significant results form the profiler results are as followsL: more_threads: GPU runtime = 1474 us,reg per thread = 6,occupancy=1,branch=83746,divergent_branch = 26,instructions = 584065,gst request=1084552 less_threads: GPU runtime = 921 us,reg per thread = 14,occupancy=0.25,branch=20956,divergent_branch = 26,instructions = 312663,gst request=677381 As I said previously, the run time of the kernel using more threads is longer, this could be due to the increased number of instructions. Why are there more instructions? Why is there any branching, let alone divergent branching, considering there is no conditional code? Why are there any gst requests when there is no global memory access? What is going on here! Thanks

Read the article
should I ever put a major version number into a C#/Java namespace?

- by Andrew Patterson

I am designing a set of 'service' layer objects (data objects and interface definitions) for a WCF web service (that will be consumed by third party clients i.e. not in-house, so outside my direct control). I know that I am not going to get the interface definition exactly right - and am wanting to prepare for the time when I know that I will have to introduce a breaking set of new data objects. However, the reality of the world I am in is that I will also need to run my first version simultaneously for quite a while. The first version of my service will have URL of http://host/app/v1service.svc and when the times comes by new version will live at http://host/app/v2service.svc However, when it comes to the data objects and interfaces, I am toying with putting the 'major' version of the interface number into the actual namespace of the classes. namespace Company.Product.V1 { [DataContract(Namespace = "company-product-v1")] public class Widget { [DataMember] string widgetName; } public interface IFunction { Widget GetWidgetData(int code); } } When the time comes for a fundamental change to the service, I will introduce some classes like namespace Company.Product.V2 { [DataContract(Namespace = "company-product-v2")] public class Widget { [DataMember] int widgetCode; [DataMember] int widgetExpiry; } public interface IFunction { Widget GetWidgetData(int code); } } The advantages as I see it are that I will be able to have a single set of code serving both interface versions, sharing functionality where possible. This is because I will be able to reference both interface versions as a distinct set of C# objects. Similarly, clients may use both interface versions simultaneously, perhaps using V1.Widget in some legacy code whilst new bits move on to V2.Widget. Can anyone tell why this is a stupid idea? I have a nagging feeling that this is a bit smelly.. notes: I am obviously not proposing every single new version of the service would be in a new namespace. Presumably I will do as many non-breaking interface changes as possible, but I know that I will hit a point where all the data modelling will probably need a significant rewrite. I understand assembly versioning etc but I think this question is tangential to that type of versioning. But I could be wrong.

Read the article
Is there a scheduling algorithm that optimizes for "maker's schedules"?

- by John Feminella

You may be familiar with Paul Graham's essay, "Maker's Schedule, Manager's Schedule". The crux of the essay is that for creative and technical professionals, meetings are anathema to productivity, because they tend to lead to "schedule fragmentation", breaking up free time into chunks that are too small to acquire the focus needed to solve difficult problems. In my firm we've seen significant benefits by minimizing the amount of disruption caused, but the brute-force algorithm we use to decide schedules is not sophisticated enough to handle scheduling large groups of people well. (*) What I'm looking for is if there's are any well-known algorithms which minimize this productivity disruption, among a group of N makers and managers. In our model, There are N people. Each person pi is either a maker (Mk) or a manager (Mg). Each person has a schedule si. Everyone's schedule is H hours long. A schedule consists of a series of non-overlapping intervals si = [h1, ..., hj]. An interval is either free or busy. Two adjacent free intervals are equivalent to a single free interval that spans both. A maker's productivity is maximized when the number of free intervals is minimized. A manager's productivity is maximized when the total length of free intervals is maximized. Notice that if there are no meetings, both the makers and the managers experience optimum productivity. If meetings must be scheduled, then makers prefer that meetings happen back-to-back, while managers don't care where the meeting goes. Note that because all disruptions are treated as equally harmful to makers, there's no difference between a meeting that lasts 1 second and a meeting that lasts 3 hours if it segments the available free time. The problem is to decide how to schedule M different meetings involving arbitrary numbers of the N people, where each person in a given meeting must place a busy interval into their schedule such that it doesn't overlap with any other busy interval. For each meeting Mt the start time for the busy interval must be the same for all parties. Does an algorithm exist to solve this problem or one similar to it? My first thought was that this looks really similar to defragmentation (minimize number of distinct chunks), and there are a lot of algorithms about that. But defragmentation doesn't have much to do with scheduling. Thoughts? (*) Practically speaking this is not really a problem, because it's rare that we have meetings with more than ~5 people at once, so the space of possibilities is small.

Read the article
Finding N contiguous zero bits in an integer to the left of the MSB position of another integer

- by James Morris

The problem is: given an integer val1 find the position of the highest bit set (Most Significant Bit) then, given a second integer val2 find a contiguous region of unset bits, with the minimum number of zero bits given by width to the left of the position (ie, in the higher bits). Here is the C code for my solution: typedef unsigned int t; unsigned const t_bits = sizeof(t) * CHAR_BIT; _Bool test_fit_within_left_of_msb( unsigned width, t val1, t val2, unsigned* offset_result) { unsigned offbit = 0; unsigned msb = 0; t mask; t b; while(val1 >>= 1) ++msb; while(offbit + width < t_bits - msb) { mask = (((t)1 << width) - 1) << (t_bits - width - offbit); b = val2 & mask; if (!b) { *offset_result = offbit; return true; } if (offbit++) /* this conditional bothers me! */ b <<= offbit - 1; while(b <<= 1) offbit++; } return false; } Aside from faster ways of finding the MSB of the first integer, the commented test for a zero offbit seems a bit extraneous, but necessary to skip the highest bit of type t if it is set. I have also implemented similar algorithms but working to the right of the MSB of the first number, so they don't require this seemingly extra condition. How can I get rid of this extra condition, or even, are there far more optimal solutions? Edit: Some background not strictly required. The offset result is a count of bits from the high bit, not from the low bit as maybe expected. This will be part of a wider algorithm which scans a 2D array for a 2D area of zero bits. Here, for testing, the algorithm has been simplified. val1 represents the first integer which does not have all bits set found in a row of the 2D array. From this the 2D version would scan down which is what val2 represents. Here's some output showing success and failure: t_bits:32 t_high: 10000000000000000000000000000000 ( 2147483648 ) --------- ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000010000000 ( 128 ) val2: 01000001000100000000100100001001 ( 1091569929 ) msb: 7 offbit:0 + width: 8 = 8 mask: 11111111000000000000000000000000 ( 4278190080 ) b: 01000001000000000000000000000000 ( 1090519040 ) offbit:8 + width: 8 = 16 mask: 00000000111111110000000000000000 ( 16711680 ) b: 00000000000100000000000000000000 ( 1048576 ) offbit:12 + width: 8 = 20 mask: 00000000000011111111000000000000 ( 1044480 ) b: 00000000000000000000000000000000 ( 0 ) offbit:12 iters:10 ***** found room for width:8 at offset: 12 ***** ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000001000000 ( 64 ) val2: 00010000000000001000010001000001 ( 268469313 ) msb: 6 offbit:0 + width: 13 = 13 mask: 11111111111110000000000000000000 ( 4294443008 ) b: 00010000000000000000000000000000 ( 268435456 ) offbit:4 + width: 13 = 17 mask: 00001111111111111000000000000000 ( 268402688 ) b: 00000000000000001000000000000000 ( 32768 ) ***** mask: 00001111111111111000000000000000 ( 268402688 ) offbit:17 iters:15 ***** no room found for width:13 ***** (iters is the count of iterations of the inner while loop)

Read the article
Code golf - hex to (raw) binary conversion

- by Alnitak

In response to this question asking about hex to (raw) binary conversion, a comment suggested that it could be solved in "5-10 lines of C, or any other language." I'm sure that for (some) scripting languages that could be achieved, and would like to see how. Can we prove that comment true, for C, too? NB: this doesn't mean hex to ASCII binary - specifically the output should be a raw octet stream corresponding to the input ASCII hex. Also, the input parser should skip/ignore white space. edit (by Brian Campbell) May I propose the following rules, for consistency? Feel free to edit or delete these if you don't think these are helpful, but I think that since there has been some discussion of how certain cases should work, some clarification would be helpful. The program must read from stdin and write to stdout (we could also allow reading from and writing to files passed in on the command line, but I can't imagine that would be shorter in any language than stdin and stdout) The program must use only packages included with your base, standard language distribution. In the case of C/C++, this means their respective standard libraries, and not POSIX. The program must compile or run without any special options passed to the compiler or interpreter (so, 'gcc myprog.c' or 'python myprog.py' or 'ruby myprog.rb' are OK, while 'ruby -rscanf myprog.rb' is not allowed; requiring/importing modules counts against your character count). The program should read integer bytes represented by pairs of adjacent hexadecimal digits (upper, lower, or mixed case), optionally separated by whitespace, and write the corresponding bytes to output. Each pair of hexadecimal digits is written with most significant nibble first. The behavior of the program on invalid input (characters besides [a-fA-F \t\r\n], spaces separating the two characters in an individual byte, an odd number of hex digits in the input) is undefined; any behavior (other than actively damaging the user's computer or something) on bad input is acceptable (throwing an error, stopping output, ignoring bad characters, treating a single character as the value of one byte, are all OK) The program may write no additional bytes to output. Code is scored by fewest total bytes in the source file. (Or, if we wanted to be more true to the original challenge, the score would be based on lowest number of lines of code; I would impose an 80 character limit per line in that case, since otherwise you'd get a bunch of ties for 1 line).

Read the article
Performing full screen grab in windows

- by Steven Lu

I am working an idea that involves getting a full capture of the screen including windows and apps, analyzing it, and then drawing items back onto the screen, as an overlay. I want to learn image processing techniques and I could get lots of data to work with if I can directly access the Windows screen. I could use this to build automation tools the likes of which have never been seen before. More on that later. I have full screen capture working for the most part. HWND hwind = GetDesktopWindow(); HDC hdc = GetDC(hwind); int resx = GetSystemMetrics(SM_CXSCREEN); int resy = GetSystemMetrics(SM_CYSCREEN); int BitsPerPixel = GetDeviceCaps(hdc,BITSPIXEL); HDC hdc2 = CreateCompatibleDC(hdc); BITMAPINFO info; info.bmiHeader.biSize = sizeof(BITMAPINFOHEADER); info.bmiHeader.biWidth = resx; info.bmiHeader.biHeight = resy; info.bmiHeader.biPlanes = 1; info.bmiHeader.biBitCount = BitsPerPixel; info.bmiHeader.biCompression = BI_RGB; void *data; hbitmap = CreateDIBSection(hdc2,&info,DIB_RGB_COLORS,(void**)&data,0,0); SelectObject(hdc2,hbitmap); Once this is done, I can call this repeatedly: BitBlt(hdc2,0,0,resx,resy,hdc,0,0,SRCCOPY); The cleanup code (I have no idea if this is correct): DeleteObject(hbitmap); ReleaseDC(hwind,hdc); if (hdc2) { DeleteDC(hdc2); } Every time BitBlt is called it grabs the screen and saves it in memory I can access thru data. Performance is somewhat satisfactory. BitBlt executes in 50 milliseconds (sometimes as low as 33ms) at 1920x1200x32. What surprises me is that when I switch display mode to 16 bit, 1920x1200x16, either through my graphics settings beforehand, or by using ChangeDisplaySettings, I get a massively improved screen grab time between 1ms and 2ms, which cannot be explained by the factor of two reduction in bit-depth. Using CreateDIBSection (as above) offers a significant speed up when in 16-bit mode, compared to if I set up with CreateCompatibleBitmap (6-7ms/f). Does anybody know why dropping to 16bit causes such a speed increase? Is there any hope for me to grab 32bit at such speeds? if not for the color depth, but for not forcing a change of screen buffer modes and the awful flickering.

Read the article
how to develop a program to minimize errors in human transcription of hand written surveys

- by Alex. S.

I need to develop custom software to do surveys. Questions may be of multiple choice, or free text in a very few cases. I was asked to design a subsystem to check if there is any error in the manual data entry for the multiple choices part. We're trying to speed up the user data entry process and to minimize human input differences between digital forms and the original questionnaires. The surveys are filled with handwritten marks and text by human interviewers, so it's possible to find hard to read marks, or also the user could accidentally select a different value in some question, and we would like to avoid that. The software must include some automatic control to detect possible typing differences. Each answer of the multiple choice questions has the same probability of being selected. This question has two parts: The GUI. The most simple thing I have in mind is to implement the most usable design of the questions display: use of large and readable fonts and space generously the choices. Is there something else? For faster input, I would like to use drop down lists (favoring keyboard over mouse). Given the questions are grouped in sections, I would like to show the answers selected for the questions of that section, but this could slow down the process. Any other ideas? The error checking subsystem. What else can I do to minimize or to check human typos in the multiple choice questions? Is this a solvable problem? is there some statistical methodology to check values that were entered by the users are the same from the hand filled forms? For example, let's suppose the survey has 5 questions, and each has 4 options. Let's say I have n survey forms filled in paper by interviewers, and they're ready to be entered in the software, then how to minimize the accidental differences that can have the manual transcription of the n surveys, without having to double check everything in the 5 questions of the n surveys? My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances, but on what criteria can I make this selection? This validation would be enough to cover everything in a significant way? The actual survey is nation level and it has 56 pages with over 200 questions in total, so it will be a lot of hand written pages by many people, and the intention is to reduce the likelihood of errors and to optimize speed in the data entry process. The surveys must filled in paper first, given the complications of taking laptops or handhelds with the interviewers.

Read the article
Supporting multiple instances of a plugin DLL with global data

- by Bruno De Fraine

Context: I converted a legacy standalone engine into a plugin component for a composition tool. Technically, this means that I compiled the engine code base to a C DLL which I invoke from a .NET wrapper using P/Invoke; the wrapper implements an interface defined by the composition tool. This works quite well, but now I receive the request to load multiple instances of the engine, for different projects. Since the engine keeps the project data in a set of global variables, and since the DLL with the engine code base is loaded only once, loading multiple projects means that the project data is overwritten. I can see a number of solutions, but they all have some disadvantages: You can create multiple DLLs with the same code, which are seen as different DLLs by Windows, so their code is not shared. Probably this already works if you have multiple copies of the engine DLL with different names. However, the engine is invoked from the wrapper using DllImport attributes and I think the name of the engine DLL needs to be known when compiling the wrapper. Obviously, if I have to compile different versions of the wrapper for each project, this is quite cumbersome. The engine could run as a separate process. This means that the wrapper would launch a separate process for the engine when it loads a project, and it would use some form of IPC to communicate with this process. While this is a relatively clean solution, it requires some effort to get working, I don't now which IPC technology would be best to set-up this kind of construction. There may also be a significant overhead of the communication: the engine needs to frequently exchange arrays of floating-point numbers. The engine could be adapted to support multiple projects. This means that the global variables should be put into a project structure, and every reference to the globals should be converted to a corresponding reference that is relative to a particular project. There are about 20-30 global variables, but as you can imagine, these global variables are referenced from all over the code base, so this conversion would need to be done in some automatic manner. A related problem is that you should be able to reference the "current" project structure in all places, but passing this along as an extra argument in each and every function signature is also cumbersome. Does there exist a technique (in C) to consider the current call stack and find the nearest enclosing instance of a relevant data value there? Can the stackoverflow community give some advice on these (or other) solutions?

Read the article

Search Results

Search found 1848 results on 74 pages for 'significant'.

Page 64/74 | < Previous Page | 60 61 62 63 64 65 66 67 68 69 70 71 | Next Page >

- by NateS

- by Jim Rush

- by mttr

- by DVK

- by Wayne

- by wallismark

- by nickfranceschina

- by oxbow_lakes

- by egarcia

- by PyNEwbie

- by Luca

- by Michelle

- by Matthew Bloch

- by John

- by Ladislav Mrnka

- by grenade

- by hermiod

- by zenna

- by Andrew Patterson

- by John Feminella

- by James Morris

- by Alnitak

- by Steven Lu

- by Alex. S.

- by Bruno De Fraine

< Previous Page | 60 61 62 63 64 65 66 67 68 69 70 71 | Next Page >