Search Results

Search found 75 results on 3 pages for 'deviation'.

Page 2/3 | < Previous Page | 1 2 3 | Next Page >

Is it normal for a company to have programmers on such a rigid schedule?

- by q303

So I've been working at this job for a couple of months. I'm a little frustrated because I do my best work from 2 to 7. In previous jobs, I've come in at 9:30-10:00 and leave at 7. Some companies have been okay with this, others have not. But my current company insists on my being there at 8:30. Any deviation from this is a big deal. Is this typical? I have colleagues who are more 9:30 to 6:30, 10:00-7:00 guys...but maybe that is just startup culture? I don't see why, given that I don't meet clients, etc. what the advantage to having things be so rigid could be. I also don't see why if there is 15 to 20 minute variation sometimes in coming in, why people don't just assume that I will adjust when I leave... Are these unreasonable expectations as a developer or am I missing something?

Read the article
Verification as QA - makes sense?

- by user970696

Preparing my thesis, I found another interesting discrepancy. While some books say verification it terms of static analysis of work products is quality control (looking for defects), other say it is actually quality assurance because the process of checking is decreasing the probability of real defects when these deliverables will be used for product manufacture. I hesitate as both seems to be correct: it is a way of checking for defects (deviation from requirements, design flaws etc.) so it looks like quality control, but also it is a process which does not have to be done and if done, can yield better quality.

Read the article
When profiling a function for time use, what information is desirable?

- by AaronMcSmooth

I'm writing a program similar to Python's timeit module. The idea is to time a function by executing it anywhere from 10 to 100,000 times depending on how long it takes and then report results. I've read that the most important number is the minimum execution time because this is the number that best reflects how fast the machine can run the code in question in the absence of other programs competing for processor time and memory. This argument makes sense to me. Would you be happy with this? Would you want to know the average time or the standard deviation? Is there some other measure that you consider more important?

Read the article
Validation and Verification explanation (Boehm) - I cannot understand its point

- by user970696

Hopefully my last thread about V&V as I found the B.Boehm is text which I just do not understand well (likely my technical English is not that good). http://csse.usc.edu/csse/TECHRPTS/1979/usccse79-501/usccse79-501.pdf Basically he says that verification is about checking that products derived from requirements baseline must correspond to it and that deviation leads only to changes in these derived products (design, code). But he says it begins with design and ends with acceptance tests (you can check the V model inside). The thing is, I have accepted ISO12207 in terms of all testing is validation, yet it does not make any sense here. In order to be sure the product complies with requirements (acceptance test) I need to test it. Also it says that validation problems means that requirements are bad and needs to be changed - which does not happen with testing that testers do, who just checks correspondence with requirements.

Read the article
Very original V&V explanation (Bohm) - I cannot understand its point

- by user970696

Hopefully my last thread about V&V as I found the B.Boehm is text which I just do not understand well (likely my technical English is not that good). http://csse.usc.edu/csse/TECHRPTS/1979/usccse79-501/usccse79-501.pdf Basically he says that verification is about checking that products derived from requirments baseline must correspond to it and that deviation leads only to changes in these derived products (design, code). But he says it begins with design and ends with acceptance tests (you can check the V model inside). The thing is, I have accepted ISO12207 in terms of all testing is validation, yet it does not make any sense here. In order to be sure the product complies with requirements (acceptance test) I need to test it. Also it says that validation problems means that requirements are bad and needs to be changed - which does not happen with testing that testers do, who just checks correspondence with requirements.

Read the article
Is there a C# library that will perform the Excel NORMINV function?

- by Portman

I'm running some Monte Carlo simulations and making extensive use of the Excel function NORM.INV using Office Interrop. This functions takes three arguments (probability, average, standard deviation) and returns the inverse of the cumulative distribution. I'd like to move my code into a web app, but that will require installing Excel on the server. Does anybody know of a C# statistics library that has an equivalent function to NORM.INV?

Read the article
Fast Remote PHP Technique To Detect Image 404

- by Volomike

What PHP script technique runs the fastest in detecting if a remote image does not exist before I include the image? I mean, I don't want to download all the bytes of the remote image -- just enough to detect if it exists. And while on the subject but with just a slight deviation, I'd like to download just enough bytes to determine a JPEG's width and height information. Speed is very important in my concern here on this system design I'm working on.

Read the article
probability and relative frequency

- by Alexandru

If I use relative frequency to estimate the probability of an event, how good is my estimate based on the number of experiments? Is standard deviation a good measure? A paper/link/online book would be perfect. http://en.wikipedia.org/wiki/Frequentist

Read the article
How to interpret weka classification?

- by gargi2010

How can we interpret the classification result in weka using naive bayes? How is mean, std deviation, weight sum and precision calculated? How is kappa statistic, mean absolute error, root mean squared error etc calculated? What is the interpretation of the confusion matrix?

Read the article
External File Upload Optimizations for Windows Azure

- by rgillen

[Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx] I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting. Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards). Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately). We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here. We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here. The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results. We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post. For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results. Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows: We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post. This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here. Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB. (click chart for full size image) The chart above illustrates some interesting points about the results: When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size) For some of the moderately-sized source files, small blocks (256KB) are best As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs). Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file. (click chart for full size image) The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes. This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files. Summary What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions. Related Resources Source code for upload test application Source code for random file generator ODatas feed of raw data from non-optimized transfer tests Experiment Metadata Experiment Datasets 2KB Uploads 32KB Uploads 64KB Uploads 128KB Uploads 256KB Uploads 512KB Uploads 1MB Uploads 5MB Uploads 10MB Uploads 25MB Uploads 50MB Uploads 100MB Uploads 250MB Uploads 500MB Uploads 750MB Uploads 1GB Uploads Raw Data OData feeds of raw data from blocked/parallelized transfer tests Experiment Metadata Experiment Datasets Raw Data 256KB Blocks 512KB Blocks 1MB Blocks 2MB Blocks 4MB Blocks Excel worksheet showing summarizations and comparisons

Read the article
syslogd: Logfile format (not configuration format)

- by chris_l

Hi, I'd like to parse logfiles. Is the logfile format of syslogd the same for all systems? On my system (Debian Lenny), it's: Mar 7 04:22:40 my-host-name ... (I'm not much interested in the ... part) Can I rely on this? And is there maybe some more-or-less official description? The manpage of syslogd describes the config format, but not the logfile format. Ideally, the description would give the fields official names like (date, time, host, entry) or (datetime, hostname, message). Maybe additionally some regular expressions. I'd like to use the names and regexes in my script, to avoid an unnecessary deviation from the standard, and to make sure, that the script runs everywhere. Thanks Chris

Read the article
The meaning of thermal throttle counters and package power limit notifications in Linux

- by Trustin Lee

Whenever I do some performance testing on my Linux-installed MacBook Pro, I often see the following messages in dmesg: Aug 8 09:29:31 infinity kernel: [79791.789404] CPU1: Package power limit notification (total events = 40365) Aug 8 09:29:31 infinity kernel: [79791.789408] CPU3: Package power limit notification (total events = 40367) Aug 8 09:29:31 infinity kernel: [79791.789411] CPU2: Package power limit notification (total events = 40453) Aug 8 09:29:31 infinity kernel: [79791.789414] CPU0: Package power limit notification (total events = 40453) I also see the throttle counters in the sysfs increases over time: trustin@infinity:/sys/devices/system/cpu/cpu0/thermal_throttle $ ls core_power_limit_count package_power_limit_count core_throttle_count package_throttle_count $ cat core_power_limit_count 0 $ cat core_throttle_count 41912 $ cat package_power_limit_count 67945 $ cat package_throttle_count 67565 What do these counters mean? Do they affect the performance of CPU or system? Do they result in increased deviation of performance numbers? (i.e. Do they prevent me from getting reliable performance numbers?) If so, how do I avoid these messages and increasing counters? Would running the performance tests on a well-cooled desktop system help?

Read the article
Reverse Engineer Formula

- by aaronls

Are there any free programs or web services for reverse engineering a formula given a set of inputs and outputs? Consider if had 3 columns of data. The first two numbers are inputs, and the last one is an output: 3,4,7 1,4,5 4,2,6 The outputs could be produced with simply a+b, but there could be many formulas that would give the same result of course. I am talking about data without any error or deviation, and I think the formula would only need basic operations(divide, multiply, add, sutbract) and possibly use one of floor/ceiling/round.

Read the article
Lightweight alternative to R for RHEL?

- by Eric Rath

I want to use R for some statistical analysis of logfile information, but found that even the "limited" R-core RPM has a lot of dependencies not already installed. I don't want to install so many packages for a peripheral need. Are there lightweight alternatives for simple statistical analysis on RHEL 6? I have an R script that accepts on stdin a large set of values -- one value per line -- and prints out the min, max, mean, median, 95th percentile, and standard deviation. For more context, I'm using grep and awk to find GET requests for a particular path in our webserver log files, get the response times, and calculate the metrics listed above in order to measure the impact on performance of changes to a web application. I don't need any graphing capabilities, just simple computation. Is there something I've overlooked?

Read the article
Getting timing consistency in Linux

- by Jim Hunziker

I can't seem to get a simple program (with lots of memory access) to achieve consistent timing in Linux. I'm using a 2.6 kernel, and the program is being run on a dual-core processor with realtime priority. I'm trying to disable cache effects by declaring the memory arrays as volatile. Below are the results and the program. What are some possible sources of the outliers? Results: Number of trials: 100 Range: 0.021732s to 0.085596s Average Time: 0.058094s Standard Deviation: 0.006944s Extreme Outliers (2 SDs away from mean): 7 Average Time, excluding extreme outliers: 0.059273s Program: #include <stdio.h> #include <stdlib.h> #include <math.h> #include <sched.h> #include <sys/time.h> #define NUM_POINTS 5000000 #define REPS 100 unsigned long long getTimestamp() { unsigned long long usecCount; struct timeval timeVal; gettimeofday(&timeVal, 0); usecCount = timeVal.tv_sec * (unsigned long long) 1000000; usecCount += timeVal.tv_usec; return (usecCount); } double convertTimestampToSecs(unsigned long long timestamp) { return (timestamp / (double) 1000000); } int main(int argc, char* argv[]) { unsigned long long start, stop; double times[REPS]; double sum = 0; double scale, avg, newavg, median; double stddev = 0; double maxval = -1.0, minval = 1000000.0; int i, j, freq, count; int outliers = 0; struct sched_param sparam; sched_getparam(getpid(), &sparam); sparam.sched_priority = sched_get_priority_max(SCHED_FIFO); sched_setscheduler(getpid(), SCHED_FIFO, &sparam); volatile float* data; volatile float* results; data = calloc(NUM_POINTS, sizeof(float)); results = calloc(NUM_POINTS, sizeof(float)); for (i = 0; i < REPS; ++i) { start = getTimestamp(); for (j = 0; j < NUM_POINTS; ++j) { results[j] = data[j]; } stop = getTimestamp(); times[i] = convertTimestampToSecs(stop-start); } free(data); free(results); for (i = 0; i < REPS; i++) { sum += times[i]; if (times[i] > maxval) maxval = times[i]; if (times[i] < minval) minval = times[i]; } avg = sum/REPS; for (i = 0; i < REPS; i++) stddev += (times[i] - avg)*(times[i] - avg); stddev /= REPS; stddev = sqrt(stddev); for (i = 0; i < REPS; i++) { if (times[i] > avg + 2*stddev || times[i] < avg - 2*stddev) { sum -= times[i]; outliers++; } } newavg = sum/(REPS-outliers); printf("Number of trials: %d\n", REPS); printf("Range: %fs to %fs\n", minval, maxval); printf("Average Time: %fs\n", avg); printf("Standard Deviation: %fs\n", stddev); printf("Extreme Outliers (2 SDs away from mean): %d\n", outliers); printf("Average Time, excluding extreme outliers: %fs\n", newavg); return 0; }

Read the article
Investigation: Can different combinations of components effect Dataflow performance?

- by jamiet

Introduction The Dataflow task is one of the core components (if not the core component) of SQL Server Integration Services (SSIS) and often the most misunderstood. This is not surprising, its an incredibly complicated beast and we’re abstracted away from that complexity via some boxes that go yellow red or green and that have some lines drawn between them. Example dataflow In this blog post I intend to look under that facade and get into some of the nuts and bolts of the Dataflow Task by investigating how the decisions we make when building our packages can affect performance. I will do this by comparing the performance of three dataflows that all have the same input, all produce the same output, but which all operate slightly differently by way of having different transformation components. I also want to use this blog post to challenge a common held opinion that I see perpetuated over and over again on the SSIS forum. That is, that people assume adding components to a dataflow will be detrimental to overall performance. Its not surprising that people think this –it is intuitive to think that more components means more work- however this is not a view that I share. I have always been of the opinion that there are many factors affecting dataflow duration and the number of components is actually one of the less important ones; having said that I have never proven that assertion and that is one reason for this investigation. I have actually seen evidence that some people think dataflow duration is simply a function of number of rows and number of components. I’ll happily call that one out as a myth even without any investigation! The Setup I have a 2GB datafile which is a list of 4731904 (~4.7million) customer records with various attributes against them and it contains 2 columns that I am going to use for categorisation: [YearlyIncome] [BirthDate] The data file is a SSIS raw format file which I chose to use because it is the quickest way of getting data into a dataflow and given that I am testing the transformations, not the source or destination adapters, I want to minimise external influences as much as possible. In the test I will split the customers according to month of birth (12 of those) and whether or not their yearly income is above or below 50000 (2 of those); in other words I will be splitting them into 24 discrete categories and in order to do it I shall be using different combinations of SSIS’ Conditional Split and Derived Column transformation components. The 24 datapaths that occur will each input to a rowcount component, again because this is the least resource intensive means of terminating a datapath. The test is being carried out on a Dell XPS Studio laptop with a quad core (8 logical Procs) Intel Core i7 at 1.73GHz and Samsung SSD hard drive. Its running SQL Server 2008 R2 on Windows 7. The Variables Here are the three combinations of components that I am going to test: One Conditional Split - A single Conditional Split component CSPL Split by Month of Birth and income category that will use expressions on [YearlyIncome] & [BirthDate] to send each row to one of 24 outputs. This next screenshot displays the expression logic in use: Derived Column & Conditional Split - A Derived Column component DER Income Category that adds a new column [IncomeCategory] which will contain one of two possible text values {“LessThan50000”,”GreaterThan50000”} and uses [YearlyIncome] to determine which value each row should get. A Conditional Split component CSPL Split by Month of Birth and Income Category then uses that new column in conjunction with [BirthDate] to determine which of the same 24 outputs to send each row to. Put more simply, I am separating the Conditional Split of #1 into a Derived Column and a Conditional Split. The next screenshots display the expression logic in use: DER Income Category CSPL Split by Month of Birth and Income Category Three Conditional Splits - A Conditional Split component that produces two outputs based on [YearlyIncome], one for each Income Category. Each of those outputs will go to a further Conditional Split that splits the input into 12 outputs, one for each month of birth (identical logic in each). In this case then I am separating the single Conditional Split of #1 into three Conditional Split components. The next screenshots display the expression logic in use: CSPL Split by Income Category CSPL Split by Month of Birth 1& 2 Each of these combinations will provide an input to one of the 24 rowcount components, just the same as before. For illustration here is a screenshot of the dataflow containing three Conditional Split components: As you can these dataflows have a fair bit of work to do and remember that they’re doing that work for 4.7million rows. I will execute each dataflow 10 times and use the average for comparison. I foresee three possible outcomes: The dataflow containing just one Conditional Split (i.e. #1) will be quicker There is no significant difference between any of them One of the two dataflows containing multiple transformation components will be quicker Regardless of which of those outcomes come to pass we will have learnt something and that makes this an interesting test to carry out. Note that I will be executing the dataflows using dtexec.exe rather than hitting F5 within BIDS. The Results and Analysis The table below shows all of the executions, 10 for each dataflow. It also shows the average for each along with a standard deviation. All durations are in seconds. I’m pasting a screenshot because I frankly can’t be bothered with the faffing about needed to make a presentable HTML table. It is plain to see from the average that the dataflow containing three conditional splits is significantly faster, the other two taking 43% and 52% longer respectively. This seems strange though, right? Why does the dataflow containing the most components outperform the other two by such a big margin? The answer is actually quite logical when you put some thought into it and I’ll explain that below. Before progressing, a side note. The standard deviation for the “Three Conditional Splits” dataflow is orders of magnitude smaller – indicating that performance for this dataflow can be predicted with much greater confidence too. The Explanation I refer you to the screenshot above that shows how CSPL Split by Month of Birth and salary category in the first dataflow is setup. Observe that there is a case for each combination of Month Of Date and Income Category – 24 in total. These expressions get evaluated in the order that they appear and hence if we assume that Month of Date and Income Category are uniformly distributed in the dataset we can deduce that the expected number of expression evaluations for each row is 12.5 i.e. 1 (the minimum) + 24 (the maximum) divided by 2 = 12.5. Now take a look at the screenshots for the second dataflow. We are doing one expression evaluation in DER Income Category and we have the same 24 cases in CSPL Split by Month of Birth and Income Category as we had before, only the expression differs slightly. In this case then we have 1 + 12.5 = 13.5 expected evaluations for each row – that would account for the slightly longer average execution time for this dataflow. Now onto the third dataflow, the quick one. CSPL Split by Income Category does a maximum of 2 expression evaluations thus the expected number of evaluations per row is 1.5. CSPL Split by Month of Birth 1 & CSPL Split by Month of Birth 2 both have less work to do than the previous Conditional Split components because they only have 12 cases to test for thus the expected number of expression evaluations is 6.5 There are two of them so total expected number of expression evaluations for this dataflow is 6.5 + 6.5 + 1.5 = 14.5. 14.5 is still more than 12.5 & 13.5 though so why is the third dataflow so much quicker? Simple, the conditional expressions in the first two dataflows have two boolean predicates to evaluate – one for Income Category and one for Month of Birth; the expressions in the Conditional Split in the third dataflow however only have one predicate thus they are doing a lot less work. To sum up, the difference in execution times can be attributed to the difference between: MONTH(BirthDate) == 1 && YearlyIncome <= 50000 and MONTH(BirthDate) == 1 In the first two dataflows YearlyIncome <= 50000 gets evaluated an average of 12.5 times for every row whereas in the third dataflow it is evaluated once and once only. Multiply those 11.5 extra operations by 4.7million rows and you get a significant amount of extra CPU cycles – that’s where our duration difference comes from. The Wrap-up The obvious point here is that adding new components to a dataflow isn’t necessarily going to make it go any slower, moreover you may be able to achieve significant improvements by splitting logic over multiple components rather than one. Performance tuning is all about reducing the amount of work that needs to be done and that doesn’t necessarily mean use less components, indeed sometimes you may be able to reduce workload in ways that aren’t immediately obvious as I think I have proven here. Of course there are many variables in play here and your mileage will most definitely vary. I encourage you to download the package and see if you get similar results – let me know in the comments. The package contains all three dataflows plus a fourth dataflow that will create the 2GB raw file for you (you will also need the [AdventureWorksDW2008] sample database from which to source the data); simply disable all dataflows except the one you want to test before executing the package and remember, execute using dtexec, not within BIDS. If you want to explore dataflow performance tuning in more detail then here are some links you might want to check out: Inequality joins, Asynchronous transformations and Lookups Destination Adapter Comparison Don’t turn the dataflow into a cursor SSIS Dataflow – Designing for performance (webinar) Any comments? Let me know! @Jamiet

Read the article
Dynamic real-time pathfinding with C# and unity

- by Yakri

A buddy and I are working on a simple 2D top down arena combat game similar to OpenGLAD (grew up on ye olde GLADIATOR). Thing is, we want to make some substantial deviation from our source of inspiration, including completely destructible/changeable terrain. Like rivers that can be frozen, walls which can be knocked down, etc. As well as letting players and NPC's build new terrain objects, some of which cannot be moved through or seen through. So I'm tasked with creating the AI, starting with pathfinding. Because of all the changeable terrain, we need something that can check to see if the player/other NPC's are in line of sight, and which can then check to find current paths around existing terrain, without getting completely confused by new terrain popping up, and old terrain vanishing, and even capable of breaking through terrain. A lot of that will just be filling in the framework of the feature, but I really just don't know where to start. What I'm really looking for are relevant websites, books, articles, or keywords to google. I just can't quite find a direction to start in, because most pathfinding types we've googled up just won't give us even the most basic level of robustness we need.

Read the article
Ruby workflow in Windows

- by Rig

I've done some searching and quite haven't come across the answer I am looking for. I do not think this is a duplicate of this question. I believe Windows could be a suitable development environment based on the mix of answers in that question. I have been developing in Ruby (mostly Rails but not entirely) for about a year now for personal projects on a Macbook Pro however that machine has faced an untimely death and has been replaced with a nice Windows 7 machine. Ruby development felt almost natural on the Mac after doing some research and setting up the typical stack. My environment then included the standard (Linux like) stuff built into OSX, Text Wrangler, Git, RVM, et al. Not too much of a deviation from what the 'devotees' tend to assume. Now I am setting up my new Windows box for continuing that development. What would my development environment look like? Should I just cave and run Linux in a VM? Ideally I would develop in Windows native. I am aware of the Windows Ruby installer. It seems decent but its not exactly as nice as RVM in the osx/linux world. Mercurial/Git are available so I would assume they play into the stack. Does one develop entirely in Windows? Does one run a webserver in a Linux VM and use it as a test bed while developing in Windows? Do it all in a VM? What does the standard Windows Ruby developer environment look like and what is the workflow? What would a typical step through be for adding a new feature to an ongoing project and what would the technology stack look like?

Read the article
How to prioritize related game entity components?

- by Paul Manta

I want to make a game where you have to run over a bunch of zombies with your car. When moving around, the zombies have a few things to take into consideration: When there's no player around they might just roam about randomly. And even when some other component dictates a specific direction, they should wobble to the left and right randomly (like drunk people). This implies a small, random, deviation in their movement. They should avoid static obstacles. When they see they are headed towards a wall, they should reorient themselves. They should avoid the car. They should try to predict where the car will be based on its velocity and try to move out of the way. When they can, they should try to get near the player. All these types of decisions they have to do seem like they should be implemented in different components. But how should I manage them? How can I give different components different weights that reflect the importance of each decision (in a given situation)? I would need some other component that acts as a manager, but do you have any tips on how I should implement it? Or maybe there's a better solution?...

Read the article
How are certain analytics metrics (time on site, etc.) usually distributed?

- by a barking spider

I'm not sure if I've come to the right place to ask this question, but I'm gathering some information for a research project. We're trying to design an experiment that'll heavily involve web analytics, and I'm trying to figure out some sensible values of mean +/- standard deviation for the following visitor-level (i.e., visitor 1 spends 2 minutes on site, visitor 2 spends 1 minute -- mean 1.5 +/- 0.71...) metrics: time spent on site page views If time allowed, we would put up the sites and gather the information ourselves, but we have a grant deadline coming up. I realize that even though these the distributions of these quantities are probably going to be heavily skewed towards zero, we'll need some reasonable figures or estimates of these figures in order to do sample size calculations, etc. Anyway, I'm not sure where else I'd turn, and I certainly have had a difficult time finding these values in the prior literature. If someone could direct me to a paper with the right information, or if you have these figures on hand (perhaps taken directly from your logs!) -- that would be amazing, and I'd love to hear from you. Thanks in advance, and even though I'm not allowed to reveal too much, rest assured that this info'll be applied towards a good cause :)

Read the article
Iphone progressive download audio player

- by joynes

Hi! Im trying to implement a progressive download audio player for the iphone, ie using http and fixed size mp3-files. I found the AudioStreamer project but it seems very complicated and works best with endless streams. I need to be able to find out the total length of audiofiles and I also need to be able to seek in the files. I found a hacked deviation from AudioStreamer but it doesnt seem to work very well for me. http://www.saygoodnight.com/?p=14 Im wondering if there is a more simple way to achieve my goals or if there are some better working samples out there? I found the bass library but not much documentation about it. /Br Johannes

Read the article
MATLAB fill area between lines

- by dustynrobots

I'm trying to do something similar to what's outlined in this post: MATLAB, Filling in the area between two sets of data, lines in one figure but running into a roadblock. I'm trying to shade the area of a graph that represents the mean +/- standard deviation. The variable definitions are a bit complicated but it boils down to this code, and when plotted without shading, I get the screenshot below: x = linspace(0, 100, 101)'; mean = torqueRnormMean(:,1); meanPlusSTD = torqueRnormMean(:,1) + torqueRnormStd(:,1); meanMinusSTD = torqueRnormMean(:,1) - torqueRnormStd(:,1); plot(x, mean, 'k', 'LineWidth', 2) plot(x, meanPlusSTD, 'k--') plot(x, meanMinusSTD, 'k--') But when I try to implement shading just on the lower half of the graph (between mean and meanMinusSTD) by adding the code below, I get a plot that looks like this: fill( [x fliplr(x)], [mean fliplr(meanMinusSTD)], 'y', 'LineStyle','--'); It's obviously not shading the correct area of the graph, and new near-horizontal lines are being created close to 0 that are messing with the shading. Any thoughts? I'm stumped.

Read the article
Finding the Formula for a Curve

- by Mystagogue

Is there a program that will take "response curve" values from me, and provide a formula that approximates the response curve? It would be cool if such a program would take a numeric "percent correct" (perhaps with a standard deviation) so that it returns simplified formulas when laxity is permissable, and more precise (viz. complex) formulas when the curve needs to be approximated closely. My interest is to play with the response curve values and "laxity" factor, until such a tool spits out a curve-fit formula simple enough that I know it will be high performance during machine computations.

Read the article
Function for averages of tuples in a dictionary

- by Billy Mann

I have a string, dictionary in the form: ('the head', {'exploded': (3.5, 1.0), 'the': (5.0, 1.0), "puppy's": (9.0, 1.0), 'head': (6.0, 1.0)}) Each parentheses is a tuple which corresponds to (score, standard deviation). I'm taking the average of just the first integer in each tuple. I've tried this: def score(string, d): for word in d: (score, std) = d[word] d[word]=float(score),float(std) if word in string: word = string.lower() number = len(string) return sum([v[0] for v in d.values()]) / float(len(d)) if len(string) == 0: return 0 When I run: print score('the head', {'exploded': (3.5, 1.0), 'the': (5.0, 1.0), "puppy's": (9.0, 1.0), 'head': (6.0, 1.0)}) I should get 5.5 but instead I'm getting 5.875. Can't figure out what in my function is not allowing me to get the correct answer.

Read the article
Generating a random displacement on the unit sphere

- by becko

Given a unit vector n, I need to generate, as fast as possible, another random unit vector m. The deviation of m from n should be on the order of a positive parameter sigma, and the distribution of m on the unit sphere should be symmetrical around n. I have no specific requirements on the representation of unit vectors, so you can use spherical angles, Cartesian coordinates, or whatever turns out to be convenient. Also, there are no precise requirements on the probability distributions used, as long as it decays when m deviates more than sigma from n. I am working with gsl and C. I have come up with a somewhat convoluted method using Cartesian coordinates. I will post it later if it is useful, but I would like to see people's ideas.

Read the article

Search Results

Search found 75 results on 3 pages for 'deviation'.

Page 2/3 | < Previous Page | 1 2 3 | Next Page >

- by q303

- by user970696

- by AaronMcSmooth

- by user970696

- by user970696

- by Portman

- by Volomike

- by Alexandru

- by gargi2010

- by rgillen

- by chris_l

- by Trustin Lee

- by aaronls

- by Eric Rath

- by Jim Hunziker

- by jamiet

- by Yakri

- by Rig

- by Paul Manta

- by a barking spider

- by joynes

- by dustynrobots

- by Mystagogue

- by Billy Mann

- by becko

< Previous Page | 1 2 3 | Next Page >