Search Results

Search found 142 results on 6 pages for 'weighted'.

Page 2/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

How do I sum up weighted arrays in PHP?

- by christian studer

Hod do I multiply the values of a multi-dimensional array with weigths and sum up the results into a new array in PHP or in general? The boring way looks like this: $weights = array(0.25, 0.4, 0.2, 0.15); $values = array ( array(5,10,15), array(20,25,30), array(35,40,45), array(50,55,60) ); $result = array(); for($i = 0; $i < count($values[0]); ++$i) { $result[$i] = 0; foreach($weights as $index => $thisWeight) $result[$i] += $thisWeight * $values[$index][$i]; } Is there a more elegant solution?

Read the article
Does Python/Scipy have a firls( ) replacement (i.e. a weighted, least squares, FIR filter design)?

- by delicasso

I am porting code from Matlab to Python and am having trouble finding a replacement for the firls( ) routine. It is used for, least-squares linear-phase Finite Impulse Response (FIR) filter design. I looked at scipy.signal and nothing there looked like it would do the trick. Of course I was able to replace my remez and freqz algorithsm, so that's good. On one blog I found an algorithm that implemented this filter without weighting, but I need one with weights. Thanks, David

Read the article
How to build a Weighted Graph with Ruby's RGL or GRATR to perform Dijkstra's algorithm?

- by Andres

I would like to see an example of a Dijkastra search algorithm for a graph built using Ruby's RGL (http://rgl.rubyforge.org/) or GRATR (http://rubyforge.org/projects/gratr/). I know GRATR has Dijkastra support but I'm not really sure how to go about using it, any help would be appreciated.

Read the article
How to correctly do SQL UPDATE with weighted subselect?

- by luminarious

I am probably trying to accomplish too much in a single query, but have I an sqlite database with badly formatted recipes. This returns a sorted list of recipes with relevance added: SELECT *, sum(relevance) FROM ( SELECT *,1 AS relevance FROM recipes WHERE ingredients LIKE '%milk%' UNION ALL SELECT *,1 AS relevance FROM recipes WHERE ingredients LIKE '%flour%' UNION ALL SELECT *,1 AS relevance FROM recipes WHERE ingredients LIKE '%sugar%' ) results GROUP BY recipeID ORDER BY sum(relevance) DESC; But I'm now stuck with a special case where I need to write the relevance value to a field on the same row as the recipe. I figured something along these lines: UPDATE recipes SET relevance=(SELECT sum(relevance) ...) But I have not been able to get this working yet. I will keep trying, but meanwhile please let me know how you would approach this?

Read the article
How can I get weighted random selections from an array in Perl?

- by Chaggster

I need to random some elements from an array. I'm doing that by randomizing the index $array[int(rand(100))]. I want some of the elements to appear more often. How do I do it? I thought of a stupid solution of having those elements duplicated several times in the array, but I'm sure you guys can do better.

Read the article
Solving Big Problems with Oracle R Enterprise, Part I

- by dbayard

Abstract: This blog post will show how we used Oracle R Enterprise to tackle a customer’s big calculation problem across a big data set. Overview: Databases are great for managing large amounts of data in a central place with rigorous enterprise-level controls. R is great for doing advanced computations. Sometimes you need to do advanced computations on large amounts of data, subject to rigorous enterprise-level concerns. This blog post shows how Oracle R Enterprise enables R plus the Oracle Database enabled us to do some pretty sophisticated calculations across 1 million accounts (each with many detailed records) in minutes. The problem: A financial services customer of mine has a need to calculate the historical internal rate of return (IRR) for its customers’ portfolios. This information is needed for customer statements and the online web application. In the past, they had solved this with a home-grown application that pulled trade and account data out of their data warehouse and ran the calculations. But this home-grown application was not able to do this fast enough, plus it was a challenge for them to write and maintain the code that did the IRR calculation. IRR – a problem that R is good at solving: Internal Rate of Return is an interesting calculation in that in most real-world scenarios it is impractical to calculate exactly. Rather, IRR is a calculation where approximation techniques need to be used. In this blog post, we will discuss calculating the “money weighted rate of return” but in the actual customer proof of concept we used R to calculate both money weighted rate of returns and time weighted rate of returns. You can learn more about the money weighted rate of returns here: http://www.wikinvest.com/wiki/Money-weighted_return First Steps- Calculating IRR in R We will start with calculating the IRR in standalone/desktop R. In our second post, we will show how to take this desktop R function, deploy it to an Oracle Database, and make it work at real-world scale. The first step we did was to get some sample data. For a historical IRR calculation, you have a balances and cash flows. In our case, the customer provided us with several accounts worth of sample data in Microsoft Excel. The above figure shows part of the spreadsheet of sample data. The data provides balances and cash flows for a sample account (BMV=beginning market value. FLOW=cash flow in/out of account. EMV=ending market value). Once we had the sample spreadsheet, the next step we did was to read the Excel data into R. This is something that R does well. R offers multiple ways to work with spreadsheet data. For instance, one could save the spreadsheet as a .csv file. In our case, the customer provided a spreadsheet file containing multiple sheets where each sheet provided data for a different sample account. To handle this easily, we took advantage of the RODBC package which allowed us to read the Excel data sheet-by-sheet without having to create individual .csv files. We wrote ourselves a little helper function called getsheet() around the RODBC package. Then we loaded all of the sample accounts into a data.frame called SimpleMWRRData. Writing the IRR function At this point, it was time to write the money weighted rate of return (MWRR) function itself. The definition of MWRR is easily found on the internet or if you are old school you can look in an investment performance text book. In the customer proof, we based our calculations off the ones defined in the The Handbook of Investment Performance: A User’s Guide by David Spaulding since this is the reference book used by the customer. (One of the nice things we found during the course of this proof-of-concept is that by using R to write our IRR functions we could easily incorporate the specific variations and business rules of the customer into the calculation.) The key thing with calculating IRR is the need to solve a complex equation with a numerical approximation technique. For IRR, you need to find the value of the rate of return (r) that sets the Net Present Value of all the flows in and out of the account to zero. With R, we solve this by defining our NPV function: where bmv is the beginning market value, cf is a vector of cash flows, t is a vector of time (relative to the beginning), emv is the ending market value, and tend is the ending time. Since solving for r is a one-dimensional optimization problem, we decided to take advantage of R’s optimize method (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/optimize.html). The optimize method can be used to find a minimum or maximum; to find the value of r where our npv function is closest to zero, we wrapped our npv function inside the abs function and asked optimize to find the minimum. Here is an example of using optimize: where low and high are scalars that indicate the range to search for an answer. To test this out, we need to set values for bmv, cf, t, emv, tend, low, and high. We will set low and high to some reasonable defaults. For example, this account had a negative 2.2% money weighted rate of return. Enhancing and Packaging the IRR function With numerical approximation methods like optimize, sometimes you will not be able to find an answer with your initial set of inputs. To account for this, our approach was to first try to find an answer for r within a narrow range, then if we did not find an answer, try calling optimize() again with a broader range. See the R help page on optimize() for more details about the search range and its algorithm. At this point, we can now write a simplified version of our MWRR function. (Our real-world version is more sophisticated in that it calculates rate of returns for 5 different time periods [since inception, last quarter, year-to-date, last year, year before last year] in a single invocation. In our actual customer proof, we also defined time-weighted rate of return calculations. The beauty of R is that it was very easy to add these enhancements and additional calculations to our IRR package.)To simplify code deployment, we then created a new package of our IRR functions and sample data. For this blog post, we only need to include our SimpleMWRR function and our SimpleMWRRData sample data. We created the shell of the package by calling: To turn this package skeleton into something usable, at a minimum you need to edit the SimpleMWRR.Rd and SimpleMWRRData.Rd files in the \man subdirectory. In those files, you need to at least provide a value for the “title” section. Once that is done, you can change directory to the IRR directory and type at the command-line: The myIRR package for this blog post (which has both SimpleMWRR source and SimpleMWRRData sample data) is downloadable from here: myIRR package Testing the myIRR package Here is an example of testing our IRR function once it was converted to an installable package: Calculating IRR for All the Accounts So far, we have shown how to calculate IRR for a single account. The real-world issue is how do you calculate IRR for all of the accounts?This is the kind of situation where we can leverage the “Split-Apply-Combine” approach (see http://www.cscs.umich.edu/~crshalizi/weblog/815.html). Given that our sample data can fit in memory, one easy approach is to use R’s “by” function. (Other approaches to Split-Apply-Combine such as plyr can also be used. See http://4dpiecharts.com/2011/12/16/a-quick-primer-on-split-apply-combine-problems/). Here is an example showing the use of “by” to calculate the money weighted rate of return for each account in our sample data set. Recap and Next Steps At this point, you’ve seen the power of R being used to calculate IRR. There were several good things: R could easily work with the spreadsheets of sample data we were given R’s optimize() function provided a nice way to solve for IRR- it was both fast and allowed us to avoid having to code our own iterative approximation algorithm R was a convenient language to express the customer-specific variations, business-rules, and exceptions that often occur in real-world calculations- these could be easily added to our IRR functions The Split-Apply-Combine technique can be used to perform calculations of IRR for multiple accounts at once. However, there are several challenges yet to be conquered at this point in our story: The actual data that needs to be used lives in a database, not in a spreadsheet The actual data is much, much bigger- too big to fit into the normal R memory space and too big to want to move across the network The overall process needs to run fast- much faster than a single processor The actual data needs to be kept secured- another reason to not want to move it from the database and across the network And the process of calculating the IRR needs to be integrated together with other database ETL activities, so that IRR’s can be calculated as part of the data warehouse refresh processes In our next blog post in this series, we will show you how Oracle R Enterprise solved these challenges.

Read the article
How does one unit test an algorithm

- by Asa Baylus

I was recently working on a JS slideshow which rotates images using a weighted average algorithm. Thankfully, timgilbert has written a weighted list script which implements the exact algorithm I needed. However in his documentation he's noted under todos: "unit tests!". I'd like to know is how one goes about unit testing an algorithm. In the case of a weighted average how would you create a proof that the averages are accurate when there is the element of randomness? Code samples of similar would be very helpful to my understanding.

Read the article
Are CK Metrics still considered useful? Is there an open source tool to help?

- by DeveloperDon

Chidamber & Kemerer proposed several metrics for object oriented code. Among them, depth of inheritance tree, weighted number of methods, number of member functions, number of children, and coupling between objects. Using a base of code, they tried to correlated these metrics to the defect density and maintenance effort using covariant analysis. Are these metrics actionable in projects? Perhaps they can guide refactoring. For example weighted number of methods might show which God classes needed to be broken into more cohesive classes that address a single concern. Is there approach superseded by a better method, and is there a tool that can identify problem code, particularly in moderately large project being handed off to a new developer or team?

Read the article
Algorithm for scoring user activity

- by ManBugra

I have an application where users can: Write reviews about products Add comments to products Up / Down vote reviews Up / Down vote comments Every Up/Down vote is recorded in a db table. What i want to do now is to create a ranking of the most active users in the last 4 weeks. Of course good reviews should be weighted more than good comments. But also e.g. 10 good comments should be weighted more than just one good review. Example: // reviews created in recent 4 weeks //format: [ upVoteCount, downVoteCount ] var reviews = [ [120,23], [32,12], [12,0], [23,45] ]; // comments created in recent 4 weeks // format: [ upVoteCount, downVoteCount ] var comments = [ [1,2], [322,1], [0,0], [0,45] ]; // create weight vector // format: [ reviewWeight, commentsWeight ] var weight = [0.60, 0.40]; // signature: activties..., activityWeight var userActivityScore = score(reviews, comments, weight); ... update user table ... List<Users> users = "from users u order by u.userActivityScore desc"; How would a fair scoring function look like? How could an implementation of the score() function look like? How to add a weight g to the function so that reviews are weighted heavier? How would such a function look like if, for example, votes for pictures would be added?

Read the article
Even googling not allowed ? Why ?

- by sagar

Please read message bellow. Access has been denied 127.0.0.1! Access to the page: http://www.google.co.in/search?hl=en&client=firefox-a&hs=F58&rls=org.mozilla%3Aen-US%3Aofficial&q=email+us&meta=&aq=f&aqi=g10&aql=&oq=&gs_rfai= ... has been denied for the following reason: Weighted phrase limit exceeded. By reading above message, you can easily understand that - it's a firewall message. I also know that. The problem is "Firewall" is allowing any kind of googling. But when I google "email us" - above message is displayed. My question is why does this happen ? ( means - why googling not allowed on this words ? ) ( Please don't tell that - contact your system administrator. ) What does this mean - Weighted phrase limit exceeded. ? sagar.

Read the article
Java: micro-optimizing array manipulation

- by Martin Wiboe

Hello all, I am trying to make a Java port of a simple feed-forward neural network. This obviously involves lots of numeric calculations, so I am trying to optimize my central loop as much as possible. The results should be correct within the limits of the float data type. My current code looks as follows (error handling & initialization removed): /** * Simple implementation of a feedforward neural network. The network supports * including a bias neuron with a constant output of 1.0 and weighted synapses * to hidden and output layers. * * @author Martin Wiboe */ public class FeedForwardNetwork { private final int outputNeurons; // No of neurons in output layer private final int inputNeurons; // No of neurons in input layer private int largestLayerNeurons; // No of neurons in largest layer private final int numberLayers; // No of layers private final int[] neuronCounts; // Neuron count in each layer, 0 is input // layer. private final float[][][] fWeights; // Weights between neurons. // fWeight[fromLayer][fromNeuron][toNeuron] // is the weight from fromNeuron in // fromLayer to toNeuron in layer // fromLayer+1. private float[][] neuronOutput; // Temporary storage of output from previous layer public float[] compute(float[] input) { // Copy input values to input layer output for (int i = 0; i < inputNeurons; i++) { neuronOutput[0][i] = input[i]; } // Loop through layers for (int layer = 1; layer < numberLayers; layer++) { // Loop over neurons in the layer and determine weighted input sum for (int neuron = 0; neuron < neuronCounts[layer]; neuron++) { // Bias neuron is the last neuron in the previous layer int biasNeuron = neuronCounts[layer - 1]; // Get weighted input from bias neuron - output is always 1.0 float activation = 1.0F * fWeights[layer - 1][biasNeuron][neuron]; // Get weighted inputs from rest of neurons in previous layer for (int inputNeuron = 0; inputNeuron < biasNeuron; inputNeuron++) { activation += neuronOutput[layer-1][inputNeuron] * fWeights[layer - 1][inputNeuron][neuron]; } // Store neuron output for next round of computation neuronOutput[layer][neuron] = sigmoid(activation); } } // Return output from network = output from last layer float[] result = new float[outputNeurons]; for (int i = 0; i < outputNeurons; i++) result[i] = neuronOutput[numberLayers - 1][i]; return result; } private final static float sigmoid(final float input) { return (float) (1.0F / (1.0F + Math.exp(-1.0F * input))); } } I am running the JVM with the -server option, and as of now my code is between 25% and 50% slower than similar C code. What can I do to improve this situation? Thank you, Martin Wiboe

Read the article
How can I make this Java code run faster?

- by Martin Wiboe

Hello all, I am trying to make a Java port of a simple feed-forward neural network. This obviously involves lots of numeric calculations, so I am trying to optimize my central loop as much as possible. The results should be correct within the limits of the float data type. My current code looks as follows (error handling & initialization removed): /** * Simple implementation of a feedforward neural network. The network supports * including a bias neuron with a constant output of 1.0 and weighted synapses * to hidden and output layers. * * @author Martin Wiboe */ public class FeedForwardNetwork { private final int outputNeurons; // No of neurons in output layer private final int inputNeurons; // No of neurons in input layer private int largestLayerNeurons; // No of neurons in largest layer private final int numberLayers; // No of layers private final int[] neuronCounts; // Neuron count in each layer, 0 is input // layer. private final float[][][] fWeights; // Weights between neurons. // fWeight[fromLayer][fromNeuron][toNeuron] // is the weight from fromNeuron in // fromLayer to toNeuron in layer // fromLayer+1. private float[][] neuronOutput; // Temporary storage of output from previous layer public float[] compute(float[] input) { // Copy input values to input layer output for (int i = 0; i < inputNeurons; i++) { neuronOutput[0][i] = input[i]; } // Loop through layers for (int layer = 1; layer < numberLayers; layer++) { // Loop over neurons in the layer and determine weighted input sum for (int neuron = 0; neuron < neuronCounts[layer]; neuron++) { // Bias neuron is the last neuron in the previous layer int biasNeuron = neuronCounts[layer - 1]; // Get weighted input from bias neuron - output is always 1.0 float activation = 1.0F * fWeights[layer - 1][biasNeuron][neuron]; // Get weighted inputs from rest of neurons in previous layer for (int inputNeuron = 0; inputNeuron < biasNeuron; inputNeuron++) { activation += neuronOutput[layer-1][inputNeuron] * fWeights[layer - 1][inputNeuron][neuron]; } // Store neuron output for next round of computation neuronOutput[layer][neuron] = sigmoid(activation); } } // Return output from network = output from last layer float[] result = new float[outputNeurons]; for (int i = 0; i < outputNeurons; i++) result[i] = neuronOutput[numberLayers - 1][i]; return result; } private final static float sigmoid(final float input) { return (float) (1.0F / (1.0F + Math.exp(-1.0F * input))); } } I am running the JVM with the -server option, and as of now my code is between 25% and 50% slower than similar C code. What can I do to improve this situation? Thank you, Martin Wiboe

Read the article
How do you blend multiple colors in HSV (polar) color-space?

- by Toxikman

In RGB color space, you can do a weighted multiple-color blend by just doing: Start with R = G = B = 0. Then we perform a blend at index i using a set of colors C, and a set of normalized weights w like so: R += w[i] * C[i].r G += w[i] * C[i].g B += w[i] * C[i].b But I'd like to interpolate the colors in the HSV color-space instead, so that saturation and brightness are uniform across the interpolation. I know I can blend saturation and brightness in the same way as above, but the HUE component is an angle around a continuous circle, since HSV is essentially a polar coordinate system. Blending only two HSV colors makes sense to me, you just find the shortest arc around the circle and interpolate between the two hues. But when you attempt to blend more than 2 colors, it becomes a bit of a puzzle. You have to handle anomalous cases, like 4 equally-weighted colors with a hue at 0, 90, 180, and 270 degrees. They basically cancel each other out, so any hue will do. Any ideas would be greatly appreciated.

Read the article
Squid + Dans Guardian (simple configuration)

- by The Digital Ninja

I just built a new proxy server and compiled the latest versions of squid and dansguardian. We use basic authentication to select what users are allowed outside of our network. It seems squid is working just fine and accepts my username and password and lets me out. But if i connect to dans guardian, it prompts for username and password and then displays a message saying my username is not allowed to access the internet. Its pulling my username for the error message so i know it knows who i am. The part i get confused on is i thought that part was handled all by squid, and squid is working flawlessly. Can someone please double check my config files and tell me if i'm missing something or there is some new option i must set to get this to work. dansguardian.conf # Web Access Denied Reporting (does not affect logging) # # -1 = log, but do not block - Stealth mode # 0 = just say 'Access Denied' # 1 = report why but not what denied phrase # 2 = report fully # 3 = use HTML template file (accessdeniedaddress ignored) - recommended # reportinglevel = 3 # Language dir where languages are stored for internationalisation. # The HTML template within this dir is only used when reportinglevel # is set to 3. When used, DansGuardian will display the HTML file instead of # using the perl cgi script. This option is faster, cleaner # and easier to customise the access denied page. # The language file is used no matter what setting however. # languagedir = '/etc/dansguardian/languages' # language to use from languagedir. language = 'ukenglish' # Logging Settings # # 0 = none 1 = just denied 2 = all text based 3 = all requests loglevel = 3 # Log Exception Hits # Log if an exception (user, ip, URL, phrase) is matched and so # the page gets let through. Can be useful for diagnosing # why a site gets through the filter. on | off logexceptionhits = on # Log File Format # 1 = DansGuardian format 2 = CSV-style format # 3 = Squid Log File Format 4 = Tab delimited logfileformat = 1 # Log file location # # Defines the log directory and filename. #loglocation = '/var/log/dansguardian/access.log' # Network Settings # # the IP that DansGuardian listens on. If left blank DansGuardian will # listen on all IPs. That would include all NICs, loopback, modem, etc. # Normally you would have your firewall protecting this, but if you want # you can limit it to only 1 IP. Yes only one. filterip = # the port that DansGuardian listens to. filterport = 8080 # the ip of the proxy (default is the loopback - i.e. this server) proxyip = 127.0.0.1 # the port DansGuardian connects to proxy on proxyport = 3128 # accessdeniedaddress is the address of your web server to which the cgi # dansguardian reporting script was copied # Do NOT change from the default if you are not using the cgi. # accessdeniedaddress = 'http://YOURSERVER.YOURDOMAIN/cgi-bin/dansguardian.pl' # Non standard delimiter (only used with accessdeniedaddress) # Default is enabled but to go back to the original standard mode dissable it. nonstandarddelimiter = on # Banned image replacement # Images that are banned due to domain/url/etc reasons including those # in the adverts blacklists can be replaced by an image. This will, # for example, hide images from advert sites and remove broken image # icons from banned domains. # 0 = off # 1 = on (default) usecustombannedimage = 1 custombannedimagefile = '/etc/dansguardian/transparent1x1.gif' # Filter groups options # filtergroups sets the number of filter groups. A filter group is a set of content # filtering options you can apply to a group of users. The value must be 1 or more. # DansGuardian will automatically look for dansguardianfN.conf where N is the filter # group. To assign users to groups use the filtergroupslist option. All users default # to filter group 1. You must have some sort of authentication to be able to map users # to a group. The more filter groups the more copies of the lists will be in RAM so # use as few as possible. filtergroups = 1 filtergroupslist = '/etc/dansguardian/filtergroupslist' # Authentication files location bannediplist = '/etc/dansguardian/bannediplist' exceptioniplist = '/etc/dansguardian/exceptioniplist' banneduserlist = '/etc/dansguardian/banneduserlist' exceptionuserlist = '/etc/dansguardian/exceptionuserlist' # Show weighted phrases found # If enabled then the phrases found that made up the total which excedes # the naughtyness limit will be logged and, if the reporting level is # high enough, reported. on | off showweightedfound = on # Weighted phrase mode # There are 3 possible modes of operation: # 0 = off = do not use the weighted phrase feature. # 1 = on, normal = normal weighted phrase operation. # 2 = on, singular = each weighted phrase found only counts once on a page. # weightedphrasemode = 2 # Positive result caching for text URLs # Caches good pages so they don't need to be scanned again # 0 = off (recommended for ISPs with users with disimilar browsing) # 1000 = recommended for most users # 5000 = suggested max upper limit urlcachenumber = # # Age before they are stale and should be ignored in seconds # 0 = never # 900 = recommended = 15 mins urlcacheage = # Smart and Raw phrase content filtering options # Smart is where the multiple spaces and HTML are removed before phrase filtering # Raw is where the raw HTML including meta tags are phrase filtered # CPU usage can be effectively halved by using setting 0 or 1 # 0 = raw only # 1 = smart only # 2 = both (default) phrasefiltermode = 2 # Lower casing options # When a document is scanned the uppercase letters are converted to lower case # in order to compare them with the phrases. However this can break Big5 and # other 16-bit texts. If needed preserve the case. As of version 2.7.0 accented # characters are supported. # 0 = force lower case (default) # 1 = do not change case preservecase = 0 # Hex decoding options # When a document is scanned it can optionally convert %XX to chars. # If you find documents are getting past the phrase filtering due to encoding # then enable. However this can break Big5 and other 16-bit texts. # 0 = disabled (default) # 1 = enabled hexdecodecontent = 0 # Force Quick Search rather than DFA search algorithm # The current DFA implementation is not totally 16-bit character compatible # but is used by default as it handles large phrase lists much faster. # If you wish to use a large number of 16-bit character phrases then # enable this option. # 0 = off (default) # 1 = on (Big5 compatible) forcequicksearch = 0 # Reverse lookups for banned site and URLs. # If set to on, DansGuardian will look up the forward DNS for an IP URL # address and search for both in the banned site and URL lists. This would # prevent a user from simply entering the IP for a banned address. # It will reduce searching speed somewhat so unless you have a local caching # DNS server, leave it off and use the Blanket IP Block option in the # bannedsitelist file instead. reverseaddresslookups = off # Reverse lookups for banned and exception IP lists. # If set to on, DansGuardian will look up the forward DNS for the IP # of the connecting computer. This means you can put in hostnames in # the exceptioniplist and bannediplist. # It will reduce searching speed somewhat so unless you have a local DNS server, # leave it off. reverseclientiplookups = off # Build bannedsitelist and bannedurllist cache files. # This will compare the date stamp of the list file with the date stamp of # the cache file and will recreate as needed. # If a bsl or bul .processed file exists, then that will be used instead. # It will increase process start speed by 300%. On slow computers this will # be significant. Fast computers do not need this option. on | off createlistcachefiles = on # POST protection (web upload and forms) # does not block forms without any file upload, i.e. this is just for # blocking or limiting uploads # measured in kibibytes after MIME encoding and header bumph # use 0 for a complete block # use higher (e.g. 512 = 512Kbytes) for limiting # use -1 for no blocking #maxuploadsize = 512 #maxuploadsize = 0 maxuploadsize = -1 # Max content filter page size # Sometimes web servers label binary files as text which can be very # large which causes a huge drain on memory and cpu resources. # To counter this, you can limit the size of the document to be # filtered and get it to just pass it straight through. # This setting also applies to content regular expression modification. # The size is in Kibibytes - eg 2048 = 2Mb # use 0 for no limit maxcontentfiltersize = # Username identification methods (used in logging) # You can have as many methods as you want and not just one. The first one # will be used then if no username is found, the next will be used. # * proxyauth is for when basic proxy authentication is used (no good for # transparent proxying). # * ntlm is for when the proxy supports the MS NTLM authentication # protocol. (Only works with IE5.5 sp1 and later). **NOT IMPLEMENTED** # * ident is for when the others don't work. It will contact the computer # that the connection came from and try to connect to an identd server # and query it for the user owner of the connection. usernameidmethodproxyauth = on usernameidmethodntlm = off # **NOT IMPLEMENTED** usernameidmethodident = off # Preemptive banning - this means that if you have proxy auth enabled and a user accesses # a site banned by URL for example they will be denied straight away without a request # for their user and pass. This has the effect of requiring the user to visit a clean # site first before it knows who they are and thus maybe an admin user. # This is how DansGuardian has always worked but in some situations it is less than # ideal. So you can optionally disable it. Default is on. # As a side effect disabling this makes AD image replacement work better as the mime # type is know. preemptivebanning = on # Misc settings # if on it adds an X-Forwarded-For: <clientip> to the HTTP request # header. This may help solve some problem sites that need to know the # source ip. on | off forwardedfor = on # if on it uses the X-Forwarded-For: <clientip> to determine the client # IP. This is for when you have squid between the clients and DansGuardian. # Warning - headers are easily spoofed. on | off usexforwardedfor = off # if on it logs some debug info regarding fork()ing and accept()ing which # can usually be ignored. These are logged by syslog. It is safe to leave # it on or off logconnectionhandlingerrors = on # Fork pool options # sets the maximum number of processes to sporn to handle the incomming # connections. Max value usually 250 depending on OS. # On large sites you might want to try 180. maxchildren = 180 # sets the minimum number of processes to sporn to handle the incomming connections. # On large sites you might want to try 32. minchildren = 32 # sets the minimum number of processes to be kept ready to handle connections. # On large sites you might want to try 8. minsparechildren = 8 # sets the minimum number of processes to sporn when it runs out # On large sites you might want to try 10. preforkchildren = 10 # sets the maximum number of processes to have doing nothing. # When this many are spare it will cull some of them. # On large sites you might want to try 64. maxsparechildren = 64 # sets the maximum age of a child process before it croaks it. # This is the number of connections they handle before exiting. # On large sites you might want to try 10000. maxagechildren = 5000 # Process options # (Change these only if you really know what you are doing). # These options allow you to run multiple instances of DansGuardian on a single machine. # Remember to edit the log file path above also if that is your intention. # IPC filename # # Defines IPC server directory and filename used to communicate with the log process. ipcfilename = '/tmp/.dguardianipc' # URL list IPC filename # # Defines URL list IPC server directory and filename used to communicate with the URL # cache process. urlipcfilename = '/tmp/.dguardianurlipc' # PID filename # # Defines process id directory and filename. #pidfilename = '/var/run/dansguardian.pid' # Disable daemoning # If enabled the process will not fork into the background. # It is not usually advantageous to do this. # on|off ( defaults to off ) nodaemon = off # Disable logging process # on|off ( defaults to off ) nologger = off # Daemon runas user and group # This is the user that DansGuardian runs as. Normally the user/group nobody. # Uncomment to use. Defaults to the user set at compile time. # daemonuser = 'nobody' # daemongroup = 'nobody' # Soft restart # When on this disables the forced killing off all processes in the process group. # This is not to be confused with the -g run time option - they are not related. # on|off ( defaults to off ) softrestart = off maxcontentramcachescansize = 2000 maxcontentfilecachescansize = 20000 downloadmanager = '/etc/dansguardian/downloadmanagers/default.conf' authplugin = '/etc/dansguardian/authplugins/proxy-basic.conf' Squid.conf http_port 3128 hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache #broken_vary_encoding allow apache access_log /squid/var/logs/access.log squid hosts_file /etc/hosts auth_param basic program /squid/libexec/ncsa_auth /squid/etc/userbasic.auth auth_param basic children 5 auth_param basic realm proxy auth_param basic credentialsttl 2 hours auth_param basic casesensitive off refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern . 0 20% 4320 acl NoAuthNec src <HIDDEN FOR SECURITY> acl BrkRm src <HIDDEN FOR SECURITY> acl Dials src <HIDDEN FOR SECURITY> acl Comps src <HIDDEN FOR SECURITY> acl whsws dstdom_regex -i .opensuse.org .novell.com .suse.com mirror.mcs.an1.gov mirrors.kernerl.org www.suse.de suse.mirrors.tds.net mirrros.usc.edu ftp.ale.org suse.cs.utah.edu mirrors.usc.edu mirror.usc.an1.gov linux.nssl.noaa.gov noaa.gov .kernel.org ftp.ale.org ftp.gwdg.de .medibuntu.org mirrors.xmission.com .canonical.com .ubuntu. acl opensites dstdom_regex -i .mbsbooks.com .bowker.com .usps.com .usps.gov .ups.com .fedex.com go.microsoft.com .microsoft.com .apple.com toolbar.msn.com .contacts.msn.com update.services.openoffice.org fms2.pointroll.speedera.net services.wmdrm.windowsmedia.com windowsupdate.com .adobe.com .symantec.com .vitalbook.com vxn1.datawire.net vxn.datawire.net download.lavasoft.de .download.lavasoft.com .lavasoft.com updates.ls-servers.com .canadapost. .myyellow.com minirick symantecliveupdate.com wm.overdrive.com www.overdrive.com productactivation.one.microsoft.com www.update.microsoft.com testdrive.whoson.com www.columbia.k12.mo.us banners.wunderground.com .kofax.com .gotomeeting.com tools.google.com .dl.google.com .cache.googlevideo.com .gpdl.google.com .clients.google.com cache.pack.google.com kh.google.com maps.google.com auth.keyhole.com .contacts.msn.com .hrblock.com .taxcut.com .merchantadvantage.com .jtv.com .malwarebytes.org www.google-analytics.com dcs.support.xerox.com .dhl.com .webtrendslive.com javadl-esd.sun.com javadl-alt.sun.com .excelsior.edu .dhlglobalmail.com .nessus.org .foxitsoftware.com foxit.vo.llnwd.net installshield.com .mindjet.com .mediascouter.com media.us.elsevierhealth.com .xplana.com .govtrack.us sa.tulsacc.edu .omniture.com fpdownload.macromedia.com webservices.amazon.com acl password proxy_auth REQUIRED acl all src all acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 631 2001 2005 8731 9001 9080 10000 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port # https, snews 443 563 acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port # unregistered ports 1936-65535 acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 10000 acl Safe_ports port 631 acl Safe_ports port 901 # SWAT acl purge method PURGE acl CONNECT method CONNECT acl UTubeUsers proxy_auth "/squid/etc/utubeusers.list" acl RestrictUTube dstdom_regex -i youtube.com acl RestrictFacebook dstdom_regex -i facebook.com acl FacebookUsers proxy_auth "/squid/etc/facebookusers.list" acl BuemerKEC src 10.10.128.0/24 acl MBSsortnet src 10.10.128.0/26 acl MSNExplorer browser -i MSN acl Printers src <HIDDEN FOR SECURITY> acl SpecialFolks src <HIDDEN FOR SECURITY> # streaming download acl fails rep_mime_type ^.*mms.* acl fails rep_mime_type ^.*ms-hdr.* acl fails rep_mime_type ^.*x-fcs.* acl fails rep_mime_type ^.*x-ms-asf.* acl fails2 urlpath_regex dvrplayer mediastream mms:// acl fails2 urlpath_regex \.asf$ \.afx$ \.flv$ \.swf$ acl deny_rep_mime_flashvideo rep_mime_type -i video/flv acl deny_rep_mime_shockwave rep_mime_type -i ^application/x-shockwave-flash$ acl x-type req_mime_type -i ^application/octet-stream$ acl x-type req_mime_type -i application/octet-stream acl x-type req_mime_type -i ^application/x-mplayer2$ acl x-type req_mime_type -i application/x-mplayer2 acl x-type req_mime_type -i ^application/x-oleobject$ acl x-type req_mime_type -i application/x-oleobject acl x-type req_mime_type -i application/x-pncmd acl x-type req_mime_type -i ^video/x-ms-asf$ acl x-type2 rep_mime_type -i ^application/octet-stream$ acl x-type2 rep_mime_type -i application/octet-stream acl x-type2 rep_mime_type -i ^application/x-mplayer2$ acl x-type2 rep_mime_type -i application/x-mplayer2 acl x-type2 rep_mime_type -i ^application/x-oleobject$ acl x-type2 rep_mime_type -i application/x-oleobject acl x-type2 rep_mime_type -i application/x-pncmd acl x-type2 rep_mime_type -i ^video/x-ms-asf$ acl RestrictHulu dstdom_regex -i hulu.com acl broken dstdomain cms.montgomerycollege.edu events.columbiamochamber.com members.columbiamochamber.com public.genexusserver.com acl RestrictVimeo dstdom_regex -i vimeo.com acl http_port port 80 #http_reply_access deny deny_rep_mime_flashvideo #http_reply_access deny deny_rep_mime_shockwave #streaming files #http_access deny fails #http_reply_access deny fails #http_access deny fails2 #http_reply_access deny fails2 #http_access deny x-type #http_reply_access deny x-type #http_access deny x-type2 #http_reply_access deny x-type2 follow_x_forwarded_for allow localhost acl_uses_indirect_client on log_uses_indirect_client on http_access allow manager localhost http_access deny manager http_access allow purge localhost http_access deny purge http_access allow SpecialFolks http_access deny CONNECT !SSL_ports http_access allow whsws http_access allow opensites http_access deny BuemerKEC !MBSsortnet http_access deny BrkRm RestrictUTube RestrictFacebook RestrictVimeo http_access allow RestrictUTube UTubeUsers http_access deny RestrictUTube http_access allow RestrictFacebook FacebookUsers http_access deny RestrictFacebook http_access deny RestrictHulu http_access allow NoAuthNec http_access allow BrkRm http_access allow FacebookUsers RestrictVimeo http_access deny RestrictVimeo http_access allow Comps http_access allow Dials http_access allow Printers http_access allow password http_access deny !Safe_ports http_access deny SSL_ports !CONNECT http_access allow http_port http_access deny all http_reply_access allow all icp_access allow all access_log /squid/var/logs/access.log squid visible_hostname proxy.site.com forwarded_for off coredump_dir /squid/cache/ #header_access Accept-Encoding deny broken #acl snmppublic snmp_community mysecretcommunity #snmp_port 3401 #snmp_access allow snmppublic all cache_mem 3 GB #acl snmppublic snmp_community mbssquid #snmp_port 3401 #snmp_access allow snmppublic all

Read the article
[C++] Write connected components of a graph using Boost Graph

- by conradlee

I have an file that is a long list of weighted edges, in the following form node1_id node2_id weight node1_id node3_id weight and so on. So one weighted edge per line. I want to load this file into boost graph and find the connected components in the graph. Each of these connected components is a subgraph. For each of these component subgraphs, I want to write the edges in the above-described format. I want to do all this using boost graph. This problem is in principle simple, it's just I'm not sure how to implement it neatly because I don't know my way around Boost Graph. I have already spent some hours and have code that will find the connected components, but my version is surely much longer and more complicated that necessary---I'm hoping there's a boost-graph ninja out there that can show me the right, easy way.

Read the article
A Taxonomy of Numerical Methods v1

- by JoshReuben

Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

Read the article
Generic Adjacency List Graph implementation

- by DmainEvent

I am trying to come up with a decent Adjacency List graph implementation so I can start tooling around with all kinds of graph problems and algorithms like traveling salesman and other problems... But I can't seem to come up with a decent implementation. This is probably because I am trying to dust the cobwebs off my data structures class. But what I have so far... and this is implemented in Java... is basically an edgeNode class that has a generic type and a weight-in the event the graph is indeed weighted. public class edgeNode<E> { private E y; private int weight; //... getters and setters as well as constructors... } I have a graph class that has a list of edges a value for the number of Vertices and and an int value for edges as well as a boolean value for whether or not it is directed. The brings up my first question, if the graph is indeed directed, shouldn't I have a value in my edgeNode class? Or would I just need to add another vertices to my LinkedList? That would imply that a directed graph is 2X as big as an undirected graph wouldn't it? public class graph { private List<edgeNode<?>> edges; private int nVertices; private int nEdges; private boolean directed; //... getters and setters as well as constructors... } Finally does anybody have a standard way of initializing there graph? I was thinking of reading in a pipe-delimited file but that is so 1997. public graph GenereateGraph(boolean directed, String file){ List<edgeNode<?>> edges; graph g; try{ int count = 0; String line; FileReader input = new FileReader("C:\\Users\\derekww\\Documents\\JavaEE Projects\\graphFile"); BufferedReader bufRead = new BufferedReader(input); line = bufRead.readLine(); count++; edges = new ArrayList<edgeNode<?>>(); while(line != null){ line = bufRead.readLine(); Object edgeInfo = line.split("|")[0]; int weight = Integer.parseInt(line.split("|")[1]); edgeNode<String> e = new edgeNode<String>((String) edges.add(e); } return g; } catch(Exception e){ return null; } } I guess when I am adding edges if boolean is true I would be adding a second edge. So far, this all depends on the file I write. So if I wrote a file with the following Vertices and weights... Buffalo | 18 br Pittsburgh | 20 br New York | 15 br D.C | 45 br I would obviously load them into my list of edges, but how can I represent one vertices connected to the other... so on... I would need the opposite vertices? Say I was representing Highways connected to each city weighted and un-directed (each edge is bi-directional with weights in some fictional distance unit)... Would my implementation be the best way to do that? I found this tutorial online Graph Tutorial that has a connector object. This appears to me be a collection of vertices pointing to each other. So you would have A and B each with there weights and so on, and you would add this to a list and this list of connectors to your graph... That strikes me as somewhat cumbersome and a little dismissive of the adjacency list concept? Am I wrong and that is a novel solution? This is all inspired by steve skiena's Algorithm Design Manual. Which I have to say is pretty good so far. Thanks for any help you can provide.

Read the article
Temperature anomaly calculation of time series data

- by neel

I have a time series like following: Data <- structure(list(Year = c(1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L, 1992L), Month = c(8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L), Day = c(30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 28L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L, 10L, 20L, 30L), Hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), temperature = c(72.5, 64, 62.5, 64, 64, 53, 52, 52, 45.5, 49, 50, 50, 59, 63.5, 69.5, 61, 61, NaN, NaN, 39.5, 37, 45.5, 45, 39, 43.5, 52, 53, 56, 64, 66, 66.5, 73.5, 81, 85, 89.5, 87.5, 88.5, 83, 84.5, 74, 60.5, 59, 53, 60.5, 62.5, 64.5, 63, 62, 65.5)), .Names = c("Year", "Month", "Day", "Hour", "temperature"), class = "data.frame", row.names = c(NA, -49L)) and I have to calculate standardized anomaly. The steps to calculate the anomalies are following: Monthly premature departures from the long-term (1991-2007) average are obtained. Then standardized by dividing by the standard deviation of monthly temperature. The standardized monthly anomalies are then weighted by multiplying by the fraction of the average temperature for the given month. These weighted anomalies are then summed over 3 month time period. Can you please help me?

Read the article
Mobile app technology choice - popularity trend data?

- by Ryan Weir

I'm familiar with the arguments for HTML5 apps over native, but was looking for some numbers or data to indicate a trend of how popular they are relative to each other for mobile app development. E.g. Surveys among programmers, data collected from the various app stores, number of downloads of development tools for those platforms. Your source could consider new apps, existing apps, categorized by downloads, app downloads weighted by popularity - basically any source you've got I would like to see. In my own personal monkey-sphere of developers, HTML5 seems to be starting to dominate as of about 6 months ago over iOS and Android by a wide margin as the technology stack preference - so I was wondering if this reflects a trend that's been measured globally and if there was objective data to support it.

Read the article
In Social Relationship Management, the Spirit is Willing, but Execution is Weak

- by Mike Stiles

In our final talk in this series with Aberdeen’s Trip Kucera, we wanted to find out if enterprise organizations are actually doing anything about what they’re learning around the importance of communicating via social and using social listening for a deeper understanding of customers and prospects. We found out that if your brand is lagging behind, you’re not alone. Spotlight: How was Aberdeen able to find out if companies are putting their money where their mouth is when it comes to implementing social across the enterprise? Trip: One way to think about the relative challenges a business has in a given area is to look at the gap between “say” and “do.” The first of those words reveals the brand’s priorities, while the second reveals their ability to execute on those priorities. In Aberdeen’s research, we capture this by asking firms to rank the value of a set of activities from one on the low end to five on the high end. We then ask them to rank their ability to execute those same activities, again on a one to five, not effective to highly effective scale. Spotlight: And once you get their self-assessments, what is it you’re looking for? Trip: There are two things we’re looking for in this analysis. The first is we want to be able to identify the widest gaps between perception of value and execution. This suggests impediments to adoption or simply a high level of challenge, be it technical or otherwise. It may also suggest areas where we can expect future investment and innovation. Spotlight: So the biggest potential pain points surface, places where they know something is critical but also know they aren’t doing much about it. What’s the second thing you look for? Trip: The second thing we want to do is look at specific areas in which high-performing companies, the Leaders, are out-executing the Followers. This points to the business impact of these activities since Leaders are defined by a set of business performance metrics. Put another way, we’re correlating adoption of specific business competencies with performance, looking for what high-performers do differently. Spotlight: Ah ha, that tells us what steps the winners are taking that are making them winners. So what did you find out? Trip: Generally speaking, we see something of a glass curtain when it comes to the social relationship management execution gap. There isn’t a single social media activity in which more than 50% of respondents indicated effectiveness, which would be a 4 or 5 on that 1-5 scale. This despite the fact that 70% of firms indicate that generating positive social media mentions is valuable or very valuable, a 4 or 5 on our 1-5 scale. Spotlight: Well at least they get points for being honest. The verdict they’re giving themselves is that they just aren’t cutting it in these highly critical social development areas. Trip: And the widest gap is around directly engaging with customers and/or prospects on social networks, which 69% of firms rated as valuable but only 34% of companies say they are executing well. Perhaps even more interesting is that these two are interdependent since you’re most likely to generate goodwill on social through happy, engaged customers. This data also suggests that social is largely being used as a broadcast channel rather than for one-to-one engagement. As we’ve discussed previously, social is an inherently personal media. Spotlight: And if they’re still using it as a broadcast channel, that shows they still fail to understand the root of social and see it as just another outlet for their ads and push-messaging. That’s depressing. Trip: A second way to evaluate this data is by using Aberdeen’s performance benchmarking. The story is both a bit different, but consistent in its own way. The first thing we notice is that Leaders are more effective in their execution of several key social relationship management capabilities, namely generating positive mentions and engaging with “influencers” and customers. Based on the fact that Aberdeen uses a broad set of performance metrics to rank the respondents as either “Leaders” (top 35% in weighted performance) or “Followers” (bottom 65% in weighted performance), from website conversion to annual revenue growth, we can then correlated high social effectiveness with company performance. We can also connect the specific social capabilities used by Leaders with effectiveness. We spoke about a few of those key capabilities last time and also discuss them in a new report: Social Powers Activate: Engineering Social Engagement to Win the Hidden Sales Cycle. Spotlight: What all that tells me is there are rewards for making the effort and getting it right. That’s how you become a Leader. Trip: But there’s another part of the story, which is that overall effectiveness, even among Leaders, is muted. There’s just one activity in which more than a majority of Leaders cite high effectiveness, effectiveness being the generation of positive buzz. While 80% of Leaders indicate “directly engaging with customers” through social media channels is valuable, the highest rated activity among Leaders, only 42% say they’re effective. This gap even among Leaders shows the challenges still involved in effective social relationship management. @mikestilesPhoto: stock.xchng

Read the article
Automatically zoom out the camera to show all players

- by user36159

I am building a game in XNA that takes place in a rectangular arena. The game is multiplayer and each player may go where they like within the arena. The camera is a perspective camera that looks directly downwards. The camera should be automatically repositioned based on the game state. Currently, the xy position is a weighted sum of the xy positions of important entities. I would like the camera's z position to be calculated from the xy coordinates so that it zooms out to the point where all important entities are visible. My current approach is to: hw = the greatest x distance from the camera to an important entity hh = the greatest y distance from the camera to an important entity Calculate z = max(hw / tan(FoVx), hh / tan(FoVy)) My code seems to almost work as it should, but the resulting z values are always too low by a factor of about 4. Any ideas?

Read the article
Automatically zoom out the camera to show all players (XNA)

- by user36159

I am building a game in XNA that takes place in a rectangular arena. The game is multiplayer and each player may go where they like within the arena. The camera is a persepective camera that looks directly downwards. The camera should be automatically repositioned based on the game state. Currently, the xy position is a weighted sum of the xy positions of important entities. I would like the camera's z position to be calculated from the xy coordinates so that it zooms out to the point where all important entities are visible. My current approach is to: hw = the greatest x distance from the camera to an important entity hh = the greatest y distance from the camera to an important entity Calculate z = max(hw / tan(FoVx), hh / tan(FoVy)) My code seems to almost work as it should, but the resulting z values are always too low by a factor of about 4. Any ideas?

Read the article
Countour Mapping API

- by Crash

I have done some searching on Google, but haven't come up with much as of yet. I am wanting to take a set of point data, which I had previously been using to create weighted points for a heat map through the Google API, and turn them into a contour map to overlay on the Google Maps API. I haven't seen anything in Google's code that would let me do this. Does anyone know of a good API to create such an overlay? Or is there possibly something I have overlooked that Google offers?

Read the article
How to find optimal path visit every node with parallel workers complicated by dynamic edge costs?

- by Aaron Anodide

Say you have an acyclic directed graph with weighted edges and create N workers. My goal is to calculate the optimal way those workers can traverse the entire graph in parralel. However, edge costs may change along the way. Example: A -1-> B A -2-> C B -3-> C (if A has already been visited) B -5-> C (if A has not already been visited) Does what I describe lend itself to a standard algorithmic approach, or alternately can someone suggest if I'm looking at this in an inherently flawed way (i have an intuition I might be)?

Read the article
Clever ways of implementing different data structures in C & data structures that should be used mor

- by Yktula

What are some clever (not ordinary) ways of implementing data structures in C, and what are some data structures that should be used more often? For example, what is the most effective way (generating minimal overhead) to implement a directed and cyclic graph with weighted edges in C? I know that we can store the distances in an array as is done here, but what other ways are there to implement this kind of a graph?

Read the article

Search Results

Search found 142 results on 6 pages for 'weighted'.

Page 2/6 | < Previous Page | 1 2 3 4 5 6 | Next Page >

- by christian studer

- by delicasso

- by Andres

- by luminarious

- by Chaggster

- by dbayard

- by Asa Baylus

- by DeveloperDon

- by ManBugra

- by sagar

- by Martin Wiboe

- by Martin Wiboe

- by Toxikman

- by The Digital Ninja

- by conradlee

- by JoshReuben

- by DmainEvent

- by neel

- by Ryan Weir

- by Mike Stiles

- by user36159

- by user36159

- by Crash

- by Aaron Anodide

- by Yktula

< Previous Page | 1 2 3 4 5 6 | Next Page >