algorithm design - Page 119

SSLCipherSuite - disable weak encryption, cbc cipher and md5 based algorithm

- by John

A developer recently ran a PCI Scan with TripWire against our LAMP server. They identified several issues and instructed the following to correct the issues: Problem: SSL Server Supports Weak Encryption for SSLv3, TLSv1, Solution: Add the following rule to httpd.conf SSLCipherSuite ALL:!aNULL:!eNULL:!LOW:!EXP:RC4+RSA:+HIGH:+MEDIUM Problem: SSL Server Supports CBC Ciphers for SSLv3, TLSv1 Solution: Disable any cipher suites using CBC ciphers Problem: SSL Server Supports Weak MAC Algorithm for SSLv3, TLSv1 Solution: Disable any cipher suites using MD5 based MAC algorithms I tried searching google for a comprehensive tutorial on how to construct an SSLCipherSuite directive to meet my requirements, but I didn't find anything I could understand. I see examples of SSLCipherSuite directives, but I need an explanation on what each component of the directive does. So even in the directive SSLCipherSuite ALL:!aNULL:!eNULL:!LOW:!EXP:RC4+RSA:+HIGH:+MEDIUM, I dont understand for example what the !LOW means. Can someone either a) tell me the SSLCipherSuite directive that will meet my needs or b) show me a resource that clearly explains each segment of a SSLCipherSuite is and how to construct one?

Read the article

What is the difference between these two find algorithms? [migrated]

- by Joe

I have these two find algorithm which look the same to me. Can anyone help me out why they are actually different? Find ( x ) : if x.parent = x then return x else return Find ( x.parent ) vs Find ( x ) : if x.parent = x then return x else x.parent <- Find(x.parent) return x.parent I interpret the first one as int i = 0; return i++; while the second one as int i = 0; int tmp = i++; return tmp which are exactly the same to me.

Read the article

User Control as container

- by Luca

I'm designing a simple expander control. I've derived from UserControl, drawn inner controls, built, run; all ok. Since an inner Control is a Panel, I'd like to use it as container at design time. Indeed I've used the attributes: [Designer(typeof(ExpanderControlDesigner))] [Designer("System.Windows.Forms.Design.ParentControlDesigner, System.Design", typeof(IDesigner))] Great I say. But it isn't... The result is that I can use it as container at design time but: The added controls go back the inner controls already embedded in the user control Even if I push to top a control added at design time, at runtime it is back again on controls embedded to the user control I cannot restrict the container area at design time into a Panel area What am I missing? Here is the code for completeness... why this snippet of code is not working? [Designer(typeof(ExpanderControlDesigner))] [Designer("System.Windows.Forms.Design.ParentControlDesigner, System.Design", typeof(IDesigner))] public partial class ExpanderControl : UserControl { public ExpanderControl() { InitializeComponent(); .... [System.Security.Permissions.PermissionSet(System.Security.Permissions.SecurityAction.Demand, Name = "FullTrust")] internal class ExpanderControlDesigner : ControlDesigner { private ExpanderControl MyControl; public override void Initialize(IComponent component) { base.Initialize(component); MyControl = (ExpanderControl)component; // Hook up events ISelectionService s = (ISelectionService)GetService(typeof(ISelectionService)); IComponentChangeService c = (IComponentChangeService)GetService(typeof(IComponentChangeService)); s.SelectionChanged += new EventHandler(OnSelectionChanged); c.ComponentRemoving += new ComponentEventHandler(OnComponentRemoving); } private void OnSelectionChanged(object sender, System.EventArgs e) { } private void OnComponentRemoving(object sender, ComponentEventArgs e) { } protected override void Dispose(bool disposing) { ISelectionService s = (ISelectionService)GetService(typeof(ISelectionService)); IComponentChangeService c = (IComponentChangeService)GetService(typeof(IComponentChangeService)); // Unhook events s.SelectionChanged -= new EventHandler(OnSelectionChanged); c.ComponentRemoving -= new ComponentEventHandler(OnComponentRemoving); base.Dispose(disposing); } public override System.ComponentModel.Design.DesignerVerbCollection Verbs { get { DesignerVerbCollection v = new DesignerVerbCollection(); v.Add(new DesignerVerb("&asd", new EventHandler(null))); return v; } } } I've found many resources (Interaction, designed, limited area), but nothing was usefull for being operative...

Read the article

Plagued by multithreaded bugs

- by koncurrency

On my new team that I manage, the majority of our code is platform, TCP socket, and http networking code. All C++. Most of it originated from other developers that have left the team. The current developers on the team are very smart, but mostly junior in terms of experience. Our biggest problem: multi-threaded concurrency bugs. Most of our class libraries are written to be asynchronous by use of some thread pool classes. Methods on the class libraries often enqueue long running taks onto the thread pool from one thread and then the callback methods of that class get invoked on a different thread. As a result, we have a lot of edge case bugs involving incorrect threading assumptions. This results in subtle bugs that go beyond just having critical sections and locks to guard against concurrency issues. What makes these problems even harder is that the attempts to fix are often incorrect. Some mistakes I've observed the team attempting (or within the legacy code itself) includes something like the following: Common mistake #1 - Fixing concurrency issue by just put a lock around the shared data, but forgetting about what happens when methods don't get called in an expected order. Here's a very simple example: void Foo::OnHttpRequestComplete(statuscode status) { m_pBar->DoSomethingImportant(status); } void Foo::Shutdown() { m_pBar->Cleanup(); delete m_pBar; m_pBar=nullptr; } So now we have a bug in which Shutdown could get called while OnHttpNetworkRequestComplete is occuring on. A tester finds the bug, captures the crash dump, and assigns the bug to a developer. He in turn fixes the bug like this. void Foo::OnHttpRequestComplete(statuscode status) { AutoLock lock(m_cs); m_pBar->DoSomethingImportant(status); } void Foo::Shutdown() { AutoLock lock(m_cs); m_pBar->Cleanup(); delete m_pBar; m_pBar=nullptr; } The above fix looks good until you realize there's an even more subtle edge case. What happens if Shutdown gets called before OnHttpRequestComplete gets called back? The real world examples my team has are even more complex, and the edge cases are even harder to spot during the code review process. Common Mistake #2 - fixing deadlock issues by blindly exiting the lock, wait for the other thread to finish, then re-enter the lock - but without handling the case that the object just got updated by the other thread! Common Mistake #3 - Even though the objects are reference counted, the shutdown sequence "releases" it's pointer. But forgets to wait for the thread that is still running to release it's instance. As such, components are shutdown cleanly, then spurious or late callbacks are invoked on an object in an state not expecting any more calls. There are other edge cases, but the bottom line is this: Multithreaded programming is just plain hard, even for smart people. As I catch these mistakes, I spend time discussing the errors with each developer on developing a more appropriate fix. But I suspect they are often confused on how to solve each issue because of the enormous amount of legacy code that the "right" fix will involve touching. We're going to be shipping soon, and I'm sure the patches we're applying will hold for the upcoming release. Afterwards, we're going to have some time to improve the code base and refactor where needed. We won't have time to just re-write everything. And the majority of the code isn't all that bad. But I'm looking to refactor code such that threading issues can be avoided altogether. One approach I am considering is this. For each significant platform feature, have a dedicated single thread where all events and network callbacks get marshalled onto. Similar to COM apartment threading in Windows with use of a message loop. Long blocking operations could still get dispatched to a work pool thread, but the completion callback is invoked on on the component's thread. Components could possibly even share the same thread. Then all the class libraries running inside the thread can be written under the assumption of a single threaded world. Before I go down that path, I am also very interested if there are other standard techniques or design patterns for dealing with multithreaded issues. And I have to emphasize - something beyond a book that describes the basics of mutexes and semaphores. What do you think? I am also interested in any other approaches to take towards a refactoring process. Including any of the following: Literature or papers on design patterns around threads. Something beyond an introduction to mutexes and semaphores. We don't need massive parallelism either, just ways to design an object model so as to handle asynchronous events from other threads correctly. Ways to diagram the threading of various components, so that it will be easy to study and evolve solutions for. (That is, a UML equivalent for discussing threads across objects and classes) Educating your development team on the issues with multithreaded code. What would you do?

Read the article

How to Correct & Improve the Design of this Code?

- by DaveDev

HI Guys, I've been working on a little experiement to see if I could create a helper method to serialize any of my types to any type of HTML tag I specify. I'm getting a NullReferenceException when _writer = _viewContext.Writer; is called in protected virtual void Dispose(bool disposing) {/*...*/} I think I'm at a point where it almost works (I've gotten other implementations to work) and I was wondering if somebody could point out what I'm doing wrong? Also, I'd be interested in hearing suggestions on how I could improve the design? So basically, I have this code that will generate a Select box with a number of options: // the idea is I can use one method to create any complete tag of any type // and put whatever I want in the content area <% using (Html.GenerateTag<SelectTag>(Model, new { href = Url.Action("ActionName") })) { %> <%foreach (var fund in Model.Funds) {%> <% using (Html.GenerateTag<OptionTag>(fund)) { %> <%= fund.Name %> <% } %> <% } %> <% } %> This Html.GenerateTag helper is defined as: public static MMTag GenerateTag<T>(this HtmlHelper htmlHelper, object elementData, object attributes) where T : MMTag { return (T)Activator.CreateInstance(typeof(T), htmlHelper.ViewContext, elementData, attributes); } Depending on the type of T it'll create one of the types defined below, public class HtmlTypeBase : MMTag { public HtmlTypeBase() { } public HtmlTypeBase(ViewContext viewContext, params object[] elementData) { base._viewContext = viewContext; base.MergeDataToTag(viewContext, elementData); } } public class SelectTag : HtmlTypeBase { public SelectTag(ViewContext viewContext, params object[] elementData) { base._tag = new TagBuilder("select"); //base.MergeDataToTag(viewContext, elementData); } } public class OptionTag : HtmlTypeBase { public OptionTag(ViewContext viewContext, params object[] elementData) { base._tag = new TagBuilder("option"); //base.MergeDataToTag(viewContext, _elementData); } } public class AnchorTag : HtmlTypeBase { public AnchorTag(ViewContext viewContext, params object[] elementData) { base._tag = new TagBuilder("a"); //base.MergeDataToTag(viewContext, elementData); } } all of these types (anchor, select, option) inherit from HtmlTypeBase, which is intended to perform base.MergeDataToTag(viewContext, elementData);. This doesn't happen though. It works if I uncomment the MergeDataToTag methods in the derived classes, but I don't want to repeat that same code for every derived class I create. This is the definition for MMTag: public class MMTag : IDisposable { internal bool _disposed; internal ViewContext _viewContext; internal TextWriter _writer; internal TagBuilder _tag; internal object[] _elementData; public MMTag() {} public MMTag(ViewContext viewContext, params object[] elementData) { } public void Dispose() { Dispose(true /* disposing */); GC.SuppressFinalize(this); } protected virtual void Dispose(bool disposing) { if (!_disposed) { _disposed = true; _writer = _viewContext.Writer; _writer.Write(_tag.ToString(TagRenderMode.EndTag)); } } protected void MergeDataToTag(ViewContext viewContext, object[] elementData) { Type elementDataType = elementData[0].GetType(); foreach (PropertyInfo prop in elementDataType.GetProperties()) { if (prop.PropertyType.IsPrimitive || prop.PropertyType == typeof(Decimal) || prop.PropertyType == typeof(String)) { object propValue = prop.GetValue(elementData[0], null); string stringValue = propValue != null ? propValue.ToString() : String.Empty; _tag.Attributes.Add(prop.Name, stringValue); } } var dic = new Dictionary<string, object>(StringComparer.OrdinalIgnoreCase); var attributes = elementData[1]; if (attributes != null) { foreach (PropertyDescriptor descriptor in TypeDescriptor.GetProperties(attributes)) { object value = descriptor.GetValue(attributes); dic.Add(descriptor.Name, value); } } _tag.MergeAttributes<string, object>(dic); _viewContext = viewContext; _viewContext.Writer.Write(_tag.ToString(TagRenderMode.StartTag)); } } Thanks Dave

Read the article

What algorithms do "the big ones" use to cluster news?

- by marco92w

I want to cluster texts for a news website. At the moment I use this algorithm to find the related articles. But I found out that PHP's similar_text() gives very good results, too. What sort of algorithms do "the big ones", Google News, Topix, Techmeme, Wikio, Megite etc., use? Of course, you don't know exactly how the algorithms work. It's secret. But maybe someone knows approximately the way they work? The algorithm I use at the moment is very slow. It only compares two articles. So for having the relations between 5,000 articles you need about 12,500,000 comparisons. This is quite a lot. Are there alternatives to reduce the number of necessary comparisons? [I don't look for improvements for my algorithm.] What do "the big ones" do? I'm sure they don't always compare one article to another and this 12,500,000 times for 5,000 news. It would be great if somebody can say something about this topic.

Read the article

What Algorithm will Find New Longtail Keywords for keyword in PPC

- by Becci

I am looking for the algorithm (or combo) that would allow someone to find new longtail PPC search phrases based on say one corekeyword. Eg #1 word word corekeyword eg #2 word corekeyword word Google search tool allows a limited number vertically - mostly of eg#1 (https://adwords.google.com.au/select/KeywordToolExternal) I also know of other PPC apps that allow more volume than google adwords keyword tool, But I want to find other combos that mention the corekeyword & then naturally sort for the highest volume searched. Working example of exact match: corekeyword: copywriter (40,500 searches a month) google will serve up: become a copywriter (480 searches globally/month in english) But if I specifically look up: How to become a copywriter (720 searches a month) This exact longtail keyword phrase has 300 more searches than the 3 word version spat out by google. I want the algorithm to find any other highly search exact longtials like: how to become a copywriter Simply because it was save significant $ finding other longtail keywords after your campaign has been running an made google lots of money. I don't want a concantenation algorithm (I already have one of those), because hypothetically, I don't know what keywords will be that I want to find. Any gurus out there? Becci

Read the article

looking for a license key algorithm.

- by giulio

There are a lot of questions relating to license keys asked on stackoverflow. But they don't answer this question. Can anyone provide a simple license key algorithm that is technology independent and doesn't required a diploma in mathematics to understand ? The license key algorithm is similar to public key encryption. I just need something simple that can be implemented in any platform .Net/Java and uses simple data like characters. Preferably no byte translations required. So if a person presents a string, a complementary string can be generated that is the authorisation code. Below is a common scenario that it would be used for. Customer downloads s/w which generates a unique key upon initial startup/installation. S/w runs during trial period. At end of trial period an authorisation key is required. Customer goes to designated web-site, enters their code and get authorisation code to enable s/w, after paying :) Don't be afraid to describe your answer as though you're talking to a 5 yr old as I am not a mathemtician. Just need a decent basic algorithm, we're not launching nukes... NB: Please no philosophy on encryption nor who is Diffie-Hellman. I just need a basic solution.

Read the article

How to keep only duplicates efficiently?

- by Marc Eaddy

Given an STL vector, I'd like an algorithm that outputs only the duplicates in sorted order, e.g., INPUT : { 4, 4, 1, 2, 3, 2, 3 } OUTPUT: { 2, 3, 4 } The algorithm is trivial, but the goal is to make it as efficient as std::unique(). My naive implementation modifies the container in-place: My naive implementation: void keep_duplicates(vector<int>* pv) { // Sort (in-place) so we can find duplicates in linear time sort(pv->begin(), pv->end()); vector<int>::iterator it_start = pv->begin(); while (it_start != pv->end()) { size_t nKeep = 0; // Find the next different element vector<int>::iterator it_stop = it_start + 1; while (it_stop != pv->end() && *it_start == *it_stop) { nKeep = 1; // This gets set redundantly ++it_stop; } // If the element is a duplicate, keep only the first one (nKeep=1). // Otherwise, the element is not duplicated so erase it (nKeep=0). it_start = pv->erase(it_start + nKeep, it_stop); } } If you can make this more efficient, elegant, or general, please let me know. For example, a custom sorting algorithm, or copy elements in the 2nd loop to eliminate the erase() call.

Read the article

AStar in a specific case in C#

- by KiTe

Hello. To an intership, I have use the A* algorithm in the following case : the unit shape is a square of height and width of 1, we can travel from a zone represented by a rectangle from another, but we can't travel outside these predifined areas, we can go from a rectangle to another through a door, represented by a segment on corresponding square edge. Here are the 2 things I already did but which didn't satisfied my boss : 1 : I created the following classes : -a Door class which contains the location of the 2 separated squares and the door's orientation (top, left, bottom, right), -a Map class which contains a door list, a rectangle list representing the walkable areas and a 2D array representing the ground's squares (for additionnal infomations through an enumeration) - classes for the A* algorithm (node, AStar) 2 : -a MapCase class, which contains information about the case effect and doors through an enumeration (with [FLAGS] attribute set on, to be able to cummulate several information on each case) -a Map classes which only contains a 2D array of MapCase classes - the classes for the A* algorithm (still node an AStar). Since the 2 version is better than the first (less useless calculation, better map classes architecture), my boss is not still satisfied about my mapping classes architecture. The A* and node classes are good and easily mainainable, so I don't think I have to explain them deeper for now. So here is my asking : has somebody a good idea to implement the A* with the problem specification (rectangle walkable but with a square unit area, travelling through doors)? He said that a grid vision of the problem (so a 2D array) shouldn't be the correct way to solve the problem. I wish I've been clear while exposing my problem .. Thanks KiTe

Read the article

Bubble sort algorithm implementations (Haskell vs. C)

- by kingping

Hello. I have written 2 implementation of bubble sort algorithm in C and Haskell. Haskell implementation: module Main where main = do contents <- readFile "./data" print "Data loaded. Sorting.." let newcontents = bubblesort contents writeFile "./data_new_ghc" newcontents print "Sorting done" bubblesort list = sort list [] False rev = reverse -- separated. To see rev2 = reverse -- who calls the routine sort (x1:x2:xs) acc _ | x1 > x2 = sort (x1:xs) (x2:acc) True sort (x1:xs) acc flag = sort xs (x1:acc) flag sort [] acc True = sort (rev acc) [] False sort _ acc _ = rev2 acc I've compared these two implementations having run both on file with size of 20 KiB. C implementation took about a second, Haskell — about 1 min 10 sec. I have also profiled the Haskell application: Compile for profiling: C:\Temp ghc -prof -auto-all -O --make Main Profile: C:\Temp Main.exe +RTS -p and got these results. This is a pseudocode of the algorithm: procedure bubbleSort( A : list of sortable items ) defined as: do swapped := false for each i in 0 to length(A) - 2 inclusive do: if A[i] > A[i+1] then swap( A[i], A[i+1] ) swapped := true end if end for while swapped end procedure I wonder if it's possible to make Haskell implementation work faster without changing the algorithm (there's are actually a few tricks to make it work faster, but neither implementations have these optimizations)

Read the article

Fast, very lightweight algorithm for camera motion detection?

- by Ertebolle

I'm working on an augmented reality app for iPhone that involves a very processor-intensive object recognition algorithm (pushing the CPU at 100% it can get through maybe 5 frames per second), and in an effort to both save battery power and make the whole thing less "jittery" I'm trying to come up with a way to only run that object recognizer when the user is actually moving the camera around. My first thought was to simply use the iPhone's accelerometers / gyroscope, but in testing I found that very often people would move the iPhone at a consistent enough attitude and velocity that there wouldn't be any way to tell that it was still in motion. So that left the option of analyzing the actual video feed and detecting movement in that. I got OpenCV working and tried running their pyramidal Lucas-Kanade optical flow algorithm, which works well but seems to be almost as processor-intensive as my object recognizer - I can get it to an acceptable framerate if I lower the depth levels / downsample the image / track fewer points, but then accuracy suffers and it starts to miss some large movements and trigger on small hand-shaking-y ones. So my question is, is there another optical flow algorithm that's faster than Lucas-Kanade if I just want to detect the overall magnitude of camera movement? I don't need to track individual objects, I don't even need to know which direction the camera is moving, all I really need is a way to feed something two frames of video and have it tell me how far apart they are.

Read the article

How toget a list of "fastest miles" from a set of GPS Points

- by santiagobasulto

I'm trying to solve a weird problem. Maybe you guys know of some algorithm that takes care of this. I have data for a cargo freight truck and want to extract some data. Suppose I've got a list of sorted points that I get from the GPS. That's the route for that truck: [ { "lng": "-111.5373066", "lat": "40.7231711", "time": "1970-01-01T00:00:04Z", "elev": "1942.1789265256325" }, { "lng": "-111.5372056", "lat": "40.7228762", "time": "1970-01-01T00:00:07Z", "elev": "1942.109892409177" } ] Now, what I want to get is a list of the "fastest miles". I'll do an example: Given the points: A, B, C, D, E, F the distance from point A to point B is 1 mile, and the cargo took 10:32 minutes. From point B to point D i've got other mile, and the cargo took 10 minutes, etc. So, i need a list sorted by time. Similar to: B -> D: 10 A -> B: 10:32 D -> F: 11:02 Do you know any efficient algorithm that let me calculate that? Thank you all. PS: I'm using Python. EDIT: I've got the distance. I know how to calculate it and there are plenty of posts to do that. What I need is an algorithm to tokenize by mile and get speed from that. Having a distance function is not helpful enough: results = {} for point in points: aux_points = points.takeWhile(point>n) #This doesn't exist, just trying to be simple for aux_point in aux_points: d = distance(point, aux_point) if d == 1_MILE: time_elapsed = time(point, aux_point) results[time_elapsed] = (point, aux_point) I'm still doing some pretty inefficient calculations.

Read the article

Django: Applying Calculations To A Query Set

- by TheLizardKing

I have a QuerySet that I wish to pass to a generic view for pagination: links = Link.objects.annotate(votes=Count('vote')).order_by('-created')[:300] This is my "hot" page which lists my 300 latest submissions (10 pages of 30 links each). I want to now sort this QuerySet by an algorithm that HackerNews uses: (p - 1) / (t + 2)^1.5 p = votes minus submitter's initial vote t = age of submission in hours Now because applying this algorithm over the entire database would be pretty costly I am content with just the last 300 submissions. My site is unlikely to be the next digg/reddit so while scalability is a plus it is required. My question is now how do I iterate over my QuerySet and sort it by the above algorithm? For more information, here are my applicable models: class Link(models.Model): category = models.ForeignKey(Category, blank=False, default=1) user = models.ForeignKey(User) created = models.DateTimeField(auto_now_add=True) modified = models.DateTimeField(auto_now=True) url = models.URLField(max_length=1024, unique=True, verify_exists=True) name = models.CharField(max_length=512) def __unicode__(self): return u'%s (%s)' % (self.name, self.url) class Vote(models.Model): link = models.ForeignKey(Link) user = models.ForeignKey(User) created = models.DateTimeField(auto_now_add=True) def __unicode__(self): return u'%s vote for %s' % (self.user, self.link) Notes: I don't have "downvotes" so just the presence of a Vote row is an indicator of a vote or a particular link by a particular user.

Read the article

Need help implementing this algorithm with map Hadoop MapReduce

- by Julia

Hi all! i have algorithm that will go through a large data set read some text files and search for specific terms in those lines. I have it implemented in Java, but I didnt want to post code so that it doesnt look i am searching for someone to implement it for me, but it is true i really need a lot of help!!! This was not planned for my project, but data set turned out to be huge, so teacher told me I have to do it like this. EDIT(i did not clarified i previos version)The data set I have is on a Hadoop cluster, and I should make its MapReduce implementation I was reading about MapReduce and thaught that i first do the standard implementation and then it will be more/less easier to do it with mapreduce. But didnt happen, since algorithm is quite stupid and nothing special, and map reduce...i cant wrap my mind around it. So here is shortly pseudo code of my algorithm LIST termList (there is method that creates this list from lucene index) FOLDER topFolder INPUT topFolder IF it is folder and not empty list files (there are 30 sub folders inside) FOR EACH sub folder GET file "CheckedFile.txt" analyze(CheckedFile) ENDFOR END IF Method ANALYZE(CheckedFile) read CheckedFile WHILE CheckedFile has next line GET line FOR(loops through termList) GET third word from line IF third word = term from list append whole line to string buffer ENDIF ENDFOR END WHILE OUTPUT string buffer to file Also, as you can see, each time when "analyze" is called, new file has to be created, i understood that map reduce is difficult to write to many outputs??? I understand mapreduce intuition, and my example seems perfectly suited for mapreduce, but when it comes to do this, obviously I do not know enough and i am STUCK! Please please help.

Read the article

Geohashing - recursively find neighbors of neighbors

- by itsme

I am now looking for an elegant algorithm to recursively find neighbors of neighbors with the geohashing algorithm (http://www.geohash.org). Basically take a central geohash, and then get the first 'ring' of same-size hashes around it (8 elements), then, in the next step, get the next ring around the first etc. etc. Have you heard of an elegant way to do so? Brute force could be to take each neighbor and get their neighbors simply ignoring the massive overlap. Neighbors around one central geohash has been solved many times (here e.g. in Ruby: http://github.com/masuidrive/pr_geohash/blob/master/lib/pr_geohash.rb) Edit for clarification: Current solution, with passing in a center key and a direction, like this (with corresponding lookup-tables): def adjacent(geohash, dir) base, lastChr = geohash[0..-2], geohash[-1,1] type = (geohash.length % 2)==1 ? :odd : :even if BORDERS[dir][type].include?(lastChr) base = adjacent(base, dir) end base + BASE32[NEIGHBORS[dir][type].index(lastChr),1] end (extract from Yuichiro MASUI's lib) I say this approach will get ugly soon, because directions gets ugly once we are in ring two or three. The algorithm would ideally simply take two parameters, the center area and the distance from 0 being the center geohash only (["u0m"] and 1 being the first ring made of 8 geohashes of the same size around it (= [["u0t", "u0w"], ["u0q", "u0n"], ["u0j", "u0h"], ["u0k", "u0s"]]). two being the second ring with 16 areas around the first ring etc. Do you see any way to deduce the 'rings' from the bits in an elegant way?

Read the article

Determining the chances of an event occurring when it hasn't occurred yet

- by sanity

A user visits my website at time t, and they may or may not click on a particular link I care about, if they do I record the fact that they clicked the link, and also the duration since t that they clicked it, call this d. I need an algorithm that allows me to create a class like this: class ClickProbabilityEstimate { public void reportImpression(long id); public void reportClick(long id); public double estimateClickProbability(long id); } Every impression gets a unique id, and this is used when reporting a click to indicate which impression the click belongs to. I need an algorithm that will return a probability, based on how much time has past since an impression was reported, that the impression will receive a click, based on how long previous clicks required. Clearly one would expect that this probability will decrease over time if there is still no click. If necessary, we can set an upper-bound, beyond which we consider the click probability to be 0 (eg. if its been an hour since the impression occurred, we can be pretty sure there won't be a click). The algorithm should be both space and time efficient, and hopefully make as few assumptions as possible, while being elegant. Ease of implementation would also be nice. Any ideas?

Read the article

Calculating distance between two X,Y coordinates

- by Umopepisdn

I am writing a tool for a game that involves calculating the distance between two coordinates on a spherical plane 500 units across. That is, [0,0] through [499,499] are valid coordinates, and [0,0] and [499,499] are also right next to each other. Currently, in my application, I am comparing the distance between a city with an [X,Y] location respective to the user's own [X,Y] location, which they have configured in advance. To do this, I found this algorithm, which kind of works: Math.sqrt ( dx * dx + dy * dy ); Because sorting a paged list by distance is a useful thing to be able to do, I implemented this algorithm in a MySQL query and have made it available to my application using the following part of my SELECT statement: SQRT( POW( ( ".strval($sourceX)." - cityX ) , 2 ) + POW( ( ".strval($sourceY)." - cityY ) , 2 ) ) AS distance This works fine for many calculations, but does not take into account the fact that [0,0] and [499,499] are kitty-corner to one another. Is there any way I can tweak this algorithm to generate an accurate distance, given that 0 and 499 are adjacent? Thanks, -Umo

Read the article

On counting pairs of words that differ by one letter

- by Quintofron

Let us consider n words, each of length k. Those words consist of letters over an alphabet (whose cardinality is n) with defined order. The task is to derive an O(nk) algorithm to count the number of pairs of words that differ by one position (no matter which one exactly, as long as it's only a single position). For instance, in the following set of words (n = 5, k = 4): abcd, abdd, adcb, adcd, aecd there are 5 such pairs: (abcd, abdd), (abcd, adcd), (abcd, aecd), (adcb, adcd), (adcd, aecd). So far I've managed to find an algorithm that solves a slightly easier problem: counting the number of pairs of words that differ by one GIVEN position (i-th). In order to do this I swap the letter at the ith position with the last letter within each word, perform a Radix sort (ignoring the last position in each word - formerly the ith position), linearly detect words whose letters at the first 1 to k-1 positions are the same, eventually count the number of occurrences of each letter at the last (originally ith) position within each set of duplicates and calculate the desired pairs (the last part is simple). However, the algorithm above doesn't seem to be applicable to the main problem (under the O(nk) constraint) - at least not without some modifications. Any idea how to solve this?

Read the article

Need help implementing this algorithm with map reduce(hadoop)

- by Julia

Hi all! i have algorithm that will go through a large data set read some text files and search for specific terms in those lines. I have it implemented in Java, but I didnt want to post code so that it doesnt look i am searching for someone to implement it for me, but it is true i really need a lot of help!!! This was not planned for my project, but data set turned out to be huge, so teacher told me I have to do it like this. I was reading about MapReduce and thaught that i first do the standard implementation and then it will be more/less easier to do it with mapreduce. But didnt happen, since algorithm is quite stupid and nothing special, and map reduce...i cant wrap my mind around it. So here is shortly pseudo code of my algorithm LIST termList (there is method that creates this list from lucene index) FOLDER topFolder INPUT topFolder IF it is folder and not empty list files (there are 30 sub folders inside) FOR EACH sub folder GET file "CheckedFile.txt" analyze(CheckedFile) ENDFOR END IF Method ANALYZE(CheckedFile) read CheckedFile WHILE CheckedFile has next line GET line FOR(loops through termList) GET third word from line IF third word = term from list append whole line to string buffer ENDIF ENDFOR END WHILE OUTPUT string buffer to file Also, as you can see, each time when "analyze" is called, new file has to be created, i understood that map reduce is difficult to write to many outputs??? I understand mapreduce intuition, and my example seems perfectly suited for mapreduce, but when it comes to do this, obviously I do not know enough and i am STUCK! Please please help.

Read the article

License key pattern detection?

- by Ricket

This is not a real situation; please ignore legal issues that you might think apply, because they don't. Let's say I have a set of 200 known valid license keys for a hypothetical piece of software's licensing algorithm, and a license key consists of 5 sets of 5 alphanumeric case-insensitive (all uppercase) characters. Example: HXDY6-R3DD7-Y8FRT-UNPVT-JSKON Is it possible (or likely) to extrapolate other possible keys for the system? What if the set was known to be consecutive; how do the methods change for this situation, and what kind of advantage does this give? I have heard of "keygens" before, but I believe they are probably made by decompiling the licensing software rather than examining known valid keys. In this case, I am only given the set of keys and I must determine the algorithm. I'm also told it is an industry standard algorithm, so it's probably not something basic, though the chance is always there I suppose. If you think this doesn't belong in Stack Overflow, please at least suggest an alternate place for me to look or ask the question. I honestly don't know where to begin with a problem like this. I don't even know the terminology for this kind of problem.

Read the article

Vacancy Tracking Algorithm implementation in C++

- by Dave

I'm trying to use the vacancy tracking algorithm to perform transposition of multidimensional arrays in C++. The arrays come as void pointers so I'm using address manipulation to perform the copies. Basically, there is an algorithm that starts with an offset and works its way through the whole 1-d representation of the array like swiss cheese, knocking out other offsets until it gets back to the original one. Then, you have to start at the next, untouched offset and do it again. You repeat until all offsets have been touched. Right now, I'm using a std::set to just fill up all possible offsets (0 up to the multiplicative fold of the dimensions of the array). Then, as I go through the algorithm, I erase from the set. I figure this would be fastest because I need to randomly access offsets in the tree/set and delete them. Then I need to quickly find the next untouched/undeleted offset. First of all, filling up the set is very slow and it seems like there must be a better way. It's individually calling new[] for every insert. So if I have 5 million offsets, there's 5 million news, plus re-balancing the tree constantly which as you know is not fast for a pre-sorted list. Second, deleting is slow as well. Third, assuming 4-byte data types like int and float, I'm using up actually the same amount of memory as the array itself to store this list of untouched offsets. Fourth, determining if there are any untouched offsets and getting one of them is fast -- a good thing. Does anyone have suggestions for any of these issues?

Read the article

Find existence of number in a sorted list in constant time? (Interview question)

- by Rich

I'm studying for upcoming interviews and have encountered this question several times (written verbatim) Find or determine non existence of a number in a sorted list of N numbers where the numbers range over M, M N and N large enough to span multiple disks. Algorithm to beat O(log n); bonus points for constant time algorithm. First of all, I'm not sure if this is a question with a real solution. My colleagues and I have mused over this problem for weeks and it seems ill formed (of course, just because we can't think of a solution doesn't mean there isn't one). A few questions I would have asked the interviewer are: Are there repeats in the sorted list? What's the relationship to the number of disks and N? One approach I considered was to binary search the min/max of each disk to determine the disk that should hold that number, if it exists, then binary search on the disk itself. Of course this is only an order of magnitude speedup if the number of disks is large and you also have a sorted list of disks. I think this would yield some sort of O(log log n) time. As for the M N hint, perhaps if you know how many numbers are on a disk and what the range is, you could use the pigeonhole principle to rule out some cases some of the time, but I can't figure out an order of magnitude improvement. Also, "bonus points for constant time algorithm" makes me a bit suspicious. Any thoughts, solutions, or relevant history of this problem?

Read the article

15 Stylish Navigation Menus For Inspiration

- by Jyoti

A site’s navigation menu is one of the most prominent things that users see when they first visit. There are many ways to design a navigation menu and since almost all websites have some form of navigation designers have to push their creative limits to build one that’s remarkable and outstanding. In this article, you’ll [...]

Read the article

How does I/O work for large graph databases?

- by tjb1982

I should preface this by saying that I'm mostly a front end web developer, trained as a musician, but over the past few years I've been getting more and more into computer science. So one idea I have as a fun toy project to learn about data structures and C programming was to design and implement my own very simple database that would manage an adjacency list of posts. I don't want SQL (maybe I'll do my own query language? I'm just having fun). It should support ACID. It should be capable of storing 1TB let's say. So with that, I was trying to think of how a database even stores data, without regard to data structures necessarily. I'm working on linux, and I've read that in that world "everything is a file," including hardware (like /dev/*), so I think that that obviously has to apply to a database, too, and it clearly does--whether it's MySQL or PostgreSQL or Neo4j, the database itself is a collection of files you can see in the filesystem. That said, there would come a point in scale where loading the entire database into primary memory just wouldn't work, so it doesn't make sense to design it with that mindset (I assume). However, reading from secondary memory would be much slower and regardless some portion of the database has to be in primary memory in order for you to be able to do anything with it. I read this post: Why use a database instead of just saving your data to disk? And I found it difficult to understand how other databases, like SQLite or Neo4j, read and write from secondary memory and are still very fast (faster, it would seem, than simply writing files to the filesystem as the above question suggests). It seems the key is indexing. But even indexes need to be stored in secondary memory. They are inherently smaller than the database itself, but indexes in a very large database might be prohibitively large, too. So my question is how is I/O generally done with large databases like the one I described above that would be at least 1TB storing a big adjacency list? If indexing is more or less the answer, how exactly does indexing work--what data structures should be involved?

Search Results

Search found 17940 results on 718 pages for 'algorithm design'.

Page 119/718 | < Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126 | Next Page >

- by John

- by Joe

- by Luca

- by koncurrency

- by DaveDev

- by marco92w

- by Becci

- by giulio

- by Marc Eaddy

- by KiTe

- by kingping

- by Ertebolle

- by santiagobasulto

- by TheLizardKing

- by Julia

- by itsme

- by sanity

- by Umopepisdn

- by Quintofron

- by Julia

- by Ricket

- by Dave

- by Rich

- by Jyoti

- by tjb1982

< Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126 | Next Page >