Efficient most common suffix algorithm?

Posted by taw on Stack Overflow See other posts from Stack Overflow or by taw
Published on 2010-06-07T06:50:10Z Indexed on 2010/06/07 6:52 UTC
Read the original article Hit count: 301

Filed under:

string

I have a few GBs worth of strings, and for every prefix I want to find 10 most common suffixes. Is there an efficient algorithm for that?

An obvious solution would be:

Store sorted list of <string, count> pairs.
Identify by binary search extent for prefix we're searching.
Find 10 highest counts in this extent.
Possibly precompute it for all short prefixes, so it doesn't ever need to look at large portion of data.

I'm not sure if that would actually be efficient at all. Is there a better way I overlooked?

Answers must be real time, but it can take as much preprocessing as necessary.

Related posts about string

Read array dump output and generates the correspondent XML file

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, The text below is the dump of a multidimensional array, dumped by the var_dump() PHP function. I need a Java function that reads a file with a content like this (attached) and returns it in XML. For a reference, in site http://pear.php.net/package/Var_Dump/ you can find the code (in PHP) that… >>> More
Formatting a date string when the string sits inside another string

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I'm trying to figure out a way to format a date string that sits inside string using javascript. The string can be: "hello there From 2010-03-04 00:00:00.0 to 2010-03-31 00:00:00.0" or "stuff like 2010-03-04 20:00:00.0 and 2010-03-31 00:00:02.0 blah blah" I'd like it to end up like: "stuff… >>> More
Using String+string+string vs using string.replace

as seen on Stack Overflow - Search for 'Stack Overflow'
A colleague told me that using the following method: string url = "SomeURL"; string ext = "SomeExt"; string sub = "SomeSub"; string subSub = "moreSub"; string Url = @"http://www." + Url +@"/"+ ext +@"/"+ sub + subSub; is not efficenet (takes more resources) and it is preferred to use the… >>> More
vb.net string concatenation string + function output + string = string + function output and no more

as seen on Stack Overflow - Search for 'Stack Overflow'
The following output produces a string with no closing xml tag. m_rFlight.Layout = m_rFlight.Layout + "<G3Grid:Spots>" + Me.gvwSpots.LayoutToString() + "</G3Grid:Spots>" This following code works correctly m_rFlight.Layout = m_rFlight.Layout + "<G3Grid:Spots>" + Me.gvwSpots.LayoutToString() m_rFlight… >>> More
Trying to convert simple midlet application to Android application but running into problems.

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi I am trying to do some threading in Android so I took an old threading assignment I had done fora midlet and took out the midlet code and replaced it with android code(such as textview). package com.assignment1; import android.app.Activity; import android.os.Bundle; import android.widget.TextView; public… >>> More

Developer IT

Efficient most common suffix algorithm? - Developer IT

Efficient most common suffix algorithm?

string

Related posts about string

Read array dump output and generates the correspondent XML file

Formatting a date string when the string sits inside another string

Using String+string+string vs using string.replace

vb.net string concatenation string + function output + string = string + function output and no more

Trying to convert simple midlet application to Android application but running into problems.

Categories cloud