reserved words - Page 23

Lucene .Net Searching with TermVector

- by Ashish

in Lucene.Net,i am creating the document for searching a word and want to display before 10 words and after 10 words.i have used TermVector. Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field("content", content, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED, Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS); Can anyone help me how to find out the keyword position and extract nearest 15 words. please send some code. Thanks Ashish

Read the article

crossword algorithm....

- by teddy

I'm making algorithm like crossword, but i dont know how to design d algorith. for example, there are words like 'car', 'apple' in the dictionary. and the 'app' words is given on the board. and there are letters like 'l' 'e' 'c' 'r'....for making words. so the algorithm work is making correct words which are stored in dictionary. app - lapp- leapp- lecapp- .... - lappe - eappc - ... - appl - apple(correct answer) what is the best solution for this algorithm?

Read the article

Sorting a file with 55K rows and varying Columns

- by Prasad

Hi I want to find a programmatic solution using C++. I have a 900 files each of 27MB size. (just to inform about the enormity ). Each file has 55K rows and Varying columns. But the header indicates the columns I want to sort the rows in an order w.r.t to a Column Value. I wrote the sorting algorithm for this (definitely my newbie attempts, you may say). This algorithm is working for few numbers, but fails for larger numbers. Here is the code for the same: basic functions I defined to use inside the main code: int getNumberOfColumns(const string& aline) { int ncols=0; istringstream ss(aline); string s1; while(ss>>s1) ncols++; return ncols; } vector<string> getWordsFromSentence(const string& aline) { vector<string>words; istringstream ss(aline); string tstr; while(ss>>tstr) words.push_back(tstr); return words; } bool findColumnName(vector<string> vs, const string& colName) { vector<string>::iterator it = find(vs.begin(), vs.end(), colName); if ( it != vs.end()) return true; else return false; } int getIndexForColumnName(vector<string> vs, const string& colName) { if ( !findColumnName(vs,colName) ) return -1; else { vector<string>::iterator it = find(vs.begin(), vs.end(), colName); return it - vs.begin(); } } ////////// I like the Recurssive functions - I tried to create a recursive function ///here. This worked for small values , say 20 rows. But for 55K - core dumps void sort2D(vector<string>vn, vector<string> &srt, int columnIndex) { vector<double> pVals; for ( int i = 0; i < vn.size(); i++) { vector<string>meancols = getWordsFromSentence(vn[i]); pVals.push_back(stringToDouble(meancols[columnIndex])); } srt.push_back(vn[max_element(pVals.begin(), pVals.end())-pVals.begin()]); if (vn.size() > 1 ) { vn.erase(vn.begin()+(max_element(pVals.begin(), pVals.end())-pVals.begin()) ); vector<string> vn2 = vn; //cout<<srt[srt.size() -1 ]<<endl; sort2D(vn2 , srt, columnIndex); } } Now the main code: for ( int i = 0; i < TissueNames.size() -1; i++) { for ( int j = i+1; j < TissueNames.size(); j++) { //string fname = path+"/gse7307_Female_rma"+TissueNames[i]+"_"+TissueNames[j]+".txt"; //string fname2 = sortpath2+"/gse7307_Female_rma"+TissueNames[i]+"_"+TissueNames[j]+"Sorted.txt"; string fname = path+"/gse7307_Male_rma"+TissueNames[i]+"_"+TissueNames[j]+".txt"; string fname2 = sortpath2+"/gse7307_Male_rma"+TissueNames[i]+"_"+TissueNames[j]+"4Columns.txt"; //vector<string>AllLinesInFile; BioInputStream fin(fname); string aline; getline(fin,aline); replace (aline.begin(), aline.end(), '"',' '); string headerline = aline; vector<string> header = getWordsFromSentence(aline); int pindex = getIndexForColumnName(header,"p-raw"); int xcindex = getIndexForColumnName(header,"xC"); int xeindex = getIndexForColumnName(header,"xE"); int prbindex = getIndexForColumnName(header,"X"); string newheaderline = "X\txC\txE\tp-raw"; BioOutputStream fsrt(fname2); fsrt<<newheaderline<<endl; int newpindex=3; while ( getline(fin, aline) ){ replace (aline.begin(), aline.end(), '"',' '); istringstream ss2(aline); string tstr; ss2>>tstr; tstr = ss2.str().substr(tstr.length()+1); vector<string> words = getWordsFromSentence(tstr); string values = words[prbindex]+"\t"+words[xcindex]+"\t"+words[xeindex]+"\t"+words[pindex]; AllLinesInFile.push_back(values); } vector<string>SortedLines; sort2D(AllLinesInFile, SortedLines,newpindex); for ( int si = 0; si < SortedLines.size(); si++) fsrt<<SortedLines[si]<<endl; cout<<"["<<i<<","<<j<<"] = "<<SortedLines.size()<<endl; } } can some one suggest me a better way of doing this? why it is failing for larger values. ? The primary function of interest for this query is Sort2D function. thanks for the time and patience. prasad.

Read the article

Clarification on recaptcha

- by luckytaxi

Um so I was in for a little bit of a surprise tonight. I spent a good 20 mins trying to figure out why I was able to submit a form knowing that what I entered into the recaptcha field was invalid. Is it true that you don't need to input the exact words it displays? If it shows me two words and I misspelled one of the words, I still pass validation? Same goes if "hello world" and I input "hell man" it still works.

Read the article

f# iterating over two arrays, using function from a c# library

- by user343550

I have a list of words and a list of associated part of speech tags. I want to iterate over both, simultaneously (matched index) using each indexed tuple as input to a .NET function. Is this the best way (it works, but doesn't feel natural to me): let taggingModel = SeqLabeler.loadModel(lthPath + "models\penn_00_18_split_dict.model"); let lemmatizer = new Lemmatizer(lthPath + "v_n_a.txt") let input = "the rain in spain falls on the plain" let words = Preprocessor.tokenizeSentence( input ) let tags = SeqLabeler.tagSentence( taggingModel, words ) let lemmas = Array.map2 (fun x y -> lemmatizer.lookup(x,y)) words tags

Read the article

recaptcha still submits form when one word invalid

- by luckytaxi

Um so I was in for a little bit of a surprise tonight. I spent a good 20 mins trying to figure out why I was able to submit a form knowing that what I entered into the recaptcha field was invalid. Is it true that you don't need to input the exact words it displays? If it shows me two words and I misspelled one of the words, I still pass validation? Same goes if "hello world" and I input "hell man" it still works.

Read the article

treeview loses data when page is being refreshed in asp.net

- by Greg

Hi, I have a treeview and I written a code for his "treeNodePopulate" event: protected void ycActiveTree_TreeNodePopulate(object sender, TreeNodeEventArgs e) { if (Application["idList"] != null && e.Node.Depth == 0) { string[] words = ((String)Application["idList"]).Split(' '); // Yellow Card details TreeNode child = new TreeNode(""); // Go over all the yellow card details and populate the treeview for (int i = 1; i < words.Length; i++) { child.SelectAction = TreeNodeSelectAction.None; // Same yellow card if (words[i] != "*") { // End of details and start of point ip's if (words[i] == "$") { // Add the yellow card node TreeNode yellowCardNode = new TreeNode(child.Text); yellowCardNode.SelectAction = TreeNodeSelectAction.Expand; e.Node.ChildNodes.Add(yellowCardNode); child.Text = ""; } // yellow card details else { child.Text = child.Text + words[i] + " "; } } // End of yellow card else { child.PopulateOnDemand = false; child.SelectAction = TreeNodeSelectAction.None; // Populate the yellow card node e.Node.ChildNodes[e.Node.ChildNodes.Count - 1].ChildNodes.Add(child); TreeNode moveChild = new TreeNode("Move To Reviewed"); moveChild.PopulateOnDemand = false; moveChild.SelectAction = TreeNodeSelectAction.Select; e.Node.ChildNodes[e.Node.ChildNodes.Count - 1].ChildNodes.Add(moveChild); child = new TreeNode(""); } } Application["idList"] = null; } } I want the treenode to get the data from the Application variable and then nullify the Application variable so that the tree will take data from Applcation only if there is something in it (I put data into the application from another page and then redirect to this page) But when I refresh this page the data in the treenode isnt being saved. I mean after the refresh the Application is null so he isnt doing anything. The question is why is the data that I put in the treenode earlier isnt being saved? The "enableViewState" property is set to "true".. Thanks in advance, Greg

Read the article

html editor properties

- by Ranjana

i have used the html editor to my page <%@ Register Assembly="AjaxControlToolkit" Namespace="AjaxControlToolkit.HTMLEditor" TagPrefix="cc1" % i have sent the values to the database as txtjobdesc.Content.Tostring(); but if i type just a paragraph in the editor it is displaying the same Description. But if i use any Bullets and Highlighted words it is displaying as words above the bulleted words.how to make it display as a html description pls help me out..

Read the article

Is it possible to shorten my main function in this code?

- by AjiPorter

Is it possible for me to shorten my main() by creating a class? If so, what part of my code would most likely be inside the class? Thanks again to those who'll answer. :) #include <iostream> #include <fstream> #include <string> #include <ctime> #include <cstdlib> #define SIZE 20 using namespace std; struct textFile { string word; struct textFile *next; }; textFile *head, *body, *tail, *temp; int main() { ifstream wordFile("WORDS.txt", ios::in); // file object constructor /* stores words in the file into an array */ string words[SIZE]; char pointer; int i; for(i = 0; i < SIZE; i++) { while(wordFile >> pointer) { if(!isalpha(pointer)) { pointer++; break; } words[i] = words[i] + pointer; } } /* stores the words in the array to a randomized linked list */ srand(time(NULL)); int index[SIZE] = {0}; // temporary array of index that will contain randomized indexes of array words int j = 0, ctr; // assigns indexes to array index while(j < SIZE) { i = rand() % SIZE; ctr = 0; for(int k = 0; k < SIZE; k++) { if(!i) break; else if(i == index[k]) { // checks if the random number has previously been stored as index ctr = 1; break; } } if(!ctr) { index[j] = i; // assigns the random number to the current index of array index j++; } } /* makes sure that there are no double zeros on the array */ ctr = 0; for(i = 0; i < SIZE; i++) { if(!index[i]) ctr++; } if(ctr > 1) { int temp[ctr-1]; for(j = 0; j < ctr-1; j++) { for(i = 0; i < SIZE; i++) { if(!index[i]) { int ctr2 = 0; for(int k = 0; k < ctr-1; k++) { if(i == temp[k]) ctr2 = 1; } if(!ctr2) temp[j] = i; } } } j = ctr - 1; while(j > 0) { i = rand() % SIZE; ctr = 0; for(int k = 0; k < SIZE; k++) { if(!i || i == index[k]) { ctr = 1; break; } } if(!ctr) { index[temp[j-1]] = i; j--; } } } head = tail = body = temp = NULL; for(j = 0; j < SIZE; j++) { body = (textFile*) malloc (sizeof(textFile)); body->word = words[index[j]]; if(head == NULL) { head = tail = body; } else { tail->next = body; tail = body; cout << tail->word << endl; } } temp = head; while(temp != NULL) { cout << temp->word << endl; temp = temp->next; } return 0; }

Read the article

Using a Custom Dictionary with Microsoft's MODI

- by ejrichards

I am currently using Microsoft's MODI (Microsoft Office Document Imaging) to read text in an image in C#. Everything is working fine, except some of the words I want to read are not real English words. Is there any way to use a custom dictionary when using MODI or add words to the regular English dictionary that it uses?

Read the article

Python slicing a string using space characters and a maximum length

- by chrism

I'd like to slice a string up in a similar way to .split() (so resulting in a list) but in a more intelligent way: I'd like it to split it into chunks that are up to 15 characters, but are not split mid word so: string = 'A string with words' [splitting process takes place] list = ('A string with','words') The string in this example is split between 'with' and 'words' because that's the last place you can split it and the first bit be 15 characters or less.

Read the article

Split string by newline and space in Bourne shell

- by l0b0

I'm currently using the following to split a file into words - Is there some quicker way? while read -r line do for word in $line do words="${words}\n${word}" done done

Read the article

Ignore duplicates in regex pattern

- by gAMBOOKa

I have a regex pattern that searches for words in a text file. How do I ignore duplicates? For instance, take a look at this code $pattern = '/(lorem|ipsum|daboom|pahwal|ababaga)/i'; $num_found = preg_match_all( $pattern, $string, $matches ); echo "$num_found match(es) found!"; echo "Matched words: " . implode( ',', $matches[0] ); If I have more than one say lorem in the article, the output will be something like this 5 matches found! Matched words: daboom,lorem,lorem,lorem,lorem I want the pattern to only find the first occurrence, and ignore the rest, so the output should be: 2 matches found! Matched words: daboom,lorem

Read the article

Flex wordwrap issue with multiple text instances

- by Craig Myles

Hi, I have a scenario where I want to dynamically add words of text to a container so that it forms a paragraph of text which is wrapped neatly according to the size of the parent container. Each text element will have differing formatting, and will have differing user interaction options. For example, imagine the text " has just spoken out about ". Each word will be added to the container one at a time, at run time. The username in this case would be bold, and if clicked on will trigger an event. Same with the news article. The rest of the text is just plain text which, when clicked on, would do nothing. Now, I'm using Flex 3 so I don't have access to the fancy new text formatting tools. I've implemented a solution where the words are plotted onto a canvas, but this means that the words are wrapped at a particular y position (an arbitrary value I've chosen). When the container is resized, the words still wrap at that position which leaves lots of space. I thought about adding each text element to an Array Collection and using this as a datasource for a Tile List, but Tile Lists don't support variable column widths (in my limited knowledge) so each word would use the same amount of space which isn't ideal. Does anyone know how I can plot words onto a container so that I can retain formatting, events and word wrapping at paragraph level, even if the container is resized?

Read the article

Is there stl and utf8 friendly C++ Wrapper for ICU, or other powerful unicode library

- by artyom

Hello, I need a good Unicode library for C++. I need Transformations in Unicode sensitive way. For example sort all strings in case insensitive way and get their first characters for index. Convert to upper and to lower various Unicode strings. Split text in reasonable position -- words that would work for Chinese and Japanese as well. Formatting numbers, dates in locale sensitive way (should be thread safe). Transparent support of utf8 (primary internal representation). As far as I know the best library is ICU. However, I can't find normal developer friendly API documentation with examples. Also as far as I see, it is not too friendly with modern C++ design, work with STL and so on. Like this std::string msg; unistring umsg.from_utf8(msg); unistring::word_iterator wi; for(wi=umsg.words().begin(),n=0;wi!=usmg.words().wi_end(),n<10;++wi,++n) ; msg=umsg.substr(umsg.words().begin(),wi).to_utf8(); cout<<_("Five 10 words are ")<<msg; Does anybody know good STL friendly ICU wrapper released under Open Source license preferred permissive like MIT or Boost, but others LGPLv2 compatible are ok as well. Is there another high quality library similar to ICU? Platform: UNIX/POSIX, Windows support is not required. Thanks, Artyom Edit: Unfortunatly I wasn't logged in so I can't make asnver accepted... I had attached the ansver by myself.

Read the article

Explain JAVA code

- by MIW

I need some help to explain the meaning from line 5 to line 9. Thanks String words = "Rain Rain go away"; String mutation1, mutation2, mutation3, mutation4; mutation1 = words.toUpperCase(); System.out.println ("** " + mutation1 + " Nursery Rhyme **"); mutation1 = words.concat ("\nCome again another day"); mutation2 = "Johnny Johnny wants to play"; mutation3 = mutation2.replace (mutation2.charAt(5), 'i'); mutation4 = mutation3.substring (7, 27); System.out.print ("\'" + mutation1 + "\n" + mutation4 + "\'\n"); 10.System.out.println ("Title length: " + words.length());

Read the article

L10N: Trusted test data for Locale Specific Sorting

- by Chris Betti

I'm working on an internationalized database application that supports multiple locales in a single instance. When international users sort data in the applications built on top of the database, the database theoretically sorts the data using a collation appropriate to the locale associated with the data the user is viewing. I'm trying to find sorted lists of words that meet two criteria: the sorted order follows the collation rules for the locale the words listed will allow me to exercise most / all of the specific collation rules for the locale I'm having trouble finding such trusted test data. Are such sort-testing datasets currently available, and if so, what / where are they? "words.en.txt" is an example text file containing American English text: Andrew Brian Chris Zachary I am planning on loading the list of words into my database in randomized order, and checking to see if sorting the list conforms to the original input. Because I am not fluent in any language other than English, I do not know how to create sample datasets like the following sample one in French (call it "words.fr.txt"): cote côte coté côté The French prefer diacritical marks to be ordered right to left. If you sorted that using code-point order, it likely comes out like this (which is an incorrect collation): cote coté côte côté Thank you for the help, Chris

Read the article

find word and score based on positions

- by ryder1211212

hey guys i have a textfile i have divided it into 4 parts. i want to search each part for the words that appear in each part and score that word exmaple welcome to the national basketball finals,the basketball teams here today have come a long way. without much delay lets play basketball. i will want to return national = 1 as it appears only in one part etc am working on determining text context using word position. am working with c# and not very good in text processing basically if a word appears in the 4 sections it scores 4 if a word appears in the 3 sections it scores 3 if a word appears in the 2 sections it scores 2 if a word appears in the 1 section it scores 1 thanks in advance so far i have this var s = "welcome to the national basketball finals,the basketball teams here today have come a long way. without much delay lets play basketball. "; var numberOfParts = 4; var eachPartLength = s.Length / numberOfParts; var parts = new List<string>(); var words = Regex.Split(s, @"\W").Where(w => w.Length > 0); // this splits all words, removes empty strings var wordsIndex = 0; for (int i = 0; i < numberOfParts; i++) { var sb = new StringBuilder(); while (sb.Length < eachPartLength && wordsIndex < words.Count()) { sb.AppendFormat("{0} ", words.ElementAt(wordsIndex)); wordsIndex++; } // here you have the part Response.Write("[{0}]"+ sb); parts.Add(sb.ToString()); var allwords = parts.SelectMany(p => p.Split(' ').Distinct()); var wordsInAllParts = allwords.Where(w => parts.All(p => p.Contains(w))).Distinct();

Read the article

Hierarchy of meaning

- by asldkncvas

I am looking for a method to build a hierarchy of words. Background: I am a "amateur" natural language processing enthusiast and right now one of the problems that I am interested in is determining the hierarchy of word semantics from a group of words. For example, if I have the set which contains a "super" representation of others, i.e. [cat, dog, monkey, animal, bird, ... ] I am interested to use any technique which would allow me to extract the word 'animal' which has the most meaningful and accurate representation of the other words inside this set. Note: they are NOT the same in meaning. cat != dog != monkey != animal BUT cat is a subset of animal and dog is a subset of animal. I know by now a lot of you will be telling me to use wordnet. Well, I will try to but I am actually interested in doing a very domain specific area which WordNet doesn't apply because: 1) Most words are not found in Wordnet 2) All the words are in another language; translation is possible but is to limited effect. another example would be: [ noise reduction, focal length, flash, functionality, .. ] so functionality includes everything in this set. I have also tried crawling wikipedia pages and applying some techniques on td-idf etc but wikipedia pages doesn't really do much either. Can someone possibly enlighten me as to what direction my research should go towards? (I could use anything)

Read the article

Python: cleaner list comprehension

- by Jasie

Is there a cleaner way to write this: for w in [w for w in words if w != '']: I want to loop over a dictionary words, but only words that aren't whitespace. Thanks!

Read the article

Hbase schema design -- to make sorting easy?

- by chen

I have 1M words in my dictionary. Whenever a user issue a query on my website, I will see if the query contains the words in my dictionary and increment the counter corresponding to them individually. Here is the example, say if a user type in "Obama is a president" and "Obama" and "president" are in my dictionary, then I should increment the counter by 1 for "Obama" and "president". And from time to time, I want to see the top 100 words (most queried words). If I use Hbase to store the counter, what schema should I use? -- I have not come up an efficient one yet. If I use word in my dictionary as row key, and "counter" as column key, then updating counter(increment) is very efficient. But it's very hard to sort and return the top 100. Anyone can give a good advice? Thanks.

Read the article

How to link a table to a field a in MySQL server

- by Nek

I have this data from a xml file: <?xml version="1.0" encoding="utf-8" ?> <words> <id>...</id> <word>...</word> <meaning>...</meaning> <translation> <ES>...</ES> <PT>...</PT> </translation> </words> This forms the table named "words", which has four fields ("id","word","meaning" and "translation"). On the other hand, the "translation" field can hold several languages like ES,PT,EN,JA,KO,etc... So I create a table ("words.translation", one field is "id" and the others ones are languages ids like "ES","PT",...). I'm sorry for this newby question, but I'd like to know a couple of things about this one-to-many relationship. How to join (or link?) this two tables in MySQL? What information does the "translation" field in the "words" table has to store? How is the sql query to get all the word information (JOIN syntax used?) Thanks for your patience.

Read the article

Algorithm for sentence analysis and tokenization

- by Andrea Nagar

I need to analyze a document and compile statistics as to how many times each a sequence of words is used (so the analysis is not on single words but of batch of recurring words). I read that compression algorithms do something similar to what I want - creating dictionaries of blocks of text with a piece of information reporting its frequency. It should be something similar to http://www.codeproject.com/KB/recipes/Patterns.aspx Do you have anything written in C#?

Read the article

Mysql Performance Question - Essentially about normalizing efficiency

- by freqmode

Hi there. Just a quick question about database performance. I'll outline my site purpose below as background. I'm creating a dictionary site that saves the words users define to a database. What I'm wondering is whether or not to create a words table for each user or to keep one massive words table. This site will be used for entire schools so the single words table would be massive! The database structure is as follows: A user table with: User_ID PRIMARY KEY Username First Last Password Email Country Research Standings SendInfo Donated JoinedOn LastLogin Logins Correct Attempts Admin Active And one word table with: User_ID PRIMARY KEY Word Vocab Spell Defined DefinedAttempted Spelled SpelledAttempted Sentenced SentencedAttempted So what I'm asking is , performance-wise, should I create a new table for each user when they join the site - each user could have hundreds or thousands of words over time? Or is it better to have one massive table with thousands and thousands of records and filter by User_ID. I don't think I'll perform many table joins. My gut feeling is to create a new table for each user, but I thought I'd ask for expert advice! Thanks in advance.

Read the article

Python unicode search not giving correct answer

- by user1318912

I am trying to search hindi words contained one line per file in file-1 and find them in lines in file-2. I have to print the line numbers with the number of words found. This is the code: import codecs hypernyms = codecs.open("hindi_hypernym.txt", "r", "utf-8").readlines() words = codecs.open("hypernyms_en2hi.txt", "r", "utf-8").readlines() count_arr = [] for counter, line in enumerate(hypernyms): count_arr.append(0) for word in words: if line.find(word) >=0: count_arr[counter] +=1 for iterator, count in enumerate(count_arr): if count>0: print iterator, ' ', count This is finding some words, but ignoring some others The input files are: File-1: ???? ??????? File-2: ???????, ????-???? ?????-???, ?????-???, ?????_???, ?????_??? ????_????, ????-????, ???????_???? ????-???? This gives output: 0 1 3 1 Clearly, it is ignoring ??????? and searching for ???? only. I have tried with other inputs as well. It only searches for one word. Any idea how to correct this?

Search Results

Search found 6107 results on 245 pages for 'reserved words'.

Page 23/245 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >

- by Ashish

- by teddy

- by Prasad

- by luckytaxi

- by user343550

- by luckytaxi

- by Greg

- by Ranjana

- by AjiPorter

- by ejrichards

- by chrism

- by l0b0

- by gAMBOOKa

- by Craig Myles

- by artyom

- by MIW

- by Chris Betti

- by ryder1211212

- by asldkncvas

- by Jasie

- by chen

- by Nek

- by Andrea Nagar

- by freqmode

- by user1318912

< Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >