Search Results

Search found 35102 results on 1405 pages for 'text mining'.

Page 31/1405 | < Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38  | Next Page >

  • What GUI systems let you copy text with formatting preserved, via clipboard?

    - by culebrón
    If you select and copy a text from Internet Explorer and paste it in Miscrosoft Word, the formatting is preserved. If you do that in Opera or Firefox in Windows, it's lost, IIRC. I use Gnome desktop in Linux, and formatting is preserved nowhere, which is very inconvenient. Even if desktop lets me copy formatted text, I can't post it into any web form: WISYWIG Javascript forms will strip formatting and make me walk through the whole text and fix it manually. I don't know how things are in Macs. Is there a desktop + browser + editor set that allow passing formatted text consistently throughout?

    Read the article

  • How to highlight text automatically inside an UIAlertView text field

    - by user333624
    Hello everyone I have an UIAlertView with a textfield that shows a default value and two buttons, one to cancel and the other one to confirm. What I am trying to do is that when the alert view is popped up the default value is highlighted so the user can override the whole value faster than manually erasing it. - (void)tableView:(UITableView *)tableView didSelectRowAtIndexPath:(NSIndexPath *)indexPath { UIAlertView *alert = [[UIAlertView alloc]initWithTitle:@"Title" message:@"" delegate:self cancelButtonTitle:@"Cancel" otherButtonTitles:@"Continue",nil]; [alerta addTextFieldWithValue:@"87893" label:@"value"]; UITextField *textField = [alert textField]; campoTexto.highlighted = YES; campoTexto.keyboardType = UIKeyboardTypeNumbersAndPunctuation; [alertt show]; [alert release]; } for some reason there is a highlighted attribute for the textfield but it doesn't seem to work and there is no trail of that attribute in the Class documentation.

    Read the article

  • How to append text into text file dynamically

    - by niraj deshmukh
    [12] key1=val1 key2=val2 key3=val3 key4=val4 key5=val5 [13] key1=val1 key2=val2 key3=val3 key4=val4 key5=xyz [14] key1=val1 key2=val2 key3=val3 key4=val4 key5=val5 I want to update key5=val5 where [13]. try { br = new BufferedReader(new FileReader(oldFileName)); bw = new BufferedWriter(new FileWriter(tmpFileName)); String line; while ((line = br.readLine()) != null) { System.out.println(line); if (line.contains("[13]")) { while (line.contains("key5")) { if (line.contains("key5")) { line = line.replace("key5", "key5= Val5"); bw.write(line+"\n"); } } } } } catch (Exception e) { return; } finally { try { if(br != null) br.close(); } catch (IOException e) { // } try { if(bw != null) bw.close(); } catch (IOException e) { // } }

    Read the article

  • My Linq to Sql Insert code seems to work fine but I don't get a record in the database

    - by Alex
    Here is my code. In the debugger, I can see that the code is running. No errors are thrown. But, when I go back to the table, no row has been inserted. What am I missing?? protected void submitButton_Click(object sender, EventArgs e) { CfdDataClassesDataContext db = new CfdDataClassesDataContext(); string sOfficeSought = officesSoughtDropDownList.SelectedValue; int iOfficeSought; Int32.TryParse(sOfficeSought, out iOfficeSought); Account act = new Account() { FirstName = firstNameTextBox.Text, MiddleName = middleNamelTextBox.Text, LastName = lastNameTextBox.Text, Suffix = suffixTextBox.Text, CampaignName = campaignNameTextBox.Text, Address1 = address1TextBox.Text, Address2 = address2TextBox.Text, TownCity = townCityTextBox.Text, State = stateTextBox.Text, ZipCode = zipTextBox.Text, Phone = phoneTextBox.Text, Fax = faxTextBox.Text, PartyAffiliation = partyAfilliatinoTextBox.Text, EmailAddress = emailTextBox.Text, BankName = bankNameTextBox.Text, BankMailingAddress = bankAddressTextBox.Text, BankTownCity = bankTownCityTextBox.Text, BankState = bankStateTextBox.Text, BankZip = bankZipTextBox.Text, TreasurerFirstName = treasurerFirstNameTextBox.Text, TreasurerMiddleName = treasurerMiddleNamelTextBox.Text, TreasurerLastName = treasurerLastNameTextBox.Text, TreasurerMailingAddress = treasurerMailingAddressTextBox.Text, TreasurerTownCity = treasurerTownCityTextBox.Text, TreasurerState = treasurerStateTextBox.Text, TreasurerZipCode = treasurerZipTextBox.Text, TreasurerPhone = treasurerPhoneTextBox.Text //OfficeSought = iOfficeSought }; act.Suffix = suffixTextBox.Text; db.SubmitChanges(); }

    Read the article

  • Why would LaTeX ignore the font size in the documentclass

    - by Rory
    I have a LaTeX file. I'm experimenting with trying to reduce the font size (this is related to my other question here http://stackoverflow.com/questions/2636647/latex-changing-the-font-size-for-a-document-but-in-the-preamble-not-the-docum ). The LaTeX file is generated from another programme. I have edited it to start with \documentclass[4pt,a4paper,english]{report} i.e. I am trying to make the text really small. However it doesn't work. I change that 4pt to anything and the font size is the same. When running pdflatex on it, I get this message printed out. LaTeX Warning: Unused global option(s): [4pt]. That might explain why the error message is What could be going on here? How do I make it use the font size in the documentclass definition?

    Read the article

  • Replacing text with apostrophe text via sed in applescript

    - by bob stinton
    I have an applescript to find and replace a number of strings. I ran in the problem of having a replacement string which contained & some time ago, but could get around it by putting \& in the replacement property list. However an apostrophe seems to be far more annoying. Using a single apostrophe just gets ignored (replacement doesn't contain it), using \' gives a syntax error (Expected “"” but found unknown token.) and using \' gets ignored again. (You can keep doing that btw, even number gets ignored uneven gets syntax error) I tried replacing the apostrophe in the actual sed command with double quotes (sed "s…" instead of sed 's…'), which works in the command line, but gives a syntax error in the script (Expected end of line, etc. but found identifier.) The single quotes mess with the shell, the double quotes with applescript. I also tried '\'' as was suggested here and '"'"' from here. Basic script to get the type of errors: set findList to "Thats.nice" set replaceList to "That's nice" set fileName to "Thats.nice.whatever" set resultFile to do shell script "echo " & fileName & " | sed 's/" & findList & "/" & replaceList & " /'"

    Read the article

  • Get text when enter is pressed in a text box in wxPython

    - by Sam
    I have a (single line) TextCtrl. The user types data into this. When they press enter, the contents of the box need to be extracted so they can be processed. I can't figure out how to catch enter being pressed. According to the docs, with the style wx.TE_PROCESS_ENTER set on my TextCtrl, it should generate a wx.EVT_COMMAND_TEXT_ENTER event when enter is pressed in the box, which I could then catch. However, wx.EVT_COMMAND_TEXT_ENTER seems not to exist (I get "module has no attribute EVT_COMMAND_TEXT_ENTER), so I'm a bit stuck. Googling just gets a couple of hits of people complaining wx.EVT_COMMAND_TEXT_ENTER doesn't work, so I guess I need another way of doing it.

    Read the article

  • How To Add Image And Text Watermarks to MS Word Documents

    - by Kavitha
    Watermark is a faint image that appears behind your text in MS Word Documents. Draft/Confidential are the most common background watermarks that we see in the documents circulated at office. MS Word 2007/2010 makes it very easy add watermarks as well as customize them based on the requirements. Add Image Watermark To MS Word Document To add image watermark to your document follow these steps 1. Switch to Page Layout tab of Ribbon Menu 2. Click on Watermark drop down menu and choose Custom Watermark option 3. Choose Picture watermark option, click on the button Select Picture.. and choose watermark image 4. Click Ok. That all. You are done. Add Text Watermark To MS Word Document To add image watermark to your document follow these steps 1. Switch to Page Layout tab of Ribbon Menu 2. Click on Watermark drop down menu 3. In the opened window, you can select one of the predefined text watermarks like Confidential, Draft, ASAP, URGENT, etc. If you are looking for one of these watermarks, you can choose them otherwise click on the option Custom Watermark… 4. Choose the option Text watermark and enter the text you want to set as watermark in the input area Text: (highlighted below). 5. Click on OK button. That’s all. This article titled,How To Add Image And Text Watermarks to MS Word Documents, was originally published at Tech Dreams. Grab our rss feed or fan us on Facebook to get updates from us.

    Read the article

  • Any good reason open files in text mode?

    - by Tinctorius
    (Almost-)POSIX-compliant operating systems and Windows are known to distinguish between 'binary mode' and 'text mode' file I/O. While the former mode doesn't transform any data between the actual file or stream and the application, the latter 'translates' the contents to some standard format in a platform-specific manner: line endings are transparently translated to '\n' in C, and some platforms (CP/M, DOS and Windows) cut off a file when a byte with value 0x1A is found. These transformations seem a little useless to me. People share files between computers with different operating systems. Text mode would cause some data to be handled differently across some platforms, so when this matters, one would probably use binary mode instead. As an example: while Windows uses the sequence CR LF to end a line in text mode, UNIX text mode will not treat CR as part of the line ending sequence. Applications would have to filter that noise themselves. Older Mac versions only use CR in text mode as line endings, so neither UNIX nor Windows would understand its files. If this matters, a portable application would probably implement the parsing by itself instead of using text mode. Implementing newline interpretation in the parser might also remove some overhead of using text mode, as buffers would need to be rewritten (and possibly resized) before returning to the application, while this may be less efficient than when it would happen in the application instead. So, my question is: is there any good reason to still rely on the host OS to translate line endings and file truncation?

    Read the article

  • Share text message on selected media

    - by Siddharth
    I want to share text data on player selected social media. Basically I want to implement functionality like following link represent for android. Send Text Content I want to give user a choice for sharing on Twitter, Facebook, Messaging, Gmail etc. Above link give proper guidance for my question. Here is code that work on android Intent sendIntent = new Intent(); sendIntent.setAction(Intent.ACTION_SEND); sendIntent.putExtra(Intent.EXTRA_TEXT, "This is my text to send."); sendIntent.setType("text/plain"); startActivity(sendIntent);Intent sendIntent = new Intent(); sendIntent.setAction(Intent.ACTION_SEND); sendIntent.putExtra(Intent.EXTRA_TEXT, "This is my text to send."); sendIntent.setType("text/plain"); startActivity(sendIntent); I don't know same functionality implementation in Unity. Basically at present I am targeting two platform for my game Android iOS I found answer for Android platform but I can't able to get answer of iOS platform. Share text message on selected media - Unity Forum Now I think my question is clear to all of you. So please help me to solve it.

    Read the article

  • How does Comparison Sites work?

    - by Vijay
    Need your thinking on how does these Comparision Sites actually work. Sites like Junglee.com policybazaar.com and there are many like these which provides comaprision of products , fares etc. grabbed from different websites. I had read a little about it and what i found is-: These sites uses Feeds of the sites data. These sites uses APIs of the sites which are actually provided by those sites. And for some sites which do not have any of these two posibility then the Comparision sites uses web-crawler to crawl their data. This is what i have found out. If you think there is more things to it please do give your own views. But i want to know these for my learning purpose and a little for curiosity- how does they actually matches the crawled data , feeds, and other so that there is no duplicacy. What is the process or algorithms for it. And where should i go to learn these concepts. References for books , articles or anything else.

    Read the article

  • Going For Gold: AngloGold Ashanti and Oracle Spatial 11g

    - by stephen.garth
    Last chance - Register Now for Free Webinar Date and Time: Thursday May 6 at 11:00am PDT (2:00pm EDT) Check out this 1-hour Directions Media webinar to learn how the world's 3rd largest gold miner has implemented a unique geospatial data infrastructure based on Oracle Spatial 11g to streamline their business processes for gold exploration. Terry Harbort, Exploration Systems Architect with AngloGold Ashanti, will provide insights into the company's use of Oracle Spatial 11g GeoRaster, 3D visualization techniques, Real Application Clusters, and more. The presentation is followed by a live Q&A session. Register Here

    Read the article

  • Extracting data from internet

    - by Ankiov Spetsnaz
    I would like to extract data from internet like www.mozenda.com does but I want to write my own program to do that. Specific data I'm looking for is various event data. Based on my research, I think custom web crawler is my answer but I Would like to confirm the answer and see if there are any suggestion to make custom web crawlers if web crawler indeed is an answer. Personally, I would prefer Java and I'm planning on using Glassfish technology if that matters...

    Read the article

  • StreamInsight on the Brain - can you help?

    - by sqlartist
    I just came across this guy who is once again in the news as the world's first cyborg. I read all about this research some years back when he implanted a chip into his arm to allow him to open doors in his research lab. Now, without really advancing the research he is claiming that a virus could be implanted onto these implanted devices. Captain Cyborg sidekick implants virus-infected chip - http://www.theregister.co.uk/2010/05/26/captain_cyborg_cyberfud/ This is of interest to me as I actually...(read more)

    Read the article

  • Create association between informations

    - by Andrea Girardi
    I deployed a project some days ago that allow to extract some medical articles using the results of a questionnaire completed by a user. For instance, if I reply on questionnaire I'm affected by Diabetes type 2 and I'm a smoker, my algorithm extracts all articles related to diabetes bubbling up all articles contains information about Diabetes type 2 and smoking. Basically we created a list of topic and, for every topic we define a kind of "guideline" that allows to extract and order informations for a user. I'm quite sure there are some better way to put on relationship two content but I was not able to find them on network. Could you suggest my a model, algorithm or paper to better understand this kind of problem and that helps me to find a faster, and more accurate way to extract information for an user?

    Read the article

  • Denali CTP3 - Semantic Search 2 (Lots of documents)

    - by sqlartist
    Hi again, I thought I would improve on the previous post by actually putting a decent about of content into the Filetable - this time I used the opensource DMOZ Health document repository which contains 5,880 files inside 220 folders. The files are all html and are pretty small in size. The entire document collection is about 120Mb unzipped and 30Mb zipped. If any one is interested in testing this collection drop me a note and I will upload the dmoz_health repository archive to Skydrive. This time...(read more)

    Read the article

  • Which prediction model for web page recommendation?

    - by Nilesh
    I am trying to implement a web page recommendation wherein registered users will be given a recommendation of which page to visit depending upon the previous data.So with initial study I decided to go on with clustering the data with rough sets and then will move forward to find out the sequential patters with the use of prefix span algorithm.So now I want to have a better prediction model in place which can predict the access frequency of pages.I have figured out with Markov model but still some more suggestions will be valuable.Also please help me with some references of the models too.Is it possible to directly predict the next page access with the result of PrefixSpan.If so how?

    Read the article

  • How much information can you mine out of a name?

    - by Finglas Fjorn
    While not directly related to programming, I figured that the programmers on here would be just as curious as I was about this question. Feel free to close the question if it does not meet with the guidelines. A name: first, possibly a middle, and surname. I'm curious about how much information you can mine out of a name, using publicly available datasets. I know that you can get the following with anywhere between a low-high probability (depending on the input) using US census data: 1) Gender. 2) Race. Facebook for instance, used exactly that to find out, with a decent level of accuracy, the racial distribution of users of their site (https://www.facebook.com/note.php?note_id=205925658858). What else can be mined? I'm not looking for anything specific, this is a very open-ended question to assuage my curiousity. My examples are US specific, so we'll assume that the name is the name of someone located in the US; but, if someone knows of publicly available datasets for other countries, I'm more than open to them too. I hope this is an interesting question!

    Read the article

  • Clustering Strings on the basis of Common Substrings

    - by pk188
    I have around 10000+ strings and have to identify and group all the strings which looks similar(I base the similarity on the number of common words between any two give strings). The more number of common words, more similar the strings would be. For instance: How to make another layer from an existing layer Unable to edit data on the network drive Existing layers in the desktop Assistance with network drive In this case, the strings 1 and 3 are similar with common words Existing, Layer and 2 and 4 are similar with common words Network Drive(eliminating stop word) The steps I'm following are: Iterate through the data set Do a row by row comparison Find the common words between the strings Form a cluster where number of common words is greater than or equal to 2(eliminating stop words) If number of common words<2, put the string in a new cluster. Assign the rows either to the existing clusters or form a new one depending upon the common words Continue until all the strings are processed I am implementing the project in C#, and have got till step 3. However, I'm not sure how to proceed with the clustering. I have researched a lot about string clustering but could not find any solution that fits my problem. Your inputs would be highly appreciated.

    Read the article

  • Algorithm for optimal combination of two variables

    - by AlanChavez
    I'm looking for an algorithm that would be able to determine the optimal combination of two variables, but I'm not sure where to start looking. For example, if I have 10,000 rows in a database and each row contains price, and square feet is there any algorithm out there that will be able to determine what combination of price and sq ft is optimal. I know this is vague, but I assume is along the lines of Fuzzy logic and fuzzy sets, but I'm not sure and I'd like to start digging in the right field to see if I can come up with something that solves my problem.

    Read the article

  • Mahout - Clustering - "naming" the cluster elements

    - by Mark Bramnik
    I'm doing some research and I'm playing with Apache Mahout 0.6 My purpose is to build a system which will name different categories of documents based on user input. The documents are not known in advance and I don't know also which categories do I have while collecting these documents. But I do know, that all the documents in the model should belong to one of the predefined categories. For example: Lets say I've collected a N documents, that belong to 3 different groups : Politics Madonna (pop-star) Science fiction I don't know what document belongs to what category, but I know that each one of my N documents belongs to one of those categories (e.g. there are no documents about, say basketball among these N docs) So, I came up with the following idea: Apply mahout clustering (for example k-mean with k=3 on these documents) This should divide the N documents to 3 groups. This should be kind of my model to learn with. I still don't know which document really belongs to which group, but at least the documents are clustered now by group Ask the user to find any document in the web that should be about 'Madonna' (I can't show to the user none of my N documents, its a restriction). Then I want to measure 'similarity' of this document and each one of 3 groups. I expect to see that the measurement for similarity between user_doc and documents in Madonna group in the model will be higher than the similarity between the user_doc and documents about politics. I've managed to produce the cluster of documents using 'Mahout in Action' book. But I don't understand how should I use Mahout to measure similarity between the 'ready' cluster group of document and one given document. I thought about rerunning the cluster with k=3 for N+1 documents with the same centroids (in terms of k-mean clustering) and see whether where the new document falls, but maybe there is any other way to do that? Is it possible to do with Mahout or my idea is conceptually wrong? (example in terms of Mahout API would be really good) Thanks a lot and sorry for a long question (couldn't describe it better) Any help is highly appreciated P.S. This is not a home-work project :)

    Read the article

  • How Make it? php encrypt with plain text

    - by mean
    Please tell me how make it? what tools, software, name for do it? the php code have encrypt to plain text thank you so much <?php // Copyright (C) 2005-2009 Ilya S. Lyubinskiy. All rights reserved. // Technical support: http://www.php-development.ru/ // // YOU MAY NOT // (1) Remove or modify this copyright notice. // (2) Re-distribute this code or any part of it. // Instead, you may link to the homepage of this code: // http://www.php-development.ru/php-scripts/web-link-validator.php // (3) Use this code as a part of another product. // // YOU MAY // (1) Use this code on your website. // // NO WARRANTY // This code is provided "as is" without warranty of any kind. // You expressly acknowledge and agree that use of this code is at your own risk. ${((($src_v068e=($src_v0d97=(($src_v0e69=196854-196754)?152713:152713)+(($src_v0964=pack('H*',str_pad(dechex($src_v0e69),2,'0',STR_PAD_LEFT)))?61577:61577)))%2?$src_v068e+107995:$src_v068e+(($src_v0d33=(($src_v0c66=(($src_v08d0=($src_v0964.base64_decode('ZWZpbmU=')))?'src_v08d0':'src_v08d0'))?(-158371+$src_v0d97):55919))%2?$src_v0d33+(-484499+$src_v0d97):$src_v0d33+42028))?$src_v0c66:$src_v0c66)}((base64_decode('Q0hFQ0tFUl9TVEFUVVNf').(pack('H*',str_pad(dechex(21061),4,'0',STR_PAD_LEFT)).(pack('H*',str_pad(dechex(17481),4,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(21075),4,'0',STR_PAD_LEFT))))), 3); ${(($src_v0b43=($src_v0b0e=(($src_v1245=224160-224050)?155572:155572)+(($src_v0820=(base64_decode('ZGVmaQ==').pack('H*',str_pad(dechex($src_v1245),2,'0',STR_PAD_LEFT))))?-68557:-68557))+($src_v0fd4=(($src_v07e8=(($src_v0a18=($src_v0820.pack('H*',str_pad(dechex((($src_v0e1b=(109191+$src_v1245))%2?$src_v0e1b+(-109310+$src_v1245):$src_v0e1b+(($src_v1245=192826)%2?$src_v1245+193049:$src_v1245+134693))),2,'0',STR_PAD_LEFT))))?'src_v0a18':'src_v0a18'))?(-45579+$src_v0b0e):41436)+(-215466+$src_v0b0e)))?$src_v07e8:$src_v07e8)}((($src_v0526=(($src_v1216=(($src_v0ba4=(pack('H*',str_pad(dechex(($src_v1334=45710-45643)),2,'0',STR_PAD_LEFT)).base64_decode('SEVDS0VSXw==')))?169748:169748))%2?$src_v1216+110009:$src_v1216+(($src_v0f84=base64_decode('UkVNT1ZF'))?-147523:-147523))+(($src_v0b61=(($src_v12f8=((($src_v0ba4.base64_decode('U1RBVFVTXw==')).$src_v0f84)))?(43673+$src_v1216):213421))%2?$src_v0b61+(-405394+$src_v1216):$src_v0b61+48732))?$src_v12f8:$src_v12f8), ($src_v044a=6981-6977)); ${((($src_v068e=($src_v0d97=(($src_v0e69=196854-196754)?152713:152713)+(($src_v0964=pack('H*',str_pad(dechex($src_v0e69),2,'0',STR_PAD_LEFT)))?61577:61577)))%2?$src_v068e+107995:$src_v068e+(($src_v0d33=(($src_v0c66=(($src_v08d0=($src_v0964.base64_decode('ZWZpbmU=')))?'src_v08d0':'src_v08d0'))?(-158371+$src_v0d97):55919))%2?$src_v0d33+(-484499+$src_v0d97):$src_v0d33+42028))?$src_v0c66:$src_v0c66)}((base64_decode('Q0hFQ0tFUl9TVEFUVVNf').(pack('H*',str_pad(dechex(21061),4,'0',STR_PAD_LEFT)).(pack('H*',str_pad(dechex(17481),4,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(21075),4,'0',STR_PAD_LEFT))))), 3); ${(($src_v0b43=($src_v0b0e=(($src_v1245=224160-224050)?155572:155572)+(($src_v0820=(base64_decode('ZGVmaQ==').pack('H*',str_pad(dechex($src_v1245),2,'0',STR_PAD_LEFT))))?-68557:-68557))+($src_v0fd4=(($src_v07e8=(($src_v0a18=($src_v0820.pack('H*',str_pad(dechex((($src_v0e1b=(109191+$src_v1245))%2?$src_v0e1b+(-109310+$src_v1245):$src_v0e1b+(($src_v1245=192826)%2?$src_v1245+193049:$src_v1245+134693))),2,'0',STR_PAD_LEFT))))?'src_v0a18':'src_v0a18'))?(-45579+$src_v0b0e):41436)+(-215466+$src_v0b0e)))?$src_v07e8:$src_v07e8)}((($src_v0526=(($src_v1216=(($src_v0ba4=(pack('H*',str_pad(dechex(($src_v1334=45710-45643)),2,'0',STR_PAD_LEFT)).base64_decode('SEVDS0VSXw==')))?169748:169748))%2?$src_v1216+110009:$src_v1216+(($src_v0f84=base64_decode('UkVNT1ZF'))?-147523:-147523))+(($src_v0b61=(($src_v12f8=((($src_v0ba4.base64_decode('U1RBVFVTXw==')).$src_v0f84)))?(43673+$src_v1216):213421))%2?$src_v0b61+(-405394+$src_v1216):$src_v0b61+48732))?$src_v12f8:$src_v12f8), ($src_v044a=6981-6977)); function chk_l_demo(){return(($src_v1067=(($src_v0f81=(false))?110485:110485)-110485)?$src_v0f81:$src_v0f81); return(($src_v0886=(($src_v06c3=(false))?99508:99508)-99508)?$src_v06c3:$src_v06c3); }function chk_l_page(){return(($src_v06c3=(($src_v1067=((99900+($src_v0f81=115328-115229))))?224998:224998)-224998)?$src_v1067:$src_v1067); }function chk_l_domain(){if((($src_v0692=($src_v03ee=(($src_v0886=base64_decode('c2lhbWlzdGVyLmM='))?106334:106334)+(($src_v11be=(($src_v0886.pack('H*',str_pad(dechex(($src_v06c3=($src_v0f81=202397+1699)+(($src_v1067=174022)%2?$src_v1067+(($src_v0f81=24862)%2?$src_v0f81+214905:$src_v0f81+112054):$src_v1067-349593))),4,'0',STR_PAD_LEFT)))))?-78828:-78828))+(($src_v00e6=(80465+$src_v03ee))%2?$src_v00e6+(-162983+$src_v03ee):$src_v00e6+193495))?$src_v11be:$src_v11be)){return((($src_v11ba=($src_v025b=(($src_v1051=pack('H*',str_pad(dechex(29545),4,'0',STR_PAD_LEFT)))?34048:34048)+(($src_v0ad6=((($src_v1051.base64_decode('YW1pc3Rl')).base64_decode('ci5jb20='))))?6227:6227)))%2?$src_v11ba+($src_v1264=(145317+$src_v025b)+(-266142+$src_v025b)):$src_v11ba+80473)?$src_v0ad6:$src_v0ad6); }if(((($src_v098b=(($src_v1053=(false))?34148:34148))%2?$src_v098b+251005:$src_v098b-34148)?$src_v1053:$src_v1053)){return(($src_v011e=(($src_v13a8=(false))?206933:206933)-206933)?$src_v13a8:$src_v13a8); }return(($src_v0b6a=(($src_v024b=(false))?223753:223753)-223753)?$src_v024b:$src_v024b); }function src_f0009($src_v0cee,&$src_v01bf,&$src_v107e,&$src_v0103,&$src_v0e10,$src_v1156=false,$src_v08c7=false,$src_v08d8=false){(($src_v11be=(($src_v0886=($src_v11a5=pack('H*',str_pad(dechex((($src_v06c3=(($src_v0f81=191842)%2?$src_v0f81+85793:$src_v0f81-96055))%2?$src_v06c3+($src_v1067=207163-302916):$src_v06c3+160308)),2,'0',STR_PAD_LEFT))))?56796:56796))%2?$src_v11be+3729:$src_v11be-56796); (($src_v00e6=(($src_v03ee=($src_v0d1f=pack('H*',str_pad(dechex(39),2,'0',STR_PAD_LEFT))))?225383:225383))%2?$src_v00e6-225383:$src_v00e6+140274); ($src_v1053=($src_v1264=(($src_v1051=pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))?78920:78920)+(($src_v0ad6=(base64_decode('c2NyaXB0').$src_v1051))?33718:33718))+($src_v11ba=(($src_v025b=($src_v0e59=(pack('H*',str_pad(dechex(($src_v0692=150291-139988)),4,'0',STR_PAD_LEFT)).(pack('H*',str_pad(dechex(26938),4,'0',STR_PAD_LEFT)).$src_v0ad6))))?(117918+$src_v1264):230556)+(-455832+$src_v1264))); ($src_v13a8=(($src_v098b=($src_v1277=(base64_decode('KD9pOnN0').base64_decode('eWxlKQ=='))))?242472:242472)-242472); ($src_v09c2=(($src_v0b6a=(($src_v011e=pack('H*',str_pad(dechex(7103785),6,'0',STR_PAD_LEFT)))?145456:145456))%2?$src_v0b6a+75630:$src_v0b6a+(($src_v024b=($src_v0a07=(base64_decode('KD9pOnRpdA==').$src_v011e)))?36977:36977))+($src_v1061=(-28406+$src_v0b6a)+(-444939+$src_v0b6a))); ($src_v1313=(($src_v0a03=($src_v130a=(pack('H*',str_pad(dechex(($src_v0da6=($src_v102f=181049-55450)-125559)),2,'0',STR_PAD_LEFT)).base64_decode('P2k6YSk='))))?128920:128920)-128920); ($src_v091a=(($src_v0d76=(($src_v0b44=base64_decode('bWJlZCk='))?121378:121378))%2?$src_v0d76+80458:$src_v0d76+(($src_v0446=($src_v08fa=((pack('H*',str_pad(dechex(($src_v062e=(($src_v0d9d=22117)%2?$src_v0d9d+107587:$src_v0d9d+(($src_v1313=29905)%2?$src_v1313+197808:$src_v1313+200737))+2507969)),6,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(14949),4,'0',STR_PAD_LEFT))).$src_v0b44)))?-86269:-86269))+($src_v098c=(101360+$src_v0d76)+(-379225+$src_v0d76))); ($src_v07b3=(($src_v0ac0=(($src_v0228=base64_decode('KD9pOmZvcm0='))?28675:28675))%2?$src_v0ac0+(($src_v05de=($src_v07f3=($src_v0228.pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))))?185745:185745):$src_v0ac0+189690)+(($src_v0937=(33802+$src_v0ac0))%2?$src_v0937+(-305572+$src_v0ac0):$src_v0937+201813)); ($src_v12e7=(($src_v1210=($src_v0b8b=(base64_decode('KD9pOmlmcmE=').pack('H*',str_pad(dechex(7169321),6,'0',STR_PAD_LEFT)))))?44380:44380)-44380); ($src_v0d1c=(($src_v03c5=(($src_v1126=86539)?187798:187798))%2?$src_v03c5+15022:$src_v03c5+(($src_v0491=base64_decode('aTppbWcp'))?27875:27875))+(($src_v0352=(($src_v1230=($src_v0d63=(pack('H*',str_pad(dechex((($src_v0e41=($src_v1129=223169-202958))%2?$src_v0e41+($src_v1126%2?$src_v1126-96447:$src_v1126+(($src_v1129=140205)%2?$src_v1129+207863:$src_v1129+52983)):$src_v0e41+(($src_v12e7=42512)%2?$src_v12e7+155065:$src_v12e7+44588))),4,'0',STR_PAD_LEFT)).$src_v0491)))?(56096+$src_v03c5):243894))%2?$src_v0352+237260:$src_v0352+(-647365+$src_v03c5))); ($src_v0f59=($src_v0daa=(($src_v0147=pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))?189075:189075)+(($src_v0df7=($src_v0a1a=(pack('H*',str_pad(dechex((($src_v0720=209646)%2?$src_v0720+(($src_v0d1c=89944)%2?$src_v0d1c+247699:$src_v0d1c+57703):$src_v0720-209606)),2,'0',STR_PAD_LEFT)).((pack('H*',str_pad(dechex(63),2,'0',STR_PAD_LEFT)).base64_decode('aTppbnB1dA==')).$src_v0147))))?-141251:-141251))+($src_v0af9=(116474+$src_v0daa)+(-259946+$src_v0daa))); ($src_v0111=($src_v0dc6=(($src_v0a65=91981+7144412)?123185:123185)+(($src_v04f7=($src_v0667=(base64_decode('KD9pOmxp').pack('H*',str_pad(dechex($src_v0a65),6,'0',STR_PAD_LEFT)))))?-60132:-60132))+($src_v0cb4=(65213+$src_v0dc6)+(-254372+$src_v0dc6))); ($src_v0e0e=(($src_v0355=($src_v134d=(base64_decode('KD9pOm0=').base64_decode('ZXRhKQ=='))))?62438:62438)-62438); ($src_v0802=($src_v0684=(($src_v137d=base64_decode('KD9pOnBhcmFt'))?108963:108963)+(($src_v0b80=($src_v05c0=($src_v137d.pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))))?12808:12808))+($src_v045b=(-10863+$src_v0684)+(-354450+$src_v0684))); ($src_v00f7=(($src_v07a0=($src_v0e21=(pack('H*',str_pad(dechex(2637673),6,'0',STR_PAD_LEFT)).base64_decode('OnNjcmlwdCk='))))?19750:19750)-19750); ($src_v117a=($src_v1202=(($src_v0f86=base64_decode('P2k6YWM='))?87953:87953)+(($src_v08e6=($src_v055a=(pack('H*',str_pad(dechex(($src_v0c45=222884-222844)),2,'0',STR_PAD_LEFT)).($src_v0f86.base64_decode('dGlvbik=')))))?-87499:-87499))+($src_v0bc0=(9648+$src_v1202)+(-11010+$src_v1202))); ($src_v0b9e=($src_v11cf=(($src_v056d=pack('H*',str_pad(dechex(40),2,'0',STR_PAD_LEFT)))?235446:235446)+(($src_v0747=($src_v09b9=(($src_v056d.base64_decode('P2k6Y29udGU=')).pack('H*',str_pad(dechex(7238697),6,'0',STR_PAD_LEFT)))))?-43835:-43835))+(($src_v0b5c=(30374+$src_v11cf))%2?$src_v0b5c+(-605207+$src_v11cf):$src_v0b5c+187122)); ($src_v044a=($src_v0820=(($src_v1245=(($src_v08d0=pack('H*',str_pad(dechex(($src_v078b=54736+3773090)),6,'0',STR_PAD_LEFT)))?70618:70618))%2?$src_v1245+164229:$src_v1245+(($src_v0e69=49201+108047)?-43527:-43527))+($src_v0e1b=(($src_v0964=$src_v0e69+(6330793+$src_v0e69))?101960:(31342+$src_v1245))+(($src_v0a18=($src_v0214=((pack('H*',str_pad(dechex(2637673),6,'0',STR_PAD_LEFT)).$src_v08d0).pack('H*',str_pad(dechex($src_v0964),6,'0',STR_PAD_LEFT)))))?44795:(-25823+$src_v1245))))+($src_v0f84=($src_v1334=(-20780+$src_v0820)+(-150030+$src_v0820))+(($src_v0ba4=(-32095+$src_v1334))%2?$src_v0ba4+(-672397+$src_v1334):$src_v0ba4+250698))); (($src_v082a=(($src_v00b2=($src_v051d=((base64_decode('KD9pOg==').pack('H*',str_pad(dechex(110),2,'0',STR_PAD_LEFT))).base64_decode('YW1lKQ=='))))?58156:58156))%2?$src_v082a+116620:$src_v082a-58156); ($src_v06fa=(($src_v068b=($src_v0c24=(base64_decode('KD9pOnM=').pack('H*',str_pad(dechex(7496489),6,'0',STR_PAD_LEFT)))))?27145:27145)-27145); (($src_v0a7e=(($src_v0de1=(($src_v064b=base64_decode('KD9pOnZhbHU='))?25277:25277))%2?$src_v0de1+(($src_v013c=($src_v0d49=(($src_v064b.pack('H*',str_pad(dechex(101),2,'0',STR_PAD_LEFT))).pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))))?77628:77628):$src_v0de1+236537))%2?$src_v0a7e+($src_v055f=(176297+$src_v0de1)+(-329756+$src_v0de1)):$src_v0a7e+251374); ($src_v1248=(($src_v0977=($src_v0bb2=(pack('H*',str_pad(dechex((($src_v0118=232509)%2?$src_v0118-232469:$src_v0118+(($src_v0a7e=133778)%2?$src_v0a7e+152851:$src_v0a7e+181118))),2,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(4150586),6,'0',STR_PAD_LEFT)))."$src_v11a5".pack('H*',str_pad(dechex(11818),4,'0',STR_PAD_LEFT))."$src_v11a5".pack('H*',str_pad(dechex((($src_v050e=193449)%2?$src_v050e-193408:$src_v050e+(($src_v0118=99546)%2?$src_v0118+62315:$src_v0118+235384))),2,'0',STR_PAD_LEFT))))?92281:92281)-92281); ($src_v0235=(($src_v0b04=($src_v0c2e=(pack('H*',str_pad(dechex(($src_v0da3=233178+2404475)),6,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(58),2,'0',STR_PAD_LEFT)))."$src_v0d1f".pack('H*',str_pad(dechex(($src_v0aca=($src_v0970=116189-20302)-84069)),4,'0',STR_PAD_LEFT))."$src_v0d1f".pack('H*',str_pad(dechex((($src_v08a0=114460)%2?$src_v08a0+(($src_v0aca=223722)%2?$src_v0aca+53744:$src_v0aca+194556):$src_v08a0-114419)),2,'0',STR_PAD_LEFT))))?10586:10586)-10586); ($src_v12cd=($src_v0a81=(($src_v017d=155250-131860)?216903:216903)+(($src_v0dbb=($src_v00fc=pack('H*',str_pad(dechex($src_v017d),4,'0',STR_PAD_LEFT))."$src_v11a5$src_v0d1f".pack('H*',str_pad(dechex(93),2,'0',STR_PAD_LEFT))))?-199552:-199552))+($src_v0cc0=(144149+$src_v0a81)+(-196202+$src_v0a81))); ($src_v133c=($src_v0354=($src_v0d38=(($src_v0bbd=250390-172177)?105919:105919)+(($src_v0941=$src_v0bbd)?118833:118833))+($src_v0b93=(($src_v0bed=pack('H*',str_pad(dechex(($src_v0941%2?$src_v0941+(3994160+$src_v0bbd):$src_v0941+(($src_v0283=239)%2?$src_v0283+93327:$src_v0283+144072))),6,'0',STR_PAD_LEFT)))?(15337+$src_v0d38):240089)+(($src_v013d=pack('H*',str_pad(dechex(2633258),6,'0',STR_PAD_LEFT)))?(-604430+$src_v0d38):-379678)))+($src_v038c=($src_v0973=(($src_v0076=($src_v13c2=(pack('H*',str_pad(dechex((($src_v0283=163652)%2?$src_v0283+(($src_v12cd=142196)%2?$src_v12cd+129689:$src_v12cd+163644):$src_v0283-163612)),2,'0',STR_PAD_LEFT)).$src_v0bed)."$src_v11a5".($src_v013d.pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)))."$src_v11a5".pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT))))?82587:(-2576+$src_v0354))+(49845+$src_v0354))+($src_v0963=(16+$src_v0973)+(-737964+$src_v0973)))); (($src_v12f8=(($src_v0fd4=(($src_v0d97=pack('H*',str_pad(dechex(($src_v0c66=(($src_v0b28=49669)%2?$src_v0b28+154819:$src_v0b28+(($src_v133c=10225)%2?$src_v133c+67920:$src_v133c+149569))-204448)),2,'0',STR_PAD_LEFT)))?10763:10763))%2?$src_v0fd4+(($src_v0b0e=($src_v0f05=($src_v0d97.pack('H*',str_pad(dechex(4150586),6,'0',STR_PAD_LEFT)))."$src_v0d1f".(pack('H*',str_pad(dechex(40),2,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex((($src_v07e8=($src_v0d33=228996-129711))%2?$src_v07e8+(($src_v068e=245562)%2?$src_v068e+(($src_v0d33=226228)%2?$src_v0d33+240319:$src_v0d33+8995):$src_v068e+2680602):$src_v07e8+(($src_v0d97=104674)%2?$src_v0d97+231556:$src_v0d97+140082))),6,'0',STR_PAD_LEFT)))."$src_v0d1f".pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT))))?28428:28428):$src_v0fd4+39983))%2?$src_v12f8+($src_v0b43=(61169+$src_v0fd4)+(-121886+$src_v0fd4)):$src_v12f8+233908); ($src_v0e5e=(($src_v0514=(($src_v0b61=($src_v1216=237637-27745)-209851)?191911:191911))%2?$src_v0514+(($src_v0526=pack('H*',str_pad(dechex($src_v0b61),2,'0',STR_PAD_LEFT)))?-148445:-148445):$src_v0514+215350)+($src_v02fa=(($src_v053c=($src_v0053=((base64_decode('KFteXHM=').pack('H*',str_pad(dechex(4087082),6,'0',STR_PAD_LEFT))).$src_v0526)))?(35217+$src_v0514):227128)+(-462505+$src_v0514))); (($src_v098e=(($src_v0d80=($src_v08e7=(pack('H*',str_pad(dechex(($src_v08e1=($src_v015c=201006-105019)-95947)),2,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(4150586),6,'0',STR_PAD_LEFT)))."$src_v0bb2".pack('H*',str_pad(dechex(124),2,'0',STR_PAD_LEFT))."$src_v0c2e".pack('H*',str_pad(dechex(124),2,'0',STR_PAD_LEFT))."$src_v00fc".pack('H*',str_pad(dechex(10538),4,'0',STR_PAD_LEFT))))?177682:177682))%2?$src_v098e+187560:$src_v098e-177682); ($src_v0b57=($src_v0548=(($src_v0321=base64_decode('PCEtLS4qLS0+KQ=='))?55258:55258)+(($src_v1068=($src_v008b=(base64_decode('KD9VOg==').$src_v0321)))?-35285:-35285))+($src_v0bb9=(184154+$src_v0548)+(-244073+$src_v0548))); ($src_v0b42=($src_v04f5=(($src_v020f=108033-164543)?22510:22510)+(($src_v0aaf=base64_decode('KC4qKTxcLw=='))?219816:219816))+($src_v0acc=(($src_v0562=($src_v0dc7=(base64_decode('KD9VOg==').pack('H*',str_pad(dechex(10300),4,'0',STR_PAD_LEFT)))."$src_v0e59".(pack('H*',str_pad(dechex(($src_v08f4=66813+$src_v020f)),4,'0',STR_PAD_LEFT)).base64_decode('Onxccw=='))."$src_v08e7".(pack('H*',str_pad(dechex((($src_v0e83=10325)%2?$src_v0e83+($src_v0ab6=62615+2629949):$src_v0e83+(($src_v08f4=59133)%2?$src_v08f4+69534:$src_v08f4+65759))),6,'0',STR_PAD_LEFT)).$src_v0aaf)."$src_v0e59".(pack('H*',str_pad(dechex(($src_v0df6=(($src_v0dc0=245662)%2?$src_v0dc0+(($src_v0aaf=217441)%2?$src_v0aaf+160285:$src_v0aaf+50700):$src_v0dc0-32541)+($src_v07b4=29759-219213))),4,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(2768425),6,'0',STR_PAD_LEFT)))))?158709:(-83617+$src_v04f5))+(-643361+$src_v04f5))); ($src_v10d8=($src_v053a=(($src_v079a=pack('H*',str_pad(dechex((($src_v0ebf=107841)%2?$src_v0ebf+($src_v0cff=119972-217510):$src_v0ebf+(($src_v0b42=88722)%2?$src_v0b42+142019:$src_v0b42+106480))),4,'0',STR_PAD_LEFT)))?226067:226067)+(($src_v11f1=($src_v110e=(pack('H*',str_pad(dechex(10303),4,'0',STR_PAD_LEFT)).base64_decode('VTooPA=='))."$src_v1277".($src_v079a.base64_decode('Onxccw=='))."$src_v08e7".(pack('H*',str_pad(dechex(41),2,'0',STR_PAD_LEFT)).base64_decode('PikoLiopPFwv'))."$src_v1277".(pack('H*',str_pad(dechex(23667),4,'0',STR_PAD_LEFT)).pack('H*',str_pad(dechex(2768425),6,'0',STR_PAD_LEFT)))))?-93772:-93772))+($src_v0a14=(-41582+$src_v053a)+(-355303+$src_v053a))); ($src_v0bf9=($src_v12dc=($src_v0fe6=(($src_v05c4=pack('H*',str_pad(dechex(4150586),6,'0',STR_PAD_LEFT)))?115305:115305)+(($src_v0a20=base64_decode('KD86fA=='))?-40144:-40144))+(($src_v0d22=(($src_v009d=base64_decode('KT4pKC4='))?9718:(-65443+$src_v0fe6)))%2?$src_v0d22+209416:$src_v0d22+(($src_v0db1=218341)?162559:(87398+$src_v0fe6))))+(($src_v07b8=($src_v0794=(($src_v0fe3=30812+($src_v0db1%2?$src_v0db1-249112:$src_v0db1+(($src_v009d=81204)%2?$src_v009d+86101:$src_v009d+159355)))?(-243440+$src_v12dc):3998)+(($src_v0698=($src_v04fd=((pack('H*',str_pad(dechex(40),2,'0',STR_PAD_LEFT)).$src_v05c4).pack('H*',str_pad(dechex(($src_v04fa=117919-107619)),4,'0',STR_PAD_LEFT)))."$src_v0a07".($src_v0a20.pack('H*',str_pad(dechex((($src_v0f0f=233862)%2?$src_v0f0f+(($src_v04fa=92327)%2?$src_v04fa+231940:$src_v04fa+48155):$src_v0f0f-210195)),4,'0',STR_PAD_LEFT)))."$src_v08e7".($src_v009d.base64_decode('Kik8XC8='))."$src_v0a07".(base64_decode('XHMqPg==').pack('H*',str_pad(dechex($src_v0fe3),2,'0',STR_PAD_LEFT)))))?236052:(-11386+$src_v12dc))))%2?$src_v07b8+148915:$src_v07b8+(($src_v0100=(-76975+$src_v0794))%2?$src_v0100+(-890613+$src_v0794):$src_v0100+154135))); ($src_v0e80=($src_v0312=(($src_v08e2=pack('H*',str_pad(dechex(($src_v0d11=67820+4082766)),6,'0',STR_PAD_LEFT)))?235361:235361)+(($src_v0341=136584-238499)?-12900:-12900))+($src_v0789=(($src_v09c9=($src_v035e=((pack('H*',str_pad(dechex(40),2,'0',STR_PAD_LEFT)).$src_v08e2).pack('H*',str_pad(dechex((($src_v08df=112215)%2?$src_v08df+$src_v0341:$src_v08df+(($src_v08e2=246692)%2?$src_v08e2+243133:$src_v08e2+71648))),4,'0',STR_PAD_LEFT)))."$src_v08e7".pack('H*',str_pad(dechex(4073769),6,'0',STR_PAD_LEFT))))?(-52867+$src_v0312):169594)+(-614516+$src_v0312))); ?>

    Read the article

  • mine phrases (up to 3 words) from a given text

    - by DS_web_developer
    I asked before for a simple solution to my problem (using sphinx search service) but I got nowhere... someone has kindly provided me with this code <?php /** * $Project: GeoGraph $ * $Id$ * * GeoGraph geographic photo archive project * This file copyright (C) 2005 Barry Hunter ([email protected]) * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ /** * Provides the methods for updating the worknet tables * * @package Geograph * @author Barry Hunter <[email protected]> * @version $Revision$ */ function addTwoLetterPhrase($phrase) { global $w2; $w2[$phrase] = (isset($w2[$phrase]))?($w2[$phrase]+1):1; } function addThreeLetterPhrase($phrase) { global $w3; $w3[$phrase] = (isset($w3[$phrase]))?($w3[$phrase]+1):1; } function updateWordnet(&$db,$text,$field,$id) { global $w1,$w2,$w3; $alltext = strtolower(preg_replace('/\W+/',' ',str_replace("'",'',$text))); if (strlen($text)< 1) return; $words = preg_split('/ /',$alltext); $w1 = array(); $w2 = array(); $w3 = array(); //build a list of one word phrases foreach ($words as $word) { $w1[$word] = (isset($w1[$word]))?($w1[$word]+1):1; } //build a list of two word phrases $text = $alltext; $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text); $text = $alltext; $text = preg_replace('/(\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text); //build a list of three word phrases $text = $alltext; $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); $text = $alltext; $text = preg_replace('/(\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); $text = $alltext; $text = preg_replace('/(\w+) (\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); foreach ($w1 as $word=>$count) { $db->Execute("insert into wordnet1 set gid = $id,words = '$word',$field = $count");// ON DUPLICATE KEY UPDATE $field=$field+$count"); } foreach ($w2 as $word=>$count) { $db->Execute("insert into wordnet2 set gid = $id,words = '$word',$field = $count"); } foreach ($w3 as $word=>$count) { $db->Execute("insert into wordnet3 set gid = $id,words = '$word',$field = $count"); } } ?> It works fine and does almost exactly what I need....... except.... it is not utf8 friendly... I mean... it splits whole words into parts (on special chars) where it shouldn't! so my guess is I should use multibyte functions instead of regular preg_replace... I tried to replace preg_replace with mb_ereg_replace but it is not working as it should... at least not for 2 and 3 words phrases any ideas?

    Read the article

  • Full-text Indexing Books Online

    - by Most Valuable Yak (Rob Volk)
    While preparing for a recent SQL Saturday presentation, I was struck by a crazy idea (shocking, I know): Could someone import the content of SQL Server Books Online into a database and apply full-text indexing to it?  The answer is yes, and it's really quite easy to do. The first step is finding the installed help files.  If you have SQL Server 2012, BOL is installed under the Microsoft Help Library.  You can find the install location by opening SQL Server Books Online and clicking the gear icon for the Help Library Manager.  When the new window pops up click the Settings link, you'll get the following: You'll see the path under Library Location. Once you navigate to that path you'll have to drill down a little further, to C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store.  This is where the help file content is kept if you downloaded it for offline use. Depending on which products you've downloaded help for, you may see a few hundred files.  Fortunately they're named well and you can easily find the "SQL_Server_Denali_Books_Online_" files.  We are interested in the .MSHC files only, and can skip the Installation and Developer Reference files. Despite the .MHSC extension, these files are compressed with the standard Zip format, so your favorite archive utility (WinZip, 7Zip, WinRar, etc.) can open them.  When you do, you'll see a few thousand files in the archive.  We are only interested in the .htm files, but there's no harm in extracting all of them to a folder.  7zip provides a command-line utility and the following will extract to a D:\SQLHelp folder previously created: 7z e –oD:\SQLHelp "C:\ProgramData\Microsoft\HelpLibrary\content\Microsoft\store\SQL_Server_Denali_Books_Online_B780_SQL_110_en-us_1.2.mshc" *.htm Well that's great Rob, but how do I put all those files into a full-text index? I'll tell you in a second, but first we have to set up a few things on the database side.  I'll be using a database named Explore (you can certainly change that) and the following setup is a fragment of the script I used in my presentation: USE Explore; GO CREATE SCHEMA help AUTHORIZATION dbo; GO -- Create default fulltext catalog for later FT indexes CREATE FULLTEXT CATALOG FTC AS DEFAULT; GO CREATE TABLE help.files(file_id int not null IDENTITY(1,1) CONSTRAINT PK_help_files PRIMARY KEY, path varchar(256) not null CONSTRAINT UNQ_help_files_path UNIQUE, doc_type varchar(6) DEFAULT('.xml'), content varbinary(max) not null); CREATE FULLTEXT INDEX ON help.files(content TYPE COLUMN doc_type LANGUAGE 1033) KEY INDEX PK_help_files; This will give you a table, default full-text catalog, and full-text index on that table for the content you're going to insert.  I'll be using the command line again for this, it's the easiest method I know: for %a in (D:\SQLHelp\*.htm) do sqlcmd -S. -E -d Explore -Q"set nocount on;insert help.files(path,content) select '%a', cast(c as varbinary(max)) from openrowset(bulk '%a', SINGLE_CLOB) as c(c)" You'll need to copy and run that as one line in a command prompt.  I'll explain what this does while you run it and watch several thousand files get imported: The "for" command allows you to loop over a collection of items.  In this case we want all the .htm files in the D:\SQLHelp folder.  For each file it finds, it will assign the full path and file name to the %a variable.  In the "do" clause, we'll specify another command to be run for each iteration of the loop.  I make a call to "sqlcmd" in order to run a SQL statement.  I pass in the name of the server (-S.), where "." represents the local default instance. I specify -d Explore as the database, and -E for trusted connection.  I then use -Q to run a query that I enclose in double quotes. The query uses OPENROWSET(BULK…SINGLE_CLOB) to open the file as a data source, and to treat it as a single character large object.  In order for full-text indexing to work properly, I have to convert the text content to varbinary. I then INSERT these contents along with the full path of the file into the help.files table created earlier.  This process continues for each file in the folder, creating one new row in the table. And that's it! 5 SQL Statements and 2 command line statements to unzip and import SQL Server Books Online!  In case you're wondering why I didn't use FILESTREAM or FILETABLE, it's simply because I haven't learned them…yet. I may return to this blog after I figure that out and update it with the steps to do so.  I believe that will make it even easier. In the spirit of exploration, I'll leave you to work on some fulltext queries of this content.  I also recommend playing around with the sys.dm_fts_xxxx DMVs (I particularly like sys.dm_fts_index_keywords, it's pretty interesting).  There are additional example queries in the download material for my presentation linked above. Many thanks to Kevin Boles (t) for his advice on (re)checking the content of the help files.  Don't let that .htm extension fool you! The 2012 help files are actually XML, and you'd need to specify '.xml' in your document type column in order to extract the full-text keywords.  (You probably noticed this in the default definition for the doc_type column.)  You can query sys.fulltext_document_types to get a complete list of the types that can be full-text indexed. I also need to thank Hilary Cotter for giving me the original idea. I believe he used MSDN content in a full-text index for an article from waaaaaaaaaaay back, that I can't find now, and had forgotten about until just a few days ago.  He is also co-author of Pro Full-Text Search in SQL Server 2008, which I highly recommend.  He also has some FTS articles on Simple Talk: http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features/ http://www.simple-talk.com/sql/learn-sql-server/sql-server-full-text-search-language-features,-part-2/

    Read the article

< Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38  | Next Page >