Search Results

Search found 819 results on 33 pages for 'tagged corpus'.

Page 16/33 | < Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23  | Next Page >

  • Iterating through String word at a time in Python

    - by AlgoMan
    I have a string buffer of a huge text file. I have to search a given words/phrases in the string buffer. Whats the efficient way to do it ? I tried using re module matches. But As i have a huge text corpus that i have to search through. This is taking large amount of time. Given a Dictionary of words and Phrases. I iterate through the each file, read that into string , search all the words and phrases in the dictionary and increment the count in the dictionary if the keys are found. One small optimization that we thought was to sort the dictionary of phrases/words with the max number of words to lowest. And then compare each word start position from the string buffer and compare the list of words. If one phrase is found, we don search for the other phrases (as it matched the longest phrase ,which is what we want) Can some one suggest how to go about word by word in the string buffer. (Iterate string buffer word by word) ? Also, Is there any other optimization that can be done on this ?

    Read the article

  • Performing a SVD on tweets. Memory problem

    - by plotti
    I have generated a huge csv file as an output from my pos tagging and stemming. It looks like this: word1, word2, word3, ..., word14400 person1 1 2 0 1 person2 0 0 1 0 ... person650 It contains the word counts for each person. Like this I am getting characteristic vectors for each person. I want to run a SVD on this beast, but it seems the matrix is too big to be held in memory to perform the operation. My quesion is: should i reduce the column size by removing words which have a column sum of for example 1, which means that they have been used only once. Do I bias the data too much with this attempt? I tried the rapidminer attempt, by loading the csv into the db. and then sequentially reading it in with batches for processing, like rapidminer proposes. But Mysql can't store that many columns in a table. If i transpose the data, and then retranspose it on import it also takes ages.... -- So in general I am asking for advice how to perform a svd on such a corpus.

    Read the article

  • Cannot install g++ on ubuntu

    - by Erel Segal
    I don't have g++: erelsgl@ubuntu:/etc/apt$ which g++ erelsgl@ubuntu:/etc/apt$ erelsgl@ubuntu:/etc/apt$ g++ The program 'g++' can be found in the following packages: * g++ * pentium-builder Try: sudo apt-get install <selected package> So I try to install it: erelsgl@ubuntu:~/srilm$ sudo apt-get install g++ Reading package lists... Done Building dependency tree Reading state information... Done g++ is already the newest version. 0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded. 2 not fully installed or removed. After this operation, 0B of additional disk space will be used. Setting up g++ (4:4.4.3-1ubuntu1) ... update-alternatives: error: alternative path /usr/bin/g++ doesn't exist. dpkg: error processing g++ (--configure): subprocess installed post-installation script returned error exit status 2 dpkg: dependency problems prevent configuration of build-essential: build-essential depends on g++ (>= 4:4.3.1); however: Package g++ is not configured yet. dpkg: error processing build-essential (--configure): dependency problems - leaving unconfigured No apport report written because the error message indicates its a followup error from a previous failure. Errors were encountered while processing: g++ build-essential E: Sub-process /usr/bin/dpkg returned an error code (1) I also try to install build-essential, and get same results. I also tried "sudo apt-get update" - didn't help. This is my apt-cache: erelsgl@ubuntu:/etc/apt$ apt-cache policy g++ build-essential g++: Installed: 4:4.4.3-1ubuntu1 Candidate: 4:4.4.3-1ubuntu1 Version table: *** 4:4.4.3-1ubuntu1 0 500 http://il.archive.ubuntu.com/ubuntu/ lucid/main Packages 100 /var/lib/dpkg/status build-essential: Installed: 11.4build1 Candidate: 11.4build1 Version table: *** 11.4build1 0 500 http://il.archive.ubuntu.com/ubuntu/ lucid/main Packages 100 /var/lib/dpkg/status erelsgl@ubuntu:/etc/apt$ I also tried this and got the same error: erelsgl@ubuntu:~/Ace/Files/corpus$ sudo dpkg --configure -a Setting up g++ (4:4.4.3-1ubuntu1) ... update-alternatives: error: alternative path /usr/bin/g++ doesn't exist. dpkg: error processing g++ (--configure): subprocess installed post-installation script returned error exit status 2 dpkg: dependency problems prevent configuration of build-essential: build-essential depends on g++ (>= 4:4.3.1); however: Package g++ is not configured yet. dpkg: error processing build-essential (--configure): dependency problems - leaving unconfigured Errors were encountered while processing: g++ build-essential

    Read the article

  • Choosing a Windows Automation script language. Autoit vs Autohotkey.

    - by PA
    I need to choose a windows automation program. Which one do you recommend? AutoIt, AutoHotkey, others? I have read http://paperlined.org/apps/autohotkey/autoit_and_autohotkey.html , interesting history but without a clear recommendation. Searching on google leaves a winner (around 312k hits for AutoHotkey Windows vs 482k hits for AutoIt Windows). In StackOverflow there are 15 questions tagged as AutoIt vs 18 for AutoHotkey. I am interested on your opinion as programmers. Which one do you think is easier to use, more deployable and more powerful in terms of functionality? Note: I have already used AutoHotkey for personal use. So my initial preference is for this.

    Read the article

  • How to calculate an angle from three points?

    - by HelloMoon
    Lets say you have this: P1 = (x=2, y=50) P2 = (x=9, y=40) P3 = (x=5, y=20) Assume that P1 is the center point of a circle. It is always the same. I want the angle that is made up by P2 and P3, or in other words the angle that is next to P1. The inner angle to be precise. It will be always a sharp angle, so less than -90 degrees. I thought: Man, that's simplest geometry maths. But I looked for a formula for like 6 hours now and people talk about most complicated NASA stuff like arcos and vector scalar product stuff. My head feels like in a fridge. Some math gurus here that think this is a simple problem? I think the programing language doesn't matter here but for those who think it does: java and objective-c. need that for both. haven't tagged it for these, though.

    Read the article

  • WCF Security Transport Security Questions

    - by shyneman
    I'm writing a set of WCF services that rely on transport security with Windows Authentication using the trusted subsystem model. However, I want to perform authorization based on the original client user that initiated the request (e.g. a user from a website with a username/password). I'm planning to achieve this by adding the original user's credentials in the header before the client sends the message and then the service will use the supplied credentials to authorize the user. So I have a few questions about this implementation: 1) using transport security with windows auth, I do NOT need to worry about again encrypting the passed credentials to ensure the validity... WCF automatically takes care of this - is this correct? 2) how does this implementation prevent a malicious service, running under some windows account within the domain, to send a message tagged with spoofed credentials. for e.g. a malicious service replaces the credentials with an Admin user to do something bad? Thanks for any help.

    Read the article

  • Quartz.NET Instance Handling Problem

    - by Dhon
    Hi, I have 2 instances which implements 2 different instance IDs in 2 different windows services as: //windows service 1 instance 1 properties["quartz.scheduler.instanceName"] = "instanceName1"; properties["quartz.scheduler.instanceId"] = "instanceID1"; //windows service 2 instance 2 properties["quartz.scheduler.instanceName"] = "instanceName2"; properties["quartz.scheduler.instanceId"] = "instanceID2"; In the ADOJobstore, I can see that there are two instances. However, when I schedule a simple job in instance1, it is getting triggered in instance2 (and vise versa). By looking at the records created in jobstore, the scheduled job are properly tagged with the expected instanceIDs. Any idea of why this is happening?

    Read the article

  • How to embed multiple tags in Rails routes, like Stackoverflow.

    - by Craig
    When one selects a Tag on stackoverflow, it is added to the end of the Url. Add a second Tag and it is add to the end of the Url after the first Tag, with a '+' delimiter. For example, http://stackoverflow.com/questions/tagged/ruby-on-rails+best-practices. How is this implemented? Is this a routing enhancement or some logic contained in the TagsController? Finally, how does one 'extract' these Tags for filtering (assuming that they are not in the params[] array)?

    Read the article

  • SQL: Gather right hand values from a join

    - by Max Williams
    Let's say a question has many tags, via a join table called taggings. I do a join thus: SELECT DISTINCT `questions`.id FROM `questions` LEFT OUTER JOIN `taggings` ON `taggings`.taggable_id = `questions`.id LEFT OUTER JOIN `tags` ON `tags`.id = `taggings`.tag_id I want to order the results according to a particular tag name, eg 'piano', so that piano is at the top, then by all the other tags in alphabetical order. Currently i'm using this order clause: ORDER BY (tags.name = 'piano') desc, tags.name Which is going completely wrong - the first results i get back aren't even tagged with 'piano' at all. I think my problem is that i need to group the tag names somehow and do my ordering test against that: i think that doing it against the straight tags.name isn't working due to the structure of the resultant join table (it does work if i just do a simple select on the tags table) but i can't get my head around how to fix it. grateful for any advice, max

    Read the article

  • Do you still support Internet Explorer 6?

    - by Marcos Buarque
    Hi, I hope this question isn't tagged as too generic. But this is an honest concern/curiosity I have. Are you people still supporting Internet Explorer 6? Web developers sometimes use the concept of softly degrading their projects in older browsers. When I say "older", I mean mainly Internet Explorer 6. When you softly degrade you still have to waste tons of time tryng to fit things in IE6. This is time consuming and the client usually won`t even mind asking or checking how things look in IE6. So what are you doing when your website is completely broken in IE6? Do you bother to completely fix it and eat up your time or do you simply ignore it (leave broken boxes, opaque PNG files etc.)?

    Read the article

  • ruby 1.9: invalid byte sequence in UTF-8

    - by Marc Seeger
    I'm writing a crawler in ruby (1.9) that consumes lots of HTML from a lot of random sites. When trying to extract links, I decided to just use .scan(/href="(.*?)"/i) instead of nokogiri/hpricot (major speedup). The problem is that I now receive a lot of "invalid byte sequence in UTF-8" errors. From what I understood, the net/http library doesn't have any encoding specific options and the stuff that comes in is basically not properly tagged. What would be the best way to actually work with that incoming data? I tried .encode with the replace and invalid options set, but no success so far...

    Read the article

  • Windows Forms: Enable/Disable WS_CLIPCHILDREN

    - by Agnel Kurian
    How do I turn on/off the WS_CLIPCHILDREN window style in a Windows Forms parent control? I would like to display some text on top of the child control after it has painted. In my parent control, this is what I have: class Parent : public Control { void Parent::OnPaint(PaintEventArgs ^e){ Control::OnPaint(e); // parent draws here // some drawing should happen over the child windows // in other words, do not clip child window regions } }; On checking with Spy++ I find that the parent has the WS_CLIPCHILDREN window style enabled by default. What is the Windows Forms way to turn this off? Note: Sample code is in C++/CLI but I have tagged this C# for visibility... language is immaterial here. Feel free to translate the code to C#.

    Read the article

  • Visual studio does not gray out properties with a ReadOnlyAttribute(true)

    - by Fire-Dragon-DoL
    I know it's stupid but visual studio (2010) doesn't gray out my properties tagged with ReadOnlyAttribute, I can't edit their values (if I try to do it, simply return to the previous value), but they aren't grayed out, I think it's really boring this when using the editor Is there an option or an attribute that I'm forgetting? Thanks for any help Example 1: /// <summary> /// Inform if the LcdDisplay has been already initiated /// </summary> [Description("Inform if the LcdDisplay has been already initiated")] [DefaultValue(false)] [ReadOnly(true)] public bool Initialized { get; private set; } Initialized is not grayed out

    Read the article

  • What is the release date for Rakudo Star (perl6)?

    - by kbenson
    If a specific release date is not available (as I suspect it is not), can you provide resources for tracking how close it is to the desired feature set that allows release. I'm not necessarily asking for a percentage gauge, or X of Y features completed list. A list of bugs marked in whichever section of the perl RT instance that's tracking Rakudo bugs would meet my criteria, even more so if the list is dynamic (I.e. it's a list of bugs tagged in some manner, not a static list of ticket numbers). If there are only a few planned features left to be finished/tested before it's considered ready for final testing, listing those would also be sufficient.

    Read the article

  • Creating new tags in StackOverflow [closed]

    - by Biranchi
    Hi, Earlier users with more than 300 reputations or so (don't remeber exactly) had the ability to create new tags which is tagged to their Questions , so that it was easier to search for similar questions with the tags. And also the tags helped to identify the context of the Questions asked for. I had also created few new tags for the questions that i ask for. But now it seems the Stackover team without prior notice to the users or any intimation has revoked the permission for creating new tags and also has set the reputation to more that 1500 for creating new tags. Isn't this a bully act or did they feel its useless/unnecessary not required to inform all the registered users before making the changes ? thoughts....

    Read the article

  • Django model manager didn't work with related object when I do aggregated query

    - by Satoru.Logic
    Hi, all. I'm having trouble doing an aggregation query on a many-to-many related field. Let's begin with my models: class SortedTagManager(models.Manager): use_for_related_fields = True def get_query_set(self): orig_query_set = super(SortedTagManager, self).get_query_set() # FIXME `used` is wrongly counted return orig_query_set.distinct().annotate( used=models.Count('users')).order_by('-used') class Tag(models.Model): content = models.CharField(max_length=32, unique=True) creator = models.ForeignKey(User, related_name='tags_i_created') users = models.ManyToManyField(User, through='TaggedNote', related_name='tags_i_used') objects_sorted_by_used = SortedTagManager() class TaggedNote(models.Model): """Association table of both (Tag , Note) and (Tag, User)""" note = models.ForeignKey(Note) # Note is what's tagged in my app tag = models.ForeignKey(Tag) tagged_by = models.ForeignKey(User) class Meta: unique_together = (('note', 'tag'),) However, the value of the aggregated field used is only correct when the model is queried directly: for t in Tag.objects.all(): print t.used # this works correctly for t in user.tags_i_used.all(): print t.used #prints n^2 when it should give n Would you please tell me what's wrong with it? Thanks in advance.

    Read the article

  • Tuple struct constructor complains about private fields

    - by Grubermensch
    I am working on a basic shell interpreter to familiarize myself with Rust. While working on the table for storing suspended jobs in the shell, I have gotten stuck at the following compiler error message: tsh.rs:8:18: 8:31 error: cannot invoke tuple struct constructor with private fields tsh.rs:8 let mut jobs = job::JobsList(vec![]); ^~~~~~~~~~~~~ It's unclear to me what is being seen as private here. As you can see below, both of the structs are tagged with pub in my module file. So, what's the secret sauce? tsh.rs use std::io; mod job; fn main() { // Initialize jobs list let mut jobs = job::JobsList(vec![]); loop { /*** Shell runtime loop ***/ } } job.rs use std::fmt; pub struct Job { jid: int, pid: int, cmd: String } impl fmt::Show for Job { /*** Formatter ***/ } pub struct JobsList(Vec<Job>); impl fmt::Show for JobsList { /*** Formatter ***/ }

    Read the article

  • Searching documents by tag using the Scribd API is no longer returning expected results.

    - by George
    Recently I have encountered an issue with Scribd where searching via Scribd API (docs.search) for documents by tag is no longer working. This has been working (for over 6 months) to return a number of documents that I have tagged with "fdsafetyandprevention" (accessible here http://www.scribd.com/tag/fdsafetyandprevention). Just recently my search via the API has stopped working. Note that test searches such as @tags "selfhelp" as described in the Scribd documentation DO work. Could my issue be related to caching or the age of my documents and Scribd choosing to not return them in search results? I have been using scribd.php (http://www.scribd.com/developers/libraries) to interface with the API using $scribd-search(@tags "fdsafetyandprevention", 20, 0, "all"). I am following the Scribd documentation for docs.search and advanced help (http://www.scribd.com/developers/search_help). Help greatly appreciated. George.

    Read the article

  • Deploying only changed part of a website with git to ftp (svn2web for git)

    - by Elazar Leibovich
    I'm having a website with many big images file. The source (as well as the images) is maintained with git. I wish to deploy that via ftp to a bluehost-like cheap server. I do not wish to deploy all the website each time (so that I won't have to upload too many unchanged files over and over), but to do roughly the following: In a git repository, mark the last deployed revision with a tag "deployed". When I say "deploy revision X", find out which files has changed between revision X and revision tagged as deploy, and upload just them. It is similar in spirit to svn2web. But I want that for DVCS. Mercurial alternative will be considered. It's a pretty simple script to write, but I'd rather not to reinvent the wheel if there's some similar script on the web. Capistrano and fab seems to know only how to push the whole revision, in their SCM integration. So I don't think I can currently use them.

    Read the article

  • Rails: Polymorphic User Table a good idea with AuthLogic?

    - by sscirrus
    Hi everyone, I have a system where I need to login three user types: customers, companies, and vendors from one login form on the home page. I have created one User table that works according to AuthLogic's example app at http://github.com/binarylogic/authlogic_example. I have added a field called "User Type" that currently contains either 'Customer', 'Company', or 'Vendor'. Note: each user type contains many disparate fields so I'm not sure if Single Table Inheritance is the best way to go (would welcome corrections if this conclusion is invalid). Is this a polymorphic association where each of the three types is 'tagged' with a User record? How should my models look so I have the right relationships between my User table and my user types Customer, Company, Vendor? Thanks very much!

    Read the article

  • Vim OmniCppComplete on vectors of pointers

    - by Alex
    Hi, I might have done something wrong in the set up but is OmniCppComplete supposed to provide the members/functions of classes when doing this? vectorofpointers[0]-> At the moment all I get when trying that are things relating to the vector class itself, which obviously isn't very useful. I think it might have been working before I tagged /usr/include/ but I could be wrong. Also, is it possible to disable the preview window? I find it just clutters up my workspace. And since I enabled ShowPrototypeInAbbr I don't really need it. Thanks, Alex

    Read the article

  • python socket.recv/sendall call blocking

    - by fsm
    Hi everyone. This post is incorrectly tagged 'send' since I cannot create new tags. I have a very basic question about this simple echo server. Here are some code snippets. client while True: data = raw_input("Enter data: ") mySock.sendall(data) echoedData = mySock.recv(1024) if not echoedData: break print echoedData server while True: print "Waiting for connection" (clientSock, address) = serverSock.accept() print "Entering read loop" while True: print "Waiting for data" data = clientSock.recv(1024) if not data: break clientSock.send(data) clientSock.close() Now this works alright, except when the client sends an empty string (by hitting the return key in response to "enter data: "), in which case I see some deadlock-ish behavior. Now, what exactly happens when the user presses return on the client side? I can only imagine that the sendall call blocks waiting for some data to be added to the send buffer, causing the recv call to block in turn. What's going on here? Thanks for reading!

    Read the article

  • Creating a QuerySet based on a ManyToManyField in Django

    - by River Tam
    So I've got two classes; Picture and Tag that are as follows: class Tag(models.Model): pics = models.ManyToManyField('Picture', blank=True) name = models.CharField(max_length=30) # stuff omitted class Picture(models.Model): name = models.CharField(max_length=100) pub_date = models.DateTimeField('date published') tags = models.ManyToManyField('Tag', blank=True) content = models.ImageField(upload_to='instaton') #stuff omitted And what I'd like to do is get a queryset (for a ListView) given a tag name that contains the most recent X number of Pictures that are tagged as such. I've looked up very similar problems, but none of the responses make any sense to me at all. How would I go about creating this queryset?

    Read the article

  • Cloning just a particular directory with hg?

    - by leeand00
    I come from a Subversion background, but I am slowly migrating to Mercurial. When starting on many of my projects, I would setup a development environment that was configured to a particular starting point in developing an app/webapp/program (much like a Maven 2 archetype, but not necessarily Java/Maven). Later I would checkout this archetype/template project out of my svn repo by its particular path; and than export the working copy from version control by the repository; so that I could import the working copy back in to another repository without adding the changes that I made to the working copy to the base the template/archetype project. I've tried doing the same thing in Mercurial, and I've run into a wall since I can't check out, er..um..no, clone a specific path from the hg repository. If I want to achieve the same sort of functionality using Mecurial, what should I do? Use tagged branches? The archetypes/template projects are very different, but I'd like to keep them in the same repository.

    Read the article

  • set Image to Button

    - by Ivan
    Hello all, Could somebody help me a little bit with my issue below? When I call the myFunction, images which I want to set to buttons appear after 2 sec simultaneously, not one by one with delay of 0.5 sec. More info: generatedNumbers is array with four elements of NSNumber (4,1,3,2) buttons are set in UIView via IB and are tagged (1,2,3,4) -(IBAction) myFunction:(id) sender { int i, value; for (i = 0; i<[generatedNumbers count]; i++) { value = [[generatedNumbers objectAtIndex:i] intValue]; UIButton *button = (UIButton *)[self.view viewWithTag:i+1]; UIImage *img = [UIImage imageNamed:[NSString stringWithFormat:@"%d.png",value]]; [button setImage:img forState:UIControlStateNormal]; [img release]; usleep(500000); } }

    Read the article

< Previous Page | 12 13 14 15 16 17 18 19 20 21 22 23  | Next Page >