Search Results

Search found 234 results on 10 pages for 'stanford nlp'.

Page 4/10 | < Previous Page | 1 2 3 4 5 6 7 8 9 10  | Next Page >

  • Natural Language Processing in Ruby

    - by Joey Robert
    I'm looking to do some sentence analysis (mostly for twitter apps) and infer some general characteristics. Are there any good natural language processing libraries for this sort of thing in Ruby? Similar to http://stackoverflow.com/questions/870460/java-is-there-a-good-natural-language-processing-library but for Ruby. I'd prefer something very general, but any leads are appreciated!

    Read the article

  • PyParsing: Is this correct use of setParseAction()?

    - by Rosarch
    I have strings like this: "MSE 2110, 3030, 4102" I would like to output: [("MSE", 2110), ("MSE", 3030), ("MSE", 4102)] This is my way of going about it, although I haven't quite gotten it yet: def makeCourseList(str, location, tokens): print "before: %s" % tokens for index, course_number in enumerate(tokens[1:]): tokens[index + 1] = (tokens[0][0], course_number) print "after: %s" % tokens course = Group(DEPT_CODE + COURSE_NUMBER) # .setResultsName("Course") course_data = (course + ZeroOrMore(Suppress(',') + COURSE_NUMBER)).setParseAction(makeCourseList) This outputs: >>> course.parseString("CS 2110") ([(['CS', 2110], {})], {}) >>> course_data.parseString("CS 2110, 4301, 2123, 1110") before: [['CS', 2110], 4301, 2123, 1110] after: [['CS', 2110], ('CS', 4301), ('CS', 2123), ('CS', 1110)] ([(['CS', 2110], {}), ('CS', 4301), ('CS', 2123), ('CS', 1110)], {}) Is this the right way to do it, or am I totally off? Also, the output of isn't quite correct - I want course_data to emit a list of course symbols that are in the same format as each other. Right now, the first course is different from the others. (It has a {}, whereas the others don't.)

    Read the article

  • Ideas for designing an automated content tagging system needed

    - by Benjamin Smith
    I am currently designing a website that amongst other is required to display and organise small amounts of text content (mainly quotes, article stubs, etc.). I currently have a database with 250,000+ items and need to come up with a method of tagging each item with relevant tags which will eventually allow for easy searching/browsing of the content for users. A very simplistic idea I have (and one that I believe is employed by some sites that I have been looking to for inspiration (http://www.brainyquote.com/quotes/topics.html)), is to simply search the database for certain words or phrases and use these words as tags for the content. This can easily be extended so that if for example a user wanted to show all items with a theme of love then I would just return a list of items with words and phrases relating to this theme. This would not be hard to implement but does not provide very good results. For example if I were to search for the month 'May' in the database with the aim of then classifying the items returned as realting to the topic of Spring then I would get back all occurrences of the word May, regardless of the semantic meaning. Another shortcoming of this method is that I believe it would be quite hard to automate the process to any large scale. What I really require is a library that can take an item, break it down and analyse the semantic meaning and also return a list of tags that would correctly classify the item. I know this is a lot to ask and I have a feeling I will end up reverting to the aforementioned method but I just thought I should ask if anyone knew of any pre-existing solution. I think that as the items in the database are short then it is probably quite a hard task to analyse any meaning from them however I may be mistaken. Another path to possibly go down would be to use something like amazon turk to outsource the task which may produce good results but would be expensive. Eventually I would like users to be able to (and want to!) tag content and to vote for the most relevant tags, possibly using a gameification mechanic as motivation however this is some way down the line. A temporary fix may be the best thing if this were the route I decided to go down as I could use the rough results I got as the starting point for a more in depth solution. If you've read this far, thanks for sticking with me, I know I'm spitballing but any input would be really helpful. Thanks.

    Read the article

  • Algorithm to match natural text in mail

    - by snøreven
    I need to separate natural, coherent text/sentences in emails from lists, signatures, greetings and so on before further processing. example: Hi tom, last monday we did bla bla, lore Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. list item 2 list item 3 list item 3 Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid x ea commodi consequat. Quis aute iure reprehenderit in voluptate velit regards, K. ---line-of-funny-characters-####### example inc. 33 evil street, london mobile: 00 234534/234345 Ideally the algorithm would match only the bold parts. Is there any recommended approach - or are there even existing algorithms for that problem? Should I try approximate regular expressions or more statistical stuff based on number of punctation marks, length and so on?

    Read the article

  • How does Amazon's Statistically Improbable Phrases work?

    - by ??iu
    How does something like Statistically Improbable Phrases work? According to amazon: Amazon.com's Statistically Improbable Phrases, or "SIPs", are the most distinctive phrases in the text of books in the Search Inside!™ program. To identify SIPs, our computers scan the text of all books in the Search Inside! program. If they find a phrase that occurs a large number of times in a particular book relative to all Search Inside! books, that phrase is a SIP in that book. SIPs are not necessarily improbable within a particular book, but they are improbable relative to all books in Search Inside!. For example, most SIPs for a book on taxes are tax related. But because we display SIPs in order of their improbability score, the first SIPs will be on tax topics that this book mentions more often than other tax books. For works of fiction, SIPs tend to be distinctive word combinations that often hint at important plot elements. For instance, for Joel's first book, the SIPs are: leaky abstractions, antialiased text, own dog food, bug count, daily builds, bug database, software schedules One interesting complication is that these are phrases of either 2 or 3 words. This makes things a little more interesting because these phrases can overlap with or contain each other.

    Read the article

  • How to estimate the quality of a web page?

    - by roddik
    Hello, I'm doing a university project, that must gather and combine data on a user provided topic. The problem I've encountered is that Google search results for many terms are polluted with low quality autogenerated pages and if I use them, I can end up with wrong facts. How is it possible to estimate the quality/trustworthiness of a page? You may think "nah, Google engineers are working on the problem for 10 years and he's asking for a solution", but if you think about it, SE must provide up-to-date content and if it marks a good page as a bad one, users will be dissatisfied. I don't have such limitations, so if the algorithm accidentally marks as bad some good pages, that wouldn't be a problem.

    Read the article

  • How to write efficient code for extracting Noun phrases?

    - by Arun Abraham
    I am trying to extract phrases using rules such as the ones mentioned below on text which has been POS tagged 1) NNP - NNP (- indicates followed by) 2) NNP - CC - NNP 3) VP - NP etc.. I have written code in this manner, Can someone tell me how i can do in a better manner. List<String> nounPhrases = new ArrayList<String>(); for (List<HasWord> sentence : documentPreprocessor) { //System.out.println(sentence.toString()); System.out.println(Sentence.listToString(sentence, false)); List<TaggedWord> tSentence = tagger.tagSentence(sentence); String lastTag = null, lastWord = null; for (TaggedWord taggedWord : tSentence) { if (lastTag != null && taggedWord.tag().equalsIgnoreCase("NNP") && lastTag.equalsIgnoreCase("NNP")) { nounPhrases.add(taggedWord.word() + " " + lastWord); //System.out.println(taggedWord.word() + " " + lastWord); } lastTag = taggedWord.tag(); lastWord = taggedWord.word(); } } In the above code, i have done only for NNP followed by NNP extraction, how can i generalise it so that i can add other rules too. I know that there are libraries available for doing this , but wanted to do this manually.

    Read the article

  • Python/YACC Lexer: Token priority?

    - by Rosarch
    I'm trying to use reserved words in my grammar: reserved = { 'if' : 'IF', 'then' : 'THEN', 'else' : 'ELSE', 'while' : 'WHILE', } tokens = [ 'DEPT_CODE', 'COURSE_NUMBER', 'OR_CONJ', 'ID', ] + list(reserved.values()) t_DEPT_CODE = r'[A-Z]{2,}' t_COURSE_NUMBER = r'[0-9]{4}' t_OR_CONJ = r'or' t_ignore = ' \t' def t_ID(t): r'[a-zA-Z_][a-zA-Z_0-9]*' if t.value in reserved.values(): t.type = reserved[t.value] return t return None However, the t_ID rule somehow swallows up DEPT_CODE and OR_CONJ. How can I get around this? I'd like those two to take higher precedence than the reserved words.

    Read the article

  • Measuring the performance of classification algorithm

    - by Silver Dragon
    I've got a classification problem in my hand, which I'd like to address with a machine learning algorithm ( Bayes, or Markovian probably, the question is independent on the classifier to be used). Given a number of training instances, I'm looking for a way to measure the performance of an implemented classificator, with taking data overfitting problem into account. That is: given N[1..100] training samples, if I run the training algorithm on every one of the samples, and use this very same samples to measure fitness, it might stuck into a data overfitting problem -the classifier will know the exact answers for the training instances, without having much predictive power, rendering the fitness results useless. An obvious solution would be seperating the hand-tagged samples into training, and test samples; and I'd like to learn about methods selecting the statistically significant samples for training. White papers, book pointers, and PDFs much appreciated!

    Read the article

  • Searching Natural Language Sentence Structure

    - by Cerin
    What's the best way to store and search a database of natural language sentence structure trees? Using OpenNLP's English Treebank Parser, I can get fairly reliable sentence structure parsings for arbitrary sentences. What I'd like to do is create a tool that can extract all the doc strings from my source code, generate these trees for all sentences in the doc strings, store these trees and their associated function name in a database, and then allow a user to search the database using natural language queries. So, given the sentence "This uploads files to a remote machine." for the function upload_files(), I'd have the tree: (TOP (S (NP (DT This)) (VP (VBZ uploads) (NP (NNS files)) (PP (TO to) (NP (DT a) (JJ remote) (NN machine)))) (. .))) If someone entered the query "How can I upload files?", equating to the tree: (TOP (SBARQ (WHADVP (WRB How)) (SQ (MD can) (NP (PRP I)) (VP (VB upload) (NP (NNS files)))) (. ?))) how would I store and query these trees in a SQL database? I've written a simple proof-of-concept script that can perform this search using a mix of regular expressions and network graph parsing, but I'm not sure how I'd implement this in a scalable way. And yes, I realize my example would be trivial to retrieve using a simple keyword search. The idea I'm trying to test is how I might take advantage of grammatical structure, so I can weed-out entries with similar keywords, but a different sentence structure. For example, with the above query, I wouldn't want to retrieve the entry associated with the sentence "Checks a remote machine to find a user that uploads files." which has similar keywords, but is obviously describing a completely different behavior.

    Read the article

  • POS tagger in SharpNLP

    - by C.
    I am using SharpNLP for my POS tagging: EnglishMaximumEntropyPosTagger posTagger = new EnglishMaximumEntropyPosTagger(mModelPath); String tagSentence = posTagger.TagSentence(question); I only have 3 tags. How can I load a set of Penn treebank or some other tagging tree banks to use? Thanks :)

    Read the article

  • Python: How best to parse a simple grammar?

    - by Rosarch
    Ok, so I've asked a bunch of smaller questions about this project, but I still don't have much confidence in the designs I'm coming up with, so I'm going to ask a question on a broader scale. I am parsing pre-requisite descriptions for a course catalog. The descriptions almost always follow a certain form, which makes me think I can parse most of them. From the text, I would like to generate a graph of course pre-requisite relationships. (That part will be easy, after I have parsed the data.) Some sample inputs and outputs: "CS 2110" => ("CS", 2110) # 0 "CS 2110 and INFO 3300" => [("CS", 2110), ("INFO", 3300)] # 1 "CS 2110, INFO 3300" => [("CS", 2110), ("INFO", 3300)] # 1 "CS 2110, 3300, 3140" => [("CS", 2110), ("CS", 3300), ("CS", 3140)] # 1 "CS 2110 or INFO 3300" => [[("CS", 2110)], [("INFO", 3300)]] # 2 "MATH 2210, 2230, 2310, or 2940" => [[("MATH", 2210), ("MATH", 2230), ("MATH", 2310)], [("MATH", 2940)]] # 3 If the entire description is just a course, it is output directly. If the courses are conjoined ("and"), they are all output in the same list If the course are disjoined ("or"), they are in separate lists Here, we have both "and" and "or". One caveat that makes it easier: it appears that the nesting of "and"/"or" phrases is never greater than as shown in example 3. What is the best way to do this? I started with PLY, but I couldn't figure out how to resolve the reduce/reduce conflicts. The advantage of PLY is that it's easy to manipulate what each parse rule generates: def p_course(p): 'course : DEPT_CODE COURSE_NUMBER' p[0] = (p[1], int(p[2])) With PyParse, it's less clear how to modify the output of parseString(). I was considering building upon @Alex Martelli's idea of keeping state in an object and building up the output from that, but I'm not sure exactly how that is best done. def addCourse(self, str, location, tokens): self.result.append((tokens[0][0], tokens[0][1])) def makeCourseList(self, str, location, tokens): dept = tokens[0][0] new_tokens = [(dept, tokens[0][1])] new_tokens.extend((dept, tok) for tok in tokens[1:]) self.result.append(new_tokens) For instance, to handle "or" cases: def __init__(self): self.result = [] # ... self.statement = (course_data + Optional(OR_CONJ + course_data)).setParseAction(self.disjunctionCourses) def disjunctionCourses(self, str, location, tokens): if len(tokens) == 1: return tokens print "disjunction tokens: %s" % tokens How does disjunctionCourses() know which smaller phrases to disjoin? All it gets is tokens, but what's been parsed so far is stored in result, so how can the function tell which data in result corresponds to which elements of token? I guess I could search through the tokens, then find an element of result with the same data, but that feel convoluted... What's a better way to approach this problem?

    Read the article

  • Online job-searching is tedious. Help me automate it.

    - by ehsanul
    Many job sites have broken searches that don't let you narrow down jobs by experience level. Even when they do, it's usually wrong. This requires you to wade through hundreds of postings that you can't apply for before finding a relevant one, quite tedious. Since I'd rather focus on writing cover letters etc., I want to write a program to look through a large number of postings, and save the URLs of just those jobs that don't require years of experience. I don't require help writing the scraper to get the html bodies of possibly relevant job posts. The issue is accurately detecting the level of experience required for the job. This should not be too difficult as job posts are usually very explicit about this ("must have 5 years experience in..."), but there may be some issues with overly simple solutions. In my case, I'm looking for entry-level positions. Often they don't say "entry-level", but inclusion of the words probably means the job should be saved. Next, I can safely exclude a job the says it requires "5 years" of experience in whatever, so a regex like /\d\syears/ seems reasonable to exclude jobs. But then, I realized some jobs say they'll take 0-2 years of experience, matches the exclusion regex but is clearly a job I want to take a look at. Hmmm, I can handle that with another regex. But some say "less than 2 years" or "fewer than 2 years". Can handle that too, but it makes me wonder what other patterns I'm not thinking of, and possibly excluding many jobs. That's what brings me here, to find a better way to do this than regexes, if there is one. I'd like to minimize the false negative rate and save all the jobs that seem like they might not require many years of experience. Does excluding anything that matches /[3-9]\syears|1\d\syears/ seem reasonable? Or is there a better way? Training a bayesian filter maybe?

    Read the article

  • Keyword sorting algorithm

    - by Nai
    I have over 1000 surveys, many of which contains open-ended replies. I would like to be able to 'parse' in all the words and get a ranking of the most used words (disregarding common words) to spot a trend. How can I do this? Is there a program I can use? EDIT If a 3rd party solution is not available, it would be great if we can keep the discussion to microsoft technologies only. Cheers.

    Read the article

  • A StringToken Parser which gives Google Search style "Did you mean:" Suggestions

    - by _ande_turner_
    Seeking a method to: Take whitespace separated tokens in a String; return a suggested Word ie: Google Search can take "fonetic wrd nterpreterr", and atop of the result page it shows "Did you mean: phonetic word interpreter" A solution in any of the C* languages or Java would be preferred. Are there any existing Open Libraries which perform such functionality? Or is there a way to Utilise a Google API to request a suggested word?

    Read the article

  • Python: Trouble with YACC

    - by Rosarch
    I'm parsing sentences like: "CS 2310 or equivalent experience" The desired output: [[("CS", 2310)], ["equivalent experience"]] YACC tokenizer symbols: tokens = [ 'DEPT_CODE', 'COURSE_NUMBER', 'OR_CONJ', 'MISC_TEXT', ] t_DEPT_CODE = r'[A-Z]{2,}' t_COURSE_NUMBER = r'[0-9]{4}' t_OR_CONJ = r'or' t_ignore = ' \t' terms = {'DEPT_CODE': t_DEPT_CODE, 'COURSE_NUMBER': t_COURSE_NUMBER, 'OR_CONJ': t_OR_CONJ} for name, regex in terms.items(): terms[name] = "^%s$" % regex def t_MISC_TEXT(t): r'\S+' for name, regex in terms.items(): # print "trying to match %s with regex %s" % (t.value, regex) if re.match(regex, t.value): t.type = name return t return t (MISC_TEXT is meant to match anything not caught by the other terms.) Some relevant rules from the parser: precedence = ( ('left', 'MISC_TEXT'), ) def p_statement_course_data(p): 'statement : course_data' p[0] = p[1] def p_course_data(p): 'course_data : course' p[0] = p[1] def p_course(p): 'course : DEPT_CODE COURSE_NUMBER' p[0] = make_course(p[1], int(p[2])) def p_or_phrase(p): 'or_phrase : statement OR_CONJ statement' p[0] = [[p[1]], [p[3]]] def p_misc_text(p): '''text_aggregate : MISC_TEXT MISC_TEXT | MISC_TEXT text_aggregate | text_aggregate MISC_TEXT ''' p[0] = "%s %s" % (p[0], [1]) def p_text_aggregate_statement(p): 'statement : text_aggregate' p[0] = p[1] Unfortunately, this fails: # works as it should >>> token_list("CS 2110 or equivalent experience") [LexToken(DEPT_CODE,'CS',1,0), LexToken(COURSE_NUMBER,'2110',1,3), LexToken(OR_CONJ,'or',1,8), LexToken(MISC_TEXT,'equivalent',1,11), LexToken(MISC_TEXT,'experience',1,22)] # fails. bummer. >>> parser.parse("CS 2110 or equivalent experience") Syntax error in input: LexToken(MISC_TEXT,'equivalent',1,11) What am I doing wrong? I don't fully understand how to set precedence rules. Also, this is my error function: def p_error(p): print "Syntax error in input: %s" % p Is there a way to see which rule the parser was trying when it failed? Or some other way to make the parser print which rules its trying?

    Read the article

  • Learning Objective-C: Need advice on populating NSMutableDictionary

    - by Zigrivers
    I am teaching myself Objective-C utilizing a number of resources, one of which is the Stanford iPhone Dev class available via iTunes U (past 2010 class). One of the home work assignments asked that I populate a mutable dictionary with a predefined list of keys and values (URLs). I was able to put the code together, but as I look at it, I keep thinking there is probably a much better way for me to approach what I'm trying to do: Populate a NSMutableDictionary with the predefined keys and values Enumerate through the keys of the dictionary and check each key to see if it starts with "Stanford" If it meets the criteria, log both the key and the value I would really appreciate any feedback on how I might improve on what I've put together. I'm the very definition of a beginner, but I'm really enjoying the challenge of learning Objective-C. void bookmarkDictionary () { NSMutableDictionary* bookmarks = [NSMutableDictionary dictionary]; NSString* one = @"Stanford University", *two = @"Apple", *three = @"CS193P", *four = @"Stanford on iTunes U", *five = @"Stanford Mall"; NSString* urlOne = @"http://www.stanford.edu", *urlTwo = @"http://www.apple.com", *urlThree = @"http://cs193p.stanford.edu", *urlFour = @"http://itunes.stanford.edu", *urlFive = @"http://stanfordshop.com"; NSURL* oneURL = [NSURL URLWithString:urlOne]; NSURL* twoURL = [NSURL URLWithString:urlTwo]; NSURL* threeURL = [NSURL URLWithString:urlThree]; NSURL* fourURL = [NSURL URLWithString:urlFour]; NSURL* fiveURL = [NSURL URLWithString:urlFive]; [bookmarks setObject:oneURL forKey:one]; [bookmarks setObject:twoURL forKey:two]; [bookmarks setObject:threeURL forKey:three]; [bookmarks setObject:fourURL forKey:four]; [bookmarks setObject:fiveURL forKey:five]; NSString* akey; NSString* testString = @"Stanford"; for (akey in bookmarks) { if ([akey hasPrefix:testString]) { NSLog(@"Key: %@ URL: %@", akey, [bookmarks objectForKey:akey]); } } } Thanks for your help!

    Read the article

  • Agile methodologies. Is it a by-product of mind control techniques as NLP / Scientology?

    - by Bobb
    The more I read about contemporary methods combinging scrum, tdd and xp, the more I feel like I already seen the methods. I am not arguing that agile approach is much more progressive than older rigid structures like waterfall, what I am saying is that it seems to me that agile methodologies are ideal to be used as a nest for a brainwashing business. I read few articles which kept referring to authors which I checked afterwards and they call themselves - coaches, trainers (usual thing when NLP specialists are involved) with no apparent software development history. Also I met a guy who is a scrum faciltator (term widly used in relation to scientology) in a high profile company. I talked to him less than 5 min but I got complete feeling that he is either on drugs or he has been programmed by a powerful NLP specialist. The way to talk and his body movements witnessed he is not an average normal person (in terms of normal distribution :))... Please dont get me wrong. I am not a fun of conspiracy theories. But I had an experience with a member of church of scientology tried to invade a commercial firm and actually went half way through to very top in just 3 weeks. I saw his work. For now I have complete impression is that psycho manipulators are now invading IT industry through the convenient door of agile techniques. Anyone has the same feeling/thoughts?

    Read the article

  • WhatApp?

    Web and mobile apps come under review on new Stanford site sport - Games - Video Games - Stanford University - Mobile

    Read the article

  • Classnotfound exception while running hadoop

    - by vana
    Hi, I am new to hadoop. I have a file Wordcount.java which refers hadoop.jar and stanford-parser.jar I am running the following commnad javac -classpath .:hadoop-0.20.1-core.jar:stanford-parser.jar -d ep WordCount.java jar cvf ep.jar -C ep . bin/hadoop jar ep.jar WordCount gutenburg gutenburg1 After executing i am getting the following error: lang.ClassNotFoundException: edu.stanford.nlp.parser.lexparser.LexicalizedParser The class is in stanford-parser.jar ... What can be the possible problem? Thanks

    Read the article

  • Ubuntu server has slow performance

    - by Rich
    I have a custom built Ubuntu 11.04 server with a 6 disk software RAID 10 primary drive. On it I'm primarily running a PostgreSQL and a few other utilities that stream data from the web. I often find after a few hours of uptime the server starts to lag with all kinds of processes. For example, it may take 10-15 seconds after log-in to get a shell prompt. It might take 5-10 seconds for top to come up. An ls might take a second or two. When I look at top there is almost no CPU usage. There's a fair amount of memory used by the PostgreSQL server but not enough to bleed into swap. I have no idea where to go from here, other than to suspect the RAID10 (I've only ever had software RAID 1's before). Edit: Output from top: top - 11:56:03 up 1:46, 3 users, load average: 0.89, 0.73, 0.72 Tasks: 119 total, 1 running, 118 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.0%sy, 0.0%ni, 93.5%id, 6.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16325596k total, 3478248k used, 12847348k free, 20880k buffers Swap: 19534176k total, 0k used, 19534176k free, 3041992k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1747 woodsp 20 0 109m 10m 4888 S 1 0.1 0:42.70 python 357 root 20 0 0 0 0 S 0 0.0 0:00.40 jbd2/sda3-8 1 root 20 0 24324 2284 1344 S 0 0.0 0:00.84 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.24 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.01 watchdog/0 8 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 10 root 20 0 0 0 0 S 0 0.0 0:00.02 ksoftirqd/1 12 root RT 0 0 0 0 S 0 0.0 0:00.01 watchdog/1 13 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2 14 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/2:0 15 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2 16 root RT 0 0 0 0 S 0 0.0 0:00.01 watchdog/2 17 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3 18 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/3:0 19 root 20 0 0 0 0 S 0 0.0 0:00.02 ksoftirqd/3 20 root RT 0 0 0 0 S 0 0.0 0:00.01 watchdog/3 21 root 0 -20 0 0 0 S 0 0.0 0:00.00 cpuset 22 root 0 -20 0 0 0 S 0 0.0 0:00.00 khelper 23 root 20 0 0 0 0 S 0 0.0 0:00.00 kdevtmpfs 24 root 0 -20 0 0 0 S 0 0.0 0:00.00 netns 26 root 20 0 0 0 0 S 0 0.0 0:00.00 sync_supers df -h rpsharp@ncp-skookum:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.8T 549G 1.2T 32% / udev 7.8G 4.0K 7.8G 1% /dev tmpfs 3.2G 492K 3.2G 1% /run none 5.0M 0 5.0M 0% /run/lock none 7.8G 0 7.8G 0% /run/shm /dev/sda2 952M 128K 952M 1% /boot/efi /dev/md0 5.5T 562G 4.7T 11% /usr/local free -m psharp@ncp-skookum:~$ free -m total used free shared buffers cached Mem: 15942 3409 12533 0 20 2983 -/+ buffers/cache: 405 15537 Swap: 19076 0 19076 tail -50 /var/log/syslog Jul 3 06:31:32 ncp-skookum rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="1070" x-info="http://www.rsyslog.com"] rsyslogd was HUPed Jul 3 06:39:01 ncp-skookum CRON[14211]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete) Jul 3 06:40:01 ncp-skookum CRON[14223]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 07:00:01 ncp-skookum CRON[14328]: (woodsp) CMD (/home/woodsp/bin/mail_tweetupdate # email an update) Jul 3 07:00:01 ncp-skookum CRON[14327]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 07:00:28 ncp-skookum sendmail[14356]: q63E0SoZ014356: from=woodsp, size=2328, class=0, nrcpts=2, msgid=<201207031400.q63E0SoZ014356@ncp-skookum.Stanford.EDU>, relay=woodsp@localhost Jul 3 07:00:29 ncp-skookum sm-mta[14357]: q63E0Si6014357: from=<woodsp@ncp-skookum.Stanford.EDU>, size=2569, class=0, nrcpts=2, msgid=<201207031400.q63E0SoZ014356@ncp-skookum.Stanford.EDU>, proto=ESMTP, daemon=MTA-v4, relay=localhost [127.0.0.1] Jul 3 07:00:29 ncp-skookum sendmail[14356]: q63E0SoZ014356: to=Spencer Wood <[email protected]>,Martin Lacayo <[email protected]>, ctladdr=woodsp (1004/1005), delay=00:00:01, xdelay=00:00:01, mailer=relay, pri=62328, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (q63E0Si6014357 Message accepted for delivery) Jul 3 07:00:29 ncp-skookum sm-mta[14359]: STARTTLS=client, relay=mx3.stanford.edu., version=TLSv1/SSLv3, verify=FAIL, cipher=DHE-RSA-AES256-SHA, bits=256/256 Jul 3 07:00:29 ncp-skookum sm-mta[14359]: q63E0Si6014357: to=<[email protected]>,<[email protected]>, ctladdr=<woodsp@ncp-skookum.Stanford.EDU> (1004/1005), delay=00:00:01, xdelay=00:00:00, mailer=esmtp, pri=152569, relay=mx3.stanford.edu. [171.67.219.73], dsn=2.0.0, stat=Sent (Ok: queued as 8F3505802AC) Jul 3 07:09:08 ncp-skookum CRON[14396]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete) Jul 3 07:17:01 ncp-skookum CRON[14438]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jul 3 07:20:01 ncp-skookum CRON[14453]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 07:39:01 ncp-skookum CRON[14551]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete) Jul 3 07:40:01 ncp-skookum CRON[14562]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 08:00:01 ncp-skookum CRON[14668]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 08:09:01 ncp-skookum CRON[14724]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete) Jul 3 08:17:01 ncp-skookum CRON[14766]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jul 3 08:20:01 ncp-skookum CRON[14781]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Jul 3 08:39:01 ncp-skookum CRON[14881]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete) Jul 3 08:40:01 ncp-skookum CRON[14892]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Output of hdparm -t /dev/sd{a,b,c,d,e,f} This looks suspicious? /dev/sda: Timing buffered disk reads: 2 MB in 4.84 seconds = 423.39 kB/sec /dev/sdb: Timing buffered disk reads: 420 MB in 3.01 seconds = 139.74 MB/sec /dev/sdc: Timing buffered disk reads: 390 MB in 3.00 seconds = 129.87 MB/sec /dev/sdd: Timing buffered disk reads: 416 MB in 3.00 seconds = 138.51 MB/sec /dev/sde: Timing buffered disk reads: 422 MB in 3.00 seconds = 140.50 MB/sec /dev/sdf: Timing buffered disk reads: 416 MB in 3.01 seconds = 138.26 MB/sec

    Read the article

  • How to disable log4j logging from Java code

    - by Erel Segal Halevi
    I use a legacy library that writes logs using log4j. My default log4j.properties file directs the log to the console, but in some specific functions of my main program, I would like to disable logging altogether (from all classes). I tried this: Logger.getLogger(BasicImplementation.class.getName()).setLevel(Level.OFF); where "BasicImplementation" is one of the main classes that does logging, but it didn't work - the logs are still written to the console. Here is my log4j.properties: log4j.rootLogger=warn, stdout log4j.logger.ac.biu.nlp.nlp.engineml=info, logfile log4j.logger.org.BIU.utils.logging.ExperimentLogger=warn log4j.appender.stdout = org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout = org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern = %-5p %d{HH:mm:ss} [%t]: %m%n log4j.appender.logfile = ac.biu.nlp.nlp.log.BackupOlderFileAppender log4j.appender.logfile.append=false log4j.appender.logfile.layout = org.apache.log4j.PatternLayout log4j.appender.logfile.layout.ConversionPattern = %-5p %d{HH:mm:ss} [%t]: %m%n log4j.appender.logfile.File = logfile.log

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10  | Next Page >