Search Results

Search found 168 results on 7 pages for 'computational linguistics'.

Page 1/7 | 1 2 3 4 5 6 7  | Next Page >

  • Extracting ""((Adj|Noun)+|((Adj|Noun)(Noun-Prep)?)(Adj|Noun))Noun"" from Text (Justeson & Katz, 1995)

    - by ssuhan
    I would like to query if it is possible to extract ((Adj|Noun)+|((Adj|Noun)(Noun-Prep)?)(Adj|Noun))Noun proposed by Justeson and Katz (1995) in R package openNLP? That is, I would like to use this linguistic filtering to extract candidate noun phrases. I cannot well understand its meaning. Could you do me a favor to explain it or transform such representation into R language. Many thanks. Maybe we can start the sample code from: library("openNLP") acq <- "This paper describes a novel optical thread plug gauge (OTPG) for internal thread inspection using machine vision. The OTPG is composed of a rigid industrial endoscope, a charge-coupled device camera, and a two degree-of-freedom motion control unit. A sequence of partial wall images of an internal thread are retrieved and reconstructed into a 2D unwrapped image. Then, a digital image processing and classification procedure is used to normalize, segment, and determine the quality of the internal thread." acqTag <- tagPOS(acq) acqTagSplit = strsplit(acqTag," ")

    Read the article

  • understanding semcor corpus structure h

    - by Sharmila
    I'm learning NLP. I currently playing with Word Sense Disambiguation. I'm planning to use the semcor corpus as training data but I have trouble understanding the xml structure. I tried googling but did not get any resource describing the content structure of semcor. <s snum="1"> <wf cmd="ignore" pos="DT">The</wf> <wf cmd="done" lemma="group" lexsn="1:03:00::" pn="group" pos="NNP" rdf="group" wnsn="1">Fulton_County_Grand_Jury</wf> <wf cmd="done" lemma="say" lexsn="2:32:00::" pos="VB" wnsn="1">said</wf> <wf cmd="done" lemma="friday" lexsn="1:28:00::" pos="NN" wnsn="1">Friday</wf> <wf cmd="ignore" pos="DT">an</wf> <wf cmd="done" lemma="investigation" lexsn="1:09:00::" pos="NN" wnsn="1">investigation</wf> <wf cmd="ignore" pos="IN">of</wf> <wf cmd="done" lemma="atlanta" lexsn="1:15:00::" pos="NN" wnsn="1">Atlanta</wf> <wf cmd="ignore" pos="POS">'s</wf> <wf cmd="done" lemma="recent" lexsn="5:00:00:past:00" pos="JJ" wnsn="2">recent</wf> <wf cmd="done" lemma="primary_election" lexsn="1:04:00::" pos="NN" wnsn="1">primary_election</wf> <wf cmd="done" lemma="produce" lexsn="2:39:01::" pos="VB" wnsn="4">produced</wf> <punc>``</punc> <wf cmd="ignore" pos="DT">no</wf> <wf cmd="done" lemma="evidence" lexsn="1:09:00::" pos="NN" wnsn="1">evidence</wf> <punc>''</punc> <wf cmd="ignore" pos="IN">that</wf> <wf cmd="ignore" pos="DT">any</wf> <wf cmd="done" lemma="irregularity" lexsn="1:04:00::" pos="NN" wnsn="1">irregularities</wf> <wf cmd="done" lemma="take_place" lexsn="2:30:00::" pos="VB" wnsn="1">took_place</wf> <punc>.</punc> </s> I'm assuming wnsn is 'word sense'. Is it correct? What does the attribute lexsn mean? How does it map to wordnet? What does the attribute pn refer to? (third line) How is the rdf attribute assigned? (again third line) In general, what are the possible attributes?

    Read the article

  • Computational complexity of Fibonacci Sequence

    - by Juliet
    I understand Big-O notation, but I don't know how to calculate it for many functions. In particular, I've been trying to figure out the computational complexity of the naive version of the Fibonacci sequence: int Fib(int n) { if (n <= 1) return 1; else return Fib(n - 1) + Fib(n - 2); } What is the computational complexity of the Fibonnaci sequence and how is it calculated?

    Read the article

  • Computational geometry: find where the triangle is after rotation, translation or reflection on a mi

    - by newba
    I have a small contest problem in which is given a set of points, in 2D, that form a triangle. This triangle may be subject to an arbitrary rotation, may be subject to an arbitrary translation (both in the 2D plane) and may be subject to a reflection on a mirror, but its dimensions were kept unchanged. Then, they give me a set of points in the plane, and I have to find 3 points that form my triangle after one or more of those geometric operations. Example: 5 15 8 5 20 10 6 5 17 5 20 20 5 10 5 15 20 15 10 I bet that have to apply some known algorithm, but I don't know which. The most common are: convex hull, sweep plane, triangulation, etc. Can someone give a tip? I don't need the code, only a push, please!

    Read the article

  • computational puzzles (brute force)

    - by acidzombie24
    Not that i need it but it was interesting to hear someone speak about their server and protecting it from DOS attack by having a puzzle that the client must solve before the server will do anything (it doesnt do allocations or make a session unless solved). The person also said puzzles can be made to take a quick amount of time or long. And they are easy to check if it is solve correctly but difficult to solve. What are these puzzles? I never heard of one. Can someone give an example (link?)

    Read the article

  • Requiring clients to solve computational puzzles...

    - by acidzombie24
    Not that I need it, but it was interesting to hear someone speak about their server and protecting it from DOS attack by having a puzzle that the client must solve before the server will do anything (it doesnt do allocations or make a session unless solved). The person also said puzzles can be made to take a quick amount of time or long. And they are easy to check for correct solutions but difficult to solve. What are these puzzles? I never heard of one. Can someone give an example (or a link)?

    Read the article

  • OpenGL - have object follow mouse

    - by kevin james
    I want to have an object follow around my mouse on the screen in OpenGL. (I am also using GLEW, GLFW, and GLM). The best idea I've come up with is: Get the coordinates within the window with glfwGetCursorPos. The window was created with window = glfwCreateWindow( 1024, 768, "Test", NULL, NULL); and the code to get coordinates is double xpos, ypos; glfwGetCursorPos(window, &xpos, &ypos); Next, I use GLM unproject, to get the coordinates in "object space" glm::vec4 viewport = glm::vec4(0.0f, 0.0f, 1024.0f, 768.0f); glm::vec3 pos = glm::vec3(xpos, ypos, 0.0f); glm::vec3 un = glm::unProject(pos, View*Model, Projection, viewport); There are two potential problems I can already see. The viewport is fine, as the initial x,y, coordinates of the lower left are indeed 0,0, and it's indeed a 1024*768 window. However, the position vector I create doesn't seem right. The Z coordinate should probably not be zero. However, glfwGetCursorPos returns 2D coordinates, and I don't know how to go from there to the 3D window coordinates, especially since I am not sure what the 3rd dimension of the window coordinates even means (since computer screens are 2D). Then, I am not sure if I am using unproject correctly. Assume the View, Model, Projection matrices are all OK. If I passed in the correct position vector in Window coordinates, does the unproject call give me the coordinates in Object coordinates? I think it does, but the documentation is not clear. Finally, to each vertex of the object I want to follow the mouse around, I just increment the x coordinate by un[0], the y coordinate by -un[1], and the z coordinate by un[2]. However, since my position vector that is being unprojected is likely wrong, this is not giving good results; the object does move as my mouse moves, but it is offset quite a bit (i.e. moving the mouse a lot doesn't move the object that much, and the z coordinate is very large). I actually found that the z coordinate un[2] is always the same value no matter where my mouse is, probably because the position vector I pass into unproject always has a value of 0.0 for z. Edit: The (incorrectly) unprojected x-values range from about -0.552 to 0.552, and the y-values from about -0.411 to 0.411.

    Read the article

  • Sphere-Sphere intersection and Circle-Sphere intersection

    - by cagirici
    I have code for circle-circle intersection. But I need to expand it to 3-D. How do I calculate: Radius and center of the intersection circle of two spheres Points of the intersection of a sphere and a circle? Given two spheres (sc0,sr0) and (sc1,sr1), I need to calculate a circle of intersection whose center is ci and whose radius is ri. Moreover, given a sphere (sc0,sr0) and a circle (cc0, cr0), I need to calulate the two intersection points (pi0, pi1) I have checked this link and this link, but I could not understand the logic behind them and how to code them. I tried ProGAL library for sphere-sphere-sphere intersection, but the resulting coordinates are rounded. I need precise results.

    Read the article

  • Scanline filling of polygons that share edges and vertices

    - by Belgin
    In this picture (a perspective projection of an icosahedron), the scanline (red) intersects that vertex at the top. In an icosahedron each edge belongs to two triangles. From edge a, only one triangle is visible, the other one is in the back. Same for edge d. Also, in order to determine what color the current pixel should be, each polygon has a flag which can either be 'in' or 'out', depending upon where on the scanline we currently are. Flags are flipped according to the intersection of the scanline with the edges. Now, as we go from a to d (because all edges are intersected with the scanline at that vertex), this happens: the triangle behind triangle 1 and triangle 1 itself are set 'in', then 2 is set in and 1 is 'out', then 3 is set 'in', 2 is 'out' and finally 3 is 'out' and the one behind it is set 'in', which is not the desired behavior because we only need the triangles which are facing us to be set 'in', the rest should be 'out'. How do process the edges in the Active Edge List (a list of edges that are currently intersected by the scanline) so the right polys are set 'in'? Also, I should mention that the edges are unique, which means there exists an array of edges in the data structure of the icosahedron which are pointed to by edge pointers in each of the triangles.

    Read the article

  • Annoying flickering of vertices and edges (possible z-fighting)

    - by Belgin
    I'm trying to make a software z-buffer implementation, however, after I generate the z-buffer and proceed with the vertex culling, I get pretty severe discrepancies between the vertex depth and the depth of the buffer at their projected coordinates on the screen (i.e. zbuffer[v.xp][v.yp] != v.z, where xp and yp are the projected x and y coordinates of the vertex v), sometimes by a small fraction of a unit and sometimes by 2 or 3 units. Here's what I think is happening: Each triangle's data structure holds the plane's (that is defined by the triangle) coefficients (a, b, c, d) computed from its three vertices from their normal: void computeNormal(Vertex *v1, Vertex *v2, Vertex *v3, double *a, double *b, double *c) { double a1 = v1 -> x - v2 -> x; double a2 = v1 -> y - v2 -> y; double a3 = v1 -> z - v2 -> z; double b1 = v3 -> x - v2 -> x; double b2 = v3 -> y - v2 -> y; double b3 = v3 -> z - v2 -> z; *a = a2*b3 - a3*b2; *b = -(a1*b3 - a3*b1); *c = a1*b2 - a2*b1; } void computePlane(Poly *p) { double x = p -> verts[0] -> x; double y = p -> verts[0] -> y; double z = p -> verts[0] -> z; computeNormal(p -> verts[0], p -> verts[1], p -> verts[2], &p -> a, &p -> b, &p -> c); p -> d = p -> a * x + p -> b * y + p -> c * z; } The z-buffer just holds the smallest depth at the respective xy coordinate by somewhat casting rays to the polygon (I haven't quite got interpolation right yet so I'm using this slower method until I do) and determining the z coordinate from the reversed perspective projection formulas (which I got from here: double z = -(b*Ez*y + a*Ez*x - d*Ez)/(b*y + a*x + c*Ez - b*Ey - a*Ex); Where x and y are the pixel's coordinates on the screen; a, b, c, and d are the planes coefficients; Ex, Ey, and Ez are the eye's (camera's) coordinates. This last formula does not accurately give the exact vertices' z coordinate at their projected x and y coordinates on the screen, probably because of some floating point inaccuracy (i.e. I've seen it return something like 3.001 when the vertex's z-coordinate was actually 2.998). Here is the portion of code that hides the vertices that shouldn't be visible: for(i = 0; i < shape.nverts; ++i) { double dist = shape.verts[i].z; if(z_buffer[shape.verts[i].yp][shape.verts[i].xp].z < dist) shape.verts[i].visible = 0; else shape.verts[i].visible = 1; } How do I solve this issue? EDIT I've implemented the near and far planes of the frustum, with 24 bit accuracy, and now I have some questions: Is this what I have to do this in order to resolve the flickering? When I compare the z value of the vertex with the z value in the buffer, do I have to convert the z value of the vertex to z' using the formula, or do I convert the value in the buffer back to the original z, and how do I do that? What are some decent values for near and far? Thanks in advance.

    Read the article

  • Fast software color interpolating triangle rasterization technique

    - by Belgin
    I'm implementing a software renderer with this rasterization method, however, I was wondering if there is a possibility to improve it, or if there exists an alternative technique that is much faster. I'm specifically interested in rendering small triangles, like the ones from this 100k poly dragon: As you can see, the method I'm using is not perfect either, as it leaves small gaps from time to time (at least I think that's what's happening). I don't mind using assembly optimizations. Pseudocode or actual code (C/C++ or similar) is appreciated. Thanks in advance.

    Read the article

  • An implementation of Sharir's or Aurenhammer's deterministic algorithm for calculating the intersect

    - by RGrey
    The problem of finding the intersection/union of 'N' discs/circles on a flat plane was first proposed by M. I. Shamos in his 1978 thesis: Shamos, M. I. “Computational Geometry” Ph.D. thesis, Yale Univ., New Haven, CT 1978. Since then, in 1985, Micha Sharir presented an O(n log2n) time and O(n) space deterministic algorithm for the disc intersection/union problem (based on modified Voronoi diagrams): Sharir, M. Intersection and closest-pair problems for a set of planar discs. SIAM .J Comput. 14 (1985), pp. 448-468. In 1988, Franz Aurenhammer presented a more efficient O(n log n) time and O(n) space algorithm for circle intersection/union using power diagrams (generalizations of Voronoi diagrams): Aurenhammer, F. Improved algorithms for discs and balls using power diagrams. Journal of Algorithms 9 (1985), pp. 151-161. Earlier in 1983, Paul G. Spirakis also presented an O(n^2) time deterministic algorithm, and an O(n) probabilistic algorithm: Spirakis, P.G. Very Fast Algorithms for the Area of the Union of Many Circles. Rep. 98, Dept. Comput. Sci., Courant Institute, New York University, 1983. I've been searching for any implementations of the algorithms above, focusing on computational geometry packages, and I haven't found anything yet. As neither appear trivial to put into practice, it would be really neat if someone could point me in the right direction!

    Read the article

  • Getting started with character and text processing (encoding, regular expressions)

    - by TK
    I'd like to learn foundations of encodings, characters and text. Understanding these is important for dealing with a large set of text whether that are log files or text source for building algorithms for collective intelligence. My current knowledge is pretty basic: something like "As long as I use UTF-8, I'm okay." I don't say I need to learn about advanced topics right away. But I need to know: Bit and bytes level knowledge of encodings. Characters and alphabets not used in English. Multi-byte encodings. (I understand some Chinese and Japanese. And parsing them is important.) Regular expressions. Algorithm for text processing. Parsing natural languages. I also need an understanding of mathematics and corpus linguistics. The current and future web (semantic, intelligent, real-time web) needs processing, parsing and analyzing large text. I'm looking for some resources (maybe books?) that get me started with some of the bullets. (I find many helpful discussion on regular expressions here on Stack Overflow. So, you don't need to suggest resources on that topic.)

    Read the article

  • Compose synthetic English phrase that would contain 160 bits of recoverable information

    - by Alexander Gladysh
    I have 160 bits of random data. Just for fun, I want to generate pseudo-English phrase to "store" this information in. I want to be able to recover this information from the phrase. Note: This is not a security question, I don't care if someone else will be able to recover the information or even detect that it is there or not. Criteria for better phrases, from most important to the least: Short Unique Natural-looking The current approach, suggested here: Take three lists of 1024 nouns, verbs and adjectives each (picking most popular ones). Generate a phrase by the following pattern, reading 20 bits for each word: Noun verb adjective verb, Noun verb adjective verb, Noun verb adjective verb, Noun verb adjective verb. Now, this seems to be a good approach, but the phrase is a bit too long and a bit too dull. I have found a corpus of words here (Part of Speech Database). After some ad-hoc filtering, I calculated that this corpus contains, approximately 50690 usable adjectives 123585 nouns 15301 verbs This allows me to use up to 16 bits per adjective (actually 16.9, but I can't figure how to use fractional bits) 15 bits per noun 13 bits per verb For noun-verb-adjective-verb pattern this gives 57 bits per "sentence" in phrase. This means that, if I'll use all words I can get from this corpus, I can generate three sentences instead of four (160 / 57 ˜ 2.8). Noun verb adjective verb, Noun verb adjective verb, Noun verb adjective verb. Still a bit too long and dull. Any hints how can I improve it? What I see that I can try: Try to compress my data somehow before encoding. But since the data is completely random, only some phrases would be shorter (and, I guess, not by much). Improve phrase pattern, so it would look better. Use several patterns, using the first word in phrase to somehow indicate for future decoding which pattern was used. (For example, use the last letter or even the length of the word.) Pick pattern according to the first bytes of the data. ...I'm not that good with English to come up with better phrase patterns. Any suggestions? Use more linguistics in the pattern. Different tenses etc. ...I guess, I would need much better word corpus than I have now for that. Any hints where can I get a suitable one?

    Read the article

  • When can a freely moving sphere escape from a ‘cage’ defined by a set of impassible coordinates?

    - by RGrey
    Hopefully there are some computational geometry folks here who can help me out with the following problem - Please imagine that I take a freely moving ball in 3-space and create a 'cage' around it by defining a set of impassible coordinates, Sc (i.e. points in 3-space that no part of the diffusing ball is allowed to overlap). These points reside within the volume, V(cage), of some larger sphere, where V(cage) V(ball). Provided the set of impassible coordinates, Sc, is there a computationally efficient and/or nice way to determine if the ball can ever escape the cage?

    Read the article

  • Nesting maximum amount of shapes on a surface

    - by Fuu
    In industry, there is often a problem where you need to calculate the most efficient use of material, be it fabric, wood, metal etc. So the starting point is X amount of shapes of given dimensions, made out of polygons and/or curved lines, and target is another polygon of given dimensions. I assume many of the current CAM suites implement this, but having no experience using them or of their internals, what kind of computational algorithm is used to find the most efficient use of space? Can someone point me to a book or other reference that discusses this topic?

    Read the article

  • Fitting maximum amount of shapes on a surface

    - by Fuu
    In industry, there is often a problem where you need to calculate the most efficient use of material, be it fabric, wood, metal etc. So the starting point is X amount of shapes of given dimensions, made out of polygons and/or curved lines, and target is another polygon of given dimensions. I assume many of the current CAM suites implement this, but having no experience using them or of their internals, what kind of computational algorithm is used to find the most efficient use of space? Can someone point me to a book or other reference that discusses this subject?

    Read the article

  • How to make concept representation with the help of bag of words

    - by agazerboy
    Hi All, Thanks for stoping to read my question :) this is very sweet place full of GREAT peoples ! I have a question about "creating sentences with words". NO NO it is not about english grammar :) Let me explain, If I have bag of words like "person apple apple person person a eat person will apple eat hungry apple hungry" and it can generate some kind of following sentence "hungry person eat apple" I don't in which field this topic will relate. Where should I try to find an answer. I tried to search google but I only found english grammar stuff :) Any body there who can tell me which algo can work in this problem? or any program Thanks P.S: It is not an assignment :) if it would be i would ask for source code ! I don't even know in which field I should look for :)

    Read the article

  • RDF of sentences

    - by Lily
    Hi, I need to classify sentences as a RDF format. In other words "John likes coke" would be automatically represented as Subject : John Predicate : Likes Object : Coke does nyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch? Any help would be appreciated thanks!

    Read the article

  • Java: remove-common-words-method in the API?

    - by HH
    Related: Forum post Before reinventing the wheel, I need to know whether such method exists. Stripping words according to a list such as list does not sound challenging but there are linguistic aspects, such as which words to stress the most in stripping, how about context?

    Read the article

  • Natural language grammar and user-entered names

    - by Owen Blacker
    Some languages, particularly Slavic languages, change the endings of people's names according to the grammatical context. (For those of you who know grammar or studied languages that do this to words, such as German or Russian, and to help with search keywords, I'm talking about noun declension.) This is probably easiest with a set of examples (in Polish, to save the whole different-alphabet problem): Dorothy saw the cat — Dorota zobaczyla kota The cat saw Dorothy — Kot zobaczyl Dorote It is Dorothy’s cat — To jest kot Doroty I gave the cat to Dorothy — Dalam kota Dorotie I went for a walk with Dorothy — Poszlam na spacer z Dorota “Hello, Dorothy!” — “Witam, Doroto!” Now, if, in these examples, the name here were to be user-entered, that introduces a world of grammar nightmares. Importantly, if I went for Katie (Kasia), the examples are not directly comparable — 3 and 4 are both Kasi, rather than *Kasy and *Kasie — and male names will be wholly different again. I'm guessing someone has dealt with this situation before, but my Google-fu appears to be weak today. I can find a lot of links about natural-language processing, but I don'think that's quite what I want. To be clear: I'm only ever gonna have one user-entered name per user and I'm gonna need to decline them into known configurations — I'll have a localised text that will have placeholders something like {name nominative} and {name dative}, for the sake of argument. I really don't want to have to do lexical analysis of text to work stuff out, I'll only ever need to decline that one user-entered name. Anyone have any recommendations on how to do this, or do I need to start calling round localisation agencies ;o) Further reading (all on Wikipedia) for the interested: Declension Grammatical case Declension in Polish Declension in Russian Declension in Czech nouns and pronouns Disclaimer: I know this happens in many other languages; highlighting Slavic languages is merely because I have a project that is going to be localised into some Slavic languages.

    Read the article

  • Dual-line bilingual paragraph in LaTeX

    - by D W
    An interlinear gloss can be used to layout a translation of a document. http://en.wikipedia.org/wiki/Interlinear_gloss Usually this is done word-by-word or morpheme-by-morpheme. However, I would like to do this in a different way, translating entire paragraphs at a time. http://www.optimnem.co.uk/learning/spanish/three-little-pigs.php For now I am not interested in taking into account the order of words or phrases that change order between languages. That is, I don't mind if the words in the paragraph are not aligned or if the length of one paragraph is much longer than the other, causing an overhanging line. As far as I can tell, the following packages do not meet my needs: covingtn.sty cgloss4e.sty gb4e.sty lingmacros.sty - shortex

    Read the article

  • Theory: "Lexical Encoding"

    - by _ande_turner_
    I am using the term "Lexical Encoding" for my lack of a better one. A Word is arguably the fundamental unit of communication as opposed to a Letter. Unicode tries to assign a numeric value to each Letter of all known Alphabets. What is a Letter to one language, is a Glyph to another. Unicode 5.1 assigns more than 100,000 values to these Glyphs currently. Out of the approximately 180,000 Words being used in Modern English, it is said that with a vocabulary of about 2,000 Words, you should be able to converse in general terms. A "Lexical Encoding" would encode each Word not each Letter, and encapsulate them within a Sentence. // An simplified example of a "Lexical Encoding" String sentence = "How are you today?"; int[] sentence = { 93, 22, 14, 330, QUERY }; In this example each Token in the String was encoded as an Integer. The Encoding Scheme here simply assigned an int value based on generalised statistical ranking of word usage, and assigned a constant to the question mark. Ultimately, a Word has both a Spelling & Meaning though. Any "Lexical Encoding" would preserve the meaning and intent of the Sentence as a whole, and not be language specific. An English sentence would be encoded into "...language-neutral atomic elements of meaning ..." which could then be reconstituted into any language with a structured Syntactic Form and Grammatical Structure. What are other examples of "Lexical Encoding" techniques? If you were interested in where the word-usage statistics come from : http://www.wordcount.org

    Read the article

1 2 3 4 5 6 7  | Next Page >