Search Results

Search found 18246 results on 730 pages for 'language processing'.

Page 113/730 | < Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >

  • Replacing all GUIDs in a file with new GUIDs from the command line

    - by Josh Petrie
    I have a file containing a large number of occurrences of the string Guid="GUID HERE" (where GUID HERE is a unique GUID at each occurrence) and I want to replace every existing GUID with a new unique GUID. This is on a Windows development machine, so I can generate unique GUIDs with uuidgen.exe (which produces a GUID on stdout every time it is run). I have sed and such available (but no awk oddly enough). I am basically trying to figure out if it is possible (and if so, how) to use the output of a command-line program as the replacement text in a sed substitution expression so that I can make this replacement with a minimum of effort on my part. I don't need to use sed -- if there's another way to do it, such as some crazy vim-fu or some other program, that would work as well -- but I'd prefer solutions that utilize a minimal set of *nix programs since I'm not really on *nix machines. To be clear, if I have a file like this: etc etc Guid="A" etc etc Guid="B" I would like it to become this: etc etc Guid="C" etc etc Guid="D" where A, B, C, D are actual GUIDs, of course. (for example, I have seen xargs used for things similar to this, but it's not available on the machines I need this to run on, either. I could install it if it's really the only way, although I'd rather not)

    Read the article

  • How to write efficient code for extracting Noun phrases?

    - by Arun Abraham
    I am trying to extract phrases using rules such as the ones mentioned below on text which has been POS tagged 1) NNP - NNP (- indicates followed by) 2) NNP - CC - NNP 3) VP - NP etc.. I have written code in this manner, Can someone tell me how i can do in a better manner. List<String> nounPhrases = new ArrayList<String>(); for (List<HasWord> sentence : documentPreprocessor) { //System.out.println(sentence.toString()); System.out.println(Sentence.listToString(sentence, false)); List<TaggedWord> tSentence = tagger.tagSentence(sentence); String lastTag = null, lastWord = null; for (TaggedWord taggedWord : tSentence) { if (lastTag != null && taggedWord.tag().equalsIgnoreCase("NNP") && lastTag.equalsIgnoreCase("NNP")) { nounPhrases.add(taggedWord.word() + " " + lastWord); //System.out.println(taggedWord.word() + " " + lastWord); } lastTag = taggedWord.tag(); lastWord = taggedWord.word(); } } In the above code, i have done only for NNP followed by NNP extraction, how can i generalise it so that i can add other rules too. I know that there are libraries available for doing this , but wanted to do this manually.

    Read the article

  • How would you go about tackling this problem? [SOLVED in C++]

    - by incrediman
    Intro: EDIT: See solution at the bottom of this question (c++) I have a programming contest coming up in about half a week, and I've been prepping :) I found a bunch of questions from this canadian competition, they're great practice: http://cemc.math.uwaterloo.ca/contests/computing/2009/stage2/day1.pdf I'm looking at problem B ("Dinner"). Any idea where to start? I can't really think of anything besides the naive approach (ie. trying all permutations) which would take too long to be a valid answer. Btw, the language there says c++ and pascal I think, but i don't care what language you use - I mean really all I want is a hint as to the direction I should proceed in, and perhpas a short explanation to go along with it. It feels like I'm missing something obvious... Of course extended speculation is more than welcome, but I just wanted to clarify that I'm not looking for a full solution here :) Short version of the question: You have a binary string N of length 1-100 (in the question they use H's and G's instead of one's and 0's). You must remove all of the digits from it, in the least number of steps possible. In each step you may remove any number of adjacent digits so long as they are the same. That is, in each step you can remove any number of adjacent G's, or any number of adjacent H's, but you can't remove H's and G's in one step. Example: HHHGHHGHH Solution to the example: 1. HHGGHH (remove middle Hs) 2. HHHH (remove middle Gs) 3. Done (remove Hs) -->Would return '3' as the answer. Note that there can also be a limit placed on how large adjacent groups have to be when you remove them. For example it might say '2', and then you can't remove single digits (you'd have to remove pairs or larger groups at a time). Solution I took Mark Harrison's main algorithm, and Paradigm's grouping idea and used them to create the solution below. You can try it out on the official test cases if you want. //B.cpp //include debug messages? #define DEBUG false #include <iostream> #include <stdio.h> #include <vector> using namespace std; #define FOR(i,n) for (int i=0;i<n;i++) #define FROM(i,s,n) for (int i=s;i<n;i++) #define H 'H' #define G 'G' class String{ public: int num; char type; String(){ type=H; num=0; } String(char type){ this->type=type; num=1; } }; //n is the number of bits originally in the line //k is the minimum number of people you can remove at a time //moves is the counter used to determine how many moves we've made so far int n, k, moves; int main(){ /*Input from File*/ scanf("%d %d",&n,&k); char * buffer = new char[200]; scanf("%s",buffer); /*Process input into a vector*/ //the 'line' is a vector of 'String's (essentially contigious groups of identical 'bits') vector<String> line; line.push_back(String()); FOR(i,n){ //if the last String is of the correct type, simply increment its count if (line.back().type==buffer[i]) line.back().num++; //if the last String is of the wrong type but has a 0 count, correct its type and set its count to 1 else if (line.back().num==0){ line.back().type=buffer[i]; line.back().num=1; } //otherwise this is the beginning of a new group, so create the new group at the back with the correct type, and a count of 1 else{ line.push_back(String(buffer[i])); } } /*Geedily remove groups until there are at most two groups left*/ moves=0; int I;//the position of the best group to remove int bestNum;//the size of the newly connected group the removal of group I will create while (line.size()>2){ /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; } /*END DEBUG*/ I=1; bestNum=-1; FROM(i,1,line.size()-1){ if (line[i-1].num+line[i+1].num>bestNum && line[i].num>=k){ bestNum=line[i-1].num+line[i+1].num; I=i; } } //remove the chosen group, thus merging the two adjacent groups line[I-1].num+=line[I+1].num; line.erase(line.begin()+I);line.erase(line.begin()+I); moves++; } /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; cout<<"\n\nFinal Answer: "; } /*END DEBUG*/ /*Attempt the removal of the last two groups, and output the final result*/ if (line.size()==2 && line[0].num>=k && line[1].num>=k) cout<<moves+2;//success else if (line.size()==1 && line[0].num>=k) cout<<moves+1;//success else cout<<-1;//not everyone could dine. /*START DEBUG*/ if (DEBUG){ cout<<" moves."; } /*END DEBUG*/ }

    Read the article

  • C++ converting binary(P5) image to ascii(P2) image (.pgm)

    - by tubby
    I am writing a simple program to convert grayscale binary (P5) to grayscale ascii (P2) but am having trouble reading in the binary and converting it to int. #include <iostream> #include <fstream> #include <sstream> using namespace::std; int usage(char* arg) { // exit program cout << arg << ": Error" << endl; return -1; } int main(int argc, char* argv[]) { int rows, cols, size, greylevels; string filetype; // open stream in binary mode ifstream istr(argv[1], ios::in | ios::binary); if(istr.fail()) return usage(argv[1]); // parse header istr >> filetype >> rows >> cols >> greylevels; size = rows * cols; // check data cout << "filetype: " << filetype << endl; cout << "rows: " << rows << endl; cout << "cols: " << cols << endl; cout << "greylevels: " << greylevels << endl; cout << "size: " << size << endl; // parse data values int* data = new int[size]; int fail_tracker = 0; // find which pixel failing on for(int* ptr = data; ptr < data+size; ptr++) { char t_ch; // read in binary char istr.read(&t_ch, sizeof(char)); // convert to integer int t_data = static_cast<int>(t_ch); // check if legal pixel if(t_data < 0 || t_data > greylevels) { cout << "Failed on pixel: " << fail_tracker << endl; cout << "Pixel value: " << t_data << endl; return usage(argv[1]); } // if passes add value to data array *ptr = t_data; fail_tracker++; } // close the stream istr.close(); // write a new P2 binary ascii image ofstream ostr("greyscale_ascii_version.pgm"); // write header ostr << "P2 " << rows << cols << greylevels << endl; // write data int line_ctr = 0; for(int* ptr = data; ptr < data+size; ptr++) { // print pixel value ostr << *ptr << " "; // endl every ~20 pixels for some readability if(++line_ctr % 20 == 0) ostr << endl; } ostr.close(); // clean up delete [] data; return 0; } sample image - Pulled this from an old post. Removed the comment within the image file as I am not worried about this functionality now. When compiled with g++ I get output: $> ./a.out a.pgm filetype: P5 rows: 1024 cols: 768 greylevels: 255 size: 786432 Failed on pixel: 1 Pixel value: -110 a.pgm: Error The image is a little duck and there's no way the pixel value can be -110...where am I going wrong? Thanks.

    Read the article

  • How to printf a time_t variable as a floating point number?

    - by soneangel
    Hi guys, I'm using a time_t variable in C (openMP enviroment) to keep cpu execution time...I define a float value sum_tot_time to sum time for all cpu's...I mean sum_tot_time is the sum of cpu's time_t values. The problem is that printing the value sum_tot_time it appear as an integer or long, by the way without its decimal part! I tried in these ways: to printf sum_tot_time as a double being a double value to printf sum_tot_time as float being a float value to printf sum_tot_time as double being a time_t value to printf sum_tot_time as float being a time_t value Please help me!!

    Read the article

  • What is the fastest way to find duplicates in multiple BIG txt files?

    - by user2950750
    I am really in deep water here and I need a lifeline. I have 10 txt files. Each file has up to 100.000.000 lines of data. Each line is simply a number representing something else. Numbers go up to 9 digits. I need to (somehow) scan these 10 files and find the numbers that appear in all 10 files. And here comes the tricky part. I have to do it in less than 2 seconds. I am not a developer, so I need an explanation for dummies. I have done enough research to learn that hash tables and map reduce might be something that I can make use of. But can it really be used to make it this fast, or do I need more advanced solutions? I have also been thinking about cutting up the files into smaller files. To that 1 file with 100.000.000 lines is transformed into 100 files with 1.000.000 lines. But I do not know what is best: 10 files with 100 million lines or 1000 files with 1 million lines? When I try to open the 100 million line file, it takes forever. So I think, maybe, it is just too big to be used. But I don't know if you can write code that will scan it without opening. Speed is the most important factor in this, and I need to know if it can be done as fast as I need it, or if I have to store my data in another way, for example, in a database like mysql or something. Thank you in advance to anybody that can give some good feedback.

    Read the article

  • Detecting regular expression in content during parse

    - by sonofdelphi
    I am writing a parser for C. I was just running it with some other language files (for fun, to see the extent C-likeness). It breaks down if the code being parsed contains regular expressions... Case 1: For example, while parsing the JavaScript code snippet, var phone="(304)434-5454" phone=phone.replace(/[\(\)-]/g, "") //Returns "3044345454" (removes "(", ")", and "-") The '(', '[' etc get matched as starters of new scopes, which may never be closed. Case 2: And, for the Perl code snippet, # Replace backslashes with two forward slashes # Any character can be used to delimit the regex $FILE_PATH =~ s@\\@//@g; The // gets matched as a comment... How can I detect a regular expression within the content text of a "C-like" program-file?

    Read the article

  • What hash algorithms are paralellizable? Optimizing the hashing of large files utilizing on mult-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • Handling Corrupted JPEGs in C#

    - by ddango
    We have a process that pulls images from a remote server. Most of the time, we're good to go, the images are valid, we don't timeout, etc. However, every once and awhile we see this error similar to this: Unhandled Exception: System.Runtime.InteropServices.ExternalException: A generic error occurred in GDI+. at System.Drawing.Image.Save(Stream stream, ImageCodecInfo encoder, EncoderPa rameters encoderParams) at ConsoleApplication1.Program.Main(String[] args) in C:\images\ConsoleApplic ation1\ConsoleApplication1\Program.cs:line 24 After not being able to reproduce it locally, we looked closer at the image, and realized that there were artifacts, making us suspect corruption. Created an ugly little unit test with only the image in question, and was unable to reproduce the error on Windows 7 as was expected. But after running our unit test on Windows Server 2008, we see this error every time. Is there a way to specify non-strictness for jpegs when writing them? Some sort of check/fix we can use? Unit test snippet: var r = ReadFile("C:\\images\\ConsoleApplication1\\test.jpg"); using (var imgStream = new MemoryStream(r)) { using (var ms = new MemoryStream()) { var guid = Guid.NewGuid(); var fileName = "C:\\images\\ConsoleApplication1\\t" + guid + ".jpg"; Image.FromStream(imgStream).Save(ms, ImageFormat.Jpeg); using (FileStream fs = File.Create(fileName)) { fs.Write(ms.GetBuffer(), 0, ms.GetBuffer().Length); } } }

    Read the article

  • Why '.png' files produced by ImageMagick are so much bigger than '.jpg' & '.gif' files?

    - by Nick Gorbikoff
    Hello. I'm using ImageMagick to convert some files from one format to another. I was always under the impression that .png files were supposed to be as big/small as .jpg if not smaller, and definitely smaller than .gif. However when I run convert photo.jpg photo.png The files I'm getting out is about 6 times bigger than the original jpg. Original jpg is a regular photo about 300x500 px, 52 kb. Output is a proper png of the same dimensions, but size is about 307 kb? Does anyoone know what the hack is going on? Am I doing something wrong? P.S.: I tried both on Debian and Windows with the same results.

    Read the article

  • How is the ">" operator implemented (on 32 bit integers)?

    - by Ron Klein
    Let's say that the environment is x86. How do compilers compile the "" operator on 32 bit integers. Logically, I mean. Without any knowledge of Assembly. Let's say that the high level language code is: int32 x, y; x = 123; y = 456; bool z; z = x > y; What does the compiler do for evaluating the expression x > y? Does it perform something like (assuming that x and y are positive integers): w = sign_of(x - y); if (w == 0) // expression is 'false' else if (w == 1) // expression is 'true' else // expression is 'false' Is there any reference for such information?

    Read the article

  • How should images be stored when multiple sizes are needed?

    - by Josh Curren
    What is the best way to store images? Currently when an image is uploaded I resize it to 3 different sizes (a thumbnail, a normal size, and a large size). I save in a database a description of the image, the format, and use the id number from the database as the image name. Each size image has its own directory. Should I be storing the images in the database? Should I only be storing the larger size and generate the thumbnail as needed? Or any other ideas you have?

    Read the article

  • Video encoding by servlet with MEncoder

    - by Andrey
    Hello. I was developing an application for video encoding on the server and got a problem with encoding video with MEncoder. This decoder doesn't work correctly when runned by a command line with Runtime.getRuntime().exec(“D:\mencoder\mnc\mencoder.exe video1.avi -o outvideo1.flv -of lavf -oac mp3lame -lameopts abr:br=64 -srate 22050 -ovc lavc -lavcopts vcodec=flv:vbitrate=300:mbd=2:mv0:trell:v4mv:cbp:last_pred=3 -vf scale=320:240,harddup -quiet”) ; The decoder launches and works in windows console with my parameters, but when it's run from a servlet it just hangs in process list and doesn't do anything before the web-server is stopped. When trying to use decoder from a simple java applcation, it runs correctly. Thanks for help.

    Read the article

  • Working with images in Scala

    - by dbyrne
    I am generating large PNG files from a Scala program. Currently, I am doing it the same way I would do it in java. I am creating a new BufferedImage and setting each pixel to the correct color. This works fine, but I am wondering if there are any good libraries for working with images in Scala? I am looking for something like Ruby's RMagick library.

    Read the article

  • Parallel Haskell in order to find the divisors of a huge number

    - by Dragno
    I have written the following program using Parallel Haskell to find the divisors of 1 billion. import Control.Parallel parfindDivisors :: Integer->[Integer] parfindDivisors n = f1 `par` (f2 `par` (f1 ++ f2)) where f1=filter g [1..(quot n 4)] f2=filter g [(quot n 4)+1..(quot n 2)] g z = n `rem` z == 0 main = print (parfindDivisors 1000000000) I've compiled the program with ghc -rtsopts -threaded findDivisors.hs and I run it with: findDivisors.exe +RTS -s -N2 -RTS I have found a 50% speedup compared to the simple version which is this: findDivisors :: Integer->[Integer] findDivisors n = filter g [1..(quot n 2)] where g z = n `rem` z == 0 My processor is a dual core 2 duo from Intel. I was wondering if there can be any improvement in above code. Because in the statistics that program prints says: Parallel GC work balance: 1.01 (16940708 / 16772868, ideal 2) and SPARKS: 2 (1 converted, 0 overflowed, 0 dud, 0 GC'd, 1 fizzled) What are these converted , overflowed , dud, GC'd, fizzled and how can help to improve the time.

    Read the article

  • Cilk or Cilk++ or OpenMP

    - by Aman Deep Gautam
    I'm creating a multi-threaded application in Linux. here is the scenario: Suppose I am having x instance of a class BloomFilter and I have some y GB of data(greater than memory available). I need to test membership for this y GB of data in each of the bloom filter instance. It is pretty much clear that parallel programming will help to speed up the task moreover since I am only reading the data so it can be shared across all processes or threads. Now I am confused about which one to use Cilk, Cilk++ or OpenMP(which one is better). Also I am confused about which one to go for Multithreading or Multiprocessing

    Read the article

  • Thread management advice - Is TPL a good idea?

    - by Ian
    I'm hoping to get some advice on the use of thread managment and hopefully the task parallel library, because I'm not sure I've been going down the correct route. Probably best is that I give an outline of what I'm trying to do. Given a Problem I need to generate a Solution using a heuristic based algorithm. I start of by calculating a base solution, this operation I don't think can be parallelised so we don't need to worry about. Once the inital solution has been generated, I want to trigger n threads, which attempt to find a better solution. These threads need to do a couple of things: They need to be initalized with a different 'optimization metric'. In other words they are attempting to optimize different things, with a precedence level set within code. This means they all run slightly different calculation engines. I'm not sure if I can do this with the TPL.. If one of the threads finds a better solution that the currently best known solution (which needs to be shared across all threads) then it needs to update the best solution, and force a number of other threads to restart (again this depends on precedence levels of the optimization metrics). I may also wish to combine certain calculations across threads (e.g. keep a union of probabilities for a certain approach to the problem). This is probably more optional though. The whole system needs to be thread safe obviously and I want it to be running as fast as possible. I tried quite an implementation that involved managing my own threads and shutting them down etc, but it started getting quite complicated, and I'm now wondering if the TPL might be better. I'm wondering if anyone can offer any general guidance? Thanks...

    Read the article

  • Finding many local max in an image (using MatLab)

    - by wenh42
    How do you go about figuring our multiple max in a 2D image where the max aren't necessarily all the same height? I have found that the imregionalmax(), imextendedmax(), and findpeaks() functions aren't necessarily that helpful because they give many local max that are really just maxes within the background noise. I tried bw=arrayimdilate(array,[1 1 1; 1 0 1; 1 1 1]) but that also is kind of limited for the same reasons (same thing with expanding the matrix that it uses). I'd definitely appreciate some help..

    Read the article

  • Resizing an image with alpha channel

    - by Hafthor
    I am writing some code to generate images - essentially I have a source image that is large and includes transparent regions. I use GDI+ to open that image and add additional objects. What I want to do next is to save this new image much smaller, so I used the Bitmap constructor that takes a source Image object and a height and width, then saved that. I was expecting the alpha channel to be smoothed like the color channels, but this did not happen -- it did result in a couple of semitransparent pixels, but overall it is very blocky. What gives? Using img As New Bitmap("source100x100.png") ''// Drawing stuff Using simg As New Bitmap(img, 20, 20) simg.Save("target20x20.png") End Using End Using Edit: I think what I want is SuperSampling, like what Paint.NET does when set to "Best Quality"

    Read the article

  • Process xml-like log file queue

    - by Zsolt Botykai
    Hi all, first of all: I'm not a programmer, never was, although had learn a lot during my professional carreer as a support consultant. Now my task is to process - and create some statistics about a constantly written and rapidly growing XML like log file. It's not valid XML, because it does not have a proper <root> element, e.g. the log looks like this: <log itemdate="somedate"> <field id="0" /> ... </log> <log itemdate="somedate+1"> <field id="0" /> ... </log> <log itemdate="somedate+n"> <field id="0" /> ... </log> E.g. I have to count all the items with field id=0. But most of the solutions I had found (e.g. using XPath) reports an error about the garbage after the first closing </log>. Most probably I can use python (2.6, although I can compile 3.x as well), or some really old perl version (5.6.x), and recently compiled xmlstarlet which really looks promising - I was able to create the statistics for a certain period after copying the file, and pre- & appending the opening and closing root element. But this is a huge file and copying takes time as well. Isn't there a better solution? Thanks in advance!

    Read the article

  • Can I process video using flash?

    - by Roman
    I want to have a web page where user can activate his/her web-camera and send video to another user. Additionally to that I want to have a possibility to process video on the client side. In more details, I want to have a program which analyze video on the client side. Is it possible to do it with Flash?

    Read the article

< Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >