Search Results

Search found 602 results on 25 pages for 'chunks'.

Page 20/25 | < Previous Page | 16 17 18 19 20 21 22 23 24 25  | Next Page >

  • Indexing and Searching Over Word Level Annotation Layers in Lucene

    - by dmcer
    I have a data set with multiple layers of annotation over the underlying text, such as part-of-tags, chunks from a shallow parser, name entities, and others from various natural language processing (NLP) tools. For a sentence like The man went to the store, the annotations might look like: Word POS Chunk NER ==== === ===== ======== The DT NP Person man NN NP Person went VBD VP - to TO PP - the DT NP Location store NN NP Location I'd like to index a bunch of documents with annotations like these using Lucene and then perform searches across the different layers. An example of a simple query would be to retrieve all documents where Washington is tagged as a person. While I'm not absolutely committed to the notation, syntactically end-users might enter the query as follows: Query: Word=Washington,NER=Person I'd also like to do more complex queries involving the sequential order of annotations across different layers, e.g. find all the documents where there's a word tagged person followed by the words arrived at followed by a word tagged location. Such a query might look like: Query: "NER=Person Word=arrived Word=at NER=Location" What's a good way to go about approaching this with Lucene? Is there anyway to index and search over document fields that contain structured tokens?

    Read the article

  • Fastest Java way to remove the first/top line of a file (like a stack)

    - by christangrant
    I am trying to improve an external sort implementation in java. I have a bunch of BufferedReader objects open for temporary files. I repeatedly remove the top line from each of these files. This pushes the limits of the Java's Heap. I would like a more scalable method of doing this without loosing speed because of a bunch of constructor calls. One solution is to only open files when they are needed, then read the first line and then delete it. But I am afraid that this will be significantly slower. So using Java libraries what is the most efficient method of doing this. --Edit-- For external sort, the usual method is to break a large file up into several chunk files. Sort each of the chunks. And then treat the sorted files like buffers, pop the top item from each file, the smallest of all those is the global minimum. Then continue until for all items. http://en.wikipedia.org/wiki/External_sorting My temporary files (buffers) are basically BufferedReader objects. The operations performed on these files are the same as stack/queue operations (peek and pop, no push needed). I am trying to make these peek and pop operations more efficient. This is because using many BufferedReader objects takes up too much space.

    Read the article

  • Out of memory while iterating through rowset

    - by Phliplip
    Hi All, I have a "small" table of 60400 rows with zipcode data. I wan't to iterate through them all, update a column value, and then save it. The following is part of my Zipcodes model which extends My_Db_Table that a totalRows function that - you guessed it.. returns the total number of rows in the table (60400 rows) public function normalizeTable() { $this->getAdapter()->setProfiler(false); $totalRows = $this->totalRows(); $rowsPerQuery = 5; for($i = 0; $i < $totalRows; $i = $i + $rowsPerQuery) { $select = $this->select()->limit($i, $rowsPerQuery); $rowset = $this->fetchAll($select); foreach ($rowset as $row) { $row->{self::$normalCityColumn} = $row->normalize($row->{self::$cityColumn}); $row->save(); } unset($rowset); } } My rowClass contains a normalize function (basicly a metaphone wrapper doing some extra magic). At first i tried a plain old $this-fetchAll(), but got a out of memory (128MB) right away. Then i tried splitting the rowset into chunks, only difference is that some rows actually gets updated. Any ideas on how i can acomplish this, or should i fallback to ye'olde mysql_query()

    Read the article

  • VB.NET Syntax Coding

    - by Yiu Korochko
    I know many people ask how some of these are done, but I do not understand the context in which to use the answers, so... I'm building a code editor for a subversion of Python language, and I found a very decent way of highlighting keywords in the RichTextBox through this: bluwords.Add(KEYWORDS GO HERE) If scriptt.Text.Length > 0 Then Dim selectStart2 As Integer = scriptt.SelectionStart scriptt.Select(0, scriptt.Text.Length) scriptt.SelectionColor = Color.Black scriptt.DeselectAll() For Each oneWord As String In bluwords Dim pos As Integer = 0 Do While scriptt.Text.ToUpper.IndexOf(oneWord.ToUpper, pos) >= 0 pos = scriptt.Text.ToUpper.IndexOf(oneWord.ToUpper, pos) scriptt.Select(pos, oneWord.Length) scriptt.SelectionColor = Color.Blue pos += 1 Loop Next scriptt.SelectionStart = selectStart2 End If (scriptt is the richtextbox) But when any decent amount of code is typed (or loaded via OpenFileDialog) chunks of the code go missing, the syntax selection falls apart, and it just plain ruins it. I'm looking for a more efficient way of doing this, maybe something more like visual studio itself...because there is NO NEED to highlight all text, set it black, then redo all of the syntaxing, and the text begins to over-right if you go back to insert characters between text. Also, in this version of Python, hash (#) is used for comments on comment only lines and double hash (##) is used for comments on the same line. Now I saw that someone had asked about this exact thing, and the working answer to select to the end of the line was something like: ^\'[^\r\n]+$|''[^\r\n]+$ which I cannot seem to get into practice. I also wanted to select text between quotes and turn it turquoise, such as between the first quotation mark and the second, the text is turquoise, and the same between the 3rd and 4th etcetera... Any help is appreciated!

    Read the article

  • Makefile - Dependency generation

    - by Profetylen
    I am trying to create a makefile that automatically compiles and links my .cpp files into an executable via .o files. What I can't get working is automated (or even manual) dependency generation. When i uncomment the below commented code, nothing is recompiled when i run make build. All i get is make: Nothing to be done for 'build'., even if x.h (or any .h file) has changed. I've been trying to learn from this question: Makefile, header dependencies, dmckee's answer, especially. Why isn't this makefile working? Clarification: I can compile everything, but when I modify any header file, the .cpp files that depend on it aren't updated. So, if I for instance compile my entire source, then I change a #define in the header file, and then run make build, and I get Nothing to be done for 'build'. (when I have uncommented either commented chunks of the below code). CC=gcc CFLAGS=-O2 -Wall LDFLAGS=-lSDL -lstdc++ SOURCES=$(wildcard *.cpp) OBJECTS=$(patsubst %.cpp, obj/%.o,$(SOURCES)) TARGET=bin/test.bin # Nothing happens when i uncomment the following. (automated attempt) #depend: .depend # #.depend: $(SOURCES) # rm -f ./.depend # $(CC) $(CFLAGS) -MM $^ >> ./.depend; # #include .depend # And nothing happens when i uncomment the following. x.cpp and x.h are files in my project. (manual attempt) #x.o: x.cpp x.h clean: rm -f $(TARGET) rm -f $(OBJECTS) run: build ./$(TARGET) debug: build nm $(TARGET) gdb $(TARGET) build: $(TARGET) $(TARGET): $(OBJECTS) @mkdir -p $(@D) $(CC) $(LDFLAGS) $(OBJECTS) -o $@ obj/%.o: %.cpp @mkdir -p $(@D) $(CC) -c $(CFLAGS) $< -o $@ include $(DEPENDENCIES)

    Read the article

  • Downloading a webpage in C# example

    - by Chris
    I am trying to understand some example code on this web page: (http://www.csharp-station.com/HowTo/HttpWebFetch.aspx) that downloads a file from the internet. The piece of code quoted below goes through a loop getting chunks of data and saving them to a string until all the data has been downloaded. As I understand it, "count" contains the size of the downloaded chunk and the loop runs until count is 0 (an empty chunk of data is downloaded). My question is, isn't it possible that count could be 0 without the file being completely downloaded? Say if the network connection is interrupted, the stream may not have any data to read on a pass of the loop and count should be 0, ending the download prematurely. Or does ResStream.Read stop the program until it gets data? Is this the correct way to save a stream? int count = 0; do { // fill the buffer with data count = resStream.Read(buf, 0, buf.Length); // make sure we read some data if (count != 0) { // translate from bytes to ASCII text tempString = Encoding.ASCII.GetString(buf, 0, count); // continue building the string sb.Append(tempString); } } while (count 0); // any more data to read?

    Read the article

  • using spring, hibernate and scala, is there a better way to load test data than dbunit?

    - by egervari
    Here are some things I really dislike about dbunit: 1) You cannot specify the exact ordering the inserts because dbunit likes to group your inserts by table name, and not by the order you define them in the XML file. This is a problem when you have records depending on other records in other tables, so you have to disable foreign key constraints during your tests... which actually sucks because these foreign key constraints will get fired in production while your tests won't be aware of them! 2) They seem hellbent on forcing you to use an xml namespace to define your xml... and I honestly can't be bothered to do this. I like the data.xml without any namespace. It works. But they are so hellbent on deprecating it. 3) Creating different xml files is hard on a per test basis, so it actually encourages creating data for your entire app. Unfortunately, this process is a little bloated too once the data grows in size and things get inter tangled. There has got to be a better way to split up your test data into chunks without having to copy/paste a lot of the test data across all of your tests. 4) Keeping track of id references in a big xml file is just impossible. If you have 130 domain classes, it just gets bewildering. This model simply does not scale. Is there something less bloated and better in the Spring/Hibernate space? db unit has worn out its welcome and I'm really looking for something better.

    Read the article

  • Can FFT length affect filtering accuracy?

    - by Charles
    Hi, I am designing a fractional delay filter, and my lagrange coefficient of order 5 h(n) have 6 taps in time domain. I have tested to convolute the h(n) with x(n) which is 5000 sampled signal using matlab, and the result seems ok. When I tried to use FFT and IFFT method, the output is totally wrong. Actually my FFT is computed with 8192 data in frequency domain, which is the nearest power of 2 for 5000 signal sample. For the IFFT portion, I convert back the 8192 frequency domain data back to 5000 length data in time domain. So, the problem is, why this thing works in convolution, but not in FFT multiplication. Does converting my 6 taps h(n) to 8192 taps in frequency domain causes this problem? Actually I have tried using overlap-save method, which perform the FFT and multiplication with smaller chunks of x(n) and doing it 5 times separately. The result seems slight better than the previous, and at least I can see the waveform pattern, but still slightly distorted. So, any idea where goes wrong, and what is the solution. Thank you.

    Read the article

  • Pre Project Documentation

    - by DeanMc
    I have an issue that I feel many programmers can relate to... I have worked on many small scale projects. After my initial paper brain storm I tend to start coding. What I come up with is usually a rough working model of the actual application. I design in a disconnected fashion so I am talking about underlying code libraries, user interfaces are the last thing as the library usually dictates what is needed in the UI. As my projects get bigger I worry that so should my "spec" or design document. The above paragraph, from my investigations, is echoed all across the internet in one fashion or another. When a UI is concerned there is a bit more information but it is UI specific and does not relate to code libraries. What I am beginning to realise is that maybe code is code is code. It seems from my extensive research that there is no 1:1 mapping between a design document and the code. When I need to research a topic I dump information into OneNote and from there I prioritise features into versions and then into related chunks so that development runs in a fairly linear fashion, my tasks tend to look like so: Implement Binary File Reader Implement Binary File Writer Create Object to encapsulate Data for expression to the caller Now any programmer worth his salt is aware that between those three to do items could be a potential wall of code that could expand out to multiple files. I have tried to map the complete code process for each task but I simply don't think it can be done effectively. By the time one mangles pseudo code it is essentially code anyway so the time investment is negated. So my question is this: Am I right in assuming that the best documentation is the code itself. We are all in agreement that a high level overview is needed. How high should this be? Do you design to statement, class or concept level? What works for you?

    Read the article

  • Split UInt32 (audio frame) into two SInt16s (left and right)?

    - by morgancodes
    Total noob to anything lower-level than Java, diving into iPhone audio, and realing from all of the casting/pointers/raw memory access. I'm working with some example code wich reads a WAV file from disc and returns stereo samples as single UInt32 values. If I understand correctly, this is just a convenient way to return the 32 bits of memory required to create two 16 bit samples. Eventually this data gets written to a buffer, and an audio unit picks it up down the road. Even though the data is written in UInt32-sized chunks, it eventually is interpreted as pairs of 16-bit samples. What I need help with is splitting these UInt32 frames into left and right samples. I'm assuming I'll want to convert each UInt32 into an SInt16, since an audio sample is a signed value. It seems to me that for efficiency's sake, I ought to be able to simply point to the same blocks in memory, and avoid any copying. So, in pseudo-code, it would be something like this: UInt32 myStereoFrame = getFramefromFilePlayer; SInt16* leftChannel = getFirst16Bits(myStereoFrame); SInt16* rightChannel = getSecond16Bits(myStereoFrame); Can anyone help me turn my pseudo into real code?

    Read the article

  • How to use linux csplit to chop up massive XML file?

    - by Fred
    Hi everyone, I have a gigantic (4GB) XML file that I am currently breaking into chunks with linux "split" function (every 25,000 lines - not by bytes). This usually works great (I end up with about 50 files), except some of the data descriptions have line breaks, and so frequently the chunk files do not have the proper closing tags - and my parser chokes halfway through processing. Example file: (note: normally each "listing" xml node is supposed to be on its own line) <?xml version="1.0" encoding="UTF-8"?> <listings> <listing><date>2009-09-22</date><desc>This is a description WITHOUT line breaks and works fine with split</desc><more_tags>stuff</more_tags></listing> <listing><date>2009-09-22</date><desc>This is a really annoying description field WITH line breaks that screw the split function</desc><more_tags>stuff</more_tags></listing> </listings> Then sometimes my split ends up like <?xml version="1.0" encoding="UTF-8"?> <listings> <listing><date>2009-09-22</date><desc>This is a description WITHOUT line breaks and works fine with split</desc><more_tags>stuff</more_tags></listing> <listing><date>2009-09-22</date><desc>This is a really annoying description field WITH line breaks ... EOF So - I have been reading about "csplit" and it sounds like it might work to solve this issue. I cant seem to get the regular expression right... Basically I want the same output of ~50ish files Something like: *csplit -k myfile.xml '/</listing>/' 25000 {50} Any help would be great Thanks!

    Read the article

  • Continuous build infrastructure recommendations for primarily C++; GreenHills Integrity

    - by andersoj
    I need your recommendations for continuous build products for a large (1-2MLOC) software development project. Characteristics: ClearCase revision control Approx 80% C++; 15% Java; 5% script or low-level Compiles for Green Hills Integrity OS, but also some windows and JVM chunks Mostly an embedded system; also includes some UI pieces and some development support (simulation tools, config tools, etc...) Each notional "version" of the deliverable includes deployment images for a number of boards, UI machines, etc... (~10 separate images; 5 distinct operating systems) Need to maintain/track many simultaneous versions which, notably, are built for a variety of different board support packages Build cycle time is a major issue on the project, need support for whatever features help address this (mostly need to manage a large farm of build machines, I guess..) Operates in a secure environment (this is a gov't program) (Edited to add: This is a classified program; outsourcing the build infrastructure is a non-starter.) Interested in any best practices or peripheral guidance you might offer. The build automation issues is one of several overlapping best practices that appear to be missing on the program, but try to keep your answers focused on build infrastructure piece and observations directly related. Cost is not an object. Scalability and ease of retrofitting onto an existing infrastructure are key. JA

    Read the article

  • c++ overloading delete, retrieve size

    - by user300713
    Hi, I am currently writing a small custom memory Allocator in c++, and want to use it together with operator overloading of new/delete. Anyways, my memory Allocator basicall checks if the requested memory is over a certain threshold, and if so uses malloc to allocate the requested memory chunk. Otherwise the memory will be provided by some fixedPool allocators. that generally works, but for my deallocation function looks like this: void MemoryManager::deallocate(void * _ptr, size_t _size){ if(_size heapThreshold) deallocHeap(_ptr); else deallocFixedPool(_ptr, _size); } so I need to provide the size of the chunk pointed to, to deallocate from the right place. No the problem is that the delete keyword does not provide any hint on the size of the deleted chunk, so I would need something like this: void operator delete(void * _ptr, size_t _size){ MemoryManager::deallocate(_ptr, _size); } But as far as I can see, there is no way to determine the size inside the delete operator.- If I want to keep things the way it is right now, would I have to save the size of the memory chunks myself? Any ideas on how to solve this are welcome! Thanks!

    Read the article

  • Getting memory leak at NSURL connection in Iphone sdk.

    - by monish
    Hi guys, Here Im getting leak at the NSURL connection in my libxml parser can anyone tell how to resolve it. The code where leak generates is: - (BOOL)parseWithLibXML2Parser { BOOL success = NO; ZohoAppDelegate *appDelegate = (ZohoAppDelegate*) [ [UIApplication sharedApplication] delegate]; NSString* curl; if ([cName length] == 0) { curl = @"https://invoice.zoho.com/api/view/settings/currencies?ticket="; curl = [curl stringByAppendingString:appDelegate.ticket]; curl = [curl stringByAppendingString:@"&apikey="]; curl = [curl stringByAppendingString:appDelegate.apiKey]; curl = [curl stringByReplacingOccurrencesOfString:@"\n" withString:@""]; } NSURLRequest *theRequest = [NSURLRequest requestWithURL:[NSURL URLWithString:curl]]; NSLog(@"the request parserWithLibXml2Parser %@",theRequest); NSURLConnection *con = [[[NSURLConnection alloc] initWithRequest:theRequest delegate:self] autorelease];//Memory leak generated here at this line of code. //self.connection = con; //[con release]; // This creates a context for "push" parsing in which chunks of data that are // not "well balanced" can be passed to the context for streaming parsing. // The handler structure defined above will be used for all the parsing. The // second argument, self, will be passed as user data to each of the SAX // handlers. The last three arguments are left blank to avoid creating a tree // in memory. _xmlParserContext = xmlCreatePushParserCtxt(&simpleSAXHandlerStruct, self, NULL, 0, NULL); if(con != nil) { do { [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]]; } while (!_done && !self.error); } if(self.error) { //NSLog(@"parsing error"); [self.delegate parser:self encounteredError:nil]; } else { success = YES; } return success; } Anyone's help will be muck appreciated . Thank you, Monish.

    Read the article

  • Which options do I have for Java process communication?

    - by Dmitriy Matveev
    We have a place in a code of such form: void processParam(Object param) { wrapperForComplexNativeObject result = jniCallWhichMayCrash(param); processResult(result); } processParam - method which is called with many different arguments. jniCallWhichMayCrash - a native method which is intended to do some complex processing of it's parameter and to create some complex object. It can crash in some cases. wrapperForComplexNativeObject - wrapper type generated by SWIG processResult - a method written in pure Java which processes it's parameter by creation of several kinds (by the kinds I'm not meaning classes, maybe some like hierarchies) of objects: 1 - Some non-unique objects which are referencing each other (from the same hierarchy), these objects can have duplicates created from the invocations of processParam() method with different parameter values. Since it's costly to keep all the duplicates it's necessary to cache them. 2 - Some unique objects which are referencing each other (from the same hierarchy) and some of the objects of 1st kind. After processParam is executed for each of the arguments from some set the data created in processResult will be processed together. The problem is in fact that jniCallWhichMayCrash method may crash the entire JVM and this will be very bad. The reason of crash may be such that it can happen for one argument value and not for the other. We've decided that it's better to ignore crashes inside of JVM and just skip some chunks of data when such crashes occur. In order to do this we should run processParam function inside of separate process and pass the result somehow (HOW? HOW?! This is a question) to the main process and in case of any crashes we will only lose some part of data (It's ok) without lose of everything else. So for now the main problem is implementation of transport between different processes. Which options do I have? I can think about serialization and transmitting of binary data by the streams, but serialization may be not very fast due to object complexity. Maybe I have some other options of implementing this?

    Read the article

  • What's the jquery CSS3 selector for excluding nested descendents?

    - by Danjah
    Per my SO question here, which has turned to jquery to solve this, but which may be worked back into YUI if I get my thinking straight, I need a selector to exclude descendents. The solution proposed says something like this: $( '.revealer:not(.revealer > .revealer)' ); To fit more accurately with my situation, because I have multiple HTML chunks to perform the same test on, I have updated it be: $( '#_revealerEl_0 .handle:not(#_revealerEl_0 .reveal .handle)' ); The HTML its selecting on (image there are numerous copies of this same chunk on a page, each needing to be treated alone - an id attribute is assigned to each 'revealer'): <div class="revealer"> <div class="hotspot"> <a class="handle" href="javascript:;">A</a> <div class="reveal"> <p>Content A.</p> </div> <div class="reveal"> <p>Content B.</p> <!-- nested revealer --> <div class="revealer"> <div class="hotspot"> <a class="handle" href="javascript:;">A</a> <div class="reveal"> <p>Sub-content A.</p> </div> <div class="reveal"> <p>Sub-content B.</p> </div> </div> </div> </div> </div> </div> In a nutshell: I need to target 'top level' handles within a 'hotspot', per revealer - and no nested descendents with the same class names. thanks, d

    Read the article

  • Need advice - Developing a flexible documentation system, heavily focused on localization

    - by inkedmn
    I've been charged with building a documentation system/platform. Here's a short list of the major requirements: Easily localized : This will need to support a dozen or so languages out of the gate. (the ability for non-technical personnel to add/update translations would be a big plus, though not 100% required) Flexibility in output formats : At the bare minimum, I need to output the documents (either as a whole or in selected chunks) as PDF and HTML. Bonus points for native formats like Windows Help Files. Managed and deployed via an intuitive user interface (web, ideally). I'm wondering if you folks know of any systems out there that support this type of thing already? I'm not averse to writing this from scratch, but I'd rather not reinvent the wheel if I can help it. The two major candidates I've come across thus far are DocBook and reST. The former seems to have garnered a reputation for, well, sucking. I'm unfamiliar with either, but I'm told that reST would get me a good portion of the way there. Any other suggestions? Would I be better off building this from scratch?

    Read the article

  • Tracking a fragment of a file in two places with git

    - by mabraham
    Hi, I have code such as void myfunc() { introduction(); while(condition()) { complex(); loop(); interior(); code(); } cleanup(); } which I wish to duplicate into two versions, viz: void myfuncA() { introduction(); minorchangeA(); while(condition()) { complex(); loop(); interior(); code(); } cleanup(); } void myfuncB() { introduction(); minorchangeB(); while(condition()) { complex(); modifiedB(); loop(); interior(); code(); } cleanup(); extracleanupB(); } git claims to track content rather than files, so do I need to tell it that there are chunks here that are common to both myfuncA and myfuncB so that when merging with upstream changes to myfunc that those changes should propagate to both myfuncA and myfuncB? If so, how? The code could be written so that myfuncAB did the correct thing at each point by testing for condition A or B, but that could seriously hinder readability or performance.

    Read the article

  • Using the read function to read in a file.

    - by robUK
    Hello, gcc 4.4.1 I am using the read function to read in a wave file. However, when it gets to the read function. Execution seems to stop and freezes. I am wondering if I am doing anything wrong with this. The file size test-short.wave is: 514K. What I am aiming for is to read the file into the memory buffer chunks at a time. Currently I just testing this. Many thanks for any suggestions, #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <fcntl.h> #include <string.h> #include <unistd.h> int main(void) { char buff = malloc(10240); int32_t fd = 0; int32_t bytes_read = 0; char *filename = "test-short.wav"; /* open wave file */ if((fd = (open(filename, O_RDWR)) == -1)) { fprintf(stderr, "open [ %s ]\n", strerror(errno)); return 1; } printf("Opened file [ %s ]\n", filename); printf("sizeof(buff) [ %d ]\n", sizeof(buff)); printf("strlen(buff) [ %d ]\n", strlen(buff)); bytes_read = read(fd, buff, sizeof(buff)); printf("Bytes read [ %d ]\n", bytes_read); return 0; }

    Read the article

  • Accessing to Request object will lead to ReadEntityBody to return 0 (in a HttpHandler class)

    - by EBAG
    I created a httpHandler that successfully implements IHttpHandler for handling file uploads. It works perfectly fine. You send the file with the form, the class receives it and will save it to hard disk. It reads chunks of file with ReadEntityBody function of HttpWorkerRequest class. Here is the situation i'm faced with.If at any stage before trying to read the file with ReadEntityBody, I try to access Request object (even Request.InputStream.Length!) ReadEntityBody would return 0 means it won't read from file stream. After further testing I found out the reason behind it. Accessing to Context.Current.Request object will trigger some sort of functionality that will cause asp.net to handle file uploads at that moment by it's own! I believe this is a bug. for example exactly after this line of code, asp.net will upload the file completely, and so there will be no stream for ReadEntityBody to read from later. int FileSize = context.Request.InputStream.Length; Can anybody tell how to stop this?

    Read the article

  • Fast serialization/deserialization of structs

    - by user256890
    I have huge amont of geographic data represented in simple object structure consisting only structs. All of my fields are of value type. public struct Child { readonly float X; readonly float Y; readonly int myField; } public struct Parent { readonly int id; readonly int field1; readonly int field2; readonly Child[] children; } The data is chunked up nicely to small portions of Parent[]-s. Each array contains a few thousands Parent instances. I have way too much data to keep all in memory, so I need to swap these chunks to disk back and forth. (One file would result approx. 2-300KB). What would be the most efficient way of serializing/deserializing the Parent[] to a byte[] for dumpint to disk and reading back? Concerning speed, I am particularly interested in fast deserialization, write speed is not that critical. Would simple BinarySerializer good enough? Or should I hack around with StructLayout (see accepted answer)? I am not sure if that would work with array field of Parent.children. UPDATE: Response to comments - Yes, the objects are immutable (code updated) and indeed the children field is not value type. 300KB sounds not much but I have zillions of files like that, so speed does matter.

    Read the article

  • PHP include taking too long

    - by wxiiir
    I have a php file with around 100mb which is full of arrays (only arrays). I've made a script that includes this file (for processing), first it exhausted the default Xampp 128mb memory limit, i've raised it to 1024mb but it just takes forever and doesn't do anything. I'm sure the problem is being created by the sheer size of the file because i've tried removing all lines of code and just leaving the include and an echo for me to know when it finishes executing, and it does the same thing (which is taking forever), i've also tried to run the 100mb file in separate and same thing again. A 10mb file is taking forever as well but a similar 1mb file is almost instantly read and executed so the problem must be more than just the file size. I was avoiding using c++ for a simple project as this and would rather not to as php is easier for me and the task that will be executed doesn't need to benefit from the added speed that it would have if it had been done in c++ but if i have no luck in solving this problem i guess i'll have to. EDIT Reasons for not using a database: 1-Whoever made it didn't used a database and it will be pretty hard to store this in an organized database if i'm not able to do something with it first, like just reading it, copying parts from it or putting in memory or something. 2-I don't have experience working with databases as pretty much all stuff i've ever done in php didn't needed large amounts of stored data, 50kb at best, if i was thinking about a big project or huge chunks of data as this one, i definitely would, but i didn't made this mess to start with and now i have to undo it. 3-The logic for having to store a small portion of data like 10mb in hard drive when now every computer has pretty much enough ram to fit the whole OS in it is pretty much incomprehensible unless someone gives a good explanation about it, if i had to access a lot of said files simultaneously i would understand but like i said, this is a simple project, this is the only file that will be accessed at a given time this isn't even to make some kind of website, it's to run a few times and be done with it.

    Read the article

  • STLifying C++ classes

    - by shambulator
    I'm trying to write a class which contains several std::vectors as data members, and provides a subset of vector's interface to access them: class Mesh { public: private: std::vector<Vector3> positions; std::vector<Vector3> normals; // Several other members along the same lines }; The main thing you can do with a mesh is add positions, normals and other stuff to it. In order to allow an STL-like way of accessing a Mesh (add from arrays, other containers, etc.), I'm toying with the idea of adding methods like this: public: template<class InIter> void AddNormals(InIter first, InIter last); Problem is, from what I understand of templates, these methods will have to be defined in the header file (seems to make sense; without a concrete iterator type, the compiler doesn't know how to generate object code for the obvious implementation of this method). Is this actually a problem? My gut reaction is not to go around sticking huge chunks of code in header files, but my C++ is a little rusty with not much STL experience outside toy examples, and I'm not sure what "acceptable" C++ coding practice is on this. Is there a better way to expose this functionality while retaining an STL-like generic programming flavour? One way would be something like this: (end list) class RestrictedVector<T> { public: RestrictedVector(std::vector<T> wrapped) : wrapped(wrapped) {} template <class InIter> void Add(InIter first, InIter last) { std::copy(first, last, std::back_insert_iterator(wrapped)); } private: std::vector<T> wrapped; }; and then expose instances of these on Mesh instead, but that's starting to reek a little of overengineering :P Any advice is greatly appreciated!

    Read the article

  • Limiting object allocation over multiple threads

    - by John
    I have an application which retrieves and caches the results of a clients query. The client then requests different chunks of data and the application sends the relevant results and removes them from the cache. A new requirement for this application is that there needs to be a run-time configurable maximum number of results which may be cached. I've taken the naive approach and implemented this by using a counter under a lock which is incremented every time a result is cached and decremented whenever a result is removed from the cache. Unfortunately, this has drastically reduced the applications performance when processing a large number of concurrent requests. I have tried both a critical section lock and spin-lock; the performance improves a bit with a spin-lock, but is still unacceptably slow. Is there a better way to solve this problem which may improve performance? Right now I have a thread pool that services requests and each request is tied to a Request object which stores that cached results for that particular request. Here is a simplified pseudo code version of my current implementation: void ResultCallback( Result result, Request *request ) { lock totalResultsCached lock cachedLimit if( totalResultsCached + 1 > cachedLimit ) { unlock cachedLimit unlock totalResultsCached //cancel the request return; } ++totalResultsCached; unlock cachedLimit unlock totalResultsCached request.add(result) } void SendResults( int resultsToSend, Request *request ) { while ( resultsToSend > 0 ) { send(request.remove()) lock totalResultsCached --totalResultsCached unlock totalResultsCached --resultsToSend; } }

    Read the article

  • Web pages that a long time to load keep on reloading, just on vista on my work n/w...

    - by Ralpharama
    I have a curious problem at work which I've been struggling with since the advent of Windows Vista. We send our own email newsletter out to 40,000+ people once a week. The sending code has been in place for years, it's in classic ASP/VBscript called through a browser and simply loops through each email address, sending it to them. The page takes 40 mins or more to run, so has a big timeout value to allow it to do so. All well and good, suddenly, after Windows Vista is installed on the work PCs, the email sending page behaved oddly - after a period of time it seems to reload the page, endlessly, so the first 20% of our users get multiple copies of the newsletter until we kill the process! If we run the code on an XP machine in the on the same office network, it works fine. If we run it on Vista outside the office, so, say, on my own ISP, then it also works fine! Note, same effect in IE and FF... So, something about my office network and Vista is causing this... I recently re-wrote the newsletter code so it would split the task into chunks of 100 users at a time, hoping this would fix it, but my most recent test shows that the office n/w vista machine once again reloads the same page over any over, even though it takes 1/10th of the time to run... Does anyone have any ideas what it might be, how I can prove it, or, better, how I can get round it? Thanks for your advice :)

    Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25  | Next Page >