Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 327/361 | < Previous Page | 323 324 325 326 327 328 329 330 331 332 333 334  | Next Page >

  • Read huge free text docs in one file for lucene indexing

    - by Jun
    I have heaps of free text news docs in one big file. The structure of each news doc is like: (Header line) Category, Doc1, Date (day, month, year) (body text) ... ... ... (Header line) Category, Doc2, Date (day, month, year) (body text) ... ... ... If I extract each doc from the big file, it costs too much time and not efficient. Therefore, I decide to read the file line by line and feed information to lucene the same time. I write c# code to index each doc to lucene like: Streamreader sr = new Streamreader(file); string line = ""; while((line = sr.ReadLine()) != null) { How can I tell this line is a doc header line from text line and get the metadata and all the text lines of a doc for lucene to index. Also, the text is read by OCR which can not give correct line-separating. Captions are mixed with content text iterate the process till the end of the file } with thanks

    Read the article

  • merging two tables, while applying aggregates on the duplicates (max,min and sum)

    - by cloudraven
    I have a table (let's call it log) with a few millions of records. Among the fields I have Id, Count, FirstHit, LastHit. Id - The record id Count - number of times this Id has been reported FirstHit - earliest timestamp with which this Id was reported LastHit - latest timestamp with which this Id was reported This table only has one record for any given Id Everyday I get into another table (let's call it feed) with around half a million records with these fields among many others: Id Timestamp - Entry date and time. This table can have many records for the same id What I want to do is to update log in the following way. Count - log count value, plus the count() of records for that id found in feed FirstHit - the earliest of the current value in log or the minimum value in feed for that id LastHit - the latest of the current value in log or the maximum value in feed for that id. It should be noticed that many of the ids in feed are already in log. The simple thing that worked is to create a temporary table and insert into it the union of both as in Select Id, Min(Timestamp) As FirstHit, MAX(Timestamp) as LastHit, Count(*) as Count FROM feed GROUP BY Id UNION ALL Select Id, FirstHit,LastHit,Count FROM log; From that temporary table I do a select that aggregates Min(firsthit), max(lasthit) and sum(Count) Select Id, Min(FirstHit),Max(LastHit),Sum(Count) FROM @temp GROUP BY Id; and that gives me the end result. I could then delete everything from log and replace it with everything with temp, or craft an update for the common records and insert the new ones. However, I think both are highly inefficient. Is there a more efficient way of doing this. Perhaps doing the update in place in the log table?

    Read the article

  • Adding new elements into DOM using JavaScript (appendChild)

    - by KatieK
    I sometimes need to add elements (such as a new link and image) to an existing HTML page, but I only have access to a small portion of the page far from where I need to insert elements. I want to use DOM based JavaScript techniques, and I must avoid using document.write(). Thus far, I've been using something like this: // Create new image element var newImg = document.createElement("img"); newImg.src = "images/button.jpg"; newImg.height = "50"; newImg.width = "150"; newImg.alt = "Click Me"; // Create new link element var newLink = document.createElement("a"); newLink.href = "/dir/signup.html"; // Append new image into new link newLink.appendChild(newImg); // Append new link (with image) into its destination on the page document.getElementById("newLinkDestination").appendChild(newLink); Is there a more efficient way that I could use to accomplish the same thing? It all seems necessary, but I'd like to know if there's a better way I could be doing this. Thanks!

    Read the article

  • Utilizing a Queue

    - by Nathan
    I'm trying to store records of transactions all together and by category for the last 1, 7, 30 or 360 days. I've tried a couple things, but they've brutally failed. I had an idea of using a queue with 360 values, one for each day, but I don't know enough about queue's to figure out how that would work. Input will be an instance of this class: class Transaction { public string TotalEarned { get; set; } public string TotalHST { get; set; } public string TotalCost { get; set; } public string Category { get; set; } } New transactions can occur at any time during the day, and there could be as many as 15 transactions in a day. My program is using a plain text file as external storage, but how I load it depends on how I decide to store this data. What would be the best way to do this?

    Read the article

  • When to use a foreign key in MySQL

    - by Mel
    Is there official guidance or a threshold to indicate when it is best practice to use a foreign key in a MySQL database? Suppose you created a table for movies. One way to do it is to integrate the producer and director data into the same table. (movieID, movieName, directorName, producerName). However, suppose most directors and producers have worked on many movies. Would it be best to create two other tables for producers and directors, and use a foreign key in the movie table? When does it become best practice to do this? When many of the directors and producers are appearing several times in the column? Or is it best practice to employ a foreign key approach at the start? While it seems more efficient to use a foreign key, it also raises the complexity of the database. So when does the trade off between complexity and normalization become worth it? I'm not sure if there is a threshold or a certain number of cell repetitions that makes it more sensible to use a foreign key. I'm thinking about a database that will be used by hundreds of users, many concurrently. Many thanks!

    Read the article

  • template specialization for static member functions; howto?

    - by Rolle
    I am trying to implement a template function with handles void differently using template specialization. The following code gives me an "Explicit specialization in non-namespace scope" in gcc: template <typename T> static T safeGuiCall(boost::function<T ()> _f) { if (_f.empty()) throw GuiException("Function pointer empty"); { ThreadGuard g; T ret = _f(); return ret; } } // template specialization for functions wit no return value template <> static void safeGuiCall<void>(boost::function<void ()> _f) { if (_f.empty()) throw GuiException("Function pointer empty"); { ThreadGuard g; _f(); } } I have tried moving it out of the class (the class is not templated) and into the namespace but then I get the error "Explicit specialization cannot have a storage class". I have read many discussions about this, but people don't seem to agree how to specialize function templates. Any ideas?

    Read the article

  • model.matrix() with na.action=NULL?

    - by Vincent
    I have a formula and a data frame, and I want to extract the model.matrix(). However, I need the resulting matrix to include the NAs that were found in the original dataset. If I were to use model.frame() to do this, I would simply pass it na.action=NULL. However, the output I need is of the model.matrix() format. Specifically, I need only the right-hand side variables, I need the output to be a matrix (not a data frame), and I need factors to be converted to a series of dummy variables. I'm sure I could hack something together using loops or something, but I was wondering if anyone could suggest a cleaner and more efficient workaround. Thanks a lot for your time! And here's an example: dat <- data.frame(matrix(rnorm(20),5,4), gl(5,2)) dat[3,5] <- NA names(dat) <- c(letters[1:4], 'fact') ff <- a ~ b + fact # This omits the row with a missing observation on the factor model.matrix(ff, dat) # This keeps the NA, but it gives me a data frame and does not dichotomize the factor model.frame(ff, dat, na.action=NULL) Here is what I would like to obtain: (Intercept) b fact2 fact3 fact4 fact5 1 1 0.7266086 0 0 0 0 2 1 -0.6088697 0 0 0 0 3 NA 0.4643360 NA NA NA NA 4 1 -1.1666248 1 0 0 0 5 1 -0.7577394 0 1 0 0 6 1 0.7266086 0 1 0 0 7 1 -0.6088697 0 0 1 0 8 1 0.4643360 0 0 1 0 9 1 -1.1666248 0 0 0 1 10 1 -0.7577394 0 0 0 1

    Read the article

  • Implementing IEnumeralbe on Non-Listed Items

    - by Stacey
    I have a class that contains a static number of objects. This class needs to be frequently 'compared' to other classes that will be simple List objects. public partial class Sheet { public Item X{ get; set; } public Item Y{ get; set; } public Item Z{ get; set; } } the items are obviously not going to be "X" "Y" "Z", those are just generic names for example. The problem is that due to the nature of what needs to be done, a List won't work; even though everything in here is going to be of type Item. It is like a checklist of very specific things that has to be tested against in both code and runtime. This works all fine and well; it isn't my issue. My issue is iterating it. For instance I want to do the following... List<Item> UncheckedItems = // Repository Logic Here. UncheckedItems contains all available items; and the CheckedItems is the Sheet class instance. CheckedItems will contain items that were moved from Unchecked to Checked; however due to the nature of the storage system, items moved to Checked CANNOT be REMOVED from Unchecked. I simply want to iterate through "Checked" and remove anything from the list in Unchecked that is already in "Checked". So naturally, that would go like this with a normal list. foreach(Item item in Unchecked) { if( Checked.Contains(item) ) Unchecked.Remove( item ); } But since "Sheet" is not a 'List', I cannot do that. So I wanted to implement IEnumerable so that I could. Any suggestions? I've never implemented IEnumerable directly before and I'm pretty confused as to where to begin.

    Read the article

  • Does the Java Memory Model (JSR-133) imply that entering a monitor flushes the CPU data cache(s)?

    - by Durandal
    There is something that bugs me with the Java memory model (if i even understand everything correctly). If there are two threads A and B, there are no guarantees that B will ever see a value written by A, unless both A and B synchronize on the same monitor. For any system architecture that guarantees cache coherency between threads, there is no problem. But if the architecture does not support cache coherency in hardware, this essentially means that whenever a thread enters a monitor, all memory changes made before must be commited to main memory, and the cache must be invalidated. And it needs to be the entire data cache, not just a few lines, since the monitor has no information which variables in memory it guards. But that would surely impact performance of any application that needs to synchronize frequently (especially things like job queues with short running jobs). So can Java work reasonably well on architectures without hardware cache-coherency? If not, why doesn't the memory model make stronger guarantees about visibility? Wouldn't it be more efficient if the language would require information what is guarded by a monitor? As i see it the memory model gives us the worst of both worlds, the absolute need to synchronize, even if cache coherency is guaranteed in hardware, and on the other hand bad performance on incoherent architectures (full cache flushes). So shouldn't it be more strict (require information what is guarded by a monitor) or more lose and restrict potential platforms to cache-coherent architectures? As it is now, it doesn't make too much sense to me. Can somebody clear up why this specific memory model was choosen? EDIT: My use of strict and lose was a bad choice in retrospect. I used "strict" for the case where less guarantees are made and "lose" for the opposite. To avoid confusion, its probably better to speak in terms of stronger or weaker guarantees.

    Read the article

  • Why use shorter VARCHAR(n) fields?

    - by chryss
    It is frequently advised to choose database field sizes to be as narrow as possible. I am wondering to what degree this applies to SQL Server 2005 VARCHAR columns: Storing 10-letter English words in a VARCHAR(255) field will not take up more storage than in a VARCHAR(10) field. Are there other reasons to restrict the size of VARCHAR fields to stick as closely as possible to the size of the data? I'm thinking of Performance: Is there an advantage to using a smaller n when selecting, filtering and sorting on the data? Memory, including on the application side (C++)? Style/validation: How important do you consider restricting colunm size to force non-sensical data imports to fail (such as 200-character surnames)? Anything else? Background: I help data integrators with the design of data flows into a database-backed system. They have to use an API that restricts their choice of data types. For character data, only VARCHAR(n) with n <= 255 is available; CHAR, NCHAR, NVARCHAR and TEXT are not. We're trying to lay down some "good practices" rules, and the question has come up if there is a real detriment to using VARCHAR(255) even for data where real maximum sizes will never exceed 30 bytes or so. Typical data volumes for one table are 1-10 Mio records with up to 150 attributes. Query performance (SELECT, with frequently extensive WHERE clauses) and application-side retrieval performance are paramount.

    Read the article

  • Recursion - Ship Battle

    - by rgorrosini
    I'm trying to write a little ship battle game in java. It is 100% academic, I made it to practice recursion, so... I want to use it instead of iteration, even if it's simpler and more efficient in most some cases. Let's get down to business. These are the rules: Ships are 1, 2 or 3 cells wide and are placed horizontally only. Water is represented with 0, non-hit ship cells are 1, hit ship cells are 2 and sunken ships have all it's cells in 3. With those rules set, I'm using the following array for testing: int[][] board = new int[][] { {0, 1, 2, 0, 1, 0}, {0, 0, 1, 1, 1, 0}, {0, 3, 0, 0, 0, 0}, {0, 0, 2, 1, 2, 0}, {0, 0, 0, 1, 1, 1}, }; It works pretty good so far, and to make it more user-friendly I would like to add a couple of reports. these are the methods I need for them: Given the matrix, return the amount of ships in it. Same as a), but separating them by state (amount of non-hit ships, hit and sunken ones). I will need a hand with those reports, and I would like to get some ideas. Remember it must be done using recursion, I want to understand this, and the only way to go is practice! Thanks a lot for your time and patience :).

    Read the article

  • NServiceBus & MSMQ: How To Change the Default Permissions on the Queue?

    - by Amy T
    My team is on our first attempt at using NServiceBus (v2.0), using MSMQ as the backing storage. We're getting stuck on queue permissions. We're using it in a Web Forms application, where the user account the website runs under is not an administrator on the machine. When NServiceBus creates the MSMQ queue, it gives the local administrators group full control, and the local everyone and anonymous groups permissions to send messages. But then later, as part of initializing the queue, NServiceBus tries to read all of its messages. That's where we run into the permissions error. Since the website isn't running as an administrator, it's not allowed to read messages. How are other people dealing with this? Do your applications run as administrators? Or do you create the MSMQ queue in your code first, giving it the permissions you need, so that NServiceBus doesn't have to create it? Or is there a bit of configuration we're missing? Or are we likely writing our code that uses NServiceBus incorrectly to be running into this?

    Read the article

  • Ruby on Rails How do I access variables of a model inside itself like in this example?

    - by banditKing
    I have a Model like so: # == Schema Information # # Table name: s3_files # # id :integer not null, primary key # owner :string(255) # notes :text # created_at :datetime not null # updated_at :datetime not null # last_accessed_by_user :string(255) # last_accessed_time_stamp :datetime # upload_file_name :string(255) # upload_content_type :string(255) # upload_file_size :integer # upload_updated_at :datetime # class S3File < ActiveRecord::Base #PaperClip methods attr_accessible :upload attr_accessor :owner Paperclip.interpolates :prefix do |attachment, style| I WOULD LIKE TO ACCESS VARIABLE= owner HERE- HOW TO DO THAT? end has_attached_file( :upload, :path => ":prefix/:basename.:extension", :storage => :s3, :s3_credentials => {:access_key_id => "ZXXX", :secret_access_key => "XXX"}, :bucket => "XXX" ) #Used to connect to users through the join table has_many :user_resource_relationships has_many :users, :through => :user_resource_relationships end Im setting this variable in the controller like so: # POST /s3_files # POST /s3_files.json def create @s3_file = S3File.new(params[:s3_file]) @s3_file.owner = current_user.email respond_to do |format| if @s3_file.save format.html { redirect_to @s3_file, notice: 'S3 file was successfully created.' } format.json { render json: @s3_file, status: :created, location: @s3_file } else format.html { render action: "new" } format.json { render json: @s3_file.errors, status: :unprocessable_entity } end end end Thanks, any help would be appreciated.

    Read the article

  • Fast serialization/deserialization of structs

    - by user256890
    I have huge amont of geographic data represented in simple object structure consisting only structs. All of my fields are of value type. public struct Child { readonly float X; readonly float Y; readonly int myField; } public struct Parent { readonly int id; readonly int field1; readonly int field2; readonly Child[] children; } The data is chunked up nicely to small portions of Parent[]-s. Each array contains a few thousands Parent instances. I have way too much data to keep all in memory, so I need to swap these chunks to disk back and forth. (One file would result approx. 2-300KB). What would be the most efficient way of serializing/deserializing the Parent[] to a byte[] for dumpint to disk and reading back? Concerning speed, I am particularly interested in fast deserialization, write speed is not that critical. Would simple BinarySerializer good enough? Or should I hack around with StructLayout (see accepted answer)? I am not sure if that would work with array field of Parent.children. UPDATE: Response to comments - Yes, the objects are immutable (code updated) and indeed the children field is not value type. 300KB sounds not much but I have zillions of files like that, so speed does matter.

    Read the article

  • Accessing HTML DOM elements from javascript using `.childNodes`

    - by Martin
    I'm wondering about the .childNodes property, I have the code below, and for some reason I get 18 children, while 6 are HTMLInputElements as expected, and the rest are undefined. What is this about? Is there an efficient way to iterate over the input elements? <html> <head> <script> window.onload = function(e){ form = document.getElementById('myForm'); alert(form.childNodes.length); for(i=0; i<form.childNodes.length; i++){ alert(form[i]); } } </script> </head> <body> <form id='myForm' action="haha" method="post"> Name: <input type="text" id="fnameAdd" name="name" /><br /> Phone1: <input type="text" id="phone1Add" name="phone1" /><br /> Phone2: <input type="text" id="phone2Add" name="phone2" /><br /> E-Mail: <input type="text" id="emailAdd" name="email" /><br /> Address: <input type="text" id="addressAdd" name="address" /><br /> <input type="submit" value="Save" /> </body> </html>

    Read the article

  • How to call shared_ptr<boost::signal> from a vector in a loop?

    - by BTR
    I've got a working callback system that uses boost::signal. I'm extending it into a more flexible and efficient callback manager which uses a vector of shared_ptr's to my signals. I've been able to successfully create and add callbacks to the list, but I'm unclear as to how to actually execute the signals. ... // Signal aliases typedef boost::signal<void (float *, int32_t)> Callback; typedef std::shared_ptr<Callback> CallbackRef; // The callback list std::vector<CallbackRef> mCallbacks; // Adds a callback to the list template<typename T> void addCallback(void (T::* callbackFunction)(float * data, int32_t size), T * callbackObject) { CallbackRef mCallback = CallbackRef(new Callback()); mCallback->connect(boost::function<void (float *, int32_t)>(boost::bind(callbackFunction, callbackObject, _1, _2))); mCallbacks.push_back(mCallback); } // Pass the float array and its size to the callbacks void execute(float * data, int32_t size) { // Iterate through the callback list for (vector<CallbackRef>::iterator i = mCallbacks.begin(); i != mCallbacks.end(); ++i) { // What do I do here? // (* i)(data, size); // <-- Dereferencing doesn't work } } ... All of this code works. I'm just not sure how to run the call from within a shared_ptr from with a vector. Any help would be neat-o. Thanks, in advance.

    Read the article

  • What are provenly scalable data persistence solutions for consumer profiles?

    - by Hubbard
    Consumer profiles with analytical scores [ConsumerID, 1..n demographical variables, 1...n analytical scores e.g. "likely to churn" "likely to buy an item 100$ in worth" etc.] have to be possible to query fast if they are to be used in customizing web-sites, consumer communications etc. Well. If you have: Large number of consumers Large profiles with a huge set of variables (as profiles describing human behaviour are likely to be..) ...you are in trouble. If you really have a physical relational database to which you target a query and then a physical disk starts to rotate someplace to give you an individual profile or a set of profiles, the profile user (a web site customizing a page, a recommendation engine making a recommendation..) has died of boredom before getting any observable results. There is the possibility of having the profiles in memory, which would of course increase the performance hugely. What are the most proven solutions for a fast-response, scalable consumer profile storage? Is there a shootout of these someplace?

    Read the article

  • Which design pattern fits - strategy makes sense ?

    - by user554833
    --Bump *One desperate try to get someone's attention I have a simple database table that stores list of users who have subscribed to folders either by email OR to show up on the site (only on the web UI). In the storage table this is controlled by a number(1 - show on site 2- by email). When I am showing in UI I need to show a checkbox next to each of folders for which the user has subscribed (both email & on site). There is a separate table which stores a set of default subscriptions which would apply to each user if user has not expressed his subscription. This is basically a folder ID and a virtual group name. But, Email subscriptions do not count for applying these default groups. So if no "on site" subscription apply default group. Thats the rule. How about a strategy pattern here (Pseudo code) Interface ISubscription public ArrayList GetSubscriptionData(Pass query object) Public class SubscriptionWithDefaultGroup Implement ArrayList GetSubscriptionData(Pass query object) Public class SubscriptionWithoutDefaultGroup Implement ArrayList GetSubscriptionData(Pass query object) Public class SubscriptionOnlyDefaultGroup Implement ArrayList GetSubscriptionData(Pass query object) does this even make sense? I would be more than glad for receive any criticism / help / notes. I am learning. Cheers

    Read the article

  • Chrome Extension - Console Log not firing

    - by coffeemonitor
    I'm starting to learn to make my own Chrome Extensions, and starting small. At the moment, I'm switching from using the alert() function to console.log() for a cleaner development environment. For some reason, console.log() is not displaying in my chrome console logs. However, the alert() function is working just fine. Can someone review my code below and perhaps tell me why console.log() isn't firing as expected? manifest.json { "manifest_version": 2, "name": "Sandbox", "version": "0.2", "description": "My Chrome Extension Playground", "icons": { "16": "imgs/16x16.png", "24": "imgs/24x24.png", "32": "imgs/32x32.png", "48": "imgs/48x48.png" }, "background": { "scripts": ["js/background.js"] }, "browser_action": { "default_title": "My Fun Sandbox Environment", "default_icon": "imgs/16x16.png" }, "permissions": [ "background", "storage", "tabs", "http://*/*", "https://*/*" ] } js/background.js function click(e) { alert("this alert certainly shows"); console.log("But this does not"); } // Fire a function, when icon is clicked chrome.browserAction.onClicked.addListener(click); As you can see, I kept it very simple. Just the manifest.json and a background.js file with an event listener, if the icon in the toolbar is clicked. As I mentioned, the alert() is popping up nicely, while the console.log() appears to be ignored.

    Read the article

  • Coding the R-ight way - avoiding the for loop

    - by mropa
    I am going through one of my .R files and by cleaning it up a little bit I am trying to get more familiar with writing the code the r-ight way. As a beginner, one of my favorite starting points is to get rid of the for() loops and try to transform the expression into a functional programming form. So here is the scenario: I am assembling a bunch of data.frames into a list for later usage. dataList <- list (dataA, dataB, dataC, dataD, dataE ) Now I like to take a look at each data.frame's column names and substitute certain character strings. Eg I like to substitute each "foo" and "bar" with "baz". At the moment I am getting the job done with a for() loop which looks a bit awkward. colnames(dataList[[1]]) [1] "foo" "code" "lp15" "bar" "lh15" colnames(dataList[[2]]) [1] "a" "code" "lp50" "ls50" "foo" matchVec <- c("foo", "bar") for (i in seq(dataList)) { for (j in seq(matchVec)) { colnames (dataList[[i]])[grep(pattern=matchVec[j], x=colnames (dataList[[i]]))] <- c("baz") } } Since I am working here with a list I thought about the lapply function. My attempts handling the job with the lapply function all seem to look alright but only at first sight. If I write f <- function(i, xList) { gsub(pattern=c("foo"), replacement=c("baz"), x=colnames(xList[[i]])) } lapply(seq(dataList), f, xList=dataList) the last line prints out almost what I am looking for. However, if i take another look at the actual names of the data.frames in dataList: lapply (dataList, colnames) I see that no changes have been made to the initial character strings. So how can I rewrite the for() loop and transform it into a functional programming form? And how do I substitute both strings, "foo" and "bar", in an efficient way? Since the gsub() function takes as its pattern argument only a character vector of length one.

    Read the article

  • What are the linkage of the following functions?

    - by Derui Si
    When I was reading the c++ 03 standard (7.1.1 Storage class specifiers [dcl.stc]), there are some examples as below, I'm not able to tell how the linkage of each successive declarations is determined? Could anyone help here? Thanks in advance! static char* f(); // f() has internal linkage char* f() { /* ... */ } // f() still has internal linkage char* g(); // g() has external linkage static char* g() { /* ... */ } // error: inconsistent linkage void h(); inline void h(); // external linkage inline void l(); void l(); // external linkage inline void m(); extern void m(); // external linkage static void n(); inline void n(); // internal linkage static int a; // a has internal linkage int a; // error: two definitions static int b; // b has internal linkage extern int b; // b still has internal linkage int c; // c has external linkage static int c; // error: inconsistent linkage extern int d; // d has external linkage static int d; // error: inconsistent linkage UPD: Additionally, how can I understand the statement in the standard, " The linkages implied by successive declarations for a given entity shall agree. That is, within a given scope, each declaration declaring the same object name or the same overloading of a function name shall imply the same linkage. Each function in a given set of overloaded functions can have a different linkage, however."

    Read the article

  • PHP Arrays: Pop an array of single-element arrays into one array.

    - by Rob Drimmie
    Using a proprietary framework, I am frequently finding myself in the situation where I get a resultset from the database in the following format: array(5) { [0] => array(1) { ["id"] => int(241) } [1] => array(1) { ["id"] => int(2) } [2] => array(1) { ["id"] => int(81) } [3] => array(1) { ["id"] => int(560) } [4] => array(1) { ["id"] => int(10) } } I'd much rather have a single array of ids, such as: array(5) { [0] => int(241) [1] => int(2) [2] => int(81) [3] => int(560) [4] => int(10) } To get there, I frequently find myself writing: $justIds = array(); foreach( $allIds as $id ) { $justIds[] = $id["id"]; } Is there a more efficient way to do this?

    Read the article

  • Velocity CTP: can we 'search' for objects?

    - by Stato Machino
    It appears that 'tags' allow us to associate a 'search term' with the objects placed into the Velocity cache space. However, these can only be queried within a 'region'. Further, regions somehow limit the locality of objects in the cache to a single server (or maybe something kinda like that). So this appears to make it hard to perform any operation for which the unique Id of the cached item is not persisted or continuously available to the application that stores and retrieves objects to and from the cache. In any case, I can't see an easy way to 'cleanse' the cache of objects or to find objects across the entire cache that may share some prefix, postfix or infix values in the cache key so that i can clear out the cache of object repeatedly created in unit tests, for example. And I am unsure about the consequences of regions being associated with single server cache locations. So I would appreciate any help with the following questions: What is the difference between a 'distributed cache' (called a 'partitioned' cache??) when using regions, and a 'local cache'? 1.a. In particular, are the region-oriented values in a distributed cache visible through a cache factory that is configured to 'see' the entire cache space? Are the operations of creating and removing 'regions' efficient enough that it would be reasonable to create a region and a group of tags for each bundle of objects that need to be cached? 2.a. Or does this just push the problem of scoping the 'search for objects' up the chain because the ability of the DataCache object to query down through regions and tags as limited as querying for the cache keys of objects themselves. Thanks, Stato

    Read the article

  • Top n items in a List ( including duplicates )

    - by Krishnan
    Trying to find an efficient way to obtain the top N items in a very large list, possibly containing duplicates. I first tried sorting & slicing, which works. But this seems unnnecessary. You shouldn't need to sort a very large list if you just want the top 20 members. So I wrote a recursive routine which builds the top-n list. This also works, but is very much slower than the non-recursive one! Question: Which is my second routine (elite2) so much slower than elite, and how do I make it faster ? My code is attached below. Thanks. import scala.collection.SeqView import scala.math.min object X { def elite(s: SeqView[Int, List[Int]], k:Int):List[Int] = { s.sorted.reverse.force.slice(0,min(k,s.size)) } def elite2(s: SeqView[Int, List[Int]], k:Int, s2:List[Int]=Nil):List[Int] = { if( k == 0 || s.size == 0) s2.reverse else { val m = s.max val parts = s.force.partition(_==m) val whole = if( parts._1.size > 1) parts._1.tail:::parts._2 else parts._2 elite2( whole.view, k-1, m::s2 ) } } def main(args:Array[String]) = { val N = 1000000/3 val x = List(N to 1 by -1).flatten.map(x=>List(x,x,x)).flatten.view println(elite2(x,20)) println(elite(x,20)) } }

    Read the article

  • Database (MySQL) structuring: pros and cons of multiple tables

    - by Gideon
    I am collecting data and storing it MySQL, for: 75 variables 55 countries Each year I have, at this stage since I am building this tool created a single table, of variables / countries (storing 1 year worth of data). Next year (and for several years after that) a new set of data will be input for each country. There are therefore 3 variables in controlling data returned to a user reviewing all collected data. The general form of any query would be: Show me these specifics variables, for these specific countries, for these specific years. (Show me average age and weight, for USA and Canada, for 2012 and 2009, for example) My question is, it seems that I have two options for arranging this data: -Multiple tables where I create a table of country / variable for each year data is collected - Single table and simply add a column (field) for the year that data relates to. As far as I can tell I could make these database calls with either sructure, but is one more powerful / efficient / quicker, and why? Thanks for your consideration. It's a PDO / PHP interface if that is relevent.

    Read the article

< Previous Page | 323 324 325 326 327 328 329 330 331 332 333 334  | Next Page >