large objects - Page 44

Send large JSON data to WCF Rest Service

- by Christo Fur

Hi I have a client web page that is sending a large json object to a proxy service on the same domain as the web page. The proxy (an ashx handler) then forwards the request to a WCF Rest Service. Using a WebClient object (standard .net object for making a http request) The JSON successfully arrives at the proxy via a jQuery POST on the client webpage. However, when the proxy forwards this to the WCF service I get a Bad Request - Error 400 This doesn't happen when the size of the json data is small The WCF service contract looks like this [WebInvoke(Method = "POST", BodyStyle = WebMessageBodyStyle.Wrapped, RequestFormat = WebMessageFormat.Json, ResponseFormat = WebMessageFormat.Json)] [OperationContract] CarConfiguration CreateConfiguration(CarConfiguration configuration); And the DataContract like this [DataContract(Namespace = "")] public class CarConfiguration { [DataMember(Order = 1)] public int CarConfigurationId { get; set; } [DataMember(Order = 2)] public int UserId { get; set; } [DataMember(Order = 3)] public string Model { get; set; } [DataMember(Order = 4)] public string Colour { get; set; } [DataMember(Order = 5)] public string Trim { get; set; } [DataMember(Order = 6)] public string ThumbnailByteData { get; set; } [DataMember(Order = 6)] public string Wheel { get; set; } [DataMember(Order = 7)] public DateTime Date { get; set; } [DataMember(Order = 8)] public List<string> Accessories { get; set; } [DataMember(Order = 9)] public string Vehicle { get; set; } [DataMember(Order = 10)] public Decimal Price { get; set; } } When the ThumbnailByteData field is small, all is OK. When it is large I get the 400 error What are my options here? I've tried increasing the MaxBytesRecived config setting but that is not enough Any ideas?

Read the article

Text mining on large database (data mining)

- by yox

Hello, I have a large database of resumes (CV), and a certain table skills grouping all users skills. inside that table there's a field skill_text that describes the skill in full text. I'm looking for an algorithm/software/method to extract significant terms/phrases from that table in order to build a new table with standarized skills.. Here are some examples skills extracted from the DB : Sectoral and competitive analysis Business Development (incl. in international settings) Specific structure and road design software - Microstation, Macao, AutoCAD (basic knowledge) Creative work (Photoshop, In-Design, Illustrator) checking and reporting back on campaign progress organising and attending events and exhibitions Development : Aptana Studio, PHP, HTML, CSS, JavaScript, SQL, AJAX Discipline: One to one marketing, E-marketing (SEO & SEA, display, emailing, affiliate program) Mix marketing, Viral Marketing, Social network marketing. The output shoud be something like : Sectoral and competitive analysis Business Development Specific structure and road design software - Macao AutoCAD Photoshop In-Design Illustrator organising events Development Aptana Studio PHP HTML CSS JavaScript SQL AJAX Mix marketing Viral Marketing Social network marketing emailing SEO One to one marketing As you see only skills remains no other representation text. I know this is possible using text mining technics but how to do it ? the database is realy large.. it's a good thing because we can calculate text frequency and decide if it's a real skill or just meaningless text... The big problem is .. how to determin that "blablabla" is a skill ? thanks

Read the article

PostgreSQL: BYTEA vs OID+Large Object?

- by mlaverd

I started an application with Hibernate 3.2 and PostgreSQL 8.4. I have some byte[] fields that were mapped as @Basic (= PG bytea) and others that got mapped as @Lob (=PG Large Object). Why the inconsistency? Because I was a Hibernate noob. Now, those fields are max 4 Kb (but average is 2-3 kb). The PostgreSQL documentation mentioned that the LOs are good when the fields are big, but I didn't see what 'big' meant. I have upgraded to PostgreSQL 9.0 with Hibernate 3.6 and I was stuck to change the annotation to @Type(type="org.hibernate.type.PrimitiveByteArrayBlobType"). This bug has brought forward a potential compatibility issue, and I eventually found out that Large Objects are a pain to deal with, compared to a normal field. So I am thinking of changing all of it to bytea. But I am concerned that bytea fields are encoded in Hex, so there is some overhead in encoding and decoding, and this would hurt the performance. Are there good benchmarks about the performance of both of these? Anybody has made the switch and saw a difference?

Read the article

How to protect copyright on large web applications?

- by Saif Bechan

Recently I have read about Myows, described as "the universal copyright management and protection app for smart creatives". It is used to protect copyright and more. Currently I am working on a large web application which is in late testing phase. Because of the complexity of the app there are not many versions of it online so copyright will be a huge issue for me as much of the code is in JavaScript and is easy copyable. I was glad to see that there is some company out there that provide such services, and naturally I wanted to know if there were people using it. I did not know that this type of concept was so new. Is protecting copyright a good idea for a large web application? If so, do you think Myows will be worth using, or are there better ways to achieve that? Update: Wow, there is no better person to have answered this question like the creator himself. There were some nice points made in the answer, and I think it will be a good service for people like me. In the next couple of weeks I will be looking further into the subject and start uploading my code and see how it works out. I will leave this question up because I do want some more suggestions on this topic.

Read the article

SqlLite/Fluent NHibernate integration test harness initialization not repeatable after large data se

- by Mark Rogers

In one of my main data integration test harnesses I create and use Fluent NHibernate's SingleConnectionSessionSourceForSQLiteInMemoryTesting, to get a fresh session for each test. After each test, I close the connection, session, and session factory, and throw out the nested StructureMap container they came from. This works for almost any simple data integration test I can think, including ones that utilize Fluent NHib's PersistenceSpecification object. When I test the application's lengthy database bootstrapping process, which creates and saves thousands of domain objects, I start seeing issues. It's not that the setup and tear down fails, in fact, the test successfully bootstraps the in-memory database as the application would bootstrap the real database in the production environment. The problem occurs when the database bootstrapping occurs a second time on a new in-memory database, with a new session and session factory. The error is: NHibernate.StaleStateException : Unexpected row count: 0; expected: 1 The row count is indeed Unexpected, the row that the application under test is looking for should be in the session. You see, it's not that any data from the last integration test is sticking around, it's that for some reason the session just stops working mid-database-boostrap. And I've looked everywhere for a place I might be holding on to an old session and I can't find one. I've searched through the code for static singleton objects, but there are none anywhere near the code in question. I have a couple StructureMap InstanceScope singleton's but they are getting thrown out with each nested container that is lost after every test teardown. I've tried every possible variation on disposing and closing every object involved with each test teardown and it still fails on this lengthy database bootstrap. But non-bootstrap related database tests appear to work fine. I'm starting to run out of options and may have to surrender lengthy database integration tests in favor of WatiN-based acceptance tests. Can anyone give me any clue about how I can figure out why some of my SingleConnectionSessionSourceForSQLiteInMemoryTesting aren't repeatable? Any advice at all, about how to make an NHibernate SqlLite database integration test harness repeatable?

Read the article

Qt/C++, Problems with large QImage

- by David Günzel

I'm pretty new to C++/Qt and I'm trying to create an application with Visual Studio C++ and Qt (4.8.3). The application displays images using a QGraphicsView, I need to change the images at pixel level. The basic code is (simplified): QImage* img = new QImage(img_width,img_height,QImage::Format_RGB32); while(do_some_stuff) { img->setPixel(x,y,color); } QGraphicsPixmapItem* pm = new QGraphicsPixmapItem(QPixmap::fromImage(*img)); QGraphicsScene* sc = new QGraphicsScene; sc->setSceneRect(0,0,img->width(),img->height()); sc->addItem(pm); ui.graphicsView->setScene(sc); This works well for images up to around 12000x6000 pixel. The weird thing happens beyond this size. When I set img_width=16000 and img_height=8000, for example, the line img = new QImage(...) returns a null image. The image data should be around 512,000,000 bytes, so it shouldn't be too large, even on a 32 bit system. Also, my machine (Win 7 64bit, 8 GB RAM) should be capable of holding the data. I've also tried this version: uchar* imgbuf = (uchar*) malloc(img_width*img_height*4); QImage* img = new QImage(imgbuf,img_width,img_height,QImage::Format_RGB32); At first, this works. The img pointer is valid and calling img-width() for example returns the correct image width (instead of 0, in case the image pointer is null). But as soon as I call img-setPixel(), the pointer becomes null and img-width() returns 0. So what am I doing wrong? Or is there a better way of modifying large images on pixel level? Regards, David

Read the article

Looking for advice on importing large dataset in sqlite and Cocoa/Objective-C

- by jluckyiv

I have a fairly large hierarchical dataset I'm importing. The total size of the database after import is about 270MB in sqlite. My current method works, but I know I'm hogging memory as I do it. For instance, if I run with Zombies, my system freezes up (although it will execute just fine if I don't use that Instrument). I was hoping for some algorithm advice. I have three hierarchical tables comprising about 400,000 records. The highest level has about 30 records, the next has about 20,000, the last has the balance. Right now, I'm using nested for loops to import. I know I'm creating an unreasonably large object graph, but I'm also looking to serialize to JSON or XML because I want to break up the records into downloadable chunks for the end user to import a la carte. I have the code written to do the serialization, but I'm wondering if I can serialize the object graph if I only have pieces in memory. Here's pseudocode showing the basic process for sqlite import. I left out the unnecessary detail. [database open]; [database beginTransaction]; NSArray *firstLevels = [[FirstLevel fetchFromURL:url retain]; for (FirstLevel *firstLevel in firstLevels) { [firstLevel save]; int id1 = [firstLevel primaryKey]; NSArray *secondLevels = [[SecondLevel fetchFromURL:url] retain]; for (SecondLevel *secondLevel in secondLevels) { [secondLevel saveWithForeignKey:id1]; int id2 = [secondLevel primaryKey]; NSArray *thirdLevels = [[ThirdLevel fetchFromURL:url] retain]; for (ThirdLevel *thirdLevel in thirdLevels) { [thirdLevel saveWithForeignKey:id2]; } [database commit]; [database beginTransaction]; [thirdLevels release]; } [secondLevels release]; } [database commit]; [database release]; [firstLevels release];

Read the article

Can a large transaction log cause cpu hikes to occur

- by Simon Rigby

Hello all, I have a client with a very large database on Sql Server 2005. The total space allocated to the db is 15Gb with roughly 5Gb to the db and 10 Gb to the transaction log. Just recently a web application that is connecting to that db is timing out. I have traced the actions on the web page and examined the queries that execute whilst these web operation are performed. There is nothing untoward in the execution plan. The query itself used multiple joins but completes very quickly. However, the db server's CPU hikes to 100% for a few seconds. The issue occurs when several simultaneous users are working on the system (when I say multiple .. read about 5). Under this timeouts start to occur. I suppose my question is, can a large transaction log cause issues with CPU performance? There is about 12Gb of free space on the disk currently. The configuration is a little out of my hands but the db and log are both on the same physical disk. I appreciate that the log file is massive and needs attending to, but I'm just looking for a heads up as to whether this may cause CPU spikes (ie trying to find the correlation). The timeouts are a recent thing and this app has been responsive for a few years (ie its a recent manifestation). Many Thanks,

Read the article

Discover periodic patterns in a large data-set

- by Miner

I have a large sequence of tuples on disk in the form (t1, k1) (t2, k2) ... (tn, kn) ti is a monotonically increasing timestamp and ki is a key (assume a fixed length string if needed). Neither ti nor ki are guaranteed to be unique. However, the number of unique tis and kis is huge (millions). n itself is very large (100 Million+) and the size of k (approx 500 bytes) makes it impossible to store everything in memory. I would like to find out periodic occurrences of keys in this sequence. For example, if I have the sequence (1, a) (2, b) (3, c) (4, b) (5, a) (6, b) (7, d) (8, b) (9, a) (10, b) The algorithm should emit (a, 4) and (b, 2). That is a occurs with a period of 4 and b occurs with a period of 2. If I build a hash of all keys and store the average of the difference between consecutive timestamps of each key and a std deviation of the same, I might be able to make a pass, and report only the ones that have an acceptable std deviation(ideally, 0). However, it requires one bucket per unique key, whereas in practice, I might have very few really periodic patterns. Any better ways?

Read the article

Efficient way to create a large number of SharePoint folders

- by BeraCim

Hi all: I'm currently creating a large number of SharePoint folders within a list (e.g. ~800 folders), with each folder containing a different number of items. The way it is currently done is that it programmatically reads off the content types, items, event listeners and the likes off the same folder from another web, then creates the same folder in the current web. That ran reasonably fine and fast on a dev environment. However when it goes to an environment with WFEs and farms, it slowed down a lot. I have checked that there are no leaks in the code, and that the code follows SharePoint coding best practices. At the moment I'm looking at it at the code level. From your experience, are there any efficient ways of creating a large number of SharePoint folders, lists and items? EDIT: I'm currently using SharePoint API, but will be looking at moving to using Web Service in the future. I'm interested in looking at both options though. Code wise, its just the general reading of a folder and its content types plus items and their details, then create the same folder in the same list with the same content types, then copy over the items using patch update. I want to know whether there are more efficient ways of doing the above. Thanks.

Read the article

Windows.Forms RichTextBox Control - Avoid inserting large data.

- by SchlaWiener

I have a Windows Form with a RichTextBox on it. The content of the RichTextBox is written to a database field that ist limited to 64k data. For my purpose that is way more than enough text to store. I have set the MaxLength property to avoid insertng more data than allowed. rtcControl.MaxLength = 65536 Howevery, that only restricts the amount of characters that so is allowed to put in the text. But with the formatting overhead from the Rtf I can type more text than I should be allowed to. It even get's worse if I insert a large image, which dosn't increase the TextLength at all but the Rtf Length grows quite a lot. At the moment I check the Length of the richttextboxes' Rtf property in the FormClosing event and display a message to the user if it's to large. However that is just a workaround because I want to disallow putting more data than allowed into the control (like in a textbox if you exceed the MaxLength property nothing is inserted into the control and you hear the default beep(). Any ideas how to achive this? I already tried: using a custom control which extends the richtextbox and shadows th Rtf property to intercept the insertation. But it seems it isn't executed if I add text. Even the TextChanged Event does not fire if I type smth. in the control.

Read the article

Migrating a Large amount of data from old publishing site to new site

- by tommizzle

Hi, I am currently in the process of creating a new news/publishing site on the Movable Type platform. There are around 20 or so sites with 20,000+ rows of data to be moved/aggregated to ~8 sites (we have a number of location specific sites and are going to aggregate the content from these into 1 single site for each niche). We have discussed how to do this and came to the conclusion that it would probably be better to hire somebody to do it (I could probably do it, but i'm limited on time and am sure that a specialist would be more efficient). So my questions to you guys are: 1) What kind of skill set should we look for in an applicant? 2) There will be a large amount of input from our side... is getting somebody to work remotely out of the question? 3) How long would a task like this traditionally take (I know this question is very subjective, but an estimation would be awesome)? 4) Do you have any recommendations for firms who would be able to take on a large task like this? Thanks in advance, Tom

Read the article

Returning large collections from WCF Serivce

- by Nate Bross

I'm trying to determine the best approach for building a WCF Service, and the area I'm struggling with most is returning lists of objects. The built-in maxMessageSize of 64k seems pretty high, and I really don't want to bump it up (quick googling finds 100s of places bumping the maxMessageSize up to multi-gigabyte range which seems foolish). But, when I'm returning a collection of objects (~150 items) I am exceeding the default 64k. I'm almost to the point of returning my own class which inherits IEnumerable and has properties for hasNext, hasPrevious and PageSize so that I can implement paging on the client side -- this seems like alot of code. The other option is to jackup the maxMessageSize and hope for the best, but that feels wrong. All other aspects of my service are working great, its just returning large collectiosn where I'm having issues. For background, there are two types of consumers of this service, UI applications which will be primarly web and/or wpf applications, and data processing applications, .NET console apps, and maybe some other non-UI apps. For the UI applications, I would like to keep them responsive and keep the messageSize low, on the console apps it doesn't matter as much as they are just pulling data down to do processing and push it back up to the service.

Read the article

Memorystream and Large Object Heap

- by Flo

I have to transfer large files between computers on via unreliable connections using WCF. Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far. My steps are: open filestream read chunk from file into byet[] and create memorystream transfer chunk back to 2. until the whole file is sent My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption. Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution. My actual question(s): Will my current solution create objects on the LOH and will that cause me problem? Is there a better way to solve this? Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.

Read the article

incremental way of counting quantiles for large set of data

- by Gacek

I need to count the quantiles for a large set of data. Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all the portions of the data and store it somewhere, then sort it and count the quantile: List<double> allData = new List<double>(); foreach(var row in matrix) // this is only example. In fact the portions of data are not rows of some matrix { allData.AddRange(row); } allData.Sort(); double p = 0.75*allData.Count; int idQ3 = (int)Math.Ceiling(p) - 1; double Q3 = allData[idQ3]; Now, I would like to find a way of counting this without storing the data in some separate variable. The best solution would be to count some parameters od mid-results for first row and then adjust it step by step for next rows. Note: These datasets are really big (ca 5000 elements in each row) The Q3 can be estimated, it doesn't have to be an exact value. I call the portions of data "rows", but they can have different leghts! Usually it varies not so much (+/- few hundred samples) but it varies! This question is similar to this one: http://stackoverflow.com/questions/1058813/on-line-iterator-algorithms-for-estimating-statistical-median-mode-skewness But I need to count quantiles. ALso there are few articles in this topic, i.e.: http://web.cs.wpi.edu/~hofri/medsel.pdf http://portal.acm.org/citation.cfm?id=347195&dl But before I would try to implement these, I wanted to ask you if there are maybe any other, qucker ways of counting the 0.25/0.75 quantiles?

Read the article

Finding ALL positions of a substring in a large string in C#

- by Tommy

Alright, so what i have, is a large string i need to parse, and what i need to happen, is find all the instances of extract"(me,i-have lots. of]punctuation, and store them to a list. So say this piece of string was in the beginning and middle of the larger string, both of them would be found, and their index's would be added to the List. and the List would contain 0 and the other index whatever it would be. Ive been playing around, and the string.IndexOf does almost what i'm looking for, and ive written some code. But i cant seem to get it to work: List<int> inst = new List<int>(); int index = 0; while (index < source.LastIndexOf("extract\"(me,i-have lots. of]punctuation", 0) + 39) { int src = source.IndexOf("extract\"(me,i-have lots. of]punctuation", index); inst.Add(src); index = src + 40; } inst = The list source = The large string Any better ideas?

Read the article

Loading/Displaying large amount of data on webpage.

- by jb

I have a webpage which contains a table for displaying a large amount of data (on average from 2,000 to 10,000 rows). This page takes a long time to load/render. Which is understandable. The problem is, while the page is loading the PCs memory usage skyrockets (500mb on my test system is in use by iexplorer) and the whole PC grinds to a halt until it has finished, which can take a minute or two. IE hangs until it is complete, switching to another running program is the same. I need to fix this - and ideally i want to accomplish 2 things: 1) Load individual parts of the page seperately. So the page can render initially without the large data table. A loading div will be placed there until it is ready. 2) Dont use up so much memory or local resources while rendering - so at least they can use a different tab/application at the same time. How would I go about doing both or either of these? I'm an applications programmer by trade so i am still a little fizzy on the things I can do in a web environment. Cheers all.

Read the article

how to get large pictures from photo album via facebook graph api

- by Nav

I am currently using the following code to retrieve all the photos from a user profile FB.api('/me/albums?fields=id,name', function(response) { //console.log(response.data.length); for (var i = 0; i < response.data.length; i++) { var album = response.data[i]; FB.api('/' + album.id + '/photos', function(photos) { if (photos && photos.data && photos.data.length) { for (var j = 0; j < photos.data.length; j++) { var photo = photos.data[j]; // photo.picture contain the link to picture var image = document.createElement('img'); image.src = photo.picture; document.body.appendChild(image); image.className = "border"; image.onclick = function() { //this.parentNode.removeChild(this); document.getElementById(info.id).src = this.src; document.getElementById(info.id).style.width = "220px"; document.getElementById(info.id).style.height = "126px"; }; } } }); } }); but, the photos that it returns are of poor quality and are basically like thumbnails.How to retrieve larger photos from the albums. Generally for profile picture I use ?type=large which returns a decent profile image but the type=large is not working in the case of photos from photo albums and also is there a way to get the photos in zip format from facebook once I specify the photo url as I want users to be able to download the photos.

Read the article

Including/Organzing HTML in large javascript project

- by Bill Zimmerman

Hi, I've a got a fairly large web app, with several mini applets on each page. These applets are almost always identical jquery apps. I am looking for advice on how I should organize/include smaller parts of these jquery apps within my larger project. For example, each app has several independent tabs. If possible, I would like to store each of the tabs as a seperate .html file because this makes development easier. My requirements are: 1) All of the html 'tabs' are loaded on the clients end when the page loads. I would like to avoid any delays by dynamically requesting the tab html. 2) If possible, I would like to minimize the raw data sent. For example, it would be preferable to send each tab 1 time, instead of sending each tab 10 times if there are ten applets on that page. Questions: 1) What are my options for 'including' the HTML files / javascript code 2) Any tips for keeping my development simple in this situation? Surely there has to be a better way than just editing one massive html file when working with large pages.

Read the article

PHP: problem rendering large images (error 321)

- by JP19

Hi ... its me again with a php problem :) Following is part of my PHP script which is rendering JPEG images. ... $tf=$requested_file; $image_type="jpeg"; header("Content-type: image/${image_type}"); $CMD="\$image=imagecreatefrom${image_type}('$tf'); image${image_type}(\$image);"; eval($CMD); exit; ... There is no syntactical error, because above code is working fine for small images, but for large images, it gives: Error 321 (net::ERR_INVALID_CHUNKED_ENCODING): Unknown error. in the browser. To be sure, I created two images using imagemagick from same source image - one resized to 10% of original and other 90%. http://mostpopularsports.net/images/misc/ttt10.jpg works http://mostpopularsports.net/images/misc/ttt90.jpg gives Error 301 in the browser. There is a related question with solution posted by OP here Error writing content through Apache. but I cannot understand how to make the fix. Can someome help me with it? I have looked at the headers in Chrome. For the first request, everything is fine. For the second request - the request headers are all garbled. Both images are jpeg (as they are created from imagemagick. But still to be sure I checked): misc/ttt10.jpg: JPEG image data, JFIF standard 1.01 misc/ttt90.jpg: JPEG image data, JFIF standard 1.01 Finally, the way I fixed is, remove the Transfer-Encoding: chunked header from the response. [This header was sent by apache only when the data was large enough]. (I had an internal proxy, so did it in the proxy script - otherwise one may need to do it in apache settings). There were some good answers and I have selected the one that helped me solve the problem best. thanks JP

Read the article

Storing large numbers of varying size objects on disk

- by Foredecker

I need to develop a system for storing large numbers (10's to 100's of thousands) of objects. Each object is email-like - there is a main text body, and several ancillary text fields of limited size. A body will be from a few bytes, to several KB in size. Each item will have a single unique ID (probably a GUID) that identifies it. The store will only be written to when an object is added to it. It will be read often. Deletions will be rare. The data is almost all human readable text so it will be readily compressible. A system that lets me issue the I/Os and mange the memory and caching would be ideal. I'm going to keep the indexes in memory, using it to map indexes to the single (and primary) key for the objects. Once I have the key, then I'll load it from disk, or the cache. The data management system needs to be part of my application - I do not want to depend on OS services. Or separately installed packages. Native (C++) would be best, but a manged (C#) thing would be ok. I believe that a database is an obvious choice, but this needs to be super-fast for look up and loading into memory of an object. I am not experienced with data base tech and I'm concerned that general relational systems will not handle all this variable sized data efficiently. (Note, this has nothing to do with my job - its a personal project.) In your experience, what are the viable alternatives to a traditional relational DB? Or would a DB work well for this?

Read the article

Large free block of english non-pronoun text

- by Tom

As part of teaching myself python I've written a script which allows a user to play hangman. At the moment, the hangman word to be guessed is simply entered manually at the start of the script's code. I want instead for the script to choose randomly from a large list of english words. This I know how to do - my problem is finding that list of words to work from in the first place. Does anyone know of a source on the net for, say, 1000 common english words where they can be downloaded as a block of text or something similar that I can work with? (My initial thought was grabbing a chunk of a novel from project gutenburg [this project is only for my own amusement and won't be available anywhere else so copyright etc doesn't matter hugely to me btw], but anything like that is likely to contain too many names or non-standard words that wouldn't be suitable for hangman. I need text that only has words legal for use in scrabble, basically). It's a slightly odd question for here I suppose, but actually I thought the answer might be of use not just to me but anyone else working on a project for a wordgame or similar that needs a large seed list of words to work from. Many thanks for any links or suggestions :)

Read the article

Dealing with large number of text strings

- by Fadrian

My project when it is running, will collect a large number of string text block (about 20K and largest I have seen is about 200K of them) in short span of time and store them in a relational database. Each of the string text is relatively small and the average would be about 15 short lines (about 300 characters). The current implementation is in C# (VS2008), .NET 3.5 and backend DBMS is Ms. SQL Server 2005 Performance and storage are both important concern of the project, but the priority will be performance first, then storage. I am looking for answers to these: Should I compress the text before storing them in DB? or let SQL Server worry about compacting the storage? Do you know what will be the best compression algorithm/library to use for this context that gives me the best performance? Currently I just use the standard GZip in .NET framework Do you know any best practices to deal with this? I welcome outside the box suggestions as long as it is implementable in .NET framework? (it is a big project and this requirements is only a small part of it) EDITED: I will keep adding to this to clarify points raised I don't need text indexing or searching on these text. I just need to be able to retrieve them in later stage for display as a text block using its primary key. I have a working solution implemented as above and SQL Server has no issue at all handling it. This program will run quite often and need to work with large data context so you can imagine the size will grow very rapidly hence every optimization I can do will help.

Read the article

find the difference between two very large list

- by user157195

I have two large list(could be a hundred million items), the source of each list can be either from a database table or a flat file. both lists are of comparable sizes, both unsorted. I need to find the difference between them. so I have 3 scenarios: 1. List1 is a database table(assume each row simply have one item(key) that is a string), List2 is a large file. 2. Both lists are from 2 db tables. 3. both lists are from two files. in case 2, I plan to use: select a.item from MyTable a where a.item not in (select b.item form MyTable b) this clearly is inefficient, is there a better way? Another approach is: I plan to sort each list, and then walk down both of them to find the diff. If the list is from a file, I have to read it into a db table first, then use db sorting to output the list. Is the run time complexity still O(nlogn) in db sorting? either approach is a pain and seems would be very slow when the list involved has hundreds of millions of items. any suggestions?

Read the article

Entity Framework 4.0 and DDD patterns

- by Voice

Hi everybody I use EntityFramework as ORM and I have simple POCO Domain Model with two base classes that represent Value Object and Entity Object Patterns (Evans). These two patterns is all about equality of two objects, so I overrode Equals and GetHashCode methods. Here are these two classes: public abstract class EntityObject<T>{ protected T _ID = default(T); public T ID { get { return _ID; } protected set { _ID = value; } } public sealed override bool Equals(object obj) { EntityObject<T> compareTo = obj as EntityObject<T>; return (compareTo != null) && ((HasSameNonDefaultIdAs(compareTo) || (IsTransient && compareTo.IsTransient)) && HasSameBusinessSignatureAs(compareTo)); } public virtual void MakeTransient() { _ID = default(T); } public bool IsTransient { get { return _ID == null || _ID.Equals(default(T)); } } public override int GetHashCode() { if (default(T).Equals(_ID)) return 0; return _ID.GetHashCode(); } private bool HasSameBusinessSignatureAs(EntityObject<T> compareTo) { return ToString().Equals(compareTo.ToString()); } private bool HasSameNonDefaultIdAs(EntityObject<T> compareTo) { return (_ID != null && !_ID.Equals(default(T))) && (compareTo._ID != null && !compareTo._ID.Equals(default(T))) && _ID.Equals(compareTo._ID); } public override string ToString() { StringBuilder str = new StringBuilder(); str.Append(" Class: ").Append(GetType().FullName); if (!IsTransient) str.Append(" ID: " + _ID); return str.ToString(); } } public abstract class ValueObject<T, U> : IEquatable<T> where T : ValueObject<T, U> { private static List<PropertyInfo> Properties { get; set; } private static Func<ValueObject<T, U>, PropertyInfo, object[], object> _GetPropValue; static ValueObject() { Properties = new List<PropertyInfo>(); var propParam = Expression.Parameter(typeof(PropertyInfo), "propParam"); var target = Expression.Parameter(typeof(ValueObject<T, U>), "target"); var indexPar = Expression.Parameter(typeof(object[]), "indexPar"); var call = Expression.Call(propParam, typeof(PropertyInfo).GetMethod("GetValue", new[] { typeof(object), typeof(object[]) }), new[] { target, indexPar }); var lambda = Expression.Lambda<Func<ValueObject<T, U>, PropertyInfo, object[], object>>(call, target, propParam, indexPar); _GetPropValue = lambda.Compile(); } public U ID { get; protected set; } public override Boolean Equals(Object obj) { if (ReferenceEquals(null, obj)) return false; if (obj.GetType() != GetType()) return false; return Equals(obj as T); } public Boolean Equals(T other) { if (ReferenceEquals(null, other)) return false; if (ReferenceEquals(this, other)) return true; foreach (var property in Properties) { var oneValue = _GetPropValue(this, property, null); var otherValue = _GetPropValue(other, property, null); if (null == oneValue && null == otherValue) return false; if (false == oneValue.Equals(otherValue)) return false; } return true; } public override Int32 GetHashCode() { var hashCode = 36; foreach (var property in Properties) { var propertyValue = _GetPropValue(this, property, null); if (null == propertyValue) continue; hashCode = hashCode ^ propertyValue.GetHashCode(); } return hashCode; } public override String ToString() { var stringBuilder = new StringBuilder(); foreach (var property in Properties) { var propertyValue = _GetPropValue(this, property, null); if (null == propertyValue) continue; stringBuilder.Append(propertyValue.ToString()); } return stringBuilder.ToString(); } protected static void RegisterProperty(Expression<Func<T, Object>> expression) { MemberExpression memberExpression; if (ExpressionType.Convert == expression.Body.NodeType) { var body = (UnaryExpression)expression.Body; memberExpression = body.Operand as MemberExpression; } else memberExpression = expression.Body as MemberExpression; if (null == memberExpression) throw new InvalidOperationException("InvalidMemberExpression"); Properties.Add(memberExpression.Member as PropertyInfo); } } Everything was OK until I tried to delete some related objects (aggregate root object with two dependent objects which was marked for cascade deletion): I've got an exception "The relationship could not be changed because one or more of the foreign-key properties is non-nullable". I googled this and found http://blog.abodit.com/2010/05/the-relationship-could-not-be-changed-because-one-or-more-of-the-foreign-key-properties-is-non-nullable/ I changed GetHashCode to base.GetHashCode() and error disappeared. But now it breaks all my code: I can't override GetHashCode for my POCO objects = I can't override Equals = I can't implement Value Object and Entity Object patters for my POCO objects. So, I appreciate any solutions, workarounds here etc.

Search Results

Search found 21702 results on 869 pages for 'large objects'.

Page 44/869 | < Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51 | Next Page >

- by Christo Fur

- by yox

- by mlaverd

- by Saif Bechan

- by Mark Rogers

- by David Günzel

- by jluckyiv

- by Simon Rigby

- by Miner

- by BeraCim

- by SchlaWiener

- by tommizzle

- by Nate Bross

- by Flo

- by Gacek

- by Tommy

- by jb

- by Nav

- by Bill Zimmerman

- by JP19

- by Foredecker

- by Tom

- by Fadrian

- by user157195

- by Voice

< Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51 | Next Page >