Search Results

Search found 5628 results on 226 pages for 'ram kumar'.

Page 220/226 | < Previous Page | 216 217 218 219 220 221 222 223 224 225 226  | Next Page >

  • Tridion Installation

    - by Kevin Brydon
    I am currently upgrading an installation of Tridion from 5.3 to 2011 starting almost from scratch (aside from migrating the database), brand new virtual servers. I just want to ask for some advice on my current server setup... a sanity check. All servers are running Windows Server 2008. The pages on our website are all classic ASP. Database SQL Server cluster. The 5.3 database has been migrated using the DatabaseManager. This is pretty standard and works well (in test anyway). Content Manager A single server to run the Content Manager and the Publisher. There are around 10 people using it at any one time so not under a particularly heavy load. Content Data Store Filesystem located somewhere on the network. One directory for live and one for staging. Content Delivery Two servers (cd1 and cd2) each with the the following server roles installed. cd1 writes to a filesystem content data store for the live website, cd2 writes to the content data store for the staging website. Presentation Two public facing web servers (web1 and web2) serving both the live and staging websites. The web servers read directly from the content data store as its a filesystem. Each of the web servers have the Content Delivery Server installed so that I can use dynamic linking (and other features?). I've so far set up everything but the web servers. Any thoughts? edit Thanks to Ram S who linked me to a decent walkthrough, upvoted. I suppose I should have posed some questions as I didn't really ask a question. I guess I'm a little confused over the content deliver aspect. I have the Content Delivery split in two separate parts. cd1 and cd2 do the work of shifting information from the Content Manager to the Staging/Live web directories. web1 and web2 should do the work of serving the web pages to the outside world and will interact with the content data store (file system). Is this a correct setup? I need some parts of the Content Delivery on my web servers right? Theoretically I could get rid of the cd1 and cd2 servers and use web1 and web2 to do the deployment right? But I suspect this will put the web servers under unnecessary strain should there ever be a big publish. I've been reading the 2011 Installation Manual, Content Delivery section, and I'm finding it quite hard to get my head around!

    Read the article

  • Fulltext search for django : Mysql not so bad ? (vs sphinx, xapian)

    - by Eric
    I am studying fulltext search engines for django. It must be simple to install, fast indexing, fast index update, not blocking while indexing, fast search. After reading many web pages, I put in short list : Mysql MYISAM fulltext, djapian/python-xapian, and django-sphinx I did not choose lucene because it seems complex, nor haystack as it has less features than djapian/django-sphinx (like fields weighting). Then I made some benchmarks, to do so, I collected many free books on the net to generate a database table with 1 485 000 records (id,title,body), each record is about 600 bytes long. From the database, I also generated a list of 100 000 existing words and shuffled them to create a search list. For the tests, I made 2 runs on my laptop (4Go RAM, Dual core 2.0Ghz): the first one, just after a server reboot to clear all caches, the second is done juste after in order to test how good are cached results. Here are the "home made" benchmark results : 1485000 records with Title (150 bytes) and body (450 bytes) Mysql 5.0.75/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 7m14.146s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 Mysql 5.5.4 m3/Ubuntu 9.04 Fulltext : ========================================================================== Full indexing : 6m08.154s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:11.553524 next run : 0:00:00.168508 1 thread, 100000 searchs with single word randomly taken from database : First run : 9m09s next run : 5m38s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:15.007353 1 thread, boolean search : 1000 x (+word1 +word2) First run : 0:00:21.205404 next run : 0:00:00.145098 Djapian Fulltext : ========================================================================== Full indexing : 84m7.601s 1 thread, 1000 searchs with single word randomly taken from database with prefetch : First run : 0:02:28.085680 next run : 0:00:14.300236 python-xapian Fulltext : ========================================================================== 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:26.402084 next run : 0:00:00.695092 django-sphinx Fulltext : ========================================================================== Full indexing : 1m25.957s 1 thread, 1000 searchs with single word randomly taken from database : First run : 0:01:30.073001 next run : 0:00:05.203294 1 thread, 100000 searchs with single word randomly taken from database : First run : 12m48s next run : 9m45s 1 thread, 10000 random strings (random strings should not be found in database) : just after the 100000 search test : 0:00:23.535319 1 thread, boolean search : 1000 x (word1 word2) First run : 0:00:20.856486 next run : 0:00:03.005416 As you can see, Mysql is not so bad at all for fulltext search. In addition, its query cache is very efficient. Mysql seems to me a good choice as there is nothing to install (I need just to write a small script to synchronize an Innodb production table to a MyISAM search table) and as I do not really need advanced search feature like stemming etc... Here is the question : What do you think about Mysql fulltext search engine vs sphinx and xapian ?

    Read the article

  • Java ExecutorService java heap space ptoblems

    - by Sergey Aganezov jr
    I have a little bit of a problem in a multitasking java department. I have a class, called public class ThreadWorker implements Runnable { //some code in here public void run(){ // invokes some recursion method in the ThreadWorker itself, // which will stop eventually { } all in all, pretty simple "worker" that can work on it's on. To work with threads I'm using public static int THREAD_NUMBER = 4; public static ExecutorServide es = Executors.newFixedThreadPool(THREAD_NUMBER); adding instances of ThreadWroker class happens here: public void recursiveMethod(Arraylist<Integers> elements, MyClass data){ if (elements.size() == 0 && data.qualifies()){ ThreadWorker tw = new ThreadWorker(data); es.execute(tw); return; } for (int i=0; i< elements.size(); i++){ // some code to prevent my problem MyClass data1 = new MyClass(data); MyClass data2 = new MyClass(data); ArrayList<Integer> newElements = (ArrayList<Integer>)elements.clone(); data1.update(elements.get(i)); data2.update(-1 * elements.get(i)); newElements.remove(i); recursiveMethod(newElements, data1); recursiveMethod(newElements, data2); { } and the problem is that the depth of the recursion tree is quite big, so as it's width, so a lot of ThreadWorkers are added to the ExecutorService, so after some time on the big input a get Exception in thread "pool-1-thread-2" java.lang.OutOfMemoryError: Java heap space which is caused, as I think because of a ginormous number of ThreadWorkers i'm adding to ExecutorSirvice to be executed, so it runs out of memory. Every ThreadWorker takes about 40 Mb of RAM for all it needs. Is there a method to get how many threads (instances of classes implementing runnable interface) have been added to ExecutorService? So I can add it in the shown above code (int the " // some code to prevent my problem"), as while ("number of threads in the ExecutorService" > 10){ Thread.sleep(10000); } so I won't go to deep or to broad with my recursion and prevent those exception-throwing situations. Sincerely, Sergey Aganezov jr.

    Read the article

  • Is this asking too much of a browser?

    - by Matt Ball
    I'm embedding a large array in <script> tags in my HTML, like this (nothing surprising): <script> var largeArray = [/* lots of stuff in here */]; </script> In this particular example, the array has 210,000 elements. That's well below the theoretical maximum of 231 - by 4 orders of magnitude. Here's the fun part: if I save JS source for the array to a file, that file is 44 megabytes (46,573,399 bytes, to be exact). If you want to see for yourself, you can download it from my Dropbox. (All the data in there is canned, so much of it is repeated. This will not be the case in production.) Now, I'm really not concerned about serving that much data. My server gzips its responses, so it really doesn't take all that long to get the data over the wire. However, there is a really nasty tendency for the page, once loaded, to crash the browser. I'm not testing at all in IE (this is an internal tool). My primary targets are Chrome 8 and Firefox 3.6. In Firefox, I can see a reasonably useful error in the console: Error: script stack space quota is exhausted In Chrome, I simply get the sad-tab page: Cut to the chase, already Is this really too much data for our modern, "high-performance" browsers to handle? Is there anything I can do* to gracefully handle this much data? Incidentally, I was able to get this to work (read: not crash the tab) on-and-off in Chrome. I really thought that Chrome, at least, was made of tougher stuff, but apparently I was wrong... Edit 1 @Crayon: I wasn't looking to justify why I'd like to dump this much data into the browser at once. Short version: either I solve this one (admittedly not-that-easy) problem, or I have to solve a whole slew of other problems. I'm opting for the simpler approach for now. @various: right now, I'm not especially looking for ways to actually reduce the number of elements in the array. I know I could implement Ajax paging or what-have-you, but that introduces its own set of problems for me in other regards. @Phrogz: each element looks something like this: {dateTime:new Date(1296176400000), terminalId:'terminal999', 'General___BuildVersion':'10.05a_V110119_Beta', 'SSM___ExtId':26680, 'MD_CDMA_NETLOADER_NO_BCAST___Valid':'false', 'MD_CDMA_NETLOADER_NO_BCAST___PngAttempt':0} @Will: but I have a computer with a 4-core processor, 6 gigabytes of RAM, over half a terabyte of disk space ...and I'm not even asking for the browser to do this quickly - I'm just asking for it to work at all! ? *other than the obvious: sending less data to the browser

    Read the article

  • Why does this thumbnail generation code throw OutOfMemoryException on large files?

    - by tsilb
    This code works great for generating thumbnails, but when given a very large (100MB+) TIFF file, it throws OutOfMemoryExceptions. When I do it manually in Paint.NET on the same machine, it works fine. How can I improve this code to stop throwing on very large files? In this case I'm loading a 721MB TIF on a machine with 8GB RAM. The Task Manager shows 2GB used so something is preventing it from using all that memory. Specifically it throws when I load the Image to calculate the size of the original. What gives? /// <summary>Creates a thumbnail of a given image.</summary> /// <param name="inFile">Fully qualified path to file to create a thumbnail of</param> /// <param name="outFile">Fully qualified path to created thumbnail</param> /// <param name="x">Width of thumbnail</param> /// <returns>flag; result = is success</returns> public static bool CreateThumbnail(string inFile, string outFile, int x) { // Validation - assume 16x16 icon is smallest useful size. Smaller than that is just not going to do us any good anyway. I consider that an "Exceptional" case. if (string.IsNullOrEmpty(inFile)) throw new ArgumentNullException("inFile"); if (string.IsNullOrEmpty(outFile)) throw new ArgumentNullException("outFile"); if (x < 16) throw new ArgumentOutOfRangeException("x"); if (!File.Exists(inFile)) throw new ArgumentOutOfRangeException("inFile", "File does not exist: " + inFile); // Mathematically determine Y dimension int y; using (Image img = Image.FromFile(inFile)) { // OutOfMemoryException double xyRatio = (double)x / (double)img.Width; y = (int)((double)img.Height * xyRatio); } // All this crap could have easily been Image.Save(filename, x, y)... but nooooo.... using (Bitmap bmp = new Bitmap(inFile)) using (Bitmap thumb = new Bitmap((Image)bmp, new Size(x, y))) using (Graphics g = Graphics.FromImage(thumb)) { g.SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.HighQuality; g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.High; g.CompositingQuality = System.Drawing.Drawing2D.CompositingQuality.HighQuality; System.Drawing.Imaging.ImageCodecInfo codec = System.Drawing.Imaging.ImageCodecInfo.GetImageEncoders()[1]; System.Drawing.Imaging.EncoderParameters ep2 = new System.Drawing.Imaging.EncoderParameters(1); ep2.Param[0] = new System.Drawing.Imaging.EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L); g.DrawImage(bmp, new Rectangle(0,0,thumb.Width, thumb.Height)); try { thumb.Save(outFile, codec, ep2); return true; } catch { return false; } } }

    Read the article

  • How to maximize http.sys file upload performance

    - by anelson
    I'm building a tool that transfers very large streaming data sets (possibly on the order of terabytes in a single stream; routinely in the tens of gigabytes) from one server to another. The client portion of the tool will read blocks from the source disk, and send them over the network. The server side will read these blocks off the network and write them to a file on the server disk. Right now I'm trying to decide which transport to use. Options are raw TCP, and HTTP. I really, REALLY want to be able to use HTTP. The HttpListener (or WCF if I want to go that route) make it easy to plug in to the HTTP Server API (http.sys), and I can get things like authentication and SSL for free. The problem right now is performance. I wrote a simple test harness that sends 128K blocks of NULL bytes using the BeginWrite/EndWrite async I/O idiom, with async BeginRead/EndRead on the server side. I've modified this test harness so I can do this with either HTTP PUT operations via HttpWebRequest/HttpListener, or plain old socket writes using TcpClient/TcpListener. To rule out issues with network cards or network pathways, both the client and server are on one machine and communicate over localhost. On my 12-core Windows 2008 R2 test server, the TCP version of this test harness can push bytes at 450MB/s, with minimal CPU usage. On the same box, the HTTP version of the test harness runs between 130MB/s and 200MB/s depending upon how I tweak it. In both cases CPU usage is low, and the vast majority of what CPU usage there is is kernel time, so I'm pretty sure my usage of C# and the .NET runtime is not the bottleneck. The box has two 6-core Xeon X5650 processors, 24GB of single-ranked DDR3 RAM, and is used exclusively by me for my own performance testing. I already know about HTTP client tweaks like ServicePointManager.MaxServicePointIdleTime, ServicePointManager.DefaultConnectionLimit, ServicePointManager.Expect100Continue, and HttpWebRequest.AllowWriteStreamBuffering. Does anyone have any ideas for how I can get HTTP.sys performance beyond 200MB/s? Has anyone seen it perform this well on any environment?

    Read the article

  • Git sh.exe process forking issue on windows XP, slow?

    - by AndyL
    Git is essential to my workflow. I run MSYS Git on Windows XP on my quad core machine with 3GB of RAM, and normally it is responsive and zippy. Suddenly an issue has cropped up whereby it takes 30 seconds to run any command from the Git Bash command prompt, including ls or cd. Interestingly, from the bash prompt it looks likes ls runs fairly quickly, I can then see the output from ls, but it then takes ~30 seconds for the prompt to return. If I switch to the windows command prompt (by running cmd from the start menu) git related commands also take forever, even just to run. For example git status can take close to a minute before anything happens. Sometimes the processes simply don't finish. Note that I have "MSYS Git" installed as well as regular "MSYS" for things like MinGW and make. I believe the problem is related to sh.exe located in C:\Program Files\Git\bin. When I run ls from the bash prompt, or when I invoke git from the windows prompt, task manager shows up to four instances of sh.exe processes that come and go. Here I am waiting for ls to return and you can see the task manager has git.exe running and four instances of sh.exe: If I ctrl-c in the middle of an ls I sometimes get errors that include: sh.exe": fork: Resource temporarily unavailable 0 [main] sh.exe" 1624 proc_subproc: Couldn't duplicate my handle<0x6FC> fo r pid 6052, Win32 error 5 sh.exe": fork: Resource temporarily unavailable Or for git status: $ git status sh.exe": fork: Resource temporarily unavailable sh.exe": fork: Resource temporarily unavailable sh.exe": fork: Resource temporarily unavailable sh.exe": fork: Resource temporarily unavailable Can I fix this so that git runs quickly again, and if so how? Things I have tried: Reboot Upgrade MSYS Git to most recent version & Reboot Upgrade MSYS to most recent version & Reboot Uninstall MSYS & uninstall and reinstall MSYS Git alone & Reboot I'd very much like to not wipe my box and reinstall Windows, but I will if I can't get this fixed. I can no longer code if it takes me 30 s to run git status or cd.

    Read the article

  • Is this slow WPF TextBlock performance expected?

    - by Ben Schoepke
    Hi, I am doing some benchmarking to determine if I can use WPF for a new product. However, early performance results are disappointing. I made a quick app that uses data binding to display a bunch of random text inside of a list box every 100 ms and it was eating up ~15% CPU. So I made another quick app that skipped the data binding/data template scheme and does nothing but update 10 TextBlocks that are inside of a ListBox every 100 ms (the actual product wouldn't require 100 ms updates, more like 500 ms max, but this is a stress test). I'm still seeing ~10-15% CPU usage. Why is this so high? Is it because of all the garbage strings? Here's the XAML: <Grid> <ListBox x:Name="numericsListBox"> <ListBox.Resources> <Style TargetType="TextBlock"> <Setter Property="FontSize" Value="48"/> <Setter Property="Width" Value="300"/> </Style> </ListBox.Resources> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> <TextBlock/> </ListBox> </Grid> Here's the code behind: public partial class Window1 : Window { private int _count = 0; public Window1() { InitializeComponent(); } private void OnLoad(object sender, RoutedEventArgs e) { var t = new DispatcherTimer(TimeSpan.FromSeconds(0.1), DispatcherPriority.Normal, UpdateNumerics, Dispatcher); t.Start(); } private void UpdateNumerics(object sender, EventArgs e) { ++_count; foreach (object textBlock in numericsListBox.Items) { var t = textBlock as TextBlock; if (t != null) t.Text = _count.ToString(); } } } Any ideas for a better way to quickly render text? My computer: XP SP3, 2.26 GHz Core 2 Duo, 4 GB RAM, Intel 4500 HD integrated graphics. And that is an order of magnitude beefier than the hardware I'd need to develop for in the real product.

    Read the article

  • R Install/Update on Mac OS X (Snow Leopard): where does R put files during install/config?

    - by doug
    In sum, there's a stray preference-like file or two (probably just one) that i can't find. Here's the whole story: I recently attempted to update my R install from 2.10 to 2.11. As i have done before, i installed from source. I know that all of the dependencies are correctly installed and made available to R, because my prior install worked fine. When i upgraded to 2.11, i am unable to install packages (no exception thrown, it just doesn't appear to complete the install and is unresponsive so i have to quit + restart R. Given i install from source, there are any number of points in the process that i could have messed up. What i need to do now is "start over" which requires that i clear out my my prior install. I am having trouble doing that. There is still at least one preference file or something like that i can't find and one of these is causing the problem, so i need to find it and terminate it with extreme prejudice before i do a fresh install. Although i set a number of flags during the install, i have never opted out of the default install locations during the config step. There has to be one or more preference files still in my file structure (and that's also accessible to the new install of R) because after i follow all of the steps below, then do a fresh install, when i start R for the first time, some of my preferences have persisted (e.g., quartz settings, GUI background color, editor selection, etc.). Again, the problem is that i just cannot locate those files. Finally, the problem can't be that during my last install from source, i inadvertently caused a preference file to be sent to an off-spec location--because again, a fresh R install (whether from source or from the OS X binaries) is finding those files Here's what i've done prior to attempting a clean install of R: Removed files from these locations: ~/.RData ~/.RHistory /Applications/R64.app /Applications/R.app /Library/Frameworks/R.framework (i also removed all symlinks from these) Cleared all RAM and disk caches, in particular the directory where i know R caches: ~/Library/Caches/R* (in fact i've cleared this entire directory) Checked for all 'hidden' files in the OS X directories where login/startup files are often placed: /etc/ ~/ In addition, i've checked R-help, and i've also read through the relevant portions of 'R Installation and Administration'--no luck. I've also searched searched my file structure using the various bash utilities, which nearly always solves problems of this sort quite easily, but in this case obviously searching by name or even pattern is more problematic.

    Read the article

  • C#: BackgroundWorker cloning resources?

    - by Dav
    The problem I've been struggling with this partiular problem for two days now and just run out of ideas. A little... background: we have a WinForms app that needs to access a database, construct a list of related in-memory objects from that data, and then display on a DataGridView. Important point is that we first populate an app-wide cache (List), and then create a mirror of the cache local to the form on which the DGV lives (using List constructor param). Because fetching the data takes a good few seconds (DB sits on a LAN server) to load, we decided to use a BackgroundWorker, and only refresh the DGV once the data is loaded. However, it seems that doing the loading via a BGW results in some memory leak... or an error on my part. When loaded using a blocking method call, the app consumes about 30MB of RAM; with a BGW this jumps to 80MB! While it may not seem as much anyway, our clients are not too happy about it. Relevant code Form private void MyForm_Load(object sender, EventArgs e) { MyRepository.Instance.FinishedEvent += RefreshCache; } private void RefreshCache(object sender, EventArgs e) { dgvProducts.DataSource = new List<MyDataObj>(MyRepository.Products); } Repository private static List<MyDataObj> Products { get; set; } public event EventHandler ProductsLoaded; public void GetProductsSync() { List<MyDataObj> p; using (MyL2SDb db = new MyL2SDb(MyConfig.ConnectionString)) { p = db.PRODUCTS .Select(p => new MyDataObj {Id = p.ID, Description = p.DESCR}) .ToList(); } Products = p; // tell the form to refresh UI if (ProductsLoaded != null) ProductsLoaded(this, null); } public void GetProductsAsync() { using (BackgroundWorker myWorker = new BackgroundWorker()) { myWorker.DoWork += delegate { List<MyDataObj> p; using (MyL2SDb db = new MyL2SDb(MyConfig.ConnectionString)) { p = db.PRODUCTS .Select(p => new MyDataObj {Id = p.ID, Description = p.DESCR}) .ToList(); } Products = p; }; // tell the form to refresh UI when finished myWorker.RunWorkerCompleted += GetProductsCompleted; myWorker.RunWorkerAsync(); } } private void GetProductsCompleted(object sender, RunWorkerCompletedEventArgs e) { if (ProductsLoaded != null) ProductsLoaded(this, null); } End! GetProductsSync or GetProductsAsync are called on the main thread, not shown above. Could it be that the GarbageCollector just gets lost with two threads? Or is it the task manager that shows incorrect values? Will be greateful for any responses, suggestions, criticism.

    Read the article

  • I just can't kill Java thread.

    - by Adrian
    I have a thread that downloads some images from internet using different proxies. Sometimes it hangs, and can't be killed by any means. public HttpURLConnection uc; public InputStream in; Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("server", 8080)); URL url = new URL("http://images.com/image.jpg"); uc = (HttpURLConnection)url.openConnection(proxy); uc.setConnectTimeout(30000); uc.setAllowUserInteraction(false); uc.setDoOutput(true); uc.addRequestProperty("User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"); uc.connect(); in = uc.getInputStream(); When it hangs, it freezes at the uc.getInputStream() method. I made a timer which tries to kill the thread if it's run time exceeds 3 minutes. I tried .terminate() the thread. No effect. I tried uc.disconnect() from the main thread. The method also hangs and with it, the main thread. I tried in.close(). No effect. I tried uc=null, in=null hoping for an exception that will end the thread. It keeps running. It never passes the uc.getInputStream() method. In my last test the thread lasted over 14 hours after receiving all above commands (or various combinations). I had to kill the Java process to stop the thread. If I just ignore the thread, and set it's instance to null, the thread doesn't die and is not cleaned by garbage collector. I know that because if I let the application running for several days, the Java process takes more and more system memory. In 3 days it took 10% of my 8Gb. RAM system. It is impossible to kill a thread whatever?

    Read the article

  • FASM vc MASM trasnlation problem in mov si, offset msg

    - by Ruben Trancoso
    hi folks, just did my first test with MASM and FASM with the same code (almos) and I falled in trouble. The only difference is that to produce just the 104 bytes I need to write to MBR in FASM I put org 7c00h and in MASM 0h. The problem is on the mov si, offset msg that in the first case transletes it to 44 7C (7c44h) and with masm translates to 44 00 (0044h)! but just when I change org 7c00h to org 0h in MASM. Otherwise it will produce the entire segment from 0 to 7dff. how do I solve it? or in short, how to make MASM produce a binary that begins at 7c00h as it first byte and subsequent jumps remain relative to 7c00h? .model TINY .code org 7c00h ; Boot entry point. Address 07c0:0000 on the computer memory xor ax, ax ; Zero out ax mov ds, ax ; Set data segment to base of RAM jmp start ; Jump to the first byte after DOS boot record data ; ---------------------------------------------------------------------- ; DOS boot record data ; ---------------------------------------------------------------------- brINT13Flag db 90h ; 0002h - 0EH for INT13 AH=42 READ brOEM db 'MSDOS5.0' ; 0003h - OEM name & DOS version (8 chars) brBPS dw 512 ; 000Bh - Bytes/sector brSPC db 1 ; 000Dh - Sectors/cluster brResCount dw 1 ; 000Eh - Reserved (boot) sectors brFATs db 2 ; 0010h - FAT copies brRootEntries dw 0E0h ; 0011h - Root directory entries brSectorCount dw 2880 ; 0013h - Sectors in volume, < 32MB brMedia db 240 ; 0015h - Media descriptor brSPF dw 9 ; 0016h - Sectors per FAT brSPH dw 18 ; 0018h - Sectors per track brHPC dw 2 ; 001Ah - Number of Heads brHidden dd 0 ; 001Ch - Hidden sectors brSectors dd 0 ; 0020h - Total number of sectors db 0 ; 0024h - Physical drive no. db 0 ; 0025h - Reserved (FAT32) db 29h ; 0026h - Extended boot record sig brSerialNum dd 404418EAh ; 0027h - Volume serial number (random) brLabel db 'OSAdventure' ; 002Bh - Volume label (11 chars) brFSID db 'FAT12 ' ; 0036h - File System ID (8 chars) ;------------------------------------------------------------------------ ; Boot code ; ---------------------------------------------------------------------- start: mov si, offset msg call showmsg hang: jmp hang msg db 'Loading...',0 showmsg: lodsb cmp al, 0 jz showmsgd push si mov bx, 0007 mov ah, 0eh int 10h pop si jmp showmsg showmsgd: retn ; ---------------------------------------------------------------------- ; Boot record signature ; ---------------------------------------------------------------------- dw 0AA55h ; Boot record signature END

    Read the article

  • Boggling Direct3D9 dynamic vertex buffer Lock crash/post-lock failure on Intel GMA X3100.

    - by nj
    Hi, For starters I'm a fairly seasoned graphics programmer but as wel all know, everyone makes mistakes. Unfortunately the codebase is a bit too large to start throwing sensible snippets here and re-creating the whole situation in an isolated CPP/codebase is too tall an order -- for which I am sorry, do not have the time. I'll do my best to explain. B.t.w, I will of course supply specific pieces of code if someone wonders how I'm handling this-or-that! As with all resources in the D3DPOOL_DEFAULT pool, when the device context is taken away from you you'll sooner or later will have to reset your resources. I've built a mechanism to handle this for all relevant resources that's been working for years; but that fact nothingwithstanding I've of course checked, asserted and doubted any assumption since this bug came to light. What happens is as follows: I have a rather large dynamic vertex buffer, exact size 18874368 bytes. This buffer is locked (and discarded fully using the D3DLOCK_DISCARD flag) each frame prior to generating dynamic geometry (isosurface-related, f.y.i) to it. This works fine, until, of course, I start to reset. It might take 1 time, it might take 2 or it might take 5 resets to set off a bug that causes an access violation either on the pointer returned by the Lock() operation on the renewed resource or a plain crash -- regarding a somewhat similar address, but without the offset that it has tacked on to it in the first case because in that case we're somewhere halfway writing -- iside the D3D9 dll Lock() call. I've tested this on other hardware, upgraded my GMA X3100 drivers (using a MacBook with BootCamp) to the latest ones, but I can't reproduce it on any other machine and I'm at a loss about what's wrong here. I have tried to reproduce a similar situation with a similar buffer (I've got a large scratch pad of the same type I filled with quads) and beyond a certain amount of bytes it started to behave likewise. I'm not asking for a solution here but I'm very interested if there are other developers here who have battled with the same foe or maybe some who can point me in some insightful direction, maybe ask some questions that might shed a light on what I may or may not be overlooking. Another interesting artifact is that the vertex buffer starts to bug if I supply both D3DLOCK_DISCARD and D3DLOCK_NOOVERWRITE together which, even though not very logical (you're not going to overwrite if you've just discarded all), gives graphics glitches. Thanks and any corrections are more than welcome. Niels p.s - A friend of mine raised the valid point that it is a huge buffer for onboard video RAM and it's being at least double or triple buffered internally due to it's dynamic nature. On the other hand, the debug output (D3D9 debug DLL + max. warning output) remains silent. p.s 2 - Had it tested on more machines and still works -- it's probably a matter of circumstance: the huge dynamic, internally double/trippled buffered buffer, not a lot of memory and drivers that don't complain when they should.. Unless someone has a better suggestion; I'd still love to hear it :)

    Read the article

  • Is it too early to start designing for Task Parallel Library?

    - by Joe Erickson
    I have been following the development of the .NET Task Parallel Library (TPL) with great interest since Microsoft first announced it. There is no doubt in my mind that we will eventually take advantage of TPL. What I am questioning is whether it makes sense to start taking advantage of TPL when Visual Studio 2010 and .NET 4.0 are released, or whether it makes sense to wait a while longer. Why Start Now? The .NET 4.0 Task Parallel Library appears to be well designed and some relatively simple tests demonstrate that it works well on today's multi-core CPUs. I have been very interested in the potential advantages of using multiple lightweight threads to speed up our software since buying my first quad processor Dell Poweredge 6400 about seven years ago. Experiments at that time indicated that it was not worth the effort, which I attributed largely to the overhead of moving data between each CPU's cache (there was no shared cache back then) and RAM. Competitive advantage - some of our customers can never get enough performance and there is no doubt that we can build a faster product using TPL today. It sounds fun. Yes, I realize that some developers would rather poke themselves in the eye with a sharp stick, but we really enjoy maximizing performance. Why Wait? Are today's Intel Nehalem CPUs representative of where we are going as multi-core support matures? You can purchase a Nehalem CPU with 4 cores which share a single level 3 cache today, and most likely a 6 core CPU sharing a single level 3 cache by the time Visual Studio 2010 / .NET 4.0 are released. Obviously, the number of cores will go up over time, but what about the architecture? As the number of cores goes up, will they still share a cache? One issue with Nehalem is the fact that, even though there is a very fast interconnect between the cores, they have non-uniform memory access (NUMA) which can lead to lower performance and less predictable results. Will future multi-core architectures be able to do away with NUMA? Similarly, will the .NET Task Parallel Library change as it matures, requiring modifications to code to fully take advantage of it? Limitations Our core engine is 100% C# and has to run without full trust, so we are limited to using .NET APIs.

    Read the article

  • Why is it still so hard to write software?

    - by nornagon
    Writing software, I find, is composed of two parts: the Idea, and the Implementation. The Idea is about thinking: "I have this problem; how do I solve it?" and further, "how do I solve it elegantly?" The answers to these questions are obtainable by thinking about algorithms and architecture. The ideas come partially through analysis and partially through insight and intuition. The Idea is usually the easy part. You talk to your friends and co-workers and you nut it out in a meeting or over coffee. It takes an hour or two, plus revisions as you implement and find new problems. The Implementation phase of software development is so difficult that we joke about it. "Oh," we say, "the rest is a Simple Matter of Code." Because it should be simple, but it never is. We used to write our code on punch cards, and that was hard: mistakes were very difficult to spot, so we had to spend extra effort making sure every line was perfect. Then we had serial terminals: we could see all our code at once, search through it, organise it hierarchically and create things abstracted from raw machine code. First we had assemblers, one level up from machine code. Mnemonics freed us from remembering the machine code. Then we had compilers, which freed us from remembering the instructions. We had virtual machines, which let us step away from machine-specific details. And now we have advanced tools like Eclipse and Xcode that perform analysis on our code to help us write code faster and avoid common pitfalls. But writing code is still hard. Writing code is about understanding large, complex systems, and tools we have today simply don't go very far to help us with that. When I click "find all references" in Eclipse, I get a list of them at the bottom of the window. I click on one, and I'm torn away from what I was looking at, forced to context switch. Java architecture is usually several levels deep, so I have to switch and switch and switch until I find what I'm really looking for -- by which time I've forgotten where I came from. And I do that all day until I've understood a system. It's taxing mentally, and Eclipse doesn't do much that couldn't be done in 1985 with grep, except eat hundreds of megs of RAM. Writing code has barely changed since we were staring at amber on black. We have the theoretical groundwork for much more advanced tools, tools that actually work to help us comprehend and extend the complex systems we work with every day. So why is writing code still so hard?

    Read the article

  • Programming graphics and sound on PC - Total newbie questions, and lots of them!

    - by Russel
    Hello, This isn't exactly specifically a programming question (or is it?) but I was wondering: How are graphics and sound processed from code and output by the PC? My guess for graphics: There is some reserved memory space somewhere that holds exactly enough room for a frame of graphics output for your monitor. IE: 800 x 600, 24 bit color mode == 800x600x3 = ~1.4MB memory space Between each refresh, the program writes video data to this space. This action is completed before the monitor refresh. Assume a simple 2D game: the graphics data is stored in machine code as many bytes representing color values. Depending on what the program(s) being run instruct the PC, the processor reads the appropriate data and writes it to the memory space. When it is time for the monitor to refresh, it reads from each memory space byte-for-byte and activates hardware depending on those values for each color element of each pixel. All of this of course happens crazy-fast, and repeats x times a second, x being the monitor's refresh rate. I've simplified my own likely-incorrect explanation by avoiding talk of double buffering, etc Here are my questions: a) How close is the above guess (the three steps)? b) How could one incorporate graphics in pure C++ code? I assume the practical thing that everyone does is use a graphics library (SDL, OpenGL, etc), but, for example, how do these libraries accomplish what they do? Would manual inclusion of graphics in pure C++ code (say, a 2D spite) involve creating a two-dimensional array of bit values (or three dimensional to include multiple RGB values per pixel)? Is this how it would be done waaay back in the day? c) Also, continuing from above, do libraries such as SDL etc that use bitmaps actual just build the bitmap/etc files into machine code of the executable and use them as though they were build in the same matter mentioned in question b above? d) In my hypothetical step 3 above, is there any registers involved? Like, could you write some byte value to some register to output a single color of one byte on the screen? Or is it purely dedicated memory space (=RAM) + hardware interaction? e) Finally, how is all of this done for sound? (I have no idea :) )

    Read the article

  • Large number of UPDATE queries slowing down page

    - by Bryan Lewis
    I am reading and validating large fixed-width text files (range from 10-50K lines) that are submitted via our ASP.net website (coded in VB.Net). I do an initial scan of the file to check for basic issues (line length, etc). Then I import each row into a MS SQL table. Each DB rows basically consists of a record_ID (Primary, auto-incrementing) and about 50 varchar fields. After the insert is done, I run a validation function on the file that checks each field in each row based on a bunch of criteria (trimmed length, isnumeric, range checks, etc). If it finds an error in any field, it inserts a record into the Errors table, which has an error_ID, the record_ID and an error message. In addition, if the field fails in a particular way, I have to do a "reset" on that field. A reset might consist of blanking the entire field, or simply replacing the value with another value (e.g. replacing the string with a new one that has all illegals chars taken out). I have a 5,000 line test file. The upload, initial check, and import takes about 5-6 seconds. The detailed error check and insert into the Errors table takes about 5-8 seconds (this file has about 1200 errors in it). However, the "resets" part takes about 40-45 seconds for 750 fields that need to be reset. When I comment out the resets function (returning immediately without actually calling the UPDATE stored proc), the process is very fast. With the resets turned on, the pages take 50 seconds to return. My UPDATE stored proc is using some recommended code from http://sommarskog.se/dynamic_sql.html, whereby it uses CASE instead of dynamic SQL: UPDATE dbo.Records SET dbo.Records.file_ID = CASE @field_name WHEN 'file_ID' THEN @field_value ELSE file_ID END, . . (all 50 varchar field CASE statements here) . WHERE dbo.Records.record_ID = @record_ID Is there any way I can help my performance here. Can I somehow group all of these UPDATE calls into a single transaction? Should I be reworking the UPDATE query somehow? Or is it just sheer quantity of 750+ UPDATEs and things are just slow (it's a quad proc server with 8GB ram). Any suggestions appreciated.

    Read the article

  • Mysterious constraints problem with SQL Server 2000

    - by Ramon
    Hi all I'm getting the following error from a VB NET web application written in VS 2003, on framework 1.1. The web app is running on Windows Server 2000, IIS 5, and is reading from a SQL server 2000 database running on the same machine. System.Data.ConstraintException: Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints. at System.Data.DataSet.FailedEnableConstraints() at System.Data.DataSet.EnableConstraints() at System.Data.DataSet.set_EnforceConstraints(Boolean value) at System.Data.DataTable.EndLoadData() at System.Data.Common.DbDataAdapter.FillFromReader(Object data, String srcTable, IDataReader dataReader, Int32 startRecord, Int32 maxRecords, DataColumn parentChapterColumn, Object parentChapterValue) at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable, IDataReader dataReader, Int32 startRecord, Int32 maxRecords) at System.Data.Common.DbDataAdapter.FillFromCommand(Object data, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior) at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior) at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet) The problem appears when the web app is under a high load. The system runs fine when volume is low, but when the number of requests becomes high, the system starts rejecting incoming requests with the above exception message. Once the problem appears, very few requests actually make it through and get processed normally, about 2 in every 30. The vast majority of requests fail, until a SQL Server restart or IIS reset is performed. The system then start processing requests normally, and after some time it starts throwing the same error. The error occurs when a data adapter runs the Fill() method against a SELECT statement, to populate a strongly-typed dataset. It appears that the dataset does not like the data it is given and throws this exception. This error occurs on various SELECT statements, acting on different tables. I have regenerated the dataset and checked the relevant constraints, as well as the table from which the data is read. Both the dataset definition and the data in the table are fine. Admittedly, the hardware running both the web app and SQL Server 2000 is seriously outdated, considering the numbers of incoming requests it currently receives. The amount of RAM consumed by SQL Server is dynamically allocated, and at peak times SQL Server can consume up to 2.8 GB out of a total of 3.5 GB on the server. At first I suspected some sort of index or database corruption, but after running DBCC CHECKDB, no errors were found in the database. So now I'm wondering whether this error is a result of the hardware limitations of the system. Is it possible for SQL Server to somehow mess up the data it's supposed to pass to the dataset, resulting in constraint violation due to, say, data type/length mismatch? I tried accessing the RowError messages of the data rows in the retrieved dataset tables but I kept getting empty strings. I know that HasErrors = true for the datatables in question. I have not set the EnableConstraints = false, and I don't want to do that. Thanks in advance. Ray

    Read the article

  • Neo4j 1.9.4 (REST Server,CYPHER) performance issue

    - by user2968943
    I have Neo4j 1.9.4 installed on 24 core 24Gb ram (centos) machine and for most queries CPU usage spikes goes to 200% with only few concurrent requests. Domain: some sort of social application where few types of nodes(profiles) with 3-30 text/array properties and 36 relationship types with at least 3 properties. Most of nodes currently has ~300-500 relationships. Current data set footprint(from console): LogicalLogSize=4294907 (32MB) ArrayStoreSize=1675520 (12MB) NodeStoreSize=1342170 (10MB) PropertyStoreSize=1739548 (13MB) RelationshipStoreSize=6395202 (48MB) StringStoreSize=1478400 (11MB) which is IMHO really small. most queries looks like this one(with more or less WITH .. MATCH .. statements and few queries with variable length relations but the often fast): START targetUser=node({id}), currentUser=node({current}) MATCH targetUser-[contact:InContactsRelation]->n, n-[:InLocationRelation]->l, n-[:InCategoryRelation]->c WITH currentUser, targetUser,n, l,c, contact.fav is not null as inFavorites MATCH n<-[followers?:InContactsRelation]-() WITH currentUser, targetUser,n, l,c,inFavorites, COUNT(followers) as numFollowers RETURN id(n) as id, n.name? as name, n.title? as title, n._class as _class, n.avatar? as avatar, n.avatar_type? as avatar_type, l.name as location__name, c.name as category__name, true as isInContacts, inFavorites as isInFavorites, numFollowers it runs in ~1s-3s(for first run) and ~1s-70ms (for consecutive and it depends on query) and there is about 5-10 queries runs for each impression. Another interesting behavior is when i try run query from console(neo4j) on my local machine many consecutive times(just press ctrl+enter for few seconds) it has almost constant execution time but when i do it on server it goes slower exponentially and i guess it somehow related with my problem. Problem: So my problem is that neo4j is very CPU greedy(for 24 core machine its may be not an issue but its obviously overkill for small project). First time i used AWS EC2 m1.large instance but over all performance was bad, during testing, CPU always was over 100%. Some relevant parts of configuration: neostore.nodestore.db.mapped_memory=1280M wrapper.java.maxmemory=8192 note: I already tried configuration where all memory related parameters where HIGH and it didn't worked(no change at all). Question: Where to digg? configuration? scheme? queries? what i'm doing wrong? if need more info(logs, configs) just ask ;)

    Read the article

  • Stored procedure performance randomly plummets; trivial ALTER fixes it. Why?

    - by gWiz
    I have a couple of stored procedures on SQL Server 2005 that I've noticed will suddenly take a significantly long time to complete when invoked from my ASP.NET MVC app running in an IIS6 web farm of four servers. Normal, expected completion time is less than a second; unexpected anomalous completion time is 25-45 seconds. The problem doesn't seem to ever correct itself. However, if I ALTER the stored procedure (even if I don't change anything in the procedure, except to perhaps add a space to the script created by SSMS Modify command), the completion time reverts to expected completion time. IIS and SQL Server are running on separate boxes, both running Windows Server 2003 R2 Enterprise Edition. SQL Server is Standard Edition. All machines have dual Xeon E5450 3GHz CPUs and 4GB RAM. SQL Server is accessed using its TCP/IP protocol over gigabit ethernet (not sure what physical medium). The problem is present from all web servers in the web farm. When I invoke the procedure from a query window in SSMS on my development machine, the procedure completes in normal time. This is strange because I was under the impression that SSMS used the same SqlClient driver as in .NET. When I point my development instance of the web app to the production database, I again get the anomalous long completion time. If my SqlCommand Timeout is too short, I get System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. Question: Why would performing ALTER on the stored procedure, without actually changing anything in it, restore the completion time to less than a second, as expected? Edit: To clarify, when the procedure is running slow for the app, it simultaneously runs fine in SSMS with the same parameters. The only difference I can discern is login credentials (next time I notice the behavior, I'll be checking from SSMS with the same creds). The ultimate goal is to get the procs to sustainably run with expected speed without requiring occasional intervention. Resolution: I wanted to to update this question in case others are experiencing this issue. Following the leads of the answers below, I was able to consistently reproduce this behavior. In order to test, I utilize sp_recompile and pass it one of the susceptible sprocs. I then initiate a website request from my browser that will invoke the sproc with atypical parameters. Lastly, I initiate a website request to a page that invokes the sproc with typical parameters, and observe that the request does not complete because of a SQL timeout on the sproc invocation. To resolve this on SQL Server 2005, I've added OPTIMIZE FOR hints to my SELECT. The sprocs that were vulnerable all have the "all-in-one" pattern described in this article. This pattern is certainly not ideal but was a necessary trade-off given the timeframe for the project.

    Read the article

  • User has many computers, computers have many attributes in different tables, best way to JOIN?

    - by krismeld
    I have a table for users: USERS: ID | NAME | ---------------- 1 | JOHN | 2 | STEVE | a table for computers: COMPUTERS: ID | USER_ID | ------------------ 13 | 1 | 14 | 1 | a table for processors: PROCESSORS: ID | NAME | --------------------------- 27 | PROCESSOR TYPE 1 | 28 | PROCESSOR TYPE 2 | and a table for harddrives: HARDDRIVES: ID | NAME | ---------------------------| 35 | HARDDRIVE TYPE 25 | 36 | HARDDRIVE TYPE 90 | Each computer can have many attributes from the different attributes tables (processors, harddrives etc), so I have intersection tables like this, to link the attributes to the computers: COMPUTER_PROCESSORS: C_ID | P_ID | --------------| 13 | 27 | 13 | 28 | 14 | 27 | COMPUTER_HARDDRIVES: C_ID | H_ID | --------------| 13 | 35 | So user JOHN, with id 1 owns computer 13 and 14. Computer 13 has processor 27 and 28, and computer 13 has harddrive 35. Computer 14 has processor 27 and no harddrive. Given a user's id, I would like to retrieve a list of that user's computers with each computers attributes. I have figured out a query that gives me a somewhat of a result: SELECT computers.id, processors.id AS p_id, processors.name AS p_name, harddrives.id AS h_id, harddrives.name AS h_name, FROM computers JOIN computer_processors ON (computer_processors.c_id = computers.id) JOIN processors ON (processors.id = computer_processors.p_id) JOIN computer_harddrives ON (computer_harddrives.c_id = computers.id) JOIN harddrives ON (harddrives.id = computer_harddrives.h_id) WHERE computers.user_id = 1 Result: ID | P_ID | P_NAME | H_ID | H_NAME | ----------------------------------------------------------- 13 | 27 | PROCESSOR TYPE 1 | 35 | HARDDRIVE TYPE 25 | 13 | 28 | PROCESSOR TYPE 2 | 35 | HARDDRIVE TYPE 25 | But this has several problems... Computer 14 doesnt show up, because it has no harddrive. Can I somehow make an OUTER JOIN to make sure that all computers show up, even if there a some attributes they don't have? Computer 13 shows up twice, with the same harddrive listet for both. When more attributes are added to a computer (like 3 blocks of ram), the number of rows returned for that computer gets pretty big, and it makes it had to sort the result out in application code. Can I somehow make a query, that groups the two returned rows together? Or a query that returns NULL in the h_name column in the second row, so that all values returned are unique? EDIT: What I would like to return is something like this: ID | P_ID | P_NAME | H_ID | H_NAME | ----------------------------------------------------------- 13 | 27 | PROCESSOR TYPE 1 | 35 | HARDDRIVE TYPE 25 | 13 | 28 | PROCESSOR TYPE 2 | 35 | NULL | 14 | 27 | PROCESSOR TYPE 1 | NULL | NULL | Or whatever result that make it easy to turn it into an array like this [13] => [P_NAME] => [0] => PROCESSOR TYPE 1 [1] => PROCESSOR TYPE 2 [H_NAME] => [0] => HARDDRIVE TYPE 25 [14] => [P_NAME] => [0] => PROCESSOR TYPE 1

    Read the article

  • BufferedReader no longer buffering after a while?

    - by BobTurbo
    Sorry I can't post code but I have a bufferedreader with 50000000 bytes set as the buffer size. It works as you would expect for half an hour, the HDD light flashing every two minutes or so, reading in the big chunk of data, and then going quiet again as the CPU processes it. But after about half an hour (this is a very big file), the HDD starts thrashing as if it is reading one byte at a time. It is still in the same loop and I think I checked free ram to rule out swapping (heap size is default). Probably won't get any helpful answers, but worth a try. OK I have changed heap size to 768mb and still nothing. There is plenty of free memory and java.exe is only using about 300mb. Now I have profiled it and heap stays at about 200MB, well below what is available. CPU stays at 50%. Yet the HDD starts thrashing like crazy. I have.. no idea. I am going to rewrite the whole thing in c#, that is my solution. Here is the code (it is just a throw-away script, not pretty): BufferedReader s = null; HashMap<String, Integer> allWords = new HashMap<String, Integer>(); HashSet<String> pageWords = new HashSet<String>(); long[] pageCount = new long[78592]; long pages = 0; Scanner wordFile = new Scanner(new BufferedReader(new FileReader("allWords.txt"))); while (wordFile.hasNext()) { allWords.put(wordFile.next(), Integer.parseInt(wordFile.next())); } s = new BufferedReader(new FileReader("wikipedia/enwiki-latest-pages-articles.xml"), 50000000); StringBuilder words = new StringBuilder(); String nextLine = null; while ((nextLine = s.readLine()) != null) { if (a.matcher(nextLine).matches()) { continue; } else if (b.matcher(nextLine).matches()) { continue; } else if (c.matcher(nextLine).matches()) { continue; } else if (d.matcher(nextLine).matches()) { nextLine = s.readLine(); if (e.matcher(nextLine).matches()) { if (f.matcher(s.readLine()).matches()) { pageWords.addAll(Arrays.asList(words.toString().toLowerCase().split("[^a-zA-Z]"))); words.setLength(0); pages++; for (String word : pageWords) { if (allWords.containsKey(word)) { pageCount[allWords.get(word)]++; } else if (!word.isEmpty() && allWords.containsKey(word.substring(0, word.length() - 1))) { pageCount[allWords.get(word.substring(0, word.length() - 1))]++; } } pageWords.clear(); } } } else if (g.matcher(nextLine).matches()) { continue; } words.append(nextLine); words.append(" "); }

    Read the article

  • Combining FileStream and MemoryStream to avoid disk accesses/paging while receiving gigabytes of data?

    - by w128
    I'm receiving a file as a stream of byte[] data packets (total size isn't known in advance) that I need to store somewhere before processing it immediately after it's been received (I can't do the processing on the fly). Total received file size can vary from as small as 10 KB to over 4 GB. One option for storing the received data is to use a MemoryStream, i.e. a sequence of MemoryStream.Write(bufferReceived, 0, count) calls to store the received packets. This is very simple, but obviously will result in out of memory exception for large files. An alternative option is to use a FileStream, i.e. FileStream.Write(bufferReceived, 0, count). This way, no out of memory exceptions will occur, but what I'm unsure about is bad performance due to disk writes (which I don't want to occur as long as plenty of memory is still available) - I'd like to avoid disk access as much as possible, but I don't know of a way to control this. I did some testing and most of the time, there seems to be little performance difference between say 10 000 consecutive calls of MemoryStream.Write() vs FileStream.Write(), but a lot seems to depend on buffer size and the total amount of data in question (i.e the number of writes). Obviously, MemoryStream size reallocation is also a factor. Does it make sense to use a combination of MemoryStream and FileStream, i.e. write to memory stream by default, but once the total amount of data received is over e.g. 500 MB, write it to FileStream; then, read in chunks from both streams for processing the received data (first process 500 MB from the MemoryStream, dispose it, then read from FileStream)? Another solution is to use a custom memory stream implementation that doesn't require continuous address space for internal array allocation (i.e. a linked list of memory streams); this way, at least on 64-bit environments, out of memory exceptions should no longer be an issue. Con: extra work, more room for mistakes. So how do FileStream vs MemoryStream read/writes behave in terms of disk access and memory caching, i.e. data size/performance balance. I would expect that as long as enough RAM is available, FileStream would internally read/write from memory (cache) anyway, and virtual memory would take care of the rest. But I don't know how often FileStream will explicitly access a disk when being written to. Any help would be appreciated.

    Read the article

  • Can MySQL reasonably perform queries on billions of rows?

    - by haxney
    I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

    Read the article

  • C++ Vector vs Array (Time)

    - by vsha041
    I have got here two programs with me, both are doing exactly the same task. They are just setting an boolean array / vector to the value true. The program using vector takes 27 seconds to run whereas the program involving array with 5 times greater size takes less than 1 s. I would like to know the exact reason as to why there is such a major difference ? Are vectors really that inefficient ? Program using vectors #include <iostream> #include <vector> #include <ctime> using namespace std; int main(){ const int size = 2000; time_t start, end; time(&start); vector<bool> v(size); for(int i = 0; i < size; i++){ for(int j = 0; j < size; j++){ v[i] = true; } } time(&end); cout<<difftime(end, start)<<" seconds."<<endl; } Runtime - 27 seconds Program using Array #include <iostream> #include <ctime> using namespace std; int main(){ const int size = 10000; // 5 times more size time_t start, end; time(&start); bool v[size]; for(int i = 0; i < size; i++){ for(int j = 0; j < size; j++){ v[i] = true; } } time(&end); cout<<difftime(end, start)<<" seconds."<<endl; } Runtime - < 1 seconds Platform - Visual Studio 2008 OS - Windows Vista 32 bit SP 1 Processor Intel(R) Pentium(R) Dual CPU T2370 @ 1.73GHz Memory (RAM) 1.00 GB Thanks Amare

    Read the article

< Previous Page | 216 217 218 219 220 221 222 223 224 225 226  | Next Page >