mb - Page 86 - Developer IT

Delete Duplicate records from large csv file C# .Net

- by Sandhurst

I have created a solution which read a large csv file currently 20-30 mb in size, I have tried to delete the duplicate rows based on certain column values that the user chooses at run time using the usual technique of finding duplicate rows but its so slow that it seems the program is not working at all. What other technique can be applied to remove duplicate records from a csv file Here's the code, definitely I am doing something wrong DataTable dtCSV = ReadCsv(file, columns); //columns is a list of string List column DataTable dt=RemoveDuplicateRecords(dtCSV, columns); private DataTable RemoveDuplicateRecords(DataTable dtCSV, List<string> columns) { DataView dv = dtCSV.DefaultView; string RowFilter=string.Empty; if(dt==null) dt = dv.ToTable().Clone(); DataRow row = dtCSV.Rows[0]; foreach (DataRow row in dtCSV.Rows) { try { RowFilter = string.Empty; foreach (string column in columns) { string col = column; RowFilter += "[" + col + "]" + "='" + row[col].ToString().Replace("'","''") + "' and "; } RowFilter = RowFilter.Substring(0, RowFilter.Length - 4); dv.RowFilter = RowFilter; DataRow dr = dt.NewRow(); bool result = RowExists(dt, RowFilter); if (!result) { dr.ItemArray = dv.ToTable().Rows[0].ItemArray; dt.Rows.Add(dr); } } catch (Exception ex) { } } return dt; }

Read the article

Regex to add CDATA for mal formed XML

- by AntonioCS

Hey guys! I have this huge xml file (13 mb) and it has some malformed values. Here is a sample of the xml: <propertylist> <adprop index="0" proptype="type" value="Ft"/> <adprop index="0" proptype="category" value="Bs"/> <adprop index="0" proptype="subcategory" value="Bsm"/> <adprop index="0" proptype="description" value="MOONEN CUSTOM 58"/> </propertylist> Now this is ok. But I many other nodes that are not encapsulated in CDATA that need to be. The node that gives me problems is the <adprop index="0" proptype="description" value=""/> I created this regular expression: <adprop index="0" proptype="description" value="(.+)"\/> to catch that node and replace it with this: <adprop index="0" proptype="description" value="<![CDATA[\1]]>"\/> I run this in notepad++ and it works. The only problem is when the value="" is multi lined like: <adprop index="0" proptype="description" value="cutter that has demonstrated her offshore capabiliti from there to the Canaries with her current owner. Spacious homely interior with over 2m headroom and heaps of" /> It fails with this one, and there are plenty like this one. Can anyone help me out in the regular expression so that I can catch the value when it's multi lined? Thanks

Read the article

What is optimal hardware configuration for heavy load LAMP application

- by Piotr Kochanski

I need to run Linux-Apache-PHP-MySQL application (Moodle e-learning platform) for a large number of concurrent users - I am aiming 5000 users. By concurrent I mean that 5000 people should be able to work with the application at the same time. "Work" means not only do database reads but writes as well. The application is not very typical, since it is doing a lot of inserts/updates on the database, so caching techniques are not helping to much. We are using InnoDB storage engine. In addition application is not written with performance in mind. For instance one Apache thread usually occupies about 30-50 MB of RAM. I would be greatful for information what hardware is needed to build scalable configuration that is able to handle this kind of load. We are using right now two HP DLG 380 with two 4 core processors which are able to handle much lower load (typically 300-500 concurrent users). Is it reasonable to invest in this kind of boxes and build cluster using them or is it better to go with some more high-end hardware? I am particularly curious how many and how powerful servers are needed (number of processors/cores, size of RAM) what network equipment should be used (what kind of switches, network cards) any other hardware, like particular disc storage solutions, etc, that are needed Another thing is how to put together everything, that is what is the most optimal architecture. Clustering with MySQL is rather hard (people are complaining about MySQL Cluster, even here on Stackoverflow).

Read the article

IO operation taking long time for files in remote server

- by user841311

I have files of size 150 MB each in a remote server in a different domain in the network. I am accessing them thorugh UNC path. I want to read the file content and perform a basic string search. When I try reading the files line by line, the operation just don't finish and takes long time, more than 30 minutes. However when I copy those files to my local machine, the same code reads and performs the string search in less than 5 seconds. I don't have .NET framework installed in the server so I have to do this from my machine. I want to perform all this through C# code in .NET framework 3.5 so I don't want to explictly ftp all the files to my machine before performing this operation. Sample Code DirectoryInfo dir = new DirectoryInfo(@strFilePath); FileInfo[] fiArray = dir.getFiles("*.txt"); foreach (FileInfo fi in fiArray) { //reading file content from server takes long time but fast in local machine //perform string search } Let me know if my requirement is not clear. Thanks in advance!

Read the article

Batch Image compression tool for optimizing thousands of images

- by Daniel Magliola

Hi all, I'm maintaining a site that has thousands of images that have not been compressed nearly enough. The homepage weighs in at 1.5 Mb currently, and it could easily be way less that half that. I'm looking for some kind of tool that'll take a folder full of JPG pictures and will recompress them to their "optimal" compression value. Obviously, "optimal lossy compression setting" is an oxymoron, but I'm thinking maybe a tool that'll try different levels and compare the outputs to the input, and choose a "sweet spot" between size and destruction? Or even try whether PNG is a better option, many times it is, for "drawing" type stuff. Does anyone of you know any such tool? I'd have lots of fun coding one, but I bet someone already did and will save me 2 days. Alternatively, of course, anything that'll take all pictures in a folder and recompress them with a fixed quality level (say, 40) will also work, it'll just not make my inner nerd as happy, but it'll solve my problem just fine. (Ideally something that can run on Windows, ideally from the command line) Thank you!

Read the article

How to output KML by GAE

- by Niklas R

Hi I use KML for a google map where entities have a geopt.db coordinate and soft memory limit was exceeded with 213.465 MB after servicing 1 requests total. The log says /list.kml 200 13130ms 10211cpu_ms 4238api_cpu_ms The file list.kml which outputs about 455,7 KB is a template as follows <?xml version="1.0" encoding="UTF-8"?><kml xmlns="http:// www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http:// www.w3.org/2005/Atom"> <Document>{% for a in list %} <Placemark> <name> </name> <description> <![CDATA[<a href="http://{{host}}/{{a.key.id}}"> {{ a.title }} </a> <br/>{{a.text}}]]> </description> <Style> <IconStyle> <Icon> <href> http://www.google.com/intl/en_us/mapfiles/ms/icons/green-dot.png </href> </Icon> </IconStyle> </Style> <Point> <coordinates> {{a.geopt.lon|floatformat:2}},{{a.geopt.lat|floatformat:2}} </coordinates> </Point> </Placemark> {% endfor %} </Document> </kml> Is there a memory leak in the template or the python that passes the list variable? Can I improve using other template engine or other framework than default? Is kmz compression a good idea in this case? Thanks in advance for any suggestion where or how to change the code.

Read the article

Temporary storage for keeping data between program iterations?

- by mr.b

I am working on an application that works like this: It fetches data from many sources, resulting in pool of about 500,000-1,500,000 records (depends on time/day) Data is parsed Part of data is processed in a way to compare it to pre-existing data (read from database), calculations are made, and stored in database. Resulting dataset that has to be stored in database is, however, much smaller in size (compared to original data set), and ranges from 5,000-50,000 records. This process almost always updates existing data, perhaps adds few more records. Then, data from step 2 should be kept somehow, somewhere, so that next time data is fetched, there is a data set which can be used to perform calculations, without touching pre-existing data in database. I should point out that this data can be lost, it's not irreplaceable (key information can be read from database if needed), but it would speed up the process next time. Application components can (and will be) run off different computers (in the same network), so storage has to be reachable from multiple hosts. I have considered using memcached, but I'm not quite sure should I do so, because one record is usually no smaller than 200 bytes, and if I have 1,500,000 records, I guess that it would amount to over 300 MB of memcached cache... But that doesn't seem scalable to me - what if data was 5x that amount? If it were to consume 1-2 GB of cache only to keep data in between iterations (which could easily happen)? So, the question is: which temporary storage mechanism would be most suitable for this kind of processing? I haven't considered using mysql temporary tables, as I'm not sure if they can persist between sessions, and be used by other hosts in network... Any other suggestion? Something I should consider?

Read the article

Mysql Database Question about Large Columns

- by murat

Hi, I have a table that has 100.000 rows, and soon it will be doubled. The size of the database is currently 5 gb and most of them goes to one particular column, which is a text column for PDF files. We expect to have 20-30 GB or maybe 50 gb database after couple of month and this system will be used frequently. I have couple of questions regarding with this setup 1-) We are using innodb on every table, including users table etc. Is it better to use myisam on this table, where we store text version of the PDF files? (from memory usage /performance perspective) 2-) We use Sphinx for searching, however the data must be retrieved for highlighting. Highlighting is done via sphinx API but still we need to retrieve 10 rows in order to send it to Sphinx again. This 10 rows may allocate 50 mb memory, which is quite large. So I am planning to split these PDF files into chunks of 5 pages in the database, so these 100.000 rows will be around 3-4 million rows and couple of month later, instead of having 300.000-350.000 rows, we'll have 10 million rows to store text version of these PDF files. However, we will retrieve less pages, so again instead of retrieving 400 pages to send Sphinx for highlighting, we can retrieve 5 pages and it will have a big impact on the performance. Currently, when we search a term and retrieve PDF files that have more than 100 pages, the execution time is 0.3-0.35 seconds, however if we retrieve PDF files that have less than 5 pages, the execution time reduces to 0.06 seconds, and it also uses less memory. Do you think, this is a good trade-off? We will have million of rows instead of having 100k-200k rows but it will save memory and improve the performance. Is it a good approach to solve this problem and do you have any ideas how to overcome this problem? The text version of the data is used only for indexing and highlighting. So, we are very flexible. Thanks,

Read the article

Hadoop reduce task gets hung

- by user806098

I set up a hadoop cluster with 4 nodes, When running a map-reduce task, the map task finishes quickly, while the reduce task hangs at 27% percent. I checked the log, it's that the reduce task fails to fetch map output from map nodes. The job tracker log of master shows messages like this: 2011-06-27 19:55:14,748 INFO org.apache.hadoop.mapred.JobTracker: Adding task (REDUCE) 'attempt_201106271953_0001_r_000000_0' to tip task_201106271953_0001_r_000000, for tracker 'tracker_web30.bbn.com.cn:localhost/127.0.0.1:56476' And the name node log of master shows messages like this: 2011-06-27 14:00:52,898 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 54310, call register(DatanodeRegistration(202.106.199.39:50010, storageID=DS-1989397900-202.106.199.39-50010-1308723051262, infoPort=50075, ipcPort=50020)) from 192.168.225.19:16129: error: java.io.IOException: verifyNodeRegistration: unknown datanode 202.106.199.3 9:50010 However, neither the "web30.bbn.com.cn" or 202.106.199.39, 202.106.199.3 is the slave node. I think such ip/domains appear because hadoop fails to resolve a node(first in the Intranet DNS server), then it goes to a higher-level DNS server, later to the top, still fails, then the "junk" ip/domains are returned. But I checked my config, it goes like this: /etc/hosts: 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.225.16 master 192.168.225.66 slave1 192.168.225.20 slave5 192.168.225.17 slave17 conf/core-site.xml: hadoop.tmp.dir /root/hadoop_tmp/hadoop_${user.name} fs.default.name hdfs://master:54310 io.sort.mb 1024 hdfs-site.xml: dfs.replication 3 masters: master slaves: master slave1 slave5 slave17 Also, all firewalls(iptables) are turned off, and ssh between each 2 nodes is ok. so I don't know where exact the error comes from. Please help. Thanks a lot.

Read the article

OpenGL Colour Interpolation

- by Will-of-fortune

I'm currently working on an little project in C++ and OpenGL and am trying to implement a colour selection tool similar to that in photoshop, as below. However I am having trouble with interpolation of the large square. Working on my desktop computer with a 8800 GTS the result was similar but the blending wasn't as smooth. This is the code I am using: GLfloat swatch[] = { 0,0,0, 1,1,1, mR,mG,mB, 0,0,0 }; GLint swatchVert[] = { 400,700, 400,500, 600,500, 600,700 }; glVertexPointer(2, GL_INT, 0, swatchVert); glColorPointer(3, GL_FLOAT, 0, swatch); glDrawArrays(GL_QUADS, 0, 4); Moving onto my laptop with Intel Graphics HD 3000, this result was even worse with no change in code. I thought it was OpenGL splitting the quad into two triangles, so I tried rendering using triangles and interpolating the colour in the middle of the square myself but it still doesnt quite match the result I was hoping for. Any help would be appreciated. Thanks.

Read the article

PostgreSQL, Foreign Keys, Insert speed & Django

- by Miles

A few days ago, I ran into an unexpected performance problem with a pretty standard Django setup. For an upcoming feature, we have to regenerate a table hourly, containing about 100k rows of data, 9M on the disk, 10M indexes according to pgAdmin. The problem is that inserting them by whatever method literally takes ages, up to 3 minutes of 100% disk busy time. That's not something you want on a production site. It doesn't matter if the inserts were in a transaction, issued via plain insert, multi-row insert, COPY FROM or even INSERT INTO t1 SELECT * FROM t2. After noticing this isn't Django's fault, I followed a trial and error route, and hey, the problem disappeared after dropping all foreign keys! Instead of 3 minutes, the INSERT INTO SELECT FROM took less than a second to execute, which isn't too surprising for a table <= 20M on the disk. What is weird is that PostgreSQL manages to slow down inserts by 180x just by using 3 foreign keys. Oh, disk activity was pure writing, as everything is cached in RAM; only writes go to the disks. It looks like PostgreSQL is working very hard to touch every row in the referred tables, as 3MB/sec * 180s is way more data than the 20MB this new table takes on disk. No WAL for the 180s case, I was testing in psql directly, in Django, add ~50% overhead for WAL logging. Tried @commit_on_success, same slowness, I had even implemented multi row insert and COPY FROM with psycopg2. That's another weird thing, how can 10M worth of inserts generate 10x 16M log segments? Table layout: id serial primary, a bunch of int32, 3 foreign keys to small table, 198 rows, 16k on disk large table, 1.2M rows, 59 data + 89 index MB on disk large table, 2.2M rows, 198 + 210MB So, am I doomed to either drop the foreign keys manually or use the table in a very un-Django way by defining saving bla_id x3 and skip using models.ForeignKey? I'd love to hear about some magical antidote / pg setting to fix this.

Read the article

Distributing a bundle of files across an extranet

- by John Zwinck

I want to be able to distribute bundles of files, about 500 MB per bundle, to all machines on a corporate "extranet" (which is basically a few LANs connected using various private mechanisms, including leased lines and VPN). The total number of hosts is roughly 100, and the goal is to get a copy of the bundle from one host onto all the other hosts reliably, quickly, and efficiently. One important issue is that some hosts are grouped together on single fast LANs in which case the network I/O should be done once from one group to the next and then within each group between all the peers. This is as opposed to a strict central server system where multiple hosts might each fetch the same bundle over a slow link, rather than once via the slow link and then between each other quickly. A new bundle will be produced every few days, and occasionally old bundles will be deleted (but that problem can be solved separately). The machines in question happen to run recent Linuxes, but bonus points will go to solutions which are at least somewhat cross-platform (in which case the bundle might differ per platform but maybe the same mechanism can be used). That's pretty much it. I'm not opposed to writing some code to handle this, but it would be preferable if it were one of bash, Python, Ruby, Lua, C, or C++.

Read the article

Saving "heavy" image to PDF in MATLAB - rendering problem

- by yuk

I generate a figure in MATLAB with lot amount of points (100000+) and want to save it into a PDF file. With zbuffer or painters renderer I've got very large and slowly opened file (over 4 Mb) - all points are in vector format. Using OpenGL renderer rasterize the figure in PDF, ok for the plot, but not good for text labels. The file size is about 150 Kb. Try this simplified code, for example: x=linspace(1,10,100000); y=sin(x)+randn(size(x)); plot(x,y,'.') set(gcf,'Renderer','zbuffer') print -dpdf -r300 testpdf_zb set(gcf,'Renderer','painters') print -dpdf -r300 testpdf_pa set(gcf,'Renderer','opengl') print -dpdf -r300 testpdf_op The actual figure is much more complex with several axes and different types of plots. Is there a way to rasterize the figure, but keep text labels as vectors? Another problem with OpenGL is that is does not work in terminal mode (-nosplash -nodesktop) under Mac OSX. Looks like OpenGL is not supported. I have to use terminal mode for automation. The MATLAB version I run is 2007b. Mac OSX server 10.4.

Read the article

Hiding <option>s in IE

- by Mark

I wrote this nifty function to filter select boxes when their value is changed... $.fn.cascade = function() { var opts = this.children('option'); var rel = this.attr('rel'); $('[name='+rel+']').change(function() { var val = $(this).val(); var disp = opts.filter('[rel='+val+']'); opts.filter(':visible').hide(); disp.show(); if(!disp.filter(':selected').length) { disp.filter(':first').attr('selected','selected'); } }).trigger('change'); return this; } It looks at the rel property, and if the element indicated by rel changes, then it filters the list to only show the options that have that value... for example, it works on HTML that looks like this: <select id="id-pickup_address-country" name="pickup_address-country"> <option selected="selected" value="CA">Canada </option> <option value="US">United States </option> </select> <select id="id-pickup_address-province" rel="pickup_address-country" name="pickup_address-province"> <option rel="CA" value="AB">Alberta </option> <option selected="selected" rel="CA" value="BC">British Columbia </option> <option rel="CA" value="MB">Manitoba </option>... </select> However, I just discovered it doesn't work properly in IE (of course!) which doesn't seem to allow you to hide options. How can I work around this?

Read the article

Google Spreadsheet API problem: memory exceeded

- by Robbert

Hi guys, Don't know if anyone has experience with the Google Spreadsheets API or the Zend_GData classes but it's worth a go: When I try to insert a value in a 750 row spreadsheet, it takes ages and then throws an error that my memory limit (which is 128 MB!) was exceeded. I also got this when querying all records of this spreadsheet but this I can imaging because it's quite a lot of data. But why does this happen when inserting a row? That's not too complex, is it? Here's the code I used: public function insertIntoSpreadsheet($username, $password, $spreadSheetId, $data = array()) { $service = Zend_Gdata_Spreadsheets::AUTH_SERVICE_NAME; $client = Zend_Gdata_ClientLogin::getHttpClient($username, $password, $service); $client->setConfig(array( 'timeout' => 240 )); $service = new Zend_Gdata_Spreadsheets($client); if (count($data) == 0) { die("No valid data"); } try { $newEntry = $service->insertRow($data, $spreadSheetId); return true; } catch (Exception $e) { return false; } }

Read the article

One big executable or many small DLL's?

- by Patrick

Over the years my application has grown from 1MB to 25MB and I expect it to grow further to 40, 50 MB. I don't use DLL's, but put everything in this one big executable. Having one big executable has certain advantages: Installing my application at the customer is really: copy and run. Upgrades can be easily zipped and sent to the customer There is no risk of having conflicting DLL's (where the customer has version X of the EXE, but version Y of the DLL) The big disadvantage of the big EXE is that linking times seem to grow exponentially. Additional problem is that a part of the code (let's say about 40%) is shared with another application. Again, the advantages are that: There is no risk on having a mix of incorrect DLL versions Every developer can make changes on the common code which speeds up developments. But again, this has a serious impact on compilation times (everyone compiles the common code again on his PC) and on linking times. The question http://stackoverflow.com/questions/2387908/grouping-dlls-for-use-in-executable mentions the possibility of mixing DLL's in one executable, but it looks like this still requires you to link all functions manually in your application (using LoadLibrary, GetProcAddress, ...). What is your opinion on executable sizes, the use of DLL's and the best 'balance' between easy deployment and easy/fast development?

Read the article

SQL Server Express performance issue

- by Developer IT

Hi folks ! I know my questions will sound silly and probably nobody will have perfect answer but since I am in a complete dead-end with the situation it will make me feel better to post it here. So... I have a SQL Server Express database that's 500 Mb. It contains 5 tables and maybe 30 stored procedure. This database is use to store articles and is use for the Developer It web site. Normally the web pages load quickly, let's say 2 ou 3 sec. BUT, sqlserver process uses 100% of the processor for those 2 or 3 sec. I try to find which stored procedure was the problem and I could not find one. It seems like every read into the table dans contains the articles (there are about 155,000 of them and 20 or so gets added every 15 minutes). I added few index but without luck... It is because the table is full text indexed ? Should I have order with the primary key instead of date ? I never had any problems with ordering by dates.... Should I use dynamic SQL ? Should I add the primary key into the url of the articles ? Should I use mutiple indexes for seperate columns or one big index ? I you want more details or code bits, just ask for it. Basicly, every little hint is much apreciated. Thanks.

Read the article

Image processing on bifurcation diagram to get small eps size

- by yCalleecharan

Hello, I'm producing bifurcation diagrams (which are normally used in nonlinear dynamics). These diagrams identify abrupt changes in topologies due to stability changes. These abrupt changes occur as one or more parameters pass through some critical value(s). An example is here: http://en.wikipedia.org/wiki/File:LogisticMap_BifurcationDiagram.png On the above figure, some image processing has been done so as to make the plot more visually pleasant. A bifurcation diagram usually contains hundreds of thousands of points and the resulting eps file can become very big. Journal submission in the LaTeX format require that figures are to be submitted in the eps format. In my case one of such figures can result in about 6 MB in Matlab and even much more in Gnuplot. For the example in the above figure, 100,000 x values are calculated for each r and one can imagine that the resulting eps file would be huge. The site however explains some image processing that makes the plot more visually pleasing. Can anyone explain to me stepwise how go about? I can't understand the explanation provided in the "summary" section. Will the resulting image processing also reduce the figure size? Furthermore, any tips on reducing the file size of such a huge eps figure? Thanks a lot...

Read the article

parsing xml using dom4j

- by D3GAN

My XML structure is like this: <rss> <channel> <yweather:location city="Paris" region="" country="France"/> <yweather:units temperature="C" distance="km" pressure="mb" speed="km/h"/> <yweather:wind chill="-1" direction="40" speed="11.27"/> <yweather:atmosphere humidity="87" visibility="9.99" pressure="1015.92" rising="0"/> <yweather:astronomy sunrise="8:30 am" sunset="4:54 pm"/> </channel> </rss> when I tried to parse it using dom4j SAXReader xmlReader = createXmlReader(); Document doc = null; doc = xmlReader.read( inputStream );//inputStream is input of function log.info(doc.valueOf("/rss/channel/yweather:location/@city")); private SAXReader createXmlReader() { Map<String,String> uris = new HashMap<String,String>(); uris.put( "yweather", "http://xml.weather.yahoo.com/ns/rss/1.0" ); uris.put( "geo", "http://www.w3.org/2003/01/geo/wgs84_pos#" ); DocumentFactory factory = new DocumentFactory(); factory.setXPathNamespaceURIs( uris ); SAXReader xmlReader = new SAXReader(); xmlReader.setDocumentFactory( factory ); return xmlReader; } But I got nothing in cmd but when I print doc.asXML(), my XML structure print correctly!

Read the article

How to know why an animation stutters?

- by Patrick Klug

I have a few fairly simple animations (moving text around, moving ellipses etc.) and running in full screen (1920x1080 minus the task bar) the WPF Performance Suite reports a good framerate around 50 FPS throughout the animation. Dirty Rect Addition is somewhere around 300 rect/s, the SW frames are between 0 and 4 and the HW frames are between 3 and 5. Video memory usage is around 80 MB. Problem is that the animations stutters every other half second. My machine is a new Dell laptop XPS 15 with the GeForce GT 435 with 2GB memory. - The drivers are up to date. (The same behavior occurs on my netbook (in full screen) as well so I don't think it is hardware related.) If I make the window smaller the stutter goes away. The stutter occurs with the simplest of animations - even with just a couple of elements but adding more elements certainly makes it more noticeable. How can I find out what causes this stutter? When I think of it, I have not actually seen any WPF animations which run smoothly in full screen. Is this even possible?

Read the article

Setting the default stack size on Linux globally for the program

- by wowus

So I've noticed that the default stack size for threads on linux is 8MB (if I'm wrong, PLEASE correct me), and, incidentally, 1MB on Windows. This is quite bad for my application, as on a 4-core processor that means 64 MB is space is used JUST for threads! The worst part is, I'm never using more than 100kb of stack per thread (I abuse the heap a LOT ;)). My solution right now is to limit the stack size of threads. However, I have no idea how to do this portably. Just for context, I'm using Boost.Thread for my threading needs. I'm okay with a little bit of #ifdef hell, but I'd like to know how to do it easily first. Basically, I want something like this (where windows_* is linked on windows builds, and posix_* is linked under linux builds) // windows_stack_limiter.c int limit_stack_size() { // Windows impl. return 0; } // posix_stack_limiter.c int limit_stack_size() { // Linux impl. return 0; } // stack_limiter.cpp int limit_stack_size(); static volatile int placeholder = limit_stack_size(); How do I flesh out those functions? Or, alternatively, am I just doing this entirely wrong? Remember I have no control over the actual thread creation (no new params to CreateThread on Windows), as I'm using Boost.Thread.

Read the article

HELP ME my dataset xsd content HAS GONE

- by Mustafa Magdy

I'm working in an erp project using Visual Studio 2008 Sp1, I've a typed dataset it was containg alot of datatable and alot of table adapter the .Designer.cs file was 8 MB, suddenly when i was trying to openit using visual studio designer the following code comes to me <?xml version="1.0" encoding="utf-8"?> <xs:schema id="erpDataSet" targetNamespace="http://tempuri.org/erpDataSet1.xsd" xmlns:mstns="http://tempuri.org/erpDataSet1.xsd" xmlns="http://tempuri.org/erpDataSet1.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:msprop="urn:schemas-microsoft-com:xml-msprop" attributeFormDefault="qualified" elementFormDefault="qualified"> <xs:annotation> <xs:appinfo source="urn:schemas-microsoft-com:xml-msdatasource"> <DataSource DefaultConnectionIndex="0" FunctionsComponentName="QueriesTableAdapter" Modifier="AutoLayout, AnsiClass, Class, Public" SchemaSerializationMode="IncludeSchema" xmlns="urn:schemas-microsoft-com:xml-msdatasource"> <Connections> <Connection AppSettingsObjectName="Settings" AppSettingsPropertyName="erpConnectionString" IsAppSettingsProperty="true" Modifier="Assembly" Name="erpConnectionString (Settings)" ParameterPrefix="@" PropertyReference="ApplicationSettings.Sbic.Pro My XSD file content has gone, :( :( :( I don't understand why, and how can i recover it. i mad something but i don't know if it is the reason for that or not, My connectionstring was in the settings "app.config" i removed it and add it to the resources of another project that is refernced by the main project. what can i do, pleaze help me.

Read the article

Creating a tar file with checksums included

- by wazoox

Here's my problem : I need to archive to tar files a lot ( up to 60 TB) of big files (usually 30 to 40 GB each). I would like to make checksums ( md5, sha1, whatever) of these files before archiving; however not reading every file twice (once for checksumming, twice for tar'ing) is more or less a necessity to achieve a very high archiving performance (LTO-4 wants 120 MB/s sustained, and the backup window is limited). So I'd need some way to read a file, feeding a checksumming tool on one side, and building a tar to tape on the other side, something along : tar cf - files | tee tarfile.tar | md5sum - Except that I don't want the checksum of the whole archive (this sample shell code does just this) but a checksum for each individual file in the archive. I've studied GNU tar, Pax, Star options. I've looked at the source from Archive::Tar. I see no obvious way to achieve this. It looks like I'll have to hand-build something in C or similar to achieve what I need. Perl/Python/etc simply won't cut it performance-wise, and the various tar programs miss the necessary "plugin architecture". Does anyone know of any existing solution to this before I start code-churning ?

Read the article

How to find an embedded platform?

- by gmagana

I am new to the locating hardware side of embedded programming and so after being completely overwhelmed with all the choices out there (pc104, custom boards, a zillion option for each board, volume discounts, devel kits, ahhh!!) I am asking here for some direction. Basically, I must find a new motherboard and (most likely) re-implement the program logic. Rewriting this in C/C++/Java/C#/Pascal/BASIC is not a problem for me. so my real problem is finding the hardware. This motherboard will have several other devices attached to it. Here is a summary of what I need to do: Required: 2 RS232 serial ports (one used all the time for primary UI, the second one not continuous) 1 modem (9600+ baud ok) [Modem will be in simultaneous use with only one of the serial port devices, so interrupt sharing with one serial port is OK, but not both] Minimum permanent/long term storage: Whatever O/S requires + 1 MB (executable) + 512 KB (Data files) RAM: Minimal, whatever the O/S requires plus maybe 1MB for executable. Nice to have: USB port(s) Ethernet network port Wireless network Implementation languages (any O/S I will adapt to): First choice Java/C# (Mono ok) Second choice is C/Pascal Third is BASIC Ok, given all this, I am having a lot of trouble finding hardware that will support this that is low in cost. Every manufacturer site I visit has a lot of options, and it's difficult to see if their offering will even satisfy my must-have requirements (for example they sometimes list 3 "serial ports", but it appears that only one of the three is RS232, for example, and don't mention what the other two are). The #1 constraint is cost, #2 is size. Can anyone help me with this? This little task has left me thinking I should have gone for EE and not CS :-). EDIT: A bit of background: This is a system currently in production, but the original programmer passed away, and the current hardware manufacturer cannot find hardware to run the (currently) DOS system, so I need to reimplement this in a modern platform. I can only change the programming and the motherboard hardware.

Read the article

SQLDeveloper using over 100MB of PGA+UGA

- by Leigh Riffel

Perhaps this is normal, but in my Oracle 11g database I am seeing programmers using Oracle's SQL Developer regularly consume more than 100MB of combined UGA and PGA memory. I'd like to know if this is normal and what can be done about it. Our database is on the 32 bit version of Windows 2008, so memory limitations are becoming an increasing concern. I am using the following query to show the memory usage: SELECT e.SID, e.username, e.status, b.PGA_MEMORY FROM v$session e LEFT JOIN (select y.SID, y.value pga, TO_CHAR(ROUND(y.value/1024/1024),99999999) || ' MB' PGA_MEMORY from v$sesstat y, v$statname z where y.STATISTIC# = z.STATISTIC# and NAME = 'session pga memory') b ON e.sid=b.sid WHERE (PGA)/1024/1024 > 20 ORDER BY 4 DESC; It seems that the resource usage goes up any time a table is opened in SQLDeveloper, but even when it is closed the memory does not go away. The problem is worse if the table is sorted while it was open as that seems to use even more memory. I understand how this would use memory while it is sorting, and perhaps even while it is still open, but to use memory after it is closed seems wrong to me. Can anyone confirm this? Update: I discovered that my numbers were off due to not understanding that the UGA is stored in the PGA under dedicated server mode. This makes the numbers lower than they were, but the problem still remains that SQL Developer seems to use excessive PGA.

Search Results

Search found 2417 results on 97 pages for 'mb'.

Page 86/97 | < Previous Page | 82 83 84 85 86 87 88 89 90 91 92 93 | Next Page >

- by Sandhurst

- by AntonioCS

- by Piotr Kochanski

- by user841311

- by Daniel Magliola

- by Niklas R

- by mr.b

- by murat

- by user806098

- by Will-of-fortune

- by Miles

- by John Zwinck

- by yuk

- by Mark

- by Robbert

- by Patrick

- by Developer IT

- by yCalleecharan

- by D3GAN

- by Patrick Klug

- by wowus

- by Mustafa Magdy

- by wazoox

- by gmagana

- by Leigh Riffel

< Previous Page | 82 83 84 85 86 87 88 89 90 91 92 93 | Next Page >