Search Results

Search found 14074 results on 563 pages for 'programmers'.

Page 141/563 | < Previous Page | 137 138 139 140 141 142 143 144 145 146 147 148  | Next Page >

  • Advice: The first-time interviewer's dilemna

    - by shan23
    I've been working in my first job for about 2 years now, and I've been "asked" to interview a potential teammate (whom I might have to mentor as well) on pretty short notice (2 days from now). Initially, I had been given a free rein(or so I thought, and hence agreed), but today, I've been told "not to pose bookish questions" - implying I can only ask basic programming puzzles and stuff similar to the 'fizbuzz' question. I strongly believe that not knowing basic algorithmic notations(the haziest ideas of space/time complexities) or the tiniest idea of regular expressions would make working with the guy very difficult for anyone. I know i'm asking for a lot here, but according to you, what would be a comprehensive way to test out the absolutely basic requirements of a CS guy(he has 2 yrs of exp) without sounding too pedantic/bookish etc ? It seems it would be legit to ask C questions/simple puzzles only....but I really do want to have something a bit different from "finding loops in linked lists" that has kind of become the opening statement of most techie interviews !! This is a face-to-face interview with about an hour or more of time - I looked at Steve's basic phone-screen questions, and I was wondering if there exists a guide on "basic face-to-face interview questions" that I can use(or compile from the community's answers here). EDIT: The position is mostly for a kernel level C programming job, with some smattering of C++ required for writing the test framework.

    Read the article

  • How to attach WAR file in email from jenkins

    - by birdy
    We have a case where a developer needs to access the last successfully built WAR file from jenkins. However, they can't access the jenkins server. I'd like to configure jenkins such that on every successful build, jenkins sends the WAR file to this user. I've installed the ext-email plugin and it seems to be working fine. Emails are being received along with the build.log. However, the WAR file isn't being received. The WAR file lives on this path in the server: /var/lib/jenkins/workspace/Ourproject/dist/our.war So I configured it under Post build actions like this: The problem is that emails are sent but the WAR file isn't being attached. Do I need to do something else?

    Read the article

  • How is time calculation performed by a computer?

    - by Jorge Mendoza
    I need to add a certain feature to a module in a given project regarding time calculation. For this specific case I'm using Java and reading through the documentation of the Date class I found out the time is calculated in milliseconds starting from January 1, 1970, 00:00:00 GMT. I think it's safe to assume there is a similar "starting date" in other languages so I guess the specific implementation in Java doesn't matter. How is the time calculation performed by the computer? How does it know exactly how many milliseconds have passed from that given "starting date and time" to the current date and time?

    Read the article

  • creating a google wave clone using php/mysql/jquery

    - by jeansymolanza
    seasons greetings to all. i have a question that has been rather bugging me as of late. does anyone know how one can create a google wave clone using php/mysql/jquery as primary points of development. any ideas on how this might be possible and recommend any starting points? i have some time off work and it would be an interesting project to undertake as i want to use it in an e-learning framework next year. i will be testing the product on a XAMPP local server. i understand some of the technologies that google wave using but i am rather curious as to how these can be developed to a decent standard using php/mysql/jquery (i mention these three because i am quite adept at them). any links to resources best suited to an intermediate programmer would be appreciated. many thanks and God bless. so far i have this: http://konrness.com/javascript/google-wave-style-scroll-bar-jquery-plugin/

    Read the article

  • Apache Commons PropertiesConfiguration escapes characters on Save [migrated]

    - by Anuvrat
    I am using the commons-configuration from apache commons library. I have a properties file which has properties like: blog_loc=http://my.blog.com blog_name="my blog name" I open the properties file, change the blog_name property and save the file. The following are the lines of code I use: PropertiesConfiguration propertyFile = new PropertiesConfiguration(propertyFileName); propertyFile.setProperty(blog_name, "blog name"); propertyFile.save(propertyFileName + ".out"); Unfortunately, in the output file certain characters get escaped as follows: blog_loc=http:\/\/my.blog.com blog_name=\"blog name\" Is there any way of preventing escaping of the above characters?

    Read the article

  • C/C++: Who uses the logical operator macros from iso646.h and why?

    - by Jaime Soto
    There has been some debate at work about using the merits of using the alternative spellings for C/C++ logical operators in iso646.h: and && and_eq &= bitand & bitor | compl ~ not ! not_eq != or || or_eq |= xor ^ xor_eq ^= According to Wikipedia, these macros facilitate typing logical operators in international (non-US English?) and non-QWERTY keyboards. All of our development team is in the same office in Orlando, FL, USA and from what I have seen we all use the US English QWERTY keyboard layout; even Dvorak provides all the necessary characters. Supporters of using the iso646.h macros claim we should them because they are part of the C and C++ standards. I think this argument is moot since digraphs and trigraphs are also part of these standards and they are not even supported by default in many compilers. My rationale for opposing these macros in our team is that we do not need them since: Everybody on our team uses the US English QWERTY keyboard layout; C and C++ programming books from the US barely mention iso646.h, if at all; and new developers may not be familiar with iso646.h (this is expected if they are from the US). /rant Finally, to my set of questions: Does anyone in this site use the iso646.h logical operator macros? Why? What is your opinion about using the iso646.h logical operator macros in code written and maintained on US English QWERTY keyboards? Is my digraph and trigraph analogy a valid argument against using iso646.h with US English QWERTY keyboard layouts? EDIT: I missed two similar questions in StackOverflow: Is anybody using the named boolean operators? Which C++ logical operators do you use: and, or, not and the ilk or C style operators? why?

    Read the article

  • One page using querystring or many folders and pages?

    - by ClarkeyBoy
    I have an application where I have the 'core' code in one folder for which there is a virtual directory in the root, such that I can include any core files using /myApp/core/bla.asp. I then have two folders outside of this with a default.asp which currently use the querystring to define what page should be displayed. One page is for general users, the other will only be accessible to users who have permission to manage users / usergroups / permissions. The core code checks the querystring and then checks the permissions for that user. An example of this as it is now is default.asp?action=view&viewtype=list&objectid=server. I am not worried about SEO as this is an internal app and uses Windows Auth. My question is, is it better the way it is now or would it be better to have something like the following: /server/view/list/ /server/view/?id=123 /server/create/ /server/edit/?id=123 /server/remove/?id=123 In the above folders I would have a home page which defines all the variables which are currently determined by the querystring - in /server/create/ for example, I would define the action as 'create', object name as 'server' and so on. In terms of future development, I really have no idea which method would be best. I think the 2nd method would be best in terms of following what page does what but this is such a huge change to make at this stage that I would really like some opinions, preferably based on experience. PS Sorry if the tags are wrong - I am new to this forum and thought this was a bit too much of a discussion for StackOverflow as that is very much right / wrong answer based. I got the idea SE is more discussion based.

    Read the article

  • How relevant is UTF-7 when it comes to parsing emails?

    - by J. Pablo Fernández
    I recently implemented incoming emails for an application and boy, did I open the gates of hell? Since then every other day an email arrives that makes the app fail in a different way. One of those things is emails encoded as UTF-7. Most emails come as ASCII, some of the Latin encodings, or thankfully, UTF-8. Hotmail error messages (like email address doesn't exist or quota exceeded) seem to come as UTF-7. Unfortunately, UTF-7 is not an encoding Ruby understands: > "hello world".encode("utf-8", "utf-7") Encoding::ConverterNotFoundError: code converter not found (UTF-7 to UTF-8) > Encoding::UTF_7 => #<Encoding:UTF-7 (dummy)> My application doesn't crash, it actually handles the email quite well, but it does send me a notification about the potential error. I spent some time googling and I can't find anyone that implemented the conversion, at least not as a Ruby 1.9.3 Encoding::Converter. So, my question is, since I never got an email with actual content, from an actual person, in UTF-7, how relevant is that encoding? can I safely ignore it?

    Read the article

  • Is it a good practice to wrap all primitives and Strings?

    - by Amogh Talpallikar
    According to Jeff Bay's Essay on Object Callisthenics, One of the practices is set to be "Wrap all primitives and Strings" Can anyone elaborate on this ? In languages where we already have wrappers for primitives like C# and Java. and In languages where Collections can have generics where you are sure of what type goes into the collection, do we need to wrap string's inside their own classes ? Does it have any other advantage ?

    Read the article

  • how to do loop for array which have different data for each array

    - by Suriani Salleh
    i have this file XML file.. I need to convert it form XMl to MYSQL. if it have only one array than i know how to do it.. now my question how to extract this two array Each array will have different value of data..for example for first array, pmIntervalTxEthMaxUtilization data : 0,74,0,0,48 and for second array pmIntervalRxPowerLevel data: -79,-68,-52 , pmIntervalTxPowerLevel data: 13,11,-55 . can some one help to guide how write php code to extract this xml file to MY SQL <mi> <mts>20130618020000</mts> <gp>900</gp> <mt>pmIntervalRxUndersizedFrames</mt> [ this is first array] <mt>pmIntervalRxUnicastFrames</mt> <mt>pmIntervalTxUnicastFrames</mt> <mt>pmIntervalRxEthMaxUtilization</mt> <mt>pmIntervalTxEthMaxUtilization</mt> <mv> <moid>port:1:3:23-24</moid> <sf>FALSE</sf> <r>0</r> [the data for 1st array i want to insert in DB] <r>0</r> <r>0</r> <r>5</r> <r>0</r> </mv> </mi> <mi> <mts>20130618020000</mts> <gp>900</gp> <mt>pmIntervalRxSES</mt> [this is second array] <mt>pmIntervalRxPowerLevel</mt> <mt>pmIntervalTxPowerLevel</mt> <mv> <moid>client:1:3:23-24</moid> <sf>FALSE</sf> <r>0</r> [the data for 2nd array i want to insert in DB] <r>-79</r> <r>13</r> </mv> </mi> this is the code for one array that i write..i dont know how to write code for two array because the field appear two times and have different data value for each array // Loop through the specified xpath foreach($xml->mi->mv as $subchild) { $port_no = $subchild->moid; $rx_ses = $subchild->r[0]; $rx_es = $subchild->r[1]; $tx_power = $subchild->r[10]; // dump into database; ........................... i have do a little research on it this is the out come... $i = 0; while( $i < 5) { // Loop through the specified xpath foreach($xml->md->mi->mv as $subchild) { $port_no = $subchild->moid; $rx_uni = $subchild->r[10]; $tx_uni = $subchild->r[11]; $rx_eth = $subchild->r[16]; $tx_eth = $subchild->r[17]; // dump into database; .............................. $i++; if( $i == 5 )break; } } // Loop through the specified xpath foreach($xml->mi->mv as $subchild) { $port_no = $subchild->moid; $rx_ses = $subchild->r[0]; $rx_es = $subchild->r[1]; $tx_power = $subchild->r[10]; // dump into database; .......................

    Read the article

  • Exception while running an EJB program [closed]

    - by rajesh
    I'm a newbie to JEE. I'm trying a sample EJB3 program given in this website. while trying to run the client program, I get an exception like this [JBossManagedConnectionPool] Throwable while attempting to get a new connection: null org.jboss.resource.JBossResourceException: Could not create connection; - nested throwable: (org.jboss.resource.JBossResourceException: Failed to register driver for: com.mysql.jdbc.Driver; - nested throwable: (java.lang.ClassNotFoundException: com.mysql.jdbc.Driver from BaseClassLoader@4c19dd{VFSClassLoaderPolicy@35fd58{name=vfszip:/home/rajesh/workspace/jboss-5.1.0.GA/server/default/deploy/FirstJPAproject.jar/ domain=ClassLoaderDomain@17380fc{name=DefaultDomain parentPolicy=BEFORE parent=org.jboss.bootstrap.NoAnnotationURLClassLoader@76cbf7} roots=[MemoryContextHandler@9160338[path= context=vfsmemory://3j001-rho3wm-h8v3wfs4-1-h8v5pkwn-9u real=vfsmemory://3j001-rho3wm-h8v3wfs4-1-h8v5pkwn-9u], DelegatingHandler@25335794[path=FirstJPAproject.jar context=file:/home/rajesh/workspace/jboss-5.1.0.GA/server/default/deploy/ real=file:/home/rajesh/workspace/jboss-5.1.0.GA/server/default/deploy/FirstJPAproject.jar] I'm using Jboss5.1.0 and ejb 3.0 Please help me to get rid of this.

    Read the article

  • How should I pitch moving to an agile/iterative development cycle with mandated 3-week deployments?

    - by Wayne M
    I'm part of a small team of four, and I'm the unofficial team lead (I'm lead in all but title, basically). We've largely been a "cowboy" environment, with no architecture or structure and everyone doing their own thing. Previously, our production deployments would be every few months without being on a set schedule, as things were added/removed to the task list of each developer. Recently, our CIO (semi-technical but not really a programmer) decided we will do deployments every three weeks; because of this I instantly thought that adopting an iterative development process (not necessarily full-blown Agile/XP, which would be a huge thing to convince everyone else to do) would go a long way towards helping manage expectations properly so there isn't this far-fetched idea that any new feature will be done in three weeks. IMO the biggest hurdle is that we don't have ANY kind of development approach in place right now (among other things like no CI or automated tests whatsoever). We don't even use Waterfall, we use "Tell Developer X to do a task, expect him to do everything and get it done". Are there any pointers that would help me start to ease us towards an iterative approach and A) Get the other developers on board with it and B) Get management to understand how iterative works? So far my idea involves trying to set up a CI server and get our build process automated (it takes about 10-20 minutes right now to simply build the application to put it on our development server), since pushing tests and/or TDD will be met with a LOT of resistance at this point, and constantly force us to break larger projects into smaller chunks that could be done iteratively in a three-week cycle; my only concern is that, unless I'm misunderstanding, an agile/iterative process may or may not release the software (depending on the project scope you might have "working" software after three weeks, but there isn't enough of it that works to let users make use of it), while I think the expectation here from management is that there will always be something "ready to go" in three weeks, and that disconnect could cause problems. On that note, is there any literature or references that explains the agile/iterative approach from a business standpoint? Everything I've seen only focuses on the developers, how to do it, but nothing seems to describe it from the perspective of actually getting the buy-in from the businesspeople.

    Read the article

  • A good interpreted language for a small embedded project

    - by Earlz
    I have an mbed which has a small ARM Cortex M3 on it. Basically, my effective resources for the project are ~25Kb of RAM and ~400Kb of Flash. For I/O I'll have a PS/2 keyboard, a VGA framebuffer(with character output), and an SD card for saving/loading programs(up to a couple of Mb maybe) The reason I ask this here is because I'm trying to figure out what programming language to implement on the thing. I'm looking for an interpreted language that's easy for me to implement, and won't break the bank on my resources. I also intend for this to be at least possible to write on th device itself, though the editor can be interpreted(yay bootstrapping) Anyway, I've looked at a few simple languages. Some nice candidates: Forth BASIC Scheme? Has anyone done something like this or know of any languages that can fit this bill or have comments about my three candidates so far?

    Read the article

  • Test case as a function or test case as a class

    - by GodMan
    I am having a design problem in test automation:- Requirements - Need to test different servers (using unix console and not GUI) through automation framework. Tests which I'm going to run - Unit, System, Integration Question: While designing a test case, I am thinking that a Test Case should be a part of a test suite (test suite is a class), just as we have in Python's pyunit framework. But, should we keep test cases as functions for a scalable automation framework or should be keep test cases as separate classes(each having their own setup, run and teardown methods) ? From automation perspective, Is the idea of having a test case as a class more scalable, maintainable or as a function?

    Read the article

  • Any valid reason to Nest Master Pages in ASP.Net rather than Inherit?

    - by James P. Wright
    Currently in a debate at work and I cannot fathom why someone would intentionally avoid Inheritance with Master Pages. For reference here is the project setup: BaseProject MainMasterPage SomeEvent SiteProject SiteMasterPage nested MainMasterPage OtherSiteProject MainMasterPage (from BaseProject) The debate came up because some code in BaseProject needs to know about "SomeEvent". With the setup above, the code in BaseProject needs to call this.Master.Master. That same code in BaseProject also applies to OtherSiteProject which is just accessed as this.Master. SiteMasterPage has no code differences, only HTML differences. If SiteMasterPage Inherits MainMasterPage rather than Nests it, then all code is valid as this.Master. Can anyone think of a reason why to use a Nested Master Page here instead of an Inherited one?

    Read the article

  • Productivity vs Security [closed]

    - by nerijus
    Really do not know is this right place to ask such a questions. But it is about programming in a different light. So, currently contracting with company witch pretends to be big corporation. Everyone is so important that all small issues like developers are ignored. Give you a sample: company VPN is configured so that if you have VPN then HTTP traffic is banned. Bearing this in mind can you imagine my workflow: Morning. Ok time to get latest source. Ups, no VPN. Let’s connect. Click-click. 3 sec. wait time. Ok getting source. Do I have emails? Ups. VPN is on, can’t check my emails. Need to wait for source to come up. Finally here it is! Ok Click-click VPN is gone. What is in my email. Someone reported a bug. Good, let’s track it down. It is in TFS already. Oh, dam, I need VPN. Click-click. Ok, there is description. Yea, I have seen this issue in stachoverflow.com. Let’s go there. Ups, no internet. Click-click. No internet. What? IPconfig… DHCP server kicked me out. Dam. Renew ip. 1..2..3. Ok internet is back. Google: site: stachoverflow.com 3 min. I have solution. Great I love stackoverflow.com. Don’t want to remember days where there was no stackoveflow.com. Ok. Copy paste this like to studio. Dam, studio is stalled, can’t reach files on TFS. Click-click. VPN is back. Get source out, paste my code. Grand. Let’s see what other comments about an issue in stackoverflow.com tells. Hmm.. There is a link. Click. Dammit! No internet. Click-click. No internet. DHCP kicked me out. Dammit. Now it is even worse: this happens 3-4 times a day. After certain amount of VPN connections open\closed my internet goes down solid. Only way to get internet back is reboot. All my browser tabs/SQL windows/studio will be gone. This happened just now when I am typing this. Back to issue I am solving right now: I am getting frustrated - I do not care about better solution for this issue. Let’s do it somehow and forget. This Click-click barrier between internet and TFS kills me… Sounds familiar? You could say there are VPN settings to change. No! This is company laptop, not allowed to do changes. I am very very lucky to have admin privileges on my machine. Most of developers don’t. So just learned to live with this frustration. It takes away 40-60 minutes daily. Tried to email company support, admins. They are too important ant too busy with something that just ignored my little man’s problem. Politely ignored. Question is: Is this normal in corporate world? (Have been in States, Canada, Germany. Never seen this.)

    Read the article

  • Server Side Developer Prerequisites

    - by Jking
    I am new to server side development and am currently learning node.js. What sort of networking information should I be familiar with to allow for a smooth learning curve with server side development. Could anyone provide resources pertaining to the information required to get into server programming? To give you a better idea of my standpoint: I do not know how a server interacts with a database [Q: How does a NoSQL database, or database in general, communicate with a server?] I am unsure of how a web stack works [Q: I have heard of LAMP but do not know how Apache, MySQL, and PHP interact. Hopefully this applies to other stacks as well. How do the components of a stack work together? Also, is a MEAN stack an alternative, or is it completely irrelevant to this] I have trivial knowledge of internet protocol [however extremely inefficient][Q: What resources are beneficial when learning about networking, and how much/what knowledge should I acquire to program on the server side] I am unsure of what I am unsure of concerning networking information necessary to start development Information on how the client-server model works would be greatly appreciated

    Read the article

  • What is the right way to process inconsistent data files?

    - by Tahabi
    I'm working at a company that uses Excel files to store product data, specifically, test results from products before they are shipped out. There are a few thousand spreadsheets with anywhere from 50-100 relevant data points per file. Over the years, the schema for the spreadsheets has changed significantly, but not unidirectionally - in the sense that, changes often get reverted and then re-added in the space of a few dozen to few hundred files. My project is to convert about 8000 of these spreadsheets into a database that can be queried. I'm using MongoDB to deal with the inconsistency in the data, and Python. My question is, what is the "right" or canonical way to deal with the huge variance in my source files? I've written a data structure which stores the data I want for the latest template, which will be the final template used going forward, but that only helps for a few hundred files historically. Brute-forcing a solution would mean writing similar data structures for each version/template - which means potentially writing hundreds of schemas with dozens of fields each. This seems very inefficient, especially when sometimes a change in the template is as little as moving a single line of data one row down or splitting what used to be one data field into two data fields. A slightly more elegant solution I have in mind would be writing schemas for all the variants I can find for pre-defined groups in the source files, and then writing a function to match a particular series of files with a series of variants that matches that set of files. This is because, more often that not, most of the file will remain consistent over a long period, only marred by one or two errant sections, but inside the period, which section is inconsistent, is inconsistent. For example, say a file has four sections with three data fields, which is represented by four Python dictionaries with three keys each. For files 7000-7250, sections 1-3 will be consistent, but section 4 will be shifted one row down. For files 7251-7500, 1-3 are consistent, section 4 is one row down, but a section five appears. For files 7501-7635, sections 1 and 3 will be consistent, but section 2 will have five data fields instead of three, section five disappears, and section 4 is still shifted down one row. For files 7636-7800, section 1 is consistent, section 4 gets shifted back up, section 2 returns to three cells, but section 3 is removed entirely. Files 7800-8000 have everything in order. The proposed function would take the file number and match it to a dictionary representing the data mappings for different variants of each section. For example, a section_four_variants dictionary might have two members, one for the shifted-down version, and one for the normal version, a section_two_variants might have three and five field members, etc. The script would then read the matchings, load the correct mapping, extract the data, and insert it into the database. Is this an accepted/right way to go about solving this problem? Should I structure things differently? I don't know what to search Google for either to see what other solutions might be, though I believe the problem lies in the domain of ETL processing. I also have no formal CS training aside from what I've taught myself over the years. If this is not the right forum for this question, please tell me where to move it, if at all. Any help is most appreciated. Thank you.

    Read the article

  • How to write good code with new stuff?

    - by Reza M.
    I always try to write easily readable code that is well structured. I face a particular problem when I am messing around with something new. I keep changing the code, structure and so many other things. In the end, I look at the code and am annoyed at how complicated it became when I was trying to do something so simple. Once I've completed something, I refactor it heavily so that it's cleaner. This occurs after completion most of the time and it is annoying because the bigger the code the more annoying it is the rewrite it. I am curious to know how people deal with such agony, especially on big projects shared between many people ?

    Read the article

  • Designing persistence schema for BigTable on AppEngine

    - by Vitalij Zadneprovskij
    I have tried to design the datastore schema for a very small application. That schema would have been very simple, if not trivial, using a relational database with foreign keys, many-to-many relations, joins, etc. But the problem was that my application was targeted for Google App Engine and I had to design for a database that was not relational. At the end I gave up. Is there a book or an article that describes design principles for applications that are meant for such databases? The books that I have found are about programming for App Engine and they don't spend many words about database design principles.

    Read the article

  • What resources do you recommend for learning more about TCP/IP, networking, and related areas?

    - by mkelley33
    As a relatively-new Python programmer, I'm finding more and more that networking as it relates to the web and web development is becoming increasingly important to understand. When I was an active C# ASP.NET programmer making smaller websites with less responsibility this knowledge seemed less important, since there was often a "networking" guy performing any tasks beyond acquiring a domain name for a client. Which books, websites, presentations, articles, or other resources would you recommend so that I best understand what's happening between the time a user types a URL and receives the rendered HTML? Thanks!

    Read the article

  • Rewrote GNU GPL v2 code in another language: can I change a license?

    - by Anton Gogolev
    I rewrote some parts of Mercurial (which is licensed under GNU GPL v2) in C#. Naturally, I looked a lot into original Python code and some parts are direct translations from Python to C#. Is is possible have "my code" licensed under different terms or to even make a part of a closed-source commercial application? If not, can I re-license "my-code" under LGPL, open-source it and then use this open-sourced C# library in my closed-source commercial application?

    Read the article

  • Button postion not changing in View Controller. (Xcode)

    - by theCodeKing
    I have a View controller in xcode 6 (beta 5). I have put 4 buttons in it through the Object library in a .xib. But when i open the app in iOS simulator the buttons are the right y position but not correct x-position()they are on the right edge. No matter where i move them in the xib they only change y-position. I even moved them using the size inspector, but to no avail. How can i actually move them?

    Read the article

  • The use of Test-Driven Development in Non-Greenfield Projects?

    - by JHarley1
    So here is a question for you, having read some great answers to questions such as "Test-Driven Development - Convince Me". So my question is: "Can Test-Driven Development be used effectively on non-Greenfield projects?" To specify: I would really like to know if people have had experience in using TDD in projects where there was already non-TDD elements present? And the problems that they then faced.

    Read the article

  • Design review for application facing memory issues

    - by Mr Moose
    I apologise in advance for the length of this post, but I want to paint an accurate picture of the problems my app is facing and then pose some questions below; I am trying to address some self inflicted design pain that is now leading to my application crashing due to out of memory errors. An abridged description of the problem domain is as follows; The application takes in a “dataset” that consists of numerous text files containing related data An individual text file within the dataset usually contains approx 20 “headers” that contain metadata about the data it contains. It also contains a large tab delimited section containing data that is related to data in one of the other text files contained within the dataset. The number of columns per file is very variable from 2 to 256+ columns. The original application was written to allow users to load a dataset, map certain columns of each of the files which basically indicating key information on the files to show how they are related as well as identify a few expected column names. Once this is done, a validation process takes place to enforce various rules and ensure that all the relationships between the files are valid. Once that is done, the data is imported into a SQL Server database. The database design is an EAV (Entity-Attribute-Value) model used to cater for the variable columns per file. I know EAV has its detractors, but in this case, I feel it was a reasonable choice given the disparate data and variable number of columns submitted in each dataset. The memory problem Given the fact the combined size of all text files was at most about 5 megs, and in an effort to reduce the database transaction time, it was decided to read ALL the data from files into memory and then perform the following; perform all the validation whilst the data was in memory relate it using an object model Start DB transaction and write the key columns row by row, noting the Id of the written row (all tables in the database utilise identity columns), then the Id of the newly written row is applied to all related data Once all related data had been updated with the key information to which it relates, these records are written using SqlBulkCopy. Due to our EAV model, we essentially have; x columns by y rows to write, where x can by 256+ and rows are often into the tens of thousands. Once all the data is written without error (can take several minutes for large datasets), Commit the transaction. The problem now comes from the fact we are now receiving individual files containing over 30 megs of data. In a dataset, we can receive any number of files. We’ve started seen datasets of around 100 megs coming in and I expect it is only going to get bigger from here on in. With files of this size, data can’t even be read into memory without the app falling over, let alone be validated and imported. I anticipate having to modify large chunks of the code to allow validation to occur by parsing files line by line and am not exactly decided on how to handle the import and transactions. Potential improvements I’ve wondered about using GUIDs to relate the data rather than relying on identity fields. This would allow data to be related prior to writing to the database. This would certainly increase the storage required though. Especially in an EAV design. Would you think this is a reasonable thing to try, or do I simply persist with identity fields (natural keys can’t be trusted to be unique across all submitters). Use of staging tables to get data into the database and only performing the transaction to copy data from staging area to actual destination tables. Questions For systems like this that import large quantities of data, how to you go about keeping transactions small. I’ve kept them as small as possible in the current design, but they are still active for several minutes and write hundreds of thousands of records in one transaction. Is there a better solution? The tab delimited data section is read into a DataTable to be viewed in a grid. I don’t need the full functionality of a DataTable, so I suspect it is overkill. Is there anyway to turn off various features of DataTables to make them more lightweight? Are there any other obvious things you would do in this situation to minimise the memory footprint of the application described above? Thanks for your kind attention.

    Read the article

< Previous Page | 137 138 139 140 141 142 143 144 145 146 147 148  | Next Page >