Search Results

Search found 356 results on 15 pages for 'datasets'.

Page 6/15 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >

  • links for 2010-04-07

    - by Bob Rhubart
    James McGovern: Enterprise Architecture and Social CRM "With a few exceptions, the vast majority of enterprise architects I know spend an awful lot of time focused on internal issues whether it is rationalization, the cloud, storage governance, data center consolidation, creation of reference architectures, portfolio management and other considerations that aren’t even visible to customers. One should ask whether IT can be truly successful if we are busy listening to the business but otherwise are blissfully ignorant towards the customers they serve." -- James McGovern (tags: enterprisearchitecture crm socialcomputing) WRF Benchmark: X6275 Beats Power6 - BestPerf "Oracle's Sun Blade X6275 cluster is 28% faster than the IBM POWER6 cluster on Weather Research and Forecasting (WRF) continental United Status (CONUS) benchmark datasets. The Sun Blade X6275 cluster used a Quad Data Rate (QDR) InfiniBand connection along with Intel compilers and MPI." (tags: oracle sun x6275 benchmarks)

    Read the article

  • Exam 71-516: Accessing Data with Microsoft .NET Framework 4

    - by Ricardo Peres
    I had the chance to take the beta version of exam 71-516 today. Here are my thoughts on it: first, I was rather annoyed to discover that I will only know if I passed or not about 8 weeks after the beta period expires (July, 02), which probably means September. It was a difficult exam, especially since I don't have any practice on some of the new Entity Framework options. The items covered, from the most covered to the least covered, were: Entity Framework (50-50 for POCO/Non-POCO) LINQ to SQL WCF Data Services Classic ADO.NET (DataSets, DataTables, DataAdapters, TableAdapters, Connections and Commands LINQ to XML Sync Framework (surprise!) All added up, I think it was a difficult exam. My advise is that you practice a lot! I will post the result as soon as I know it.

    Read the article

  • Design review for application facing memory issues

    - by Mr Moose
    I apologise in advance for the length of this post, but I want to paint an accurate picture of the problems my app is facing and then pose some questions below; I am trying to address some self inflicted design pain that is now leading to my application crashing due to out of memory errors. An abridged description of the problem domain is as follows; The application takes in a “dataset” that consists of numerous text files containing related data An individual text file within the dataset usually contains approx 20 “headers” that contain metadata about the data it contains. It also contains a large tab delimited section containing data that is related to data in one of the other text files contained within the dataset. The number of columns per file is very variable from 2 to 256+ columns. The original application was written to allow users to load a dataset, map certain columns of each of the files which basically indicating key information on the files to show how they are related as well as identify a few expected column names. Once this is done, a validation process takes place to enforce various rules and ensure that all the relationships between the files are valid. Once that is done, the data is imported into a SQL Server database. The database design is an EAV (Entity-Attribute-Value) model used to cater for the variable columns per file. I know EAV has its detractors, but in this case, I feel it was a reasonable choice given the disparate data and variable number of columns submitted in each dataset. The memory problem Given the fact the combined size of all text files was at most about 5 megs, and in an effort to reduce the database transaction time, it was decided to read ALL the data from files into memory and then perform the following; perform all the validation whilst the data was in memory relate it using an object model Start DB transaction and write the key columns row by row, noting the Id of the written row (all tables in the database utilise identity columns), then the Id of the newly written row is applied to all related data Once all related data had been updated with the key information to which it relates, these records are written using SqlBulkCopy. Due to our EAV model, we essentially have; x columns by y rows to write, where x can by 256+ and rows are often into the tens of thousands. Once all the data is written without error (can take several minutes for large datasets), Commit the transaction. The problem now comes from the fact we are now receiving individual files containing over 30 megs of data. In a dataset, we can receive any number of files. We’ve started seen datasets of around 100 megs coming in and I expect it is only going to get bigger from here on in. With files of this size, data can’t even be read into memory without the app falling over, let alone be validated and imported. I anticipate having to modify large chunks of the code to allow validation to occur by parsing files line by line and am not exactly decided on how to handle the import and transactions. Potential improvements I’ve wondered about using GUIDs to relate the data rather than relying on identity fields. This would allow data to be related prior to writing to the database. This would certainly increase the storage required though. Especially in an EAV design. Would you think this is a reasonable thing to try, or do I simply persist with identity fields (natural keys can’t be trusted to be unique across all submitters). Use of staging tables to get data into the database and only performing the transaction to copy data from staging area to actual destination tables. Questions For systems like this that import large quantities of data, how to you go about keeping transactions small. I’ve kept them as small as possible in the current design, but they are still active for several minutes and write hundreds of thousands of records in one transaction. Is there a better solution? The tab delimited data section is read into a DataTable to be viewed in a grid. I don’t need the full functionality of a DataTable, so I suspect it is overkill. Is there anyway to turn off various features of DataTables to make them more lightweight? Are there any other obvious things you would do in this situation to minimise the memory footprint of the application described above? Thanks for your kind attention.

    Read the article

  • Google I/O 2010 - Tips and tricks for Google Earth API and KML

    Google I/O 2010 - Tips and tricks for Google Earth API and KML Google I/O 2010 - Mapping in 3D: Tips and tricks for Google Earth API and KML Geo 201 Josh Livni, Mano Marks Google Earth and the Earth API can handle a tremendous amount of data. But you always have more. We will talk about integrating large datasets efficiently, coding for optimal performance, and taking advantage of advanced features in KML and the Earth API. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 14 0 ratings Time: 01:01:18 More in Science & Technology

    Read the article

  • Google I/O 2010 - BigQuery and Prediction APIs

    Google I/O 2010 - BigQuery and Prediction APIs Google I/O 2010 - BigQuery and Prediction APIs App Engine 101 Amit Agarwal, Max Lin, Gideon Mann, Siddartha Naidu Google relies heavily on data analysis and has developed many tools to understand large datasets. Two of these tools are now available on a limited sign-up basis to developers: (1) BigQuery: interactive analysis of very large data sets and (2) Prediction API: make informed predictions from your data. We will demonstrate their use and give instructions on how to get access. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 6 0 ratings Time: 57:48 More in Science & Technology

    Read the article

  • Interactive manifest editing with the Automated Installer Manifest Wizard

    - by Glynn Foster
    Oracle Solaris 11.2 adds a new Automated Installer (AI) Manifest Wizard to allow administrators to more easily create AI manifests for use in provisioning new client systems in the data center. The AI Manifest Wizard is a web web based interface that steps administrators through the basics of the AI manifest - target disks and layout selection, additional ZFS pools and datasets, IPS publisher and package selection, and the creation of any Oracle Solaris Zone virtual environments. The end result is an AI manifest without having to directly edit XML, and this can then be associated with an appropriate AI service. To get started, check out How To Create an Automated Installer Manifest with an Interactive Wizard

    Read the article

  • Is there a modern (eg NoSQL) web analytics solution based on log files?

    - by Martin
    I have been using Awstats for many years to process my log files. But I am missing many possibilities (like cross-domain reports) and I hate being stuck with extra fields I created years ago. Anyway, I am not going to continue to use this script. Is there a modern apache logs analytics solution based on modern storage technologies like NoSQL or at least somehow ready to cope with large datasets efficiently? I am primarily looking for something that generates nice sortable and searchable outputs with the focus on web analytics, before having to write my own frontends. (so graylog2 is not an option) This question is purely about log file based solutions.

    Read the article

  • Which programming language could I use for Natural Language Processing to extract clinical words?

    - by MACEE
    I am going to do entity extraction (like Named Entity Recognition) from clinical free text (unstructured raw text such as discharge summaries) and these entities could be any medical problem, medical tests or treatments. I am going to use one of i2b2 datasets (https://www.i2b2.org/) if case you are familiar with that. I am new to the NLP(Natural Language Processing) field and I need a programming language to support NLP tasks and also easily connect to the available libraries of machine learning algorithms like CRF. I don't know much java and I heard about Python, Perl and Scala but I am not sure which one would be the best option for this task?

    Read the article

  • How do I start correctly in building database classes in c#?

    - by e4rthdog
    I am new in C# programming and in OOP. I need to dive into web applications for my company, and I need to do it fast and correct. So even that I know ASP.NET MVC is the way to go, I want to start with some simple applications with ASP.NET Webforms and then advance to MVC logic. Also regarding my db classes: I plan to create common database classes in order to be able to use them either from WinForms or ASP.NET applications. I also know that the way to go is to learn about ORM and EF. BUT I also want to start from where I am feeling comfortable and that is the traditional ADO.NET way. So about my Data Access Layer classes: Should I return my results in datasets or arraylists/lists? Should my methods do their own connect/disconnect from the db, or have separate methods and let the application maintain the connection?

    Read the article

  • Automatically kill a process if it exceeds a given amount of RAM

    - by chrisamiller
    I work on large-scale datasets. When testing new software, a script will sometimes sneak up on me, quickly grab all available RAM, and render my desktop unusable. I'd like a way to set a RAM limit for a process so that if it exceeds that amount, it will be killed automatically. A language-specific solution probably won't work, as I use all sorts of different tools (R, Perl, Python, Bash, etc). So is there some sort of process-monitor that will let me set a threshold amount of RAM and automatically kill a process if it uses more?

    Read the article

  • ETL Software Research Question

    - by WernerCD
    Where I work, we use an in-house ETL solution that's homegrown and has been around for 5-10 years. I'm still new to my data analysis job, but I was wondering about the ETL tools that are out there. This is a new area for me. My situation, and job, is basically digging in a set of databases (DB2, SQL2005, Citrix, Ancient Cobol Database with a SQL Wrapper on top, MySQL, etc). Gather the desired information. combine the different datasets into one set. output into a file of choice (CSV, Tab Separated, Pipe Separated, XLS, etc). FTP to customer. I guess what my real question is, given my job, what are some good ETL suites that I can look at and compare to my in-house tools? This is more to research some other options. Ultimately, I'd either suggest a new solution or get options/ideas to improve our current app.

    Read the article

  • ASP.NET Combo Box and List Box Performance Improvements - v2010 vol 1

    Check out this great new performance feature of our ASP.NET combo box and list box controls for the DXperience v2010.1 release. You can now manually populate lists with items based on the currently applied filter criteria. This means that you can significantly decrease web server workload by loading only a subset of all items when working with large datasets. For instance, when using a large data source, you can only request a few records to be visible on the screen. The rest of the items can...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • sqlbulkcopy using sql CE

    - by harrisonmeister
    Is it possible to use SqlBulkcopy with Sql Compact Edition e.g. (*.sdf) files? I know it works with SQL Server 200 Up, but wanted to check CE compatibility. If it doesnt does anyone else know the fastest way of getting a CSV type file into SQL Server CE without using DataSets (puke here)?

    Read the article

  • Prevent scrollbars with WPF WebBrowser displaying content

    - by Kevin Montrose
    I'm using the WPF WebBrowser component to display some very simple HTML content. However, since I don't know the content size in advance, I'm currently getting scrollbars on the control when I load certain datasets. Basically, how can I force (or otherwise effect the equivalent of forcing) the WebBrowser to expand in size so that all content is displayed without the need for scrollbars?

    Read the article

  • Test data generators / quickest route to generating solid, non-repetitive, but not-real database sam

    - by Jamo
    I need to build a quick feasibility test / proof-of-concept of a remote database for a client, that will be populated with mostly-typical Company and People data (names, addresses, etc); 150K records or so. The sample databases mentioned here were helpful: http://stackoverflow.com/questions/57068/good-databases-with-sample-data ...but, I'd like to be able to generate sample data like this easily on less-typical datasets as well. Anyone have any recommendations for off-the-shelf (or off-the-web) solutions?

    Read the article

  • Mongoid or MongoMapper?

    - by PanosJee
    I have tried MongoMapper and it is feature complete (offering almost all AR functionality) but i was not very happy with the performance when using large datasets. Has anyone compared with Mongoid? Any performance gains ?

    Read the article

  • How to do wpf datavalidation with Ado.net

    - by biju
    How can i use data validation mechanisms with ado.net datatable or datasets. I have an input form which i am binding to a datatable.Now i want to do input validation how can i do that.I have tried using validationRules but i cant bind parameters to it.I tried using idataerrorinfo but cant get a clue.can someone provide some input..?

    Read the article

  • Merits of .NET ORM data access methods Enity Framework vs. NHibernate vs. Subsonic vs. ADO.NET Datas

    - by Lloyd
    I have recently heard "fanboys" of different .NET ORM methodologies express strong, if not outlandish oppinions of other ORM methodologies. And frankly feel a bit in the dark. Could you please explain the key merits of each of these .NET ORM solutions? Entity Framework NHibernate Subsonic ADO.NET Datasets I have a good understanding of 1&4, and a cursory understanding of 2&3, but apparently not enough to understand the implied cultural perceptions of one towards the other.

    Read the article

  • Use cases for NoSQL

    - by seengee
    NoSQL has been getting a lot of attention in the industry recently. Really interested in peoples thoughts on the best use-cases for its use over relational database storage. What should trigger a developer into thinking that particular datasets are more suited to a NoSQL solution like Redis/CouchDB/MongoDB/Cassandra etc. Would also be really interested to hear what people have ported from relational db's to NoSQL and what improvements they have seen.

    Read the article

  • Multiple Connection Types for one Designer Generated TableAdapter

    - by Tim
    I have a Windows Forms application with a DataSet (.xsd) that is currently set to connect to a Sql Ce database. Compact Edition is being used so the users can use this application in the field without an internet connection, and then sync their data at day's end. I have been given a new project to create a supplemental web interface for displaying some of the same reports as the Windows Forms application so certain users can obtain reports without installing the Windows app. What I've done so far is create a new Web Project and added it to my current Solution. I have split both the reports (.rdlc) and DataSets out of the Windows Forms project into their own projects so they can be accessed by both the Windows and Web applications. So far, this is working fine. Here's my dilemma: As I said before, the DataSets are currently set up to connect to a local Sql Ce database file. This is correct for the Windows app, but for the Web application I would like to use these same TableAdapters and queries to connect to the Sql Server 2005 database. I have found that the designer generated, strongly-typed TableAdapter classes have a ConnectionModifier property that allows you to make the TableAdapter's Connection public. This exposes the Connection property and allows me to set it, however it is strongly-typed as a SqlCeConnection, whereas I would like to set it to a SqlConnection for my Web project. I'm assuming the DataSet Designer strongly-types the Connection, Command, and DataAdapter objects based on the Provider of the ConnectionString as indicated in the app.config file. Is there any way I can use some generic provider so that the DataSet Designer will use object types that can connect to both a Sql Ce database file AND the actual Sql Server 2005 database? I know that SqlCeConnection and SqlConnection both inherit from DbConnection, which implements IDbConnection. Relatively, the same goes for SqlCeCommand/SqlCommand:DbCommand:IDbCommand. It would be nice if I could just figure out a way for the designer to use the Interface types rather than the strong types, but I'm hesitant that that is possible. I hope my problem and question are clear. Any help is much appreciated. Let me know if there's anything I can clarify.

    Read the article

  • Visualize Classifier Error Weka

    - by user1780592
    Hye there i have a have datasets where this data i have test it on weka with J48 classifier It give me an output = 87.2611% Total of instances = 157 Correctly Instances = 137 Incorrectly instance = 20 Then i have do a visualize classifier error on my data. However my result have been decrease to: New result = 85.4015% Correctly Instances = 117 Incorrectly instances = 20 Total of instances = 137 Is there any reason for that? Should my result become much better after i do the visualize classifier error?

    Read the article

  • two UserControls, one page, need to notify each other of updates

    - by jeriley
    I've got a user control thats used twice on the same page, each have the ability to be updated (a dropdown list gets a new item) and I'm not sure what might be the best way to handle this. One concern - this is an older system (~4+ years, datasets, .net2) and it is amazingly brittle. I did manage to have it run on 3.5 with no problems, but I've had a few run-ins with the javascript validation (~300 lines per page) throwing up all over the place when I change/add/modify controls in the parent.

    Read the article

  • Flex Action Script timeout issue

    - by user179056
    Hello, Our flex (flare) application keeps timing out when rendering large datasets. Is there anyway to prevent this? we have tried to increase the timeout in the Application tag and the compiler settings. Not mushc success. Any other thoughts? regards Sameer

    Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13  | Next Page >