Search Results

Search found 58475 results on 2339 pages for 'data mining'.

Page 20/2339 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >

  • Building dynamic OLAP data marts on-the-fly

    - by DrJohn
    At the forthcoming SQLBits conference, I will be presenting a session on how to dynamically build an OLAP data mart on-the-fly. This blog entry is intended to clarify exactly what I mean by an OLAP data mart, why you may need to build them on-the-fly and finally outline the steps needed to build them dynamically. In subsequent blog entries, I will present exactly how to implement some of the techniques involved. What is an OLAP data mart? In data warehousing parlance, a data mart is a subset of the overall corporate data provided to business users to meet specific business needs. Of course, the term does not specify the technology involved, so I coined the term "OLAP data mart" to identify a subset of data which is delivered in the form of an OLAP cube which may be accompanied by the relational database upon which it was built. To clarify, the relational database is specifically create and loaded with the subset of data and then the OLAP cube is built and processed to make the data available to the end-users via standard OLAP client tools. Why build OLAP data marts? Market research companies sell data to their clients to make money. To gain competitive advantage, market research providers like to "add value" to their data by providing systems that enhance analytics, thereby allowing clients to make best use of the data. As such, OLAP cubes have become a standard way of delivering added value to clients. They can be built on-the-fly to hold specific data sets and meet particular needs and then hosted on a secure intranet site for remote access, or shipped to clients' own infrastructure for hosting. Even better, they support a wide range of different tools for analytical purposes, including the ever popular Microsoft Excel. Extension Attributes: The Challenge One of the key challenges in building multiple OLAP data marts based on the same 'template' is handling extension attributes. These are attributes that meet the client's specific reporting needs, but do not form part of the standard template. Now clearly, these extension attributes have to come into the system via additional files and ultimately be added to relational tables so they can end up in the OLAP cube. However, processing these files and filling dynamically altered tables with SSIS is a challenge as SSIS packages tend to break as soon as the database schema changes. There are two approaches to this: (1) dynamically build an SSIS package in memory to match the new database schema using C#, or (2) have the extension attributes provided as name/value pairs so the file's schema does not change and can easily be loaded using SSIS. The problem with the first approach is the complexity of writing an awful lot of complex C# code. The problem of the second approach is that name/value pairs are useless to an OLAP cube; so they have to be pivoted back into a proper relational table somewhere in the data load process WITHOUT breaking SSIS. How this can be done will be part of future blog entry. What is involved in building an OLAP data mart? There are a great many steps involved in building OLAP data marts on-the-fly. The key point is that all the steps must be automated to allow for the production of multiple OLAP data marts per day (i.e. many thousands, each with its own specific data set and attributes). Now most of these steps have a great deal in common with standard data warehouse practices. The key difference is that the databases are all built to order. The only permanent database is the metadata database (shown in orange) which holds all the metadata needed to build everything else (i.e. client orders, configuration information, connection strings, client specific requirements and attributes etc.). The staging database (shown in red) has a short life: it is built, populated and then ripped down as soon as the OLAP Data Mart has been populated. In the diagram below, the OLAP data mart comprises the two blue components: the Data Mart which is a relational database and the OLAP Cube which is an OLAP database implemented using Microsoft Analysis Services (SSAS). The client may receive just the OLAP cube or both components together depending on their reporting requirements.  So, in broad terms the steps required to fulfil a client order are as follows: Step 1: Prepare metadata Create a set of database names unique to the client's order Modify all package connection strings to be used by SSIS to point to new databases and file locations. Step 2: Create relational databases Create the staging and data mart relational databases using dynamic SQL and set the database recovery mode to SIMPLE as we do not need the overhead of logging anything Execute SQL scripts to build all database objects (tables, views, functions and stored procedures) in the two databases Step 3: Load staging database Use SSIS to load all data files into the staging database in a parallel operation Load extension files containing name/value pairs. These will provide client-specific attributes in the OLAP cube. Step 4: Load data mart relational database Load the data from staging into the data mart relational database, again in parallel where possible Allocate surrogate keys and use SSIS to perform surrogate key lookup during the load of fact tables Step 5: Load extension tables & attributes Pivot the extension attributes from their native name/value pairs into proper relational tables Add the extension attributes to the views used by OLAP cube Step 6: Deploy & Process OLAP cube Deploy the OLAP database directly to the server using a C# script task in SSIS Modify the connection string used by the OLAP cube to point to the data mart relational database Modify the cube structure to add the extension attributes to both the data source view and the relevant dimensions Remove any standard attributes that not required Process the OLAP cube Step 7: Backup and drop databases Drop staging database as it is no longer required Backup data mart relational and OLAP database and ship these to the client's infrastructure Drop data mart relational and OLAP database from the build server Mark order complete Start processing the next order, ad infinitum. So my future blog posts and my forthcoming session at the SQLBits conference will all focus on some of the more interesting aspects of building OLAP data marts on-the-fly such as handling the load of extension attributes and how to dynamically alter the structure of an OLAP cube using C#.

    Read the article

  • The Best Data Integration for Exadata Comes from Oracle

    - by maria costanzo
    Oracle Data Integrator and Oracle GoldenGate offer unique and optimized data integration solutions for Oracle Exadata. For example, customers that choose to feed their data warehouse or reporting database with near real-time throughout the day, can do so without decreasing  performance or availability of source and target systems. And if you ask why real-time, the short answer is: in today’s fast-paced, always-on world, business decisions need to use more relevant, timely data to be able to act fast and seize opportunities. A longer response to "why real-time" question can be found in a related blog post. If we look at the solution architecture, as shown on the diagram below,  Oracle Data Integrator and Oracle GoldenGate are both uniquely designed to take full advantage of the power of the database and to eliminate unnecessary middle-tier components. Oracle Data Integrator (ODI) is the best bulk data loading solution for Exadata. ODI is the only ETL platform that can leverage the full power of Exadata, integrate directly on the Exadata machine without any additional hardware, and by far provides the simplest setup and fastest overall performance on an Exadata system. We regularly see customers achieving a 5-10 times boost when they move their ETL to ODI on Exadata. For  some companies the performance gain is even much higher. For example a large insurance company did a proof of concept comparing ODI vs a traditional ETL tool (one of the market leaders) on Exadata. The same process that was taking 5hrs and 11 minutes to complete using the competing ETL product took 7 minutes and 20 seconds with ODI. Oracle Data Integrator was 42 times faster than the conventional ETL when running on Exadata.This shows that Oracle's own data integration offering helps you to gain the most out of your Exadata investment with a truly optimized solution. GoldenGate is the best solution for streaming data from heterogeneous sources into Exadata in real time. Oracle GoldenGate can also be used together with Data Integrator for hybrid use cases that also demand non-invasive capture, high-speed real time replication. Oracle GoldenGate enables real-time data feeds from heterogeneous sources non-invasively, and delivers to the staging area on the target Exadata system. ODI runs directly on Exadata to use the database engine power to perform in-database transformations. Enterprise Data Quality is integrated with Oracle Data integrator and enables ODI to load trusted data into the data warehouse tables. Only Oracle can offer all these technical benefits wrapped into a single intelligence data warehouse solution that runs on Exadata. Compared to traditional ETL with add-on CDC this solution offers: §  Non-invasive data capture from heterogeneous sources and avoids any performance impact on source §  No mid-tier; set based transformations use database power §  Mini-batches throughout the day –or- bulk processing nightly which means maximum availability for the DW §  Integrated solution with Enterprise Data Quality enables leveraging trusted data in the data warehouse In addition to Starwood Hotels and Resorts, Morrison Supermarkets, United Kingdom’s fourth-largest food retailer, has seen the power of this solution for their new BI platform and shared their story with us. Morrisons needed to analyze data across a large number of manufacturing, warehousing, retail, and financial applications with the goal to achieve single view into operations for improved customer service. The retailer deployed Oracle GoldenGate and Oracle Data Integrator to bring new data into Oracle Exadata in near real-time and replicate the data into reporting structures within the data warehouse—extending visibility into operations. Using Oracle's data integration offering for Exadata, Morrisons produced financial reports in seconds, rather than minutes, and improved staff productivity and agility. You can read more about Morrison’s success story here and hear from Starwood here. From an Irem Radzik article.

    Read the article

  • Data access pattern

    - by andlju
    I need some advice on what kind of pattern(s) I should use for pushing/pulling data into my application. I'm writing a rule-engine that needs to hold quite a large amount of data in-memory in order to be efficient enough. I have some rather conflicting requirements; It is not acceptable for the engine to always have to wait for a full pre-load of all data before it is functional. Only fetching and caching data on-demand will lead to the engine taking too long before it is running quickly enough. An external event can trigger the need for specific parts of the data to be reloaded. Basically, I think I need a combination of pushing and pulling data into the application. A simplified version of my current "pattern" looks like this (in psuedo-C# written in notepad): // This interface is implemented by all classes that needs the data interface IDataSubscriber { void RegisterData(Entity data); } // This interface is implemented by the data access class interface IDataProvider { void EnsureLoaded(Key dataKey); void RegisterSubscriber(IDataSubscriber subscriber); } class MyClassThatNeedsData : IDataSubscriber { IDataProvider _provider; MyClassThatNeedsData(IDataProvider provider) { _provider = provider; _provider.RegisterSubscriber(this); } public void RegisterData(Entity data) { // Save data for later StoreDataInCache(data); } void UseData(Key key) { // Make sure that the data has been stored in cache _provider.EnsureLoaded(key); Entity data = GetDataFromCache(key); } } class MyDataProvider : IDataProvider { List<IDataSubscriber> _subscribers; // Make sure that the data for key has been loaded to all subscribers public void EnsureLoaded(Key key) { if (HasKeyBeenMarkedAsLoaded(key)) return; PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } // Force all subscribers to get a new version of the data for key public void ForceReload(Key key) { PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } void PublishDataToSubscribers(Key key) { Entity data = FetchDataFromStore(key); foreach(var subscriber in _subscribers) { subscriber.RegisterData(data); } } } // This class will be spun off on startup and should make sure that all data is // preloaded as quickly as possible class MyPreloadingThread { IDataProvider _provider; MyPreloadingThread(IDataProvider provider) { _provider = provider; } void RunInBackground() { IEnumerable<Key> allKeys = GetAllKeys(); foreach(var key in allKeys) { _provider.EnsureLoaded(key); } } } I have a feeling though that this is not necessarily the best way of doing this.. Just the fact that explaining it seems to take two pages feels like an indication.. Any ideas? Any patterns out there I should have a look at?

    Read the article

  • Data access pattern, combining push and pull?

    - by andlju
    I need some advice on what kind of pattern(s) I should use for pushing/pulling data into my application. I'm writing a rule-engine that needs to hold quite a large amount of data in-memory in order to be efficient enough. I have some rather conflicting requirements; It is not acceptable for the engine to always have to wait for a full pre-load of all data before it is functional. Only fetching and caching data on-demand will lead to the engine taking too long before it is running quickly enough. An external event can trigger the need for specific parts of the data to be reloaded. Basically, I think I need a combination of pushing and pulling data into the application. A simplified version of my current "pattern" looks like this (in psuedo-C# written in notepad): // This interface is implemented by all classes that needs the data interface IDataSubscriber { void RegisterData(Entity data); } // This interface is implemented by the data access class interface IDataProvider { void EnsureLoaded(Key dataKey); void RegisterSubscriber(IDataSubscriber subscriber); } class MyClassThatNeedsData : IDataSubscriber { IDataProvider _provider; MyClassThatNeedsData(IDataProvider provider) { _provider = provider; _provider.RegisterSubscriber(this); } public void RegisterData(Entity data) { // Save data for later StoreDataInCache(data); } void UseData(Key key) { // Make sure that the data has been stored in cache _provider.EnsureLoaded(key); Entity data = GetDataFromCache(key); } } class MyDataProvider : IDataProvider { List<IDataSubscriber> _subscribers; // Make sure that the data for key has been loaded to all subscribers public void EnsureLoaded(Key key) { if (HasKeyBeenMarkedAsLoaded(key)) return; PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } // Force all subscribers to get a new version of the data for key public void ForceReload(Key key) { PublishDataToSubscribers(key); MarkKeyAsLoaded(key); } void PublishDataToSubscribers(Key key) { Entity data = FetchDataFromStore(key); foreach(var subscriber in _subscribers) { subscriber.RegisterData(data); } } } // This class will be spun off on startup and should make sure that all data is // preloaded as quickly as possible class MyPreloadingThread { IDataProvider _provider; MyPreloadingThread(IDataProvider provider) { _provider = provider; } void RunInBackground() { IEnumerable<Key> allKeys = GetAllKeys(); foreach(var key in allKeys) { _provider.EnsureLoaded(key); } } } I have a feeling though that this is not necessarily the best way of doing this.. Just the fact that explaining it seems to take two pages feels like an indication.. Any ideas? Any patterns out there I should have a look at?

    Read the article

  • Import csv data (SDK iphone)

    - by Ni
    I am new to cocoa. I have been working on these stuff for a few days. For the following code, i can read all the data in the string, and successfully get the data for plot. NSMutableArray *contentArray = [NSMutableArray array]; NSString *filePath = @"995,995,995,995,995,995,995,995,1000,997,995,994,992,993,992,989,988,987,990,993,989"; NSArray *myText = [filePath componentsSeparatedByString:@","]; NSInteger idx; for (idx = 0; idx < myText.count; idx++) { NSString *data =[myText objectAtIndex:idx]; NSLog(@"%@", data); id x = [NSNumber numberWithFloat:0+idx*0.002777778]; id y = [NSDecimalNumber decimalNumberWithString:data]; [contentArray addObject: [NSMutableDictionary dictionaryWithObjectsAndKeys:x, @"x", y, @"y", nil]]; } self.dataForPlot = contentArray; then, i try to load the data from csv file. the data in Data.csv file has the same value and the same format as 995,995,995,995,995,995,995,995,1000,997,995,994,992,993,992,989,988,987,990,993,989. I run the code, it is supposed to give the same graph output. however, it seems that the data is not loaded from csv file successfully. i can not figure out what's wrong with my code. NSMutableArray *contentArray = [NSMutableArray array]; NSString *filePath = [[NSBundle mainBundle] pathForResource:@"Data" ofType:@"csv"]; NSString *Data = [NSString stringWithContentsOfFile:filePath encoding:NSUTF8StringEncoding error:nil ]; if (Data) { NSArray *myText = [Data componentsSeparatedByString:@","]; NSInteger idx; for (idx = 0; idx < myText.count; idx++) { NSString *data =[myText objectAtIndex:idx]; NSLog(@"%@", data); id x = [NSNumber numberWithFloat:0+idx*0.002777778]; id y = [NSDecimalNumber decimalNumberWithString:data]; [contentArray addObject: [NSMutableDictionary dictionaryWithObjectsAndKeys:x, @"x", y, @"y",nil]]; } self.dataForPlot = contentArray; } The only difference is NSString *filePath = [[NSBundle mainBundle] pathForResource:@"Data" ofType:@"csv"]; NSString *Data = [NSString stringWithContentsOfFile:filePath encoding:NSUTF8StringEncoding error:nil ]; if (data){ } did i do anything wrong here?? Thanks for your help!!!!

    Read the article

  • POST data not being received

    - by Alexander
    I've got an iPhone App that is supposed to send POST data to my server to register the device in a MySQL database so we can send notifications etc... to it. It sends it's unique identifier, device name, token, and a few other small things like passwords and usernames as a POST request to our server. The problem is that sometimes the server doesn't receive the data. And by this I mean, its not just receiving blank values for the POST inputs but, its not receiving ANY post data at all. I am logging all POST inputs to my server into some log files and when the script that relies on the POST data from the device fails (detects no data) I notice that its because NO POST data was sent. Is this a problem on the server, like refusing data or something or does this have to be on the client's side? What could be causing this?

    Read the article

  • Oracle Big Data Learning Library - Click on LEARN BY PRODUCT to Open Page

    - by chberger
    Oracle Big Data Learning Library... Learn about Oracle Big Data, Data Science, Learning Analytics, Oracle NoSQL Database, and more! Oracle Big Data Essentials Attend this Oracle University Course! Using Oracle NoSQL Database Attend this Oracle University class! Oracle and Big Data on OTN See the latest resource on OTN. Search Welcome Get Started Learn by Role Learn by Product Latest Additions Additional Resources Oracle Big Data Appliance Oracle Big Data and Data Science Basics Meeting the Challenge of Big Data Oracle Big Data Tutorial Video Series Oracle MoviePlex - a Big Data End-to-End Series of Demonstrations Oracle Big Data Overview Oracle Big Data Essentials Data Mining Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features Using Oracle NoSQL Database Exalytics Enterprise Manager 12c R3: Manage Exalytics Setting Up and Running Summary Advisor on an E s Oracle R Enterprise Oracle R Enterprise Tutorial Series Oracle Big Data Connectors Integrate All Your Data with Oracle Big Data Connectors Using Oracle Direct Connector for HDFS to Read the Data from HDSF Using Oracle R Connector for Hadoop to Analyze Data Oracle NoSQL Database Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features  Using Oracle NoSQL Database eries Oracle Business Intelligence Enterprise Edition Oracle Business Intelligence Oracle BI 11g R1: Create Analyses and Dashboards - 4 day class Oracle BI Publisher 11g R1: Fundamentals - 3 day class Oracle BI 11g R1: Build Repositories - 5 day class

    Read the article

  • Let's introduce the Oracle Enterprise Data Quality family!

    - by Sarah Zanchetti
    The Oracle Enterprise Data Quality family of products helps you to achieve maximum value from their business applications by delivering fit-­for-­purpose data. OEDQ is a state-of-the-art collaborative data quality profiling, analysis, parsing, standardization, matching and merging product, designed to help you understand, improve, protect and govern the quality of the information your business uses, all from a single integrated environment. Oracle Enterprise Data Quality products are: Oracle Enterprise Data Quality Profile and Audit Oracle Enterprise Data Quality Parsing and Standardization Oracle Enterprise Data Quality Match and Merge Oracle Enterprise Data Quality Address Verification Server Oracle Enterprise Data Quality Product Data Parsing and Standardization Oracle Enterprise Data Quality Product Data Match and Merge Also, the following are some of the key features of OEDQ: Integrated data profiling, auditing, cleansing and matching Browser-based client access Ability to handle all types of data – for example customer, product, asset, financial, operational Connection to any JDBC-compliant data sources and targets Multi-user project support (role-based access, issue tracking, process annotation, and version control) Services Oriented Architecture (SOA) - support for designing processes that may be exposed to external applications as a service Designed to process large data volumes A single repository to hold data along with gathered statistics and project tracking information, with shared access Intuitive graphical user interface designed to help you solve real-world information quality issues quickly Easy, data-led creation and extension of validation and transformation rules Fully extensible architecture allowing the insertion of any required custom processing  If you need to learn more about EDQ, or get assistance for any kind of issue, the Oracle Technology Network offers a huge range of resources on Oracle software. Discuss technical problems and solutions on the Discussion Forums. Get hands-on step-by-step tutorials with Oracle By Example. Download Sample Code. Get the latest news and information on any Oracle product. You can also get further help and information with Oracle software from: My Oracle Support Oracle Support Services An Information Center is available, where you can find technical information and fast solutions to the most common already solved issues: Information Center: Oracle Enterprise Data Quality [ID 1555073.2]

    Read the article

  • 'Similarity' in Data Mining

    - by Shailesh Tainwala
    In the field of Data Mining, is there a specific sub-discipline called 'Similarity'? If yes, what does it deal with. Any examples, links, references will be helpful. Also, being new to the field, I would like the community opinion on how closely related Data Mining and Artificial Intelligence are. Are they synonyms, is one the subset of the other? Thanks in advance for sharing your knowledge.

    Read the article

  • Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

    - by RizwanK
    I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject. I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines.... Now, many subject lines would fail a string match, but "Google Friends : Our latest news" "Google Friends : What we're doing today" are more similar to each other than a random subject line, as is: "Virgin Airlines has a great sale today" "Take a flight with Virgin Airlines" So -- how can I start to automagically extract trends/examples of strings that may be more similar. Approaches I've considered and discarded ('because there must be some better way'): Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones Stripping off the first word or two and then count the occurrence of each sub string Comparing Levenshtein distance between entries Some sort of string similarity index ... Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..? In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set. After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one. Any guidance would be appreciated.

    Read the article

  • Oracle Enterprise Data Quality - Geared Up and Ready for OpenWorld 2012

    - by Mala Narasimharajan
    10 days and counting till Oracle OpenWorld 2012 is upon us.  Enterprise data quality is key to every information integration and consolidation initiative. At this year's OpenWorld, hear how Oracle Enterprise Data Quality provides the critical piece to achieving trusted, reliable master data and increases the value of data integration initiatives. Here are the different ways you can learn and experience Enterprise Data Quality at OpenWorld:  Conference sessions: Oracle Enterprise Data Quality: Product Overview and Roadmap - Monday 10/1/12, 1:45-2:45 PM - Moscone West - 3006 Data Preparation and Ongoing Governance with the Oracle Enterprise Data Quality Platform - Wednesday 10/3/2012, 1:15-2:15 PM - Moscone West - 3000  Data Acquisition, Migration and Integration with the Oracle Enterprise Data Quality Platform - Thursday 10/4/2012, 12:45-1:45 PM - Moscone West - 3005  Hands on Labs: Introduction to Oracle Enterprise Data Quality Platform -  Monday 10/2/2012, 4:45-5:45 PM - Marriot Marquis - Salon 1/2 Demos:  Trusted Data with Oracle Enterprise Data Quality - Moscone South, Right - S-243 (note: proceed to Middleware Demo grounds) For a list of Master Data Management and Data Quality sessions and other events click here. 

    Read the article

  • Going For Gold: AngloGold Ashanti and Oracle Spatial 11g

    - by stephen.garth
    Save The Date: May 6 at 11:00am Pacific time Attend this free Directions Media live webinar to find out how AngloGold Ashanti is using Oracle Database 11g with a unique geospatial infrastructure based on Oracle Spatial 11g to support worldwide gold exploration and mining operations. Terence Harbort, Exploration Systems Architect at AngloGold Ashanti, will discuss how the company is addressing challenges including management of large volumes of highly varied mapping and image data, 3D visualization, and geospatial analysis. Viewers can paricipate in a live Q&A session at the end of the webinar. Date: May 6, 2010 Time: 11:00am PDT Register here

    Read the article

  • Oracle Technológia Fórum rendezvény, 2010. május 5. szerda

    - by Fekete Zoltán
    Jövo hét szerdán Oracle Technology Fórum napot tartunk, ahol az adatbázis-kezelési és a fejlesztoi szekciókban hallgathatók meg eloadások illetve kaphatók válaszok a kérdésekre. Jelentkezés a rendezvényre. Az adatbázis szekcióban fogok beszélni a Sun Oracle Database Machine / Exadata megoldások technikai gyöngyszemeirol mind a tranzakciós (OLTP) mind az adattárházas (DW) és adatbázis konszolidáció oldaláról. Emellett kiemelem majd az Oracle Data Mining (adatbányászat) és OLAP újdonságait, érdekességeit. Megemlítem majd az Oracle's Data Warehouse Reference Architecture alkalmazási lehetoségeit is.

    Read the article

  • July SQL Server UG Event in Manchester

    I will be speaking at the SQL Server UK User Group event in Manchester on 16.07.2009.  I am going to be talking about data mining again and how it isn’t all statistics and people with PhDs from Oxford.  Come join me and the excellent Chris Testa-O’Neill.  More details and registration can be found here

    Read the article

  • July SQL Server UG Event in Manchester

    I will be speaking at the SQL Server UK User Group event in Manchester on 16.07.2009.  I am going to be talking about data mining again and how it isn’t all statistics and people with PhDs from Oxford.  Come join me and the excellent Chris Testa-O’Neill.  More details and registration can be found here

    Read the article

  • SL3/SL4 - Ado.Net Data Services Error during new DataServiceCollection<T>(queryResponse)

    - by Soulhuntre
    Hey all, I have two functions in a SL project (VS2010) that do almost exactly the same thing, yet one throws an error and the other does not. It seems to be related to the projections, but I am unsure about the best way to resolve. The function that works is... public void LoadAllChunksExpandAll(DataHelperReturnHandler handler, string orderby) { DataServiceCollection<CmsChunk> data = null; DataServiceQuery<CmsChunk> theQuery = _dataservice .CmsChunks .Expand("CmsItemState") .AddQueryOption("$orderby", orderby); theQuery.BeginExecute( delegate(IAsyncResult asyncResult) { _callback_dispatcher.BeginInvoke( () => { try { DataServiceQuery<CmsChunk> query = asyncResult.AsyncState as DataServiceQuery<CmsChunk>; if (query != null) { //create a tracked DataServiceCollection from the result of the asynchronous query. QueryOperationResponse<CmsChunk> queryResponse = query.EndExecute(asyncResult) as QueryOperationResponse<CmsChunk>; data = new DataServiceCollection<CmsChunk>(queryResponse); handler(data); } } catch { handler(data); } } ); }, theQuery ); } This compiles and runs as expected. A very, very similar function (shown below) fails... public void LoadAllPagesExpandAll(DataHelperReturnHandler handler, string orderby) { DataServiceCollection<CmsPage> data = null; DataServiceQuery<CmsPage> theQuery = _dataservice .CmsPages .Expand("CmsChildPages") .Expand("CmsParentPage") .Expand("CmsItemState") .AddQueryOption("$orderby", orderby); theQuery.BeginExecute( delegate(IAsyncResult asyncResult) { _callback_dispatcher.BeginInvoke( () => { try { DataServiceQuery<CmsPage> query = asyncResult.AsyncState as DataServiceQuery<CmsPage>; if (query != null) { //create a tracked DataServiceCollection from the result of the asynchronous query. QueryOperationResponse<CmsPage> queryResponse = query.EndExecute(asyncResult) as QueryOperationResponse<CmsPage>; data = new DataServiceCollection<CmsPage>(queryResponse); handler(data); } } catch { handler(data); } } ); }, theQuery ); } Clearly the issue is the Expand projections that involve a self referencing relationship (pages can contain other pages). This is under SL4 or SL3 using ADONETDataServices SL3 Update CTP3. I am open to any work around or pointers to goo information, a Google search for the error results in two hits, neither particularly helpful that I can decipher. The short error is "An item could not be added to the collection. When items in a DataServiceCollection are tracked by the DataServiceContext, new items cannot be added before items have been loaded into the collection." The full error is... System.Reflection.TargetInvocationException was caught Message=Exception has been thrown by the target of an invocation. StackTrace: at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType typeOwner) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters) at System.Data.Services.Client.ClientType.ClientProperty.SetValue(Object instance, Object value, String propertyName, Boolean allowAdd) at System.Data.Services.Client.AtomMaterializer.ApplyItemsToCollection(AtomEntry entry, ClientProperty property, IEnumerable items, Uri nextLink, ProjectionPlan continuationPlan) at System.Data.Services.Client.AtomMaterializer.ApplyFeedToCollection(AtomEntry entry, ClientProperty property, AtomFeed feed, Boolean includeLinks) at System.Data.Services.Client.AtomMaterializer.MaterializeResolvedEntry(AtomEntry entry, Boolean includeLinks) at System.Data.Services.Client.AtomMaterializer.Materialize(AtomEntry entry, Type expectedEntryType, Boolean includeLinks) at System.Data.Services.Client.AtomMaterializer.DirectMaterializePlan(AtomMaterializer materializer, AtomEntry entry, Type expectedEntryType) at System.Data.Services.Client.AtomMaterializerInvoker.DirectMaterializePlan(Object materializer, Object entry, Type expectedEntryType) at System.Data.Services.Client.ProjectionPlan.Run(AtomMaterializer materializer, AtomEntry entry, Type expectedType) at System.Data.Services.Client.AtomMaterializer.Read() at System.Data.Services.Client.MaterializeAtom.MoveNextInternal() at System.Data.Services.Client.MaterializeAtom.MoveNext() at System.Linq.Enumerable.d_b11.MoveNext() at System.Data.Services.Client.DataServiceCollection1.InternalLoadCollection(IEnumerable1 items) at System.Data.Services.Client.DataServiceCollection1.StartTracking(DataServiceContext context, IEnumerable1 items, String entitySet, Func2 entityChanged, Func2 collectionChanged) at System.Data.Services.Client.DataServiceCollection1..ctor(DataServiceContext context, IEnumerable1 items, TrackingMode trackingMode, String entitySetName, Func2 entityChangedCallback, Func2 collectionChangedCallback) at System.Data.Services.Client.DataServiceCollection1..ctor(IEnumerable1 items) at Phinli.Dashboard.Silverlight.Helpers.DataHelper.<>c__DisplayClass44.<>c__DisplayClass46.<LoadAllPagesExpandAll>b__43() InnerException: System.InvalidOperationException Message=An item could not be added to the collection. When items in a DataServiceCollection are tracked by the DataServiceContext, new items cannot be added before items have been loaded into the collection. StackTrace: at System.Data.Services.Client.DataServiceCollection1.InsertItem(Int32 index, T item) at System.Collections.ObjectModel.Collection`1.Add(T item) InnerException: Thanks for any help!

    Read the article

  • 3rd party data - Store in Data Warehouse or Primary database?

    - by brydgesk
    This is mostly a data warehouse philosophy question. My project involves an Oracle forms application, and a Teradata Data Warehouse for reporting and ad-hoc purposes. In addition to the primary data created by the users of our application, we also require data from various other sources. Currently, this 3rd party data comes via FTPd flat files directly to our Data Warehouse. To access the data, our users must use a series of custom BusinessObjects reports. My question is, would it make more sense for this data to be sent to our source Oracle system instead? Is it ever appropriate for a Data Warehouse to be the point of origin for users to access raw data? In short, is it more important that the operational database contain only the data created by your project, or that the data warehouse remain dedicated solely to reporting and analysis?

    Read the article

  • Starting to construct a data access layer. Things to consider?

    - by Phil
    Our organisation uses inline sql. We have been tasked with providing a suitable data access layer and are weighing up the pro's and cons of which way to go... Datasets ADO.net Linq Entity framework Subsonic Other? Some tutorials and articles I have been using for reference: http://www.asp.net/(S(pdfrohu0ajmwt445fanvj2r3))/learn/data-access/tutorial-01-vb.aspx http://www.simple-talk.com/dotnet/.net-framework/designing-a-data-access-layer-in-linq-to-sql/ http://msdn.microsoft.com/en-us/magazine/cc188750.aspx http://msdn.microsoft.com/en-us/library/aa697427(VS.80).aspx http://www.subsonicproject.com/ I'm extremely torn, and finding it very difficult to make a decision on which way to go. Our site is a series of 2 internal portals and a public web site. We are using vs2008 sp1 and framework version 3.5. Please can you give me advise on what factors to consider and any pro's and cons you have faced with your data access layer. Thanks.

    Read the article

  • Using Core Data Concurrently and Reliably

    - by John Topley
    I'm building my first iOS app, which in theory should be pretty straightforward but I'm having difficulty making it sufficiently bulletproof for me to feel confident submitting it to the App Store. Briefly, the main screen has a table view, upon selecting a row it segues to another table view that displays information relevant for the selected row in a master-detail fashion. The underlying data is retrieved as JSON data from a web service once a day and then cached in a Core Data store. The data previous to that day is deleted to stop the SQLite database file from growing indefinitely. All data persistence operations are performed using Core Data, with an NSFetchedResultsController underpinning the detail table view. The problem I am seeing is that if you switch quickly between the master and detail screens several times whilst fresh data is being retrieved, parsed and saved, the app freezes or crashes completely. There seems to be some sort of race condition, maybe due to Core Data importing data in the background whilst the main thread is trying to perform a fetch, but I'm speculating. I've had trouble capturing any meaningful crash information, usually it's a SIGSEGV deep in the Core Data stack. The table below shows the actual order of events that happen when the detail table view controller is loaded: Main Thread Background Thread viewDidLoad Get JSON data (using AFNetworking) Create child NSManagedObjectContext (MOC) Parse JSON data Insert managed objects in child MOC Save child MOC Post import completion notification Receive import completion notification Save parent MOC Perform fetch and reload table view Delete old managed objects in child MOC Save child MOC Post deletion completion notification Receive deletion completion notification Save parent MOC Once the AFNetworking completion block is triggered when the JSON data has arrived, a nested NSManagedObjectContext is created and passed to an "importer" object that parses the JSON data and saves the objects to the Core Data store. The importer executes using the new performBlock method introduced in iOS 5: NSManagedObjectContext *child = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType]; [child setParentContext:self.managedObjectContext]; [child performBlock:^{ // Create importer instance, passing it the child MOC... }]; The importer object observes its own MOC's NSManagedObjectContextDidSaveNotification and then posts its own notification which is observed by the detail table view controller. When this notification is posted the table view controller performs a save on its own (parent) MOC. I use the same basic pattern with a "deleter" object for deleting the old data after the new data for the day has been imported. This occurs asynchronously after the new data has been fetched by the fetched results controller and the detail table view has been reloaded. One thing I am not doing is observing any merge notifications or locking any of the managed object contexts or the persistent store coordinator. Is this something I should be doing? I'm a bit unsure how to architect this all correctly so would appreciate any advice.

    Read the article

  • Free NOSQL database for use with C# client [closed]

    - by Mitten
    I've never used NOSQL databases before, but so far it seems like the best data storage solution for my project. I am going to implement a datamining application. The data I would like to mine is thousands of documents which cannot be imported into datamining applications. To make to import easier and faster (than importing thousands of documents) I am planning to import these documents into a NOSQL database first and when import NOSQL database into datamining software. At the very least once I have all the data in NOSQL database I should be able to code simplest datamining logic myself. Am I correct that NOSQL databases allow to creates records of data, but they don't mandate all the records to adhere to the same data schema (same column names/types in a classic table oriended SQL databases)? I think for each document I would create a row/entry/object (not sure what is the correct term is in use in NOSQL world) which would be a string id, few (columns) with unstructured text data, and a dozens of columns mostly of datetime and integer types. From its name NOSQL does not support SQL query syntax, but it support locating the object(row/entry?) by its unique id. Does NOSQL support qyuering objects using property=value syntax? Unfortunately most of free NOSQL db only support Java/C++ clients, which free NOSQL db would you recommend for a C# programmer?

    Read the article

  • Used HDD/ran DiskSmartView/40,000 Power-on-hours?? should i trust it w/ my data, or take it back and bitch?

    - by David Lindsay
    I just bought a used hard drive from a University Surplus Store. Decided to run DiskSmartView to make sure it wasn't ready to fail. 40,000 power-on-hours I don't know if I feel like trusting my data to something that used. I really dont know if thats unreasonably old, but when i compare it to the POH reading i get when testing my other hdds its more than 3x older (my others have 2110 hours, 6150 hours, etc.. It's a Western Digital, so that gives me a little bit of hope(WDC WD4000KD-00NAB0). I could sure use someone else's opinion here. Thanks, DAVE

    Read the article

  • Windows Azure Recipe: Big Data

    - by Clint Edmonson
    As the name implies, what we’re talking about here is the explosion of electronic data that comes from huge volumes of transactions, devices, and sensors being captured by businesses today. This data often comes in unstructured formats and/or too fast for us to effectively process in real time. Collectively, we call these the 4 big data V’s: Volume, Velocity, Variety, and Variability. These qualities make this type of data best managed by NoSQL systems like Hadoop, rather than by conventional Relational Database Management System (RDBMS). We know that there are patterns hidden inside this data that might provide competitive insight into market trends.  The key is knowing when and how to leverage these “No SQL” tools combined with traditional business such as SQL-based relational databases and warehouses and other business intelligence tools. Drivers Petabyte scale data collection and storage Business intelligence and insight Solution The sketch below shows one of many big data solutions using Hadoop’s unique highly scalable storage and parallel processing capabilities combined with Microsoft Office’s Business Intelligence Components to access the data in the cluster. Ingredients Hadoop – this big data industry heavyweight provides both large scale data storage infrastructure and a highly parallelized map-reduce processing engine to crunch through the data efficiently. Here are the key pieces of the environment: Pig - a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. Mahout - a machine learning library with algorithms for clustering, classification and batch based collaborative filtering that are implemented on top of Apache Hadoop using the map/reduce paradigm. Hive - data warehouse software built on top of Apache Hadoop that facilitates querying and managing large datasets residing in distributed storage. Directly accessible to Microsoft Office and other consumers via add-ins and the Hive ODBC data driver. Pegasus - a Peta-scale graph mining system that runs in parallel, distributed manner on top of Hadoop and that provides algorithms for important graph mining tasks such as Degree, PageRank, Random Walk with Restart (RWR), Radius, and Connected Components. Sqoop - a tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. Flume - a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large log data amounts to HDFS. Database – directly accessible to Hadoop via the Sqoop based Microsoft SQL Server Connector for Apache Hadoop, data can be efficiently transferred to traditional relational data stores for replication, reporting, or other needs. Reporting – provides easily consumable reporting when combined with a database being fed from the Hadoop environment. Training These links point to online Windows Azure training labs where you can learn more about the individual ingredients described above. Hadoop Learning Resources (20+ tutorials and labs) Huge collection of resources for learning about all aspects of Apache Hadoop-based development on Windows Azure and the Hadoop and Windows Azure Ecosystems SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

    Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >