Search Results

Search found 67262 results on 2691 pages for 'data driven testing'.

Page 22/2691 | < Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >

  • Are there sources of email marketing data available?

    - by Gortron
    Are sources of email marketing data available to the public? I would like to see email marketing data to see what kind of content a business sends out, the frequency of sending, the number of people emailed, especially the resulting open rates and click through rates. Are businesses willing to share data on their previous email marketing campaigns without divulging their contact list? I would like to use this data to create an application to help businesses create better newsletters by using this data as a benchmark, basically sharing what works and what doesn't for each industry.

    Read the article

  • How to Achieve Real-Time Data Protection and Availabilty....For Real

    - by JoeMeeks
    There is a class of business and mission critical applications where downtime or data loss have substantial negative impact on revenue, customer service, reputation, cost, etc. Because the Oracle Database is used extensively to provide reliable performance and availability for this class of application, it also provides an integrated set of capabilities for real-time data protection and availability. Active Data Guard, depicted in the figure below, is the cornerstone for accomplishing these objectives because it provides the absolute best real-time data protection and availability for the Oracle Database. This is a bold statement, but it is supported by the facts. It isn’t so much that alternative solutions are bad, it’s just that their architectures prevent them from achieving the same levels of data protection, availability, simplicity, and asset utilization provided by Active Data Guard. Let’s explore further. Backups are the most popular method used to protect data and are an essential best practice for every database. Not surprisingly, Oracle Recovery Manager (RMAN) is one of the most commonly used features of the Oracle Database. But comparing Active Data Guard to backups is like comparing apples to motorcycles. Active Data Guard uses a hot (open read-only), synchronized copy of the production database to provide real-time data protection and HA. In contrast, a restore from backup takes time and often has many moving parts - people, processes, software and systems – that can create a level of uncertainty during an outage that critical applications can’t afford. This is why backups play a secondary role for your most critical databases by complementing real-time solutions that can provide both data protection and availability. Before Data Guard, enterprises used storage remote-mirroring for real-time data protection and availability. Remote-mirroring is a sophisticated storage technology promoted as a generic infrastructure solution that makes a simple promise – whatever is written to a primary volume will also be written to the mirrored volume at a remote site. Keeping this promise is also what causes data loss and downtime when the data written to primary volumes is corrupt – the same corruption is faithfully mirrored to the remote volume making both copies unusable. This happens because remote-mirroring is a generic process. It has no  intrinsic knowledge of Oracle data structures to enable advanced protection, nor can it perform independent Oracle validation BEFORE changes are applied to the remote copy. There is also nothing to prevent human error (e.g. a storage admin accidentally deleting critical files) from also impacting the remote mirrored copy. Remote-mirroring tricks users by creating a false impression that there are two separate copies of the Oracle Database. In truth; while remote-mirroring maintains two copies of the data on different volumes, both are part of a single closely coupled system. Not only will remote-mirroring propagate corruptions and administrative errors, but the changes applied to the mirrored volume are a result of the same Oracle code path that applied the change to the source volume. There is no isolation, either from a storage mirroring perspective or from an Oracle software perspective.  Bottom line, storage remote-mirroring lacks both the smarts and isolation level necessary to provide true data protection. Active Data Guard offers much more than storage remote-mirroring when your objective is protecting your enterprise from downtime and data loss. Like remote-mirroring, an Active Data Guard replica is an exact block for block copy of the primary. Unlike remote-mirroring, an Active Data Guard replica is NOT a tightly coupled copy of the source volumes - it is a completely independent Oracle Database. Active Data Guard’s inherent knowledge of Oracle data block and redo structures enables a separate Oracle Database using a different Oracle code path than the primary to use the full complement of Oracle data validation methods before changes are applied to the synchronized copy. These include: physical check sum, logical intra-block checking, lost write validation, and automatic block repair. The figure below illustrates the stark difference between the knowledge that remote-mirroring can discern from an Oracle data block and what Active Data Guard can discern. An Active Data Guard standby also provides a range of additional services enabled by the fact that it is a running Oracle Database - not just a mirrored copy of data files. An Active Data Guard standby database can be open read-only while it is synchronizing with the primary. This enables read-only workloads to be offloaded from the primary system and run on the active standby - boosting performance by utilizing all assets. An Active Data Guard standby can also be used to implement many types of system and database maintenance in rolling fashion. Maintenance and upgrades are first implemented on the standby while production runs unaffected at the primary. After the primary and standby are synchronized and all changes have been validated, the production workload is quickly switched to the standby. The only downtime is the time required for user connections to transfer from one system to the next. These capabilities further expand the expectations of availability offered by a data protection solution beyond what is possible to do using storage remote-mirroring. So don’t be fooled by appearances.  Storage remote-mirroring and Active Data Guard replication may look similar on the surface - but the devil is in the details. Only Active Data Guard has the smarts, the isolation, and the simplicity, to provide the best data protection and availability for the Oracle Database. Stay tuned for future blog posts that dive into the many differences between storage remote-mirroring and Active Data Guard along the dimensions of data protection, data availability, cost, asset utilization and return on investment. For additional information on Active Data Guard, see: Active Data Guard Technical White Paper Active Data Guard vs Storage Remote-Mirroring Active Data Guard Home Page on the Oracle Technology Network

    Read the article

  • Extending SSIS with custom Data Flow components (Presentation)

    Download the slides and sample code from my Extending SSIS with custom Data Flow components presentation, first presented at the SQLBits II (The SQL) Community Conference. Abstract Get some real-world insights into developing data flow components for SSIS. This starts with an introduction to the data flow pipeline engine, and explains the real differences between adapters and the three sub-types of transformation. Understanding how the different types of component behave and manage data is key to writing components of your own, and probably should but be required knowledge for anyone building packages at all. Using sample code throughout, I will show you how to write components, as well as highlighting best practice and lessons learned. The sample code includes fully working example projects for source, destination and transformation components. Presentation & Samples (358KB) Extending SSIS with custom Data Flow components.zip

    Read the article

  • How to use OO for data analysis? [closed]

    - by Konsta
    In which ways could object-orientation (OO) make my data analysis more efficient and let me reuse more of my code? The data analysis can be broken up into get data (from db or csv or similar) transform data (filter, group/pivot, ...) display/plot (graph timeseries, create tables, etc.) I mostly use Python and its Pandas and Matplotlib packages for this besides some DB connectivity (SQL). Almost all of my code is a functional/procedural mix. While I have started to create a data object for a certain collection of time series, I wonder if there are OO design patterns/approaches for other parts of the process that might increase efficiency?

    Read the article

  • Winner of the 2012 Government Big Data Solutions Award

    - by Jean-Pierre Dijcks
    Hot off the press: The winner of the 2012 Government Big Data Solutions Aware is the National Cancer Institute!! Read all the details on CTOLabs.com. A short excerpt to wet your appetite: "... This solution, based on the Oracle Big Data Appliance with the Cloudera Distribution of Apache Hadoop (CDH), leverages capabilities available from the Big Data community today in pioneering ways that can serve a broad range of researchers. The promising approach of this solution is repeatable across many other Big Data challenges for bioinfomatics, making this approach worthy of its selection as the 2012 Government Big Data Solution Award." Read the entire post. Congrats to the entire team!!

    Read the article

  • Markup format or script for data files?

    - by Aaron
    The game I'm designing will be mainly written in a high level scripting language (leaning towards either Lua or Squirrel) with a C++ core. In addition to scripts I'm also going to need different data files. Many data files will be for static information such as graphical assets and monster types. I'd also want to create and update data files at runtime for user information like option settings and game saves. Can I get away with using plain script files (i.e. .lua or .nut files) for my data files, or is it better to use dedicated markup formats like XML or YAML? If I use script files, loaded separately from my true scripts, then I wouldn't need an extra library to read those files. Scripting languages like Lua also have table syntax that lend themselves towards data definition. On the other hand I'd have to write my own schema check code. These languages also don't seem to support serialization "out of the box" like the markup format libraries do.

    Read the article

  • SQL Server and the XML Data Type : Data Manipulation

    The introduction of the xml data type, with its own set of methods for processing xml data, made it possible for SQL Server developers to create columns and variables of the type xml. Deanna Dicken examines the modify() method, which provides for data manipulation of the XML data stored in the xml data type via XML DML statements. Too many SQL Servers to keep up with?Download a free trial of SQL Response to monitor your SQL Servers in just one intuitive interface."The monitoringin SQL Response is excellent." Mike Towery.

    Read the article

  • Getting data from a webpage in a stable and efficient way

    - by Mike Heremans
    Recently I've learned that using a regex to parse the HTML of a website to get the data you need isn't the best course of action. So my question is simple: What then, is the best / most efficient and a generally stable way to get this data? I should note that: There are no API's There is no other source where I can get the data from (no databases, feeds and such) There is no access to the source files. (Data from public websites) Let's say the data is normal text, displayed in a table in a html page I'm currently using python for my project but a language independent solution/tips would be nice. As a side question: How would you go about it when the webpage is constructed by Ajax calls?

    Read the article

  • What data structure to use / data persistence

    - by Dave
    I have an app where I need one table of information with the following fields: field 1 - int or char field 2 - string (max 10 char) field 3 - string (max 20 char) field 4 - float I need the program to filter on field 1 based upon a segmented control and select a field 2 from a picker. From this data I need to look up field 4 to use in a calculation. Total records will be about 200. I never see it go above 400 - 500. I am going to use a singleton which I am able to do, I just need help with the structure for this with data persistence. What type of data structure should I use for this and should I use NSNumber, NSString, etc. or old data types like float, Char, etc. I thought about a struct put into an array but there is probably a better way. This is new to me so any help or reference to examples would be great. I also thought about a plist or dictionary but it looks like it is just a lookup and a field which obviously won't work. Core data looked like overkill to me. Also, with any recommendation how should I get initial data into it? I want the user to be able to edit and add to the database. Sorry for the old terms, you can see what generation I am from... Thanks in advance!!!!

    Read the article

  • Domain driven design: Manager and service

    - by ryudice
    I'm creating some business logic in the application but I'm not sure how or where to encapsulate it, I've used the repository pattern for data access, I've seen some projects that use DDD that have some classes with the "Service" suffix and the "manager" suffix, what are each of this clases suppose to take care of in DDD?

    Read the article

  • Using a "white list" for extracting terms for Text Mining

    - by [email protected]
    In Part 1 of my post on "Generating cluster names from a document clustering model" (part 1, part 2, part 3), I showed how to build a clustering model from text documents using Oracle Data Miner, which automates preparing data for text mining. In this process we specified a custom stoplist and lexer and relied on Oracle Text to identify important terms.  However, there is an alternative approach, the white list, which uses a thesaurus object with the Oracle Text CTXRULE index to allow you to specify the important terms. INTRODUCTIONA stoplist is used to exclude, i.e., black list, specific words in your documents from being indexed. For example, words like a, if, and, or, and but normally add no value when text mining. Other words can also be excluded if they do not help to differentiate documents, e.g., the word Oracle is ubiquitous in the Oracle product literature. One problem with stoplists is determining which words to specify. This usually requires inspecting the terms that are extracted, manually identifying which ones you don't want, and then re-indexing the documents to determine if you missed any. Since a corpus of documents could contain thousands of words, this could be a tedious exercise. Moreover, since every word is considered as an individual token, a term excluded in one context may be needed to help identify a term in another context. For example, in our Oracle product literature example, the words "Oracle Data Mining" taken individually are not particular helpful. The term "Oracle" may be found in nearly all documents, as with the term "Data." The term "Mining" is more unique, but could also refer to the Mining industry. If we exclude "Oracle" and "Data" by specifying them in the stoplist, we lose valuable information. But it we include them, they may introduce too much noise. Still, when you have a broad vocabulary or don't have a list of specific terms of interest, you rely on the text engine to identify important terms, often by computing the term frequency - inverse document frequency metric. (This is effectively a weight associated with each term indicating its relative importance in a document within a collection of documents. We'll revisit this later.) The results using this technique is often quite valuable. As noted above, an alternative to the subtractive nature of the stoplist is to specify a white list, or a list of terms--perhaps multi-word--that we want to extract and use for data mining. The obvious downside to this approach is the need to specify the set of terms of interest. However, this may not be as daunting a task as it seems. For example, in a given domain (Oracle product literature), there is often a recognized glossary, or a list of keywords and phrases (Oracle product names, industry names, product categories, etc.). Being able to identify multi-word terms, e.g., "Oracle Data Mining" or "Customer Relationship Management" as a single token can greatly increase the quality of the data mining results. The remainder of this post and subsequent posts will focus on how to produce a dataset that contains white list terms, suitable for mining. CREATING A WHITE LIST We'll leverage the thesaurus capability of Oracle Text. Using a thesaurus, we create a set of rules that are in effect our mapping from single and multi-word terms to the tokens used to represent those terms. For example, "Oracle Data Mining" becomes "ORACLEDATAMINING." First, we'll create and populate a mapping table called my_term_token_map. All text has been converted to upper case and values in the TERM column are intended to be mapped to the token in the TOKEN column. TERM                                TOKEN DATA MINING                         DATAMINING ORACLE DATA MINING                  ORACLEDATAMINING 11G                                 ORACLE11G JAVA                                JAVA CRM                                 CRM CUSTOMER RELATIONSHIP MANAGEMENT    CRM ... Next, we'll create a thesaurus object my_thesaurus and a rules table my_thesaurus_rules: CTX_THES.CREATE_THESAURUS('my_thesaurus', FALSE); CREATE TABLE my_thesaurus_rules (main_term     VARCHAR2(100),                                  query_string  VARCHAR2(400)); We next populate the thesaurus object and rules table using the term token map. A cursor is defined over my_term_token_map. As we iterate over  the rows, we insert a synonym relationship 'SYN' into the thesaurus. We also insert into the table my_thesaurus_rules the main term, and the corresponding query string, which specifies synonyms for the token in the thesaurus. DECLARE   cursor c2 is     select token, term     from my_term_token_map; BEGIN   for r_c2 in c2 loop     CTX_THES.CREATE_RELATION('my_thesaurus',r_c2.token,'SYN',r_c2.term);     EXECUTE IMMEDIATE 'insert into my_thesaurus_rules values                        (:1,''SYN(' || r_c2.token || ', my_thesaurus)'')'     using r_c2.token;   end loop; END; We are effectively inserting the token to return and the corresponding query that will look up synonyms in our thesaurus into the my_thesaurus_rules table, for example:     'ORACLEDATAMINING'        SYN ('ORACLEDATAMINING', my_thesaurus)At this point, we create a CTXRULE index on the my_thesaurus_rules table: create index my_thesaurus_rules_idx on        my_thesaurus_rules(query_string)        indextype is ctxsys.ctxrule; In my next post, this index will be used to extract the tokens that match each of the rules specified. We'll then compute the tf-idf weights for each of the terms and create a nested table suitable for mining.

    Read the article

  • Load Testing Linux Virtual Server

    - by Anubhav Agarwal
    I have configured a Linux virtual network with following configuration 172.17.6.112- VIP 172.17.6.111- Linux Director | |----------172.17.6.113 --- Real Server 1 |----------172.17.6.114 --- Real Server 2 I am using direct routing technique. I am unable to test my LVS network. Are there some good scripts/softwares available for load testing. I am running apache2.0 service on them. I came across with testlvs on the internet but am unable to understand its documentation. Are there more simpler ones I want to test the response time of server using various scheduling algorithms .

    Read the article

  • Developing and implementing a testing plan for a software app deployed on a web server

    - by Abhzoo
    A company in the USA is building a new Web App that will be offered SaaS to customers and the development is being done by a software development team located in a different country(India). They are about to take delivery of a first demo to provide live feedback to the team in India. The overseas team requires a cloud server (Windows + SQL Standard, 8GB Ram, 8 vCPUs, 40GB SSD system disk, 80GB SSD data disk, 1600Mb/s network bandwidth) to serve as a tester server. When the tester is setup the team will install the app on the test server to get live feedback. Q:Explain in detail how you will develop and implement a testing plan for the software App. Be sure to explain the specifics. PLEASE HELP, NEED ANSWER ASAP

    Read the article

  • ADO.NET (WCF) Data Services Query Interceptor Hangs IIS

    - by PreMagination
    I have an ADO.NET Data Service that's supposed to provide read-only access to a somewhat complex database. Logically I have table-per-type (TPT) inheritance in my data model but the EDM doesn't implement inheritance. (Limitation of EF and navigation properties on derived types. STILL not fixed in EF4!) I can query my EDM directly (using a separate project) using a copy of the query I'm trying to run against the web service, results are returned within 10 seconds. Disabling the query interceptors I'm able to make the same query against the web service, results are returned similarly quickly. I can enable some of the query interceptors and the results are returned slowly, up to a minute or so later. Alternatively, I can enable all the query interceptors, expand less of the properties on the main object I'm querying, and results are returned in a similar period of time. (I've increased some of the timeout periods) Up til this point Sql Profiler indicates the slow-down is the database. (That's a post for a different day) But when I enable all my query interceptors and expand all the properties I'd like to have the IIS worker process pegs the CPU for 20 minutes and a query is never even made against the database. This implies to me that yes, my implementation probably sucks but regardless the Data Services "tier" is having an issue it shouldn't. WCF tracing didn't reveal anything interesting to my untrained eye. Details: Data model: Agent-Person-Student Student has a collection of referrals Students and referrals are private, queries against the web service should only return "your" students and referrals. This means Person and Agent need to be filtered too. Other entities (Agent-Organization-School) can be accessed by anyone who has authenticated. The existing security model is poorly suited to perform this type of filtering for this type of data access, the query interceptors are complicated and cause EF to generate some entertaining sql queries. Sample Interceptor [QueryInterceptor("Agents")] public Expression<Func<Agent, Boolean>> OnQueryAgents() { //Agent is a Person(1), Educator(2), Student(3), or Other Person(13); allow if scope permissions exist return ag => (ag.AgentType.AgentTypeId == 1 || ag.AgentType.AgentTypeId == 2 || ag.AgentType.AgentTypeId == 3 || ag.AgentType.AgentTypeId == 13) && ag.Person.OrganizationPersons.Count<OrganizationPerson>(op => op.Organization.ScopePermissions.Any<ScopePermission> (p => p.ApplicationRoleAccount.Account.UserName == HttpContext.Current.User.Identity.Name && p.ApplicationRoleAccount.Application.ApplicationId == 124) || op.Organization.HierarchyDescendents.Any<OrganizationsHierarchy>(oh => oh.AncestorOrganization.ScopePermissions.Any<ScopePermission> (p => p.ApplicationRoleAccount.Account.UserName == HttpContext.Current.User.Identity.Name && p.ApplicationRoleAccount.Application.ApplicationId == 124))) > 0; } The query interceptors for Person, Student, Referral are all very similar, ie they traverse multiple same/similar tables to look for ScopePermissions as above. Sample Query var referrals = (from r in service.Referrals .Expand("Organization/ParentOrganization") .Expand("Educator/Person/Agent") .Expand("Student/Person/Agent") .Expand("Student") .Expand("Grade") .Expand("ProblemBehavior") .Expand("Location") .Expand("Motivation") .Expand("AdminDecision") .Expand("OthersInvolved") where r.DateCreated >= coupledays && r.DateDeleted == null select r); Any suggestions or tips would be greatly associated, for fixing my current implementation or in developing a new one, with the caveat that the database can't be changed and that ultimately I need to expose a large portion of the database via a web service that limits data access to the data authorized for, for the purpose of data integration with multiple outside parties. THANK YOU!!!

    Read the article

  • Oracle Insurance Gets Innovative with Insurance Business Intelligence

    - by nicole.bruns(at)oracle.com
    Oracle Insurance announced yesterday the availability of Oracle Insurance Insight 7.0, an insurance-specific data warehouse and business intelligence (BI) system that transforms the traditional approach to BI by involving business users in the creation and maintenance."Rapid access to business intelligence is essential to compete and thrive in today's insurance industry," said Srini Venkatasantham, vice president, Product Strategy, Oracle Insurance. "The adaptive data modeling approach of Oracle Insurance Insight 7.0, combined with the insurance-specific data model, offers global insurance companies a faster, easier way to get the intelligence they need to make better-informed business decisions." New Features in Oracle Insurance 7.0 include:"Adaptive Data Modeling" via the new warehouse palette: Gives business users the power to configure lines of business via an easy-to-use warehouse palette tool. Oracle Insurance Insight then automatically creates data warehouse elements - such as line-specific database structures and extract-transform-load (ETL) processes -speeding up time-to-value for BI initiatives. Out-of-the-box insurance models or create-from-scratch option: Includes pre-built content and interfaces for six Property and Casualty (P&C) lines. Additionally, insurers can use the warehouse palette to deploy any and all P&C or General Insurance lines of business from scratch, helping insurers support operations in any country.Leverages Oracle technologies: In addition to Oracle Business Intelligence Enterprise Edition, the solution includes Oracle Database 11g as well as Oracle Data Integrator Enterprise Edition 11g, which delivers Extract, Load and Transform (E-L-T) architecture and eliminates the need for a separate transformation server. Additionally, the expanded Oracle technology infrastructure enables support for Oracle Exadata. Martina Conlon, a Principal with Novarica's Insurance practice, and author of Business Intelligence in Insurance: Current State, Challenges, and Expectations says, "The need for continued investment by insurers in business intelligence capabilities is widely understood, and the industry is acting. Arming the business intelligence implementation with predefined insurance specific content, and flexible and configurable technology will get these projects up and running faster."Learn moreTo see a demo of the Oracle Insurance Insight system, click hereTo read the press announcement, click here

    Read the article

  • SQL SERVER – Installing Data Quality Services (DQS) on SQL Server 2012

    - by pinaldave
    Data Quality Services is very interesting enhancements in SQL Server 2012. My friend and SQL Server Expert Govind Kanshi have written an excellent article on this subject earlier on his blog. Yesterday I stumbled upon his blog one more time and decided to experiment myself with DQS. I have basic understanding of DQS and MDS so I knew I need to start with DQS Client. However, when I tried to find DQS Client I was not able to find it under SQL Server 2012 installation. I quickly realized that I needed to separately install the DQS client. You will find the DQS installer under SQL Server 2012 >> Data Quality Services directory. The pre-requisite of DQS is Master Data Services (MDS) and IIS. If you have not installed IIS, you can follow the simple steps and install IIS in your machine. Once the pre-requisites are installed, click on MDS installer once again and it will install DQS just fine. Be patient with the installer as it can take a bit longer time if your machine is low on configurations. Once the installation is over you will be able to expand SQL Server 2012 >> Data Quality Services directory and you will notice that it will have a new item called Data Quality Client.  Click on it and it will open the client. Well, in future blog post we will go over more details about DQS and detailed practical examples. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology Tagged: Data Quality Services

    Read the article

  • make-like build tools for data?

    - by miku
    Make is a standard tools for building software. But make decides whether a target needs to be regenerated by comparing file modification times. Are there any proven, preferably small tools that handle builds not for software but for data? Something that regenerates targets not only on mod times but on certain other properties (e.g. completeness). (Or alternatively some paper that describes such a tool.) As illustration: I'd like to automate the following process: get data (e.g. a tarball) from some regularly updated source copy somewhere if it's not there (based e.g. on some filename-scheme) convert the files to different format (but only if there aren't successfully converted ones there - e.g. from a previous attempt - custom comparison routine) for each file find a certain data element and fetch some additional file from say an URL, but only if that hasn't been downloaded yet (decide on existence of file and file "freshness") finally compute something (e.g. word count for something identifiable and store it in the database, but only if the DB does not have an entry for that exact ID yet) Observations: there are different stages each stage is usually simple to compute or implement in isolation each stage may be simple, but the data volume may be large each stage may produce a few errors each stage may have different signals, on when (re)processing is needed Requirements: builds should be interruptable and idempotent (== robust) when interrupted, already processed objects should be reused to speedup the next run data paths should be easy to adjust (simple syntax, nothing new to learn, internal dsl would be ok) some form of dependency graph, that describes the process would be nice for later visualizations should leverage existing programs, if possible I've done some research on make alternatives like rake and have worked a lot with ant and maven in the past. All these tools naturally focus on code and software build, not on data builds. A system we have in place now for a task similar to the above is pretty much just shell scripts, which are compact (and are a ok glue for a variety of other programs written in other languages), so I wonder if worse is better?

    Read the article

  • Social Analytics in your current data

    - by Dan McGrath
    By now everyone is aware of the massive boom in social-networking (Twitter, Facebook, LinkedIn) and obviously a big part of its business model revolves around being able to mine this data to create information that can be used to make money for someone. Gartner has identified 'Social Analytics' as one of the top 10 strategic technologies for 2011. Has anyone looked at their existing data structures to determine if they could extract a social graph and then perform further data mining against this? How does it fit in with your other strategic development strategies? What information are you trying to extract from the data? Take for example, a bank. They could conceivably determine a social graph through account relationships and transactions. Obviously there would be open edges on the graph where funds enter/leave the institute, but that shouldn't detract from the usefulness of the data. I'm looking for actual examples with the answers, as well as why/how they did it. References to other sites will be greatly appreciated. Note: I'm not at all referring to mining data out of actual social networks.

    Read the article

  • The Oldest Big Data Problem: Parsing Human Language

    - by dan.mcclary
    There's a new whitepaper up on Oracle Technology Network which details the use of Digital Reasoning Systems' Synthesys software on Oracle Big Data Appliance.  Digital Reasoning's approach is inherently "big data friendly," as it leverages multiple components of the Hadoop ecosystem.  Moreover, the paper addresses the oldest big data problem of them all: extracting knowledge from human text.   You can find the paper here.   From the Executive Summary: There is a wealth of information to be extracted from natural language, but that extraction is challenging. The volume of human language we generate constitutes a natural Big Data problem, while its complexity and nuance requires a particular expertise to model and mine. In this paper we illustrate the impressive combination of Oracle Big Data Appliance and Digital Reasoning Synthesys software. The combination of Synthesys and Big Data Appliance makes it possible to analyze tens of millions of documents in a matter of hours. Moreover, this powerful combination achieves four times greater throughput than conducting the equivalent analysis on a much larger cloud-deployed Hadoop cluster.

    Read the article

  • Does unit testing lead to premature generalization (specifically in the context of C++)?

    - by Martin
    Preliminary notes I'll not go into the distinction of the different kinds of test there are, there are already a few questions on these sites regarding that. I'll take what's there and that says: unit testing in the sense of "testing the smallest isolatable unit of an application" from which this question actually derives The isolation problem What is the smallest isolatable unit of a program. Well, as I see it, it (highly?) depends on what language you are coding in. Micheal Feathers talks about the concept of a seam: [WEwLC, p31] A seam is a place where you can alter behavior in your program without editing in that place. And without going into the details, I understand a seam -- in the context of unit testing -- to be a place in a program where your "test" can interface with your "unit". Examples Unit test -- especially in C++ -- require from the code under test to add more seams that would be strictly called for for a given problem. Example: Adding a virtual interface where non-virtual implementation would have been sufficient Splitting -- generalizing(?) -- a (smallish) class further "just" to facilitate adding a test. Splitting a single-executable project into seemingly "independent" libs, "just" to facilitate compiling them independently for the tests. The question I'll try a few versions that hopefully ask about the same point: Is the way that Unit Tests require one to structure an application's code "only" beneficial for the unit tests or is it actually beneficial to the applications structure. Is the generalization code need to exhibit to be unit-testable useful for anything but the unit tests? Does adding unit tests force one to generalize unnecessarily? Is the shape unit tests force on code "always" also a good shape for the code in general as seen from the problem domain? I remember a rule of thumb that said don't generalize until you need to / until there's a second place that uses the code. With Unit Tests, there's always a second place that uses the code -- namely the unit test. So is this reason enough to generalize?

    Read the article

  • Uses of persistent data structures in non-functional languages

    - by Ray Toal
    Languages that are purely functional or near-purely functional benefit from persistent data structures because they are immutable and fit well with the stateless style of functional programming. But from time to time we see libraries of persistent data structures for (state-based, OOP) languages like Java. A claim often heard in favor of persistent data structures is that because they are immutable, they are thread-safe. However, the reason that persistent data structures are thread-safe is that if one thread were to "add" an element to a persistent collection, the operation returns a new collection like the original but with the element added. Other threads therefore see the original collection. The two collections share a lot of internal state, of course -- that's why these persistent structures are efficient. But since different threads see different states of data, it would seem that persistent data structures are not in themselves sufficient to handle scenarios where one thread makes a change that is visible to other threads. For this, it seems we must use devices such as atoms, references, software transactional memory, or even classic locks and synchronization mechanisms. Why then, is the immutability of PDSs touted as something beneficial for "thread safety"? Are there any real examples where PDSs help in synchronization, or solving concurrency problems? Or are PDSs simply a way to provide a stateless interface to an object in support of a functional programming style?

    Read the article

  • How to do integrated testing?

    - by Enthusiastic Programmer
    So I have been reading up on a lot of books surrounding testing. But all the books I've read have the same flaws. They will all tell you the definitions of testing. But I have not found a single book that will guide you into integration testing (or pretty much anything higher then unit testing). Is integration testing that elusive or am I reading the wrong books? I'm a hands on person, so I would appreciate it if someone could help me with a simple program: Let's say you need to make some sort of calculation program that calculates something (doesn't matter what) and exports it to *.txt file. Let's assume we use the Model View Controller design principle. And one class for the actual calculating which you'll use in the model and one for writing the textfile. So: View = Controller = Model = CalculationClass, FileClass So for unittesting: You'd test the calculationClass, I'd personally focus most of my unit tests there. And less time on unit testing the view/controller/FileClass. I personally wouldn't see the use of unittesting those unless you want a really robust program. Integration testing: Now this is where I run into a wall. What would I have to test to call it an integration test? I could stub the view and feed the controller data which it would pass on to the model and so forth. And then check what the view gets back in the end. But ... Couldn't I just run the (in this case small) program then and test it manually? Would this be considered a integration test too, or does it have to be automated? Also, can I check multiple items to see if they are correct? I cannot seem to find any book that offers a hands on approach to methods of integration testing.

    Read the article

  • During Spring unit test, data written to db but test not seeing the data

    - by richever
    I wrote a test case that extends AbstractTransactionalJUnit4SpringContextTests. The single test case I've written creates an instance of class User and attempts to write it to the database using Hibernate. The test code then uses SimpleJdbcTemplate to execute a simple select count(*) from the user table to determine if the user was persisted to the database or not. The test always fails though. I was suspect because in the Spring controller I wrote, the ability to save an instance of User to the db is successful. So I added the Rollback annotation to the unit test and sure enough, the data is written to the database since I can even see it in the appropriate table -- the transaction isn't rolled back when the test case is finished. Here's my test case: @ContextConfiguration(locations = { "classpath:context-daos.xml", "classpath:context-dataSource.xml", "classpath:context-hibernate.xml"}) public class UserDaoTest extends AbstractTransactionalJUnit4SpringContextTests { @Autowired private UserDao userDao; @Test @Rollback(false) public void teseCreateUser() { try { UserModel user = randomUser(); String username = user.getUserName(); long id = userDao.create(user); String query = "select count(*) from public.usr where usr_name = '%s'"; long count = simpleJdbcTemplate.queryForLong(String.format(query, username)); Assert.assertEquals("User with username should be in the db", 1, count); } catch (Exception e) { e.printStackTrace(); Assert.assertNull("testCreateUser: " + e.getMessage()); } } } I think I was remiss by not adding the configuration files. context-hibernate.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd> <bean id="namingStrategy" class="org.springframework.beans.factory.config.FieldRetrievingFactoryBean"> <property name="staticField"> <value>org.hibernate.cfg.ImprovedNamingStrategy.INSTANCE</value> </property> </bean> <bean id="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean" destroy-method="destroy" scope="singleton"> <property name="namingStrategy"> <ref bean="namingStrategy"/> </property> <property name="dataSource" ref="dataSource"/> <property name="mappingResources"> <list> <value>com/company/model/usr.hbm.xml</value> </list> </property> <property name="hibernateProperties"> <props> <prop key="hibernate.dialect">org.hibernate.dialect.PostgreSQLDialect</prop> <prop key="hibernate.show_sql">true</prop> <prop key="hibernate.use_sql_comments">true</prop> <prop key="hibernate.query.substitutions">yes 'Y', no 'N'</prop> <prop key="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</prop> <prop key="hibernate.cache.use_query_cache">true</prop> <prop key="hibernate.cache.use_minimal_puts">false</prop> <prop key="hibernate.cache.use_second_level_cache">true</prop> <prop key="hibernate.current_session_context_class">thread</prop> </props> </property> </bean> <bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager"> <property name="sessionFactory" ref="sessionFactory"/> <property name="nestedTransactionAllowed" value="false" /> </bean> <bean id="transactionInterceptor" class="org.springframework.transaction.interceptor.TransactionInterceptor"> <property name="transactionManager"> <ref local="transactionManager"/> </property> <property name="transactionAttributes"> <props> <prop key="create">PROPAGATION_REQUIRED</prop> <prop key="delete">PROPAGATION_REQUIRED</prop> <prop key="update">PROPAGATION_REQUIRED</prop> <prop key="*">PROPAGATION_SUPPORTS,readOnly</prop> </props> </property> </bean> </beans> context-dataSource.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd"> <bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close"> <property name="driverClass" value="org.postgresql.Driver" /> <property name="jdbcUrl" value="jdbc\:postgresql\://localhost:5432/company_dev" /> <property name="user" value="postgres" /> <property name="password" value="postgres" /> </bean> </beans> context-daos.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd"> <bean id="extendedFinderNamingStrategy" class="com.company.dao.finder.impl.ExtendedFinderNamingStrategy"/> <bean id="finderIntroductionAdvisor" class="com.company.dao.finder.impl.FinderIntroductionAdvisor"/> <bean id="abstractDaoTarget" class="com.company.dao.impl.GenericDaoHibernateImpl" abstract="true" depends-on="sessionFactory"> <property name="sessionFactory"> <ref bean="sessionFactory"/> </property> <property name="namingStrategy"> <ref bean="extendedFinderNamingStrategy"/> </property> </bean> <bean id="abstractDao" class="org.springframework.aop.framework.ProxyFactoryBean" abstract="true"> <property name="interceptorNames"> <list> <value>transactionInterceptor</value> <value>finderIntroductionAdvisor</value> </list> </property> </bean> <bean id="userDao" parent="abstractDao"> <property name="proxyInterfaces"> <value>com.company.dao.UserDao</value> </property> <property name="target"> <bean parent="abstractDaoTarget"> <constructor-arg> <value>com.company.model.UserModel</value> </constructor-arg> </bean> </property> </bean> </beans> Some of this I've inherited from someone else. I wouldn't have used the proxying that is going on here because I'm not sure it's needed but this is what I'm working with. Any help much appreciated.

    Read the article

  • Translate report data export from RUEI into HTML for import into OpenOffice Calc Spreadsheets

    - by [email protected]
    A common question of users is, How to import the data from the automated data export of Real User Experience Insight (RUEI) into tools for archiving, dashboarding or combination with other sets of data.XML is well-suited for such a translation via the companion Extensible Stylesheet Language Transformations (XSLT). Basically XSLT utilizes XSL, a template on what to read from your input XML data file and where to place it into the target document. The target document can be anything you like, i.e. XHTML, CSV, or even a OpenOffice Spreadsheet, etc. as long as it is a plain text format.XML 2 OpenOffice.org SpreadsheetFor the XSLT to work as an OpenOffice.org Calc Import Filter:How to add an XML Import Filter to OpenOffice CalcStart OpenOffice.org Calc andselect Tools > XML Filter SettingsNew...Fill in the details as follows:Filter name: RUEI Import filterApplication: OpenOffice.org Calc (.ods)Name of file type: Oracle Real User Experience InsightFile extension: xmlSwitch to the transformation tab and enter/select the following leaving the rest untouchedXSLT for import: ruei_report_data_import_filter.xslPlease see at the end of this blog post for a download of the referenced file.Select RUEI Import filter from list and Test XSLTClick on Browse to selectTransform file: export.php.xmlOpenOffice.org Calc will transform and load the XML file you retrieved from RUEI in a human-readable format.You can now select File > Open... and change the filetype to open your RUEI exports directly in OpenOffice.org Calc, just like any other a native Spreadsheet format.Files of type: Oracle Real User Experience Insight (*.xml)File name: export.php.xml XML 2 XHTMLMost XML-powered browsers provides for inherent XSL Transformation capabilities, you only have to reference the XSLT Stylesheet in the head of your XML file. Then open the file in your favourite Web Browser, Firefox, Opera, Safari or Internet Explorer alike.<?xml version="1.0" encoding="ISO-8859-1"?><!-- inserted line below --> <?xml-stylesheet type="text/xsl" href="ruei_report_data_export_2_xhtml.xsl"?><!-- inserted line above --><report>You can find a patched example export from RUEI plus the above referenced XSL-Stylesheets here: export.php.xml - Example report data export from RUEI ruei_report_data_export_2_xhtml.xsl - RUEI to XHTML XSL Transformation Stylesheetruei_report_data_import_filter.xsl - OpenOffice.org XML import filter for RUEI report export data If you would like to do things like this on the command line you can use either Xalan or xsltproc.The basic command syntax for xsltproc is very simple:xsltproc -o output.file stylesheet.xslt inputfile.xmlYou can use this with the above two stylesheets to translate RUEI Data Exports into XHTML and/or OpenOffice.org Calc ODS-Format. Or you could write your own XSLT to transform into Comma separated Value lists.Please let me know what you think or do with this information in the comments below.Kind regards,Stefan ThiemeReferences used:OpenOffice XML Filter - Create XSLT filters for import and export - http://user.services.openoffice.org/en/forum/viewtopic.php?f=45&t=3490SUN OpenOffice.org XML File Format 1.0 - http://xml.openoffice.org/xml_specification.pdf

    Read the article

  • Data Auditor by Example

    - by Jinjin.Wang
    OWB has a node Data Auditors under Oracle Module in Projects Navigator. What is data auditor and how to use it? I will give an introduction to data auditor and show its usage by examples. Data auditor is an important tool in ensuring that data quality levels meet business requirements. Data auditor validates data against a set of data rules to determine which records comply and which do not. It gathers statistical metrics on how well the data in a system complies with a rule by auditing and marking how many errors are occurring against the audited table. Data auditors are typically scheduled for regular execution as part of a process flow, to monitor the quality of the data in an operational environment such as a data warehouse or ERP system, either immediately after updates like data loads, or at regular intervals. How to use data auditor to monitor data quality? Only objects with data rules can be monitored, so the first step is to define data rules according to business requirements and apply them to the objects you want to monitor. The objects can be tables, views, materialized views, and external tables. Secondly create a data auditor containing the objects. You can configure the data auditor and set physical deployment parameters for it as optional, which will be used while running the data auditor. Then deploy and run the data auditor either manually or as part of the process flow. After execution, the data auditor sets several output values, and records that are identified as not complying with the defined data rules contained in the data auditor are written to error tables. Here is an example. We have two tables DEPARTMENTS and EMPLOYEES (see pic-1 and pic-2. Click here for DDL and data) imported into OWB. We want to gather statistical metrics on how well data in these two tables satisfies the following requirements: a. Values of the EMPLOYEES.EMPLOYEE_ID attribute are three-digit numbers. b. Valid values for EMPLOYEES.JOB_ID are IT_PROG, SA_REP, SH_CLERK, PU_CLERK, and ST_CLERK. c. EMPLOYEES.EMPLOYEE_ID is related to DEPARTMENTS.MANAGER_ID. Pic-1 EMPLOYEES Pic-2 DEPARTMENTS 1. To determine legal data within EMPLOYEES or legal relationships between data in different columns of the two tables, firstly we define data rules based on the three requirements and apply them to tables. a. The first requirement is about patterns that an attribute is allowed to conform to. We create a Domain Pattern List data rule EMPLOYEE_PATTERN_RULE here. The pattern is defined in the Oracle Database regular expression syntax as ^([0-9]{3})$ Apply data rule EMPLOYEE_PATTERN_RULE to table EMPLOYEES.

    Read the article

< Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >