Search Results

Search found 65999 results on 2640 pages for 'large data volumes'.

Page 26/2640 | < Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >

Large scale application development and MVP - Part II

If you're interested in building large scale applications the MVP way, you may want to checkout the latest addition to our articles section: Large application development and MVP...

Read the article
Shrinking physical volumes in LVM on a Linux Guest in ESXi 5.0

- by Stew

The problem: Linux guest (OpenSuse 12.1), with multiple virtual disks attached. 3 disks are in a logical volume, two of which are exactly 2TB. None of the disks are independent, and due to the backup software we use, cannot be independent. When the two 2TB virtual disks are "dependent", the snapshot fails stating that the file is too large for the datastore. When I put those two disks in independent mode, snapshots work fine (the other disk is 1.8TB). I have therefore concluded that even shrinking the two physical disks by 100GB should solve the problem, however I am having trouble conceptualizing how to go about getting those disks smaller without breaking the LVM entirely. The actual LV has 1.3TB free, so there is plenty of space to shrink with. What I need to accomplish: Deallocate 100GB from the two, 2TB virtual disks within the linux guest. Shrink the two virtual disks by 100GB within vsphere (not as complicated). Are there any vsphere/LVM gurus that can give me a clue?

Read the article
3 Problems Some Large Websites Face With SEO

Thin content, duplicate content, or little or no original content can be killers to your large website SEO strategy. Using great keywords appropriately can really help indicate the subject matter of the individual page and allow for better SEO indexing. Here are a few tips to help optimize your large website and avoid thin content pages.

Read the article
Supporting Large Scale Team Development

With a large-scale development of a database application, the task of supporting a large number of development and test databases, keeping them up to date with different builds can soon become ridiculously complex and costly. Grant Fritchey demonstrates a novel solution that can reduce the storage requirements enormously, and allow individual developers to work on thir own version, using a full set of data.

Read the article
Best 3 Ways to Optimize a Large Website For the Search Engines

A lot of people believe that in order to get good rankings they need to build a large website. As a general rule, we classify a website as large if there are more than 300 pages of content available to the readers. This may seem like a lot of content, but even this can be considered a small website by those who have been in the SEO industry for a long time.

Read the article
How to recover data from a partially overwritten partition

- by shredder12

By mistake, I configured a 900GB partition to be part of a 50GB raid. The sync is complete and my understanding is that only the first 50GB of the bigger partition is overwritten. How do I recover the rest of the data? When I try to mount this partition by identifying it as ext3, it mounts only the 50GB overwritten space. This partition was earlier divided into various logical volumes(all ext3 filesystems) through LVM. Any suggestions?

Read the article
Webbased data modelling and management tool

- by pixeldude

Is there a web-based tool available, where I am able to... ...define data models (like in a database admin tool) ...fill in data (in custom web forms, not too generic) with basic features like completion ...import data from CSV oder Excel Sheets ...export data to CSV or SQL ...create snapshots of my data models (versions, diff, etc.) ...share my data models ...discuss/collaborate with other people about my data models Well, I can develop something like this in PHP or with Ruby or whatever. But this is such a common task, where the application support could be a lot better. And it would be language and database independent. This would help to maintain data models in different versions and you can maybe share your data models with others, extend it with your team members, etc. There is a website called FreeBase, which allows you to define a data entity model and fill in data, which also has export features, but I need to define my own data model with my own granularity and structure. And it should not be shared in public if I don't want to. How do you solve problems like this yourself?

Read the article
Filter large amounts of data from a HTML table w/ jQuery

- by Bry4n

I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> I have an HTML page with a giant table on it listing the station names and each stations times. I want to be able to put my starting location in one box and my ending location in another box and have all the items in the table disappear that don't relate to either of the two locations typed in, leaving only two rows that match what was typed in (even if they don't spell it right or type it all the way) Similar to the jQuery quick search plugin

Read the article
Filter large amounts of data in a table w/ jQuery

- by Bry4n

I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> So If I start typing in one of the stations in the To: textbox it either displays dynamically like the quick search or i have to press a button (either or) and then in the from: textbox. Lastly it shows me to: station and all its times on the left and the from: station and all its times on the right.

Read the article
Way to store a large dictionary with low memory footprint + fast lookups (on Android)

- by BobbyJim

I'm developing an android word game app that needs a large (~250,000 word dictionary) available. I need: reasonably fast look ups e.g. constant time preferable, need to do maybe 200 lookups a second on occasion to solve a word puzzle and maybe 20 lookups within 0.2 second more often to check words the user just spelled. EDIT: Lookups are typically asking "Is in the dictionary?". I'd like to support up to two wildcards in the word as well, but this is easy enough by just generating all possible letters the wildcards could have been and checking the generated words (i.e. 26 * 26 lookups for a word with two wildcards). as it's a mobile app, using as little memory as possible and requiring only a small initial download for the dictionary data is top priority. My first naive attempts used Java's HashMap class, which caused an out of memory exception. I've looked into using the SQL lite databases available on android, but this seems like overkill. What's a good way to do what I need?

Read the article
Handling large (object) datasets with PHP

- by Aron Rotteveel

I am currently working on a project that extensively relies on the EAV model. Both entities as their attributes are individually represented by a model, sometimes extending other models (or at least, base models). This has worked quite well so far since most areas of the application only rely on filtered sets of entities, and not the entire dataset. Now, however, I need to parse the entire dataset (IE: all entities and all their attributes) in order to provide a sorting/filtering algorithm based on the attributes. The application currently consists of aproximately 2200 entities, each with aproximately 100 attributes. Every entity is represented by a single model (for example Client_Model_Entity) and has a protected property called $_attributes, which is an array of Attribute objects. Each entity object is about 500KB, which results in an incredible load on the server. With 2000 entities, this means a single task would take 1GB of RAM (and a lot of CPU time) in order to work, which is unacceptable. Are there any patterns or common approaches to iterating over such large datasets? Paging is not really an option, since everything has to be taken into account in order to provide the sorting algorithm.

Read the article
Advanced Data Source Engine coming to Telerik Reporting Q1 2010

This is the final blog post from the pre-release series. In it we are going to share with you some of the updates coming to our reporting solution in Q1 2010. A new Declarative Data Source Engine will be added to Telerik Reporting, that will allow full control over data management, and deliver significant gains in rendering performance and memory consumption. Some of the engines new features will be: Data source parameters - those parameters will be used to limit data retrieved from the data source to just the data needed for the report. Data source parameters are processed on the data source side, however only queried data is fetched to the reporting engine, rather than the full data source. This leads to lower memory consumption, because data operations are performed on queried data only, rather than on all data. As a result, only the queried data needs to be stored in the memory vs. the whole dataset, which was the case with the old approach Support for stored procedures - they will assist in achieving a consistent implementation of logic across applications, and are especially practical for performing repetitive tasks. A stored procedure stores the SQL statements and logic, which can then be executed in different reports and/or applications. Stored Procedures will not only save development time, but they will also improve performance, because each stored procedure is compiled on the data base server once, and then is reutilized. In Telerik Reporting, the stored procedure will also be parameterized, where elements of the SQL statement will be bound to parameters. These parameterized SQL queries will be handled through the data source parameters, and are evaluated at run time. Using parameterized SQL queries will improve the performance and decrease the memory footprint of your application, because they will be applied directly on the database server and only the necessary data will be downloaded on the middle tier or client machine; Calculated fields through expressions - with the help of the new reporting engine you will be able to use field values in formulas to come up with a calculated field. A calculated field is a user defined field that is computed "on the fly" and does not exist in the data source, but can perform calculations using the data of the data source object it belongs to. Calculated fields are very handy for adding frequently used formulas to your reports; Improved performance and optimized in-memory OLAP engine - the new data source will come with several improvements in how aggregates are calculated, and memory is managed. As a result, you may experience between 30% (for simpler reports) and 400% (for calculation-intensive reports) in rendering performance, and about 50% decrease in memory consumption. Full design time support through wizards - Declarative data sources are a great advance and will save developers countless hours of coding. In Q1 2010, and true to Telerik Reportings essence, using the new data source engine and its features requires little to no coding, because we have extended most of the wizards to support the new functionality. The newly extended wizards are available in VS2005/VS2008/VS2010 design-time. More features will be revealed on the product's what's new page when the new version is officially released in a few days. Also make sure you attend the free webinar on Thursday, March 11th that will be dedicated to the updates in Telerik Reporting Q1 2010. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
BizTalk Cross Reference Data Management Strategy

- by charlie.mott

Article Source: http://geekswithblogs.net/charliemott This article describes an approach to the management of cross reference data for BizTalk. Some articles about the BizTalk Cross Referencing features can be found here: http://home.comcast.net/~sdwoodgate/xrefseed.zip http://geekswithblogs.net/michaelstephenson/archive/2006/12/24/101995.aspx http://geekswithblogs.net/charliemott/archive/2009/04/20/value-vs.id-cross-referencing-in-biztalk.aspx Options Current options to managing this data include: Maintaining xml files in the format that can be used by the out-of-the-box BTSXRefImport.exe utility. Use of user interfaces that have been developed to manage this data: BizTalk Cross Referencing Tool XRef XML Creation Tool However, there are the following issues with the above options: The 'BizTalk Cross Referencing Tool' requires a separate database to manage. The 'XRef XML Creation' tool has no means of persisting the data settings. The 'BizTalk Cross Referencing tool' generates integers in the common id field. I prefer to use a string (e.g. acme.country.uk). This is more readable. (see naming conventions below). Both UI tools continue to use BTSXRefImport.exe. This utility replaces all xref data. This can be a problem in continuous integration environments that support multiple clients or BizTalk target instances. If you upload the data for one client it would destroy the data for another client. Yet in TFS where builds run concurrently, this would break unit tests. Alternative Approach In response to these issues, I instead use simple SQL scripts to directly populate the BizTalkMgmtDb xref tables combined with a data namepacing strategy to isolate client data. Naming Conventions All data keys use namespace prefixing. The pattern will be <companyName>.<data Type>. The naming conventions will be to use lower casing for all items. The data must follow this pattern to isolate it from other company cross-reference data. The table below shows some sample data. (Note: this data uses the 'ID' cross-reference tables. the same principles apply for the 'value' cross-referencing tables). Table.Field Description Sample Data xref_AppType.appType Application Types acme.erp acme.portal acme.assetmanagement xref_AppInstance.appInstance Application Instances (each will have a corresponding application type). acme.dynamics.ax acme.dynamics.crm acme.sharepoint acme.maximo xref_IDXRef.idXRef Holds the cross reference data types. acme.taxcode acme.country xref_IDXRefData.CommonID Holds each cross reference type value used by the canonical schemas. acme.vatcode.exmpt acme.vatcode.std acme.country.usa acme.country.uk xref_IDXRefData.AppID This holds the value for each application instance and each xref type. GBP USD SQL Scripts The data to be stored in the BizTalkMgmtDb xref tables will be managed by SQL scripts stored in a database project in the visual studio solution. File(s) Description Build.cmd A sqlcmd script to deploy data by running the SQL scripts below. (This can be run as part of the MSBuild process). acme.purgexref.sql SQL script to clear acme.* data from the xref tables. As such, this will not impact data for any other company. acme.applicationInstances.sql SQL script to insert application type and application instance data. acme.vatcode.sql acme.country.sql etc ... There will be a separate SQL script to insert each cross-reference data type and application specific values for these types.

Read the article
Master Note for Generic Data Warehousing

- by lajos.varady(at)oracle.com

++++++++++++++++++++++++++++++++++++++++++++++++++++ The complete and the most recent version of this article can be viewed from My Oracle Support Knowledge Section. Master Note for Generic Data Warehousing [ID 1269175.1] ++++++++++++++++++++++++++++++++++++++++++++++++++++In this Document Purpose Master Note for Generic Data Warehousing Components covered Oracle Database Data Warehousing specific documents for recent versions Technology Network Product Homes Master Notes available in My Oracle Support White Papers Technical Presentations Platforms: 1-914CU; This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review. Applies to: Oracle Server - Enterprise Edition - Version: 9.2.0.1 to 11.2.0.2 - Release: 9.2 to 11.2Information in this document applies to any platform. Purpose Provide navigation path Master Note for Generic Data Warehousing Components covered Read Only Materialized ViewsQuery RewriteDatabase Object PartitioningParallel Execution and Parallel QueryDatabase CompressionTransportable TablespacesOracle Online Analytical Processing (OLAP)Oracle Data MiningOracle Database Data Warehousing specific documents for recent versions 11g Release 2 (11.2)11g Release 1 (11.1)10g Release 2 (10.2)10g Release 1 (10.1)9i Release 2 (9.2)9i Release 1 (9.0)Technology Network Product HomesOracle Partitioning Advanced CompressionOracle Data MiningOracle OLAPMaster Notes available in My Oracle SupportThese technical articles have been written by Oracle Support Engineers to provide proactive and top level information and knowledge about the components of thedatabase we handle under the "Database Datawarehousing".Note 1166564.1 Master Note: Transportable Tablespaces (TTS) -- Common Questions and IssuesNote 1087507.1 Master Note for MVIEW 'ORA-' error diagnosis. For Materialized View CREATE or REFRESHNote 1102801.1 Master Note: How to Get a 10046 trace for a Parallel QueryNote 1097154.1 Master Note Parallel Execution Wait Events Note 1107593.1 Master Note for the Oracle OLAP OptionNote 1087643.1 Master Note for Oracle Data MiningNote 1215173.1 Master Note for Query RewriteNote 1223705.1 Master Note for OLTP Compression Note 1269175.1 Master Note for Generic Data WarehousingWhite Papers Transportable Tablespaces white papers Database Upgrade Using Transportable Tablespaces:Oracle Database 11g Release 1 (February 2009) Platform Migration Using Transportable Database Oracle Database 11g and 10g Release 2 (August 2008) Database Upgrade using Transportable Tablespaces: Oracle Database 10g Release 2 (April 2007) Platform Migration using Transportable Tablespaces: Oracle Database 10g Release 2 (April 2007)Parallel Execution and Parallel Query white papers Best Practices for Workload Management of a Data Warehouse on the Sun Oracle Database Machine (June 2010) Effective resource utilization by In-Memory Parallel Execution in Oracle Real Application Clusters 11g Release 2 (Feb 2010) Parallel Execution Fundamentals in Oracle Database 11g Release 2 (November 2009) Parallel Execution with Oracle Database 10g Release 2 (June 2005)Oracle Data Mining white paper Oracle Data Mining 11g Release 2 (March 2010)Partitioning white papers Partitioning with Oracle Database 11g Release 2 (September 2009) Partitioning in Oracle Database 11g (June 2007)Materialized Views and Query Rewrite white papers Oracle Materialized Views and Query Rewrite (May 2005) Improving Performance using Query Rewrite in Oracle Database 10g (December 2003)Database Compression white papers Advanced Compression with Oracle Database 11g Release 2 (September 2009) Table Compression in Oracle Database 10g Release 2 (May 2005)Oracle OLAP white papers On-line Analytic Processing with Oracle Database 11g Release 2 (September 2009) Using Oracle Business Intelligence Enterprise Edition with the OLAP Option to Oracle Database 11g (July 2008)Generic white papers Enabling Pervasive BI through a Practical Data Warehouse Reference Architecture (February 2010) Optimizing and Protecting Storage with Oracle Database 11g Release 2 (November 2009) Oracle Database 11g for Data Warehousing and Business Intelligence (August 2009) Best practices for a Data Warehouse on Oracle Database 11g (September 2008)Technical PresentationsA selection of ObE - Oracle by Examples documents: Generic Using Basic Database Functionality for Data Warehousing (10g) Partitioning Manipulating Partitions in Oracle Database (11g Release 1) Using High-Speed Data Loading and Rolling Window Operations with Partitioning (11g Release 1) Using Partitioned Outer Join to Fill Gaps in Sparse Data (10g) Materialized View and Query Rewrite Using Materialized Views and Query Rewrite Capabilities (10g) Using the SQLAccess Advisor to Recommend Materialized Views and Indexes (10g) Oracle OLAP Using Microsoft Excel With Oracle 11g Cubes (how to analyze data in Oracle OLAP Cubes using Excel's native capabilities) Using Oracle OLAP 11g With Oracle BI Enterprise Edition (Creating OBIEE Metadata for OLAP 11g Cubes and querying those in BI Answers) Building OLAP 11g Cubes Querying OLAP 11g Cubes Creating Interactive APEX Reports Over OLAP 11g CubesSelection of presentations from the BIWA website:Extreme Data Warehousing With Exadata by Hermann Baer (July 2010) (slides 2.5MB, recording 54MB)Data Mining Made Easy! Introducing Oracle Data Miner 11g Release 2 New "Work flow" GUI by Charlie Berger (May 2010) (slides 4.8MB, recording 85MB )Best Practices for Deploying a Data Warehouse on Oracle Database 11g by Maria Colgan (December 2009) (slides 3MB, recording 18MB, white paper 3MB )

Read the article
How to give my user permission to add/edit files on local apache server? [duplicate]

- by Logan

Possible Duplicate: How to make Apache run as current user I'm setting up my local test server again, and I seem to have forgotten how to successfully set up the LAMP server. I have installed LAMP server via tasksel command and I have configured the /var/www directory according to a guide I've found: After the lamp server installation you will need write permissions to the /var/www directory. Follow these steps to configure permissions. Add your user to the www-data group sudo usermod -a -G www-data <your user name> now add the /var/www folder to the www-data group sudo chgrp -R www-data /var/www now give write permissions to the www-data group sudo chmod -R g+w /var/www So logan user is now part of www-data group and the file/folder permissions look like the output below: logan@computer:/var/www$ ls -lart total 172 -rw-r--r-- 1 www-data www-data 1997 Oct 23 2010 wp-links-opml.php -rw-r--r-- 1 www-data www-data 3177 Nov 1 2010 wp-config-sample.php -rw-r--r-- 1 www-data www-data 3700 Jan 8 2012 wp-trackback.php -rw-r--r-- 1 www-data www-data 271 Jan 8 2012 wp-blog-header.php -rw-r--r-- 1 www-data www-data 395 Jan 8 2012 index.php -rw-r--r-- 1 www-data www-data 3522 Apr 10 2012 wp-comments-post.php -rw-r--r-- 1 www-data www-data 19929 May 6 2012 license.txt -rw-r--r-- 1 www-data www-data 18219 Sep 11 08:27 wp-signup.php -rw-r--r-- 1 www-data www-data 2719 Sep 11 16:11 xmlrpc.php -rw-r--r-- 1 www-data www-data 2718 Sep 23 12:57 wp-cron.php -rw-r--r-- 1 www-data www-data 7723 Sep 25 01:26 wp-mail.php -rw-r--r-- 1 www-data www-data 2408 Oct 26 15:40 wp-load.php -rw-r--r-- 1 www-data www-data 4663 Nov 17 10:11 wp-activate.php -rw-r--r-- 1 www-data www-data 9899 Nov 22 04:52 wp-settings.php -rw-r--r-- 1 www-data www-data 9175 Nov 29 19:57 readme.html -rw-r--r-- 1 www-data www-data 29310 Nov 30 08:40 wp-login.php drwxr-xr-x 14 root root 4096 Dec 24 17:41 .. drwx------ 9 www-data www-data 4096 Dec 26 16:11 wp-admin drwx------ 9 www-data www-data 4096 Dec 26 16:11 wp-includes -rw-rw-rw- 1 www-data www-data 3448 Dec 26 16:14 wp-config.php drwxrwxr-x 5 www-data www-data 4096 Dec 26 16:14 . drwx------ 6 www-data www-data 4096 Dec 26 16:19 wp-content Things work perfectly at http://localhost, I can view the website fine. The thing with this is that I will be working on a plugin for wordpress and I don't want to deal with separate owners under www directory to create or modify files/folders. When I give my user the ownership of /var/www recursively as logan:www-data I can create/modify files but cannot view the http://localhost. I get a Forbidden error. I'm assuming that this is because of the Apache's configuration? Which one is healthier or easier considering this is just a local test website, configuring apache to give user logan to view website and chmod /var/www logan:logan so that I can create files etc. without any sudo commands; or is it easier to configure user groups to get www-data user to act like my logan user? (Idk how that's possible, maybe putting www-data user under logan group?) Please shed some light to this subject. All I want is to be able to create/modifiy files under my user, and yet to be able to successfully view http://localhost I appreciate the help!

Read the article
Oracle Enterprise Data Quality: Ever Integration-ready

- by Mala Narasimharajan

It is closing in on a year now since Oracle’s acquisition of Datanomic, and the addition of Oracle Enterprise Data Quality (EDQ) to the Oracle software family. The big move has caused some big shifts in emphasis and some very encouraging excitement from the field. To give an illustration, combined with a shameless promotion of how EDQ can help to give quick insights into your data, I did a quick Phrase Profile of the subject field of emails to the Global EDQ mailing list since it was set up last September. The results revealed a very clear theme: Integration, Integration, Integration! As well as the important Siebel and Oracle Data Integrator (ODI) integrations, we have been asked about integration with a huge variety of Oracle applications, including EBS, Peoplesoft, CRM on Demand, Fusion, DRM, Endeca, RightNow, and more - and we have not stood still! While it would not have been possible to develop specific pre-integrations with all of the above within a year, we have developed a package of feature-rich out-of-the-box web services and batch processes that can be plugged into any application or middleware technology with ease. And with Siebel, they work out of the box. Oracle Enterprise Data Quality version 9.0.4 includes the Customer Data Services (CDS) pack – a ready set of standard processes with standard interfaces, to provide integrated: Address verification and cleansing Individual matching Organization matching The services can are suitable for either Batch or Real-Time processing, and are enabled for international data, with simple configuration options driving the set of locale-specific dictionaries that are used. For example, large dictionaries are provided to support international name transcription and variant matching, including highly specialized handling for Arabic, Japanese, Chinese and Korean data. In total across all locales, CDS includes well over a million dictionary entries. Excerpt from EDQ’s CDS Individual Name Standardization Dictionary CDS has been developed to replace the OEM of Informatica Identity Resolution (IIR) for attached Data Quality on the Oracle price list, but does this in a way that creates a ‘best of both worlds’ situation for customers, who can harness not only the out-of-the-box functionality of pre-packaged matching and standardization services, but also the flexibility of OEDQ if they want to customize the interfaces or the process logic, without having to learn more than one product. From a competitive point of view, we believe this stands us in good stead against our key competitors, including Informatica, who have separate ‘Identity Resolution’ and general DQ products, and IBM, who provide limited out-of-the-box capabilities (with a steep learning curve) in both their QualityStage data quality and Initiate matching products. Here is a brief guide to the main services provided in the pack: Address Verification and Standardization EDQ’s CDS Address Cleaning Process The Address Verification and Standardization service uses EDQ Address Verification (an OEM of Loqate software) to verify and clean addresses in either real-time or batch. The Address Verification processor is wrapped in an EDQ process – this adds significant capabilities over calling the underlying Address Verification API directly, specifically: Country-specific thresholds to determine when to accept the verification result (and therefore to change the input address) based on the confidence level of the API Optimization of address verification by pre-standardizing data where required Formatting of output addresses into the input address fields normally used by applications Adding descriptions of the address verification and geocoding return codes The process can then be used to provide real-time and batch address cleansing in any application; such as a simple web page calling address cleaning and geocoding as part of a check on individual data. Duplicate Prevention Unlike Informatica Identity Resolution (IIR), EDQ uses stateless services for duplicate prevention to avoid issues caused by complex replication and synchronization of large volume customer data. When a record is added or updated in an application, the EDQ Cluster Key Generation service is called, and returns a number of key values. These are used to select other records (‘candidates’) that may match in the application data (which has been pre-seeded with keys using the same service). The ‘driving record’ (the new or updated record) is then presented along with all selected candidates to the EDQ Matching Service, which decides which of the candidates are a good match with the driving record, and scores them according to the strength of match. In this model, complex multi-locale EDQ techniques can be used to generate the keys and ensure that the right balance between performance and matching effectiveness is maintained, while ensuring that the application retains control of data integrity and transactional commits. The process is explained below: EDQ Duplicate Prevention Architecture Note that where the integration is with a hub, there may be an additional call to the Cluster Key Generation service if the master record has changed due to merges with other records (and therefore needs to have new key values generated before commit). Batch Matching In order to allow customers to use different match rules in batch to real-time, separate matching templates are provided for batch matching. For example, some customers want to minimize intervention in key user flows (such as adding new customers) in front end applications, but to conduct a more exhaustive match on a regular basis in the back office. The batch matching jobs are also used when migrating data between systems, and in this case normally a more precise (and automated) type of matching is required, in order to minimize the review work performed by Data Stewards. In batch matching, data is captured into EDQ using its standard interfaces, and records are standardized, clustered and matched in an EDQ job before matches are written out. As with all EDQ jobs, batch matching may be called from Oracle Data Integrator (ODI) if required. When working with Siebel CRM (or master data in Siebel UCM), Siebel’s Data Quality Manager is used to instigate batch jobs, and a shared staging database is used to write records for matching and to consume match results. The CDS batch matching processes automatically adjust to Siebel’s ‘Full Match’ (match all records against each other) and ‘Incremental Match’ (match a subset of records against all of their selected candidates) modes. The Future The Customer Data Services Pack is an important part of the Oracle strategy for EDQ, offering a clear path to making Data Quality Assurance an integral part of enterprise applications, and providing a strong value proposition for adopting EDQ. We are planning various additions and improvements, including: An out-of-the-box Data Quality Dashboard Even more comprehensive international data handling Address search (suggesting multiple results) Integrated address matching The EDQ Customer Data Services Pack is part of the Enterprise Data Quality Media Pack, available for download at http://www.oracle.com/technetwork/middleware/oedq/downloads/index.html.

Read the article
Master Data Management Implementation Styles

- by david.butler(at)oracle.com

In any Master Data Management solution deployment, one of the key decisions to be made is the choice of the MDM architecture. Gartner and other analysts describe some different Hub deployment styles, which must be supported by a best of breed MDM solution in order to guarantee the success of the deployment project. Registry Style: In a Registry Style MDM Hub, the various source systems publish their data and a subscribing Hub stores only the source system IDs, the Foreign Keys (record IDs on source systems) and the key data values needed for matching. The Hub runs the cleansing and matching algorithms and assigns unique global identifiers to the matched records, but does not send any data back to the source systems. The Registry Style MDM Hub uses data federation capabilities to build the "virtual" golden view of the master entity from the connected systems. Consolidation Style: The Consolidation Style MDM Hub has a physically instantiated, "golden" record stored in the central Hub. The authoring of the data remains distributed across the spoke systems and the master data can be updated based on events, but is not guaranteed to be up to date. The master data in this case is usually not used for transactions, but rather supports reporting; however, it can also be used for reference operationally. Coexistence Style: The Coexistence Style MDM Hub involves master data that's authored and stored in numerous spoke systems, but includes a physically instantiated golden record in the central Hub and harmonized master data across the application portfolio. The golden record is constructed in the same manner as in the consolidation style, and, in the operational world, Consolidation Style MDM Hubs often evolve into the Coexistence Style. The key difference is that in this architectural style the master data stored in the central MDM system is selectively published out to the subscribing spoke systems. Transaction Style: In this architecture, the Hub stores, enhances and maintains all the relevant (master) data attributes. It becomes the authoritative source of truth and publishes this valuable information back to the respective source systems. The Hub publishes and writes back the various data elements to the source systems after the linking, cleansing, matching and enriching algorithms have done their work. Upstream, transactional applications can read master data from the MDM Hub, and, potentially, all spoke systems subscribe to updates published from the central system in a form of harmonization. The Hub needs to support merging of master records. Security and visibility policies at the data attribute level need to be supported by the Transaction Style hub, as well. Adaptive Transaction Style: This is similar to the Transaction Style, but additionally provides the capability to respond to diverse information and process requests across the enterprise. This style emerged most recently to address the limitations of the above approaches. With the Adaptive Transaction Style, the Hub is built as a platform for consolidating data from disparate third party and internal sources and for serving unified master entity views to operational applications, analytical systems or both. This approach delivers a real-time Hub that has a reliable, persistent foundation of master reference and relationship data, along with all the history and lineage of data changes needed for audit and compliance tracking. On top of this persistent master data foundation, the Hub can dynamically aggregate transaction data on demand from different source systems to deliver the unified golden view to downstream systems. Data can also be accessed through batch interfaces, published to a message bus or served through a real-time services layer. New data sources can be readily added in this approach by extending the data model and by configuring the new source mappings and the survivorship rules, meaning that all legacy data hubs can be leveraged to contribute their records/rules into the new transaction hub. Finally, through rich user interfaces for data stewardship, it allows exception handling by business analysts to keep it current with business rules/practices while maintaining the reliability of best-of-breed master records. Confederation Style: In this architectural style, several Hubs are maintained at departmental and/or agency and/or territorial level, and each of them are connected to the other Hubs either directly or via a central Super-Hub. Each Domain level Hub can be implemented using any of the previously described styles, but normally the Central Super-Hub is a Registry Style one. This is particularly important for Public Sector organizations, where most of the time it is practically or legally impossible to store in a single central hub all the relevant constituent information from all departments. Oracle MDM Solutions can be deployed according to any of the above MDM architectural styles, and have been specifically designed to fully support the Transaction and Adaptive Transaction styles. Oracle MDM Solutions provide strong data federation and integration capabilities which are key to enabling the use of the Confederated Hub as a possible architectural style approach. Don't lock yourself into a solution that cannot evolve with your needs. With Oracle's support for any type of deployment architecture, its ability to leverage the outstanding capabilities of the Oracle technology stack, and its open interfaces for non-Oracle technology stacks, Oracle MDM Solutions provide a low TCO and a quick ROI by enabling a phased implementation strategy.

Read the article
Best practices for upgrading user data when updating versions of software

- by Javy

In my code I check the current version of the software on launch and compare it to the version stored in the user's data file(s). If the version is newer, then I call different methods to update the old data to the newer data version, if necessary. I usually have to make a new method to convert the data with each update that changes user data in some way, and cannot remove the old ones in case there was someone who missed an update. So the app must be able to go through each method call and update their data until they get their data current. With larger data sets, this could be a problem. In addition, I recently had a brief discussion with another StackOverflow user this and he indicated he always appended a date stamp to the filename to manage data versions, although his reasoning as to why this was better than storing the version data in the file itself was unclear. Since I've rarely seen management of user data versions in books I've read, I'm curious what are the best practices for naming user data files and procedures for updating older data to newer versions.

Read the article
How to tackle archived who-is personal data with opt-out?

- by defaye

As far as I understand it, it is possible to opt-out (in the UK at least) of having your address details displayed on who-is information of a domain for non-trading individuals. What I want to know is, after opt-out, how do individuals combat archived data? Is there any enforcement of this? How many who-is websites are there which archive data and what rights do we have to force them to remove that data without paying absurd fees? In the case of capitulating to these scoundrels, what point is it in paying for the removal of archived data if that data can presumably resurface on another who-is repository? In other words, what strategy is one supposed to take, besides being wiser after the fact?

Read the article
how to find maximum frequent item sets from large transactional data file

- by ANIL MANE

Hi, I have the input file contains large amount of transactions like Transaction ID Items T1 Bread, milk, coffee, juice T2 Juice, milk, coffee T3 Bread, juice T4 Coffee, milk T5 Bread, Milk T6 Coffee, Bread T7 Coffee, Bread, Juice T8 Bread, Milk, Juice T9 Milk, Bread, Coffee, T10 Bread T11 Milk T12 Milk, Coffee, Bread, Juice i want the occurrence of every unique item like Item Name Count Bread 9 Milk 8 Coffee 7 Juice 6 and from that i want an a fp-tree now by traversing this tree i want the maximal frequent itemsets as follows The basic idea of method is to dispose nodes in each “layer” from bottom to up. The concept of “layer” is different to the common concept of layer in a tree. Nodes in a “layer” mean the nodes correspond to the same item and be in a linked list from the “Head Table”. For nodes in a “layer” NBN method will be used to dispose the nodes from left to right along the linked list. To use NBN method, two extra fields will be added to each node in the ordered FP-Tree. The field tag of node N stores the information of whether N is maximal frequent itemset, and the field count’ stores the support count information in the nodes at left. In Figure, the first node to be disposed is “juice: 2”. If the min_sup is equal to or less than 2 then “bread, milk, coffee, juice” is a maximal frequent itemset. Firstly output juice:2 and set the field tag of “coffee:3” as “false” (the field tag of each node is “true” initially ). Next check whether the right four itemsets juice:1 be the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 set the field tag of the node “false”. In the following process when the field tag of the disposed node is FALSE we can omit the node after the same tagging. If the min_sup is more than 2 then check whether the right four juice:1 is the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 then set the field count’ of the node with the sum of the former count’ and 2 After all the nodes “juice” disposed ,begin to dispose the node “coffee:3”. Any suggestions or available source code, welcome. thanks in advance

Read the article
Chef: nested data bag data to template file returns "can't convert String into Integer"

- by Dalho Park

I'm creating simple test recipe with a template and data bag. What I'm trying to do is creating a config file from data bag that has simple nested information, but I receive error "can't convert String into Integer" Here are my setting file 1) recipe/default.rb data1 = data_bag_item( 'mytest', 'qa' )['test'] data2 = data_bag_item( 'mytest', 'qa' ) template "/opt/env/test.cfg" do source "test.erb" action :create_if_missing mode 0664 owner "root" group "root" variables({ :pepe1 = data1['part.name'], :pepe2 = data2['transport.tcp.ip2'] }) end 2)my data bag named "mytest" $knife data bag show mytest qa id: qa test: part.name: L12 transport.tcp.ip: 111.111.111.111 transport.tcp.port: 9199 transport.tcp.ip2: 222.222.222.222 3)template file test.erb part.name=<%= @pepe1 % transport.tcp.binding=<%= @pepe2 % Error reurns when I run chef-client on my server, [2013-06-24T19:50:38+00:00] DEBUG: filtered backtrace of compile error: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in []',/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:inblock in from_file',/var/chef/cache/cookbooks/config_test/recipes/default.rb:12:in from_file' [2013-06-24T19:50:38+00:00] DEBUG: filtered backtrace of compile error: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in[]',/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in block in from_file',/var/chef/cache/cookbooks/config_test/recipes/default.rb:12:infrom_file' [2013-06-24T19:50:38+00:00] DEBUG: backtrace entry for compile error: '/var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in `[]'' [2013-06-24T19:50:38+00:00] DEBUG: Line number of compile error: '19' Recipe Compile Error in /var/chef/cache/cookbooks/config_test/recipes/default.rb TypeError can't convert String into Integer Cookbook Trace: /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:in []' /var/chef/cache/cookbooks/config_test/recipes/default.rb:19:inblock in from_file' /var/chef/cache/cookbooks/config_test/recipes/default.rb:12:in `from_file' Relevant File Content: /var/chef/cache/cookbooks/config_test/recipes/default.rb: 12: template "/opt/env/test.cfg" do 13: source "test.erb" 14: action :create_if_missing 15: mode 0664 16: owner "root" 17: group "root" 18: variables({ 19 :pepe1 = data1['part.name'], 20: :pepe2 = data2['transport.tcp.ip2'] 21: }) 22: end 23: I tried many things and if I comment out "pepe1 = data1['part.name'],", then :pepe2 = data2['transport.tcp.ip2'] works fine. only nested data "part.name" cannot be set to @pepe1. Does anyone knows why I receive the errors? thanks,

Read the article
pure-ftpd debian, can't get www-data user working

- by lynks

I'm trying to add FTP access to the apache web files, in the past I have done this with an ftpuser and group arrangement. This time I would like to make it possible to login directly as www-data (the default apache user on debian) to make things a bit cleaner. I have checked and re-checked all the common issues; MinUID is set to 1 (www-data has uid 33) www-data has shell set to /bin/bash in /etc/passwd PAMAuthentication is off UnixAuthentication is on I have restarted pure-ftpd using /etc/init.d/pure-ftpd restart My resulting pure-ftpd run is; /usr/sbin/pure-ftpd -l unix -A -Y 1 -u 1 -E -O clf:/var/log/pure-ftpd/transfer.log -8 UTF-8 -B My syslog contains; Oct 7 19:46:40 Debian-60-squeeze-64 pure-ftpd: ([email protected]) [WARNING] Can't login as [www-data]: account disabled And my ftp client is giving me; 530 Sorry, but I can't trust you Am I missing something obvious?

Read the article
Data recovery on a corrupted 3TB disk

- by Mark K Cowan

Short version I probably need software to run a deep-scan recovery (ideally on Linux) to find files on NTFS filesystem. The file data is intact, but the references are no longer present. Analogous to recovering data from a "quick-formatted" partition. Hopefully there is a smarter way available than deep-scan, one which would recover filenames and possibly paths. Long version I have a 3TB disk containing a load of backups. Windows 7 SP1 refused to detect the disk when plugged in directly via SATA, so I put it on a USB/SATA adaptor which seemed to work at first. The SATA/USB adaptor probably does not support disks over 2.2TB though. Windows first asked me if I wanted to 'format' the disk, then later showed me most of the contents but some folder were inaccessible. I stupidly decided to run a CHKDSK on my backup disk, which made the folders accessible but also left them empty. I connected this disk via SATA to my main PC (Arch Linux). I tried: testdisk ntfsundelete ntfsfix --no-action (to look for diagnostically relevant faults, disk was "OK" though) to no avail as the files references in the tables had presumably been zeroed out by CHKDSK, rather than using a typical journal'd deletion). If it is useful at all, a majority of the files that I want to recover are JPEG, Photoshop PSD, and MPEG-3/MPEG-4/AVI/MKV files. If worst comes to worst, I'll just design my own sector scanner and use some simple heuristic-driven analysis to recover raw binary blocks of data from the disk which appears to match the structures of the above file types. I am unfamiliar with the exact workings of NTFS but used to be proficient at recovering FAT32 systems with just a hex-editor, so I can provide any useful diagnostic information if you let me know how to find it! My priorities in ascending order of importance for choosing the accepted answer: Restores directory structure Recovers many filenames in addition to the file data Is free / very cheap Runs on Linux Recovers a majority of file data The last point is the most important, but the more of the higher points you match the more rep you'll probably get :)

Read the article
Any "Magic Tricks" For Getting Data Back After Windows 7 Install

- by user163757

My old man installed Windows 7 without making a proper backup, and now realizes he left behind some important data. He did a true "clean install", so there is no Windows.old folder in the root directory. However, I believe the format performed on the hard drive was only a quick format, so I am hoping there is some chance at data recovery. I took his hard drive out, and have spent a majority of the weekend researching data recovery options. I paid $70 for the GetDataBack software, but have had little success with it. I can see all of the files I want to restore, however they appear corrupt when I try to open them. With that all being said, does anyone know of a viable way to recover some of this data, or is it a lost cause all together?

Read the article
raid 0 data recovery?

- by Fred

HI All, I have two identical seagate 7200.9 500Gb drives confiured as a RAID 0 spanned disk in windows. One of the drives has lost power and wont spin up at all. I know this normally means death for the data on both drives but i have a cunning plan.. DISK 1 - NO POWER RAID 0 DISK DISK 2 - FULLY FUNCTIONAL RAID 0 DISK DISK 3 - FULLY FUNCTIONAL SPARE DISK Copy the working drive (disk 2) data to a third 500GB DISK (disk 3), remove the logic board from the working disk (disk 2) and replace it with the non working logic board on the broken drive (disk 1) , then hopefully recreate the RAID 0 with disk 1 and disk 3, just long enough to get the data off it. Hope this makes sense, here are my questions: Windows disk manager atm recognises disk 2 but wont let me access it in anyway, therefore copying the data off it (or getting a disk image) cant be done in windows. Does anyone know of any software (in linux or self booting) that would allow me to access this disk? Anyone know of any software that will recreate the spanned drive off two disk images Am i missing any key information that means i definitely shouldn't even bother starting this, i know its a long shot anyway but its worth a try unless i definitely cant do it. The irritating thing is that i am sure its a logic board failure on disk 1 as it simply wont power up at all, suddenly no signs of life, so i am sure the data is intact! Any help would be really appreciated! Thanks

Read the article

Search Results

Search found 65999 results on 2640 pages for 'large data volumes'.

Page 26/2640 | < Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >

- by Stew

- by shredder12

- by pixeldude

- by Bry4n

- by Bry4n

- by BobbyJim

- by Aron Rotteveel

- by charlie.mott

- by lajos.varady(at)oracle.com

- by Logan

- by Mala Narasimharajan

- by david.butler(at)oracle.com

- by Javy

- by defaye

- by ANIL MANE

- by Dalho Park

- by lynks

- by Mark K Cowan

- by user163757

- by Fred

< Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >