Search Results

Search found 59880 results on 2396 pages for 'data recovery'.

Page 26/2396 | < Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33  | Next Page >

  • Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

    - by Pinal Dave
    There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

    Read the article

  • ASP.NET Dynamic Data Deployment Error

    - by rajbk
    You have an ASP.NET 3.5 dynamic data website that works great on your local box. When you deploy it to your production machine and turn on debug, you get the YSD Server Error in '/MyPath/MyApp' Application. Parser Error Description: An error occurred during the parsing of a resource required to service this request. Please review the following specific parse error details and modify your source file appropriately. Parser Error Message: Unknown server tag 'asp:DynamicDataManager'. Source Error: Line 5:  Line 6:  <asp:Content ID="Content1" ContentPlaceHolderID="ContentPlaceHolder1" Runat="Server"> Line 7:      <asp:DynamicDataManager ID="DynamicDataManager1" runat="server" AutoLoadForeignKeys="true" /> Line 8:  Line 9:      <h2><%= table.DisplayName%></h2> Probable Causes The server does not have .NET 3.5 SP1, which includes ASP.NET Dynamic Data, installed. Download it here. The third tagPrefix shown below is missing from web.config <pages> <controls> <add tagPrefix="asp" namespace="System.Web.UI" assembly="System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/> <add tagPrefix="asp" namespace="System.Web.UI.WebControls" assembly="System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/> <add tagPrefix="asp" namespace="System.Web.DynamicData" assembly="System.Web.DynamicData, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/> </controls></pages>     Hope that helps!

    Read the article

  • SQL Server 2012 : The Data Tools installer is now available

    - by AaronBertrand
    Last week when RC0 was released, the updated installer for "Juneau" (SQL Server Data Tools) was not available. Depending on how you tried to get it, you either ended up on a blank search page, or a page offering the CTP3 bits. Important note: the CTP3 Juneau bits are not compatible with SQL Server 2012 RC0. If you already have Visual Studio 2010 installed (meaning Standard/Pro/Premium/Ultimate), you will need to install Service Pack 1 before continuing. You can get to the installer simply by opening...(read more)

    Read the article

  • SQL Developer Data Modeler v3.3 Early Adopter: Link Model Objects Across Designs

    - by thatjeffsmith
    The third post in our “What’s New in SQL Developer Data Modeler v3.3” series, SQL Developer Data Modeler now allows you to link objects across models. If you need to catch up on the earlier posts, here are the first two: New and Improved Search Collaborative Design via Excel Today’s post is a very simple and straightforward discussion on how to share objects across models and designs. In previous releases you could easily copy and paste objects between models and designs. Simply select your object, right-click and select ‘Copy’ Once copied, paste it into your other designs and then make changes as required. Once you paste the object, it is no longer associated with the source it was copied from. You are free to make any changes you want in the new location without affecting the source material. And it works the other way as well – make any changes to the source material and the new object is also unaffected. However. What if you want to LINK a model object instead of COPYING it? In version 3.3, you can now do this. Simply drag and drop the object instead of copy and pasting it. Select the object, in this case a relational model table, and drag it to your other model. It’s as simple as it sounds, here’s a little animated GIF to show you what I’m talking about. Drag and drop between models/designs to LINK an object Notes The ‘linked’ object cannot be modified from the destination space Updating the source object will propagate the changes forward to wherever it’s been linked You can drag a linked object to another design, so dragging from A - B and then from B - C will work Linked objects are annotated in the model with a ‘Chain’ bitmap, see below This object has been linked from another design/model and cannot be modified. A very simple feature, but I like the flexibility here. Copy and paste = new independent object. Drag and drop = linked object.

    Read the article

  • Data structures in functional programming

    - by pwny
    I'm currently playing with LISP (particularly Scheme and Clojure) and I'm wondering how typical data structures are dealt with in functional programming languages. For example, let's say I would like to solve a problem using a graph pathfinding algorithm. How would one typically go about representing that graph in a functional programming language (primarily interested in pure functional style that can be applied to LISP)? Would I just forget about graphs altogether and solve the problem some other way?

    Read the article

  • PASS Data Architecture VC presents Neil Hambly on Improve Data Quality & Integrity using Constraints

    On Tuesday June 19th 12PM noon Central, Neil Hambly will discuss "Leveraging the power of constraints to improve both data quality and performance of your databases." What are your servers really trying to tell you? Find out with new SQL Monitor 3.0, an easy-to-use tool built for no-nonsense database professionals.For effortless insights into SQL Server, download a free trial today.

    Read the article

  • Fokedvenc BI és DW blogjaim 7: Oracle Data Warehousing

    - by Fekete Zoltán
    A következo tartalmas blogot ajánlom a nyájas olvasó figyelmébe: The Data Warehouse Insider: http://blogs.oracle.com/datawarehousing/ Az adattárház általános fogalmaitól és a bevezetések és tervezés "best practice" legjobb gyakorlati tapasztalatokig. Témák: csillagsémák, particionálás, OLAP, 3NF, párhuzamos feldolgozások, adatbetöltés, ETL-ELT, adatmodellek, rendezvények, Exadata, Database Machine, tömörítés, adatbányászat, ügyféltörténetek,...

    Read the article

  • When does 'optimizing code' == 'structuring data'?

    - by NewAlexandria
    A recent article by ycombinator lists a comment with principles of a great programmer. #7. Good programmer: I optimize code. Better programmer: I structure data. Best programmer: What's the difference? Acknowledging subjective and contentious concepts - does anyone have a position on what this means? I do, but I'd like to edit this question later with my thoughts so-as not to predispose the answers.

    Read the article

  • Single Key Multiple Values Data Structure for one to many mapping

    - by nijhawan.saurabh
    Dictionaries are good, they are great to store Key / Value pairs but what if you want to store multiple values for a single key? Dictionaries would not allow duplicate keys. I came across a nice way to represent such a Data Structure using one of the Extension Method (ToLookup) present in System.Linq Namespace which converts an IEnumerable<T> to an ILookup<TKey, TElement>.   Now, there are two parameters this method expects (The other overload expects 3 parameters): IEnumerable<TSource> - This list would contain the actual data. Func<TSource, TKey> keySelector - The Delegate which which computes the keys   The method returns the following: ILookup<TKey, TElement>   This DS would store Keys and multiple values along those keys.   Let's see a small example:        12  using System;    13     using System.Collections.Generic;    14     using System.Linq;    15     16     /// <summary>    17     /// </summary>    18     internal class Program    19     {    20         #region Methods    21     22         /// <summary>    23         /// </summary>    24         /// <param name="args">    25         /// The args.    26         /// </param>    27         private static void Main(string[] args)    28         {    29             // Create an array of strings.    30             var list = new List<string> { "IceCream1", "Chocolate Moose", "IceCream2" };    31     32             // Generate a lookup Data Structure    33             ILookup<int, string> lookupDs = list.ToLookup(item => item.Length);    34     35           // Enumerate groupings.    36             foreach (var group in lookupDs)    37             {    38                 foreach (string element in group)    39                 {    40                     Console.WriteLine(element);    41                 }    42             }    43         } Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • Data Education: Great Classes Coming to a City Near You

    - by Adam Machanic
    In case you haven't noticed, Data Education (the training company I started a couple of years ago) has expanded beyond the US northeast; we're currently offering courses with top trainers in both St. Louis and Chicago , as well as the Boston area. The courses are starting to fill up fast—not surprising when you consider we’re talking about experienced instructors like Kalen Delaney , Rob Farley , and Allan Hirt —but we have still have some room. We’re very excited about bringing the highest quality...(read more)

    Read the article

  • Extracting the Layout of all the Data Forms from the Relational Database

    - by RahulS
    Today I came across a question from one of our clients that: "what members are used on each data form WITHOUT having to go through the report generated out of our Planning app". We worked with client on this and reached to a simple query. All the form related information is stored in the following tables: HSP_FORM HSP_FORMOBJ_DEF HSP_FORMOBJ_DEF_MBR HSP_FORM_ATTRIBUTES HSP_FORM_CALCS HSP_FORM_DV_CONDITION HSP_FORM_DV_PM_RULE HSP_FORM_DV_RULE HSP_FORM_DV_USER_IN_PM_RULE HSP_FORM_LAYOUT HSP_FORM_MENUS HSP_FORM_VARIABLES If we want to retrieve just the members included, we can concentrate on: HSP_OBJECT to get the Object_ID for form, Object_Type is 7 for forms. (Ex: Select * from HSP_OBJECT where OBJECT_TYPE = 7) HSP_FORMOBJ_DEF Find the OBJDEF_ID for a particular form HSP_FORMOBJ_DEF_MBR Use the above OBJDEF_ID to find the members: Here the Mbr_ID is the Id of the member and Query_Type is the Function like Idesc, Level0 etc and Sequce is you sequence, And the final table we can use is HSP_FORM_LAYOUT: Layout_Type: 0->Pov 1-> Page, 2->Row, 3->Col, DIM_ID is the dimension ID and Ordinal is position. Here is the Query: SELECT HSP_OBJECT.OBJECT_NAME AS 'Form',  HSP_OBJECT_2.OBJECT_NAME AS 'Dimension',  HSP_OBJECT_1.OBJECT_NAME AS 'Member',  HSP_FORMOBJ_DEF_MBR.QUERY_TYPE FROM  <DatabaseName>.dbo.HSP_FORM_LAYOUT HSP_FORM_LAYOUT,  <DatabaseName>.dbo.HSP_FORMOBJ_DEF HSP_FORMOBJ_DEF,  <DatabaseName>.dbo.HSP_FORMOBJ_DEF_MBR HSP_FORMOBJ_DEF_MBR,  <DatabaseName>.dbo.HSP_MEMBER HSP_MEMBER,  <DatabaseName>.dbo.HSP_OBJECT HSP_OBJECT,  <DatabaseName>.dbo.HSP_OBJECT HSP_OBJECT_1,  <DatabaseName>.dbo.HSP_OBJECT HSP_OBJECT_2 WHERE  HSP_OBJECT.OBJECT_ID = HSP_FORMOBJ_DEF.FORM_ID AND  HSP_FORMOBJ_DEF_MBR.OBJDEF_ID = HSP_FORMOBJ_DEF.OBJDEF_ID AND  HSP_MEMBER.MEMBER_ID = HSP_FORMOBJ_DEF_MBR.MBR_ID AND  HSP_OBJECT_1.OBJECT_ID = HSP_MEMBER.MEMBER_ID AND  HSP_OBJECT_2.OBJECT_ID = HSP_MEMBER.DIM_ID AND  HSP_FORM_LAYOUT.DIM_ID = HSP_MEMBER.DIM_ID AND  HSP_FORM_LAYOUT.FORM_ID = HSP_FORMOBJ_DEF.FORM_ID AND  ((HSP_OBJECT.OBJECT_TYPE=7)) ORDER BY HSP_OBJECT.OBJECT_NAME  Concentrate on Test1 data form and Actual Layout of it as follows: Corresponding Query_type for few of the functions: 9  for Idesc, 3  for Ancestors, -9 for ILvl0Des, 8  for Desc, 4  for IAncestors Its just a basic idea you can do lot on the basis of this. Cheers..!!! Rahul S. http://www.facebook.com/pages/HyperionPlanning/117320818374228

    Read the article

  • Sorting data in the SSIS Pipeline (Video)

    In this post I want to show a couple of ways to order the data that comes into the pipeline.  a number of people have asked me about this primarily because there are a number of ways to do it but also because some components in the pipeline take sorted inputs.  One of the methods I show is visually easy to understand and the other is less visual but potentially more performant.

    Read the article

  • What's beyond c,c++ and data structure?

    - by sagacious
    I have learnt c and c++ programming languages.i have learnt data structure too. Now i'm confused what to do next?my aim is to be a good programmer. i want to go deeper into the field of programming and making the practical applications of what i have learnt. So,the question takes the form-what to do next?Or is there any site where i can see advantage of every language with it's features? sorry,if there's any language error and thanks in advance.

    Read the article

  • Recover data from a Thecus N4100 NAS jbod partition

    - by TimothyP
    I have a Thecus N4100 NAS wich had a 2TB drive in it configured as jbod partition. Later I tried adding a second drive to expand the available space. I added it to the jbod configuration but I did not get any extra space. Then I tried removing the second drive but then the NAS system indicated the jbod configuration was damaged. After a reboot it told me there is no configuration and I need to create a new RAID/jbod configuration and that I would lose all data on the drive. Of course I did not do this, I took the 2TB drive and checked with Linux if the partition was still there and it is... completely intact. I found that linux recognizes it as a /dev/mdX device and that I should be able to mount it but I don't know the file system type. Anyway.... is there a way to recover the data? Since everything has always been on a single drive it should all be there right? I can connect the drive to Windows , Linux or MacOS so whatever gets the job done.

    Read the article

  • Five stars of open data - example and review

    - by Joe
    (there may be a more suited SE site for this question so feel free to shift) I have some data I'd like to make open to the public - It's synatesis of some related data retrived from freedom of infomation requests over the last year. The data itself is at http://www.cs.rhul.ac.uk/home/joseph/domesday/Domesday-Scotland.csv or for fans of Excel, at http://www.cs.rhul.ac.uk/home/joseph/domesday/Domesday-Scotland.xlsx . It's no more than a table with about five columns. I'd like to make this properly open data, so I was looking at the 5 star deployment scheme for Open Data. Much of which is fine but I'm confused towards the end and I could do with an explenation from people who know the answers. So to get achieve the star levels I need: "make your stuff available on the Web (whatever format) under an open license" trival - all I have to do is put the notes up on the page that will give the provance of the data. "make it available as structured data (e.g., Excel instead of image scan of a table)"… done… "use non-proprietary formats (e.g., CSV instead of Excel)" - done… "use URIs to identify things, so that people can point at your stuff" - this is where I start to get a bit hazy - does this mean there should be an URI for every line in the table? "link your data to other data to provide context" - this isn't massively clear to me - does this mean to give the provence of the data? One column of the data I've put out is a link to where the data came from - is that the sort of thing we're looking at? Any and all information and answers welcome… EDIT - or if anyone wants to recommend a place SE or other place to ask the question - that would be cool...

    Read the article

  • saving data from a failing drive

    - by intuited
    An external 3½" HDD seems to be in danger of failing — it's making ticking sounds when idle. I've acquired a replacement drive, and want to know the best strategy to get the data off of the dubious drive with the best chance of saving as much as possible. There are some directories that are more important than others. However, I'm guessing that picking and choosing directories is going to reduce my chances of saving the whole thing. I would also have to mount it, dump a file listing, and then unmount it in order to be able to effectively prioritize directories. Adding in the fact that it's time-consuming to do this, I'm leaning away from this approach. I've considered just using dd, but I'm not sure how it would handle read errors or other problems that might prevent only certain parts of the data from being rescued, or which could be overcome with some retries, but not so many that they endanger other parts of the drive from being saved. I guess ideally it would do a single pass to get as much as possible and then go back to retry anything that was missed due to errors. Is it possible that copying more slowly — e.g. pausing every x MB/GB — would be better than just running the operation full tilt, for example to avoid any overheating issues? For the "where is your backup" crowd: this actually is my backup drive, but it also contains some non-critical and bulky stuff, like music, that aren't backups, i.e. aren't backed up. The drive has not exhibited any clear signs of failure other than this somewhat ominous sound. I did have to fsck a few errors recently — orphaned inodes, incorrect free blocks/inodes counts, inode bitmap differences, zero dtime on deleted inodes; about 20 errors in all. The filesystem of the partition is ext3.

    Read the article

  • Recovering data from an external hard drive

    - by CCallaghan
    I have a WD Elements 2GB hard drive (formatted NTFS). I accidentally kicked out the USB cable while writing data to the disk, and now I can't access most of the data. Although this was ostensibly my backup drive, there is a great deal of important material on there which was only on there. I realise how idiotic this makes me. (So, formatting is not an option.) Things I've tried/information I've gathered: Windows Explorer will recognise the drive itself. However, it will not access most directories therein (and will sometimes crash when exploring). I can access all of the directories through the command line, but the dir command will often report that it can't read any files in most of the directories. The situation was similar when I hooked it up to an Ubuntu machine: the file explorer crashed, but I could access directories - but not files in those directories - via terminal commands. Several files I tried to copy out either resulted in an I/O error being reported or resulted in the command line crashing. The Disk Management utility on Windows reports a healthy disk formatted as NTFS and not RAW. It also indicates the correct amount of space used up and its capacity (so it seems that the files are not deleted). I've tried to run chkdsk, but that hangs on Step 2 (checking indexes) at 74%. Step 1 reported no bad sectors. I tried Recuva, but that didn't seem to work (stalled at 0% for half an hour). I should also note that the disk doesn't seem to be spinning smoothly; it seems to be chopping back, like it's reading the same sector over and over again. I noticed this after I kicked out the cable. Any help would be greatly appreciated. Update: It would seem the problem has taken a turn for the worse. The external hard drive now shows up on my computer as a local disk and is not mountable by Linux.

    Read the article

  • Symantec Protection Suite and System Recovery 2011 Desktop Edition

    - by rihatum
    I am re-posting this as my previous question was being treated as if I am "Shopping or seeking Product Recommendations" even though I was NOT - BTW they have deleted my comments too which were not offensive in nature. anyway - I have re-phrased some parts of my question and I hope SF Admins "Do Not Modify / Edit" this one - will be most grateful for that. I have a lot of respect for the People who visit this SITE and help others ! Just To clarify : Just to go by SF rules - I am not seeking someone to Design this solution, I am simply seeking real world examples, experiences, technical expert opinions / suggestions, any tips or tricks they may have or any problems they may have faced while doing something similar above with these products. I am also not asking for Capacity Planning for Storage, We have done some research and I am seeking Expert Assurance / Suggestions. We (our company) are planning to deploy Symantec Endpoint Protection and Symantec Desktop Recovery 2011 Desktop Edition to our 3000 - 4000 workstations (Windows7 32 and 64) with a few 100s with Windows XP 32/64 Bit. I have read the implementation guide for SEP and have read tech-notes for Desktop Recovery 2011. Our team have planned to deploy this as follows : 1 x dedicated SQL 2008R2 for Symantec Endpoint Protection (Instead of using the Embedded Database) 1 x Dedicated SQL 2008R2 for Symantec Desktop Recovery 2011 (Instead of using the Embedded Database) 1 x Dedicated W2K8 R2 Box for the SEPM (Symantec Endpoint Protection Manager - Mgmt. APP) 1 x Dedicated W2K8 R2 Box for the Symantec Desktop Recovery 2011 Management Application Agent Deployment : As per Symantec Documentation for both of the above, an agent can be pushed via the Mgmt. Application (provided no firewalls are blocking ports required etc. - we have Windows firewall disabled already). Server Hardware : Per SQL Server : 16GB RAM + SAS DISKS + Dual XEON, RAID-10 for the SQL DB or I can always mount a LUN from our existing Hitachi or EMC SAN. SEPM Server : 16GB RAM + SAS DISKS + DUAL XEON System Recovery MGMT SERVER : 16GB RAM + SAS DISKS + DUAL XEON Above is the initial plan we have for 3000 - 4000 client workstation (Windows) Now my Questions :-) a) If we had these users distributed amongst two sites with AD DC / GC in each site, How would I restrict SEPM and Desktop Mgmt. solution to only check for users in their respective site ? b) At present all users are under one building but we are going to move some dept. to a new location (with dedicated connectivity), How would we control which SEPM / MGMT Server is responsible for which site ? c) We have netbackup in our environment backing up other servers, I am planning to protect these 4 (2 x SQL, 1 x SEPM, 1 x System Recovery Mgmt. Server) via netbackup or I can use System recovery 2011 server edition on all 4 of these boxes as well. (License is not an issue as we have the complete symantec portfolio included in our license). d) Now - Saving Desktop backups - What strategies have you implemented ? Any best practice recommendation for a large user base ? I was thinking to either mount a LUN from our Hitachi SAN on the Symantec Recovery Server itself or backup to the users hard drive locally and then copy it over to a network location ? Suggestions welcome :-) If you have anything to add / correct - that will be really helpful before diving into the actual implementation phase. Will be most grateful with your suggestions, recommendations and corrections with above - Many Thanks !

    Read the article

  • Using a "white list" for extracting terms for Text Mining, Part 2

    - by [email protected]
    In my last post, we set the groundwork for extracting specific tokens from a white list using a CTXRULE index. In this post, we will populate a table with the extracted tokens and produce a case table suitable for clustering with Oracle Data Mining. Our corpus of documents will be stored in a database table that is defined as create table documents(id NUMBER, text VARCHAR2(4000)); However, any suitable Oracle Text-accepted data type can be used for the text. We then create a table to contain the extracted tokens. The id column contains the unique identifier (or case id) of the document. The token column contains the extracted token. Note that a given document many have many tokens, so there will be one row per token for a given document. create table extracted_tokens (id NUMBER, token VARCHAR2(4000)); The next step is to iterate over the documents and extract the matching tokens using the index and insert them into our token table. We use the MATCHES function for matching the query_string from my_thesaurus_rules with the text. DECLARE     cursor c2 is       select id, text       from documents; BEGIN     for r_c2 in c2 loop        insert into extracted_tokens          select r_c2.id id, main_term token          from my_thesaurus_rules          where matches(query_string,                        r_c2.text)>0;     end loop; END; Now that we have the tokens, we can compute the term frequency - inverse document frequency (TF-IDF) for each token of each document. create table extracted_tokens_tfidf as   with num_docs as (select count(distinct id) doc_cnt                     from extracted_tokens),        tf       as (select a.id, a.token,                            a.token_cnt/b.num_tokens token_freq                     from                        (select id, token, count(*) token_cnt                        from extracted_tokens                        group by id, token) a,                       (select id, count(*) num_tokens                        from extracted_tokens                        group by id) b                     where a.id=b.id),        doc_freq as (select token, count(*) overall_token_cnt                     from extracted_tokens                     group by token)   select tf.id, tf.token,          token_freq * ln(doc_cnt/df.overall_token_cnt) tf_idf   from num_docs,        tf,        doc_freq df   where df.token=tf.token; From the WITH clause, the num_docs query simply counts the number of documents in the corpus. The tf query computes the term (token) frequency by computing the number of times each token appears in a document and divides that by the number of tokens found in the document. The doc_req query counts the number of times the token appears overall in the corpus. In the SELECT clause, we compute the tf_idf. Next, we create the nested table required to produce one record per case, where a case corresponds to an individual document. Here, we COLLECT all the tokens for a given document into the nested column extracted_tokens_tfidf_1. CREATE TABLE extracted_tokens_tfidf_nt              NESTED TABLE extracted_tokens_tfidf_1                  STORE AS extracted_tokens_tfidf_tab AS              select id,                     cast(collect(DM_NESTED_NUMERICAL(token,tf_idf)) as DM_NESTED_NUMERICALS) extracted_tokens_tfidf_1              from extracted_tokens_tfidf              group by id;   To build the clustering model, we create a settings table and then insert the various settings. Most notable are the number of clusters (20), using cosine distance which is better for text, turning off auto data preparation since the values are ready for mining, the number of iterations (20) to get a better model, and the split criterion of size for clusters that are roughly balanced in number of cases assigned. CREATE TABLE km_settings (setting_name  VARCHAR2(30), setting_value VARCHAR2(30)); BEGIN  INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.clus_num_clusters, 20);  INSERT INTO km_settings (setting_name, setting_value)     VALUES (dbms_data_mining.kmns_distance, dbms_data_mining.kmns_cosine);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.prep_auto,dbms_data_mining.prep_auto_off);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.kmns_iterations,20);   INSERT INTO km_settings (setting_name, setting_value) VALUES     VALUES (dbms_data_mining.kmns_split_criterion,dbms_data_mining.kmns_size);   COMMIT; END; With this in place, we can now build the clustering model. BEGIN     DBMS_DATA_MINING.CREATE_MODEL(     model_name          => 'TEXT_CLUSTERING_MODEL',     mining_function     => dbms_data_mining.clustering,     data_table_name     => 'extracted_tokens_tfidf_nt',     case_id_column_name => 'id',     settings_table_name => 'km_settings'); END;To generate cluster names from this model, check out my earlier post on that topic.

    Read the article

  • Building vs. Buying a Master Data Management Solution

    - by david.butler(at)oracle.com
    Many organizations prefer to build their own MDM solutions. The argument is that they know their data quality issues and their data better than anyone. Plus a focused solution will cost less in the long run then a vendor supplied general purpose product. This is not unreasonable if you think of MDM as a point solution for a particular data quality problem. But this approach carries significant risk. We now know that organizations achieve significant competitive advantages when they deploy MDM as a strategic enterprise wide solution: with the most common best practice being to deploy a tactical MDM solution and grow it into a full information architecture. A build your own approach most certainly will not scale to a larger architecture unless it is done correctly with the larger solution in mind. It is possible to build a home grown point MDM solution in such a way that it will dovetail into broader MDM architectures. A very good place to start is to use the same basic technologies that Oracle uses to build its own MDM solutions. Start with the Oracle 11g database to create a flexible, extensible and open data model to hold the master data and all needed attributes. The Oracle database is the most flexible, highly available and scalable database system on the market. With its Real Application Clusters (RAC) it can even support the mixed OLTP and BI workloads that represent typical MDM data access profiles. Use Oracle Data Integration (ODI) for batch data movement between applications, MDM data stores, and the BI layer. Use Oracle Golden Gate for more real-time data movement. Use Oracle's SOA Suite for application integration with its: BPEL Process Manager to orchestrate MDM connections to business processes; Identity Management for managing users; WS Manager for managing web services; Business Intelligence Enterprise Edition for analytics; and JDeveloper for creating or extending the MDM management application. Oracle utilizes these technologies to build its MDM Hubs.  Customers who build their own MDM solution using these components will easily migrate to Oracle provided MDM solutions when the home grown solution runs out of gas. But, even with a full stack of open flexible MDM technologies, creating a robust MDM application can be a daunting task. For example, a basic MDM solution will need: a set of data access methods that support master data as a service as well as direct real time access as well as batch loads and extracts; a data migration service for initial loads and periodic updates; a metadata management capability for items such as business entity matrixed relationships and hierarchies; a source system management capability to fully cross-reference business objects and to satisfy seemingly conflicting data ownership requirements; a data quality function that can find and eliminate duplicate data while insuring correct data attribute survivorship; a set of data quality functions that can manage structured and unstructured data; a data quality interface to assist with preventing new errors from entering the system even when data entry is outside the MDM application itself; a continuing data cleansing function to keep the data up to date; an internal triggering mechanism to create and deploy change information to all connected systems; a comprehensive role based data security system to control and monitor data access, update rights, and maintain change history; a flexible business rules engine for managing master data processes such as privacy and data movement; a user interface to support casual users and data stewards; a business intelligence structure to support profiling, compliance, and business performance indicators; and an analytical foundation for directly analyzing master data. Oracle's pre-built MDM Hub solutions are full-featured 3-tier Internet applications designed to participate in the full Oracle technology stack or to run independently in other open IT SOA environments. Building MDM solutions from scratch can take years. Oracle's pre-built MDM solutions can bring quality data to the enterprise in a matter of months. But if you must build, at lease build with the world's best technology stack in a way that simplifies the eventual upgrade to Oracle MDM and to the full enterprise wide information architecture that it enables.

    Read the article

  • How to remove raid 5 array of 5 SAS and use only use one SAS at a time without writing any thing to disc for recovery

    - by murtaza hamid
    I have HP server ML370 g5 with 8 SAS, c drive 1 72 gb raid 0, d drive 2 72 gb raid 0, f drive 5 146 gb raid 5. 2 of 5 sas drive has got bad sectors and raid 5 is showing status failed. now i want to remove all this 5 SAS and put 1 by 1 in any of the bay to make its image for data recovery purpose without writing anything to the drive. how should i proceed. i also want to keep drive c and d intact. also is it possible if i put all this 5 drives in the bay with the same sequense will it recognise the raid 5 array ( i read some where its smart controller..just curious) many thanks in advance.

    Read the article

  • Is ECC mandatory in SSD technology?

    - by Alexander Shcheblikin
    While shopping for an SSD I have noticed that some manufacturers promote their "Pro" models as the ones sporting ECC data protection. Those manufacturers do not mention ECC in their budget models descriptions. However, Wikipedia article on flash memory states that "NAND relies on ECC to compensate for bits that may spontaneously fail during normal device operation." So the question is does any SSD device use ECC behind the scenes for its normal operation and is that ECC "feature" just a marketing ploy?

    Read the article

  • Changing filesystem types "safely"

    - by warren
    Back in Windows 95 OSR2 (I believe), there was a conversion tool that would take your extant FAT16 partition and change it to FAT32 non-destructively (most of the time). Are there any tools like that now for going from one file system type to another in situ without destroying the data? For example, from etx3 to ext4? Or NTFS to XFS?

    Read the article

  • How to recover data from a partially overwritten partition

    - by shredder12
    By mistake, I configured a 900GB partition to be part of a 50GB raid. The sync is complete and my understanding is that only the first 50GB of the bigger partition is overwritten. How do I recover the rest of the data? When I try to mount this partition by identifying it as ext3, it mounts only the 50GB overwritten space. This partition was earlier divided into various logical volumes(all ext3 filesystems) through LVM. Any suggestions?

    Read the article

  • Safely reboot prior to recovering data

    - by ELO
    What is the safest (without additional writing to the disk) way to power down computer whose deleted files you want to recover in order to boot from rescue medium? In case of a desktop computer, plugging off the power cord looks like the most direct solution, but are there possible side-effects, apart from losing unsaved data? More problematic seems the laptop, with removing the battery being the equivalent, but is it a good idea overall?

    Read the article

< Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33  | Next Page >