Search Results

Search found 60688 results on 2428 pages for 'spring data solr'.

Page 29/2428 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 35 36  | Next Page >

  • What data structure to use / data persistence

    - by Dave
    I have an app where I need one table of information with the following fields: field 1 - int or char field 2 - string (max 10 char) field 3 - string (max 20 char) field 4 - float I need the program to filter on field 1 based upon a segmented control and select a field 2 from a picker. From this data I need to look up field 4 to use in a calculation. Total records will be about 200. I never see it go above 400 - 500. I am going to use a singleton which I am able to do, I just need help with the structure for this with data persistence. What type of data structure should I use for this and should I use NSNumber, NSString, etc. or old data types like float, Char, etc. I thought about a struct put into an array but there is probably a better way. This is new to me so any help or reference to examples would be great. I also thought about a plist or dictionary but it looks like it is just a lookup and a field which obviously won't work. Core data looked like overkill to me. Also, with any recommendation how should I get initial data into it? I want the user to be able to edit and add to the database. Sorry for the old terms, you can see what generation I am from... Thanks in advance!!!!

    Read the article

  • Using a "white list" for extracting terms for Text Mining

    - by [email protected]
    In Part 1 of my post on "Generating cluster names from a document clustering model" (part 1, part 2, part 3), I showed how to build a clustering model from text documents using Oracle Data Miner, which automates preparing data for text mining. In this process we specified a custom stoplist and lexer and relied on Oracle Text to identify important terms.  However, there is an alternative approach, the white list, which uses a thesaurus object with the Oracle Text CTXRULE index to allow you to specify the important terms. INTRODUCTIONA stoplist is used to exclude, i.e., black list, specific words in your documents from being indexed. For example, words like a, if, and, or, and but normally add no value when text mining. Other words can also be excluded if they do not help to differentiate documents, e.g., the word Oracle is ubiquitous in the Oracle product literature. One problem with stoplists is determining which words to specify. This usually requires inspecting the terms that are extracted, manually identifying which ones you don't want, and then re-indexing the documents to determine if you missed any. Since a corpus of documents could contain thousands of words, this could be a tedious exercise. Moreover, since every word is considered as an individual token, a term excluded in one context may be needed to help identify a term in another context. For example, in our Oracle product literature example, the words "Oracle Data Mining" taken individually are not particular helpful. The term "Oracle" may be found in nearly all documents, as with the term "Data." The term "Mining" is more unique, but could also refer to the Mining industry. If we exclude "Oracle" and "Data" by specifying them in the stoplist, we lose valuable information. But it we include them, they may introduce too much noise. Still, when you have a broad vocabulary or don't have a list of specific terms of interest, you rely on the text engine to identify important terms, often by computing the term frequency - inverse document frequency metric. (This is effectively a weight associated with each term indicating its relative importance in a document within a collection of documents. We'll revisit this later.) The results using this technique is often quite valuable. As noted above, an alternative to the subtractive nature of the stoplist is to specify a white list, or a list of terms--perhaps multi-word--that we want to extract and use for data mining. The obvious downside to this approach is the need to specify the set of terms of interest. However, this may not be as daunting a task as it seems. For example, in a given domain (Oracle product literature), there is often a recognized glossary, or a list of keywords and phrases (Oracle product names, industry names, product categories, etc.). Being able to identify multi-word terms, e.g., "Oracle Data Mining" or "Customer Relationship Management" as a single token can greatly increase the quality of the data mining results. The remainder of this post and subsequent posts will focus on how to produce a dataset that contains white list terms, suitable for mining. CREATING A WHITE LIST We'll leverage the thesaurus capability of Oracle Text. Using a thesaurus, we create a set of rules that are in effect our mapping from single and multi-word terms to the tokens used to represent those terms. For example, "Oracle Data Mining" becomes "ORACLEDATAMINING." First, we'll create and populate a mapping table called my_term_token_map. All text has been converted to upper case and values in the TERM column are intended to be mapped to the token in the TOKEN column. TERM                                TOKEN DATA MINING                         DATAMINING ORACLE DATA MINING                  ORACLEDATAMINING 11G                                 ORACLE11G JAVA                                JAVA CRM                                 CRM CUSTOMER RELATIONSHIP MANAGEMENT    CRM ... Next, we'll create a thesaurus object my_thesaurus and a rules table my_thesaurus_rules: CTX_THES.CREATE_THESAURUS('my_thesaurus', FALSE); CREATE TABLE my_thesaurus_rules (main_term     VARCHAR2(100),                                  query_string  VARCHAR2(400)); We next populate the thesaurus object and rules table using the term token map. A cursor is defined over my_term_token_map. As we iterate over  the rows, we insert a synonym relationship 'SYN' into the thesaurus. We also insert into the table my_thesaurus_rules the main term, and the corresponding query string, which specifies synonyms for the token in the thesaurus. DECLARE   cursor c2 is     select token, term     from my_term_token_map; BEGIN   for r_c2 in c2 loop     CTX_THES.CREATE_RELATION('my_thesaurus',r_c2.token,'SYN',r_c2.term);     EXECUTE IMMEDIATE 'insert into my_thesaurus_rules values                        (:1,''SYN(' || r_c2.token || ', my_thesaurus)'')'     using r_c2.token;   end loop; END; We are effectively inserting the token to return and the corresponding query that will look up synonyms in our thesaurus into the my_thesaurus_rules table, for example:     'ORACLEDATAMINING'        SYN ('ORACLEDATAMINING', my_thesaurus)At this point, we create a CTXRULE index on the my_thesaurus_rules table: create index my_thesaurus_rules_idx on        my_thesaurus_rules(query_string)        indextype is ctxsys.ctxrule; In my next post, this index will be used to extract the tokens that match each of the rules specified. We'll then compute the tf-idf weights for each of the terms and create a nested table suitable for mining.

    Read the article

  • La recherche Full Text avec Solr, par Guillaume Rossolini

    Bonsoir, Voici un article que je ne parviens pas à finaliser dans le détail, mais qui me semble suffisamment avancé pour vous le présenter sans trop rougir : Configurer un moteur de recherche performant à l'aide d'Apache Lucene/Solr et Apache Tomcat (ou tout autre conteneur de servlets) Citation: Apache Lucene est un moteur d'indexation de texte permettant d'effectuer des recherches en langage naturel à l'aide de diverses manipulations automatiques du ...

    Read the article

  • Spring MVC jQuery remote validation

    - by raulsan
    Hi, I am using Spring MVC on the server side, but in one of the pages I decided to create an AJAX validation with jQuery rather than the default Spring validation. Everything works great, except when I have to do a remote validation to check if a "title" already exists. For the javascript I have the following: var validator = $("form").validate({ rules: { title: { minlength: 6, required: true, remote: { url: location.href.substring(0,location.href.lastIndexOf('/'))+"/checkLocalArticleTitle.do", type: "GET" } }, html: { minlength: 50, required: true } }, messages: { title: { required: "A title is required.", remote: "This title already exists." } } }); Then, I use Spring-Json to make this validation and give a response: @RequestMapping("/checkLocalArticleTitle.do") public ModelAndView checkLocalArticleTitle(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { Map model = new HashMap(); String result = "FALSE"; try{ String title = request.getParameter("title"); if(!EJBRequester.articleExists(title)){ result = "TRUE"; } }catch(Exception e){ System.err.println("Exception: " + e.getMessage()); } model.put("result",result); return new ModelAndView("jsonView", model); } However, this does not work and the field "title" is never validated. I think the reason for this is that I am returning an answer in the manner: {result:"TRUE"} when in fact, the answer should be: {"TRUE"} I don't know how to return a single response like this one using a ModelAndView answer. Another thing that is not working is the customized message for the "remote" validation: messages: { title: { required: "A title is required.", remote: "This title already exists." } }, The required message works, but not the remote message. I looked around, but I didn't see many people using Spring and jQuery at the same time. I would appreciate some help here.

    Read the article

  • cxf jaxws with spring on gwt 2.0

    - by Karl
    Hi, I'm trying to use an application which uses cxf-jaxws in bean definition: <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:sec="http://cxf.apache.org/configuration/security" xmlns:jaxws="http://cxf.apache.org/jaxws" xmlns:util="http://www.springframework.org/schema/util" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd http://cxf.apache.org/configuration/security http://cxf.apache.org/schemas/configuration/security.xsd http://cxf.apache.org/jaxws http://cxf.apache.org/schemas/jaxws.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-2.0.xsd"> however in combination with the jetty from gwt 2.0 development shell my context doesn't load and I get this exception: org.springframework.web.context.ContextLoader: Context initialization failed org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Unable to locate Spring NamespaceHandler for XML schema namespace [http://cxf.apache.org/jaxws] enter code hereOffending resource: class path resource [bean_definition.xml] My project is maven-based and I got cxf-rt-frontend-jaxws which contains the namespacehandler and spring.handlers on the classpath. I added cxf-transports-http-jetty.jar as well. Has anyone experienced this kind of problem and found a solution? It seems to be a classpath issue, added the cxf-rt-frontend-jaxws.jar by hand and it works... Somehow the maven dependency doesn't get added to the classpath. Thanks in advance, karl

    Read the article

  • ADO.NET (WCF) Data Services Query Interceptor Hangs IIS

    - by PreMagination
    I have an ADO.NET Data Service that's supposed to provide read-only access to a somewhat complex database. Logically I have table-per-type (TPT) inheritance in my data model but the EDM doesn't implement inheritance. (Limitation of EF and navigation properties on derived types. STILL not fixed in EF4!) I can query my EDM directly (using a separate project) using a copy of the query I'm trying to run against the web service, results are returned within 10 seconds. Disabling the query interceptors I'm able to make the same query against the web service, results are returned similarly quickly. I can enable some of the query interceptors and the results are returned slowly, up to a minute or so later. Alternatively, I can enable all the query interceptors, expand less of the properties on the main object I'm querying, and results are returned in a similar period of time. (I've increased some of the timeout periods) Up til this point Sql Profiler indicates the slow-down is the database. (That's a post for a different day) But when I enable all my query interceptors and expand all the properties I'd like to have the IIS worker process pegs the CPU for 20 minutes and a query is never even made against the database. This implies to me that yes, my implementation probably sucks but regardless the Data Services "tier" is having an issue it shouldn't. WCF tracing didn't reveal anything interesting to my untrained eye. Details: Data model: Agent-Person-Student Student has a collection of referrals Students and referrals are private, queries against the web service should only return "your" students and referrals. This means Person and Agent need to be filtered too. Other entities (Agent-Organization-School) can be accessed by anyone who has authenticated. The existing security model is poorly suited to perform this type of filtering for this type of data access, the query interceptors are complicated and cause EF to generate some entertaining sql queries. Sample Interceptor [QueryInterceptor("Agents")] public Expression<Func<Agent, Boolean>> OnQueryAgents() { //Agent is a Person(1), Educator(2), Student(3), or Other Person(13); allow if scope permissions exist return ag => (ag.AgentType.AgentTypeId == 1 || ag.AgentType.AgentTypeId == 2 || ag.AgentType.AgentTypeId == 3 || ag.AgentType.AgentTypeId == 13) && ag.Person.OrganizationPersons.Count<OrganizationPerson>(op => op.Organization.ScopePermissions.Any<ScopePermission> (p => p.ApplicationRoleAccount.Account.UserName == HttpContext.Current.User.Identity.Name && p.ApplicationRoleAccount.Application.ApplicationId == 124) || op.Organization.HierarchyDescendents.Any<OrganizationsHierarchy>(oh => oh.AncestorOrganization.ScopePermissions.Any<ScopePermission> (p => p.ApplicationRoleAccount.Account.UserName == HttpContext.Current.User.Identity.Name && p.ApplicationRoleAccount.Application.ApplicationId == 124))) > 0; } The query interceptors for Person, Student, Referral are all very similar, ie they traverse multiple same/similar tables to look for ScopePermissions as above. Sample Query var referrals = (from r in service.Referrals .Expand("Organization/ParentOrganization") .Expand("Educator/Person/Agent") .Expand("Student/Person/Agent") .Expand("Student") .Expand("Grade") .Expand("ProblemBehavior") .Expand("Location") .Expand("Motivation") .Expand("AdminDecision") .Expand("OthersInvolved") where r.DateCreated >= coupledays && r.DateDeleted == null select r); Any suggestions or tips would be greatly associated, for fixing my current implementation or in developing a new one, with the caveat that the database can't be changed and that ultimately I need to expose a large portion of the database via a web service that limits data access to the data authorized for, for the purpose of data integration with multiple outside parties. THANK YOU!!!

    Read the article

  • Spring Test / JUnit problem - unable to load application context

    - by HDave
    I am using Spring for the first time and must be doing something wrong. I have a project with several Bean implementations and now I am trying to create a test class with Spring Test and JUnit. I am trying to use Spring Test to inject a customized bean into the test class. Here is my test-applicationContext.xml: <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="............."> <bean id="MyUuidFactory" class="com.myapp.UuidFactory" scope="singleton" > <property name="typeIdentifier" value="CLS" /> </bean> <bean id="ThingyImplTest" class="com.myapp.ThingyImplTest" scope="singleton"> <property name="uuidFactory"> <idref local="MyUuidFactory" /> </property> </bean> </beans> The injection of MyUuidFactory instance goes along with the following code from within the test class: private UuidFactory uuidFactory; public void setUuidFactory(UuidFactory uuidFactory) { this.uuidFactory = uuidFactory; } However, when I go to run the test (in Eclipse or command line) I get the following error (stack trace omitted for brevity): Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'MyImplTest' defined in class path resource [test-applicationContext.xml]: Initialization of bean failed; nested exception is org.springframework.beans.ConversionNotSupportedException: Failed to convert property value of type 'java.lang.String' to required type 'com.myapp.UuidFactory' for property 'uuidFactory'; nested exception is java.lang.IllegalStateException: Cannot convert value of type [java.lang.String] to required type [com.myapp.UuidFactory] for property 'uuidFactory': no matching editors or conversion strategy found Funny thing is, the Eclipse/Spring XML editor shows errors of I misspell any of the types or idrefs. If I leave the bean in, but comment out the dependency injection, everything work until I get a NullPointerException while running the test...which makes sense.

    Read the article

  • Oracle Insurance Gets Innovative with Insurance Business Intelligence

    - by nicole.bruns(at)oracle.com
    Oracle Insurance announced yesterday the availability of Oracle Insurance Insight 7.0, an insurance-specific data warehouse and business intelligence (BI) system that transforms the traditional approach to BI by involving business users in the creation and maintenance."Rapid access to business intelligence is essential to compete and thrive in today's insurance industry," said Srini Venkatasantham, vice president, Product Strategy, Oracle Insurance. "The adaptive data modeling approach of Oracle Insurance Insight 7.0, combined with the insurance-specific data model, offers global insurance companies a faster, easier way to get the intelligence they need to make better-informed business decisions." New Features in Oracle Insurance 7.0 include:"Adaptive Data Modeling" via the new warehouse palette: Gives business users the power to configure lines of business via an easy-to-use warehouse palette tool. Oracle Insurance Insight then automatically creates data warehouse elements - such as line-specific database structures and extract-transform-load (ETL) processes -speeding up time-to-value for BI initiatives. Out-of-the-box insurance models or create-from-scratch option: Includes pre-built content and interfaces for six Property and Casualty (P&C) lines. Additionally, insurers can use the warehouse palette to deploy any and all P&C or General Insurance lines of business from scratch, helping insurers support operations in any country.Leverages Oracle technologies: In addition to Oracle Business Intelligence Enterprise Edition, the solution includes Oracle Database 11g as well as Oracle Data Integrator Enterprise Edition 11g, which delivers Extract, Load and Transform (E-L-T) architecture and eliminates the need for a separate transformation server. Additionally, the expanded Oracle technology infrastructure enables support for Oracle Exadata. Martina Conlon, a Principal with Novarica's Insurance practice, and author of Business Intelligence in Insurance: Current State, Challenges, and Expectations says, "The need for continued investment by insurers in business intelligence capabilities is widely understood, and the industry is acting. Arming the business intelligence implementation with predefined insurance specific content, and flexible and configurable technology will get these projects up and running faster."Learn moreTo see a demo of the Oracle Insurance Insight system, click hereTo read the press announcement, click here

    Read the article

  • SQL SERVER – Installing Data Quality Services (DQS) on SQL Server 2012

    - by pinaldave
    Data Quality Services is very interesting enhancements in SQL Server 2012. My friend and SQL Server Expert Govind Kanshi have written an excellent article on this subject earlier on his blog. Yesterday I stumbled upon his blog one more time and decided to experiment myself with DQS. I have basic understanding of DQS and MDS so I knew I need to start with DQS Client. However, when I tried to find DQS Client I was not able to find it under SQL Server 2012 installation. I quickly realized that I needed to separately install the DQS client. You will find the DQS installer under SQL Server 2012 >> Data Quality Services directory. The pre-requisite of DQS is Master Data Services (MDS) and IIS. If you have not installed IIS, you can follow the simple steps and install IIS in your machine. Once the pre-requisites are installed, click on MDS installer once again and it will install DQS just fine. Be patient with the installer as it can take a bit longer time if your machine is low on configurations. Once the installation is over you will be able to expand SQL Server 2012 >> Data Quality Services directory and you will notice that it will have a new item called Data Quality Client.  Click on it and it will open the client. Well, in future blog post we will go over more details about DQS and detailed practical examples. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology Tagged: Data Quality Services

    Read the article

  • make-like build tools for data?

    - by miku
    Make is a standard tools for building software. But make decides whether a target needs to be regenerated by comparing file modification times. Are there any proven, preferably small tools that handle builds not for software but for data? Something that regenerates targets not only on mod times but on certain other properties (e.g. completeness). (Or alternatively some paper that describes such a tool.) As illustration: I'd like to automate the following process: get data (e.g. a tarball) from some regularly updated source copy somewhere if it's not there (based e.g. on some filename-scheme) convert the files to different format (but only if there aren't successfully converted ones there - e.g. from a previous attempt - custom comparison routine) for each file find a certain data element and fetch some additional file from say an URL, but only if that hasn't been downloaded yet (decide on existence of file and file "freshness") finally compute something (e.g. word count for something identifiable and store it in the database, but only if the DB does not have an entry for that exact ID yet) Observations: there are different stages each stage is usually simple to compute or implement in isolation each stage may be simple, but the data volume may be large each stage may produce a few errors each stage may have different signals, on when (re)processing is needed Requirements: builds should be interruptable and idempotent (== robust) when interrupted, already processed objects should be reused to speedup the next run data paths should be easy to adjust (simple syntax, nothing new to learn, internal dsl would be ok) some form of dependency graph, that describes the process would be nice for later visualizations should leverage existing programs, if possible I've done some research on make alternatives like rake and have worked a lot with ant and maven in the past. All these tools naturally focus on code and software build, not on data builds. A system we have in place now for a task similar to the above is pretty much just shell scripts, which are compact (and are a ok glue for a variety of other programs written in other languages), so I wonder if worse is better?

    Read the article

  • Social Analytics in your current data

    - by Dan McGrath
    By now everyone is aware of the massive boom in social-networking (Twitter, Facebook, LinkedIn) and obviously a big part of its business model revolves around being able to mine this data to create information that can be used to make money for someone. Gartner has identified 'Social Analytics' as one of the top 10 strategic technologies for 2011. Has anyone looked at their existing data structures to determine if they could extract a social graph and then perform further data mining against this? How does it fit in with your other strategic development strategies? What information are you trying to extract from the data? Take for example, a bank. They could conceivably determine a social graph through account relationships and transactions. Obviously there would be open edges on the graph where funds enter/leave the institute, but that shouldn't detract from the usefulness of the data. I'm looking for actual examples with the answers, as well as why/how they did it. References to other sites will be greatly appreciated. Note: I'm not at all referring to mining data out of actual social networks.

    Read the article

  • Spring - How do you set Enum keys in a Map with annotations

    - by al nik
    Hi all, I've an Enum class public enum MyEnum{ ABC; } than my 'Mick' class has this property private Map<MyEnum, OtherObj> myMap; I've this spring xml configuration. <util:map id="myMap"> <entry key="ABC" value-ref="myObj" /> </util:map> <bean id="mick" class="com.x.Mick"> <property name="myMap" ref="myMap" /> </bean> and this is fine. I'd like to replace this xml configuration with Spring annotations. Do you have any idea on how to autowire the map? The problem here is that if I switch from xml config to the @Autowired annotation (on the myMap attribute of the Mick class) Spring is throwing this exception nested exception is org.springframework.beans.FatalBeanException: Key type [class com.MyEnum] of map [java.util.Map] must be assignable to [java.lang.String] Spring is no more able to recognize the string ABC as a MyEnum.ABC object. Any idea? Thanks

    Read the article

  • Spring and hibernate.cfg.xml

    - by Steve Kuo
    How do I get Spring to load Hibernate's properties from hibernate.cfg.xml? We're using Spring and JPA (with Hibernate as the implementation). Spring's applicationContext.xml specifies the JPA dialect and Hibernate properties: <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalEntityManagerFactoryBean"> <property name="jpaDialect"> <bean class="org.springframework.orm.jpa.vendor.HibernateJpaDialect" /> </property> <property name="jpaProperties"> <props> <prop key="hibernate.dialect">org.hibernate.dialect.MySQLInnoDBDialect</prop> </props> </property> </bean> In this configuration, Spring is reading all the Hibernate properties via applicationContext.xml . When I create a hibernate.cfg.xml (located at the root of my classpath, the same level as META-INF), Hibernate doesn't read it at all (it's completely ignored). What I'm trying to do is configure Hibernate second level cache by inserting the cache properties in hibernate.cfg.xml: <cache usage="transactional|read-write|nonstrict-read-write|read-only" region="RegionName" include="all|non-lazy" />

    Read the article

  • Spring 2.5 managed servlets: howto?

    - by EugeneP
    Correct me if anything is wrong. As I understand, all Spring functionality, namely DI works when beans are got thru Spring Context, ie getBean() method. Otherwise, none can work, even if my method is marked @Transactional and I will create the owning class with a new operator, no transaction management will be provided. I use Tomcat 6 as a servlet container. So, my question is: how to make Servlet methods managed by Spring framework. The issue here is that I use a framework, and its servlets extend the functionality of basic java Servlets, so they have more methods. Still, web.xml is present in an app as usual. The thing is that I do not control the servlets creation flow, I can only override a few methods of each servlet, the flow is basically written down in some xml file, but I control this process using a graphical gui. So, basically, I only add some code to a few methods of each Servlet. How to make those methods managed by Spring framework? The basic thing I need to do is making these methods transactional (@Transactional).

    Read the article

  • How to instantiate spring bean without being referenced from aop:aspect

    - by XDeveloper
    Using Spring and Java; I have a pointcut which works OK. Now I want to remove the pointcut and AOP from the spring and just trigger the event with an event from inside the java code but I want "myAdvice" bean still called via Spring and its properties set. I want to get ridoff all advice things even in java code, no more advice or any trace of AOP, I already have a nice event system working. I just want to instantiate my bean via Spring. When I remove the second code block (one starting with "aop:config") then I noticed the bean "myAdvice" is not called and instantiated anymore. How can i stil call it set its properties without referencing it from the "aop:aspect" ? in my application context ; <bean id="myAdvice" class="com.myclass"> <property name="name1" ref="ref1" /> <property name="name2" ref="ref2" /> </bean> <aop:config proxy-target-class="true"> <aop:aspect id="myAspect" ref="myAdvice"> <aop:pointcut id="myPointcut" expression="execution(* com.myexcmethod" /> <aop:around pointcut-ref="myPointcut" method="invoke" /> </aop:aspect> </aop:config>

    Read the article

  • The Oldest Big Data Problem: Parsing Human Language

    - by dan.mcclary
    There's a new whitepaper up on Oracle Technology Network which details the use of Digital Reasoning Systems' Synthesys software on Oracle Big Data Appliance.  Digital Reasoning's approach is inherently "big data friendly," as it leverages multiple components of the Hadoop ecosystem.  Moreover, the paper addresses the oldest big data problem of them all: extracting knowledge from human text.   You can find the paper here.   From the Executive Summary: There is a wealth of information to be extracted from natural language, but that extraction is challenging. The volume of human language we generate constitutes a natural Big Data problem, while its complexity and nuance requires a particular expertise to model and mine. In this paper we illustrate the impressive combination of Oracle Big Data Appliance and Digital Reasoning Synthesys software. The combination of Synthesys and Big Data Appliance makes it possible to analyze tens of millions of documents in a matter of hours. Moreover, this powerful combination achieves four times greater throughput than conducting the equivalent analysis on a much larger cloud-deployed Hadoop cluster.

    Read the article

  • Uses of persistent data structures in non-functional languages

    - by Ray Toal
    Languages that are purely functional or near-purely functional benefit from persistent data structures because they are immutable and fit well with the stateless style of functional programming. But from time to time we see libraries of persistent data structures for (state-based, OOP) languages like Java. A claim often heard in favor of persistent data structures is that because they are immutable, they are thread-safe. However, the reason that persistent data structures are thread-safe is that if one thread were to "add" an element to a persistent collection, the operation returns a new collection like the original but with the element added. Other threads therefore see the original collection. The two collections share a lot of internal state, of course -- that's why these persistent structures are efficient. But since different threads see different states of data, it would seem that persistent data structures are not in themselves sufficient to handle scenarios where one thread makes a change that is visible to other threads. For this, it seems we must use devices such as atoms, references, software transactional memory, or even classic locks and synchronization mechanisms. Why then, is the immutability of PDSs touted as something beneficial for "thread safety"? Are there any real examples where PDSs help in synchronization, or solving concurrency problems? Or are PDSs simply a way to provide a stateless interface to an object in support of a functional programming style?

    Read the article

  • Spring Security 3.0- Customise basic http Authentication Dialog

    - by gav
    Rather than reading; A user name and password are being requested by http://localhost:8080. The site says: "Spring Security Application" I want to change the prompt, or at least change what the "site says". Does anyone know how to do this via resources.xml? In my Grails App Spring configuration, my current version is as follows; <?xml version="1.0" encoding="UTF-8"?> <beans:beans xmlns="http://www.springframework.org/schema/security" xmlns:beans="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-3.0.xsd"> <http auto-config="true" use-expressions="true"> <http-basic/> <intercept-url pattern="/**" access="isAuthenticated()" /> </http> <authentication-manager alias="authenticationManager"> <authentication-provider> <user-service> <user name="admin" password="admin" authorities="ROLE_ADMIN"/> </user-service> </authentication-provider> </authentication-manager> </beans:beans>

    Read the article

  • Spring Integration 1.0 RC2: Streaming file content?

    - by gdm
    I've been trying to find information on this, but due to the immaturity of the Spring Integration framework I haven't had much luck. Here is my desired work flow: New files are placed in an 'Incoming' directory Files are picked up using a file:inbound-channel-adapter The file content is streamed, N lines at a time, to a 'Stage 1' channel, which parses the line into an intermediary (shared) representation. This parsed line is routed to multiple 'Stage 2' channels. Each 'Stage 2' channel does its own processing on the N available lines to convert them to a final representation. This channel must have a queue which ensures no Stage 2 channel is overwhelmed in the event that one channel processes significantly slower than the others. The final representation of the N lines is written to a file. There will be as many output files as there were routing destinations in step 4. *'N' above stands for any reasonable number of lines to read at a time, from [1, whatever I can fit into memory reasonably], but is guaranteed to always be less than the number of lines in the full file. How can I accomplish streaming (steps 3, 4, 5) in Spring Integration? It's fairly easy to do without streaming the files, but my files are large enough that I cannot read the entire file into memory. As a side note, I have a working implementation of this work flow without Spring Integration, but since we're using Spring Integration in other places in our project, I'd like to try it here to see how it performs and how the resulting code compares for length and clarity.

    Read the article

  • implement AOP for Controllers in Spring 3

    - by tommy
    How do I implement AOP with an annotated Controller? I've search and found two previous posts regarding the problem, but can't seem to get the solutions to work. posted solution 1 posted solution 2 Here's what I have: Dispatch Servlet: <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:p="http://www.springframework.org/schema/p" xmlns:context="http://www.springframework.org/schema/context" xmlns:mvc="http://www.springframework.org/schema/mvc" xmlns:aop="http://www.springframework.org/schema/aop" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc-3.0.xsd http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-3.0.xsd"> <context:annotation-config/> <context:component-scan base-package="com.foo.controller"/> <bean id="fooAspect" class="com.foo.aop.FooAspect" /> <aop:aspectj-autoproxy> <aop:include name="fooAspect" /> </aop:aspectj-autoproxy> </beans> Controller: @Controller public class FooController { @RequestMapping(value="/index.htm", method=RequestMethod.GET) public String showIndex(Model model){ return "index"; } } Aspect: @Aspect public class FooAspect { @Pointcut("@target(org.springframework.stereotype.Controller)") public void controllerPointcutter() {} @Pointcut("execution(* *(..))") public void methodPointcutter() {} @Before("controllerPointcutter()") public void beforeMethodInController(JoinPoint jp){ System.out.println("### before controller call..."); } @AfterReturning("controllerPointcutter() && methodPointcutter() ") public void afterMethodInController(JoinPoin jp) { System.out.println("### after returning..."); } @Before("methodPointcutter()") public void beforeAnyMethod(JoinPoint jp){ System.out.println("### before any call..."); } } The beforeAnyMethod() works for methods NOT in a controller; I cannot get anything to execute on calls to controllers. Am I missing something?

    Read the article

  • Spring.net customer namespace parser

    - by ListenToRick
    I have a customer parser which looks like this: [NamespaceParser( Namespace = "http://mysite/schema/cache", SchemaLocationAssemblyHint = typeof(CacheNamespaceParser ), SchemaLocation = "/cache.xsd" ) ] public class CacheNamespaceParser : NamespaceParserSupport { public override void Init() { RegisterObjectDefinitionParser("cache", new CacheParser ()); } } public class CacheParser : AbstractSimpleObjectDefinitionParser { protected override Type GetObjectType(XmlElement element) { return typeof(CacheDefinition); } protected override void DoParse(XmlElement element, ObjectDefinitionBuilder builder) { } protected override bool ShouldGenerateIdAsFallback { get { return true; } } } in the web config i have the following configuration.... <spring> <parsers> <parser type="Spring.Data.Config.DatabaseNamespaceParser, Spring.Data"/> <parser type="App.Web.CacheNamespaceParser, WebApp" /> </parsers> When I run the project I get the following error: An error occurred creating the configuration section handler for spring/parsers: Invalid resource name. Name has to be in 'assembly:<assemblyName>/<namespace>/<resourceName>' format. I put a break point in the CacheNamespaceParser init method and it is called. If I remove from the web config all is well! Any ideas whats wrong

    Read the article

  • Translate report data export from RUEI into HTML for import into OpenOffice Calc Spreadsheets

    - by [email protected]
    A common question of users is, How to import the data from the automated data export of Real User Experience Insight (RUEI) into tools for archiving, dashboarding or combination with other sets of data.XML is well-suited for such a translation via the companion Extensible Stylesheet Language Transformations (XSLT). Basically XSLT utilizes XSL, a template on what to read from your input XML data file and where to place it into the target document. The target document can be anything you like, i.e. XHTML, CSV, or even a OpenOffice Spreadsheet, etc. as long as it is a plain text format.XML 2 OpenOffice.org SpreadsheetFor the XSLT to work as an OpenOffice.org Calc Import Filter:How to add an XML Import Filter to OpenOffice CalcStart OpenOffice.org Calc andselect Tools > XML Filter SettingsNew...Fill in the details as follows:Filter name: RUEI Import filterApplication: OpenOffice.org Calc (.ods)Name of file type: Oracle Real User Experience InsightFile extension: xmlSwitch to the transformation tab and enter/select the following leaving the rest untouchedXSLT for import: ruei_report_data_import_filter.xslPlease see at the end of this blog post for a download of the referenced file.Select RUEI Import filter from list and Test XSLTClick on Browse to selectTransform file: export.php.xmlOpenOffice.org Calc will transform and load the XML file you retrieved from RUEI in a human-readable format.You can now select File > Open... and change the filetype to open your RUEI exports directly in OpenOffice.org Calc, just like any other a native Spreadsheet format.Files of type: Oracle Real User Experience Insight (*.xml)File name: export.php.xml XML 2 XHTMLMost XML-powered browsers provides for inherent XSL Transformation capabilities, you only have to reference the XSLT Stylesheet in the head of your XML file. Then open the file in your favourite Web Browser, Firefox, Opera, Safari or Internet Explorer alike.<?xml version="1.0" encoding="ISO-8859-1"?><!-- inserted line below --> <?xml-stylesheet type="text/xsl" href="ruei_report_data_export_2_xhtml.xsl"?><!-- inserted line above --><report>You can find a patched example export from RUEI plus the above referenced XSL-Stylesheets here: export.php.xml - Example report data export from RUEI ruei_report_data_export_2_xhtml.xsl - RUEI to XHTML XSL Transformation Stylesheetruei_report_data_import_filter.xsl - OpenOffice.org XML import filter for RUEI report export data If you would like to do things like this on the command line you can use either Xalan or xsltproc.The basic command syntax for xsltproc is very simple:xsltproc -o output.file stylesheet.xslt inputfile.xmlYou can use this with the above two stylesheets to translate RUEI Data Exports into XHTML and/or OpenOffice.org Calc ODS-Format. Or you could write your own XSLT to transform into Comma separated Value lists.Please let me know what you think or do with this information in the comments below.Kind regards,Stefan ThiemeReferences used:OpenOffice XML Filter - Create XSLT filters for import and export - http://user.services.openoffice.org/en/forum/viewtopic.php?f=45&t=3490SUN OpenOffice.org XML File Format 1.0 - http://xml.openoffice.org/xml_specification.pdf

    Read the article

  • Data Auditor by Example

    - by Jinjin.Wang
    OWB has a node Data Auditors under Oracle Module in Projects Navigator. What is data auditor and how to use it? I will give an introduction to data auditor and show its usage by examples. Data auditor is an important tool in ensuring that data quality levels meet business requirements. Data auditor validates data against a set of data rules to determine which records comply and which do not. It gathers statistical metrics on how well the data in a system complies with a rule by auditing and marking how many errors are occurring against the audited table. Data auditors are typically scheduled for regular execution as part of a process flow, to monitor the quality of the data in an operational environment such as a data warehouse or ERP system, either immediately after updates like data loads, or at regular intervals. How to use data auditor to monitor data quality? Only objects with data rules can be monitored, so the first step is to define data rules according to business requirements and apply them to the objects you want to monitor. The objects can be tables, views, materialized views, and external tables. Secondly create a data auditor containing the objects. You can configure the data auditor and set physical deployment parameters for it as optional, which will be used while running the data auditor. Then deploy and run the data auditor either manually or as part of the process flow. After execution, the data auditor sets several output values, and records that are identified as not complying with the defined data rules contained in the data auditor are written to error tables. Here is an example. We have two tables DEPARTMENTS and EMPLOYEES (see pic-1 and pic-2. Click here for DDL and data) imported into OWB. We want to gather statistical metrics on how well data in these two tables satisfies the following requirements: a. Values of the EMPLOYEES.EMPLOYEE_ID attribute are three-digit numbers. b. Valid values for EMPLOYEES.JOB_ID are IT_PROG, SA_REP, SH_CLERK, PU_CLERK, and ST_CLERK. c. EMPLOYEES.EMPLOYEE_ID is related to DEPARTMENTS.MANAGER_ID. Pic-1 EMPLOYEES Pic-2 DEPARTMENTS 1. To determine legal data within EMPLOYEES or legal relationships between data in different columns of the two tables, firstly we define data rules based on the three requirements and apply them to tables. a. The first requirement is about patterns that an attribute is allowed to conform to. We create a Domain Pattern List data rule EMPLOYEE_PATTERN_RULE here. The pattern is defined in the Oracle Database regular expression syntax as ^([0-9]{3})$ Apply data rule EMPLOYEE_PATTERN_RULE to table EMPLOYEES.

    Read the article

  • java.lang.NullPointerException exception in my controller file (using Spring Hibernate Maven)

    - by mrjayviper
    The problem doesn't seemed to have anything to do with Hibernate. As I've commented the Hibernate stuff but I'm still getting it. If I comment out this line message = staffDAO.searchForStaff(search); in my controller file, it goes through ok. But I don't see anything wrong with searchForStaff function. It's a very simple function that just returns the string "test" and run system.out.println("test"). Can you please help? thanks But this is the error that I'm getting: SEVERE: Servlet.service() for servlet [spring] in context with path [/directorymaven] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause java.lang.NullPointerException at org.flinders.staffdirectory.controllers.SearchController.showSearchResults(SearchController.java:25) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:219) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:100) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:604) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:565) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:778) at javax.servlet.http.HttpServlet.service(HttpServlet.java:621) at javax.servlet.http.HttpServlet.service(HttpServlet.java:728) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) My spring-servlet xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:mvc="http://www.springframework.org/schema/mvc" xmlns:p="http://www.springframework.org/schema/p" xmlns:tx="http://www.springframework.org/schema/tx" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx.xsd"> <context:component-scan base-package="org.flinders.staffdirectory.controllers" /> <mvc:annotation-driven /> <mvc:resources mapping="/resources/**" location="/resources/" /> <tx:annotation-driven /> <bean id="propertyConfigurer" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer" p:location="/WEB-INF/spring.properties" /> <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close" p:driverClassName="${jdbc.driverClassName}" p:url="${jdbc.databaseurl}" p:username="${jdbc.username}" p:password="${jdbc.password}" /> <bean id="sessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean" p:dataSource-ref="dataSource" p:configLocation="${hibernate.config}" p:packagesToScan="org.flinders.staffdirectory"/> <bean id="transactionManager" class="org.springframework.orm.hibernate4.HibernateTransactionManager" p:sessionFactory-ref="sessionFactory" /> <bean id="viewResolver" class="org.springframework.web.servlet.view.UrlBasedViewResolver" p:viewClass="org.springframework.web.servlet.view.tiles2.TilesView" /> <bean id="tilesConfigurer" class="org.springframework.web.servlet.view.tiles2.TilesConfigurer" p:definitions="/WEB-INF/tiles.xml" /> <bean id="staffDAO" class="org.flinders.staffdirectory.dao.StaffDAO" p:sessionFactory-ref="sessionFactory" /> <!-- <bean id="staffService" class="org.flinders.staffdirectory.services.StaffServiceImpl" p:staffDAO-ref="staffDAO" />--> </beans> This is my controller file package org.flinders.staffdirectory.controllers; import java.util.List; //import org.flinders.staffdirectory.models.database.SearchResult; import org.flinders.staffdirectory.models.misc.Search; import org.flinders.staffdirectory.dao.StaffDAO; //import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Controller; import org.springframework.web.bind.annotation.ModelAttribute; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.servlet.ModelAndView; @Controller public class SearchController { //@Autowired private StaffDAO staffDAO; private String message; @RequestMapping("/SearchStaff") public ModelAndView showSearchResults(@ModelAttribute Search search) { //List<SearchResult> searchResults = message = staffDAO.searchForStaff(search); //System.out.println(search.getSurname()); return new ModelAndView("search/SearchForm", "Search", new Search()); //return new ModelAndView("search/SearchResults", "searchResults", searchResults); } @RequestMapping("/SearchForm") public ModelAndView showSearchForm() { return new ModelAndView("search/SearchForm", "search", new Search()); } } my dao class package org.flinders.staffdirectory.dao; import java.util.List; import org.hibernate.SessionFactory; //import org.springframework.beans.factory.annotation.Autowired; import org.flinders.staffdirectory.models.database.SearchResult; import org.flinders.staffdirectory.models.misc.Search; public class StaffDAO { //@Autowired private SessionFactory sessionFactory; public void setSessionFactory(SessionFactory sessionFactory) { this.sessionFactory = sessionFactory; } public String searchForStaff(Search search) { /*String SQL = "select distinct telsumm_id as id, telsumm_parent_id as parentId, telsumm_name_title as title, (case when substr(telsumm_surname, length(telsumm_surname) - 1, 1) = ',' then substr(telsumm_surname, 1, length(telsumm_surname) - 1) else telsumm_surname end) as surname, telsumm_preferred_name as firstname, nvl(telsumm_tele_number, '-') as telephoneNumber, nvl(telsumm_role, '-') as role, telsumm_display_department as department, lower(telsumm_entity_type) as entityType from teldirt.teld_summary where (telsumm_search_surname is not null) and not (nvl(telsumm_tele_directory,'xxx') IN ('N','S','D')) and not (telsumm_tele_number IS NULL AND telsumm_alias IS NULL) and (telsumm_alias_list = 'Y' OR (telsumm_tele_directory IN ('A','B'))) and ((nvl(telsumm_system_id_end,sysdate+1) > SYSDATE and telsumm_entity_type = 'P') or (telsumm_entity_type = 'N')) and (telsumm_search_department NOT like 'SPONSOR%')"; if (search.getSurname().length() > 0) { SQL += " and (telsumm_search_surname like '" + search.getSurname().toUpperCase() + "%')"; } if (search.getSurnameLike().length() > 0) { SQL += " and (telsumm_search_soundex like soundex(('%" + search.getSurnameLike().toUpperCase() + "%'))"; } if (search.getFirstname().length() > 0) { SQL += " and (telsumm_search_preferred_name like '" + search.getFirstname().toUpperCase() + "%' or telsumm_search_first_name like '" + search.getFirstname() + "%')"; } if (search.getTelephoneNumber().length() > 0) { SQL += " and (telsumm_tele_number like '" + search.getTelephoneNumber() + "%')"; } if (search.getDepartment().length() > 0) { SQL += " and (telsumm_search_department like '" + search.getDepartment().toUpperCase() + "%')"; } if (search.getRole().length() > 0) { SQL += " and (telsumm_search_role like '" + search.getRole().toUpperCase() + "%')"; } SQL += " order by surname, firstname"; List<Object[]> list = (List<Object[]>) sessionFactory.getCurrentSession().createQuery(SQL).list(); for(int j=0;j<list.size();j++){ Object [] obj= (Object[])list.get(j); for(int i=0;i<obj.length;i++) System.out.println(obj[i]); }*/ System.out.println("test"); return "test"; } }

    Read the article

  • Big Data – Buzz Words: What is Hadoop – Day 6 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is NoSQL. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – Hadoop. What is Hadoop? Apache Hadoop is an open-source, free and Java based software framework offers a powerful distributed platform to store and manage Big Data. It is licensed under an Apache V2 license. It runs applications on large clusters of commodity hardware and it processes thousands of terabytes of data on thousands of the nodes. Hadoop is inspired from Google’s MapReduce and Google File System (GFS) papers. The major advantage of Hadoop framework is that it provides reliability and high availability. What are the core components of Hadoop? There are two major components of the Hadoop framework and both fo them does two of the important task for it. Hadoop MapReduce is the method to split a larger data problem into smaller chunk and distribute it to many different commodity servers. Each server have their own set of resources and they have processed them locally. Once the commodity server has processed the data they send it back collectively to main server. This is effectively a process where we process large data effectively and efficiently. (We will understand this in tomorrow’s blog post). Hadoop Distributed File System (HDFS) is a virtual file system. There is a big difference between any other file system and Hadoop. When we move a file on HDFS, it is automatically split into many small pieces. These small chunks of the file are replicated and stored on other servers (usually 3) for the fault tolerance or high availability. (We will understand this in the day after tomorrow’s blog post). Besides above two core components Hadoop project also contains following modules as well. Hadoop Common: Common utilities for the other Hadoop modules Hadoop Yarn: A framework for job scheduling and cluster resource management There are a few other projects (like Pig, Hive) related to above Hadoop as well which we will gradually explore in later blog posts. A Multi-node Hadoop Cluster Architecture Now let us quickly see the architecture of the a multi-node Hadoop cluster. A small Hadoop cluster includes a single master node and multiple worker or slave node. As discussed earlier, the entire cluster contains two layers. One of the layer of MapReduce Layer and another is of HDFC Layer. Each of these layer have its own relevant component. The master node consists of a JobTracker, TaskTracker, NameNode and DataNode. A slave or worker node consists of a DataNode and TaskTracker. It is also possible that slave node or worker node is only data or compute node. The matter of the fact that is the key feature of the Hadoop. In this introductory blog post we will stop here while describing the architecture of Hadoop. In a future blog post of this 31 day series we will explore various components of Hadoop Architecture in Detail. Why Use Hadoop? There are many advantages of using Hadoop. Let me quickly list them over here: Robust and Scalable – We can add new nodes as needed as well modify them. Affordable and Cost Effective – We do not need any special hardware for running Hadoop. We can just use commodity server. Adaptive and Flexible – Hadoop is built keeping in mind that it will handle structured and unstructured data. Highly Available and Fault Tolerant – When a node fails, the Hadoop framework automatically fails over to another node. Why Hadoop is named as Hadoop? In year 2005 Hadoop was created by Doug Cutting and Mike Cafarella while working at Yahoo. Doug Cutting named Hadoop after his son’s toy elephant. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – MapReduce. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

< Previous Page | 25 26 27 28 29 30 31 32 33 34 35 36  | Next Page >