Search Results

Search found 69357 results on 2775 pages for 'data oriented design'.

Page 30/2775 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >

  • Object Oriented Design Questions

    - by Robert
    Hello there. I am going to develop a Tic-Tac-Toe game using Java(or maybe other OO Languages).Now I have a picture in my mind about the general design. Interface: Player ,then I will be able to implement a couple of Player classes,based on how I want the opponent to be,for example,random player,intelligent player. Classes: Board class,with a two-dimensional array of integers,0 indicates open,1 indicates me,-1 indicates opponent.The evaluation function will be in here as well,to return the next best move based on the current board arrangement and whose turn it is. Refree class,which will create instance of the Board and two player instances,then get the game begin. This is a rough idea of my OO design,could anybody give me any critiques please,I find this is really beneficial,thank you very much.

    Read the article

  • Repeated properties design pattern

    - by Mark
    I have a DownloadManager class that manages multiple DownloadItem objects. Each DownloadItem has events like ProgressChanged and DownloadCompleted. Usually you want to use the same event handler for all download items, so it's a bit annoying to have to set the event handlers over and over again for each DownloadItem. Thus, I need to decide which pattern to use: Use one DownloadItem as a template and clone it as necessary var dm = DownloadManager(); var di = DownloadItem(); di.ProgressChanged += new DownloadProgressChangedEventHandler(di_ProgressChanged); di.DownloadCompleted += new DownloadProgressChangedEventHandler(di_DownloadCompleted); DownloadItem newDi; newDi = di.Clone(); newDi.Uri = "http://google.com"; dm.Enqueue(newDi); newDi = di.Clone(); newDi.Uri = "http://yahoo.com"; dm.Enqueue(newDi); Set the event handlers on the DownloadManager instead and have it copy the events over to each DownloadItem that is enqeued. var dm = DownloadManager(); dm.ProgressChanged += new DownloadProgressChangedEventHandler(di_ProgressChanged); dm.DownloadCompleted += new DownloadProgressChangedEventHandler(di_DownloadCompleted); dm.Enqueue(new DownloadItem("http://google.com")); dm.Enqueue(new DownloadItem("http://yahoo.com")); Or use some kind of factory var dm = DownloadManager(); var dif = DownloadItemFactory(); dif.ProgressChanged += new DownloadProgressChangedEventHandler(di_ProgressChanged); dif.DownloadCompleted += new DownloadProgressChangedEventHandler(di_DownloadCompleted); dm.Enqueue(dif.Create("http://google.com")); dm.Enqueue(dif.Create("http://yahoo.com")); What would you recommend?

    Read the article

  • RESTful Question/Answer design?

    - by Kirschstein
    This is a toy project I'm working on at the moment. My app contains questions with multiple choice answers. The question url is in the following format, with GET & POST mapping to different actions on the questions controller. GET: url.com/questions/:category/:difficulty => 'ask' POST: url.com/questions/:category/:difficulty => 'answer' I'm wondering if it's worth redesigning this into a RESTful style. I know I'd need to introduce answers as a resource, but I'm struggling to think of a url that would look natural for answering that question. Would a redesign be worthwhile? How would you go about structuring the urls?

    Read the article

  • Design Question on when to save

    - by Ben
    Hi, I was just after peoples opinion on when the best time to save an object (or collection of objects) is. I appreciate that it can be completely dependent on the situation that you are in but here is my situation. I have a collection of objects "MyCollection" in a grid. You can open each object "MyObject" in an editor dialogue by double clicking on the grid. Selecting "Cancel" on the dialogue will back out any changes you have made, but should selecting "ok" commit those changes back to the database, or should they commit the changes on that object back to the collection and have a save method that iterates through the collection and saves all changed objects? If i have an object "MyParentObject", that contains a collection of childen "MyChildObjectCollection", none of the changes made to each "MyChildObject" would be commited to the database until the "MyParentObject" was saved - this makes sense. However in my current situation, none of the objects in the collection are linked, therefore should the "Ok" on the dialogue commit the changes to the database? Appreciate any opinions on this. Thanks

    Read the article

  • User account design and security...

    - by espinet
    Before I begin, I am using Ruby on Rails and the Devise gem for user authentication. Hi, I was doing some research about account security and I found a blog post about the topic awhile ago but I can no longer find it again. I read something about when making a login system you should have 1 model for User, this contains a user's username, encrypted password, and email. You should also have a model for a user's Account. This contains everything else. A User has an Account. I don't know if I'm explaining this correctly since I haven't seen the blog post for several months and I lost my bookmark. Could someone explain how and why I should or shouldn't do this. My application deals with money so I need to cover my bases with security. Thanks.

    Read the article

  • Schema design: many to many plus additional one to many

    - by chrisj
    Hi, I have this scenario and I'm not sure exactly how it should be modeled in the database. The objects I'm trying to model are: teams, players, the team-player membership, and a list of fees due for each player on a given team. So, the fees depend on both the team and the player. So, my current approach is the following: **teams** id name **players** id name **team_players** id player_id team_id **team_player_fees** id team_players_id amount send_reminder_on Schema layout ERD In this schema, team_players is the junction table for teams and players. And the table team_player_fees has records that belong to records to the junction table. For example, playerA is on teamA and has the fees of $10 and $20 due in Aug and Feb. PlayerA is also on teamB and has the fees of $25 and $25 due in May and June. Each player/team combination can have a different set of fees. Questions: Are there better ways to handle such a scenario? Is there a term for this type of relationship? (so I can google it) Or know of any references with similar structures?

    Read the article

  • Column-oriented DBMS and JOIN operations

    - by André
    From some of the research I've done on NoSQL, column-oriented databases (like HBase or Cassandra) seem to solve the problem of costly JOIN operations, but I don't get how this approach solves this problem. Can anyone explain it to me and/or link me to interesting documentation regarding this area? Thanks

    Read the article

  • Project design / FS layout for large django projects

    - by rcreswick
    What is the best way to layout a large django project? The tutuorials provide simple instructions for setting up apps, models, and views, but there is less information about how apps and projects should be broken down, how much sharing is allowable/necessary between apps in a typical project (obviously that is largely dependent on the project) and how/where general templates should be kept. Does anyone have examples, suggestions, and explanations as to why a certain project layout is better than another? I am particularly interested in the incorporation of large numbers of unit tests (2-5x the size of the actual code base) and string externalization / templates.

    Read the article

  • General ORM design question

    - by Calvin
    Suppose you have 2 classes, Person and Rabbit. A person can do a number of things to a rabbit, s/he can either feed it, buy it and become its owner, or give it away. A rabbit can have none or at most 1 owner at a time. And if it is not fed for a while, it may die. Class Person { Void Feed(Rabbit r); Void Buy(Rabbit r); Void Giveaway(Person p, Rabbit r); Rabbit[] rabbits; } Class Rabbit { Bool IsAlive(); Person pwner; } There are a couple of observations from the domain model: Person and Rabbit can have references to each other Any actions on 1 object can also change the state of the other object Even if no explicit actions are invoked, there can still be a change of state in the objects (e.g. Rabbit can be starved to death, and that causes it to be removed from the Person.rabbits array) As DDD is concerned, I think the correct approach is to synchronize all calls that may change the states in the domain model. For instance, if a Person buys a Rabbit, s/he would need to acquire a lock in Person to make a change to the rabbits array AND also another lock in Rabbit to change its owner before releasing the first one. This would prevent a race condition where 2 Persons claim to be the owner of the little Rabbit. The other approach is to let the database to handle all these synchronizations. Who makes the first call wins, but then the DB needs to have some kind of business logics to figure out if it is a valid transaction (e.g. if a Rabbit already has an owner, it cannot change its owner unless the Person gives it away). There are both pros/cons in either approach, and I’d expect the “best” solution would be somewhere in-between. How would you do it in real life? What’s your take and experience? Also, is it a valid concern that there can be another race condition the domain model has committed its change but before it is fully committed in the database? And for the 3rd observation (i.e. state change due to time factor). How will you do it?

    Read the article

  • How to better design it ???

    - by Deepak
    public interface IBasePresenter { } public interface IJobViewPresenter : IBasePresenter { } public interface IActivityViewPresenter : IBasePresenter { } public class BaseView { public IBasePresenter Presenter { get; set; } } public class JobView : BaseView { public IJobViewPresenter JobViewPresenter { get { this.Presenter as IJobViewPresenter;} } } public class ActivityView : BaseView { public IActivityViewPresenter ActivityViewPresenter { get { this.Presenter as IActivityViewPresenter;} } } Lets assume that I need a IBasePresenter property on BaseView. Now this property is inherited by JobView and ActivityView but if I need reference to IJobViewPresenter object in these derived classes then I need to type cast IBasePresenter property to IJobViewPresenter or IActivityPresenter (which I want to avoid) or create JobViewPresenter and ActivityViewPresenter on derived classes (as shown above). I want to avoid type casting in derived classes and still have reference to IJobViewPresenter or IActivityViewPresenter and still have IBasePresenter in BaseView. Is there a way I can achieve it ?

    Read the article

  • Exploring the Factory Design Pattern

    - by asksuperuser
    There was an article here: http://msdn.microsoft.com/en-us/library/Ee817667%28pandp.10%29.aspx The first part of tut implemented this pattern with abstract classes. The second part shows an example with Interface class. But nothing in this article discusses why this pattern would rather use abstract or interface. So what explanation (advantages of one over the other) would you give ? Not in general but for this precise pattern.

    Read the article

  • Domain-Driven-Design question

    - by Michael
    Hello everyone, I have a question about DDD. I'm building a application to learn DDD and I have a question about layering. I have an application that works like this: UI layer calls = Application Layer - Domain Layer - Database Here is a small example of how the code looks: //****************UI LAYER************************ //Uses Ioc to get the service from the factory. //This factory would be in the MyApp.Infrastructure.dll IImplementationFactory factory = new ImplementationFactory(); //Interface and implementation for Shopping Cart service would be in MyApp.ApplicationLayer.dll IShoppingCartService service = factory.GetImplementationFactory<IShoppingCartService>(); //This is the UI layer, //Calling into Application Layer //to get the shopping cart for a user. //Interface for IShoppingCart would be in MyApp.ApplicationLayer.dll //and implementation for IShoppingCart would be in MyApp.Model. IShoppingCart shoppingCart = service.GetShoppingCartByUserName(userName); //Show shopping cart information. //For example, items bought, price, taxes..etc ... //Pressed Purchase button, so even for when //button is pressed. //Uses Ioc to get the service from the factory again. IImplementationFactory factory = new ImplementationFactory(); IShoppingCartService service = factory.GetImplementationFactory<IShoppingCartService>(); service.Purchase(shoppingCart); //**********************Application Layer********************** public class ShoppingCartService : IShoppingCartService { public IShoppingCart GetShoppingCartByUserName(string userName) { //Uses Ioc to get the service from the factory. //This factory would be in the MyApp.Infrastructure.dll IImplementationFactory factory = new ImplementationFactory(); //Interface for repository would be in MyApp.Infrastructure.dll //but implementation would by in MyApp.Model.dll IShoppingCartRepository repository = factory.GetImplementationFactory<IShoppingCartRepository>(); IShoppingCart shoppingCart = repository.GetShoppingCartByUserName(username); //Do shopping cart logic like calculating taxes and stuff //I would put these in services but not sure? ... return shoppingCart; } public void Purchase(IShoppingCart shoppingCart) { //Do Purchase logic and calling out to repository ... } } I've seem to put most of my business rules in services rather than the models and I'm not sure if this is correct? Also, i'm not completely sure if I have the laying correct? Do I have the right pieces in the correct place? Also should my models leave my domain model? In general I'm I doing this correct according DDD? Thanks!

    Read the article

  • Database design: one huge table or separate tables?

    - by littlegreen
    Currently I am designing a database for use in our company. We are using SQL Server 2008. The database will hold data gathered from several customers. The goal of the database is to acquire aggregate benchmark numbers over several customers. Recently, I have become worried with the fact that one table in particular will be getting very big. Each customer has approximately 20.000.000 rows of data, and there will soon be 30 customers in the database (if not more). A lot of queries will be done on this table. I am already noticing performance issues and users being temporarily locked out. My question, will we be able to handle this table in the future, or is it better to split this table up into smaller tables for each customer?

    Read the article

  • Looking for an appropriate design pattern

    - by user1066015
    I have a game that tracks user stats after every match, such as how far they travelled, how many times they attacked, how far they fell, etc, and my current implementations looks somewhat as follows (simplified version): Class Player{ int id; public Player(){ int id = Math.random()*100000; PlayerData.players.put(id,new PlayerData()); } public void jump(){ //Logic to make the user jump //... //call the playerManager PlayerManager.jump(this); } public void attack(Player target){ //logic to attack the player //... //call the player manager PlayerManager.attack(this,target); } } Class PlayerData{ public static HashMap<int, PlayerData> players = new HashMap<int,PlayerData>(); int id; int timesJumped; int timesAttacked; } public void incrementJumped(){ timesJumped++; } public void incrementAttacked(){ timesAttacked++; } } Class PlayerManager{ public static void jump(Player player){ players.get(player.getId()).incrementJumped(); } public void incrementAttacked(Player player, Player target){ players.get(player.getId()).incrementAttacked(); } } So I have a PlayerData class which holds all of the statistics, and brings it out of the player class because it isn't part of the player logic. Then I have PlayerManager, which would be on the server, and that controls the interactions between players (a lot of the logic that does that is excluded so I could keep this simple). I put the calls to the PlayerData class in the Manager class because sometimes you have to do certain checks between players, for instance if the attack actually hits, then you increment "attackHits". The main problem (in my opinion, correct me if I'm wrong) is that this is not very extensible. I will have to touch the PlayerData class if I want to keep track of a new stat, by adding methods and fields, and then I have to potentially add more methods to my PlayerManager, so it isn't very modulized. If there is an improvement to this that you would recommend, I would be very appreciative. Thanks.

    Read the article

  • design an extendible and pluggable business logic flow handler in php

    - by Broncha
    I am working on a project where I need to allow a pluggable way to inject business processes in the normal data flow. eg There is an ordering system. The standard flow of the application is A consumer orders an item. Pays for it and card is authorized. Admin captures the payment. Order is marked as complete and item is shipped. But this process may vary (extra steps in between) for different clients. Say a client would need to validate the location of the consumer before he is presented with a credit card form, OR his policies might require some other processes in between. I am thinking of using State Pattern for processing orders, saving the current state of the order in database, and initializing the state of order from the saved state. I would also need some mechanism, where a small plugin would be able to inject business specific states in the state machine. Am I thinking the right way? Are there already implemented patterns for this kind of situation? I am working with Codeigniter and basically this would mean for me, to redirect to proper controller according to the current state of the order. Like, if the state of the order is unconfirmed then redirect the user to details page and then change the state to pending. If some client would need to do some validation, then register an intermediate state between unconfirmed and pending Please suggest.

    Read the article

  • Data Governance 2010 Conference in San Diego

    - by Tony Ouk
    The Data Governance Annual Conference is one of the world's most authoritative and vendor neutral event on Data Governance and Data Quality.  The conference will focus on the "how-tos" from starting a data governance and stewardship program to attaining data governance maturity with specific topics on MDM.  This year's event will be hosted June 7 through June 10 in San Diego, California. For more information, including registration details, visit the Data Governance 2010 Conference website.

    Read the article

  • How to search for newline or linebreak characters in Excel?

    - by Highly Irregular
    I've imported some data into Excel (from a text file) and it contains some sort of newline characters. It looks like this initially: If I hit F2 (to edit) then Enter (to save changes) on each of the cells with a newline (without actually editing anything), Excel automatically changes the layout to look like this: I don't want these newlines characters here, as it messes up data processing further down the track. How can I do a search for these to detect more of them? The usual search function doesn't accept an enter character as a search character.

    Read the article

  • Oracle Data Integration 12c: Simplified, Future-Ready, High-Performance Solutions

    - by Thanos Terentes Printzios
    In today’s data-driven business environment, organizations need to cost-effectively manage the ever-growing streams of information originating both inside and outside the firewall and address emerging deployment styles like cloud, big data analytics, and real-time replication. Oracle Data Integration delivers pervasive and continuous access to timely and trusted data across heterogeneous systems. Oracle is enhancing its data integration offering announcing the general availability of 12c release for the key data integration products: Oracle Data Integrator 12c and Oracle GoldenGate 12c, delivering Simplified and High-Performance Solutions for Cloud, Big Data Analytics, and Real-Time Replication. The new release delivers extreme performance, increase IT productivity, and simplify deployment, while helping IT organizations to keep pace with new data-oriented technology trends including cloud computing, big data analytics, real-time business intelligence. With the 12c release Oracle becomes the new leader in the data integration and replication technologies as no other vendor offers such a complete set of data integration capabilities for pervasive, continuous access to trusted data across Oracle platforms as well as third-party systems and applications. Oracle Data Integration 12c release addresses data-driven organizations’ critical and evolving data integration requirements under 3 key themes: Future-Ready Solutions : Supporting Current and Emerging Initiatives Extreme Performance : Even higher performance than ever before Fast Time-to-Value : Higher IT Productivity and Simplified Solutions  With the new capabilities in Oracle Data Integrator 12c, customers can benefit from: Superior developer productivity, ease of use, and rapid time-to-market with the new flow-based mapping model, reusable mappings, and step-by-step debugger. Increased performance when executing data integration processes due to improved parallelism. Improved productivity and monitoring via tighter integration with Oracle GoldenGate 12c and Oracle Enterprise Manager 12c. Improved interoperability with Oracle Warehouse Builder which enables faster and easier migration to Oracle Data Integrator’s strategic data integration offering. Faster implementation of business analytics through Oracle Data Integrator pre-integrated with Oracle BI Applications’ latest release. Oracle Data Integrator also integrates simply and easily with Oracle Business Analytics tools, including OBI-EE and Oracle Hyperion. Support for loading and transforming big and fast data, enabled by integration with big data technologies: Hadoop, Hive, HDFS, and Oracle Big Data Appliance. Only Oracle GoldenGate provides the best-of-breed real-time replication of data in heterogeneous data environments. With the new capabilities in Oracle GoldenGate 12c, customers can benefit from: Simplified setup and management of Oracle GoldenGate 12c when using multiple database delivery processes via a new Coordinated Delivery feature for non-Oracle databases. Expanded heterogeneity through added support for the latest versions of major databases such as Sybase ASE v 15.7, MySQL NDB Clusters 7.2, and MySQL 5.6., as well as integration with Oracle Coherence. Enhanced high availability and data protection via integration with Oracle Data Guard and Fast-Start Failover integration. Enhanced security for credentials and encryption keys using Oracle Wallet. Real-time replication for databases hosted on public cloud environments supported by third-party clouds. Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c and other Oracle technologies, such as Oracle Database 12c and Oracle Applications, provides a number of benefits for organizations: Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c enables developers to leverage Oracle GoldenGate’s low overhead, real-time change data capture completely within the Oracle Data Integrator Studio without additional training. Integration with Oracle Database 12c provides a strong foundation for seamless private cloud deployments. Delivers real-time data for reporting, zero downtime migration, and improved performance and availability for Oracle Applications, such as Oracle E-Business Suite and ATG Web Commerce . Oracle’s data integration offering is optimized for Oracle Engineered Systems and is an integral part of Oracle’s fast data, real-time analytics strategy on Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine. Oracle Data Integrator 12c and Oracle GoldenGate 12c differentiate the new offering on data integration with these many new features. This is just a quick glimpse into Oracle Data Integrator 12c and Oracle GoldenGate 12c. Find out much more about the new release in the video webcast "Introducing 12c for Oracle Data Integration", where customer and partner speakers, including SolarWorld, BT, Rittman Mead will join us in launching the new release. Resource Kits Meet Oracle Data Integration 12c  Discover what's new with Oracle Goldengate 12c  Oracle EMEA DIS (Data Integration Solutions) Partner Community is available for all your questions, while additional partner focused webcasts will be made available through our blog here, so stay connected. For any questions please contact us at partner.imc-AT-beehiveonline.oracle-DOT-com Stay Connected Oracle Newsletters

    Read the article

  • Storing large data in HTTP Session (Java Application)

    - by Umesh Awasthi
    I am asking this question in continuation with http-session-or-database-approach. I am planning to follow this approach. When user add product to cart, create a Cart Model, add items to cart and save to DB. Convert Cart model to cart data and save it to HTTP session. Any update/ edit update underlying cart in DB and update data snap shot in Session. When user click on view cart page, just pick cart data from Session and display to customer. I have following queries regarding HTTP Session How good is it to store large data (Shopping Cart) in Session? How scalable this approach can be ? (With respect to Session) Won't my application going to eat and demand a lot of memory? Is my approach is fine or do i need to consider other points while designing this? Though, we can control what all cart data should be stored in the Session, but still we need to have certain information in cart data being stored in session?

    Read the article

  • How to program for constraints/rules

    - by Gaurav
    First the background, during interviews in the past, many times I have been asked to design some or other variation of card game as programming puzzle, and I have tried to design it in OO way, but I have never been satisfied with my solutions. However it was not until recently that I realized that I had been approaching the problem from the wrong direction. Specifically I was trying to solve the problem by modeling individual card as an object. Problem with this is individual cards don't have any non-trivial intrinsic behavior and therefore are not suitable (or primary) candidate as objects. What is interesting and important about cards are rules and constraints, such as there could be only four suits, or only thirteen cards in each suit. Of course, then there are any number of rules for games. So my questions are Are there any idioms/constructs/patterns to program for rules & constraints. How many in 1 can be applied in conjunction with OO paradigm.

    Read the article

  • Building a Data Mart with Pentaho Data Integration Video Review by Diethard Steiner, Packt Publishing

    - by Compudicted
    Originally posted on: http://geekswithblogs.net/Compudicted/archive/2014/06/01/building-a-data-mart-with-pentaho-data-integration-video-review-again.aspx The Building a Data Mart with Pentaho Data Integration Video by Diethard Steiner from Packt Publishing is more than just a course on how to use Pentaho Data Integration, it also implements and uses the principals of the Data Warehousing (and I even heard the name of Ralph Kimball in the video). Indeed, a video watcher should be familiar with its concepts as the Star Schema, Slowly Changing Dimension types, etc. so I suggest prior to watching this course to consider skimming through the Data Warehouse concepts (if unfamiliar) or even better, read the excellent Ralph’s The Data Warehouse Tooolkit. By the way, the author expands beyond using Pentaho along to MySQL and MonetDB which is a real icing on the cake! Indeed, I even suggest the name of the course should be ‘Building a Data Warehouse with Pentaho’. To successfully complete the course one needs to know some Linux (Ubuntu used in the course), the VI editor and the Bash command shell, but it seems that similar requirements would also apply to the Windows OS. Additionally, knowing some basic SQL would not hurt. As I had said, MonetDB is used in this course several times which seems to be not anymore complex than say MySQL, but based on what I read is very well suited for fast querying big volumes of data thanks to having a columnstore (vertical data storage). I don’t see what else can be a barrier, the material is very digestible. On this note, I must add that the author does not cover how to acquire the software, so here is what I found may help: Pentaho: the free Community Edition must be more than anyone needs to learn it. Or even go into a POC. MonetDB can be downloaded (exists for both, Linux and Windows) from http://goo.gl/FYxMy0 (just see the appropriate link on the left). The author seems to be using Eclipse to run SQL code, one can get it from http://goo.gl/5CcuN. To create, or edit database entities and/or schema otherwise one can use a universal tool called SQuirreL, get it from http://squirrel-sql.sourceforge.net.   Next, I must confess Diethard is very knowledgeable in what he does and beyond. However, there will be some accent heard to the user of the course especially if one’s mother tongue language is English, but it I got over it in a few chapters. I liked the rate at which the material is being presented, it makes me feel I paid for every second Eventually, my impressions are: Pentaho is an awesome ETL offering, it is worth learning it very much (I am an ETL fan and a heavy user of SSIS) MonetDB is nice, it tickles my fancy to know it more Data Warehousing, despite all the BigData tool offerings (Hive, Scoop, Pig on Hadoop), using the traditional tools still rocks Chapters 2 to 6 were the most fun to me with chapter 8 being the most difficult.   In terms of closing, I highly recommend this video to anyone who needs to grasp Pentaho concepts quick, likewise, the course is very well suited for any developer on a “supposed to be done yesterday” type of a project. It is for a beginner to intermediate level ETL/DW developer. But one would need to learn more on Data Warehousing and Pentaho, for such I recommend the 5 star Pentaho Data Integration 4 Cookbook. Enjoy it! Disclaimer: I received this video from the publisher for the purpose of a public review.

    Read the article

  • Building a Data Mart with Pentaho Data Integration Video Review by Diethard Steiner, Packt Publishing

    - by Compudicted
    Originally posted on: http://geekswithblogs.net/Compudicted/archive/2014/06/01/building-a-data-mart-with-pentaho-data-integration-video-review.aspx The Building a Data Mart with Pentaho Data Integration Video by Diethard Steiner from Packt Publishing is more than just a course on how to use Pentaho Data Integration, it also implements and uses the principals of the Data Warehousing (and I even heard the name of Ralph Kimball in the video). Indeed, a video watcher should be familiar with its concepts as the Star Schema, Slowly Changing Dimension types, etc. so I suggest prior to watching this course to consider skimming through the Data Warehouse concepts (if unfamiliar) or even better, read the excellent Ralph’s The Data Warehouse Tooolkit. By the way, the author expands beyond using Pentaho along to MySQL and MonetDB which is a real icing on the cake! Indeed, I even suggest the name of the course should be ‘Building a Data Warehouse with Pentaho’. To successfully complete the course one needs to know some Linux (Ubuntu used in the course), the VI editor and the Bash command shell, but it seems that similar requirements would also apply to the Weindows OS. Additionally, knowing some basic SQL would not hurt. As I had said, MonetDB is used in this course several times which seems to be not anymore complex than say MySQL, but based on what I read is very well suited for fast querying big volumes of data thanks to having a columnstore (vertical data storage). I don’t see what else can be a barrier, the material is very digestible. On this note, I must add that the author does not cover how to acquire the software, so here is what I found may help: Pentaho: the free Community Edition must be more than anyone needs to learn it. Or even go into a POC. MonetDB can be downloaded (exists for both, Linux and Windows) from http://goo.gl/FYxMy0 (just see the appropriate link on the left). The author seems to be using Eclipse to run SQL code, one can get it from http://goo.gl/5CcuN. To create, or edit database entities and/or schema otherwise one can use a universal tool called SQuirreL, get it from http://squirrel-sql.sourceforge.net.   Next, I must confess Diethard is very knowledgeable in what he does and beyond. However, there will be some accent heard to the user of the course especially if one’s mother tongue language is English, but it I got over it in a few chapters. I liked the rate at which the material is being presented, it makes me feel I paid for every second Eventually, my impressions are: Pentaho is an awesome ETL offering, it is worth learning it very much (I am an ETL fan and a heavy user of SSIS) MonetDB is nice, it tickles my fancy to know it more Data Warehousing, despite all the BigData tool offerings (Hive, Scoop, Pig on Hadoop), using the traditional tools still rocks Chapters 2 to 6 were the most fun to me with chapter 8 being the most difficult.   In terms of closing, I highly recommend this video to anyone who needs to grasp Pentaho concepts quick, likewise, the course is very well suited for any developer on a “supposed to be done yesterday” type of a project. It is for a beginner to intermediate level ETL/DW developer. But one would need to learn more on Data Warehousing and Pentaho, for such I recommend the 5 star Pentaho Data Integration 4 Cookbook. Enjoy it! Disclaimer: I received this video from the publisher for the purpose of a public review.

    Read the article

  • Internal Mutation of Persistent Data Structures

    - by Greg Ros
    To clarify, when I mean use the terms persistent and immutable on a data structure, I mean that: The state of the data structure remains unchanged for its lifetime. It always holds the same data, and the same operations always produce the same results. The data structure allows Add, Remove, and similar methods that return new objects of its kind, modified as instructed, that may or may not share some of the data of the original object. However, while a data structure may seem to the user as persistent, it may do other things under the hood. To be sure, all data structures are, internally, at least somewhere, based on mutable storage. If I were to base a persistent vector on an array, and copy it whenever Add is invoked, it would still be persistent, as long as I modify only locally created arrays. However, sometimes, you can greatly increase performance by mutating a data structure under the hood. In more, say, insidious, dangerous, and destructive ways. Ways that might leave the abstraction untouched, not letting the user know anything has changed about the data structure, but being critical in the implementation level. For example, let's say that we have a class called ArrayVector implemented using an array. Whenever you invoke Add, you get a ArrayVector build on top of a newly allocated array that has an additional item. A sequence of such updates will involve n array copies and allocations. Here is an illustration: However, let's say we implement a lazy mechanism that stores all sorts of updates -- such as Add, Set, and others in a queue. In this case, each update requires constant time (adding an item to a queue), and no array allocation is involved. When a user tries to get an item in the array, all the queued modifications are applied under the hood, requiring a single array allocation and copy (since we know exactly what data the final array will hold, and how big it will be). Future get operations will be performed on an empty cache, so they will take a single operation. But in order to implement this, we need to 'switch' or mutate the internal array to the new one, and empty the cache -- a very dangerous action. However, considering that in many circumstances (most updates are going to occur in sequence, after all), this can save a lot of time and memory, it might be worth it -- you will need to ensure exclusive access to the internal state, of course. This isn't a question about the efficacy of such a data structure. It's a more general question. Is it ever acceptable to mutate the internal state of a supposedly persistent or immutable object in destructive and dangerous ways? Does performance justify it? Would you still be able to call it immutable? Oh, and could you implement this sort of laziness without mutating the data structure in the specified fashion?

    Read the article

  • Design patterns to avoiding breaking the SRP while performing heavy data logging

    - by Kazark
    A class that performs both computations and data logging seems to have at least two responsibilities. Given a system for which the specifications require heavy data logging, what kind of design patterns or architectural patterns can be used to avoid bloating all the classes with logging calls every time they compute something? The decorator pattern be used (e.g. Interpolator decorated to LoggingInterpolator), but it seems that would result in a situation hardly more desirable in which almost every major class would need to be decorated with logging.

    Read the article

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >