Search Results

Search found 14501 results on 581 pages for 'pierre julien bringer 2008'.

Page 325/581 | < Previous Page | 321 322 323 324 325 326 327 328 329 330 331 332 | Next Page >

Installing Enterprise Library 5.0 - Enterprise Library 5.0 Tutorial Part 1

Microsoft has released Enterprise Library on April 2010. it’s free you can download and install from “Download Enterprise Library”. you can also find older version of enterprise library 4.1 still if your project needs it for maintenance purpose. but I suggest go for 5.0 as it has great enhancements and improved UI configuration tool. Will it work only with Visual Studio 2008? Yes. Yes, it works with also .NET 3.5 and Visual Studio 2008. you can take advantage of new improved UI configuration tool which comes from enterprise library 5.0 with VS2008. Please find this Enterprise Library resources. I suggest to install it with documentation and hands on labs. you can also find community links. I’ll see you in my next blog serious where I provide introduction to various blocks of Enterprise Library 5.0. span.fullpost {display:none;}

Read the article
Connecting to Microsoft Excel using Oracle Data Integrator

- by julien.testut

The posts in this series assume that you have some level of familiarity with ODI. The concepts of Topology, Data Server, Physical and Logical Architecture are used here assuming that you understand them in the context of ODI. If you need more details on these elements, please refer to the ODI Tutorial for a quick introduction, or to the complete ODI documentation for more details. In this post I will describe how a Microsoft Excel spreadsheet can be used in Oracle Data Integrator. Microsoft Excel is one of the many different technologies you can leverage in ODI as a source or as a target. Prepare your Excel spreadsheet Prior to using a Microsoft Excel spreadsheet in ODI we need to specify a name for the different cell tables we want to use. You can have multiple names in the same spreadsheet. First open up a Microsoft Excel spreadsheet, we will need to define a named range.

Read the article
Big Data’s Killer App…

- by jean-pierre.dijcks

Recently Keith spent some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

Read the article
Collaborate10 – THEconference

- by jean-pierre.dijcks

After spending a few days in Mandalay Bay's THEHotel, I guess I now call everything THE... Seriously, they even tag their toilet paper with THEtp... I guess the brand builders in Vegas thought that once you are on to something you keep on doing it, and granted it is a nice hotel with nice rooms. THEanalytics Most of my collab10 experience was in a room called Reef C, where the BIWA bootcamp was held. Two solid days of BI, Warehousing and Analytics organized by the BIWA SIG at IOUG. Didn't get to see all sessions, but what struck me was the high interest in Analytics. Marty Gubar's OLAP session was full and he did some very nice things with the OLAP option. The cool bit was that he actually gets all the advanced calculations in OLAP to show up in OBI EE without any effort. It was nice to see that the idea from OWB where you generate an RPD is now also in AWM. I think it makes life so much simpler to generate these RPD's from your data model. Even if the end RPD needs some tweaking, it is all a lot less effort to get something going. You can see this stuff for yourself in this demo (click here). OBI EE uses just SQL to get to the calculations, and so, if you prefer APEX, you can build you application there and get the same nice calculations in an APEX application. Marty also showed the Simba MDX driver used with Excel. I guess we should call that THEcoolone... and it is very slick and wonderfully useful for all of you who actually know Excel. The nice thing is that you leverage pure Excel for all operations (no plug-ins). That means no new tools to learn, no new controls, all just pure Excel. THEdatabasemachine Got some very good questions in my "what makes Exadata fast" session and overall, the interest in Exadata is overwhelming. One of the things that I did try to do in my session is to get people to think in new patterns rather than in patterns based on Oracle 9i running on some random hardware configuration. We talked a little bit about the often over-indexing and how everyone has to unlearn all of that on Exadata. The main thing however is that everyone needs to get used to the shear size of some of the components in a Database machine V2. 5TB of flash cache is a lot of very fast data storage, half a TB of memory gets quite interesting as well. So what I did there was really focus on some of the content in these earlier posts on Upward ILM and In-Memory processing. In short, I do believe the these newer media point out a trend. In-memory and other fast media will get cheaper and will see more use. Some of that we do automatically by adding new functionality, but in some cases I think the end user of the system needs to start thinking about how to leverage all this new hardware. I think most people got very excited about these new capabilities and opportunities. THEcoolkids One of the cool things about the BIWA track was the hand-on track. Very cool to see big crowds for both OLAP and OWB hands-on. Also quite nice to see that the folks at RittmanMead spent so much time on preparing for that session. While all of them put down cool stuff, none was more cool that seeing Data Mining on an Apple iPAD... it all just looks great on an iPAD! Very disappointing to see that Mark Rittman still wasn't showing OWB on his iPAD ;-) THEend All in all this was a great set of sessions in the BIWA track. Lots of value to our guests (we hope) and we hope they all come again next year!

Read the article
Visual Studio 2010 Professional special launch offer!

- by Etienne Tremblay

Hello everyone, long time no blog… I’ll try to get back in the game soon but with 2 customer and user group and life in general let’s just say I’m busy. In the meantime I’m passing along this great offer. Microsoft Visual Studio 2010 Professional will launch on April 12 but you can beat the rush and secure your copy today by pre-ordering at the affordable estimated retail price of $549, a saving of $250. If you use a previous version of Visual Studio or any other development tool then you are eligible for this upgrade. Along with all the great new features in Visual Studio 2010 (see www.microsoft.com/visualstudio) Visual Studio 2010 Professional includes a 12-month MSDN Essentials subscription which gives you access to core Microsoft platforms: Windows 7 Ultimate, Windows Server 2008 R2 Enterprise, and Microsoft SQL Server 2008 R2 Datacenter. So visit http://www.microsoft.com/visualstudio/en-us/pre-order-visual-studio-2010 to check out all the new features and sign up for this great offer. Cheers, ET Technorati Tags: VS2010

Read the article
Limiting DOPs – Who rules over whom?

- by jean-pierre.dijcks

I've gotten a couple of questions from Dan Morgan and figured I start to answer them in this way. While Dan is running on a big system he is running with Database Resource Manager and he is trying to make sure the system doesn't go crazy (remember end user are never, ever crazy!) on very high DOPs. Q: How do I control statements with very high DOPs driven from user hints in queries? A: The best way to do this is to work with DBRM and impose limits on consumer groups. The Max DOP setting you can set in DBRM allows you to overwrite the hint. Now let's go into some more detail here. Assume my object (and for simplicity we assume there is a single object - and do remember that we always pick the highest DOP when in doubt and when conflicting DOPs are available in a query) has PARALLEL 64 as its setting. Assume that the query that selects something cool from that table lives in a consumer group with a max DOP of 32. Assume no goofy things (like running out of parallel_max_servers) are happening. A query selecting from this table will run at DOP 32 because DBRM caps the DOP. As of 11.2.0.1 we also use the DBRM cap to create the original plan (at compile time) and not just enforce the cap at runtime. Now, my user is smart and writes a query with a parallel hint requesting DOP 128. This query is still capped by DBRM and DBRM overrules the hint in the statement. The statement, despite the hint, runs at DOP 32. Note that in the hinted scenario we do compile the statement with DOP 128 (the optimizer obeys the hint). This is another reason to use table decoration rather than hints. Q: What happens if I set parallel_max_servers higher than processes (e.g. the max number of processes allowed to run on my machine)? A: Processes rules. It is important to understand that processes are fixed at startup time. If you increase parallel_max_servers above the number of processes in the processes parameter you should get a warning in the alert log stating it can not take effect. As a follow up, a hinted query requesting more parallel processes than either parallel_max_servers or processes will not be able to acquire the requested number. Parallel_max_processes will prevent this. And since parallel_max_servers should be lower than max processes you can never go over either...

Read the article
List of Microsoft training kits (2012)

- by DigiMortal

Some years ago I published list of Microsoft training kits for developers. Now it’s time to make a little refresh and list current training kits available. Feel free to refer additional training kits in comments. Sharepoint 2010 Developer Training Kit SharePoint 2010 and Windows Phone 7 Training Kit SharePoint and Windows Azure Development Kit Visual Studio 2010 and .NET Framework 4 Training Kit (December 2011) SQL Server 2012 Developer Training Kit SQL Server 2008 R2 Update for Developers Training Kit (May 2011 Update) SharePoint and Silverlight Training Kit Windows Phone 7 Training Kit for Developers - RTM Refresh Windows Phone 7.5 Training Kit Silverlight 4 Training Web Camps Training Kit Identity Developer Training Kit Internet Explorer 10 Training Kit Visual Studio LightSwitch Training Kit Office 2010 Developer Training Kit - June 2011 Office 365 Developer Training Kit - June 2011 Update Dynamics CRM 2011 Developer Training Kit PHP on Windows and SQL Server Training Kit (March 2011 Update) Windows Server AppFabric Training Kit Windows Server 2008 R2 Developer Training Kit - July 2009

Read the article
SQL SERVER – Weekly Series – Memory Lane – #033

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Spatial Database Definition and Research Documents Here is the definition from Wikipedia about spatial database : A spatial database is a database that is optimized to store and query data related to objects in space, including points, lines and polygons. While typical databases can understand various numeric and character types of data, additional functionality needs to be added for databases to process spatial data types. Select Only Date Part From DateTime – Best Practice A very common question which I receive is how to only get Date or Time part from datetime value. In this blog post I explain the same in very simple words. T-SQL Paging Query Technique Comparison (OVER and ROW_NUMBER()) – CTE vs. Derived Table I have received few emails and comments about my post SQL SERVER – T-SQL Paging Query Technique Comparison – SQL 2000 vs SQL 2005. The main question was is this can be done using CTE? Absolutely! What about Performance? It is identical! Please refer above mentioned article for the history of paging. SQL SERVER – Cannot resolve collation conflict for equal to operation One of the very first error I ever encountered in my career was to resolve this conflict. I have blogged about it and I have realized that many others like me who are facing this error. LEN and DATALENGTH of NULL Simple Example Here is the question for you what is the LEN of NULL value? Well it is very easy – just read the blog. Recovery Models and Selection Very simple and easy explanation of the Database Backup Recovery Model and how to select the best option for you. Explanation SQL SERVER Hash Join Hash join gives best performance when two more join tables are joined and at-least one of them have no index or is not sorted. It is also expected that smaller of the either of table can be read in memory completely (though not necessary). Easy Sequence of SELECT FROM JOIN WHERE GROUP BY HAVING ORDER BY SELECT yourcolumns FROM tablenames JOIN tablenames WHERE condition GROUP BY yourcolumns HAVING aggregatecolumn condition ORDER BY yourcolumns NorthWind Database or AdventureWorks Database – Samples Databases In this blog post we learn how to install Northwind database. I also shared the source where one can download this database as that is used in many examples on MSDN help files. sp_HelpText for sp_HelpText – Puzzle A simple quick puzzle – do you know the answer of it? If not, go ahead and read the blog. 2008 SQL SERVER – 2008 – Step By Step Installation Guide With Images When SQL Server 2008 was newly introduced lots of people had no clue how to install SQL Server 2008 and the amount of the question which I used to receive were so much. I wrote this blog post with the spirit that this will help all the newbies to install SQL Server 2008 with the help of images. Still today this blog post has been bible for all of the people who are confused with SQL Server installation. Inline Variable Assignment I loved this feature. I have always wanted this feature to be present in SQL Server. The last time when I met developers from Microsoft SQL Server, I had talked about this feature. I think this feature saves some time but make the code more readable. Introduction to Policy Management – Enforcing Rules on SQL Server If our company policy is to create all the Stored Procedure with prefix ‘usp’ that developers should be just prevented to create Stored Procedure with any other prefix. Let us see a small tutorial how to create conditions and policy which will prevent any future SP to be created with any other prefix. 2009 Performance Counters from System Views – By Kevin Mckenna Many of you are not aware of this fact that access to performance information is readily available in SQL Server and that too without querying performance counters using a custom application or via perfmon. Till now, this fact has remained undisclosed but through this post I would like to explain you can easily access SQL Server performance counter information. Without putting much effort you will come across the system viewsys.dm_os_performance_counters. As the name suggests, this provides you easy access to the SQL Server performance counter information that is passed on to perfmon, but you can get at it via tsql. Customize Toolbar – Remove Debug Button from Toolbar I was fond of SQL Server Debugger feature in SQL Server 2000. To my utter disappointment, this feature was withdrawn from SQL Server 2005. The button of the debugger is similar to a play button and is used to run debugging commands of Visual Studio. Because of this reason, it gets very much infuriating for developers when they are developing on both – Visual Studio and SSMS. Let us now see how we can remove debugging button from SQL Server Management Studio. Effect of Normalization on Index and Performance A very interesting conversation which started from twitter. If you want to read one link this is the link I encourage you to read it. SSMS Feature – Multi-server Queries Using SQL Server Management Studio (SSMS) DBAs can now query multiple servers from one window. It is quite common for DBAs with large amount of servers to maintain and gather information from multiple SQL Servers and create report. This feature is a blessing for the DBAs, as they can now assemble all the information instantaneously without going anywhere. Query Optimizer Hint ROBUST PLAN – Question to You “ROBUST PLAN” is a kind of query hint which works quite differently than other hints. It does not improve join or force any indexes to use; it just makes sure that a query does not crash due to over the limit size of row. Let me elaborate upon it in the blog post. 2010 Do you really know the difference between various date functions available in SQL Server 2012? Here is a three part story where we explored the same with examples: Fastest Way to Restore the Database Difference Between DATETIME and DATETIME2 Difference Between DATETIME and DATETIME2 – WITH GETDATE Shrinking NDF and MDF Files – Readers’ Opinion Shrinking Database always creates performance degradation and increases fragmentation in the database. I suggest that you keep that in mind before you start reading the following comment. If you are going to say Shrinking Database is bad and evil, here I am saying it first and loud. Now, the comment of Imran is written while keeping in mind only the process showing how the Shrinking Database Operation works. Imran has already explained his understanding and requests further explanation. I have removed the Best Practices section from Imran’s comments, as there are a few corrections. 2011 Solution – Puzzle – SELECT * vs SELECT COUNT(*) This is very interesting question and I am very confident that not every one knows the answer to this question. Let me ask you again – Which will be faster SELECT* or SELECT COUNT (*) or do you think this is apples and oranges comparison. 2012 Service Broker and CAP_CPU_PERCENT – Limiting SQL Server Instances to CPU Usage In SQL Server 2012 there are a few enhancements with regards to SQL Server Resource Governor. One of the enhancement is how the resources are allocated. Let me explain you with examples. Let us understand the entire discussion with the help of three different examples. Finding Size of a Columnstore Index Using DMVs One of the very common question I often see is need of the list of columnstore index along with their size and corresponding table name. I quickly re-wrote a script using DMVs sys.indexes and sys.dm_db_partition_stats. This script gives the size of the columnstore index on disk only. I am sure there will be advanced script to retrieve details related to components associated with the columnstore index. However, I believe following script is sufficient to start getting an idea of columnstore index size. Developer Training Resources and Summary Roundup Developer Training - Importance and Significance - Part 1 In this part we discussed the importance of training in the real world. The most important and valuable resource any company is its employee. Employees who have been well-trained will be better at their jobs and produce a better product. An employee who is well trained obviously knows more about their job and all the technical aspects. I have a very high opinion about training employees and it is the most important task. Developer Training – Employee Morals and Ethics – Part 2 In this part we discussed the most crucial components of training. Often employees are expecting the company to pay for their training and the company expresses no interest in training the employee. Quite often training expenses are the real issue for both the employee and employer. Developer Training – Difficult Questions and Alternative Perspective - Part 3 This part was the most difficult to write as I tried to address a few difficult questions and answers. Training is such a sensitive issue that many developers when not receiving chance for training think about leaving the organization. Developer Training – Various Options for Developer Training – Part 4 In this part I tried to explore a few methods and options for training. The generic feedback I received on this blog post was short and I should have explored each of the subject of the training in details. I believe there are two big buckets of training 1) Instructor Lead Training and 2) Self Lead Training. Developer Training – A Conclusive Summary- Part 5 There is no better motivation than a personal desire to learn new technology. Honestly there is nothing more personal learning. That “change is the only constant” and “adapt & overcome” are the essential lessons of life. One cannot stop the learning and resist the change. In the IT industry “ego of knowing all” and the “resistance to change” are the most challenging issues. A Quick Look at Logging and Ideas around Logging Question: What is the first thing comes to your mind when you hear the word “Logging”? Strange enough I got a different answer every single time. Let me just list what answer I got from my friends. Let us go over them one by one. Beginning Performance Tuning with SQL Server Execution Plan Solution of Puzzle – Swap Value of Column Without Case Statement Earlier this week I asked a question where I asked how to Swap Values of the column without using CASE Statement. Read here: SQL SERVER – A Puzzle – Swap Value of Column Without Case Statement. I have proposed 3 different solutions in the blog posts itself. I had requested the help of the community to come up with alternate solutions and honestly I am stunned and amazed by the qualified entries. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Role Center Installation for IIS Default Feature XPS-VIEWER

- by ssmantha

While installing Dynamics Ax 2009 Roles Center and Enterprise Portal on Windows Server 2008 R2, there is a prerequisite for IIS Default components which fails to install. The error log file for IIS component installation points to an error while installing feature NET-XPS-VIEWER. This issue can be resolved by editing “ServerManagerCmdInputIIS.xml” file present in the support folder of the DAX 2009 installer. Edit the entry “<Feature Id="NET-XPS-Viewer" />” to “<Feature Id="NET-XPS-Viewer" />” and try reinstalling the installer should now continue uninterrupted. The issue is due to the feature name which is now XPS-VIEWER in windows server 2008 R2. Happy Installing!! :-)

Read the article
Partition Wise Joins

- by jean-pierre.dijcks

Some say they are the holy grail of parallel computing and PWJ is the basis for a shared nothing system and the only join method that is available on a shared nothing system (yes this is oversimplified!). The magic in Oracle is of course that is one of many ways to join data. And yes, this is the old flexibility vs. simplicity discussion all over, so I won't go there... the point is that what you must do in a shared nothing system, you can do in Oracle with the same speed and methods. The Theory A partition wise join is a join between (for simplicity) two tables that are partitioned on the same column with the same partitioning scheme. In shared nothing this is effectively hard partitioning locating data on a specific node / storage combo. In Oracle is is logical partitioning. If you now join the two tables on that partitioned column you can break up the join in smaller joins exactly along the partitions in the data. Since they are partitioned (grouped) into the same buckets, all values required to do the join live in the equivalent bucket on either sides. No need to talk to anyone else, no need to redistribute data to anyone else... in short, the optimal join method for parallel processing of two large data sets. PWJ's in Oracle Since we do not hard partition the data across nodes in Oracle we use the Partitioning option to the database to create the buckets, then set the Degree of Parallelism (or run Auto DOP - see here) and get our PWJs. The main questions always asked are: How many partitions should I create? What should my DOP be? In a shared nothing system the answer is of course, as many partitions as there are nodes which will be your DOP. In Oracle we do want you to look at the workload and concurrency, and once you know that to understand the following rules of thumb. Within Oracle we have more ways of joining of data, so it is important to understand some of the PWJ ideas and what it means if you have an uneven distribution across processes. Assume we have a simple scenario where we partition the data on a hash key resulting in 4 hash partitions (H1 -H4). We have 2 parallel processes that have been tasked with reading these partitions (P1 - P2). The work is evenly divided assuming the partitions are the same size and we can scan this in time t1 as shown below. Now assume that we have changed the system and have a 5th partition but still have our 2 workers P1 and P2. The time it takes is actually 50% more assuming the 5th partition has the same size as the original H1 - H4 partitions. In other words to scan these 5 partitions, the time t2 it takes is not 1/5th more expensive, it is a lot more expensive and some other join plans may now start to look exciting to the optimizer. Just to post the disclaimer, it is not as simple as I state it here, but you get the idea on how much more expensive this plan may now look... Based on this little example there are a few rules of thumb to follow to get the partition wise joins. First, choose a DOP that is a factor of two (2). So always choose something like 2, 4, 8, 16, 32 and so on... Second, choose a number of partitions that is larger or equal to 2* DOP. Third, make sure the number of partitions is divisible through 2 without orphans. This is also known as an even number... Fourth, choose a stable partition count strategy, which is typically hash, which can be a sub partitioning strategy rather than the main strategy (range - hash is a popular one). Fifth, make sure you do this on the join key between the two large tables you want to join (and this should be the obvious one...). Translating this into an example: DOP = 8 (determined based on concurrency or by using Auto DOP with a cap due to concurrency) says that the number of partitions >= 16. Number of hash (sub) partitions = 32, which gives each process four partitions to work on. This number is somewhat arbitrary and depends on your data and system. In this case my main reasoning is that if you get more room on the box you can easily move the DOP for the query to 16 without repartitioning... and of course it makes for no leftovers on the table... And yes, we recommend up-to-date statistics. And before you start complaining, do read this post on a cool way to do stats in 11.

Read the article
Programmaticaly finding the Landau notation (Big O or Theta notation) of an algorithm?

- by Julien L

I'm used to search for the Landau (Big O, Theta...) notation of my algorithms by hand to make sure they are as optimized as they can be, but when the functions are getting really big and complex, it's taking way too much time to do it by hand. it's also prone to human errors. I spent some time on Codility (coding/algo exercises), and noticed they will give you the Landau notation for your submitted solution (both in Time and Memory usage). I was wondering how they do that... How would you do it? Is there another way besides Lexical Analysis or parsing of the code? PS: This question concerns mainly PHP and or JavaScript, but I'm opened to any language and theory.

Read the article
The countdown for ‘In Touch’ has begun!

- by Julien Haye

The Oracle 'In Touch' PartnerCast is just a week away from going live, so if you haven’t registered yet, what are you waiting for?! Registration is quick and easy, so click here to register and ensure you stay informed with the latest from the Oracle PartnerNetwork. 'In Touch' relies on your input, so let David Callaghan, Senior Vice President EMEA Alliances and Channels, know your thoughts and comments via the player consol, by emailing [email protected] or on twitter using the hashtag #DCpickme. The cast will go live on Tuesday 29th October from 10:30am UK / 11:30am CET with studio guests Will O'Brien, VP Alliances & Channels, UK & Ireland, and Markus Reischl, Senior Director and Sales Leader EMEA Strategic Alliances, answering your questions on Oracle Storage and Business Intelligence. To find out more information about the cast, including the full line up, please visit the 'In Touch' webpage here.

Read the article
Edinburgh this Thurs (25th) - Rob Carrol talks about how to build a high performance, scalable repor

- by tonyrogerson

Scottish Area SQL Server User Group Meeting, Edinburgh - Thursday 25th March An evening of SQL Server 2008 Reporting Services Scalability and Performance with Rob Carrol, see how to build a high performance, scalable reporting platform and the tuning techniques required to ensure that report performance remains optimal as your platform grows. Pizza and drinks will be provided! Register at http://www.sqlserverfaq.com/events/221/SQL-Server-2008-Reporting-Services-Scalability-and-Performance.aspx...(read more)

Read the article
I'm a happy camper

- by Paul Nielsen

I’m satisfied with SQL Server 2008. It meets my needs and I can build whatever I want in the database. There’s no feature that it lacks that blocking my development. SQL Server 2008 is what I see when I close my eyes and dream. Maybe I’m just pleased with my current projects, maybe I’m being dense, maybe I lack imagination, tell me if I’m being stupid, but I feel no compelling need for an upgrade. If Microsoft didn’t ship the new version for 3 or 5 more years, I’d be ok with that. I’m sure there...(read more)

Read the article
Big Data Accelerator

- by Jean-Pierre Dijcks

For everyone who does not regularly listen to earnings calls, Oracle's Q4 call was interesting (as it mostly is). One of the announcements in the call was the Big Data Accelerator from Oracle (Seeking Alpha link here - slightly tweaked for correctness shown below): "The big data accelerator includes some of the standard open source software, HDFS, the file system and a number of other pieces, but also some Oracle components that we think can dramatically speed up the entire map-reduce process. And will be particularly attractive to Java programmers [...]. There are some interesting applications they do, ETL is one. Log processing is another. We're going to have a lot of those features, functions and pre-built applications in our big data accelerator." Not much else we can say right now, more on this (and Big Data in general) at Openworld!

Read the article
WCF/ADO.NET Data Services - Could not load type 'System.Data.Services.Providers.IDataServiceUpdatePr

- by Sahil Malik

Ad:: SharePoint 2007 Training in .NET 3.5 technologies (more information). When you try accessing ListData.svc, do you get the following error? Could not load type 'System.Data.Services.Providers.IDataServiceUpdateProvider' from assembly 'System.Data.Services, Version=3.5.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'. Well, if you followed the instructions in Chapter 1 of my book to build your VM, you wouldn’t run into the above issue. But if you do, you need to install - For Windows Vista and Windows 2008 - http://www.microsoft.com/downloads/details.aspx?familyid=4B710B89-8576-46CF-A4BF-331A9306D555&displaylang=en For Windows 7 and Windows 2008 R2 - http://www.microsoft.com/downloads/details.aspx?familyid=79d7f6f8-d6e9-4b8c-8640-17f89452148e&displaylang=en Remember to: a) Install the x64 version, and b) Do an IISReset before trying again. Comment on the article ....

Read the article
Partition Wise Joins II

- by jean-pierre.dijcks

One of the things that I did not talk about in the initial partition wise join post was the effect it has on resource allocation on the database server. When Oracle applies a different join method - e.g. not PWJ - what you will see in SQL Monitor (in Enterprise Manager) or in an Explain Plan is a set of producers and a set of consumers. The producers scan the tables in the the join. If there are two tables the producers first scan one table, then the other. The producers thus provide data to the consumers, and when the consumers have the data from both scans they do the join and give the data to the query coordinator. Now that behavior means that if you choose a degree of parallelism of 4 to run such query with, Oracle will allocate 8 parallel processes. Of these 8 processes 4 are producers and 4 are consumers. The consumers only actually do work once the producers are fully done with scanning both sides of the join. In the plan above you can see that the producers access table SALES [line 11] and then do a PX SEND [line 9]. That is the producer set of processes working. The consumers receive that data [line 8] and twiddle their thumbs while the producers go on and scan CUSTOMERS. The producers send that data to the consumer indicated by PX SEND [line 5]. After receiving that data [line 4] the consumers do the actual join [line 3] and give the data to the QC [line 2]. BTW, the myth that you see twice the number of processes due to the setting PARALLEL_THREADS_PER_CPU=2 is obviously not true. The above is why you will see 2 times the processes of the DOP. In a PWJ plan the consumers are not present. Instead of producing rows and giving those to different processes, a PWJ only uses a single set of processes. Each process reads its piece of the join across the two tables and performs the join. The plan here is notably different from the initial plan. First of all the hash join is done right on top of both table scans [line 8]. This query is a little more complex than the previous so there is a bit of noise above that bit of info, but for this post, lets ignore that (sort stuff). The important piece here is that the PWJ plan typically will be faster and from a PX process number / resources typically cheaper. You may want to look out for those plans and try to get those to appear a lot... CREDITS: credits for the plans and some of the info on the plans go to Maria, as she actually produced these plans and is the expert on plans in general... You can see her talk about explaining the explain plan and other optimizer stuff over here: ODTUG in Washington DC, June 27 - July 1 On the Optimizer blog At OpenWorld in San Francisco, September 19 - 23 Happy joining and hope to see you all at ODTUG and OOW...

Read the article
MDX using EXISTING, AGGREGATE, CROSSJOIN and WHERE

- by James Rogers

It is a well-published approach to using the EXISTING function to decode AGGREGATE members and nested sub-query filters. Mosha wrote a good blog on it here and a more recent one here. The use of EXISTING in these scenarios is very useful and sometimes the only option when dealing with multi-select filters. However, there are some limitations I have run across when using the EXISTING function against an AGGREGATE member: The AGGREGATE member must be assigned to the Dimension.Hierarchy being detected by the EXISTING function in the calculated measure. The AGGREGATE member cannot contain a crossjoin from any other dimension or hierarchy or EXISTING will not be able to detect the members in the AGGREGATE member. Take the following query (from Adventure Works DW 2008): With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))' member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]})' select {[Week Count]} on columns from [Adventure Works] where [Date].[Fiscal Weeks].[CM] Here we are attempting to count the existing fiscal weeks in slicer. This is useful to get a per-week average for another member. Many applications generate queries in this manner (such as Oracle OBIEE). This query returns the correct result of (4) weeks. Now let's put a twist in it. What if the querying application submits the query in the following manner: With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))' member [Customer].[Customer Geography].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]})' select {[Week Count]} on columns from [Adventure Works] where [Customer].[Customer Geography].[CM] Here we are attempting to count the existing fiscal weeks in slicer. However, the AGGREGATE member is built on a different dimension (in name) than the one EXISTING is trying to detect. In this case the query returns (174) which is the total number of [Date].[Fiscal Weeks].[Fiscal Week].members defined in the dimension. Now another twist, the AGGREGATE member will be named appropriately and contain the hierarchy we are trying to detect with EXISTING but it will be cross-joined with another hierarchy: With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))' member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}* {[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})' select {[Week Count]} on columns from [Adventure Works] where [Date].[Fiscal Weeks].[CM] Once again, we are attempting to count the existing fiscal weeks in slicer. Again, in this case the query returns (174) which is the total number of [Date].[Fiscal Weeks].[Fiscal Week].members defined in the dimension. However, in 2008 R2 this query returns the correct result of 4 and additionally , the following will return the count of existing countries as well (2): With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))' member [Country Count] as 'count(existing([Customer].[Customer Geography].[Country].members))' member [Date].[Fiscal Weeks].[CM] as 'AGGREGATE({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]}* {[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]})' select {[Week Count]} on columns from [Adventure Works] where [Date].[Fiscal Weeks].[CM] 2008 R2 seems to work as long as the AGGREGATE member is on at least one of the hierarchies attempting to be detected (i.e. [Date].[Fiscal Weeks] or [Customer].[Customer Geography]). If not, it seems that the engine cannot find a "point of entry" into the aggregate member and ignores it for calculated members. One way around this would be to put the sets from the AGGREGATE member explicitly in the WHERE clause (slicer). I realize this is only supported in SSAS 2005 and 2008. However, after talking with Chris Webb (his blog is here and I highly recommend following his efforts and musings) it is a far more efficient way to filter/slice a query: With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members))' select {[Week Count]} on columns from [Adventure Works] where ({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]} ,{[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]}) This query returns the correct result of (4) weeks. Additionally, we can count the cross-join members of the two hierarchies in the slicer: With member [Week Count] as 'count(existing([Date].[Fiscal Weeks].[Fiscal Week].members)*existing([Customer].[Customer Geography].[Country].members))' select {[Week Count]} on columns from [Adventure Works] where ({[Date].[Fiscal Weeks].[Fiscal Week].&[47]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[48]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[49]&[2004],[Date].[Fiscal Weeks].[Fiscal Week].&[50]&[2004]} ,{[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[United States]}) We get the correct number of (8) here.

Read the article
Data Warehouse Best Practices

- by jean-pierre.dijcks

In our quest to share our endless wisdom (ahem…) one of the things we figured might be handy is recording some of the best practices for data warehousing. And so we did. And, we did some more… We now have recreated our websites on Oracle Technology Network and have a separate page for best practices, parallelism and other cool topics related to data warehousing. But the main topic of this post is the set of recorded best practices. Here is what is available (and it is a series that ties together but can be read independently), applicable for almost any database version: Partitioning 3NF schema design for a data warehouse Star schema design Data Loading Parallel Execution Optimizer and Stats management The best practices page has a lot of other useful information so have a look here.

Read the article
Auto DOP and Concurrency

- by jean-pierre.dijcks

After spending some time in the cloud, I figured it is time to come down to earth and start discussing some of the new Auto DOP features some more. As Database Machines (the v2 machine runs Oracle Database 11.2) are effectively selling like hotcakes, it makes some sense to talk about the new parallel features in more detail. For basic understanding make sure you have read the initial post. The focus there is on Auto DOP and queuing, which is to some extend the focus here. But now I want to discuss the concurrency a little and explain some of the relevant parameters and their impact, specifically in a situation with concurrency on the system. The goal of Auto DOP The idea behind calculating the Automatic Degree of Parallelism is to find the highest possible DOP (ideal DOP) that still scales. In other words, if we were to increase the DOP even more above a certain DOP we would see a tailing off of the performance curve and the resource cost / performance would become less optimal. Therefore the ideal DOP is the best resource/performance point for that statement. The goal of Queuing On a normal production system we should see statements running concurrently. On a Database Machine we typically see high concurrency rates, so we need to find a way to deal with both high DOP’s and high concurrency. Queuing is intended to make sure we Don’t throttle down a DOP because other statements are running on the system Stay within the physical limits of a system’s processing power Instead of making statements go at a lower DOP we queue them to make sure they will get all the resources they want to run efficiently without trashing the system. The theory – and hopefully – practice is that by giving a statement the optimal DOP the sum of all statements runs faster with queuing than without queuing. Increasing the Number of Potential Parallel Statements To determine how many statements we will consider running in parallel a single parameter should be looked at. That parameter is called PARALLEL_MIN_TIME_THRESHOLD. The default value is set to 10 seconds. So far there is nothing new here…, but do realize that anything serial (e.g. that stays under the threshold) goes straight into processing as is not considered in the rest of this post. Now, if you have a system where you have two groups of queries, serial short running and potentially parallel long running ones, you may want to worry only about the long running ones with this parallel statement threshold. As an example, lets assume the short running stuff runs on average between 1 and 15 seconds in serial (and the business is quite happy with that). The long running stuff is in the realm of 1 – 5 minutes. It might be a good choice to set the threshold to somewhere north of 30 seconds. That way the short running queries all run serial as they do today (if it ain’t broken, don’t fix it) and allows the long running ones to be evaluated for (higher degrees of) parallelism. This makes sense because the longer running ones are (at least in theory) more interesting to unleash a parallel processing model on and the benefits of running these in parallel are much more significant (again, that is mostly the case). Setting a Maximum DOP for a Statement Now that you know how to control how many of your statements are considered to run in parallel, lets talk about the specific degree of any given statement that will be evaluated. As the initial post describes this is controlled by PARALLEL_DEGREE_LIMIT. This parameter controls the degree on the entire cluster and by default it is CPU (meaning it equals Default DOP). For the sake of an example, let’s say our Default DOP is 32. Looking at our 5 minute queries from the previous paragraph, the limit to 32 means that none of the statements that are evaluated for Auto DOP ever runs at more than DOP of 32. Concurrently Running a High DOP A basic assumption about running high DOP statements at high concurrency is that you at some point in time (and this is true on any parallel processing platform!) will run into a resource limitation. And yes, you can then buy more hardware (e.g. expand the Database Machine in Oracle’s case), but that is not the point of this post… The goal is to find a balance between the highest possible DOP for each statement and the number of statements running concurrently, but with an emphasis on running each statement at that highest efficiency DOP. The PARALLEL_SERVER_TARGET parameter is the all important concurrency slider here. Setting this parameter to a higher number means more statements get to run at their maximum parallel degree before queuing kicks in. PARALLEL_SERVER_TARGET is set per instance (so needs to be set to the same value on all 8 nodes in a full rack Database Machine). Just as a side note, this parameter is set in processes, not in DOP, which equates to 4* Default DOP (2 processes for a DOP, default value is 2 * Default DOP, hence a default of 4 * Default DOP). Let’s say we have PARALLEL_SERVER_TARGET set to 128. With our limit set to 32 (the default) we are able to run 4 statements concurrently at the highest DOP possible on this system before we start queuing. If these 4 statements are running, any next statement will be queued. To run a system at high concurrency the PARALLEL_SERVER_TARGET should be raised from its default to be much closer (start with 60% or so) to PARALLEL_MAX_SERVERS. By using both PARALLEL_SERVER_TARGET and PARALLEL_DEGREE_LIMIT you can control easily how many statements run concurrently at good DOPs without excessive queuing. Because each workload is a little different, it makes sense to plan ahead and look at these parameters and set these based on your requirements.

Read the article
Hiring a Junior Developer, What should I ask?

- by Jeremy

We are currently hiring a junior developer to help me out, as I have more projects than I can currently manage. I have never hired anyone who wasn't a friend or at least an acquaintance. I have a phone interview with the only applicant that actually stood out to me (on paper), but I have never done this before. Our projects are all high scalability, data intensive web applications that process millions of transactions an hour, across multiple servers and clients. To be language/stack specific, we use ASP.Net MVC2, WebForms and C# 4, MSSQL 2008 R2, all running atop Windows Server 2008 R2 What should I ask him? How should I structure the phone call?

Read the article
Finally, I have my HP 6910p laptop running with 8Gb RAM

- by Liam Westley

Today, I received two Corsair Value Select 4Gb DDR SO-DIMMs (from overclock.co.uk) for my aging HP 6910p to give it the extra lease of life to keep it going until the end of 2010. And here is the proof that Windows 7 64-bit happily sees all 8Gb, There are no 4Gb modules are officially supported for the HP 6910p (they didn’t exist when it was first build). I was taking a bit of a gamble, and relying on the UK distance selling regulations which meant that even if they didn’t work I’d be able to send them back, getting a full refund and only paying for the return postage. I’d read Keith Comb’s blog back in 2008, (http://blogs.technet.com/b/keithcombs/archive/2008/07/05/loading-a-hp-6910p-with-8gb-of-ram.aspx) where he mentioned ‘trying’ out 4Gb samples of SO-DIMMs in a HP 6910p laptop, but there still appears to be no mentions of running this configuration in any other blog. Seeing how the 8Gb of memory is used is made easier with the new Resource Monitor available in Windows 7. With two copies of Visual Studio 2008, Outlook, Firefox (with 30+ tabs), TweetDeck (an infamous memory hog) and VMWare workstation running a virtual machine allocated with 2Gb of memory, you might have no ‘free’ memory remaining, but the standby memory is an awesome 2.4Gb, and once the VM is up and running the Hard Faults/sec hovers around zero, It’s the page fault figure which really counts, because reducing that value means that you are preventing the Windows 7 system drive from being used for virtual memory paging operations. Even after only a few hours of use it’s noticeable that disc access has been reduced and applications feel more responsive and ‘snappy’. I did consider the option of purchasing an SSD to replace the main drive, rather than go for 8Gb of RAM, but I think I’ve probably made the correct decision. Given my hobby topic of virtualisation, I take the view that you can never have too much memory. It was also a decision made easier by the price differential between 8Gb of RAM compared to a decent size SSD. In the 18 months since Keith Comb tested the first 4Gb SO-DIMMS they have plummeted in price, at just under £100 per 4Gb, they are around a fifth of the price when launched. So if you ever wondered if a HP 6910p can handle 8Gb, now you know.

Read the article
Cannot set a credential for principal 'sa'

- by hailey

I was trying to change the SA password on my development server this morning and got an error. Msg 15535, Level 16, State 1, Line 1 Cannot set a credential for principal 'sa'. It was a little frustrating to get an error for a seemingly simple task but then agian maybe I screwed something up. After doing a couple of searches i found a Microsoft KB (support.microsoft.com/kb/956177) "You receive an exception in SQL Server 2008 when you try to modify the properties of the SQL Server Administrator account by using SQL Server Management Studio". It was for SQL 2008 but it worked for my SQL 2005 sp3 server just fine. You have to click the Map to Credential check box but you don't have to add any credetials just click the OK button to complete and that's it.

Read the article
New Feature in ODI 11.1.1.6: ODI for Big Data

- by Julien Testut

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} By Ananth Tirupattur Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature. With all the buzz around Big Data and before getting into the details of ODI for Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data. So, what is Big Data? Big data includes: structured data (this includes data from relation data stores, xml data stores), semi-structured data (this includes data from weblogs) unstructured data (this includes data from text blob, images) Traditionally, business decisions are based on the information gathered from transactional data. For example, transactional Data from CRM applications is fed to a decision system for analysis and decision making. Products such as ODI play a key role in enabling decision systems. However, with the emergence of massive amounts of semi-structured and unstructured data it is important for decision system to include them in the analysis to achieve better decision making capability. While there is an abundance of opportunities for business for gaining competitive advantages, process of Big Data has challenges. The challenges of processing Big Data include: Volume of data Velocity of data - The high Rate at which data is generated Variety of data In order to address these challenges and convert them into opportunities, we would need an appropriate framework, platform and the right set of tools. Hadoop is an open source framework which is highly scalable, fault tolerant system, for storage and processing large amounts of data. Hadoop provides 2 key services, distributed and reliable storage called Hadoop Distributed File System or HDFS and a framework for parallel data processing called Map-Reduce. Innovations in Hadoop and its related technology continue to rapidly evolve, hence therefore, it is highly recommended to follow information on the web to keep up with latest information. Oracle's vision is to provide a comprehensive solution to address the challenges faced by Big Data. Oracle is providing the necessary Hardware, software and tools for processing Big Data Oracle solution includes: Big Data Appliance Oracle NoSQL Database Cloudera distribution for Hadoop Oracle R Enterprise- R is a statistical package which is very popular among data scientists. ODI solution for Big Data Oracle Loader for Hadoop for loading data from Hadoop to Oracle. Further details can be found here: http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html ODI Solution for Big Data: ODI’s goal is to minimize the need to understand the complexity of Hadoop framework and simplify the adoption of processing Big Data seamlessly in an enterprise. ODI is providing the capabilities for an integrated architecture for processing Big Data. This includes capability to load data in to Hadoop, process data in Hadoop and load data from Hadoop into Oracle. ODI is expanding its support for Big Data by providing the following out of the box Knowledge Modules (KMs). IKM File to Hive (LOAD DATA).Load unstructured data from File (Local file system or HDFS ) into Hive IKM Hive Control AppendTransform and validate structured data on Hive IKM Hive TransformTransform unstructured data on Hive IKM File/Hive to Oracle (OLH)Load processed data in Hive to Oracle RKM HiveReverse engineer Hive tables to generate models Using the Loading KM you can map files (local and HDFS files) to the corresponding Hive tables. For example, you can map weblog files categorized by date into a corresponding partitioned Hive table schema. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Hive control Append KM you can validate and transform data in Hive. In the below example, two source Hive tables are joined and mapped to a target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} The Hive Transform KM facilitates processing of semi-structured data in Hive. In the below example, the data from weblog is processed using a Perl script and mapped to target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Oracle Loader for Hadoop (OLH) KM you can load data from Hive table or HDFS to a corresponding table in Oracle. OLH is available as a standalone product. ODI greatly enhances OLH capability by generating the configuration and mapping files for OLH based on the configuration provided in the interface and KM options. ODI seamlessly invokes OLH when executing the scenario. In the below example, a HDFS file is mapped to a table in Oracle. Development and Deployment:The following diagram illustrates the development and deployment of ODI solution for Big Data. Using the ODI Studio on your development machine create and develop ODI solution for processing Big Data by connecting to a MySQL DB or Oracle database on a BDA machine or Hadoop cluster. Schedule the ODI scenarios to be executed on the ODI agent deployed on the BDA machine or Hadoop cluster. ODI Solution for Big Data provides several exciting new capabilities to facilitate the adoption of Big Data in an enterprise. You can find more information about the Oracle Big Data connectors on OTN. You can find an overview of all the new features introduced in ODI 11.1.1.6 in the following document: ODI 11.1.1.6 New Features Overview

Read the article
Tools of the Trade

- by Ajarn Mark Caldwell

I got pretty excited a couple of days ago when my new laptop arrived. “The new phone books are here! The new phone books are here! I’m a somebody!” - Steve Martin in The Jerk It is a Dell Precision M4500 with an Intel i7 Core 2.8 GHZ running 64-bit Windows 7 with a 15.6” widescreen, 8 GB RAM, 256 GB SSD. For some of you high fliers, this may be nothing to write home about, but compared to the 32–bit Windows XP laptop with 2 GB of RAM and a regular hard disk that I’m coming from, it’s a really nice step forward. I won’t even bore you with the details of the desktop PC I was first given when I started here 5 1/2 years ago. Let’s just say that things have improved. One really nice thing is that while we are definitely running a lean and mean department in terms of staffing, my boss believes in supporting that lean staff with good tools in order to stay lean instead of having to spend even more money on additional employees. Of course, that only goes so far, and at some point you have to add more people in order to get more work done, which is why we are bringing on-board a new employee and a new contract developer next week. But that’s a different story for a different time. But the main topic for this post is to highlight the variety of tools that I use in my job and that you might find useful, too. This is easy to do right now because the process of building up my new laptop from scratch has forced me to assemble a list of software that had to be installed and configured. Keep in mind as you look through this list that I play many roles in our company. My official title is Software Engineering Manager, but in addition to managing the team, I am also an active ASP.NET and SQL developer, the Database Administrator, and 50% of the SAN Administrator team. So, without further ado, here are the tools and some comments about why I use them: Tool Purpose Virtual Clone Drive Easily mount an ISO image as a DVD Drive. This is particularly handy when you are downloading disk images from Microsoft for your tools. SQL Server 2008 R2 Developer Edition We are migrating all of our active systems to SQL 2008 R2. Developer Edition has all the features of Enterprise Edition, but intended for development use. SQL Server 2005 Developer Edition (BIDS ONLY) The migration to SSRS 2008 R2 is just getting started, and in the meantime, maintenance work still has to be done on the reports on our SQL 2005 server. For some reason, you can’t use BIDS from 2008 to write reports for a 2005 server. There is some different format and when you open 2005 reports in 2008 BIDS, it forces you to upgrade, and they can no longer be uploaded to a 2005 server. Hopefully Microsoft will fix this soon in some manner similar to Visual Studio now allows you to pick which version of the .NET Framework you are coding against. Visual Studio 2010 Premium All of our application development is in ASP.NET, and we might as well use the tool designed for it. I’ve used a version of Visual Studio going all the way back to VB 6.0 and Visual Interdev. Vault Professional Client Several years ago we replaced Visual Source Safe with SourceGear Vault (then Fortress, and now Vault Pro), and I love it. It is very reliable with low overhead - perfect for a small to medium size development team. And being a small ISV, their support is exceptional. Red-Gate Developer Bundle with the SQL Source Control update for Vault I first used, and fell in love with, SQL Prompt shortly before Red-Gate bought it, and then Red-Gate’s first release made me love it even more. SQL Refactor (which has since been rolled into the latest version of SQL Prompt) has saved me many hours and migraine’s trying to understand somebody else’s code when their indenting was nonexistant, or worse, irrational. SQL Compare has been awesome for troubleshooting potential schema issues between different instances of system databases. SQL Data Compare helped us identify the cause behind a bug which appeared in PROD but could not be reproduced in a nearly (but not quite exactly) identical copy in UAT. And the newest tool we are embracing: SQL Source Control. I blogged about it here (and here, and here) last December. This is really going to help us keep each developer’s copy of the database in sync with one another. Fiddler Helps you watch the whole traffic stream on web visits. Haven’t used it a lot, but it did help me track down some odd 404 errors we were finding in our own application logs. Has some other JavaScript troubleshooting capabilities, but some of its usefulness has been supplanted by the Developer Tools option in IE8. Funduc Search & Replace Find any string anywhere in a mound of source code really, really fast. Does RegEx searches, if you understand that foreign language. Has really helped with some refactoring work to pinpoint, for example, everywhere a particular stored procedure is referenced, whether in .NET code or other SQL procedures (which we have in script files). Provides in-context preview of the search results. Fantastic tool, and a bargain price. SciTE SciTE is a Scintilla based Text Editor and it is a fantastic, light-weight tool for quickly reviewing (or writing) program code, SQL scripts, and extract files. It has language-specific syntax highlighting. I used it to write several batch and CMD programs a year ago, and to examine data extract files for exchanging information with other systems. Extremely handy are the options to View End of Line and View Whitespace. Ever receive a file that is supposed to use CRLF as an end-of-line marker, but really only has CRs? SciTE will quickly make that visible. Infragistics Controls We do a lot of ASP.NET development, and frequently use the WebGrid, WebTab, and date picker controls. We will likely be implementing the Hierarchical Data Grid soon. Infragistics has control suites for WebForms, WinForms, Silverlight, and coming soon MVC/JQuery. WinZip - WITH Command-Line add-in The classic compression program with a great command-line interface that allows me to build those CMD (and soon PowerShell) programs for automated compression jobs. Our versioned Build packages are zip files. XML Notepad Haven’t used this a lot myself, but one of my team really likes it for examining large XML files. LINQPad Again, haven’t used this one a lot, but it was recommended to me for learning and practicing my LINQ skills which will come in handy as we implement Entity Framework. SQL Sentry Plan Explorer SQL Server Show Plan on steroids. Great for helping you focus on the parts of a large query that are of most importance. Also great for just compressing the graphical plan into more readable layout. Araxis Merge A great DIFF and Merge tool. SourceGear provides a great tool called DiffMerge that we use all the time, but occasionally, I like the cross-edit capabilities of Araxis Merge. For a while, we also produced DIFF reports in HTML that showed all the changes that occurred between two releases. This was most important when we were putting out very small, but very important hot fixes on a very politically hot system. The reports produced by Araxis Merge gave the Director of IS assurance that we were not accidentally introducing ripples throughout the system with our releases. Idera SQL Admin Toolset A great collection of tools including a password checker to help analyze your SQL Server for weak user passwords, a Backup Status tool to quickly scan a large list of servers and databases to identify any that are overdue for backups. Particularly helpful for highlighting new databases that have been deployed without getting included in your backup processing. I also like Space Analyzer to keep an eye on disk space consumed by database files. Idera SQL Job Manager This free tool provides a nice calendar view of SQL Server Job Schedules, but to a degree, you also get what you pay for. We will be purchasing SQL Sentry Event Manager later this year as an even better job schedule reviewer/manager. But in the meantime, this at least gives me a good view on potential resource conflicts across multiple instances of SQL Server. DBFViewer 2000 I inherited a couple of FoxPro databases that I have to keep an eye on occasionally and have not yet been able to migrate them to SQL Server. Balsamiq Mockups We are still in evaluation-mode on this tool, but I really like it as a quick UI mockup tool that does not require Visual Studio, so someone other than a programmer can do UI design. The interface looks hand-drawn which definitely has some psychological benefits when communicating to users, too. FeedDemon I have to stay on top of my WAY TOO MANY blog subscriptions somehow. I may read blogs on a couple of different computers, and FeedDemon’s integration with Google Reader allows me to keep them all in sync. I don’t particularly like the Google Reader interface, or the fact that it always wanted to mark articles as read just because I scrolled past them. FeedDemon solves this problem for me, and provides a multi-tabbed interface which is good because fairly frequently one blog will link to something else I want to read, and I can end up with a half-dozen open tabs all from one article. Synergy+ In my office, I run four monitors across two computers all with one mouse and keyboard. Synergy is the magic software that makes this work. TweetDeck I’m not the most active Tweeter in the world, but when I want to check-in with the Twitterverse, this really helps. I have found the #sqlhelp and #PoshHelp hash tags particularly useful, and I also have columns setup to make it easy to monitor #sqlpass, #PASSProfDev, and short term events like #sqlsat68. Whew! That’s a lot. No wonder it took me a couple of days to get everything setup the way I wanted it. Oh, that and actually getting some work accomplished at the same time. Anyway, I know that is a huge dump of info, and most people never make it here to the end, so for those who did, let me say, CONGRATULATIONS, you made it! I hope you’ll find a new tool or two to make your work life a little easier.

Read the article

Search Results

Search found 14501 results on 581 pages for 'pierre julien bringer 2008'.

Page 325/581 | < Previous Page | 321 322 323 324 325 326 327 328 329 330 331 332 | Next Page >

- by julien.testut

- by jean-pierre.dijcks

- by jean-pierre.dijcks

- by Etienne Tremblay

- by jean-pierre.dijcks

- by DigiMortal

- by Pinal Dave

- by ssmantha

- by jean-pierre.dijcks

- by Julien L

- by Julien Haye

- by tonyrogerson

- by Paul Nielsen

- by Jean-Pierre Dijcks

- by Sahil Malik

- by jean-pierre.dijcks

- by James Rogers

- by jean-pierre.dijcks

- by jean-pierre.dijcks

- by Jeremy

- by Liam Westley

- by hailey

- by Julien Testut

- by Ajarn Mark Caldwell

< Previous Page | 321 322 323 324 325 326 327 328 329 330 331 332 | Next Page >