Search Results

Search found 956 results on 39 pages for 'querying'.

Page 24/39 | < Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31 | Next Page >

Incremental Statistics Maintenance – what statistics will be gathered after DML occurs on the table?

- by Maria Colgan

Incremental statistics maintenance was introduced in Oracle Database 11g to improve the performance of gathering statistics on large partitioned table. When incremental statistics maintenance is enabled for a partitioned table, oracle accurately generated global level statistics by aggregating partition level statistics. As more people begin to adopt this functionality we have gotten more questions around how they expected incremental statistics to behave in a given scenario. For example, last week we got a question around what partitions should have statistics gathered on them after DML has occurred on the table? The person who asked the question assumed that statistics would only be gathered on partitions that had stale statistics (10% of the rows in the partition had changed). However, what they actually saw when they did a DBMS_STATS.GATHER_TABLE_STATS was all of the partitions that had been affected by the DML had statistics re-gathered on them. This is the expected behavior, incremental statistics maintenance is suppose to yield the same statistics as gathering table statistics from scratch, just faster. This means incremental statistics maintenance needs to gather statistics on any partition that will change the global or table level statistics. For instance, the min or max value for a column could change after just one row is inserted or updated in the table. It might easier to demonstrate this using an example. Let’s take the ORDERS2 table, which is partitioned by month on order_date. We will begin by enabling incremental statistics for the table and gathering statistics on the table. After the statistics gather the last_analyzed date for the table and all of the partitions now show 13-Mar-12. And we now have the following column statistics for the ORDERS2 table. We can also confirm that we really did use incremental statistics by querying the dictionary table sys.HIST_HEAD$, which should have an entry for each column in the ORDERS2 table. So, now that we have established a good baseline, let’s move on to the DML. Information is loaded into the latest partition of the ORDERS2 table once a month. Existing orders maybe also be update to reflect changes in their status. Let’s assume the following transactions take place on the ORDERS2 table this month. After these transactions have occurred we need to re-gather statistic since the partition ORDERS_MAR_2012 now has rows in it and the number of distinct values and the maximum value for the STATUS column have also changed. Now if we look at the last_analyzed date for the table and the partitions, we will see that the global statistics and the statistics on the partitions where rows have changed due to the update (ORDERS_FEB_2012) and the data load (ORDERS_MAR_2012) have been updated. The column statistics also reflect the changes with the number of distinct values in the status column increase to reflect the update. So, incremental statistics maintenance will gather statistics on any partition, whose data has changed and that change will impact the global level statistics.

Read the article
Analysis Services Tabular books #ssas #tabular

- by Marco Russo (SQLBI)

Many people are looking for books about Analysis Services Tabular. Today there are two books available and they complement each other: Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model by Marco Russo, Alberto Ferrari and Chris Webb Applied Microsoft SQL Server 2012 Analysis Services: Tabular Modeling by Teo Lachev The book I wrote with Alberto and Chris is a complete guide to create tabular models and has a good coverage about DAX, including how to use it for enriching a semantic model with calculated columns and measures and how to use it for querying a Tabular model. In my experience, DAX as a query language is a very interesting option for custom analytical applications that requires a fast calculation engine, or simply for standard reports running in Reporting Services and accessing a Tabular model. You can freely preview the table of content and read some excerpts from the book on Safari Books Online. The book is in printing and should be shipped within mid-July, so finally it will be very soon on the shelf of all the people already preordered it! The Teo Lachev’s book, covers the full spectrum of Tabular models provided by Microsoft: starting with self-service BI, you have users creating a model with PowerPivot for Excel, publishing it to PowerPivot for SharePoint and exploring data by using Power View; then, the PowerPivot for Excel model can be imported in a Tabular model and published in Analysis Services, adding more control on the model through row-level security and partitioning, for example. Teo’s book follows a step-by-step approach describing each feature that is very good for a beginner that is new to PowerPivot and/or to BISM Tabular. If you need to get the big picture and to start using the products that are part of the new Microsoft wave of BI products, the Teo’s book is for you. After you read the book from Teo, or if you already have a certain confidence with PowerPivot or BISM Tabular and you want to go deeper about internals, best practices, design patterns in just BISM Tabular, then our book is a suggested read: it contains several chapters about DAX, includes discussions about new opportunities in data model design offered by Tabular models, and also provides examples of optimizations you can obtain in DAX and best practices in data modeling and queries. It might seem strange that an author write a review of a book that might seem to compete with his one, but in reality these two books complement each other and are not alternatives. If you have any doubt, buy both: you will be not disappointed! Moreover, Amazon usually offers you a deal to buy three books, including the Visualizing Data with Microsoft Power View, another good choice for getting all the details about Power View.

Read the article
Overview of getting and setting the URL and parts of the URL using angularjs and/or Javascript

- by Sandy Good

Getting and Setting the URL, and different parts of the URL are a basic part of Application Design. For Page Navigation Deep Linking Providing a link to the user Querying Data Passing information to other pages Both angularjs and javascript provide ways to get/set the URL and parts of the URL. I'm looking for the following information: Situation: Show a simple URL in the browser address bar to the user Provide a more detailed URL with string parameters to the page that the user will not see. In other words, two different URLs will be used, one simple one that the user sees in the browser, a more detailed one available to the page on load. Get URL info with PHP when then page intially loads, both don't reload the PHP page when the user needs more detailed info that is already loaded but not displayed yet. Set the URL with a more detailed URL for deep linking as the user drills down to more specific information. Get URL info in a controller or JavaSript when angularjs detects a change in the URL with routing. Hash or Query String or Both? Should I use a hash # in the URL, a string ?= or both? Here is what I currently know and what I want: A Query String HTTP:\\www.name.com?mykey=itemID will prevent angularjs from reloading the page. So, I can change the URL by adding/changing the string at the end, thereby providing new info to the page, and keep the page from reloading. I can change the URL and force a page reload with: window.location.href = "#Store/" + argUserPubId + "?itemID=home"; If home is the itemID string, I want code to simply load the page, and not display more detailed information. If there is a real itemID in the URL query string, I want the code to display the more detailed information. Code from angularjs will run either from the controller specified in the routing, or a controller specified in the HTML, or both. The angularjs code specified in the routing seems to run first, before the code specified in the HTML. A different URL for the page can be used in angularjs templateURL: than the URL that was sent to the browser address bar. when('/Store/:StoreId', { templateUrl: function(params){return 'Client_Pages/Stores.php?storeID=' + params.StoreId;}, controller: 'storeParseData' }). The above code detects http:\\www.name.com\Store\StoreID in the browser, but SENDS http:\\www.name.com\Client_Pages/Stores.php?storeID=StoreID to the page. In the above code, a function is used for the angularjs routing templateURL: to dynamically set the templateURL. So, when the user clicks something to see details of an item, how should I configure the URL? Should I use angularjs $location or window.location.href ? Should I use a longer URL with more parameters, a hash bang, or a query string? Should I use: http:\\www.name.com\Store\StoreID\ItemID or http:\\www.name.com\Store\StoreID#ItemID or http:\\www.name.com\Store\StoreID?ItemID or http:\\www.name.com\Store#StoreID?ItemID or Something else?

Read the article
Database design and performance impact

- by Craige

I have a database design issue that I'm not quite sure how to approach, nor if the benefits out weigh the costs. I'm hoping some P.SE members can give some feedback on my suggested design, as well as any similar experiences they may have came across. As it goes, I am building an application that has large reporting demands. Speed is an important issue, as there will be peak usages throughout the year. This application/database has a multiple-level, many-to-many relationship. eg object a object b object c object d object b has relationship to object a object c has relationship to object b, a object d has relationship to object c, b, a Theoretically, this could go on for unlimited levels, though logic dictates it could only go so far. My idea here, to speed up reporting, would be to create a syndicate table that acts as a global many-to-many join table. In this table (with the given example), one might see: +----------+-----------+---------+ | child_id | parent_id | type_id | +----------+-----------+---------+ | b | a | 1 | | c | b | 2 | | c | a | 3 | | d | c | 4 | | d | b | 5 | | d | a | 6 | +----------+-----------+---------+ Where a, b, c and d would translate to their respective ID's in their respective tables. So, for ease of reporting all of a which exist on object d, one could query SELECT * FROM `syndicates` ... JOINS TO child and parent tables ... WHERE parent_id=a and type_id=6; rather than having a query with a join to each level up the chain. The Problem This table grows exponentially, and in a given year, could easily grow past 20,000 records for one client. Given multiple clients over multiple years, this table will VERY quickly explode to millions of records and beyond. Now, the database will, in time, be partitioned across multiple servers, but I would like (as most would) to keep the number of servers as low as possible while still offering flexibility. Also writes and updates would be exponentially longer (though possibly not noticeable to the end user) as there would be multiple inserts/updates/scans on this table to keep it in sync. Am I going in the right direction here, or am I way off track. What would you do in a similar situation? This solution seems overly complex, but allows the greatest flexibility and fastest read-operations. Sidenote 1 - This structure allows me to add new levels to the tree easily. Sidenote 2 - The database querying for this database is done through an ORM framework.

Read the article
Entity Framework 4.0: Optimal and horrible SQL

- by DigiMortal

Lately I had Entity Framework 4.0 session where I introduced new features of Entity Framework. During session I found out with audience how Entity Framework 4.0 can generate optimized SQL. After session I also showed guys one horrible example about how awful SQL can be generated by Entity Framework. In this posting I will cover both examples. Optimal SQL Before going to code take a look at following model. There is class called Event and I will use this class in my query. Here is the LINQ To Entities query that uses small anonymous type. var query = from e in _context.Events select new { Id = e.Id, Title = e.Title }; Debug.WriteLine(((ObjectQuery)query).ToTraceString()); Running this code gives us the following SQL. SELECT [Extent1].[event_id] AS [event_id], [Extent1].[title] AS [title] FROM [dbo].[events] AS [Extent1] This is really small – no additional fields in SELECT clause. Nice, isn’t it? Horrible SQL Ayende Rahien blog shows us darker side of Entiry Framework 4.0 queries. You can find comparison betwenn NHibernate, LINQ To SQL and LINQ To Entities from posting What happens behind the scenes: NHibernate, Linq to SQL, Entity Framework scenario analysis. In this posting I will show you the resulting query and let you think how much better it can be done. Well, it is not something we want to see running in our servers. I hope that EF team improves generated SQL to acceptable level before Visual Studio 2010 is released. There is also morale of this example: you should always check out the queries that O/R-mapper generates. Behind the curtains it may silently generate queries that perform badly and in this case you need to optimize you data querying strategy. Conclusion Entity Framework 4.0 is new product with a lot of new features and it is clear that not everything is 100% super in its first release. But it still great step forward and I hope that on 12.04.2010 we have new promising O/R-mapper available to use in our projects. If you want to read more about Entity Framework 4.0 and Visual Studio 2010 then please feel free to follow this link to list of my Visual Studio 2010 and .NET Framework 4.0 postings.

Read the article
Framework 4 Features: User Propogation to the Database

- by Anthony Shorten

Once of the features I mentioned in a previous entry was the ability for Oracle Utilities Application Framework V4 to automatically propogate the end user to the database connection. This bears more explanation. In the past releases of the Oracle Utilities Application Framework, all database connections are pooled and shared within a channel of access. So for example, the online connections on the Business Application Server share a common pool of connections and the batch in a thread pool shares a seperate pool of connections. The connections are pooled for performance reasons (the most expensive part of a typical transaction is opening and closing connections so we save time by having them ready beforehand). The idea is that when a business function needs some SQL to be execute it takes a spare connection from the pool, executes the SQL and then returns the connection back to the pool for reuse. Unfortunelty to support the pool being started and ready before the transactions arrives means that you need to have a shared userid (as you dont know the users who need them beforehand). Therefore each connection uses the same database user to execute the SQL it needs. This is acceptable for executing transactions, generally but does not allow the DBA or other tools to ascertain which end user is actually running the transaction. In Oracle Utilities Application Framework V4, we now set the CLIENT_IDENTIFIER to the end userid (not the Login Id) when the connection is taken from the pool and used and reset it back to blank when returned to the pool. The CLIENT_IDENTIFIER is a feature that is present in the Oracle Database connection information. From a monitoring perspective, when a connection to the database is actively running SQL, the end user is now able to be determined by querying the CLIENT_IDENTIFIER on the session object within the database. This can be done in the DBA's favorite monitoring tool (even just some SQL on the v$session table is enough). This has other implications as well. Oracle sells a lot of other security addons to the database and so do third parties. If a site wants to have additional levels of security or auditing in the database then the CLIENT_IDENTIFIER, if supported, is now available to be recorded or used by those products to provide additional levels of security. This facility was one of the highly "nice to haves" that customers would ask us about so we now allow it to be used to allow finer grained monitoring and additional security facilities. Note: This facility is only available for customers using the Oracle Database versions of our products.

Read the article
How to Name Linked Servers

- by Bill Graziano

I did another SQL Server migration over the weekend that dealt with linked servers. I’ve seen all kinds of odd naming schemes and there are a few I like and a few I suggest you avoid. Don’t name your linked server for its IP address. At some point whatever is on the other end of that IP address will move. You’ll probably need to point your linked server to a new IP address but not change the name of the linked server. And then you’ve completely lost any context around this. Bonus points if a new SQL Server eventually ends up at the old IP address further adding confusion when you’re trying to troubleshoot. Don’t name your linked server based on its instance name. This one is less obvious. It sounds nice to have a linked server named [VSRV1\SQLTRAN01]. You know what it is and it’s easy to use. It’s less nice when you’ve got 200 stored procedures that all reference this linked server but the database they reference has moved to a new instance. Now when you query this you’re actually querying a different instance. (Please note: I’m not saying it’s a good idea to have 200 stored procedures that all reference a linked server. I’m just saying it’s not all that uncommon.) Consider naming your linked server something that you can easily search on. See my note above. You can also get around this by always enclosing the name in brackets. That is harder to enforce unless you use some odd characters in it. Consider naming your linked server based on the function. For example, I’ve had some luck having a linked server named [DW] that points to our data warehouse server. That server can change names or physically move and all I need to do is update the linked server to point to the new destination. The descriptive name of the linked server is still accurate. No code needs to change and people still know what it is just by looking at it. Consider naming your linked server for the database. I’m still thinking through this one. It may mean you have multiple linked servers that point to the same instance. I’ve found that database names rarely change. It also makes it easier to move individual databases to new servers. Consider pointing your linked servers to DNS entries and not IP addresses. I’ve done this for reporting databases and had some success. Especially for read-only snapshots that can get created on the main database or on the mirror. What issues have you had with linked server names? What has worked for you? Where are the holes in my approach?

Read the article
Is inline SQL still classed as bad practice now that we have Micro ORMs?

- by Grofit

This is a bit of an open ended question but I wanted some opinions, as I grew up in a world where inline SQL scripts were the norm, then we were all made very aware of SQL injection based issues, and how fragile the sql was when doing string manipulations all over the place. Then came the dawn of the ORM where you were explaining the query to the ORM and letting it generate its own SQL, which in a lot of cases was not optimal but was safe and easy. Another good thing about ORMs or database abstraction layers were that the SQL was generated with its database engine in mind, so I could use Hibernate/Nhibernate with MSSQL, MYSQL and my code never changed it was just a configuration detail. Now fast forward to current day, where Micro ORMs seem to be winning over more developers I was wondering why we have seemingly taken a U-Turn on the whole in-line sql subject. I must admit I do like the idea of no ORM config files and being able to write my query in a more optimal manner but it feels like I am opening myself back up to the old vulnerabilities such as SQL injection and I am also tying myself to one database engine so if I want my software to support multiple database engines I would need to do some more string hackery which seems to then start to make code unreadable and more fragile. (Just before someone mentions it I know you can use parameter based arguments with most micro orms which offers protection in most cases from sql injection) So what are peoples opinions on this sort of thing? I am using Dapper as my Micro ORM in this instance and NHibernate as my regular ORM in this scenario, however most in each field are quite similar. What I term as inline sql is SQL strings within source code. There used to be design debates over SQL strings in source code detracting from the fundamental intent of the logic, which is why statically typed linq style queries became so popular its still just 1 language, but with lets say C# and Sql in one page you have 2 languages intermingled in your raw source code now. Just to clarify, the SQL injection is just one of the known issues with using sql strings, I already mention you can stop this from happening with parameter based queries, however I highlight other issues with having SQL queries ingrained in your source code, such as the lack of DB Vendor abstraction as well as losing any level of compile time error capturing on string based queries, these are all issues which we managed to side step with the dawn of ORMs with their higher level querying functionality, such as HQL or LINQ etc (not all of the issues but most of them). So I am less focused on the individual highlighted issues and more the bigger picture of is it now becoming more acceptable to have SQL strings directly in your source code again, as most Micro ORMs use this mechanism. Here is a similar question which has a few different view points, although is more about the inline sql without the micro orm context: http://stackoverflow.com/questions/5303746/is-inline-sql-hard-coding

Read the article
Windows Azure Recipe: Big Data

- by Clint Edmonson

As the name implies, what we’re talking about here is the explosion of electronic data that comes from huge volumes of transactions, devices, and sensors being captured by businesses today. This data often comes in unstructured formats and/or too fast for us to effectively process in real time. Collectively, we call these the 4 big data V’s: Volume, Velocity, Variety, and Variability. These qualities make this type of data best managed by NoSQL systems like Hadoop, rather than by conventional Relational Database Management System (RDBMS). We know that there are patterns hidden inside this data that might provide competitive insight into market trends. The key is knowing when and how to leverage these “No SQL” tools combined with traditional business such as SQL-based relational databases and warehouses and other business intelligence tools. Drivers Petabyte scale data collection and storage Business intelligence and insight Solution The sketch below shows one of many big data solutions using Hadoop’s unique highly scalable storage and parallel processing capabilities combined with Microsoft Office’s Business Intelligence Components to access the data in the cluster. Ingredients Hadoop – this big data industry heavyweight provides both large scale data storage infrastructure and a highly parallelized map-reduce processing engine to crunch through the data efficiently. Here are the key pieces of the environment: Pig - a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. Mahout - a machine learning library with algorithms for clustering, classification and batch based collaborative filtering that are implemented on top of Apache Hadoop using the map/reduce paradigm. Hive - data warehouse software built on top of Apache Hadoop that facilitates querying and managing large datasets residing in distributed storage. Directly accessible to Microsoft Office and other consumers via add-ins and the Hive ODBC data driver. Pegasus - a Peta-scale graph mining system that runs in parallel, distributed manner on top of Hadoop and that provides algorithms for important graph mining tasks such as Degree, PageRank, Random Walk with Restart (RWR), Radius, and Connected Components. Sqoop - a tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. Flume - a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large log data amounts to HDFS. Database – directly accessible to Hadoop via the Sqoop based Microsoft SQL Server Connector for Apache Hadoop, data can be efficiently transferred to traditional relational data stores for replication, reporting, or other needs. Reporting – provides easily consumable reporting when combined with a database being fed from the Hadoop environment. Training These links point to online Windows Azure training labs where you can learn more about the individual ingredients described above. Hadoop Learning Resources (20+ tutorials and labs) Huge collection of resources for learning about all aspects of Apache Hadoop-based development on Windows Azure and the Hadoop and Windows Azure Ecosystems SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

Read the article
Tellago releases a RESTful API for BizTalk Server business rules

- by Charles Young

Jesus Rodriguez has blogged recently on Tellago Devlabs' release of an open source RESTful API for BizTalk Server Business Rules. This is an excellent addition to the BizTalk ecosystem and I congratulate Tellago on their work. See http://weblogs.asp.net/gsusx/archive/2011/02/08/tellago-devlabs-a-restful-api-for-biztalk-server-business-rules.aspx The Microsoft BRE was originally designed to be used as an embedded library in .NET applications. This is reflected in the implementation of the Rules Engine Update (REU) Service which is a TCP/IP service that is hosted by a Windows service running locally on each BizTalk box. The job of the REU is to distribute rules, managed and held in a central database repository, across the various servers in a BizTalk group. The engine is therefore distributed on each box, rather than exploited behind a central rules service. This model is all very well, but proves quite restrictive in enterprise environments. The problem is that the BRE can only run legally on licensed BizTalk boxes. Increasingly we need to deliver rules capabilities across a more widely distributed environment. For example, in the project I am working on currently, we need to surface decisioning capabilities for use within WF workflow services running under AppFabric on non-BTS boxes. The BRE does not, currently, offer any centralised rule service facilities out of the box, and hence you have to roll your own (and then run your rules services on BTS boxes which has raised a few eyebrows on my current project, as all other WCF services run on a dedicated server farm ). Tellago's API addresses this by providing a RESTful API for querying the rules repository and executing rule sets against XML passed in the request payload. As Jesus points out in his post, using a RESTful approach hugely increases the reach of BRE-based decisioning, allowing simple invocation from code written in dynamic languages, mobile devices, etc. We developed our own SOAP-based general-purpose rules service to handle scenarios such as the one we face on my current project. SOAP is arguably better suited to enterprise service bus environments (please don't 'flame' me - I refuse to engage in the RESTFul vs. SOAP war). For example, on my current project we use claims based authorisation across the entire service bus and use WIF and WS-Federation for this purpose. We have extended this to the rules service. I can't release the code for commercial reasons :-( but this approach allows us to legally extend the reach of BRE far beyond the confines of the BizTalk boxes on which it runs and to provide general purpose decisioning capabilities on the bus. So, well done Tellago. I haven't had a chance to play with the API yet, but am looking forward to doing so.

Read the article
Designing for an algorithm that reports progress

- by Stefano Borini

I have an iterative algorithm and I want to print the progress. However, I may also want it not to print any information, or to print it in a different way, or do other logic. In an object oriented language, I would perform the following solutions: Solution 1: virtual method have the algorithm class MyAlgoClass which implements the algo. The class also implements a virtual reportIteration(iterInfo) method which is empty and can be reimplemented. Subclass the MyAlgoClass and override reportIteration so that it does what it needs to do. This solution allows you to carry additional information (for example, the file unit) in the reimplemented class. I don't like this method because it clumps together two functionalities that may be unrelated, but in GUI apps it may be ok. Solution 2: observer pattern the algorithm class has a register(Observer) method, keeps a list of the registered observers and takes care of calling notify() on each of them. Observer::notify() needs a way to get the information from the Subject, so it either has two parameters, one with the Subject and the other with the data the Subject may pass, or just the Subject and the Observer is now in charge of querying it to fetch the relevant information. Solution 3: callbacks I tend to see the callback method as a lightweight observer. Instead of passing an object, you pass a callback, which may be a plain function, but also an instance method in those languages that allow it (for example, in python you can because passing an instance method will remain bound to the instance). C++ however does not allow it, because if you pass a pointer to an instance method, this will not be defined. Please correct me on this regard, my C++ is quite old. The problem with callbacks is that generally you have to pass them together with the data you want the callback to be invoked with. Callbacks don't store state, so you have to pass both the callback and the state to the Subject in order to find it at callback execution, together with any additional data the Subject may provide about the event is reporting. Question My question is relative to the fact that I need to implement the opening problem in a language that is not object oriented, namely Fortran 95, and I am fighting with my usual reasoning which is based on python assumptions and style. I think that in Fortran the concept is similar to C, with the additional trouble that in C you can store a function pointer, while in Fortran 95 you can only pass it around. Do you have any comments, suggestions, tips, and quirks on this regard (in C, C++, Fortran and python, but also in any other language, so to have a comparison of language features that can be exploited on this regard) on how to design for an algorithm that must report progress to some external entity, using state from both the algorithm and the external entity ?

Read the article
New Source Database Added for EBS 12 + 11gR2 Transportable Tablespaces

- by John Abraham

The Transportable Tablespaces (TTS) process was originally certified for the migration of E-Business Suite R12 databases going from a source database of 11gR1 or 11gR2 to a target of 11gR2. This requirement has now been expanded to include a source database of 10gR2 (10.2.0.5) - this will potentially save time for existing 10gR2 customers as they can remove on a crucial upgrade step prior to performing the platform migration. The migration process requires an updated Controlled patch delivered by the Oracle E-Business Suite Platform Engineering team, i.e. it requires a password obtainable from Oracle Support. We released the patch in this manner to gauge uptake, and help identify and monitor any customer issues due to the nature of this technology. This patch has been updated to now include supporting 10gR2 as a source database. Does it meet your requirements?Note that for migration across platforms of the same "endian" format, users are advised to use the Transportable Database (TDB) migration process instead for large databases. The "endian-ness" target platforms can be verified by querying the view V$DB_TRANSPORTABLE_PLATFORM using SQL*Plus (connected as sysdba) on the source platform:SQL>select platform_name from v$db_transportable_platform;If the intended target platform does not appear in the output, it means that it is of a different endian format from the source. Consequently. database migration will need to be performed via Transportable Tablespaces (for large databases) or export/import.The use of Transportable Tablespaces can greatly speed up the migration of the data portion of the database. However, it does not affect metadata, which must still be migrated using export/import. We recommend that users initially perform a test migration on their database, using export/import with the 'metrics=y' parameter. This will help identify the relative amounts of data and metadata, and provide a basis for assessing likely gains in timing. In general, the larger the amount of data (compared to metadata), the greater the reduction in downtime that can be expected from using TTS as a migration process. For smaller databases or for those that have relatively small data compared to metadata, TTS will not be as beneficial for cross endian migration and the use of export/import (datapump) for the whole database is recommended. Where can I find more information? Using Transportable Tablespaces to Migrate Oracle E-Business Suite Release 12 Using Oracle Database 11g Release 2 Enterprise Edition (My Oracle Support Document 1311487.1) Oracle Database Administrator's Guide 11g Release 2 (11.2) Related Articles Database Migration using 11gR2 Transportable Tablespaces Now Certified for EBS 12 New Source Databases Added for Transportable Tablespaces + EBS 11i 10gR2 Transportable Tablespaces Certified for EBS 11i Migrating E-Business Suite Release 11i Databases Between Platforms Migrating E-Business Suite Release 12 Databases Between Platforms

Read the article
Is it evil to model JSON responses to classes when they are mostly smilar?

- by Aybe

Here's the problem : While implementing a C# wrapper for an online API (Discogs) I've been faced to a dilemma : quite often the responses returned have mostly similar members and while modeling these responses to classes, some questions surfaces on which way to go would be the best. Example : Querying for a 'release' or a 'master' will return an object that contains an array of 'artist', however these 'artists' do not exactly have the same members. Currently I decided to represent these 'artists' as a single 'Artist' class, against having respective 'ReleaseArtist' and 'MasterArtist' classes which soon becomes very confusing even though another problem arises : when a category (master or release) does not return these members, they will be null. Though it might sound confusing as well I find it less confusing than the former situation as I've tackled the problem by simply not showing null members when visualizing these objects. Is this the right approach to follow ? An example of these differences : public class Artist { public List<Alias> Aliases { get; set; } public string DataQuality { get; set; } public List<Image> Images { get; set; } public string Name { get; set; } public List<string> NameVariations { get; set; } public string Profile { get; set; } public string Realname { get; set; } public string ReleasesUrl { get; set; } public string ResourceUrl { get; set; } public string Uri { get; set; } public List<string> Urls { get; set; } } public class ReleaseArtist { public string Join { get; set; } public string Name { get; set; } public string Anv { get; set; } public string Tracks { get; set; } public string Role { get; set; } public string ResourceUrl { get; set; } public int Id { get; set; } }

Read the article
The most dangerous SQL Script in the world!

- by DrJohn

In my last blog entry, I outlined how to automate SQL Server database builds from concatenated SQL Scripts. However, I did not mention how I ensure the database is clean before I rebuild it. Clearly a simple DROP/CREATE DATABASE command would suffice; but you may not have permission to execute such commands, especially in a corporate environment controlled by a centralised DBA team. However, you should at least have database owner permissions on the development database so you can actually do your job! Then you can employ my universal "drop all" script which will clear down your database before you run your SQL Scripts to rebuild all the database objects. Why start with a clean database? During the development process, it is all too easy to leave old objects hanging around in the database which can have unforeseen consequences. For example, when you rename a table you may forget to delete the old table and change all the related views to use the new table. Clearly this will mean an end-user querying the views will get the wrong data and your reputation will take a nose dive as a result! Starting with a clean, empty database and then building all your database objects using SQL Scripts using the technique outlined in my previous blog means you know exactly what you have in your database. The database can then be repopulated using SSIS and bingo; you have a data mart "to go". My universal "drop all" SQL Script To ensure you start with a clean database run my universal "drop all" script which you can download from here: 100_drop_all.zip By using the database catalog views, the script finds and drops all of the following database objects: Foreign key relationships Stored procedures Triggers Database triggers Views Tables Functions Partition schemes Partition functions XML Schema Collections Schemas Types Service broker services Service broker queues Service broker contracts Service broker message types SQLCLR assemblies There are two optional sections to the script: drop users and drop roles. You may use these at your peril, particularly as you may well remove your own permissions! Note that the script has a verbose mode which displays the SQL commands it is executing. This can be switched on by setting @debug=1. Running this script against one of the system databases is certainly not recommended! So I advise you to keep a USE database statement at the top of the file. Good luck and be careful!!

Read the article
Hadoop, NOSQL, and the Relational Model

- by Phil Factor

(Guest Editorial for the IT Pro/SysAdmin Newsletter)Whereas Relational Databases fit the world of commerce like a glove, it is useless to pretend that they are a perfect fit for all human endeavours. Although, with SQL Server, we’ve made great strides with indexing text, in processing spatial data and processing markup, there is still a problem in dealing efficiently with large volumes of ephemeral semi-structured data. Key-value stores such as Cassandra, Project Voldemort, and Riak are of great value for ephemeral data, and seem of equal value as a data-feed that provides aggregations to an RDBMS. However, the Document databases such as MongoDB and CouchDB are ideal for semi-structured data for which no fixed schema exists; analytics and logging are obvious examples. NoSQL products, such as MongoDB, tackle the semi-structured data problem with panache. MongoDB is designed with a simple document-oriented data model that scales horizontally across multiple servers. It doesn’t impose a schema, and relies on the application to enforce the data structure. This is another take on the old ‘EAV’ problem (where you don’t know in advance all the attributes of a particular entity) It uses a clever replica set design that allows automatic failover, and uses journaling for data durability. It allows indexing and ad-hoc querying. However, for SQL Server users, the obvious choice for handling semi-structured data is Apache Hadoop. There will soon be an ODBC Driver for Apache Hive .and an Add-in for Excel. Additionally, there are now two Hadoop-based connectors for SQL Server; the Apache Hadoop connector for SQL Server 2008 R2, and the SQL Server Parallel Data Warehouse (PDW) connector. We can connect to Hadoop process the semi-structured data and then store it in SQL Server. For one steeped in the culture of Relational SQL Databases, I might be expected to throw up my hands in the air in a gesture of contempt for a technology that was, judging by the overblown journalism on the subject, about to make my own profession as archaic as the Saggar makers bottom knocker (a potter’s assistant who helped the saggar maker to make the bottom of the saggar by placing clay in a metal hoop and bashing it). However, on the contrary, I find that I'm delighted with the advances made by the NoSQL databases in the past few years. Having the flow of ideas from the NoSQL providers will knock any trace of complacency out of the providers of Relational Databases and inspire them into back-fitting some features, such as horizontal scaling, with sharding and automatic failover into SQL-based RDBMSs. It will do the breed a power of good to benefit from all this lateral thinking.

Read the article
Are there any concerns with using a static read-only unit of work so that it behaves like a cache?

- by Rowan Freeman

Related question: How do I cache data that rarely changes? I'm making an ASP.NET MVC4 application. On every request the security details about the user will need to be checked with the area/controller/action that they are accessing to see if they are allowed to view it. The security information is stored in the database. For example: User Permission UserPermission Action ActionPermission A "Permission" is a token that is applied to an MVC action to indicate that the token is required in order to access the action. Once a user is given the permission (via the UserPermission table) then they have the token and can therefore access the action. I've been looking in to how to cache this data (since it rarely changes) so that I'm only querying in-memory data and not hitting a database (which is a considerable performance hit at the moment). I've tried storing things in lists, using a caching provider but I either run in to problems or performance doesn't improve. One problem that I constantly run in to is that I'm using lazy loading and dynamic proxies with EntityFramework. This means that even if I ToList() everything and store them somewhere static, the relationships are never populated. For example, User.Permissions is an ICollection but it's always null. I don't want to Include() everything because I'm trying to keep things simple and generic (and easy to modify). One thing I know is that an EntityFramework DbContext is a unit of work that acts with 1st-level caching. That is, for the duration of the unit of work, everything that is accessed is cached in memory. I want to create a read-only DbContext that will exist indefinitely and will only be used to read about permission data. Upon testing this it worked perfectly; my page load times went from 200ms+ to 20ms. I can easily force the data to refresh at certain intervals or simply leave it to refresh when the application pool is recycled. Basically it will behave like a cache. Note that the rest of the application will interact with other contexts that exist per request as normal. Is there any disadvantage to this approach? Could I be doing something different?

Read the article
Sync Google Calendar with SharePoint Calendar

- by dataintegration

The ADO.NET Providers for Google and SharePoint make it easy to retrieve and update data in both Google's web services and SharePoint. This article shows how the SQL interface to data makes it easy to build applications that need to move data from one source to another. The application described here is a demo Windows application that synchronizes calendar events between Google and SharePoint, but the RSSBus Providers can be used to achieve integrations on both the .NET and the Java platforms, including more sophisticated features like full automation. Getting the Events Step 1: Google accounts can have several calendars. Obtain a list of a user's Google Calendars by issuing a query to the Calendars table. For example: SELECT * FROM Calendars. Step 2: In order to get a list of the events from a given Google Calendar, issue a query to the CalendarEvents table while specifying the CalendarId from the Calendars table. The resulting events can be further filtered by using the StartDateTime or EndDateTime columns. For example: SELECT * FROM CalendarEvents WHERE (CalendarId = '[email protected]') AND (StartDateTime >= '1/1/2012') AND (StartDateTime <= '2/1/2012') Step 3: SharePoint stores data in Lists. There are various types of lists, e.g., document lists and calendar lists. A SharePoint account can have several lists of the same type. To find all the calendar lists in SharePoint, use the ListLists stored procedure and inspect the BaseTemplate column. Step 4: The SharePoint data provider models each SharPoint list as a table. Get the events in a particular calendar by querying the table with the same name as the list. The events may be filtered further by specifying the EventDate or EndDate columns. For example: SELECT * FROM Calendar WHERE (EventDate >= '1/1/2012') AND (EventDate <= '2/1/2012') Synchronizing the Events Synchronizing the events is a simple process. Once the events from Google and SharePoint are available they can be compared and synchronized based on user preference. The sample application does this based on user input, but it is easy to create one that does the synchronization automatically. The INSERT, UPDATE, and DELETE statements available in both data providers makes it easy to create, update, or delete events as needed. Pre-Built Demo Application The executable for the demo application can be downloaded here. Note that this demo is built using BETA builds of the ADO.NET Provider for Google V2 and ADO.NET Provider for SharePoint V2, and will expire in 2013. Source Code You can download the full source of the demo application here. You will need the Google ADO.NET Data Provider V2 and the SharePoint ADO.NET Data Provider V2, which can be obtained here.

Read the article
How are Reads Distributed in a Workload

- by Bill Graziano

People have uploaded nearly one millions rows of trace data to TraceTune. That’s enough data to start to look at the results in aggregate. The first thing I want to look at is logical reads. This is the easiest metric to identify and fix. When you upload a trace, I rank each statement based on the total number of logical reads. I also calculate each statement’s percentage of the total logical reads. I do the same thing for CPU, duration and logical writes. When you view a statement you can see all the details like this: This single statement consumed 61.4% of the total logical reads on the system while we were tracing it. I also wanted to see the distribution of reads across statements. That graph looks like this: On average, the highest ranked statement consumed just under 50% of the reads on the system. When I tune a system, I’m usually starting in one of two modes: this “piece” is slow or the whole system is slow. If a given piece (screen, report, query, etc.) is slow you can usually find the specific statements behind it and tune it. You can make that individual piece faster but you may not affect the whole system. When you’re trying to speed up an entire server you need to identity those queries that are using the most disk resources in aggregate. Fixing those will make them faster and it will leave more disk throughput for the rest of the queries. Here are some of the things I’ve learned querying this data: The highest ranked query averages just under 50% of the total reads on the system. The top 3 ranked queries average 73% of the total reads on the system. The top 10 ranked queries average 91% of the total reads on the system. Remember these are averages across all the traces that have been uploaded. And I’m guessing that people mainly upload traces where there are performance problems so your mileage may vary. I also learned that slow queries aren’t the problem. Before I wrote ClearTrace I used to identify queries by filtering on high logical reads using Profiler. That picked out individual queries but those rarely ran often enough to put a large load on the system. If you look at the execution count by rank you’d see that the highest ranked queries also have the highest execution counts. The graph would look very similar to the one above but flatter. These queries don’t look that bad individually but run so often that they hog the disk capacity. The take away from all this is that you really should be tuning the top 10 queries if you want to make your system faster. Tuning individually slow queries will help those specific queries but won’t have much impact on the system as a whole.

Read the article
Assessing Relative Maintainability

- by João Bragança

We (a contractor, actually) are implementing an off the shelf system to replace a legacy homegrown system for the core domain of the company (designing widgets). Unfortunately both systems will have to run concurrently for some time, as the product just isn't ready yet. Also, the decision was made to only migrate some of the widgets from the legacy system, based on date of last sale activity. Later on a new requirement came down: certain people in the company, most of them outside of the widget development context, want to search all widgets. The search results screen has 3 pieces of data: a GUID, a human readable id that is searchable, and a brief description (may need to be searchable in the future). In the widget details, there will be multiple screens. These screens align very well along SOA / bounded context lines - a screen for marketing data, a screen for sales history, etc. UML ahead! I am probably using the wrong kind of arrows here so please forgive me. The current solution - which is not in production yet - is something like the following: Both systems will be queried and the controller will merge the results. The new system has its own proprietary query language (we've alleviated this a bit with a LINQ provider). It also puts a lot of data on the wire. 15 search results typically run about 60k of unintelligible SOAP-wrapped xml. So I would prefer to avoid querying this system directly. These two systems publish events to help us integrate with other systems, mainly an ERP system. One of these events contains all the data necessary for the search screen. I proposed the following alternative: However I am being told that 'adding another database' will create more maintenance down the road. However, I believe this to be false, as I had to add a relatively simple feature that took several hours longer than anticipated because of this merging code. I want to get a feel for which system is more maintainable in the long run. I personally have not had the burden of maintaining any large system. I want something more than my gut. Specifically I'd like to know if having more, specialized physical databases is more or less maintainable than having less larger physical databases.

Read the article
NHibernate Session Load vs Get when using Table per Hierarchy. Always use ISession.Get<T> for TPH to work.

- by Rohit Gupta

Originally posted on: http://geekswithblogs.net/rgupta/archive/2014/06/01/nhibernate-session-load-vs-get-when-using-table-per-hierarchy.aspxNHibernate ISession has two methods on it : Load and Get. Load allows the entity to be loaded lazily, meaning the actual call to the database is made only when properties on the entity being loaded is first accessed. Additionally, if the entity has already been loaded into NHibernate Cache, then the entity is loaded directly from the cache instead of querying the underlying database. ISession.Get<T> instead makes the call to the database, every time it is invoked. With this background, it is obvious that we would prefer ISession.Load<T> over ISession.Get<T> most of the times for performance reasons to avoid making the expensive call to the database. let us consider the impact of using ISession.Load<T> when we are using the Table per Hierarchy implementation of NHibernate. Thus we have base class/ table Animal, there is a derived class named Snake with the Discriminator column being Type which in this case is “Snake”. If we load This Snake entity using the Repository for Animal, we would have a entity loaded, as shown below: public T GetByKey(object key, bool lazy = false) { if (lazy) return CurrentSession.Load<T>(key); return CurrentSession.Get<T>(key); } var tRepo = new NHibernateReadWriteRepository<TPHAnimal>(); var animal = tRepo.GetByKey(new Guid("602DAB56-D1BD-4ECC-B4BB-1C14BF87F47B"), true); var snake = animal as Snake; snake is null As you can see that the animal entity retrieved from the database cannot be cast to Snake even though the entity is actually a snake. The reason being ISession.Load prevents the entity to be cast to Snake and will throw the following exception: System.InvalidCastException : Message=Unable to cast object of type 'TPHAnimalProxy' to type 'NHibernateChecker.Model.Snake'. Thus we can see that if we lazy load the entity using ISession.Load<TPHAnimal> then we get a TPHAnimalProxy and not a snake. =============================================================== However if do not lazy load the same cast works perfectly fine, this is since we are loading the entity from database and the entity being loaded is not a proxy. Thus the following code does not throw any exceptions, infact the snake variable is not null: var tRepo = new NHibernateReadWriteRepository<TPHAnimal>(); var animal = tRepo.GetByKey(new Guid("602DAB56-D1BD-4ECC-B4BB-1C14BF87F47B"), false); var snake = animal as Snake; if (snake == null) { var snake22 = (Snake) animal; }

Read the article
What's the best way to cache a growing database table for html generation?

- by McLeopold

I've got a database table which will grow in size by about 5000 rows a hour. For a key that I would be querying by, the query will grow in size by about 1 row every hour. I would like a web page to show the latest rows for a key, 50 at a time (this is configurable). I would like to try and implement memcache to keep database activity low for reads. If I run a query and create a cache result for each page of 50 results, that would work until a new entry is added. At that time, the page of latest results gets new result and the oldest results drops off. This cascades down the list of cached pages causing me to update every cache result. It seems like a poor design. I could build the cache pages backwards, then for each page requested I should get the latest 2 pages and truncate to the proper length of 50. I'm not sure if this is good or bad? Ideally, the mechanism I use to insert a new row would also know how to invalidate the proper cache results. Has someone already solved this problem in a widely acceptable way? What's the best method of doing this? EDIT: If my understanding of the MYSQL query cache is correct, it has table level granularity in invalidation. Given the fact that I have about 5000 updates before a query on a key should need to be invalidated, it seems that the database query cache would not be used. MS SQL caches execution plans and frequently accessed data pages, so it may do better in this scenario. My query is not against a single table with TOP N. One version has joins to several tables and another has sub-selects. Also, since I want to cache the html generated table, I'm wondering if a cache at the web server level would be appropriate? Is there really no benefit to any type of caching? Is the best advice really to just allow a website site query to go through all the layers and hit the database every request?

Read the article
What is the right way to group this project into classes?

- by sigil

I originally asked this on SO, where it was closed and recommended that I ask it here instead. I'm trying to figure out how to group all the functions necessary for my project into classes. The goal of the project is to execute the following process: Get the user's FTP credentials (username & password). Check to make sure the credentials establish a valid connection to the FTP server. Query several Sharepoint lists and join the results of those queries to create a list of items that need to have action taken on them. Each item in the list has a folder. For each item: Zip the contents of the folder. Upload the folder to the FTP server using SFTP Update the item's Sharepoint data. Email the user an Excel report showing, e.g., Items without folder paths Items that failed to zip or upload Steps 2-5 are performed on a periodic basis; if step 2 returns an invalid connection, the user is alerted and the process returns to step 1. If at any point the user presses a certain key, the process terminates. I've defined the following set of classes, each of which is in its own .cs file: SFTP: file transfer processes DataHandler: Sharepoint data retrieval/querying/updating processes. Also makes and uploads the zip files. Exceptions: Not just one class, this is the .cs file where I have all of my exception classes. Report: Builds and sends the report. Program: The main class for running the program. I recognize that the DataHandler class is a god object, but I don't have a good idea of how to refactor it. I feel like it should be more fine-grained than just breaking it into Sharepoint, Zip, and Upload, but maybe that's it. Also, I haven't yet worked out how to combine the periodic behavior with the "wait for user input at any point in the process" part; I think that involves threads, which means other classes to manage the threads... I'm not that well-versed in design patterns, but is there one that fits this project well? If this is too big of a topic to neatly explain in an SO answer, I'll also accept a link to a good tutorial on what I'm trying to do here.

Read the article
Linked server problem on SQL Server 2005

- by BradyKelly

I have a weird issue and I hope someone can steer me in the right direction for resolving this please. When I execute the following query against a linked server, I get the following error. I can connect to the server in SSMS as a separate server, and execute a similar query against its Deposits table. The nn.nn is my own replacement to avoid broadcasting our server addresses. The query: select td.Batch , td.DateTimeDeposited from Deposits cd left join [172.nn.nn.32\sqlexpress].Terminal.dbo.Deposits td on cd.DateTimeDeposited = td.DateTimeDeposited The error: OLE DB provider "SQLNCLI" for linked server "172.nn.nn.11\sqlexpress" returned message "Login timeout expired". OLE DB provider "SQLNCLI" for linked server "172.nn.nn.11\sqlexpress" returned message "An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections.". Msg 65535, Level 16, State 1, Line 0 SQL Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF]. Notice how the error is about server 172.nn.nn.11 and not 172.nn.nn.32. SOLVED (STUPID ME): Somebody had added an extra bit to my query that was scrolled off-screen and was querying the 17.nn.nn.11 server.

Read the article
Can not join additional domain controllers

- by Hosm

Hi all, I had a dead PDC and another not so synced domain controller for my domain. using comments here link now the so called secondary domain controller has seized domain controls and I can verify it from dsa.msc that it is a domain controller. I set up another domain controller (win2003SRV) and about to promote an AD on it as a domain controller for my domain. When I try to join the new domain controller to the domain I face DNS problem. here is some more detail DNS was successfully queried for the service location (SRV) resource record used to locate a domain controller for domain DOMNAME.A.B: The query was for the SRV record for _ldap._tcp.dc._msdcs.DOMNAME.A.B The following domain controllers were identified by the query: update.DOMNAME.A.B Common causes of this error include: - Host (A) records that map the name of the domain controller to its IP addresses are missing or contain incorrect addresses. - Domain controllers registered in DNS are not connected to the network or are not running. For information about correcting this problem, click Help. it is worth noting that update.DOMNAME.A.B is the current domain controller to which I'd like to add another controller named PDC.DOMNAME.A.B Ip address of update.DOMNAME.A.B is 192.168.200.1 and for pdc.DOMNAME.A.B is 192.168.200.100 querying DNS on both machine return correct results. Any idea?

Read the article
Deploying concrete5 on nginx

- by Nithin

I have a concrete5 site that works 'out of the box' in apache server. However I am having a lot of trouble running it in nginx. The following is the nginx configuration i am using: server { root /home/test/public; index index.php; access_log /home/test/logs/access.log; error_log /home/test/logs/error.log; location / { # First attempt to serve request as file, then # as directory, then fall back to index.html try_files $uri $uri/ index.php; # Uncomment to enable naxsi on this location # include /etc/nginx/naxsi.rules } # pass the PHP scripts to FastCGI server listening on unix socket # location ~ \.php($|/) { fastcgi_pass unix:/tmp/phpfpm.sock; fastcgi_split_path_info ^(.+\.php)(/.+)$; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_param PATH_INFO $fastcgi_path_info; include fastcgi_params; } location ~ /\.ht { deny all; } } I am able to get the homepage but am having problem with the inner pages. The inner pages display an "Access denied". Possibly the rewrite is not working, in effect I think its querying and trying to execute php files directly instead of going through the concrete dispatcher. I am totally lost here. Thank you for your help, in advance.

Read the article

Search Results

Search found 956 results on 39 pages for 'querying'.

Page 24/39 | < Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31 | Next Page >

- by Maria Colgan

- by Marco Russo (SQLBI)

- by Sandy Good

- by Craige

- by DigiMortal

- by Anthony Shorten

- by Bill Graziano

- by Grofit

- by Clint Edmonson

- by Charles Young

- by Stefano Borini

- by John Abraham

- by Aybe

- by DrJohn

- by Phil Factor

- by Rowan Freeman

- by dataintegration

- by Bill Graziano

- by João Bragança

- by Rohit Gupta

- by McLeopold

- by sigil

- by BradyKelly

- by Hosm

- by Nithin

< Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31 | Next Page >