Search Results

Search found 6257 results on 251 pages for 'columns'.

Page 75/251 | < Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >

  • Would this data requirement suit a Document -Oriented database?

    - by codecowboy
    I have a requirement to allow users to fill in journal/diary entries per day. I want to provide a handful of known journal templates with x columns to fill in. An example might be a thought diary; a user has to record a thought in one column, describe the situation, rate how they felt etc. The other requirement is that a user should be able to create their own diary templates. They might have a need for a 10 column diary entry per day and might need to rate some aspect out of 50 instead of 10. In an RDBMS, I can see this getting quite complicated. I could have individual tables for my known templates as the fields will be fixed. But for custom diary templates I imagine I would would need a table storing custom_field_types (the diary columns), a table storing entries referencing their field types (custom_entries) and then a third custom_diary table which would store rows matching custom_entries to diaries. Leaving performance / scaling aside, would it be any simpler or make more sense to use a document oriented database like MongoDB to store this data? This is for a web application which might later need an API for mobile devices.

    Read the article

  • Checking validation of entries in a Sudoku game written in Java

    - by Mico0
    I'm building a simple Sudoku game in Java which is based on a matrix (an array[9][9]) and I need to validate my board state according to these rules: all rows have 1-9 digits all columns have 1-9 digits. each 3x3 grid has 1-9 digits. This function should be efficient as possible for example if first case is not valid I believe there's no need to check other cases and so on (correct me if I'm wrong). When I tried doing this I had a conflict. Should I do one large for loop and inside check columns and row (in two other loops) or should I do each test separately and verify every case by it's own? (Please don't suggest too advanced solutions with other class/object helpers.) This is what I thought about: Main validating function (which I want pretty clean): public boolean testBoard() { boolean isBoardValid = false; if (validRows()) { if (validColumns()) { if (validCube()) { isBoardValid = true; } } } return isBoardValid; } Different methods to do the specific test such as: private boolean validRows() { int rowsDigitsCount = 0; for (int num = 1; num <= 9; num++) { boolean foundDigit = false; for (int row = 0; (row < board.length) && (!foundDigit); row++) { for (int col = 0; col < board[row].length; col++) { if (board[row][col] == num) { rowsDigitsCount++; foundDigit = true; break; } } } } return rowsDigitsCount == 9 ? true : false; } I don't know if I should keep doing tests separately because it looks like I'm duplicating my code.

    Read the article

  • Creating a Predicate Builder extension method

    - by Rippo
    I have a Kendo UI Grid that I am currently allowing filtering on multiple columns. I am wondering if there is a an alternative approach removing the outer switch statement? Basically I want to able to create an extension method so I can filter on a IQueryable<T> and I want to drop the outer case statement so I don't have to switch column names. private static IQueryable<Contact> FilterContactList(FilterDescriptor filter, IQueryable<Contact> contactList) { switch (filter.Member) { case "Name": switch (filter.Operator) { case FilterOperator.StartsWith: contactList = contactList.Where(w => w.Firstname.StartsWith(filter.Value.ToString()) || w.Lastname.StartsWith(filter.Value.ToString()) || (w.Firstname + " " + w.Lastname).StartsWith(filter.Value.ToString())); break; case FilterOperator.Contains: contactList = contactList.Where(w => w.Firstname.Contains(filter.Value.ToString()) || w.Lastname.Contains(filter.Value.ToString()) || (w.Firstname + " " + w.Lastname).Contains( filter.Value.ToString())); break; case FilterOperator.IsEqualTo: contactList = contactList.Where(w => w.Firstname == filter.Value.ToString() || w.Lastname == filter.Value.ToString() || (w.Firstname + " " + w.Lastname) == filter.Value.ToString()); break; } break; case "Company": switch (filter.Operator) { case FilterOperator.StartsWith: contactList = contactList.Where(w => w.Company.StartsWith(filter.Value.ToString())); break; case FilterOperator.Contains: contactList = contactList.Where(w => w.Company.Contains(filter.Value.ToString())); break; case FilterOperator.IsEqualTo: contactList = contactList.Where(w => w.Company == filter.Value.ToString()); break; } break; } return contactList; } Some additional information, I am using NHibernate Linq. Also another problem is that the "Name" column on my grid is actually "Firstname" + " " + "LastName" on my contact entity. We can also assume that all filterable columns will be strings.

    Read the article

  • CSS Positioning

    - by Davey
    Trying to mess with this wordpress theme and can't figure out why the sidebar is stacking underneath the content block. Any help would be very appreciated. http://www.buffalostreetbooks.com/events CSS: body { font-family: Arial, Helvetica, Verdana, Sans-serif; font-size: 10pt; background-color: #692022; background-image:url("http://www.buffalostreetbooks.com/wp-content/themes/autumn-leaves/images/repeatflower.png"); } body,h1#blog-title { margin: 0; padding: 0; } a { color: blue; } a:hover { color: #FF8C00; } a img { border: 0 none; } #wrapper { width: 960px; margin: 0 auto; background-color: #F4FBF4; border-left: 1px solid #ccc; border-right: 1px solid #ccc; } #header { background-image:url("http://www.buffalostreetbooks.com/wp-content/themes/autumn-leaves/images/headertime.png"); width:768px; height: 200px; } #inner-header { padding: 125px 1em 0; } h1#blog-title { font-size: 2em; } h1#blog-title a { color: #800000; } .entry-title a { color: #CD853F; } h1#blog-title a, .entry-title a, #footer a { text-decoration: none; } h1#blog-title a:hover, .entry-title a:hover, #footer a:hover { text-decoration: underline; } div.skip-link { display: none; } #menu { border-bottom: 1px solid #ccc; } #menu a { color: #000; } #menu a:hover { text-decoration: underline; } #menu li.current_page_item a, #menu li.current_page_item a:hover { background-color: #DFC28B; text-decoration: none; } #content { padding: 1em; width:600px; } .entry-title { font-size: 1.5em; margin: 1em 0 0 0; } abbr.published { color: #666; border: 0 none; } .entry-meta, .entry-date { color: #666; } #comments-list .avatar { float: left; margin-right: 1em; } #comments-list .n { font-weight: bold; } .entry-meta, .comment-meta { font-style: italic; } #comments-list p { clear: left; } #primary { padding-left: 1em; font-size: 0.9em; border-left: 1px solid #ccc; border-bottom: 1px solid #ccc; background-color: #FFFACD; } #footer { text-align: center; font-size: 0.8em; border-top: 1px solid #ccc; border-bottom: 1px solid #ccc; margin-bottom: 1em; } #inner-footer { padding: 1em 0; } .entry-meta, .entry-meta a, .comment-meta, .comment-meta a, .sidebar, .sidebar a, #footer, #footer a { color: #666; } /* LAYOUT: Two-Column (Right) DESCRIPTION: Two-column fluid layout with one sidebars right of content */ div#container { margin:0 0 0 0; width:960px; height:100%; } div#content { margin:0 0 0 0; } div.sidebar { overflow:hidden; width:280px; min-height:500px; clear:both; } div#secondary { clear:right; } div#footer { clear:both; width:100%; } /* Just some example content */ div#menu { height:2em; width:100%; } div#menu ul,div#menu ul ul { line-height:2em; list-style:none; margin:0; padding:0; } div#menu ul a { display:block; margin-right:1em; padding:0 0.5em; text-decoration:none; } div#menu ul ul ul a { font-style:italic; } div#menu ul li ul { left:-999em; position:absolute; } div#menu ul li:hover ul { left:auto; } .entry-title,.entry-meta { clear:both; } div#primary { } form#commentform .form-label { margin:1em 0 0; } form#commentform span.required { background:#fff; color:#c30; } form#commentform,form#commentform p { padding:0; } input#author,input#email,input#url,textarea#comme nt { padding:0.2em; } div.comments ol li { margin:0 0 3.5em; } textarea#comment { height:13em; margin:0 0 0.5em; overflow:auto; width:66%; } .alignright,img.alignright{ float:right; margin:1em 0 0 1em; } .alignleft,img.alignleft{ float:left; margin:1em 1em 0 0; } .aligncenter,img.aligncenter{ display:block; margin:1em auto; text-align:center; } div.gallery { clear:both; height:180px; margin:1em 0; width:100%; } p.wp-caption-text{ font-style:italic; } div.gallery dl{ margin:1em auto; overflow:hidden; text-align:center; } div.gallery dl.gallery-columns-1 { width:100%; } div.gallery dl.gallery-columns-2 { width:49%; } div.gallery dl.gallery-columns-3 { width:33%; } div.gallery dl.gallery-columns-4 { width:24%; } div.gallery dl.gallery-columns-5 { width:19%; } div#nav-above { margin-bottom:1em; } div#nav-below { margin-top:1em; } div#nav-images { height:150px; margin:1em 0; } div.navigation { height:1.25em; } div.navigation div.nav-next { float:right; text-align:right; } div.sidebar h3 { font-size:1.2em; } div.sidebar input#s { width:7em; } div.sidebar li { list-style:none; margin:0 0 2em; } div.sidebar li form { margin:0.2em 0 0; padding:0; } div.sidebar ul ul { margin:0 0 0 2em; } div.sidebar ul ul li { list-style:disc; margin:0; } div.sidebar ul ul ul { margin:0 0 0 0.5em; } div.sidebar ul ul ul li { list-style:circle; } div#menu ul li,div.gallery dl,div.navigation div.nav-previous { float:left; } input#author,input#email,input#url,div.navigation div { width:50%; } div.gallery *,div.sidebar div,div.sidebar h3,div.sidebar ul { margin:0; padding:0; }

    Read the article

  • Trace File Source Adapter

    The Trace File Source adapter is a useful addition to your SSIS toolbox.  It allows you to read 2005 and 2008 profiler traces stored as .trc files and read them into the Data Flow.  From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file. Example Usages Cache warming for SQL Server Analysis Services Reading the flight recorder Find out the longest running queries on a server Analyze statements for CPU, memory by user or some other criteria you choose Properties The Trace File Source adapter has two properties, both of which combine to control the source trace file that is read at runtime. SQL Server 2005 and SQL Server 2008 trace files are supported for both the Database Engine (SQL Server) and Analysis Services. The properties are managed by the Editor form or can be set directly from the Properties Grid in Visual Studio. Property Type Description AccessMode Enumeration This property determines how the Filename property is interpreted. The values available are: DirectInput Variable Filename String This property holds the path for trace file to load (*.trc). The value is either a full path, or the name of a variable which contains the full path to the trace file, depending on the AccessMode property. Trace Column Definition Hopefully the majority of you can skip this section entirely, but if you encounter some problems processing a trace file this may explain it and allow you to fix the problem. The component is built upon the trace management API provided by Microsoft. Unfortunately API methods that expose the schema of a trace file have known issues and are unreliable, put simply the data often differs from what was specified. To overcome these limitations the component uses  some simple XML files. These files enable the trace column data types and sizing attributes to be overridden. For example SQL Server Profiler or TMO generated structures define EventClass as an integer, but the real value is a string. TraceDataColumnsSQL.xml  - SQL Server Database Engine Trace Columns TraceDataColumnsAS.xml    - SQL Server Analysis Services Trace Columns The files can be found in the %ProgramFiles%\Microsoft SQL Server\100\DTS\PipelineComponents folder, e.g. "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsSQL.xml" "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml" If at runtime the component encounters a type conversion or sizing error it is most likely due to a discrepancy between the column definition as reported by the API and the actual value encountered. Whilst most common issues have already been fixed through these files we have implemented specific exception traps to direct you to the files to enable you to fix any further issues due to different usage or data scenarios that we have not tested. An example error that you can fix through these files is shown below. Buffer exception writing value to column 'Column Name'. The string value is 999 characters in length, the column is only 111. Columns can be overridden by the TraceDataColumns XML files in "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml". Installation The component is provided as an MSI file which you can download and run to install it. This simply places the files on disk in the correct locations and also installs the assemblies in the Global Assembly Cache as per Microsoft’s recommendations. You may need to restart the SQL Server Integration Services service, as this caches information about what components are installed, as well as restarting any open instances of Business Intelligence Development Studio (BIDS) / Visual Studio that you may be using to build your SSIS packages. Finally you will have to add the transformation to the Visual Studio toolbox manually. Right-click the toolbox, and select Choose Items.... Select the SSIS Data Flow Items tab, and then check the Trace File Source transformation in the Choose Toolbox Items window. This process has been described in detail in the related FAQ entry for How do I install a task or transform component? We recommend you follow best practice and apply the current Microsoft SQL Server Service pack to your SQL Server servers and workstations. Please note that the Microsoft Trace classes used in the component are not supported on 64-bit platforms. To use the Trace File Source on a 64-bit host you need to ensure you have the 32-bit (x86) tools available, and the way you execute your package is setup to use them, please see the help topic 64-bit Considerations for Integration Services for more details. Downloads Trace Sources for SQL Server 2005 -- Trace Sources for SQL Server 2008 Version History SQL Server 2008 Version 2.0.0.382 - SQL Sever 2008 public release. (9 Apr 2009) SQL Server 2005 Version 1.0.0.321 - SQL Server 2005 public release. (18 Nov 2008) -- Screenshots

    Read the article

  • Adopting DBVCS

    - by Wes McClure
    Identify early adopters Pick a small project with a small(ish) team.  This can be a legacy application or a green-field application. Strive to find a team of early adopters that will be eager to try something new. Get the team on board! Research Research the tool(s) that you want to use.  Some tools provide all of the features you would need while some only provide a slice of the pie.  DBVCS requires the ability to manage a set of change scripts that update a database from one version to the next.  Ideally a tool can track database versions and automatically apply updates.  The change script generation process can be manual, but having diff tools available to automatically generate it can really reduce the overhead to adoption.  Finally, an automated tool to generate a script file per database object is an added bonus as your version control system can quickly identify what was changed in a commit (add/del/modify), just like with code changes. Don’t settle on just one tool, identify several.  Then work with the team to evaluate the tools.  Have the team do some tests of the following scenarios with each tool: Baseline an existing database: can the migration tool work with legacy databases?  Caution: most migration platforms do not support baselines or have poor support, especially the fad of fluent APIs. Add/drop tables Add/drop procedures/functions/views Alter tables (rename columns, add columns, remove columns) Massage data – migrations sometimes involve changing data types that cannot be implicitly casted and require you to decide how the data is explicitly cast to the new type.  This is a requirement for a migrations platform.  Think about a case where you might want to combine fields, or move a field from one table to another, you wouldn’t want to lose the data. Run the tool via the command line.  If you cannot automate the tool in Continuous Integration what is the point? Create a copy of a database on demand. Backup/restore databases locally. Let the team give feedback and decide together, what tool they would like to try out. My recommendation at this point would be to include TSqlMigrations and RoundHouse as SQL based migration platforms.  In general I would recommend staying away from the fluent platforms as they often lack baseline capabilities and add overhead to learn a new API when SQL is already a very well known DSL.  Code migrations often get messy with procedures/views/functions as these have to be created with SQL and aren’t cross platform anyways.  IMO stick to SQL based migrations. Reconciling Production If your project is a legacy application, you will need to reconcile the current state of production with your development databases.  Find changes in production and bring them down to development, even if they are old and need to be removed.  Once complete, produce a baseline of either dev or prod as they are now in sync.  Commit this to your VCS of choice. Add whatever schema changes tracking mechanism your tool requires to your development database.  This often requires adding a table to track the schema version of that database.  Your tool should support doing this for you.  You can add this table to production when you do your next release. Script out any changes currently in dev.  Remove production artifacts that you brought down during reconciliation.  Add change scripts for any outstanding changes in dev since the last production release.  Commit these to your repository.   Say No to Shared Dev DBs Simply put, you wouldn’t dream of sharing a code checkout, why would you share a development database?  If you have a shared dev database, back it up, distribute the backups and take the shared version offline (including the dev db server once all projects are using DB VCS).  Doing DB VCS with a shared database is bound to cause problems as people won’t be able to easily script out their own changes from those that others are working on.   First prod release Copy prod to your beta/testing environment.  Add the schema changes table (or mechanism) and do a test run of your changes.  If successful you can schedule this to be run on production.   Evaluation After your first release, evaluate the pain points of the process.  Try to find tools or modifications to existing tools to help fix them.  Don’t leave stones unturned, iteratively evolve your tools and practices to make the process as seamless as possible.  This is why I suggest open source alternatives.  Nothing is set in stone, a good example was adding transactional support to TSqlMigrations.  We ran into situations where an update would break a database, so I added a feature to do transactional updates and rollback on errors!  Another good example is generating change scripts.  We have been manually making these for months now.  I found an open source project called Open DB Diff and integrated this with TSqlMigrations.  These were things we just accepted at the time when we began adopting our tool set.  Once we became comfortable with the base functionality, it was time to start automating more of the process.  Just like anything else with development, never be afraid to try to find tools to make your job easier!   Enjoy -Wes

    Read the article

  • Using LogParser - part 2

    - by fatherjack
    PersonAddress.csv SalesOrderDetail.tsv In part 1 of this series we downloaded and installed LogParser and used it to list data from a csv file. That was a good start and in this article we are going to see the different ways we can stream data and choose whether a whole file is selected. We are also going to take a brief look at what file types we can interrogate. If we take the query from part 1 and add a value for the output parameter as -o:datagrid so that the query becomes LOGPARSER "SELECT top 15 * FROM C:\LP\person_address.csv" -o:datagrid and run that we get a different result. A pop-up dialog that lets us view the results in a resizable grid. Notice that because we didn't specify the columns we wanted returned by LogParser (we used SELECT *) is has added two columns to the recordset - filename and rownumber. This behaviour can be very useful as we will see in future parts of this series. You can click Next 10 rows or All rows or close the datagrid once you are finished reviewing the data. You may have noticed that the files that I am working with are different file types - one is a csv (comma separated values) and the other is a tsv (tab separated values). If you want to convert a file from one to another then LogParser makes it incredibly simple. Rather than using 'datagrid' as the value for the output parameter, use 'csv': logparser "SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate into C:\Sales_SalesOrderDetail.csv FROM C:\Sales_SalesOrderDetail.tsv" -i:tsv -o:csv Those familiar with SQL will not have to make a very big leap of faith to making adjustments to the above query to filter in/out records from the source file. Lets get all the records from the same file where the Order Quantity (OrderQty) is more than 25: logparser "SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate into C:\LP\Sales_SalesOrderDetailOver25.csv FROM C:\LP\Sales_SalesOrderDetail.tsv WHERE orderqty > 25" -i:tsv -o:csv Or we could find all those records where the Order Quantity is equal to 25 and output it to an xml file: logparser "SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate into C:\LP\Sales_SalesOrderDetailEq25.xml FROM C:\LP\Sales_SalesOrderDetail.tsv WHERE orderqty = 25" -i:tsv -o:xml All the standard comparison operators are to be found in LogParser; >, <, =, LIKE, BETWEEN, OR, NOT, AND. Input and Output file formats. LogParser has a pretty impressive list of file formats that it can parse and a good selection of output formats that will let you generate output in a format that is useable for whatever process or application you may be using. From any of these To any of these IISW3C: parses IIS log files in the W3C Extended Log File Format.   NAT: formats output records as readable tabulated columns. IIS: parses IIS log files in the Microsoft IIS Log File Format. CSV: formats output records as comma-separated values text. BIN: parses IIS log files in the Centralized Binary Log File Format. TSV: formats output records as tab-separated or space-separated values text. IISODBC: returns database records from the tables logged to by IIS when configured to log in the ODBC Log Format. XML: formats output records as XML documents. HTTPERR: parses HTTP error log files generated by Http.sys. W3C: formats output records in the W3C Extended Log File Format. URLSCAN: parses log files generated by the URLScan IIS filter. TPL: formats output records following user-defined templates. CSV: parses comma-separated values text files. IIS: formats output records in the Microsoft IIS Log File Format. TSV: parses tab-separated and space-separated values text files. SQL: uploads output records to a table in a SQL database. XML: parses XML text files. SYSLOG: sends output records to a Syslog server. W3C: parses text files in the W3C Extended Log File Format. DATAGRID: displays output records in a graphical user interface. NCSA: parses web server log files in the NCSA Common, Combined, and Extended Log File Formats. CHART: creates image files containing charts. TEXTLINE: returns lines from generic text files. TEXTWORD: returns words from generic text files. EVT: returns events from the Windows Event Log and from Event Log backup files (.evt files). FS: returns information on files and directories. REG: returns information on registry values. ADS: returns information on Active Directory objects. NETMON: parses network capture files created by NetMon. ETW: parses Enterprise Tracing for Windows trace log files and live sessions. COM: provides an interface to Custom Input Format COM Plugins. So, you can query data from any of the types on the left and really easily get it into a format where it is ready for analysis by other tools. To a DBA or network Administrator with an enquiring mind this is a treasure trove. In part 3 we will look at working with multiple sources and specifically outputting to SQL format. See you there!

    Read the article

  • ODI 12c's Mapping Designer - Combining Flow Based and Expression Based Mapping

    - by Madhu Nair
    post by David Allan ODI is renowned for its declarative designer and minimal expression based paradigm. The new ODI 12c release has extended this even further to provide an extended declarative mapping designer. The ODI 12c mapper is a fusion of ODI's new declarative designer with the familiar flow based designer while retaining ODI’s key differentiators of: Minimal expression based definition, The ability to incrementally design an interface and to extract/load data from any combination of sources, and most importantly Backed by ODI’s extensible knowledge module framework. The declarative nature of the product has been extended to include an extensible library of common components that can be used to easily build simple to complex data integration solutions. Big usability improvements through consistent interactions of components and concepts all constructed around the familiar knowledge module framework provide the utmost flexibility. Here is a little taster: So what is a mapping? A mapping comprises of a logical design and at least one physical design, it may have many. A mapping can have many targets, of any technology and can be arbitrarily complex. You can build reusable mappings and use them in other mappings or other reusable mappings. In the example below all of the information from an Oracle bonus table and a bonus file are joined with an Oracle employees table before being written to a target. Some things that are cool include the one-click expression cross referencing so you can easily see what's used where within the design. The logical design in a mapping describes what you want to accomplish  (see the animated GIF here illustrating how the above mapping was designed) . The physical design lets you configure how it is to be accomplished. So you could have one logical design that is realized as an initial load in one physical design and as an incremental load in another. In the physical design below we can customize how the mapping is accomplished by picking Knowledge Modules, in ODI 12c you can pick multiple nodes (on logical or physical) and see common properties. This is useful as we can quickly compare property values across objects - below we can see knowledge modules settings on the access points between execution units side by side, in the example one table is retrieved via database links and the other is an external table. In the logical design I had selected an append mode for the integration type, so by default the IKM on the target will choose the most suitable/default IKM - which in this case is an in-built Oracle Insert IKM (see image below). This supports insert and select hints for the Oracle database (the ANSI SQL Insert IKM does not support these), so by default you will get direct path inserts with Oracle on this statement. In ODI 12c, the mapper is just that, a mapper. Design your mapping, write to multiple targets, the targets can be in the same data server, in different data servers or in totally different technologies - it does not matter. ODI 12c will derive and generate a plan that you can use or customize with knowledge modules. Some of the use cases which are greatly simplified include multiple heterogeneous targets, multi target inserts for Oracle and writing of XML. Let's switch it up now and look at a slightly different example to illustrate expression reuse. In ODI you can define reusable expressions using user functions. These can be reused across mappings and the implementations specialized per technology. So you can have common expressions across Oracle, SQL Server, Hive etc. shielding the design from the physical aspects of the generated language. Another way to reuse is within a mapping itself. In ODI 12c expressions can be defined and reused within a mapping. Rather than replicating the expression text in larger expressions you can decompose into smaller snippets, below you can see UNIT_TAX AMOUNT has been defined and is used in two downstream target columns - its used in the TOTAL_TAX_AMOUNT plus its used in the UNIT_TAX_AMOUNT (a recording of the calculation).  You can see the columns that the expressions depend on (upstream) and the columns the expression is used in (downstream) highlighted within the mapper. Also multi selecting attributes is a convenient way to see what's being used where, below I have selected the TOTAL_TAX_AMOUNT in the target datastore and the UNIT_TAX_AMOUNT in UNIT_CALC. You can now see many expressions at once now and understand much more at the once time without needlessly clicking around and memorizing information. Our mantra during development was to keep it simple and make the tool more powerful and do even more for the user. The development team was a fusion of many teams from Oracle Warehouse Builder, Sunopsis and BEA Aqualogic, debating and perfecting the mapper in ODI 12c. This was quite a project from supporting the capabilities of ODI in 11g to building the flow based mapping tool to support the future. I hope this was a useful insight, there is so much more to come on this topic, this is just a preview of much more that you will see of the mapper in ODI 12c.

    Read the article

  • Troubleshooting High-CPU Utilization for SQL Server

    - by Susantha Bathige
    The objective of this FAQ is to outline the basic steps in troubleshooting high CPU utilization on  a server hosting a SQL Server instance. The first and the most common step if you suspect high CPU utilization (or are alerted for it) is to login to the physical server and check the Windows Task Manager. The Performance tab will show the high utilization as shown below: Next, we need to determine which process is responsible for the high CPU consumption. The Processes tab of the Task Manager will show this information: Note that to see all processes you should select Show processes from all user. In this case, SQL Server (sqlserver.exe) is consuming 99% of the CPU (a normal benchmark for max CPU utilization is about 50-60%). Next we examine the scheduler data. Scheduler is a component of SQLOS which evenly distributes load amongst CPUs. The query below returns the important columns for CPU troubleshooting. Note – if your server is under severe stress and you are unable to login to SSMS, you can use another machine’s SSMS to login to the server through DAC – Dedicated Administrator Connection (see http://msdn.microsoft.com/en-us/library/ms189595.aspx for details on using DAC) SELECT scheduler_id ,cpu_id ,status ,runnable_tasks_count ,active_workers_count ,load_factor ,yield_count FROM sys.dm_os_schedulers WHERE scheduler_id See below for the BOL definitions for the above columns. scheduler_id – ID of the scheduler. All schedulers that are used to run regular queries have ID numbers less than 1048576. Those schedulers that have IDs greater than or equal to 1048576 are used internally by SQL Server, such as the dedicated administrator connection scheduler. cpu_id – ID of the CPU with which this scheduler is associated. status – Indicates the status of the scheduler. runnable_tasks_count – Number of workers, with tasks assigned to them that are waiting to be scheduled on the runnable queue. active_workers_count – Number of workers that are active. An active worker is never preemptive, must have an associated task, and is either running, runnable, or suspended. current_tasks_count - Number of current tasks that are associated with this scheduler. load_factor – Internal value that indicates the perceived load on this scheduler. yield_count – Internal value that is used to indicate progress on this scheduler.                                                                 Now to interpret the above data. There are four schedulers and each assigned to a different CPU. All the CPUs are ready to accept user queries as they all are ONLINE. There are 294 active tasks in the output as per the current_tasks_count column. This count indicates how many activities currently associated with the schedulers. When a  task is complete, this number is decremented. The 294 is quite a high figure and indicates all four schedulers are extremely busy. When a task is enqueued, the load_factor  value is incremented. This value is used to determine whether a new task should be put on this scheduler or another scheduler. The new task will be allocated to less loaded scheduler by SQLOS. The very high value of this column indicates all the schedulers have a high load. There are 268 runnable tasks which mean all these tasks are assigned a worker and waiting to be scheduled on the runnable queue.   The next step is  to identify which queries are demanding a lot of CPU time. The below query is useful for this purpose (note, in its current form,  it only shows the top 10 records). SELECT TOP 10 st.text  ,st.dbid  ,st.objectid  ,qs.total_worker_time  ,qs.last_worker_time  ,qp.query_plan FROM sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp ORDER BY qs.total_worker_time DESC This query as total_worker_time as the measure of CPU load and is in descending order of the  total_worker_time to show the most expensive queries and their plans at the top:      Note the BOL definitions for the important columns: total_worker_time - Total amount of CPU time, in microseconds, that was consumed by executions of this plan since it was compiled. last_worker_time - CPU time, in microseconds, that was consumed the last time the plan was executed.   I re-ran the same query again after few seconds and was returned the below output. After few seconds the SP dbo.TestProc1 is shown in fourth place and once again the last_worker_time is the highest. This means the procedure TestProc1 consumes a CPU time continuously each time it executes.      In this case, the primary cause for high CPU utilization was a stored procedure. You can view the execution plan by clicking on query_plan column to investigate why this is causing a high CPU load. I have used SQL Server 2008 (SP1) to test all the queries used in this article.

    Read the article

  • Replication Services in a BI environment

    - by jorg
    In this blog post I will explain the principles of SQL Server Replication Services without too much detail and I will take a look on the BI capabilities that Replication Services could offer in my opinion. SQL Server Replication Services provides tools to copy and distribute database objects from one database system to another and maintain consistency afterwards. These tools basically copy or synchronize data with little or no transformations, they do not offer capabilities to transform data or apply business rules, like ETL tools do. The only “transformations” Replication Services offers is to filter records or columns out of your data set. You can achieve this by selecting the desired columns of a table and/or by using WHERE statements like this: SELECT <published_columns> FROM [Table] WHERE [DateTime] >= getdate() - 60 There are three types of replication: Transactional Replication This type replicates data on a transactional level. The Log Reader Agent reads directly on the transaction log of the source database (Publisher) and clones the transactions to the Distribution Database (Distributor), this database acts as a queue for the destination database (Subscriber). Next, the Distribution Agent moves the cloned transactions that are stored in the Distribution Database to the Subscriber. The Distribution Agent can either run at scheduled intervals or continuously which offers near real-time replication of data! So for example when a user executes an UPDATE statement on one or multiple records in the publisher database, this transaction (not the data itself) is copied to the distribution database and is then also executed on the subscriber. When the Distribution Agent is set to run continuously this process runs all the time and transactions on the publisher are replicated in small batches (near real-time), when it runs on scheduled intervals it executes larger batches of transactions, but the idea is the same. Snapshot Replication This type of replication makes an initial copy of database objects that need to be replicated, this includes the schemas and the data itself. All types of replication must start with a snapshot of the database objects from the Publisher to initialize the Subscriber. Transactional replication need an initial snapshot of the replicated publisher tables/objects to run its cloned transactions on and maintain consistency. The Snapshot Agent copies the schemas of the tables that will be replicated to files that will be stored in the Snapshot Folder which is a normal folder on the file system. When all the schemas are ready, the data itself will be copied from the Publisher to the snapshot folder. The snapshot is generated as a set of bulk copy program (BCP) files. Next, the Distribution Agent moves the snapshot to the Subscriber, if necessary it applies schema changes first and copies the data itself afterwards. The application of schema changes to the Subscriber is a nice feature, when you change the schema of the Publisher with, for example, an ALTER TABLE statement, that change is propagated by default to the Subscriber(s). Merge Replication Merge replication is typically used in server-to-client environments, for example when subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers, like with mobile devices that need to synchronize one in a while. Because I don’t really see BI capabilities here, I will not explain this type of replication any further. Replication Services in a BI environment Transactional Replication can be very useful in BI environments. In my opinion you never want to see users to run custom (SSRS) reports or PowerPivot solutions directly on your production database, it can slow down the system and can cause deadlocks in the database which can cause errors. Transactional Replication can offer a read-only, near real-time database for reporting purposes with minimal overhead on the source system. Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!

    Read the article

  • $.fadeTo/fadeOut() operations on Table Rows in IE fail

    - by Rick Strahl
    Here’s a a small problem that one of customers ran into a few days ago: He was playing around with some of the sample code I’ve put out for one of my simple jQuery demos which deals with providing a simple pulse behavior plug-in: $.fn.pulse = function(time) { if (!time) time = 2000; // *** this == jQuery object that contains selections $(this).fadeTo(time, 0.20, function() { $(this).fadeTo(time, 1); }); return this; } it’s a very simplistic plug-in and it works fine for simple pulse animations. However he ran into a problem where it didn’t work when working with tables – specifically pulsing a table row in Internet Explorer. Works fine in FireFox and Chrome, but IE not so much. It also works just fine in IE as long as you don’t try it on tables or table rows specifically. Applying against something like this (an ASP.NET GridView): var sel = $("#gdEntries>tbody>tr") .not(":first-child") // no header .not(":last-child") // no footer .filter(":even") .addClass("gridalternate"); // *** Demonstrate simple plugin sel.pulse(2000); fails in IE. No pulsing happens in any version of IE. After some additional experimentation with single rows and various ways of selecting each and still failing, I’ve come to the conclusion that the various fade operations in jQuery simply won’t work correctly in IE (any version). So even something as ‘elemental’ as this: var el = $("#gdEntries>tbody>tr").get(0);$(el).fadeOut(2000); is not working correctly. The item will stick around for 2 seconds and then magically disappear. Likewise: sel.hide().fadeIn(5000); also doesn’t fade in although the items become immediately visible in IE. Go figure that behavior out. Thanks to a tweet from red_square and a link he provided here is a grid that explains what works and doesn’t in IE (and most last gen browsers) regarding opacity: http://www.quirksmode.org/js/opacity.html It appears from this link that table and row elements can’t be made opaque, but td elements can. This means for the row selections I can force each of the td elements to be selected and then pulse all of those. Once you have the rows it’s easy to explicitly select all the columns in those rows with .find(“td”). Aha the following actually works: var sel = $("#gdEntries>tbody>tr") .not(":first-child") // no header .not(":last-child") // no footer .filter(":even") .addClass("gridalternate"); // *** Demonstrate simple plugin sel.find("td").pulse(2000); A little unintuitive that, but it works. Stay away from <table> and <tr> Fades The moral of the story is – stay away from TR, TH and TABLE fades and opacity. If you have to do it on tables use the columns instead and if necessary use .find(“td”) on your row(s) selector to grab all the columns. I’ve been surprised by this uhm relevation, since I use fadeOut in almost every one of my applications for deletion of items and row deletions from grids are not uncommon especially in older apps. But it turns out that fadeOut actually works in terms of behavior: It removes the item when the timeout’s done and because the fade is relatively short lived and I don’t extensively test IE code any more I just never noticed that the fade wasn’t happening. Note – this behavior or rather lack thereof appears to be specific to table table,tr,th elements. I see no problems with other elements like <div> and <li> items. Chalk this one up to another of IE’s shortcomings. Incidentally I’m not the only one who has failed to address this in my simplistic plug-in: The jquery-ui pulsate effect also fails on the table rows in the same way. sel.effect("pulsate", { times: 3 }, 2000); and it also works with the same workaround. If you’re already using jquery-ui definitely use this version of the plugin which provides a few more options… Bottom line: be careful with table based fade operations and remember that if you do need to fade – fade on columns.© Rick Strahl, West Wind Technologies, 2005-2010Posted in jQuery  

    Read the article

  • SQL SERVER – Solution – Puzzle – SELECT * vs SELECT COUNT(*)

    - by pinaldave
    Earlier I have published Puzzle Why SELECT * throws an error but SELECT COUNT(*) does not. This question have received many interesting comments. Let us go over few of the answers, which are valid. Before I start the same, let me acknowledge Rob Farley who has not only answered correctly very first but also started interesting conversation in the same thread. The usual question will be what is the right answer. I would like to point to official Microsoft Connect Items which discusses the same. RGarvao https://connect.microsoft.com/SQLServer/feedback/details/671475/select-test-where-exists-select tiberiu utan http://connect.microsoft.com/SQLServer/feedback/details/338532/count-returns-a-value-1 Rob Farley count(*) is about counting rows, not a particular column. It doesn’t even look to see what columns are available, it’ll just count the rows, which in the case of a missing FROM clause, is 1. “select *” is designed to return columns, and therefore barfs if there are none available. Even more odd is this one: select ‘blah’ where exists (select *) You might be surprised at the results… Koushik The engine performs a “Constant scan” for Count(*) where as in the case of “SELECT *” the engine is trying to perform either Index/Cluster/Table scans. amikolaj When you query ‘select * from sometable’, SQL replaces * with the current schema of that table. With out a source for the schema, SQL throws an error. so when you query ‘select count(*)’, you are counting the one row. * is just a constant to SQL here. Check out the execution plan. Like the description states – ‘Scan an internal table of constants.’ You could do ‘select COUNT(‘my name is adam and this is my answer’)’ and get the same answer. Netra Acharya SELECT * Here, * represents all columns from a table. So it always looks for a table (As we know, there should be FROM clause before specifying table name). So, it throws an error whenever this condition is not satisfied. SELECT COUNT(*) Here, COUNT is a Function. So it is not mandetory to provide a table. Check it out this: DECLARE @cnt INT SET @cnt = COUNT(*) SELECT @cnt SET @cnt = COUNT(‘x’) SELECT @cnt Naveen Select 1 / Select ‘*’ will return 1/* as expected. Select Count(1)/Count(*) will return the count of result set of select statement. Count(1)/Count(*) will have one 1/* for each row in the result set of select statement. Select 1 or Select ‘*’ result set will contain only 1 result. so count is 1. Where as “Select *” is a sysntax which expects the table or equauivalent to table (table functions, etc..). It is like compilation error for that query. Ramesh Hi Friends, Count is an aggregate function and it expects the rows (list of records) for a specified single column or whole rows for *. So, when we use ‘select *’ it definitely give and error because ‘*’ is meant to have all the fields but there is not any table and without table it can only raise an error. So, in the case of ‘Select Count(*)’, there will be an error as a record in the count function so you will get the result as ’1'. Try using : Select COUNT(‘RAMESH’) and think there is an error ‘Must specify table to select from.’ in place of ‘RAMESH’ Pinal : If i am wrong then please clarify this. Sachin Nandanwar Any aggregate function expects a constant or a column name as an expression. DO NOT be confused with * in an aggregate function.The aggregate function does not treat it as a column name or a set of column names but a constant value, as * is a key word in SQL. You can replace any value instead of * for the COUNT function.Ex Select COUNT(5) will result as 1. The error resulting from select * is obvious it expects an object where it can extract the result set. I sincerely thank you all for wonderful conversation, I personally enjoyed it and I am sure all of you have the same feeling. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • Oracle SQL Developer v3.2.1 Now Available

    - by thatjeffsmith
    Oracle SQL Developer version 3.2.1 is now available. I recommend that everyone now upgrade to this release. It features more than 200 bug fixes, tweaks, and polish applied to the 3.2 edition. The high profile bug fixes submitted by customers and users on our forums are listed in all their glory for your review. I want to highlight a few of the changes though, as I recognize many of you lack the time and/or patience to ‘read the docs.’ That would include me, which is why I enjoy writing these kinds of blog posts. I’m lazy – just like you! No more artificial line breaks between CREATE OR REPLACE and your PL/SQL In versions 3.2 and older, when you pull up your stored procedural objects in our editor, you would see a line break inserted between the CREATE OR REPLACE and then the body of your code. In version 3.2.1, we have removed the line break. 3.1 3.2.1 Trivia Did You Know? The database doesn’t store the ‘CREATE’ or ‘CREATE OR REPLACE’ bit of your PL/SQL code in the database. If we look at the USER_SOURCE view, we can see that the code begins with the object name. So the CREATE OR REPLACE bit is ‘artificial’ The intent is to give you the code necessary to recreate your object – and have it ‘compile’ into the database. We pretty much HAVE to add the ‘CREATE OR REPLACE.’ From now on it will appear inline with the first line of your code. Exporting Tables & Views When exporting data from your tables or views, previous versions of SQL Developer presented a 3 step wizard. It allows you to choose your columns and apply data filters for what is exported. This was kind of redundant. The grids already allowed you to select your columns and apply filters. Wouldn’t it be more intuitive AND efficient to just make the grids behave in a What You See Is What You Get (WYSIWYG) fashion? In version 3.2.1, that is exactly what will happen. The wizard now only has two steps and the grid will export the data and columns as defined in the visible grid. Let the grid properties define what is actually exported! And here is what is pasted into my worksheet: "BREWERY"|"CITY" "3 Brewers Restaurant Micro-Brewery"|"Toronto" "Amsterdam Brewing Co."|"Toronto" "Ball Brewing Company Ltd."|"Toronto" "Big Ram Brewing Company"|"Toronto" "Black Creek Historic Brewery"|"Toronto" "Black Oak Brewing"|"Toronto" "C'est What?"|"Toronto" "Cool Beer Brewing Company"|"Toronto" "Denison's Brewing"|"Toronto" "Duggan's Brewery"|"Toronto" "Feathers"|"Toronto" "Fermentations! - Danforth"|"Toronto" "Fermentations! - Mount Pleasant"|"Toronto" "Granite Brewery & Restaurant"|"Toronto" "Labatt's Breweries of Canada"|"Toronto" "Mill Street Brew Pub"|"Toronto" "Mill Street Brewery"|"Toronto" "Molson Breweries of Canada"|"Toronto" "Molson Brewery at Air Canada Centre"|"Toronto" "Pioneer Brewery Ltd."|"Toronto" "Post-Production Bistro"|"Toronto" "Rotterdam Brewing"|"Toronto" "Steam Whistle Brewing"|"Toronto" "Strand Brasserie"|"Toronto" "Upper Canada Brewing"|"Toronto" JUST what I wanted And One Last Thing Speaking of export, sometimes I want to send data to Excel. And sometimes I want to send multiple objects to Excel – to a single Excel file that is. In version 3.2.1 you can now do that. Let’s export the bulk of the HR schema to Excel, with each table going to it’s own workbook in the same worksheet. Select many tables, put them in in a single Excel worksheet If you try this in previous versions of SQL Developer it will just write the first table to the Excel file. This is one of the bugs we addressed in v3.2.1. Here is what the output Excel file looks like now: Many tables - Many workbooks in an Excel Worksheet I have a sneaky suspicion that this will be a frequently used feature going forward. Excel seems to be the cornerstone of many of our popular features. Imagine that!

    Read the article

  • SQL SERVER – SmallDateTime and Precision – A Continuous Confusion

    - by pinaldave
    Some kinds of confusion never go away. Here is one of the ancient confusing things in SQL. The precision of the SmallDateTime is one concept that confuses a lot of people, proven by the many messages I receive everyday relating to this subject. Let me start with the question: What is the precision of the SMALLDATETIME datatypes? What is your answer? Write it down on your notepad. Now if you do not want to continue reading the blog post, head to my previous blog post over here: SQL SERVER – Precision of SMALLDATETIME. A Social Media Question Since the increase of social media conversations, I noticed that the amount of the comments I receive on this blog is a bit staggering. I receive lots of questions on facebook, twitter or Google+. One of the very interesting questions yesterday was asked on Facebook by Raghavendra. I am re-organizing his script and asking all of the questions he has asked me. Let us see if we could help him with his question: CREATE TABLE #temp (name VARCHAR(100),registered smalldatetime) GO DECLARE @test smalldatetime SET @test=GETDATE() INSERT INTO #temp VALUES ('Value1',@test) INSERT INTO #temp VALUES ('Value2',@test) GO SELECT * FROM #temp ORDER BY registered DESC GO DROP TABLE #temp GO Now when the above script is ran, we will get the following result: Well, the expectation of the query was to have the following result. The row which was inserted last was expected to return as first row in result set as the ORDER BY descending. Side note: Because the requirement is to get the latest data, we can’t use any  column other than smalldatetime column in order by. If we use name column in the order by, we will get an incorrect result as it can be any name. My Initial Reaction My initial reaction was as follows: 1) DataType DateTime2: If file precision of the column is expected from the column which store date and time, it should not be smalldatetime. The precision of the column smalldatetime is One Minute (Read Here) for finer precision use DateTime or DateTime2 data type. Here is the code which includes above suggestion: CREATE TABLE #temp (name VARCHAR(100), registered datetime2) GO DECLARE @test datetime2 SET @test=GETDATE() INSERT INTO #temp VALUES ('Value1',@test) INSERT INTO #temp VALUES ('Value2',@test) GO SELECT * FROM #temp ORDER BY registered DESC GO DROP TABLE #temp GO 2) Tie Breaker Identity: There are always possibilities that two rows were inserted at the same time. In that case, you may need a tie breaker. If you have an increasing identity column, you can use that as a tie breaker as well. CREATE TABLE #temp (ID INT IDENTITY(1,1), name VARCHAR(100),registered datetime2) GO DECLARE @test datetime2 SET @test=GETDATE() INSERT INTO #temp VALUES ('Value1',@test) INSERT INTO #temp VALUES ('Value2',@test) GO SELECT * FROM #temp ORDER BY ID DESC GO DROP TABLE #temp GO Those two were the quick suggestions I provided. It is not necessary that you should use both advices. It is possible that one can use only DATETIME datatype or Identity column can have datatype of BIGINT or have another tie breaker. An Alternate NO Solution In the facebook thread this was also discussed as one of the solutions: CREATE TABLE #temp (name VARCHAR(100),registered smalldatetime) GO DECLARE @test smalldatetime SET @test=GETDATE() INSERT INTO #temp VALUES ('Value1',@test) INSERT INTO #temp VALUES ('Value2',@test) GO SELECT name, registered, ROW_NUMBER() OVER(ORDER BY registered DESC) AS "Row Number" FROM #temp ORDER BY 3 DESC GO DROP TABLE #temp GO However, I believe it is not the solution and can be further misleading if used in a production server. Here is the example of why it is not a good solution: CREATE TABLE #temp (name VARCHAR(100) NOT NULL,registered smalldatetime) GO DECLARE @test smalldatetime SET @test=GETDATE() INSERT INTO #temp VALUES ('Value1',@test) INSERT INTO #temp VALUES ('Value2',@test) GO -- Before Index SELECT name, registered, ROW_NUMBER() OVER(ORDER BY registered DESC) AS "Row Number" FROM #temp ORDER BY 3 DESC GO -- Create Index ALTER TABLE #temp ADD CONSTRAINT [PK_#temp] PRIMARY KEY CLUSTERED (name DESC) GO -- After Index SELECT name, registered, ROW_NUMBER() OVER(ORDER BY registered DESC) AS "Row Number" FROM #temp ORDER BY 3 DESC GO DROP TABLE #temp GO Now let us examine the resultset. You will notice that an index which is created on the base table which is (indeed) schema change the table but can affect the resultset. As you can see, an index can change the resultset, so this method is not yet perfect to get the latest inserted resultset. No Schema Change Requirement After giving these two suggestions, I was waiting for the feedback of the asker. However, the requirement of the asker was there can’t be any schema change because the application was used by many other applications. I validated again, and of course, the requirement is no schema change at all. No addition of the column of change of datatypes of any other columns. There is no further help as well. This is indeed an interesting question. I personally can’t think of any solution which I could provide him given the requirement of no schema change. Can you think of any other solution to this? Need of Database Designer This question once again brings up another ancient question:  “Do we need a database designer?” I often come across databases which are facing major performance problems or have redundant data. Normalization is often ignored when a database is built fast under a very tight deadline. Often I come across a database which has table with unnecessary columns and performance problems. While working as Developer Lead in my earlier jobs, I have seen developers adding columns to tables without anybody’s consent and retrieving them as SELECT *.  There is a lot to discuss on this subject in detail, but for now, let’s discuss the question first. Do you have any suggestions for the above question? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: CodeProject, Developer Training, PostADay, SQL, SQL Authority, SQL DateTime, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • DBCC CHECKDB (BatmanDb, REPAIR_ALLOW_DATA_LOSS) &ndash; Are you Feeling Lucky?

    - by David Totzke
    I’m currently working for a client on a PowerBuilder to WPF migration.  It’s one of those “I could tell you, but I’d have to kill you” kind of clients and the quick-lime pits are currently occupied by the EMC tech…but I’ve said too much already. At approximately 3 or 4 pm that day users of the Batman[1] application here in Gotham[1] started to experience problems accessing the application.  Batman[2] is a document management system here that also integrates with the ERP system.  Very little goes on here that doesn’t involve Batman in some way.  The errors being received seemed to point to network issues (TCP protocol error, connection forcibly closed by the remote host etc…) but the real issue was much more insidious. Connecting to the database via SSMS and performing selects on certain tables underlying the application areas that were having problems started to reveal the issue.  You couldn’t do a SELECT * FROM MyTable without it bombing and giving the same error noted above.  A run of DBCC CHECKDB revealed 14 tables with corruption.  One of the tables with issues was the Document table.  Pretty central to a “document management” system.  Information was obtained from IT that a single drive in the SAN went bad in the night.  A new drive was in place and was working fine.  The partition that held the Batman database is configured for RAID Level 5 so a single drive failure shouldn’t have caused any trouble and yet, the database is corrupted.  They do hourly incremental backups here so the first thing done was to try a restore.  A restore of the most recent backup failed so they worked backwards until they hit a good point.  This successful restore was for a backup at 3AM – a full day behind.  This time also roughly corresponds with the time the SAN started to report the drive failure.  The plot thickens… I got my hands on the output from DBCC CHECKDB and noticed a pattern.  What’s sad is that nobody that should have noticed the pattern in the DBCC output did notice.  There was a rush to do things to try and recover the data before anybody really understood what was wrong with it in the first place.  Cooler heads must prevail in these circumstances and some investigation should be done and a plan of action laid out or you could end up making things worse[3].  DBCC CHECKDB also told us that: repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB Yikes.  That means that the database is so messed up that you’re definitely going to lose some stuff when you repair it to get it back to a consistent state.  All the more reason to do a little more investigation into the problem.  Rescuing this database is preferable to having to export all of the data possible from this database into a new one.  This is a fifteen year old application with about seven hundred tables.  There are TRIGGERS everywhere not to mention the referential integrity constraints to deal with.  Only fourteen of the tables have an issue.  We have a good backup that is missing the last 24 hours of business which means we could have a “do-over” of yesterday but that’s not a very palatable option either. All of the affected tables had TEXT columns and all of the errors were about LOB data types and orphaned off-row data which basically means TEXT, IMAGE or NTEXT columns.  If we did a SELECT on an affected table and excluded those columns, we got all of the rows.  We exported that data into a separate database.  Things are looking up.  Working on a copy of the production database we then ran DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS and that “fixed” everything up.   The allow data loss option will delete the bad rows.  This isn’t too horrible as we have all of those rows minus the text fields from out earlier export.  Now I could LEFT JOIN to the exported data to find the missing rows and INSERT them minus the TEXT column data. We had the restored data from the good 3AM backup that we could now JOIN to and, with fingers crossed, recover the missing TEXT column information.  We got lucky in that all of the affected rows were old and in the end we didn’t lose anything.  :O  All of the row counts along the way worked out and it looks like we dodged a major bullet here. We’ve heard back from EMC and it turns out the SAN firmware that they were running here is apparently buggy.  This thing is only a couple of months old.  Grrr…. They dispatched a technician that night to come and update it .  That explains why RAID didn’t save us. All-in-all this could have been a lot worse.  Given the root cause here, they basically won the lottery in not losing anything. Here are a few links to some helpful posts on the SQL Server Engine blog.  I love the title of the first one: Which part of 'REPAIR_ALLOW_DATA_LOSS' isn't clear? CHECKDB (Part 8): Can repair fix everything? (in fact, read the whole series) Ta da! Emergency mode repair (we didn’t have to resort to this one thank goodness)   Dave Just because I can…   [1] Names have been changed to protect the guilty. [2] I'm Batman. [3] And if I'm the coolest head in the room, you've got even bigger problems...

    Read the article

  • Joins in single-table queries

    - by Rob Farley
    Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

    Read the article

  • Trace File Source Adapter

    The Trace File Source adapter is a useful addition to your SSIS toolbox.  It allows you to read 2005 and 2008 profiler traces stored as .trc files and read them into the Data Flow.  From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file. Example Usages Cache warming for SQL Server Analysis Services Reading the flight recorder Find out the longest running queries on a server Analyze statements for CPU, memory by user or some other criteria you choose Properties The Trace File Source adapter has two properties, both of which combine to control the source trace file that is read at runtime. SQL Server 2005 and SQL Server 2008 trace files are supported for both the Database Engine (SQL Server) and Analysis Services. The properties are managed by the Editor form or can be set directly from the Properties Grid in Visual Studio. Property Type Description AccessMode Enumeration This property determines how the Filename property is interpreted. The values available are: DirectInput Variable Filename String This property holds the path for trace file to load (*.trc). The value is either a full path, or the name of a variable which contains the full path to the trace file, depending on the AccessMode property. Trace Column Definition Hopefully the majority of you can skip this section entirely, but if you encounter some problems processing a trace file this may explain it and allow you to fix the problem. The component is built upon the trace management API provided by Microsoft. Unfortunately API methods that expose the schema of a trace file have known issues and are unreliable, put simply the data often differs from what was specified. To overcome these limitations the component uses  some simple XML files. These files enable the trace column data types and sizing attributes to be overridden. For example SQL Server Profiler or TMO generated structures define EventClass as an integer, but the real value is a string. TraceDataColumnsSQL.xml  - SQL Server Database Engine Trace Columns TraceDataColumnsAS.xml    - SQL Server Analysis Services Trace Columns The files can be found in the %ProgramFiles%\Microsoft SQL Server\100\DTS\PipelineComponents folder, e.g. "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsSQL.xml" "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml" If at runtime the component encounters a type conversion or sizing error it is most likely due to a discrepancy between the column definition as reported by the API and the actual value encountered. Whilst most common issues have already been fixed through these files we have implemented specific exception traps to direct you to the files to enable you to fix any further issues due to different usage or data scenarios that we have not tested. An example error that you can fix through these files is shown below. Buffer exception writing value to column 'Column Name'. The string value is 999 characters in length, the column is only 111. Columns can be overridden by the TraceDataColumns XML files in "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml". Installation The component is provided as an MSI file which you can download and run to install it. This simply places the files on disk in the correct locations and also installs the assemblies in the Global Assembly Cache as per Microsoft’s recommendations. You may need to restart the SQL Server Integration Services service, as this caches information about what components are installed, as well as restarting any open instances of Business Intelligence Development Studio (BIDS) / Visual Studio that you may be using to build your SSIS packages. Finally you will have to add the transformation to the Visual Studio toolbox manually. Right-click the toolbox, and select Choose Items.... Select the SSIS Data Flow Items tab, and then check the Trace File Source transformation in the Choose Toolbox Items window. This process has been described in detail in the related FAQ entry for How do I install a task or transform component? We recommend you follow best practice and apply the current Microsoft SQL Server Service pack to your SQL Server servers and workstations. Please note that the Microsoft Trace classes used in the component are not supported on 64-bit platforms. To use the Trace File Source on a 64-bit host you need to ensure you have the 32-bit (x86) tools available, and the way you execute your package is setup to use them, please see the help topic 64-bit Considerations for Integration Services for more details. Downloads Trace Sources for SQL Server 2005 -- Trace Sources for SQL Server 2008 Version History SQL Server 2008 Version 2.0.0.382 - SQL Sever 2008 public release. (9 Apr 2009) SQL Server 2005 Version 1.0.0.321 - SQL Server 2005 public release. (18 Nov 2008) -- Screenshots

    Read the article

  • Lessons learned from Word 2007 automation with c# 2008

    - by robertphyatt
    My organization has an ongoing project to take documents produced for internal regulations and such, change some of the formatting and then export it as PDF. Our requirements were that only one person would be doing this, but it has been painfully tedious and sometimes error-prone to do by hand. Enter the fearless developer to automate the situation! Since I am one of those guys that just plain does not like VB, I wanted to do the automation in the ever-so-much-more-familiar C#. While Microsoft had made a dll that makes such a task easier, documentation on MSDN is pretty lame and most of the forumns and posts on the internet had little to do with my task. So, I feel like I can give back to the community and make a post here of the things I have learned so far. I hope this is helpful to whoever stumbles upon it. Steps to do this: 1) First of all, make some sort of a project and use some sort of a means to get the filename of the word document you are trying to open. I got the filename the user wanted with an openFileDialog tied to a button that I labeled 'Browse':        private void btnBrowse_Click(object sender, EventArgs e)        {            try            {                DialogResult myResult = openFileDialog1.ShowDialog();                if (myResult.Equals(DialogResult.OK))                {                    if (openFileDialog1.SafeFileName.EndsWith(".doc"))                    {                        txtFileName.Text = openFileDialog1.SafeFileName;                        paramSourceDocPath = openFileDialog1.FileName;                        paramExportFilePath = openFileDialog1.FileName.Replace(".doc", ".pdf");                    }                    else                    {                        txtFileName.Text = "only something that end with .doc, please";                    }                }            }            catch (Exception err)            {                lblError.Text = err.Message;            }        }   2) Add in "using Microsoft.Office.Interop.Word;" after setting your project to reference Microsoft.Office.Core and Microsoft.Office.Interop.Word so that you don't have to add "Microsoft.Office.Interop.Word" to the front of everything. 3) Now you are ready to play. You will need to have a copy of word open and a copy of your word document that you want to modify open to be able to make the changes that are needed. The word interop dll likes using ref on all the parameters passed in, and likes to have them as objects. If you don't want to specify the parameter, you have to give it a "Type.Missing". I suggest creating some objects that you reuse all over the place to maintain sanity. object paramMissing = Type.Missing; ApplicationClass wordApplication = new ApplicationClass(); Document wordDocument = wordApplication.Documents.Open(                ref paramSourceDocPath, ref paramMissing, ref paramMissing,                ref paramMissing, ref paramMissing, ref paramMissing,                ref paramMissing, ref paramMissing, ref paramMissing,                ref paramMissing, ref paramMissing, ref paramMissing,                ref paramMissing, ref paramMissing, ref paramMissing,                ref paramMissing); 4) There are many ways to modify the text of the inside of the word document. One of the ways that was most effective for me was to break it down by paragraph and then do things on each paragraph by what style the particular paragraph had.            foreach (Paragraph thisParagraph in wordDocument.Content.Paragraphs)            {                string strStyleName = ((Style)thisParagraph.get_Style()).NameLocal;                string strText = thisParagraph.Range.Text;                //Do whatever you need to do            } 5) Sometimes you want to insert a new line character somewhere in the text or insert text into the document, etc.  There are a few ways you can do this: you can either modify the text of a paragraph by doing something like this ('\r' makes a new paragraph, '\v' will make a newline without making a new paragraph. If you remove a '\r' from the text, it will eliminate the paragraph you removed it from): thisParagraph.Range.Text = "A\vNew Paragraph!\r" + thisParagraph.Range.Text; OR you could select where you want to insert it and have it act like you were typing in Word like any normal user (note: if you do not collapse the range first, you will overwrite the thing you got the range from) object oCollapseDirectionEnd = WdCollapseDirection.wdCollapseEnd; object oCollapseDirectionStart = WdCollapseDirection.wdCollapseStart; Range rangeInsertAtBeginning = thisParagraph.Range; Range rangeInsertAtEnd = thisParagraph.Range; rangeInsertAtBeginning.Collapse(ref oCollapseDirectionStart); rangeInsertAtEnd.Collapse(ref oCollapseDirectionEnd); rangeInsertAtBeginning.Select(); wordApplication.Selection.TypeText("Blah Blah Blah"); rangeInsertAtEnd.Select(); wordApplication.Selection.TypeParagraph(); 6) If you want to make text columns, like a newspaper or newsletter, you have to modify the page layout of the document or a section of the document to make it happen. In my case, I only wanted a particular section to have that, and I wanted to have a black line before and after the newspaper-like text columns. First you need to do a section break on either side of what you wanted, then you take the section and modify the page layout. Then you can modify the borders of the section (or another object in the word document). I also show here how to modify the alignment of a paragraph.            object oSectionBreak = WdBreakType.wdSectionBreakContinuous;            //These ranges were set while I was going through the paragraphs of my document, like I was showing earlier            rangeHeaderStart.InsertBreak(ref oSectionBreak);            rangeHeaderEnd.InsertBreak(ref oSectionBreak);            //change the alignment to justify            object oRangeHeaderStart = rangeStartJustifiedAlignment.Start;            object oRangeHeaderEnd = rangeHeaderEnd.End;            Range rangeHeader = wordDocument.Range(ref oRangeHeaderStart, ref oRangeHeaderEnd);            rangeHeader.Paragraphs.Alignment = WdParagraphAlignment.wdAlignParagraphJustify;            //find the section break and make it into triple text columns            foreach (Section mySection in wordDocument.Sections)            {                if (mySection.Range.Start == rangeHeaderStart.Start)                {                    mySection.PageSetup.TextColumns.Add(ref paramMissing, ref paramMissing, ref paramMissing);                    mySection.PageSetup.TextColumns.Add(ref paramMissing, ref paramMissing, ref paramMissing);                    //I didn't like the default spacing and column widths. This is how I adjusted them.                    foreach (TextColumn txtc in mySection.PageSetup.TextColumns)                    {                        try                        {                            txtc.SpaceAfter = 151.6f;                            txtc.Width = 7;                        }                        catch (Exception)                        {                            txtc.Width = 151.6f;                        }                    }                }            } That is all  I have time for today! I hope this was helpful to someone!

    Read the article

  • Alternative Grid Layout for Silverlight suggestion

    - by brainbox
    I've proposed a suggestion to create alternative grid layout for Silverlight. Please vote for it if also faced the same problems. As i write before current Silverlight Grid Layout breakes best practices of HTML and Adobe Flex Grid layouts. Current defention based approach have following disadvantages that makes xaml coding very hard: 1. It is very hard to create new row. In that case you should rewriteall Grid.Row and Grid.Columns for all rows inserted below.2. Defenitions are static by their nature and because of it, it isimpossible to use grid for dynamic forms. Currently even in toolkit DataFormMicrosoft is using StackPanel. But StackPanel is not designed for multicolumn layout that have dataform. Here is a sample code of AdobeFlex datagrid, which incorporates bestpractices of HTML. <mx:Grid id="myGrid">        <!-- Define Row 1. -->       <mx:GridRow id="row1">           <!-- Define the first cell of Row 1. -->           <mx:GridItem>               <mx:Button label="Button 1"/>           </mx:GridItem>           <!-- Define the second cell of Row 1. -->           <mx:GridItem>               <mx:Button label="2"/>           </mx:GridItem>           <!-- Define the third cell of Row 1. -->           <mx:GridItem>               <mx:Button label="Button 3"/>           </mx:GridItem>       </mx:GridRow>        <!-- Define Row 2. -->       <mx:GridRow id="row2">           <!-- Define a single cell to span three columns of Row 2. -->           <mx:GridItem colSpan="3" horizontalAlign="center">               <mx:Button label="Long-Named Button 4"/>           </mx:GridItem>       </mx:GridRow>        <!-- Define Row 3. -->       <mx:GridRow id="row3">           <!-- Define an empty first cell of Row 3. -->           <mx:GridItem/>           <!-- Define a cell to span columns 2 and 3 of Row 3. -->           <mx:GridItem colSpan="2" horizontalAlign="center">               <mx:Button label="Button 5"/>           </mx:GridItem>       </mx:GridRow>    </mx:Grid>

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #053 – Final Post in Series

    - by Pinal Dave
    It has been a fantastic journey to write memory lane series for an entire year. This series gave me the opportunity to go back and see what I have contributed to this blog throughout the last 7 years. This was indeed fantastic series as this provided me the opportunity to witness how technology has grown throughout the year and how I have progressed in my career while writing this blog post. This series was indeed fantastic experience readers as many joined during the last few years and were not sure what they have missed in recent years. Let us continue with the final episode of the Memory Lane Series. Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Get Current User – Get Logged In User Here is the straight script which list logged in SQL Server users. Disable All Triggers on a Database – Disable All Triggers on All Servers Question : How to disable all the triggers for a database? Additionally, how to disable all the triggers for all servers? For answer execute the script in the blog post. Importance of Master Database for SQL Server Startup I have received following questions many times. I will list all the questions here and answer them together. What is the purpose of Master database? Should our backup Master database? Which database is must have database for SQL Server for startup? Which are the default system database created when SQL Server 2005 is installed for the first time? What happens if Master database is corrupted? Answers to all of the questions are very much related. 2008 DECLARE Multiple Variables in One Statement SQL Server is a great product and it has many features which are very unique to SQL Server. Regarding feature of SQL Server where multiple variable can be declared in one statement, it is absolutely possible to do. 2009 How to Enable Index – How to Disable Index – Incorrect syntax near ‘ENABLE’ Many times I have seen that the index is disabled when there is a large update operation on the table. Bulk insert of very large file updates in any table using SSIS is usually preceded by disabling the index and followed by enabling the index. I have seen many developers running the following query to disable the index. 2010 List of all the Views from Database Many emails I received suggesting that they have hundreds of the view and now have no clue what is going on and how many of them have indexes and how many does not have an index. Some even asked me if there is any way they can get a list of the views with the property of Index along with it. Here is the quick script which does exactly the same. You can also include many other columns from the same view. Minimum Maximum Memory – Server Memory Options I was recently reading about SQL Server Memory Options over here. While reading this one line really caught my attention is minimum value allowed for maximum memory options. The default setting for min server memory is 0, and the default setting for max server memory is 2147483647. The minimum amount of memory you can specify for max server memory is 16 megabytes (MB). 2011 Fundamentals of Columnstore Index There are two kinds of storage in a database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data are relevant or not, column store queries need only to search a much lesser number of the columns. How to Ignore Columnstore Index Usage in Query In summary the question in simple words “How can we ignore using the column store index in selective queries?” Very interesting question – you can use I can understand there may be the cases when the column store index is not ideal and needs to be ignored the same. You can use the query hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX to ignore the column store index. The SQL Server Engine will use any other index which is best after ignoring the column store index. 2012 Storing Variable Values in Temporary Array or Temporary List SQL Server does not support arrays or a dynamic length storage mechanism like list. Absolutely there are some clever workarounds and few extra-ordinary solutions but everybody can;t come up with such solution. Additionally, sometime the requirements are very simple that doing extraordinary coding is not required. Here is the simple case. Move Database Files MDF and LDF to Another Location It is not common to keep the Database on the same location where OS is installed. Usually Database files are in SAN, Separate Disk Array or on SSDs. This is done usually for performance reason and manageability perspective. Now the challenges comes up when database which was installed at not preferred default location and needs to move to a different location. Here is the quick tutorial how you can do it. UNION ALL and ORDER BY – How to Order Table Separately While Using UNION ALL If your requirement is such that you want your top and bottom query of the UNION resultset independently sorted but in the same result set you can add an additional static column and order by that column. Let us re-create the same scenario. Copy Data from One Table to Another Table – SQL in Sixty Seconds #031 – Video http://www.youtube.com/watch?v=FVWIA-ACMNo Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Using Stored Procedures in SSIS

    - by dataintegration
    The SSIS Data Flow components: the source task and the destination task are the easiest way to transfer data in SSIS. Some data transactions do not fit this model, they are procedural tasks modeled as stored procedures. In this article we show how you can call stored procedures available in RSSBus ADO.NET Providers from SSIS. In this article we will use the CreateJob and the CreateBatch stored procedures available in RSSBus ADO.NET Provider for Salesforce, but the same steps can be used to call a stored procedure in any of our data providers. Step 1: Open Visual Studio and create a new Integration Services Project. Step 2: Add a new Data Flow Task to the Control Flow window. Step 3: Open the Data Flow Task and add a Script Component to the data flow pane. A dialog box will pop-up allowing you to select the Script Component Type: pick the source type as we will be outputting columns from our stored procedure. Step 4: Double click the Script Component to open the editor. Step 5: In the "Inputs and Outputs" settings, enter all the columns you want to output to the data flow. Ensure the correct data type has been set for each output. You can check the data type by selecting the output and then changing the "DataType" property from the property editor. In our example, we'll add the column JobID of type String. Step 6: Select the "Script" option in the left-hand pane and click the "Edit Script" button. This will open a new Visual Studio window with some boiler plate code in it. Step 7: In the CreateOutputRows() function you can add code that executes the stored procedures included with the Salesforce Component. In this example we will be using the CreateJob and CreateBatch stored procedures. You can find a list of the available stored procedures along with their inputs and outputs in the product help. //Configure the connection string to your credentials String connectionString = "Offline=False;user=myusername;password=mypassword;access token=mytoken;"; using (SalesforceConnection conn = new SalesforceConnection(connectionString)) { //Create the command to call the stored procedure CreateJob SalesforceCommand cmd = new SalesforceCommand("CreateJob", conn); cmd.CommandType = CommandType.StoredProcedure; cmd.Parameters.Add(new SalesforceParameter("ObjectName", "Contact")); cmd.Parameters.Add(new SalesforceParameter("Action", "insert")); //Execute CreateJob //CreateBatch requires JobID as input so we store this value for later SalesforceDataReader rdr = cmd.ExecuteReader(); String JobID = ""; while (rdr.Read()) { JobID = (String)rdr["JobID"]; } //Create the command for CreateBatch, for this example we are adding two new rows SalesforceCommand batCmd = new SalesforceCommand("CreateBatch", conn); batCmd.CommandType = CommandType.StoredProcedure; batCmd.Parameters.Add(new SalesforceParameter("JobID", JobID)); batCmd.Parameters.Add(new SalesforceParameter("Aggregate", "<Contact><Row><FirstName>Bill</FirstName>" + "<LastName>White</LastName></Row><Row><FirstName>Bob</FirstName><LastName>Black</LastName></Row></Contact>")); //Execute CreateBatch SalesforceDataReader batRdr = batCmd.ExecuteReader(); } Step 7b: If you had specified output columns earlier, you can now add data into them using the UserComponent Output0Buffer. For example, we had set an output column called JobID of type String so now we can set a value for it. We will modify the DataReader that contains the output of CreateJob like so:. while (rdr.Read()) { Output0Buffer.AddRow(); JobID = (String)rdr["JobID"]; Output0Buffer.JobID = JobID; } Step 8: Note: You will need to modify the connection string to include your credentials. Also ensure that the System.Data.RSSBus.Salesforce assembly is referenced and include the following using statements to the top of the class: using System.Data; using System.Data.RSSBus.Salesforce; Step 9: Once you are done editing your script, save it, and close the window. Click OK in the Script Transformation window to go back to the main pane. Step 10: If had any outputs from the Script Component you can use them in your data flow. For example we will use a Flat File Destination. Configure the Flat File Destination to output the results to a file, and you should see the JobId in the file. Step 11: Your project should be ready to run.

    Read the article

  • Developing Schema Compare for Oracle (Part 6): 9i Query Performance

    - by Simon Cooper
    All throughout the EAP and beta versions of Schema Compare for Oracle, our main request was support for Oracle 9i. After releasing version 1.0 with support for 10g and 11g, our next step was then to get version 1.1 of SCfO out with support for 9i. However, there were some significant problems that we had to overcome first. This post will concentrate on query execution time. When we first tested SCfO on a 9i server, after accounting for various changes to the data dictionary, we found that database registration was taking a long time. And I mean a looooooong time. The same database that on 10g or 11g would take a couple of minutes to register would be taking upwards of 30 mins on 9i. Obviously, this is not ideal, so a poke around the query execution plans was required. As an example, let's take the table population query - the one that reads ALL_TABLES and joins it with a few other dictionary views to get us back our list of tables. On 10g, this query takes 5.6 seconds. On 9i, it takes 89.47 seconds. The difference in execution plan is even more dramatic - here's the (edited) execution plan on 10g: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 108K| 939 || 1 | SORT ORDER BY | | 108K| 939 || 2 | NESTED LOOPS OUTER | | 108K| 938 ||* 3 | HASH JOIN RIGHT OUTER | | 103K| 762 || 4 | VIEW | ALL_EXTERNAL_LOCATIONS | 2058 | 3 ||* 20 | HASH JOIN RIGHT OUTER | | 73472 | 759 || 21 | VIEW | ALL_EXTERNAL_TABLES | 2097 | 3 ||* 34 | HASH JOIN RIGHT OUTER | | 39920 | 755 || 35 | VIEW | ALL_MVIEWS | 51 | 7 || 58 | NESTED LOOPS OUTER | | 39104 | 748 || 59 | VIEW | ALL_TABLES | 6704 | 668 || 89 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2025 | 5 || 106 | VIEW | ALL_PART_TABLES | 277 | 11 |------------------------------------------------------------------------------- And the same query on 9i: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 16P| 55G|| 1 | SORT ORDER BY | | 16P| 55G|| 2 | NESTED LOOPS OUTER | | 16P| 862M|| 3 | NESTED LOOPS OUTER | | 5251G| 992K|| 4 | NESTED LOOPS OUTER | | 4243M| 2578 || 5 | NESTED LOOPS OUTER | | 2669K| 1440 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 ||* 50 | VIEW PUSHED PREDICATE | ALL_TAB_COMMENTS | 2043 | ||* 66 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_TABLES | 1777K| ||* 80 | VIEW PUSHED PREDICATE | ALL_EXTERNAL_LOCATIONS | 1744K| ||* 96 | VIEW | ALL_PART_TABLES | 852K| |------------------------------------------------------------------------------- Have a look at the cost column. 10g's overall query cost is 939, and 9i is 55,000,000,000 (or more precisely, 55,496,472,769). It's also having to process far more data. What on earth could be causing this huge difference in query cost? After trawling through the '10g New Features' documentation, we found item 1.9.2.21. Before 10g, Oracle advised that you do not collect statistics on data dictionary objects. From 10g, it advised that you do collect statistics on the data dictionary; for our queries, Oracle therefore knows what sort of data is in the dictionary tables, and so can generate an efficient execution plan. On 9i, no statistics are present on the system tables, so Oracle has to use the Rule Based Optimizer, which turns most LEFT JOINs into nested loops. If we force 9i to use hash joins, like 10g, we get a much better plan: -------------------------------------------------------------------------------| Id | Operation | Name | Bytes | Cost |-------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 7587K| 3704 || 1 | SORT ORDER BY | | 7587K| 3704 ||* 2 | HASH JOIN OUTER | | 7587K| 822 ||* 3 | HASH JOIN OUTER | | 5262K| 616 ||* 4 | HASH JOIN OUTER | | 2980K| 465 ||* 5 | HASH JOIN OUTER | | 710K| 432 ||* 6 | HASH JOIN OUTER | | 398K| 302 || 7 | VIEW | ALL_TABLES | 342K| 276 || 29 | VIEW | ALL_MVIEWS | 51 | 20 || 50 | VIEW | ALL_PART_TABLES | 852K| 104 || 78 | VIEW | ALL_TAB_COMMENTS | 2043 | 14 || 93 | VIEW | ALL_EXTERNAL_LOCATIONS | 1744K| 31 || 106 | VIEW | ALL_EXTERNAL_TABLES | 1777K| 28 |------------------------------------------------------------------------------- That's much more like it. This drops the execution time down to 24 seconds. Not as good as 10g, but still an improvement. There are still several problems with this, however. 10g introduced a new join method - a right outer hash join (used in the first execution plan). The 9i query optimizer doesn't have this option available, so forcing a hash join means it has to hash the ALL_TABLES table, and furthermore re-hash it for every hash join in the execution plan; this could be thousands and thousands of rows. And although forcing hash joins somewhat alleviates this problem on our test systems, there's no guarantee that this will improve the execution time on customers' systems; it may even increase the time it takes (say, if all their tables are partitioned, or they've got a lot of materialized views). Ideally, we would want a solution that provides a speedup whatever the input. To try and get some ideas, we asked some oracle performance specialists to see if they had any ideas or tips. Their recommendation was to add a hidden hook into the product that allowed users to specify their own query hints, or even rewrite the queries entirely. However, we would prefer not to take that approach; as well as a lot of new infrastructure & a rewrite of the population code, it would have meant that any users of 9i would have to spend some time optimizing it to get it working on their system before they could use the product. Another approach was needed. All our population queries have a very specific pattern - a base table provides most of the information we need (ALL_TABLES for tables, or ALL_TAB_COLS for columns) and we do a left join to extra subsidiary tables that fill in gaps (for instance, ALL_PART_TABLES for partition information). All the left joins use the same set of columns to join on (typically the object owner & name), so we could re-use the hash information for each join, rather than re-hashing the same columns for every join. To allow us to do this, along with various other performance improvements that could be done for the specific query pattern we were using, we read all the tables individually and do a hash join on the client. Fortunately, this 'pure' algorithmic problem is the kind that can be very well optimized for expected real-world situations; as well as storing row data we're not using in the hash key on disk, we use very specific memory-efficient data structures to store all the information we need. This allows us to achieve a database population time that is as fast as on 10g, and even (in some situations) slightly faster, and a memory overhead of roughly 150 bytes per row of data in the result set (for schemas with 10,000 tables in that means an extra 1.4MB memory being used during population). Next: fun with the 9i dictionary views.

    Read the article

  • Joining on NULLs

    - by Dave Ballantyne
    A problem I see on a fairly regular basis is that of dealing with NULL values.  Specifically here, where we are joining two tables on two columns, one of which is ‘optional’ ie is nullable.  So something like this: i.e. Lookup where all the columns are equal, even when NULL.   NULL’s are a tricky thing to initially wrap your mind around.  Statements like “NULL is not equal to NULL and neither is it not not equal to NULL, it’s NULL” can cause a serious brain freeze and leave you a gibbering wreck and needing your mummy. Before we plod on, time to setup some data to demo against. Create table #SourceTable ( Id integer not null, SubId integer null, AnotherCol char(255) not null ) go create unique clustered index idxSourceTable on #SourceTable(id,subID) go with cteNums as ( select top(1000) number from master..spt_values where type ='P' ) insert into #SourceTable select Num1.number,nullif(Num2.number,0),'SomeJunk' from cteNums num1 cross join cteNums num2 go Create table #LookupTable ( Id integer not null, SubID integer null ) go insert into #LookupTable Select top(100) id,subid from #SourceTable where subid is not null order by newid() go insert into #LookupTable Select top(3) id,subid from #SourceTable where subid is null order by newid() If that has run correctly, you will have 1 million rows in #SourceTable and 103 rows in #LookupTable.  We now want to join one to the other. First attempt – Lets just join select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and #LookupTable.SubID = #SourceTable.SubID OK, that’s a fail.  We had 100 rows back,  we didn’t correctly account for the 3 rows that have null values.  Remember NULL <> NULL and the join clause specifies SUBID=SUBID, which for those rows is not true. Second attempt – Lets deal with those pesky NULLS select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and isnull(#LookupTable.SubID,0) = isnull(#SourceTable.SubID,0) OK, that’s the right result, well done and 99.9% of the time that is where its left. It is a relatively trivial CPU overhead to wrap ISNULL around both columns and compare that result, so no problems.  But, although that’s true, this a relational database we are using here, not a procedural language.  SQL is a declarative language, we are making a request to the engine to get the results we want.  How we ask for them can make a ton of difference. Lets look at the plan for our second attempt, specifically the clustered index seek on the #SourceTable   There are 2 predicates. The ‘seek predicate’ and ‘predicate’.  The ‘seek predicate’ describes how SQLServer has been able to use an Index.  Here, it has been able to navigate the index to resolve where ID=ID.  So far so good, but what about the ‘predicate’ (aka residual probe) ? This is a row-by-row operation.  For each row found in the index matching the Seek Predicate, the leaf level nodes have been scanned and tested using this logical condition.  In this example [Expr1007] is the result of the IsNull operation on #LookupTable and that is tested for equality with the IsNull operation on #SourceTable.  This residual probe is quite a high overhead, if we can express our statement slightly differently to take full advantage of the index and make the test part of the ‘Seek Predicate’. Third attempt – X is null and Y is null So, lets state the query in a slightly manner: select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and ( #LookupTable.SubID = #SourceTable.SubID or (#LookupTable.SubID is null and #SourceTable.SubId is null) ) So its slightly wordier and may not be as clear in its intent to the human reader, that is what comments are for, but the key point is that it is now clearer to the query optimizer what our intention is. Let look at the plan for that query, again specifically the index seek operation on #SourceTable No ‘predicate’, just a ‘Seek Predicate’ against the index to resolve both ID and SubID.  A subtle difference that can be easily overlooked.  But has it made a difference to the performance ? Well, yes , a perhaps surprisingly high one. Clever query optimizer well done. If you are using a scalar function on a column, you a pretty much guaranteeing that a residual probe will be used.  By re-wording the query you may well be able to avoid this and use the index completely to resolve lookups. In-terms of performance and scalability your system will be in a much better position if you can.

    Read the article

  • Custom page sizes in paging dropdown in Telerik RadGrid

    Working with Telerik RadControls for ASP.NET AJAX is actually quite easy and the initial effort to get started with the control suite is very low. Meaning that you can easily get good result with little time. But there are usually cases where you have to go a little further and dig a little bit deeper than the standard scenarios. In this article I am going to describe how you can customize the default values (10, 20 and 50) of the drop-down list in the paging element of RadGrid. Get control over the displayed page sizes while using numeric paging... The default page sizes are good but not always good enough The paging feature in RadGrid offers you 3, well actually 4, possible page sizes in the drop-down element out-of-the box, which are 10, 20 or 50 items. You can get a fourth option by specifying a value different than the three standards for the PageSize attribute, ie. 35 or 100. The drawback in that case is that it is the initial page size. Certainly, the available choices could be more flexible or even a little bit more intelligent. For example, by taking the total count of records into consideration. There are some interesting scenarios that would justify a customized page size element: A low number of records, like 14 or similar shouldn't provide a page size of 50, A high total count of records (ie: 300+) should offer more choices, ie: 100, 200, 500, or display of all records regardless of number of records I am sure that you might have your own requirements, and I hope that the following source code snippets might be helpful. Wiring the ItemCreated event In order to adjust and manipulate the existing RadComboBox in the paging element we have to handle the OnItemCreated event of RadGrid. Simply specify your code behind method in the attribute of the RadGrid tag, like so: <telerik:RadGrid ID="RadGridLive" runat="server" AllowPaging="true" PageSize="20"    AllowSorting="true" AutoGenerateColumns="false" OnNeedDataSource="RadGridLive_NeedDataSource"    OnItemDataBound="RadGrid_ItemDataBound" OnItemCreated="RadGrid_ItemCreated">    <ClientSettings EnableRowHoverStyle="true">        <ClientEvents OnRowCreated="RowCreated" OnRowSelected="RowSelected" />        <Resizing AllowColumnResize="True" AllowRowResize="false" ResizeGridOnColumnResize="false"            ClipCellContentOnResize="true" EnableRealTimeResize="false" AllowResizeToFit="true" />        <Scrolling AllowScroll="true" ScrollHeight="360px" UseStaticHeaders="true" SaveScrollPosition="true" />        <Selecting AllowRowSelect="true" />    </ClientSettings>    <MasterTableView DataKeyNames="AdvertID">        <PagerStyle AlwaysVisible="true" Mode="NextPrevAndNumeric" />        <Columns>            <telerik:GridBoundColumn HeaderText="Listing ID" DataField="AdvertID" DataType="System.Int32"                SortExpression="AdvertID" UniqueName="AdvertID">                <HeaderStyle Width="66px" />            </telerik:GridBoundColumn>             <!--//  ... and some more columns ... -->         </Columns>    </MasterTableView></telerik:RadGrid> To provide a consistent experience for your visitors it might be helpful to display the page size selection always. This is done by setting the AlwaysVisible attribute of the PagerStyle element to true, like highlighted above. Customize the values of page size Your delegate method for the ItemCreated event should look like this: protected void RadGrid_ItemCreated(object sender, GridItemEventArgs e){    if (e.Item is GridPagerItem)    {        var dropDown = (RadComboBox)e.Item.FindControl("PageSizeComboBox");        var totalCount = ((GridPagerItem)e.Item).Paging.DataSourceCount;        var sizes = new Dictionary<string, string>() {            {"10", "10"},            {"20", "20"},            {"50", "50"}        };        if (totalCount > 100)        {            sizes.Add("100", "100");        }        if (totalCount > 200)        {            sizes.Add("200", "200");        }        sizes.Add("All", totalCount.ToString());        dropDown.Items.Clear();        foreach (var size in sizes)        {            var cboItem = new RadComboBoxItem() { Text = size.Key, Value = size.Value };            cboItem.Attributes.Add("ownerTableViewId", e.Item.OwnerTableView.ClientID);            dropDown.Items.Add(cboItem);        }        dropDown.FindItemByValue(e.Item.OwnerTableView.PageSize.ToString()).Selected = true;    }} It is important that we explicitly check the event arguments for GridPagerItem as it is the control that contains the PageSizeComboBox control that we want to manipulate. To keep the actual modification and exposure of possible page size values flexible I am filling a Dictionary with the requested 'key/value'-pairs based on the number of total records displayed in the grid. As a final step, ensure that the previously selected value is the active one using the FindItemByValue() method. Of course, there might be different requirements but I hope that the snippet above provide a first insight into customized page size value in Telerik's Grid. The Grid demos describe a more advanced approach to customize the Pager.

    Read the article

  • Oracle BI Server Modeling, Part 1- Designing a Query Factory

    - by bob.ertl(at)oracle.com
      Welcome to Oracle BI Development's BI Foundation blog, focused on helping you get the most value from your Oracle Business Intelligence Enterprise Edition (BI EE) platform deployments.  In my first series of posts, I plan to show developers the concepts and best practices for modeling in the Common Enterprise Information Model (CEIM), the semantic layer of Oracle BI EE.  In this segment, I will lay the groundwork for the modeling concepts.  First, I will cover the big picture of how the BI Server fits into the system, and how the CEIM controls the query processing. Oracle BI EE Query Cycle The purpose of the Oracle BI Server is to bridge the gap between the presentation services and the data sources.  There are typically a variety of data sources in a variety of technologies: relational, normalized transaction systems; relational star-schema data warehouses and marts; multidimensional analytic cubes and financial applications; flat files, Excel files, XML files, and so on. Business datasets can reside in a single type of source, or, most of the time, are spread across various types of sources. Presentation services users are generally business people who need to be able to query that set of sources without any knowledge of technologies, schemas, or how sources are organized in their company. They think of business analysis in terms of measures with specific calculations, hierarchical dimensions for breaking those measures down, and detailed reports of the business transactions themselves.  Most of them create queries without knowing it, by picking a dashboard page and some filters.  Others create their own analysis by selecting metrics and dimensional attributes, and possibly creating additional calculations. The BI Server bridges that gap from simple business terms to technical physical queries by exposing just the business focused measures and dimensional attributes that business people can use in their analyses and dashboards.   After they make their selections and start the analysis, the BI Server plans the best way to query the data sources, writes the optimized sequence of physical queries to those sources, post-processes the results, and presents them to the client as a single result set suitable for tables, pivots and charts. The CEIM is a model that controls the processing of the BI Server.  It provides the subject areas that presentation services exposes for business users to select simplified metrics and dimensional attributes for their analysis.  It models the mappings to the physical data access, the calculations and logical transformations, and the data access security rules.  The CEIM consists of metadata stored in the repository, authored by developers using the Administration Tool client.     Presentation services and other query clients create their queries in BI EE's SQL-92 language, called Logical SQL or LSQL.  The API simply uses ODBC or JDBC to pass the query to the BI Server.  Presentation services writes the LSQL query in terms of the simplified objects presented to the users.  The BI Server creates a query plan, and rewrites the LSQL into fully-detailed SQL or other languages suitable for querying the physical sources.  For example, the LSQL on the left below was rewritten into the physical SQL for an Oracle 11g database on the right. Logical SQL   Physical SQL SELECT "D0 Time"."T02 Per Name Month" saw_0, "D4 Product"."P01  Product" saw_1, "F2 Units"."2-01  Billed Qty  (Sum All)" saw_2 FROM "Sample Sales" ORDER BY saw_0, saw_1       WITH SAWITH0 AS ( select T986.Per_Name_Month as c1, T879.Prod_Dsc as c2,      sum(T835.Units) as c3, T879.Prod_Key as c4 from      Product T879 /* A05 Product */ ,      Time_Mth T986 /* A08 Time Mth */ ,      FactsRev T835 /* A11 Revenue (Billed Time Join) */ where ( T835.Prod_Key = T879.Prod_Key and T835.Bill_Mth = T986.Row_Wid) group by T879.Prod_Dsc, T879.Prod_Key, T986.Per_Name_Month ) select SAWITH0.c1 as c1, SAWITH0.c2 as c2, SAWITH0.c3 as c3 from SAWITH0 order by c1, c2   Probably everybody reading this blog can write SQL or MDX.  However, the trick in designing the CEIM is that you are modeling a query-generation factory.  Rather than hand-crafting individual queries, you model behavior and relationships, thus configuring the BI Server machinery to manufacture millions of different queries in response to random user requests.  This mass production requires a different mindset and approach than when you are designing individual SQL statements in tools such as Oracle SQL Developer, Oracle Hyperion Interactive Reporting (formerly Brio), or Oracle BI Publisher.   The Structure of the Common Enterprise Information Model (CEIM) The CEIM has a unique structure specifically for modeling the relationships and behaviors that fill the gap from logical user requests to physical data source queries and back to the result.  The model divides the functionality into three specialized layers, called Presentation, Business Model and Mapping, and Physical, as shown below. Presentation services clients can generally only see the presentation layer, and the objects in the presentation layer are normally the only ones used in the LSQL request.  When a request comes into the BI Server from presentation services or another client, the relationships and objects in the model allow the BI Server to select the appropriate data sources, create a query plan, and generate the physical queries.  That's the left to right flow in the diagram below.  When the results come back from the data source queries, the right to left relationships in the model show how to transform the results and perform any final calculations and functions that could not be pushed down to the databases.   Business Model Think of the business model as the heart of the CEIM you are designing.  This is where you define the analytic behavior seen by the users, and the superset library of metric and dimension objects available to the user community as a whole.  It also provides the baseline business-friendly names and user-readable dictionary.  For these reasons, it is often called the "logical" model--it is a virtual database schema that persists no data, but can be queried as if it is a database. The business model always has a dimensional shape (more on this in future posts), and its simple shape and terminology hides the complexity of the source data models. Besides hiding complexity and normalizing terminology, this layer adds most of the analytic value, as well.  This is where you define the rich, dimensional behavior of the metrics and complex business calculations, as well as the conformed dimensions and hierarchies.  It contributes to the ease of use for business users, since the dimensional metric definitions apply in any context of filters and drill-downs, and the conformed dimensions enable dashboard-wide filters and guided analysis links that bring context along from one page to the next.  The conformed dimensions also provide a key to hiding the complexity of many sources, including federation of different databases, behind the simple business model. Note that the expression language in this layer is LSQL, so that any expression can be rewritten into any data source's query language at run time.  This is important for federation, where a given logical object can map to several different physical objects in different databases.  It is also important to portability of the CEIM to different database brands, which is a key requirement for Oracle's BI Applications products. Your requirements process with your user community will mostly affect the business model.  This is where you will define most of the things they specifically ask for, such as metric definitions.  For this reason, many of the best-practice methodologies of our consulting partners start with the high-level definition of this layer. Physical Model The physical model connects the business model that meets your users' requirements to the reality of the data sources you have available. In the query factory analogy, think of the physical layer as the bill of materials for generating physical queries.  Every schema, table, column, join, cube, hierarchy, etc., that will appear in any physical query manufactured at run time must be modeled here at design time. Each physical data source will have its own physical model, or "database" object in the CEIM.  The shape of each physical model matches the shape of its physical source.  In other words, if the source is normalized relational, the physical model will mimic that normalized shape.  If it is a hypercube, the physical model will have a hypercube shape.  If it is a flat file, it will have a denormalized tabular shape. To aid in query optimization, the physical layer also tracks the specifics of the database brand and release.  This allows the BI Server to make the most of each physical source's distinct capabilities, writing queries in its syntax, and using its specific functions. This allows the BI Server to push processing work as deep as possible into the physical source, which minimizes data movement and takes full advantage of the database's own optimizer.  For most data sources, native APIs are used to further optimize performance and functionality. The value of having a distinct separation between the logical (business) and physical models is encapsulation of the physical characteristics.  This encapsulation is another enabler of packaged BI applications and federation.  It is also key to hiding the complex shapes and relationships in the physical sources from the end users.  Consider a routine drill-down in the business model: physically, it can require a drill-through where the first query is MDX to a multidimensional cube, followed by the drill-down query in SQL to a normalized relational database.  The only difference from the user's point of view is that the 2nd query added a more detailed dimension level column - everything else was the same. Mappings Within the Business Model and Mapping Layer, the mappings provide the binding from each logical column and join in the dimensional business model, to each of the objects that can provide its data in the physical layer.  When there is more than one option for a physical source, rules in the mappings are applied to the query context to determine which of the data sources should be hit, and how to combine their results if more than one is used.  These rules specify aggregate navigation, vertical partitioning (fragmentation), and horizontal partitioning, any of which can be federated across multiple, heterogeneous sources.  These mappings are usually the most sophisticated part of the CEIM. Presentation You might think of the presentation layer as a set of very simple relational-like views into the business model.  Over ODBC/JDBC, they present a relational catalog consisting of databases, tables and columns.  For business users, presentation services interprets these as subject areas, folders and columns, respectively.  (Note that in 10g, subject areas were called presentation catalogs in the CEIM.  In this blog, I will stick to 11g terminology.)  Generally speaking, presentation services and other clients can query only these objects (there are exceptions for certain clients such as BI Publisher and Essbase Studio). The purpose of the presentation layer is to specialize the business model for different categories of users.  Based on a user's role, they will be restricted to specific subject areas, tables and columns for security.  The breakdown of the model into multiple subject areas organizes the content for users, and subjects superfluous to a particular business role can be hidden from that set of users.  Customized names and descriptions can be used to override the business model names for a specific audience.  Variables in the object names can be used for localization. For these reasons, you are better off thinking of the tables in the presentation layer as folders than as strict relational tables.  The real semantics of tables and how they function is in the business model, and any grouping of columns can be included in any table in the presentation layer.  In 11g, an LSQL query can also span multiple presentation subject areas, as long as they map to the same business model. Other Model Objects There are some objects that apply to multiple layers.  These include security-related objects, such as application roles, users, data filters, and query limits (governors).  There are also variables you can use in parameters and expressions, and initialization blocks for loading their initial values on a static or user session basis.  Finally, there are Multi-User Development (MUD) projects for developers to check out units of work, and objects for the marketing feature used by our packaged customer relationship management (CRM) software.   The Query Factory At this point, you should have a grasp on the query factory concept.  When developing the CEIM model, you are configuring the BI Server to automatically manufacture millions of queries in response to random user requests. You do this by defining the analytic behavior in the business model, mapping that to the physical data sources, and exposing it through the presentation layer's role-based subject areas. While configuring mass production requires a different mindset than when you hand-craft individual SQL or MDX statements, it builds on the modeling and query concepts you already understand. The following posts in this series will walk through the CEIM modeling concepts and best practices in detail.  We will initially review dimensional concepts so you can understand the business model, and then present a pattern-based approach to learning the mappings from a variety of physical schema shapes and deployments to the dimensional model.  Along the way, we will also present the dimensional calculation template, and learn how to configure the many additivity patterns.

    Read the article

< Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >