Search Results

Search found 60391 results on 2416 pages for 'data generation'.

Page 515/2416 | < Previous Page | 511 512 513 514 515 516 517 518 519 520 521 522  | Next Page >

  • how to use sed/awk to remove words with multiple pattern count

    - by user1461112
    I have a file of string records where one of the fields - delimited by "," - can contain one or more "-" inside it. The goal is to delete the field value if it contains more than two "-". i am trying to recoup my past knowledge of sed/awk but can't make much headway ========== info,whitepaper,Data-Centers,yes-the-6-top-problems-in-your-data-center-lane info,whitepaper,Data-Centers,the-evolution-center info,whitepaper,Data-Centers,the-evolution-of-lan-technology-lanner ========== expected outcome: info,whitepaper,Data-Centers info,whitepaper,Data-Centers,the-evolution-center info,whitepaper,Data-Centers thanks

    Read the article

  • Connecting to DB2 from SSIS

    - by Christopher House
    The project I'm currently working on involves moving various pieces of data from a legacy DB2 environment to some SQL Server and flat file locations.  Most of the data flows are real time, so they were a natural fit for the client's MQSeries on their iSeries servers and BizTalk to handle the messaging.  Some of the data flows, however, are daily batch type transmissions.  For the daily batch transmissions, it was decided that we'd use SSIS to pull the data direct from DB2 to either a SQL Server or flat file.  I'm not at all an SSIS guy, I've done a bit here and there, but mainly for situations were we needed to move data from a dev environment to QA, mostly informal stuff like that.  And, as much as I'm not an SSIS guy, I'm even less a DB2/iSeries guy.  Prior to this engagement, my knowledge of DB2 was limited to the fact that it's an IBM product and that it was probably a DBMS flatform (that's what the DB in DB2 means, right?).   One of my first goals when I came onto this project was to develop of POC SSIS package to pull some data from DB2 and dump it to a flat file.  It sounded like a pretty straight forward task.  As always, the devil is in the details.  Configuring the DB2 connection manager took a bit of trial and error.  As such, I thought I'd post my experiences here in hopes that they might save someone the efforts I went through.  That being said, please keep in mind, as I pointed out, I'm not at all a DB2 guy, so my terminology and explanations may not be 100% spot on. Before you get started, you need to figure out how you're going to connect to DB2.  From the research I did, it looks like there are a few options.  IBM has both an OLE DB and .Net data provider which can be found here.  I installed their client access tools and tried to use both the .Net and OLE DB providers but I received an error message from both when attempting to connect to the iSeries that indicated I needed a license for a product called DB2 Connect.  I inquired with one of my client's iSeries resources about a license for this product and it appears they didn't have one, so that meant the IBM drivers were out.  The other option that I found quite a bit of discussion around was Microsoft's OLE DB Provider for DB2.  This driver is part of the feature pack for SQL Server 2008 Enterprise Edition and can be downloaded here. As it turns out, I already had Microsoft's driver installed on my dev VM, which stuck me as odd since I hadn't installed it.  I discovered that the driver is installed with the BizTalk adapter pack for host systems, which was also installed on my VM.  However, it looks like the version used by the adapter pack is newer than the version provided in the SQL Server feature pack.   Once you get the driver installed, create a connection manager in your package just like you normally would and select the Microsoft OLE DB Provider for DB2 from the list of available drivers. After you select the driver, you'll need to enter in your host name, login credentials and initial catalog. A couple of things to note here.  First, the Initial catalog needs to be the same as your host name.  Not sure why that is, but trust me, it just does.  Second, for credentials, in my environment, we're using what the client's iSeries people refer to as "profiles".  I guess this is similar to SQL auth in the SQL Server world.  In other words, they've given me a username and password for connecting to DB, so I've entered it here. Next, click the Data Links button.  On the Data Links screen, enter your package collection on the first tab. Package collection is one of those DB2 concepts I'm still trying to figure out.  From the little bit I've read, packages are used to control SQL compilation and each DB2 connection needs one.  The package collection, I believe, controls where your package is created.  One of the iSeries folks I've been working with told me that I should always use QGPL for my package collection, as QGPL is "general purpose" and doesn't require any additional authority. Next click the ellipsis next to the Network drop-down.  Here you'll want to enter your host name again. Again, not sure why you need to do this, but trust me, my connection wouldn't work until I entered my hostname here. Finally, go to the Advanced tab, select your DBMS platform and check Process binary as character. My environment is DB2 on the iSeries and iSeries is the replacement for AS/400, so I selected DB2/AS400 for my platform.  Process binary as character was necessary to handle some of the DB2 data types.  I had a few columns that showed all their data as "System.Byte[]".  Checking Process binary as character resolved this. At this point, you should be good to go.  You can go back to the Connection tab on the Data Links dialog to perform a couple of tests to validate your configuration.  The Test Connection button is obvious, this just verifies you can connect to the host using the configuration data you've entered.  The Packages button will attempt to connect to the host and create the packages required to execute queries. This isn't meant to be a comprehensive look SSIS and DB2, these are just some of the notes I've come up with since I've started working with DB2 and SSIS.  I'm sure as I continue developing my packages, I'll find more quirks and will post them here.

    Read the article

  • Setting up a new Silverlight 4 Project with WCF RIA Services

    - by Kevin Grossnicklaus
    Many of my clients are actively using Silverlight 4 and RIA Services to build powerful line of business applications.  Getting things set up correctly is critical to being to being able to take full advantage of the RIA services plumbing and when developers struggle with the setup they tend to shy away from the solution as a whole.  I’m a big proponent of RIA services and wanted to take the opportunity to share some of my experiences in setting up these types of projects.  In late 2010 I presented a RIA Services Master Class here in St. Louis, MO through my firm (ArchitectNow) and the information shared in this post was promised during that presentation. One other thing I want to mention before diving in is the existence of a number of other great posts on this subject.  I’ve learned a lot from many of them and wanted to call out a few of them.  The purpose of my post is to point out some of the gotchas that people get caught up on in the process but I would still encourage you to do as much additional research as you can to find the perfect setup for your needs. Here are a few additional blog posts and articles you should check out on the subject: http://msdn.microsoft.com/en-us/library/ee707351(VS.91).aspx http://adam-thompson.com/post/2010/07/03/Getting-Started-with-WCF-RIA-Services-for-Silverlight-4.aspx Technologies I don’t intend for this post to turn into a full WCF RIA Services tutorial but I did want to point out what technologies we will be using: Visual Studio.NET 2010 Silverlight 4.0 WCF RIA Services for Visual Studio 2010 Entity Framework 4.0 I also wanted to point out that the screenshots came from my personal development box which has a number of additional plug-ins and frameworks loaded so a few of the screenshots might not match 100% with what you see on your own machines. If you do not have Visual Studio 2010 you can download the express version from http://www.microsoft.com/express.  The Silverlight 4.0 tools and the WCF RIA Services components are installed via the Web Platform Installer (http://www.microsoft.com/web/download). Also, the examples given in this post are done in C#…sorry to you VB folks but the concepts are 100% identical. Setting up anew RIA Services Project This section will provide a step-by-step walkthrough of setting up a new RIA services project using a shared DLL for server side code and a simple Entity Framework model for data access.  All projects are created with the consistent ArchitectNow.RIAServices filename prefix and default namespace.  This would be modified to match your companies standards. First, open Visual Studio and open the new project window via File->New->Project.  In the New Project window, select the Silverlight folder in the Installed Templates section on the left and select “Silverlight Application” as your project type.  Verify your solution name and location are set appropriately.  Note that the project name we specified in the example below ends with .Client.  This indicates the name which will be given to our Silverlight project. I consider Silverlight a client-side technology and thus use this name to reflect that.  Click Ok to continue. During the creation on a new Silverlight 4 project you will be prompted with the following dialog to create a new web ASP.NET web project to host your Silverlight content.  As we are demonstrating the setup of a WCF RIA Services infrastructure, make sure the “Enable WCF RIA Services” option is checked and click OK.  Obviously, there are some other options here which have an effect on your solution and you are welcome to look around.  For our example we are going to leave the ASP.NET Web Application Project selected.  If you are interested in having your Silverlight project hosted in an MVC 2 application or a Web Site project these options are available as well.  Also, whichever web project type you select, the name can be modified here as well.  Note that it defaults to the same name as your Silverlight project with the addition of a .Web suffix. At this point, your full Silverlight 4 project and host ASP.NET Web Application should be created and will now display in your Visual Studio solution explorer as part of a single Visual Studio solution as follows: Now we want to add our WCF RIA Services projects to this same solution.  To do so, right-click on the Solution node in the solution explorer and select Add->New Project.  In the New Project dialog again select the Silverlight folder under the Visual C# node on the left and, in the main area of the screen, select the WCF RIA Services Class Library project template as shown below.  Make sure your project name is set appropriately as well.  For the sample below, we will name the project “ArchitectNow.RIAServices.Server.Entities”.   The .Server.Entities suffix we use is meant to simply indicate that this particular project will contain our WCF RIA Services entity classes (as you will see below).  Click OK to continue. Once you have created the WCF RIA Services Class Library specified above, Visual Studio will automatically add TWO projects to your solution.  The first will be an project called .Server.Entities (using our naming conventions) and the other will have the same name with a .Web extension.  The full solution (with all 4 projects) is shown in the image below.  The .Entities project will essentially remain empty and is actually a Silverlight 4 class library that will contain generated RIA Services domain objects.  It will be referenced by our front-end Silverlight project and thus allow for simplified sharing of code between the client and the server.   The .Entities.Web project is a .NET 4.0 class library into which we will put our data access code (via Entity Framework).  This is our server side code and business logic and the RIA Services plumbing will maintain a link between this project and the front end.  Specific entities such as our domain objects and other code we set to be shared will be copied automatically into the .Entities project to be used in both the front end and the back end. At this point, we want to do a little cleanup of the projects in our solution and we will do so by deleting the “Class1.cs” class from both the .Entities project and the .Entities.Web project.  (Has anyone ever intentionally named a class “Class1”?) Next, we need to configure a few references to make RIA Services work.  THIS IS A KEY STEP THAT CAUSES MANY HEADACHES FOR DEVELOPERS NEW TO THIS INFRASTRUCTURE! Using the Add References dialog in Visual Studio, add a project reference from the *.Client project (our Silverlight 4 client) to the *.Entities project (our RIA Services class library).  Next, again using the Add References dialog in Visual Studio, add a project reference from the *.Client.Web project (our ASP.NET host project) to the *.Entities.Web project (our back-end data services DLL).  To get to the Add References dialog, simply right-click on the project you with to add a reference to in the Visual Studio solution explorer and select “Add Reference” from the resulting context menu.  You will want to make sure these references are added as “Project” references to simplify your future debugging.  To reiterate the reference direction using the project names we have utilized in this example thus far:  .Client references .Entities and .Client.Web reference .Entities.Web.  If you have opted for a different naming convention, then the Silverlight project must reference the RIA Services Silverlight class library and the ASP.NET host project must reference the server-side class library. Next, we are going to add a new Entity Framework data model to our data services project (.Entities.Web).  We will do this by right clicking on this project (ArchitectNow.Server.Entities.Web in the above diagram) and selecting Add->New Project.  In the New Project dialog we will select ADO.NET Entity Data Model as in the following diagram.  For now we will call this simply SampleDataModel.edmx and click OK. It is worth pointing out that WCF RIA Services is in no way tied to the Entity Framework as a means of accessing data and any data access technology is supported (as long as the server side implementation maps to the RIA Services pattern which is a topic beyond the scope of this post).  We are using EF to quickly demonstrate the RIA Services concepts and setup infrastructure, as such, I am not providing a database schema with this post but am instead connecting to a small sample database on my local machine.  The following diagram shows a simple EF Data Model with two tables that I reverse engineered from a local data store.   If you are putting together your own solution, feel free to reverse engineer a few tables from any local database to which you have access. At this point, once you have an EF data model generated as an EDMX into your .Entites.Web project YOU MUST BUILD YOUR SOLUTION.  I know it seems strange to call that out but it important that the solution be built at this point for the next step to be successful.  Obviously, if you have any build errors, these must be addressed at this point. At this point we will add a RIA Services Domain Service to our .Entities.Web project (our server side code).  We will need to right-click on the .Entities.Web project and select Add->New Item.  In the Add New Item dialog, select Domain Service Class and verify the name of your new Domain Service is correct (ours is called SampleService.cs in the image below).  Next, click "Add”. After clicking “Add” to include the Domain Service Class in the selected project, you will be presented with the following dialog.  In it, you can choose which entities from the selected EDMX to include in your services and if they should be allowed to be edited (i.e. inserted, updated, or deleted) via this service.  If the “Available DataContext/ObjectContext classes” dropdown is empty, this indicates you have not yes successfully built your project after adding your EDMX.  I would also recommend verifying that the “Generate associated classes for metadata” option is selected.  Once you have selected the appropriate options, click “OK”. Once you have added the domain service class to the .Entities.Web project, the resulting solution should look similar to the following: Note that in the solution you now have a SampleDataModel.edmx which represents your EF data mapping to your database and a SampleService.cs which will contain a large amount of generated RIA Services code which RIA Services utilizes to access this data from the Silverlight front-end.  You will put all your server side data access code and logic into the SampleService.cs class.  The SampleService.metadata.cs class is for decorating the generated domain objects with attributes from the System.ComponentModel.DataAnnotations namespace for validation purposes. FINAL AND KEY CONFIGURATION STEP!  One key step that causes significant headache to developers configuring RIA Services for the first time is the fact that, when we added the EDMX to the .Entities.Web project for our EF data access, a connection string was generated and placed within a newly generated App.Context file within that project.  While we didn’t point it out at the time you can see it in the image above.  This connection string will be required for the EF data model to successfully locate it’s data.  Also, when we added the Domain Service class to the .Entities.Web project, a number of RIA Services configuration options were added to the same App.Config file.   Unfortunately, when we ultimately begin to utilize the RIA Services infrastructure, our Silverlight UI will be making RIA services calls through the ASP.NET host project (i.e. .Client.Web).  This host project has a reference to the .Entities.Web project which actually contains the code so all will pass through correctly EXCEPT the fact that the host project will utilize it’s own Web.Config for any configuration settings.  For this reason we must now merge all the sections of the App.Config file in the .Entities.Web project into the Web.Config file in the .Client.Web project.  I know this is a bit tedious and I wish there were a simpler solution but it is required for our RIA Services Domain Service to be made available to the front end Silverlight project.  Much of this manual merge can be achieved by simply cutting and pasting from App.Config into Web.Config.  Unfortunately, the <system.webServer> section will exist in both and the contents of this section will need to be manually merged.  Fortunately, this is a step that needs to be taken only once per solution.  As you add additional data structures and Domain Services methods to the server no additional changes will be necessary to the Web.Config. Next Steps At this point, we have walked through the basic setup of a simple RIA services solution.  Unfortunately, there is still a lot to know about RIA services and we have not even begun to take advantage of the plumbing which we just configured (meaning we haven’t even made a single RIA services call).  I plan on posting a few more introductory posts over the next few weeks to take us to this step.  If you have any questions on the content in this post feel free to reach out to me via this Blog and I’ll gladly point you in (hopefully) the right direction. Resources Prior to closing out this post, I wanted to share a number or resources to help you get started with RIA services.  While I plan on posting more on the subject, I didn’t invent any of this stuff and wanted to give credit to the following areas for helping me put a lot of these pieces into place.   The books and online resources below will go a long way to making you extremely productive with RIA services in the shortest time possible.  The only thing required of you is the dedication to take advantage of the resources available. Books Pro Business Applications with Silverlight 4 http://www.amazon.com/Pro-Business-Applications-Silverlight-4/dp/1430272074/ref=sr_1_2?ie=UTF8&qid=1291048751&sr=8-2 Silverlight 4 in Action http://www.amazon.com/Silverlight-4-Action-Pete-Brown/dp/1935182374/ref=sr_1_1?ie=UTF8&qid=1291048751&sr=8-1 Pro Silverlight for the Enterprise (Books for Professionals by Professionals) http://www.amazon.com/Pro-Silverlight-Enterprise-Books-Professionals/dp/1430218673/ref=sr_1_3?ie=UTF8&qid=1291048751&sr=8-3 Web Content RIA Services http://channel9.msdn.com/Blogs/RobBagby/NET-RIA-Services-in-5-Minutes http://silverlight.net/riaservices/ http://www.silverlight.net/learn/videos/all/net-ria-services-intro/ http://www.silverlight.net/learn/videos/all/ria-services-support-visual-studio-2010/ http://channel9.msdn.com/learn/courses/Silverlight4/SL4BusinessModule2/SL4LOB_02_01_RIAServices http://www.myvbprof.com/MainSite/index.aspx#/zSL4_RIA_01 http://channel9.msdn.com/blogs/egibson/silverlight-firestarter-ria-services http://msdn.microsoft.com/en-us/library/ee707336%28v=VS.91%29.aspx Silverlight www.silverlight.net http://msdn.microsoft.com/en-us/silverlight4trainingcourse.aspx http://channel9.msdn.com/shows/silverlighttv

    Read the article

  • Replication Services in a BI environment

    - by jorg
    In this blog post I will explain the principles of SQL Server Replication Services without too much detail and I will take a look on the BI capabilities that Replication Services could offer in my opinion. SQL Server Replication Services provides tools to copy and distribute database objects from one database system to another and maintain consistency afterwards. These tools basically copy or synchronize data with little or no transformations, they do not offer capabilities to transform data or apply business rules, like ETL tools do. The only “transformations” Replication Services offers is to filter records or columns out of your data set. You can achieve this by selecting the desired columns of a table and/or by using WHERE statements like this: SELECT <published_columns> FROM [Table] WHERE [DateTime] >= getdate() - 60 There are three types of replication: Transactional Replication This type replicates data on a transactional level. The Log Reader Agent reads directly on the transaction log of the source database (Publisher) and clones the transactions to the Distribution Database (Distributor), this database acts as a queue for the destination database (Subscriber). Next, the Distribution Agent moves the cloned transactions that are stored in the Distribution Database to the Subscriber. The Distribution Agent can either run at scheduled intervals or continuously which offers near real-time replication of data! So for example when a user executes an UPDATE statement on one or multiple records in the publisher database, this transaction (not the data itself) is copied to the distribution database and is then also executed on the subscriber. When the Distribution Agent is set to run continuously this process runs all the time and transactions on the publisher are replicated in small batches (near real-time), when it runs on scheduled intervals it executes larger batches of transactions, but the idea is the same. Snapshot Replication This type of replication makes an initial copy of database objects that need to be replicated, this includes the schemas and the data itself. All types of replication must start with a snapshot of the database objects from the Publisher to initialize the Subscriber. Transactional replication need an initial snapshot of the replicated publisher tables/objects to run its cloned transactions on and maintain consistency. The Snapshot Agent copies the schemas of the tables that will be replicated to files that will be stored in the Snapshot Folder which is a normal folder on the file system. When all the schemas are ready, the data itself will be copied from the Publisher to the snapshot folder. The snapshot is generated as a set of bulk copy program (BCP) files. Next, the Distribution Agent moves the snapshot to the Subscriber, if necessary it applies schema changes first and copies the data itself afterwards. The application of schema changes to the Subscriber is a nice feature, when you change the schema of the Publisher with, for example, an ALTER TABLE statement, that change is propagated by default to the Subscriber(s). Merge Replication Merge replication is typically used in server-to-client environments, for example when subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers, like with mobile devices that need to synchronize one in a while. Because I don’t really see BI capabilities here, I will not explain this type of replication any further. Replication Services in a BI environment Transactional Replication can be very useful in BI environments. In my opinion you never want to see users to run custom (SSRS) reports or PowerPivot solutions directly on your production database, it can slow down the system and can cause deadlocks in the database which can cause errors. Transactional Replication can offer a read-only, near real-time database for reporting purposes with minimal overhead on the source system. Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!

    Read the article

  • The Complementary Roles of PLM and PIM

    - by Ulf Köster
    Oracle Product Value Chain Solutions (aka Enterprise PLM Solutions) are a comprehensive set of product management solutions that work together to provide Oracle customers with a broad array of capabilities to manage all aspects of product life: innovation, design, launch, and supply chain / commercialization processes beyond the capabilities and boundaries of traditional engineering-focused Product Lifecycle Management applications. They support companies with an integrated managed view across the product value chain: From Lab to Launch, From Farm to Fork, From Concept to Product to Customer, From Product Innovation to Product Design and Product Commercialization. Product Lifecycle Management (PLM) represents a broad suite of software solutions to improve product-oriented business processes and data. PLM success stories prove that PLM helps companies improve time to market, increase product-related revenue, reduce product costs, reduce internal costs and improve product quality. As a maturing suite of enterprise solutions, PLM is still evolving to realize the promise it can provide across all facets of a business and all phases of the product lifecycle. The vision for PLM includes everything from gathering early requirements for a product through multiple stages of the product lifecycle from product design, through commercialization and eventual product retirement or replacement. In discrete or process industries, PLM is typically more focused on Product Definition as items with respect to the technical view of a material or part, including specifications, bills of material and manufacturing data. With Agile PLM, this is specifically related to capabilities addressing Product Collaboration, Governance and Compliance, Product Quality Management, Product Cost Management and Engineering Collaboration. PLM today is mainly addressing key requirements in the early product lifecycle, in engineering changes or in the “innovation cycle”, and primarily adds value related to product design, development, launch and engineering change process. In short, PLM is the master for Product Definition, wherever manufacturing takes place. Product Information Management (PIM) is a product suite that has evolved in parallel to PLM. Product Information Management (PIM) can extend the value of PLM implementations by providing complementary tools and capabilities. More relevant in the area of Product Commercialization, the vision for PIM is to manage product information throughout an enterprise and supply chain to improve product-related knowledge management, information sharing and synchronization from multiple data sources. PIM success stories have shown the ability to provide multiple benefits, with particular emphasis on reducing information complexity and information management costs. Product Information in PIM is typically treated as the commercial view of a material or part, including sales and marketing information and categorization. PIM collects information from multiple manufacturing sites and multiple suppliers into its repository, but also provides integration tools to push the information back out to the other systems, serving as an active central repository with the aim to provide a holistic view on any product sold by a company (hence the name “Product Hub”). In short, PIM is the master of commercial Product Information. So PIM is quickly becoming mandatory because of its value in optimizing multichannel selling processes and relationships with customers, as you can see from the following table: Viewpoint PLM Current State PIM Key Benefits PIM adds to PLM Product Lifecycle Primarily R&D Front end Innovation Cycle Change process Primarily commercial / transactional state of lifecycle Provides a seamless information flow from design and manufacturing through the ultimate selling and servicing of products Data Primarily focused on “item” vs. “product” data Product structures Specifications Technical information Repository for all product information. Reaches out to entire enterprise and its various silos of product information and descriptions Provides a “trusted source” of accurate product information to the internal organization and trading partners Data Lifecycle Repository for all design iterations Historical information Released, current information, with version management and time stamping Provides a single location to track and audit historical product information Communication PLM release finished product to ERP PLM is the master for Product Definition Captures information from disparate sources, including in-house data stores Recognizes the reality of today’s data “mess” across information silos Provides the ability to package product information to its audience in the desired, relevant format to meet their exacting business requirements Departmental R&D Manufacturing Quality Compliance Procurement Strategic Marketing Focus on Marketing and Sales Gathering information from other Departments, multiple sites, multiple suppliers A singular enterprise solution that leverages existing information silos and data stores Supply Chain Multi-site internal collaboration Supplier collaboration Customer collaboration Works with customers, exchanges / data pools, and trading partners to provide relevant product information packaged the way the customer desires Provides ability to provide trading partners and internal customers with information in a manner they desire, continuously Tools Data Management Collaboration Innovation Management Cleansing Synchronization Hub functions Consistent, clean and complete commercial product information The goals of both PLM and PIM, put simply, are to help companies make more profit from their products. PLM and PIM solutions can be easily added as they share some of the same goals, while coming from two different perspectives: the definition of the product and the commercialization of the product. Both can serve as a form of product “system of record”, but take different approaches to delivering value. Oracle Product Value Chain solutions offer rich new strategies for executives to collectively leverage Agile PLM, Product Data Hub, together with Enterprise Data Quality for Products, and other industry leading Oracle applications to achieve further incremental value, like Oracle Innovation Management. This is unique on the market today.

    Read the article

  • SQL SERVER – Fundamentals of Columnstore Index

    - by pinaldave
    There are two kind of storage in database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data is relevant or not, column store queries need only to search much lesser number of the columns. This means major increases in search speed and hard drive use. Additionally, the column store indexes are heavily compressed, which translates to even greater memory and faster searches. I am sure this looks very exciting and it does not mean that you convert every single index from row store to column store index. One has to understand the proper places where to use row store or column store indexes. Let us understand in this article what is the difference in Columnstore type of index. Column store indexes are run by Microsoft’s VertiPaq technology. However, all you really need to know is that this method of storing data is columns on a single page is much faster and more efficient. Creating a column store index is very easy, and you don’t have to learn new syntax to create them. You just need to specify the keyword “COLUMNSTORE” and enter the data as you normally would. Keep in mind that once you add a column store to a table, though, you cannot delete, insert or update the data – it is READ ONLY. However, since column store will be mainly used for data warehousing, this should not be a big problem. You can always use partitioning to avoid rebuilding the index. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. The difference between column store and row store approaches is illustrated below: In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. In case of column store indexes multiple pages will contain multiple single columns. This will lead only the columns needed to solve a query will be fetched from disk. Additionally there is good chance that there will be redundant data in a single column which will further help to compress the data, this will have positive effect on buffer hit rate as most of the data will be in memory and due to same it will not need to be retrieved. Let us see small example of how columnstore index improves the performance of the query on a large table. As a first step let us create databaseset which is large enough to show performance impact of columnstore index. The time taken to create sample database may vary on different computer based on the resources. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 Now let us do quick performance test. I have kept STATISTICS IO ON for measuring how much IO following queries take. In my test first I will run query which will use regular index. We will note the IO usage of the query. After that we will create columnstore index and will measure the IO of the same. -- Performance Test -- Comparing Regular Index with ColumnStore Index USE AdventureWorks GO SET STATISTICS IO ON GO -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO -- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0. -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Select Table with Columnstore Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO It is very clear from the results that query is performance extremely fast after creating ColumnStore Index. The amount of the pages it has to read to run query is drastically reduced as the column which are needed in the query are stored in the same page and query does not have to go through every single page to read those columns. If we enable execution plan and compare we can see that column store index performance way better than regular index in this case. Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In future posts we will see cases where Columnstore index is not appropriate solution as well few other tricks and tips of the columnstore index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Seeking on a Heap, and Two Useful DMVs

    - by Paul White
    So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much more complex than it first appears.  A seek can contain one or more seek predicates – each of which can either identify at most one row in a unique index (a singleton lookup) or a range of values (a range scan).  When looking at a query plan, we will often need to look at the details of the seek operator in the Properties window to see how many operations it is performing, and what type of operation each one is.  As you saw in the first post in this series, the number of hidden seeking operations can have an appreciable impact on performance. Measuring Seeks and Scans I mentioned in my last post that there is no way to tell from a graphical query plan whether you are seeing a singleton lookup or a range scan.  You can work it out – if you happen to know that the index is defined as unique and the seek predicate is an equality comparison, but there’s no separate property that says ‘singleton lookup’ or ‘range scan’.  This is a shame, and if I had my way, the query plan would show different icons for range scans and singleton lookups – perhaps also indicating whether the operation was one or more of those operations underneath the covers. In light of all that, you might be wondering if there is another way to measure how many seeks of either type are occurring in your system, or for a particular query.  As is often the case, the answer is yes – we can use a couple of dynamic management views (DMVs): sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats. Index Usage Stats The index usage stats DMV contains counts of index operations from the perspective of the Query Executor (QE) – the SQL Server component that is responsible for executing the query plan.  It has three columns that are of particular interest to us: user_seeks – the number of times an Index Seek operator appears in an executed plan user_scans – the number of times a Table Scan or Index Scan operator appears in an executed plan user_lookups – the number of times an RID or Key Lookup operator appears in an executed plan An operator is counted once per execution (generating an estimated plan does not affect the totals), so an Index Seek that executes 10,000 times in a single plan execution adds 1 to the count of user seeks.  Even less intuitively, an operator is also counted once per execution even if it is not executed at all.  I will show you a demonstration of each of these things later in this post. Index Operational Stats The index operational stats DMV contains counts of index and table operations from the perspective of the Storage Engine (SE).  It contains a wealth of interesting information, but the two columns of interest to us right now are: range_scan_count – the number of range scans (including unrestricted full scans) on a heap or index structure singleton_lookup_count – the number of singleton lookups in a heap or index structure This DMV counts each SE operation, so 10,000 singleton lookups will add 10,000 to the singleton lookup count column, and a table scan that is executed 5 times will add 5 to the range scan count. The Test Rig To explore the behaviour of seeks and scans in detail, we will need to create a test environment.  The scripts presented here are best run on SQL Server 2008 Developer Edition, but the majority of the tests will work just fine on SQL Server 2005.  A couple of tests use partitioning, but these will be skipped if you are not running an Enterprise-equivalent SKU.  Ok, first up we need a database: USE master; GO IF DB_ID('ScansAndSeeks') IS NOT NULL DROP DATABASE ScansAndSeeks; GO CREATE DATABASE ScansAndSeeks; GO USE ScansAndSeeks; GO ALTER DATABASE ScansAndSeeks SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE ScansAndSeeks SET AUTO_CLOSE OFF, AUTO_SHRINK OFF, AUTO_CREATE_STATISTICS OFF, AUTO_UPDATE_STATISTICS OFF, PARAMETERIZATION SIMPLE, READ_COMMITTED_SNAPSHOT OFF, RESTRICTED_USER ; Notice that several database options are set in particular ways to ensure we get meaningful and reproducible results from the DMVs.  In particular, the options to auto-create and update statistics are disabled.  There are also three stored procedures, the first of which creates a test table (which may or may not be partitioned).  The table is pretty much the same one we used yesterday: The table has 100 rows, and both the key_col and data columns contain the same values – the integers from 1 to 100 inclusive.  The table is a heap, with a non-clustered primary key on key_col, and a non-clustered non-unique index on the data column.  The only reason I have used a heap here, rather than a clustered table, is so I can demonstrate a seek on a heap later on.  The table has an extra column (not shown because I am too lazy to update the diagram from yesterday) called padding – a CHAR(100) column that just contains 100 spaces in every row.  It’s just there to discourage SQL Server from choosing table scan over an index + RID lookup in one of the tests. The first stored procedure is called ResetTest: CREATE PROCEDURE dbo.ResetTest @Partitioned BIT = 'false' AS BEGIN SET NOCOUNT ON ; IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; IF @Partitioned = 'true' BEGIN -- Enterprise, Trial, or Developer -- required for partitioning tests IF SERVERPROPERTY('EngineEdition') = 3 BEGIN EXECUTE (' DROP TABLE dbo.Example ; IF EXISTS ( SELECT 1 FROM sys.partition_schemes WHERE name = N''PS'' ) DROP PARTITION SCHEME PS ; IF EXISTS ( SELECT 1 FROM sys.partition_functions WHERE name = N''PF'' ) DROP PARTITION FUNCTION PF ; CREATE PARTITION FUNCTION PF (INTEGER) AS RANGE RIGHT FOR VALUES (20, 40, 60, 80, 100) ; CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]) ; CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ON PS (key_col); '); END ELSE BEGIN RAISERROR('Invalid SKU for partition test', 16, 1); RETURN; END; END ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; END; GO The second stored procedure, ShowStats, displays information from the Index Usage Stats and Index Operational Stats DMVs: CREATE PROCEDURE dbo.ShowStats @Partitioned BIT = 'false' AS BEGIN -- Index Usage Stats DMV (QE) SELECT index_name = ISNULL(I.name, I.type_desc), scans = IUS.user_scans, seeks = IUS.user_seeks, lookups = IUS.user_lookups FROM sys.dm_db_index_usage_stats AS IUS JOIN sys.indexes AS I ON I.object_id = IUS.object_id AND I.index_id = IUS.index_id WHERE IUS.database_id = DB_ID(N'ScansAndSeeks') AND IUS.object_id = OBJECT_ID(N'dbo.Example', N'U') ORDER BY I.index_id ; -- Index Operational Stats DMV (SE) IF @Partitioned = 'true' SELECT index_name = ISNULL(I.name, I.type_desc), partitions = COUNT(IOS.partition_number), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; ELSE SELECT index_name = ISNULL(I.name, I.type_desc), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; END; The final stored procedure, RunTest, executes a query written against the example table: CREATE PROCEDURE dbo.RunTest @SQL VARCHAR(8000), @Partitioned BIT = 'false' AS BEGIN -- No execution plan yet SET STATISTICS XML OFF ; -- Reset the test environment EXECUTE dbo.ResetTest @Partitioned ; -- Previous call will throw an error if a partitioned -- test was requested, but SKU does not support it IF @@ERROR = 0 BEGIN -- IO statistics and plan on SET STATISTICS XML, IO ON ; -- Test statement EXECUTE (@SQL) ; -- Plan and IO statistics off SET STATISTICS XML, IO OFF ; EXECUTE dbo.ShowStats @Partitioned; END; END; The Tests The first test is a simple scan of the heap table: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example'; The top result set comes from the Index Usage Stats DMV, so it is the Query Executor’s (QE) view.  The lower result is from Index Operational Stats, which shows statistics derived from the actions taken by the Storage Engine (SE).  We see that QE performed 1 scan operation on the heap, and SE performed a single range scan.  Let’s try a single-value equality seek on a unique index next: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 32'; This time we see a single seek on the non-clustered primary key from QE, and one singleton lookup on the same index by the SE.  Now for a single-value seek on the non-unique non-clustered index: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32'; QE shows a single seek on the non-clustered non-unique index, but SE shows a single range scan on that index – not the singleton lookup we saw in the previous test.  That makes sense because we know that only a single-value seek into a unique index is a singleton seek.  A single-value seek into a non-unique index might retrieve any number of rows, if you think about it.  The next query is equivalent to the IN list example seen in the first post in this series, but it is written using OR (just for variety, you understand): EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32 OR data = 33'; The plan looks the same, and there’s no difference in the stats recorded by QE, but the SE shows two range scans.  Again, these are range scans because we are looking for two values in the data column, which is covered by a non-unique index.  I’ve added a snippet from the Properties window to show that the query plan does show two seek predicates, not just one.  Now let’s rewrite the query using BETWEEN: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data BETWEEN 32 AND 33'; Notice the seek operator only has one predicate now – it’s just a single range scan from 32 to 33 in the index – as the SE output shows.  For the next test, we will look up four values in the key_col column: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col IN (2,4,6,8)'; Just a single seek on the PK from the Query Executor, but four singleton lookups reported by the Storage Engine – and four seek predicates in the Properties window.  On to a more complex example: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WITH (INDEX([PK dbo.Example key_col])) WHERE key_col BETWEEN 1 AND 8'; This time we are forcing use of the non-clustered primary key to return eight rows.  The index is not covering for this query, so the query plan includes an RID lookup into the heap to fetch the data and padding columns.  The QE reports a seek on the PK and a lookup on the heap.  The SE reports a single range scan on the PK (to find key_col values between 1 and 8), and eight singleton lookups on the heap.  Remember that a bookmark lookup (RID or Key) is a seek to a single value in a ‘unique index’ – it finds a row in the heap or cluster from a unique RID or clustering key – so that’s why lookups are always singleton lookups, not range scans. Our next example shows what happens when a query plan operator is not executed at all: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 8 AND @@TRANCOUNT < 0'; The Filter has a start-up predicate which is always false (if your @@TRANCOUNT is less than zero, call CSS immediately).  The index seek is never executed, but QE still records a single seek against the PK because the operator appears once in an executed plan.  The SE output shows no activity at all.  This next example is 2008 and above only, I’m afraid: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WHERE key_col BETWEEN 1 AND 30', @Partitioned = 'true'; This is the first example to use a partitioned table.  QE reports a single seek on the heap (yes – a seek on a heap), and the SE reports two range scans on the heap.  SQL Server knows (from the partitioning definition) that it only needs to look at partitions 1 and 2 to find all the rows where key_col is between 1 and 30 – the engine seeks to find the two partitions, and performs a range scan seek on each partition. The final example for today is another seek on a heap – try to work out the output of the query before running it! EXECUTE dbo.RunTest @SQL = 'SELECT TOP (2) WITH TIES * FROM Example WHERE key_col BETWEEN 1 AND 50 ORDER BY $PARTITION.PF(key_col) DESC', @Partitioned = 'true'; Notice the lack of an explicit Sort operator in the query plan to enforce the ORDER BY clause, and the backward range scan. © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • BizTalk Envelopes explained

    - by Robert Kokuti
    Recently I've been trying to get some order into an ESB-BizTalk pub/sub scenario, and decided to wrap the payload into standardized envelopes. I have used envelopes before in a 'light weight' fashion, and I found that they can be quite useful and powerful if used systematically. Here is what I learned. The Theory In my experience, Envelopes are often underutilised in a BizTalk solution, and quite often their full potential is not well understood. Here I try to simplify the theory behind the Envelopes within BizTalk.   Envelopes can be used to attach additional data to the ‘real’ data (payload). This additional data can contain all routing and processing information, and allows treating the business data as a ‘black box’, possibly compressed and/or encrypted etc. The point here is that the infrastructure does not need to know anything about the business data content, just as a post man does not need to know the letter within the envelope. BizTalk has built-in support for envelopes through the XMLDisassembler and XMLAssembler pipeline components (these are part of the XMLReceive and XMLSend default pipelines). These components, among other things, perform the following: XMLDisassembler Extracts the payload from the envelope into the Message Body Copies data from the envelope into the message context, as specified by the property schema(s) associated by the envelope schema. Typically, once the envelope is through the XMLDisassembler, the payload is submitted into the Messagebox, and the rest of the envelope data are copied into the context of the submitted message. The XMLDisassembler uses the Property Schemas, referenced by the Envelope Schema, to determine the name of the promoted Message Context element.   XMLAssembler Wraps the Message Body inside the specified envelope schema Populates the envelope values from the message context, as specified by the property schema(s) associated by the envelope schema. Notice that there are no requirements to use the receiving envelope schema when sending. The sent message can be wrapped within any suitable envelope, regardless whether the message was originally received within an envelope or not. However, by sharing Property Schemas between Envelopes, it is possible to pass values from the incoming envelope to the outgoing envelope via the Message Context. The Practice Creating the Envelope Add a new Schema to the BizTalk project:   Envelopes are defined as schemas, with the <Schema> Envelope property set to Yes, and the root node’s Body XPath property pointing to the node which contains the payload. Typically, you’d create an envelope structure similar to this: Click on the <Schema> node and set the Envelope property to Yes. Then, click on the Envelope node, and set the Body XPath property pointing to the ‘Body’ node:   The ‘Body’ node is a Child Element, and its Data Structure Type is set to xs:anyType.  This allows the Body node to carry any payload data. The XMLReceive pipeline will submit the data found in the Body node, while the XMLSend pipeline will copy the message into the Body node, before sending to the destination. Promoting Properties Once you defined the envelope, you may want to promote the envelope data (anything other than the Body) as Property Fields, in order to preserve their value in the message context. Anything not promoted will be lost when the XMLDisassembler extracts the payload from the Body. Typically, this means you promote everything in the Header node. Property promotion uses associated Property Schemas. These are special BizTalk schemas which have a flat field structure. Property Schemas define the name of the promoted values in the Message Context, by combining the Property Schema’s Namespace and the individual Field names. It is worth being systematic when it comes to naming your schemas, their namespace and type name. A coherent method will make your life easier when it comes to referencing the schemas during development, and managing subscriptions (filters) in BizTalk Administration. I developed a fairly coherent naming convention which I’ll probably share in another article. Because the property schema must be flat, I recommend creating one for each level in the envelope header hierarchy. Property schemas are very useful in passing data between incoming as outgoing envelopes. As I mentioned earlier, in/out envelopes do not have to be the same, but you can use the same property schema when you promote the outgoing envelope fields as you used for the incoming schema.  As you can reference many property schemas for field promotion, you can pick data from a variety of sources when you define your outgoing envelope. For example, the outgoing envelope can carry some of the incoming envelope’s promoted values, plus some values from the standard BizTalk message context, like the AdapterReceiveCompleteTime property from the BizTalk message-tracking properties. The values you promote for the outgoing envelope will be automatically populated from the Message Context by the XMLAssembler pipeline component. Using the Envelope Receiving Enveloped messages are automatically recognized by the XMLReceive pipeline, or any other custom pipeline which includes the XMLDisassembler component. The Body Path node will become the Message Body, while the rest of the envelope values will be added to the Message context, as defined by the Property Shemas referenced by the Envelope Schema. Sending The Send Port’s filter expression can use the promoted properties from the incoming envelope. If you want to enclose the sent message within an envelope, the Send Port XMLAssembler component must be configured with the fully qualified envelope name:   One way of obtaining the fully qualified envelope name is copy it off from the envelope schema property page: The full envelope schema name is constructed as <Name>, <Assembly> The outgoing envelope is populated by the XMLAssembler pipeline component. The Message Body is copied to the specified envelope’s Body Path node, while the rest of the envelope fields are populated from the Message Context, according to the Property Schemas associated with the Envelope Schema. That’s all for now, happy enveloping!

    Read the article

  • Developing Schema Compare for Oracle (Part 5): Query Snapshots

    - by Simon Cooper
    If you've emailed us about a bug you've encountered with the EAP or beta versions of Schema Compare for Oracle, we probably asked you to send us a query snapshot of your databases. Here, I explain what a query snapshot is, and how it helps us fix your bug. Problem 1: Debugging users' bug reports When we started the Schema Compare project, we knew we were going to get problems with users' databases - configurations we hadn't considered, features that weren't installed, unicode issues, wierd dependencies... With SQL Compare, users are generally happy to send us a database backup that we can restore using a single RESTORE DATABASE command on our test servers and immediately reproduce the problem. Oracle, on the other hand, would be a lot more tricky. As Oracle generally has a 1-to-1 mapping between instances and databases, any databases users sent would have to be restored to their own instance. Furthermore, the number of steps required to get a properly working database, and the size of most oracle databases, made it infeasible to ask every customer who came across a bug during our beta program to send us their databases. We also knew that there would be lots of issues with data security that would make it hard to get backups. So we needed an easier way to be able to debug customers issues and sort out what strange schema data Oracle was returning. Problem 2: Test execution time Another issue we knew we would have to solve was the execution time of the tests we would produce for the Schema Compare engine. Our initial prototype showed that querying the data dictionary for schema information was going to be slow (at least 15 seconds per database), and this is generally proportional to the size of the database. If you're running thousands of tests on the same databases, each one registering separate schemas, not only would the tests would take hours and hours to run, but the test servers would be hammered senseless. The solution To solve these, we needed to be able to populate the schema of a database without actually connecting to it. Well, the IDataReader interface is the primary way we read data from an Oracle server. The data dictionary queries we use return their data in terms of simple strings and numbers, which we then process and reconstruct into an object model, and the results of these queries are identical for identical schemas. So, we can record the raw results of the queries once, and then replay these results to construct the same object model as many times as required without needing to actually connect to the original database. This is what query snapshots do. They are binary files containing the raw unprocessed data we get back from the oracle server for all the queries we run on the data dictionary to get schema information. The core of the query snapshot generation takes the results of the IDataReader we get from running queries on Oracle, and passes the row data to a BinaryWriter that writes it straight to a file. The query snapshot can then be replayed to create the same object model; when the results of a specific query is needed by the population code, we can simply read the binary data stored in the file on disk and present it through an IDataReader wrapper. This is far faster than querying the server over the network, and allows us to run tests in a reasonable time. They also allow us to easily debug a customers problem; using a simple snapshot generation program, users can generate a query snapshot that could be sent along with a bug report that we can immediately replay on our machines to let us debug the issue, rather than having to obtain database backups and restore databases to test systems. There are also far fewer problems with data security; query snapshots only contain schema information, which is generally less sensitive than table data. Query snapshots implementation However, actually implementing such a feature did have a couple of 'gotchas' to it. My second blog post detailed the development of the dependencies algorithm we use to ensure we get all the dependencies in the database, and that algorithm uses data from both databases to find all the needed objects - what database you're comparing to affects what objects get populated from both databases. We get information on these additional objects using an appropriate WHERE clause on all the population queries. So, in order to accurately replay the results of querying the live database, the query snapshot needs to be a snapshot of a comparison of two databases, not just populating a single database. Furthermore, although the code population queries (eg querying all_tab_cols to get column information) can simply be passed straight from the IDataReader to the BinaryWriter, we need to hook into and run the live dependencies algorithm while we're creating the snapshot to ensure we get the same WHERE clauses, and the same query results, as if we were populating straight from a live system. We also need to store the results of the dependencies queries themselves, as the resulting dependency graph is stored within the OracleDatabase object that is produced, and is later used to help order actions in synchronization scripts. This is significantly helped by the dependencies algorithm being a deterministic algorithm - given the same input, it will always return the same output. Therefore, when we're replaying a query snapshot, and processing dependency information, we simply have to return the results of the queries in the order we got them from the live database, rather than trying to calculate the contents of all_dependencies on the fly. Query snapshots are a significant feature in Schema Compare that really helps us to debug problems with the tool, as well as making our testers happier. Although not really user-visible, they are very useful to the development team to help us fix bugs in the product much faster than we otherwise would be able to.

    Read the article

  • Why would a WebService return nulls when the actual service returns data?

    - by Jerry
    I have a webservice (out of my control) that I have to talk to. I also have a packet-sniffer on the line, and (SURPRISE!!!) the developers of the webservice aren't lying. They are actually sending back all of the data that I requested. But the web-service code that is auto-generated from the WSDL file is giving me "null" as a value. I used their WSDL file to generate my Web Reference. I checked my data types with the datatypes that the WSDL file has declared. And I used the code as listed below to perform the calls: DT_MaterialMaster_LookupRequest req = new DT_MaterialMaster_LookupRequest(); req.MaterialNumber = "101*"; req.DocumentNo = ""; req.Description = "Pipe*"; req.Plant = "0000"; MI_MaterialMaster_Lookup_OBService srv = new MI_MaterialMaster_Lookup_OBService(); DT_MaterialMaster_Response resp = srv.MI_MaterialMaster_Lookup_OB(new DT_MaterialMaster_LookupRequest[] { req }); // Note that the response here is ALWAYS null!! Console.WriteLine(resp.Status); The resp object is an actual object. It was generated properly. However, the Status and MaterialData fields are always null. When I call the web service, I've placed a packet-sniffer on the line, and I can see that I've sent the following (linebreaks and indentions for my own sanity): <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <MT_MaterialMaster_Lookup xmlns="http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch"> <Request xmlns=""> <MaterialNumber>101*</MaterialNumber> <Description>Pipe*</Description> <DocumentNo /> <Plant>0000</Plant> </Request> </MT_MaterialMaster_Lookup> </soap:Body> </soap:Envelope> The response that they send back SEEMS to be a valid response (linebreaks and indentions for my own sanity): <SOAP:Envelope xmlns:SOAP='http://schemas.xmlsoap.org/soap/envelope/'> <SOAP:Header /> <SOAP:Body> <n0:MT_MaterialMaster_Response xmlns:n0='http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch' xmlns:prx='urn:SomeCompany.com:proxy:BRD:/1SAI/TAS4FE14A2DE960D61219AE:701:2009/02/10'> <Response> <Status>No Rows Found</Status> <MaterialData /> </Response> </n0:MT_MaterialMaster_Response> </SOAP:Body> </SOAP:Envelope> The status shows that it actually received data... but the resp.Status and resp.MaterialData fields are always null. What have I done wrong? UPDATE: The WSDL file is defined as: <?xml version="1.0" encoding="utf-8"?> <wsdl:definitions xmlns:p1="http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch" name="MI_MaterialMaster_Lookup_AutoCAD_OB" targetNamespace="http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"> <wsdl:types> <xsd:schema xmlns="http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch" targetNamespace="http://MyCompany.com/SomeCompany/mm/MaterialMasterSearch" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="MT_MaterialMaster_Response" type="p1:DT_MaterialMaster_Response" /> <xsd:element name="MT_MaterialMaster_Lookup" type="p1:DT_MaterialMaster_Lookup" /> <xsd:complexType name="DT_MaterialMaster_Response"> <xsd:sequence> <xsd:element name="Status" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">d48d03b040af11df99e300145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element maxOccurs="unbounded" name="MaterialData"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa040a511df843700145eccb24e</xsd:appinfo> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="MaterialNumber" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa140a511df848500145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Description" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa240a511df95bf00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="DocumentNo" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa340a511dfb23700145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="UOM" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">3b5f14c040a611df9fbe00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Hierarchy" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa440a511dfc65b00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Plant" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">d48d03b140af11dfb78e00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Procurement" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">d48d03b240af11dfb87b00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="DT_MaterialMaster_Lookup"> <xsd:sequence> <xsd:element maxOccurs="unbounded" name="Request"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa040a511df843700145eccb24e</xsd:appinfo> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element minOccurs="0" name="MaterialNumber" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa140a511df848500145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Description" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa240a511df95bf00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="DocumentNo" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa340a511dfb23700145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> <xsd:element minOccurs="0" name="Plant" type="xsd:string"> <xsd:annotation> <xsd:appinfo source="http://SomeCompany.com/xi/TextID">64908aa440a511dfc65b00145eccb24e</xsd:appinfo> </xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:schema> </wsdl:types> <wsdl:message name="MT_MaterialMaster_Lookup"> <wsdl:part name="MT_MaterialMaster_Lookup" element="p1:MT_MaterialMaster_Lookup" /> </wsdl:message> <wsdl:message name="MT_MaterialMaster_Response"> <wsdl:part name="MT_MaterialMaster_Response" element="p1:MT_MaterialMaster_Response" /> </wsdl:message> <wsdl:portType name="MI_MaterialMaster_Lookup_AutoCAD_OB"> <wsdl:operation name="MI_MaterialMaster_Lookup_AutoCAD_OB"> <wsdl:input message="p1:MT_MaterialMaster_Lookup" /> <wsdl:output message="p1:MT_MaterialMaster_Response" /> </wsdl:operation> </wsdl:portType> <wsdl:binding name="MI_MaterialMaster_Lookup_AutoCAD_OBBinding" type="p1:MI_MaterialMaster_Lookup_AutoCAD_OB"> <binding transport="http://schemas.xmlsoap.org/soap/http" xmlns="http://schemas.xmlsoap.org/wsdl/soap/" /> <wsdl:operation name="MI_MaterialMaster_Lookup_AutoCAD_OB"> <operation soapAction="http://SomeCompany.com/xi/WebService/soap1.1" xmlns="http://schemas.xmlsoap.org/wsdl/soap/" /> <wsdl:input> <body use="literal" xmlns="http://schemas.xmlsoap.org/wsdl/soap/" /> </wsdl:input> <wsdl:output> <body use="literal" xmlns="http://schemas.xmlsoap.org/wsdl/soap/" /> </wsdl:output> </wsdl:operation> </wsdl:binding> <wsdl:service name="MI_MaterialMaster_Lookup_AutoCAD_OBService"> <wsdl:port name="MI_MaterialMaster_Lookup_AutoCAD_OBPort" binding="p1:MI_MaterialMaster_Lookup_AutoCAD_OBBinding"> <address location="http://bxdwas.MyCompany.com/XISOAPAdapter/MessageServlet?channel=:AutoCAD:SOAP_SND_Material_Lookup" xmlns="http://schemas.xmlsoap.org/wsdl/soap/" /> </wsdl:port> </wsdl:service> </wsdl:definitions>

    Read the article

  • directory with 980MB meta data, millions of files, how to delete it? (ext3)

    - by Alexandre
    Hello, So I'm stuck with this directory: drwxrwxrwx 2 dan users 980M 2010-12-22 18:38 sessions2 The directories contents is small - just millions of tiny little files. I want to wipe it from the filesystem but have been unable to. My first try was: find sessions2 -type f -delete and find sessions2 -type f -print0 | xargs -0 rm -f but had to stop because both caused escalating memory usage. At one point it was using 65% of the system's memory. So I thought (no doubt incorrectly), that it had to do with the fact that dir_index was enabled on the system. Perhaps find was trying to read the entire index into memory? So I did this (foolishly): tune2fs -O^dir_index /dev/xxx Alright, so that should do it. Ran the find command above again and... same thing. Crazy memory usage. I hurriedly ran tune2fs -Odir_index /dev/xxx to reenable dir_index, and ran to Server Fault! 2 questions: 1) How do I get rid of this directory on my live system? I don't care how long it takes, as long as it uses little memory and little CPU. By the way, using nice find ... I was able to reduce CPU usage, so my problem right now is only memory usage. 2) I disabled dir_index for about 20 minutes. No doubt new files were written to the filesystem in the meanwhile. I reenabled dir_index. Does that mean the system will not find the files that were written before dir_index was reenabled since their filenames will be missing from the old indexes? If so and I know these new files aren't important, can I maintain the old indexes? If not, how do I rebuild the indexes? Can it be done on a live system? Thanks!

    Read the article

  • mysql UDF : fopen = permission denied

    - by lindenb
    Hi All, this is question I already asked on SO but I wonder if this could be a SysAdmin problem. I'm trying to create a mysql UDF function , this function calls "fopen/fclose" to read a flat file stored in /data. But using errno (yes, I know it is bad in a MT program...) I can see that the function cannot open my file: "Permission denied" I tried to do a chmod -R 755 /data (as well as 777, chown -R mysql:mysql /data etc...) but it didn't change anything. when I copied the flat file to /tmp : OK, my UDF was able to 'fopen' the file. I'm puzzled. currently , I've got: drwxrwxrwx 4 pierre root 4096 2010-05-26 16:51 /data drwxrwxrwx 3 pierre root 4096 2010-05-18 09:41 /data/dir1 drwxrwxrwx 3 pierre root 4096 2010-05-18 09:41 /data/dir1/dir2 drwxrwxrwx 4 pierre root 4096 2010-05-18 10:27 /data/dir1/dir2/dir3 -rw-r--r-- 1 pierre root 50685268 2005-12-10 00:01 /data/dir1/dir2/dir3/myfile.txt Any idea ?

    Read the article

  • What do you use to store all of your personal data?

    - by codeflunky
    I have been on a quest for years to find the perfect tool to store all "my stuff". You know... personal information, code snippets, software keys, people's birthdays, whatever. There are lots of tools out there for this sort of thing, but I've never found any of them quite what I need. Ideally, I would just be able to type some notes, tag them (I don't like the idea of folder organization... too cumbersome) and then easily search and retrieve what I need later. It seems so simple, but for some reason I just can't find it. I currently use Backpack (sometimes), which is OK, but I hate the fact that you always have to create "pages" to store things. I don't want to have to do that. I want to just type some notes, tag it and save. That's it. And Backpack didn't even have search for a long time. What I do like about Backpack is that it's fast and it's web based. I've tried some desktop apps, which probably came closer to the functionality I want, but I just hate being tied to a single machine. I want to be able to get to my stuff anywhere, so the web based thing is a definite requirement. Anyway, I'm thinking about writing my own thing for this if I can't find anything, but before I make the attempt, I was wondering if anyone has any suggestions? I've used Backpack, Zoho Planner, Stikkit and Google Notes so far, and they are not quite to my liking. Anyone? (Sorry if this is off-topic, but I figured you guys might be legitimately into this kind of thing... you know, storing code snippets and such.) UPDATE: I've been using Evernote for a few days, and it is exactly what I've been looking for. It is totally tag based and allows both online and offline usage. The desktop app sits in your system tray and allows you to add whatever you want on the fly either as text notes or clippings from the browser. It also syncs it to the web (if you want) where you can get to it from anywhere using their web client. They even have a mobile client which I haven't used, but I will try it soon. Thanks again 18hrs. I wish I could give you 10 upvotes.

    Read the article

  • Why Is ModSecurity Unable to Access the Data Directory?

    - by tommytwoeyes
    Update I think we've solved this; the problem appears to have been a result of the /modsec_storage directory having an incorrect value for its SELinux context type. However, we're still not sure, because although after I changed the SELinux context type value, Apache was able to create files in that directory for the global and ip collections (global.dir/global.pag and ip.dir/ip.pag), the new files still have zero bytes. I'm new to ModSecurity and am not sure if the files are empty because something is wrong with the configuration or if ModSecurity has simply determined it doesn't need to store IP addresses persistently after each transaction ends. Anyone able to offer guidance here? I've recently installed ModSecurity (v2.5.12 / CRS v2.0.8) on our production server, and everything works great, except for these errors that it keeps writing to the Apache error log: Failed to access DBM file "/modsec_storage/global": Permission denied [hostname "www.internationalstudent.com"] [uri "/includes/soc_bookmarks/images/delicious.png"] [unique_id "LZ6jc38AAAEAAFO6408AAABO"] Failed to access DBM file "/modsec_storage/ip": Permission denied [hostname "www.internationalstudent.com"] [uri "/includes/soc_bookmarks/images/delicious.png"] [unique_id "LZ6jc38AAAEAAFO6408AAABO"] After following the instructions for file permission settings in the ModSecurity handbook by Ivan Ristic, with no success, I created a /modsec_storage directory, set the owner & group to apache, and set the permissions for the directory recursively to 777. However, ModSecurity is still reporting the same permission errors, so I am stumped. Can anyone tell me how to fix this?

    Read the article

  • Create mirror software raid with bad blocks hdd. How to check data integrity?

    - by rumburak
    There is error in System event log like this one: "The device, \Device\Harddisk1\DR1, has a bad block." Because of above I created Raid 1 on this disk and other one. I'm using Windows Server 2008 R2 software RAID volumes. Volume in Disk Manager is marked as "Failed Redundancy" and "At Risk". I could command to "Reactivate Disk" and it's starts to re-sync, but after a while it stops and returns to previous state. It stops re-sync on bad block on old disk and creates same error in System event log. Old disk status is Errors, new disk status is Online. How can I check that there is exact copy of the old disk on new one ? It is server machine so I would prefer to keep it running during this check.

    Read the article

  • Macro keeps crashing need to speed it up or rewrite it, excel vba 50,000 lines of data

    - by Joel
    Trying to speed up a macro that runs over 50,000 lines ! I have two ways of performing the same vba macro Sub deleteCommonValue() Dim aRow, bRow As Long Dim colB_MoreFirst, colB_LessFirst, colB_Second, colC_MoreFirst, colC_LessFirst, colC_Second As Integer Dim colD_First, colD_Second As Integer Application.ScreenUpdating = False Application.DisplayStatusBar = False Application.Calculation = xlCalculationManual Application.EnableEvents = False aRow = 2 bRow = 3 colB_MoreFirst = Range("B" & aRow).Value + 0.05 colB_LessFirst = Range("B" & aRow).Value - 0.05 colB_Second = Range("B" & bRow).Value colC_MoreFirst = Range("C" & aRow).Value + 0.05 colC_LessFirst = Range("C" & aRow).Value - 0.05 colC_Second = Range("C" & bRow).Value colD_First = Range("D" & aRow).Value colD_Second = Range("D" & bRow).Value Do If colB_Second <= colB_MoreFirst And colB_Second >= colB_LessFirst Then If colC_Second <= colC_MoreFirst And colC_Second >= colC_LessFirst Then If colD_Second = colD_First Or colD_Second > colD_First Then Range(bRow & ":" & bRow).Delete 'bRow delete, assign new value to bRow colB_Second = Range("B" & bRow).Value colC_Second = Range("C" & bRow).Value colD_Second = Range("D" & bRow).Value '----------------------------------------------------- Else Range(aRow & ":" & aRow).Delete bRow = aRow + 1 'aRow value deleted, assign new value to aRow and bRow colB_MoreFirst = Range("B" & aRow).Value + 0.05 colB_LessFirst = Range("B" & aRow).Value - 0.05 colB_Second = Range("B" & bRow).Value colC_MoreFirst = Range("C" & aRow).Value + 0.05 colC_LessFirst = Range("C" & aRow).Value - 0.05 colC_Second = Range("C" & bRow).Value colD_First = Range("D" & aRow).Value colD_Second = Range("D" & bRow).Value '----------------------------------------------------- End If Else bRow = bRow + 1 'Assign new value to bRow colB_Second = Range("B" & bRow).Value colC_Second = Range("C" & bRow).Value colD_Second = Range("D" & bRow).Value '----------------------------------------------------- End If Else bRow = bRow + 1 'Assign new value to bRow colB_Second = Range("B" & bRow).Value colC_Second = Range("C" & bRow).Value colD_Second = Range("D" & bRow).Value '----------------------------------------------------- End If If IsEmpty(Range("D" & bRow).Value) = True Then aRow = aRow + 1 bRow = aRow + 1 'finish compare aRow, assign new value to aRow and bRow colB_MoreFirst = Range("B" & aRow).Value + 0.05 colB_LessFirst = Range("B" & aRow).Value - 0.05 colB_Second = Range("B" & bRow).Value colC_MoreFirst = Range("C" & aRow).Value + 0.05 colC_LessFirst = Range("C" & aRow).Value - 0.05 colC_Second = Range("C" & bRow).Value colD_First = Range("D" & aRow).Value colD_Second = Range("D" & bRow).Value '----------------------------------------------------- End If Loop Until IsEmpty(Range("D" & aRow).Value) = True Application.ScreenUpdating = False Application.DisplayStatusBar = False Application.Calculation = xlCalculationAutomatic Application.EnableEvents = False End Sub or Sub deleteCommonValue() Dim aRow, bRow As Long Application.ScreenUpdating = False aRow = 2 bRow = 3 Do If Range("B" & bRow).Value <= (Range("B" & aRow).Value + 0.05) _ And Range("B" & bRow).Value >= (Range("B" & aRow).Value - 0.05) Then If Range("C" & bRow).Value <= (Range("C" & aRow).Value + 0.05) _ And Range("C" & bRow).Value >= (Range("C" & aRow).Value - 0.05) Then If Range("D" & bRow).Value = (Range("D" & aRow).Value) _ Or Range("D" & bRow).Value > (Range("D" & aRow).Value) Then Range(bRow & ":" & bRow).Delete Else Range(aRow & ":" & aRow).Delete bRow = aRow + 1 Range("A" & aRow).Select End If Else bRow = bRow + 1 Range("A" & bRow).Select End If Else bRow = bRow + 1 Range("A" & bRow).Select End If If IsEmpty(Range("D" & bRow).Value) = True Then aRow = aRow + 1 bRow = aRow + 1 End If Loop Until IsEmpty(Range("D" & aRow).Value) = True End Sub I dont know if my best option will be to split the rows into multiple sheets?

    Read the article

  • Can I cancel a resize operation in GParted without causing data loss?

    - by Anderson Green
    I'm currently waiting for GParted to finish resizing a partition, but the progress bar is currently at 0, and it's been taking much longer than usual (perhaps an hour). Is it safe to cancel the resize operation? I don't want to wait days for the resize operation to complete, but I don't want to lose all of my files either. (Is there any way that I can simply pause the resize operation, attempt to recover files, and then resume the resize operation?) (An update: the operation has finally completed, and my files are still intact!)

    Read the article

< Previous Page | 511 512 513 514 515 516 517 518 519 520 521 522  | Next Page >