Search Results

Search found 22170 results on 887 pages for 'multiple schema'.

Page 542/887 | < Previous Page | 538 539 540 541 542 543 544 545 546 547 548 549  | Next Page >

  • RC of Entity Framework 4.1 (which includes EF Code First)

    - by ScottGu
    Last week the data team shipped the Release Candidate of Entity Framework 4.1.  You can learn more about it and download it here. EF 4.1 includes the new “EF Code First” option that I’ve blogged about several times in the past.  EF Code First provides a really elegant and clean way to work with data, and enables you to do so without requiring a designer or XML mapping file.  Below are links to some tutorials I’ve written in the past about it: Code First Development with Entity Framework 4.x EF Code First: Custom Database Schema Mapping Using EF Code First with an Existing Database The above tutorials were written against the CTP4 release of EF Code First (and so some APIs might be a little different) – but the concepts and scenarios outlined in them are the same as with the RC. Go Live License Last week’s EF 4.1 RC ships with a “go live” license that enables you to use it in production environments.  The final release of EF 4.1 will ship within the next 4 weeks and will be 100% API compatible with the RC release. Improvements with the RC The RC includes several improvements and enhancements.  The EF team has a good blog post summarizing the RC changes.  Scott Hanselman also has a nice video interview with the data team that talks more about the release. One of my favorite improvements introduced with last week’s RC is its support for medium trust security.  This enables you to use EF 4.1 (and code-first) within low-cost ASP.NET shared hosting web environments – without requiring a hoster to install anything to use it. EF 4.1 also now supports validation with not only code-first scenarios, but also model-first and database-first workflows.  Upgrading from previous releases The RC does include a few API tweaks and changes from the prior CTP builds.  Read the release notes that come with the release to get a more detailed listing of the changes. John Papa also has an excellent Upgrading to EF 4.1 RC blog post that describes the steps he took when upgrading a large project he wrote with the previous CTP5 release.  The work to upgrade is pretty straight forward and easy – use his write-up as a guide on how to quickly update projects of your own. NuGet Package Rename One of the changes that the data team made between the CTP5 and RC releases was to rename the NuGet package name from “EFCodeFirst” to “EntityFramework”. They decided to make this change since the EF 4.1 release now includes several additions above and beyond just code first. If you already have installed the “EFCodeFirst” NuGet package, you’ll want to uninstall it and then install the new “EntityFramework” NuGet package.  John Papa’s blog post details the exact steps on how to do this (it only takes ~20 seconds to do this). More EF Tutorials Julie Lerman has created some nice whitepapers and tutorials for MSDN that show using the new EF4 and EF 4.1 feature set. Click here to find links to read and watch them. Summary I’m really excited about the EF 4.1 release that will be shipping next month.  It significantly improves the Entity Framework, and makes it even easier and cleaner to work with data inside of .NET.  You can take advantage of it within all ASP.NET projects (including both Web Forms and MVC), within client projects using Windows Forms and WPF, and within other project types like WCF, Console and Services.  You can use NuGet to easily install it within all of them. Hope this helps, Scott P.S. I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • Sweet and Sour Source Control

    - by Tony Davis
    Most database developers don't use Source Control. A recent anonymous poll on SQL Server Central asked its readers "Which Version Control system do you currently use to store you database scripts?" The winner, with almost 30% of the vote was...none: "We don't use source control for database scripts". In second place with almost 28% of the vote was Microsoft's VSS. VSS? Given its reputation for being buggy, unstable and lacking most of the basic features required of a proper source control system, answering VSS is really just another way of saying "I don't use Source Control". At first glance, it's a surprising thought. You wonder how database developers can work in a team and find out what changed, when the system worked before but is now broken; to work out what happened to their changes that now seem to have vanished; to roll-back a mistake quickly so that the rest of the team have a functioning build; to find instantly whether a suspect change has been deployed to production. Unfortunately, the survey didn't ask about the scale of the database development, and correlate the two questions. If there is only one database developer within a schema, who has an automated approach to regular generation of build scripts, then the need for a formal source control system is questionable. After all, a database stores far more about its metadata than a traditional compiled application. However, what is meat for a small development is poison for a team-based development. Here, we need a form of Source Control that can reconcile simultaneous changes, store the history of changes, derive versions and builds and that can cope with forks and merges. The problem comes when one borrows a solution that was designed for conventional programming. A database is not thought of as a "file", but a vast, interdependent and intricate matrix of tables, indexes, constraints, triggers, enumerations, static data and so on, all subtly interconnected. It is an awkward fit. Subversion with its support for merges and forks, and the tolerance of different work practices, can be made to work well, if used carefully. It has a standards-based architecture that allows it to be used on all platforms such as Windows Mac, and Linux. In the words of Erland Sommerskog, developers should "just do it". What's in a database is akin to a "binary file", and the developer must work only from the file. You check out the file, edit it, and save it to disk to compile it. Dependencies are validated at this point and if you've broken anything (e.g. you renamed a column and broke all the objects that reference the column), you'll find out about it right away, and you'll be forced to fix it. Nevertheless, for many this is an alien way of working with SQL Server. Subversion is the powerhouse, not the GUI. It doesn't work seamlessly with your existing IDE, and that usually means SSMS. So the question then becomes more subtle. Would developers be less reluctant to use a fully-featured source (revision) control system for a team database development if they had a turn-key, reliable system that fitted in with their existing work-practices? I'd love to hear what you think. Cheers, Tony.

    Read the article

  • Multitenant Design for SQL Azure: White Paper Available

    - by Herve Roggero
    Cloud computing is about scaling out all your application tiers, from web application to the database layer. In fact, the whole promise of Azure is to pay for just what you need. You need more IIS servers? No problemo... just spin another web server. You expect to double your storage needs for Azure Tables? No problemo; you are covered there too... just pay for your storage needs. But what about the database tier, SQL Azure? How do you add new databases easily, and transparently, so that your application simply uses more of SQL Azure if its needs to? Without changing a single line of code? And what if you need to scale back down? Welcome to the world of database scalability. There are many terms that describe database scalability, including data federation, multitenant designs, and even NoSQL depending on the technical solution you are implementing.  Because SQL Azure is a transactional database system, NoSQL is not really an option. However data federation and multitenant designs offer some very interesting scalability options that are worth considering. Data federation, a feature of SQL Azure that will be offered in the future, offers very interesting capabilities available natively on the SQL Azure platform. More to come in a few weeks... Multitenant designs on the other hand are design practices and technologies designed to help you reach flexible scalability options not available otherwise. The first incarnation of such a method was made available on CodePlex as an open source project (http://enzosqlshard.codeplex.com).  This project was an attempt to provide a sharding library for educational purposes.  All that sounds really cool... and really esoteric... almost a form of database "voodoo"... However after being on multiple Azure projects I am starting to see a real need. Customers want to be able to free themselves from the database tier, so that if they have 10 new customers tomorrow, all they need to do is add 2 more SQL Azure instances. It's that simple. How you achieve this, and suggested application design guidelines, are available in a white paper I just published.  The white paper offers two primary sections. The first section describes the business and technical problem at hand, and how to classify it according to specific design patterns. For example, I discuss compressed shards through schema separation. The second section offers a method for addressing the needs of a multitenant design using a new library, the big bother of the codeplex project mentioned previously (that I created earlier this year), complete with management interface and such. A Beta of this platform will be made available within weeks; as soon as the documentation will be ready.   I would like to ask you to drop me a quick email at [email protected] if you are going to download the white paper. It's not required, but it would help me get in touch with you for feedback.  You can download this white paper here:   http://www.bluesyntax.net/files/EnzoFramework.pdf . Thank you, and I am looking for feedback, thoughts and implementation opportunities.

    Read the article

  • SQL Constraints &ndash; CHECK and NOCHECK

    - by David Turner
    One performance issue i faced at a recent project was with the way that our constraints were being managed, we were using Subsonic as our ORM, and it has a useful tool for generating your ORM code called SubStage – once configured, you can regenerate your DAL code easily based on your database schema, and it can even be integrated into your build as a pre-build event if you want to do this.  SubStage also offers the useful feature of being able to generate DDL scripts for your entire database, and can script your data for you too. The problem came when we decided to use the generate scripts feature to migrate the database onto a test database instance – it turns out that the DDL scripts that it generates include the WITH NOCHECK option, so when we executed them on the test instance, and performed some testing, we found that performance wasn’t as expected. A constraint can be disabled, enabled but not trusted, or enabled and trusted.  When it is disabled, data can be inserted that violates the constraint because it is not being enforced, this is useful for bulk load scenarios where performance is important.  So what does it mean to say that a constraint is trusted or not trusted?  Well this refers to the SQL Server Query Optimizer, and whether it trusts that the constraint is valid.  If it trusts the constraint then it doesn’t check it is valid when executing a query, so the query can be executed much faster. Here is an example base in this article on TechNet, here we create two tables with a Foreign Key constraint between them, and add a single row to each.  We then query the tables: 1 DROP TABLE t2 2 DROP TABLE t1 3 GO 4 5 CREATE TABLE t1(col1 int NOT NULL PRIMARY KEY) 6 CREATE TABLE t2(col1 int NOT NULL) 7 8 ALTER TABLE t2 WITH CHECK ADD CONSTRAINT fk_t2_t1 FOREIGN KEY(col1) 9 REFERENCES t1(col1) 10 11 INSERT INTO t1 VALUES(1) 12 INSERT INTO t2 VALUES(1) 13 GO14 15 SELECT COUNT(*) FROM t2 16 WHERE EXISTS17 (SELECT *18 FROM t1 19 WHERE t1.col1 = t2.col1) This all works fine, and in this scenario the constraint is enabled and trusted.  We can verify this by executing the following SQL to query the ‘is_disabled’ and ‘is_not_trusted’ properties: 1 select name, is_disabled, is_not_trusted from sys.foreign_keys This gives the following result: We can disable the constraint using this SQL: 1 alter table t2 NOCHECK CONSTRAINT fk_t2_t1 And when we query the constraints again, we see that the constraint is disabled and not trusted: So the constraint won’t be enforced and we can insert data into the table t2 that doesn’t match the data in t1, but we don’t want to do this, so we can enable the constraint again using this SQL: 1 alter table t2 CHECK CONSTRAINT fk_t2_t1 But when we query the constraints again, we see that the constraint is enabled, but it is still not trusted: This means that the optimizer will check the constraint each time a query is executed over it, which will impact the performance of the query, and this is definitely not what we want, so we need to make the constraint trusted by the optimizer again.  First we should check that our constraints haven’t been violated, which we can do by running DBCC: 1 DBCC CHECKCONSTRAINTS (t2) Hopefully you see the following message indicating that DBCC completed without finding any violations of your constraint: Having verified that the constraint was not violated while it was disabled, we can simply execute the following SQL:   1 alter table t2 WITH CHECK CHECK CONSTRAINT fk_t2_t1 At first glance this looks like it must be a typo to have the keyword CHECK repeated twice in succession, but it is the correct syntax and when we query the constraints properties, we find that it is now trusted again: To fix our specific problem, we created a script that checked all constraints on our tables, using the following syntax: 1 ALTER TABLE t2 WITH CHECK CHECK CONSTRAINT ALL

    Read the article

  • DISA Cross Domain Enterprise Solutions on the NetBeans Platform

    - by Geertjan
    Bray 2.0 is a tool based on the NetBeans Platform that assists in creating valid Data Flow Configuration (DFC) files. The DFC Specification was developed to provide a standardized way for defining, validating, and approving data flows for use on cross-domain guarding solutions. A DFC document specifies key entities such as security domains, guards that facilitate data between security domains, data flows that describe how data travels between security domains, filters that transform and validate the data and more. Related info: http://www.disa.mil/Services/Information-Assurance/Cross-Domain-Solutions The Bray product is in development at Fulcrum IT (http://www.fulcrumco.com). The DFC Specification and Bray were developed in support of the US Department of Defense. Bray 2.0 marks the first release of Bray on the NetBeans Platform and utilizes a number of features that are core to the NetBeans Platform: Modular plugability. Bray consumers can integrate their own tools, file types, and more into the product with relative ease. Robust UI. The NetBeans Platform intuitive UI makes it easy to access and manipulate multiple aspects of a DFC. Explorer. The Explorer is a key component that makes the DFC XML easy to traverse, edit, and find errors. Context-sensitive help. JavaHelp can be readily integrated for the product as well as all the UI within. Editors. Any external file can be added to a DFC. Users can register their own editors or use the provided NetBeans editors to edit files. Printing. The NetBeans Platform Print API makes it easy to determine what should be printed and how.   A screenshot: Bray 2.0 provides a lot of key features in developing valid, robust DFC files:  XML validation. A DFC can be validated against the DFC schema specification. DFC Check List. An interactive, minimal guide for creating a complete DFC. Summary Window. The Summary Window functions like the Navigator in NetBeans IDE. The current "item of interest" is checked against various business rules and provides the ability to quickly find and fix errors. Change Log. Bray audits every change to a DFC and places them in a change log for users to peruse. Comments. Users can optionally add comments for other users to see. Digital signatures. DFC files can be digitally signed. A signature history and signature validation is provided in Bray. Pluggable security schemes. Bray ships with plain text and IC-ISM security schemes. If needed, users can integrate additional ones.  ...and more to come! New features for Bray are constantly in development including use of the NetBeans Visual Library, language support, and more. More screenshots:

    Read the article

  • Columnstore Case Study #1: MSIT SONAR Aggregations

    - by aspiringgeek
    Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in this deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR.  By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows.  New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly.  Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements.  Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data.  "Conventional vs. Columnstore Metrics" document the raw data.  Note on this linear display the magnitude of the conventional index performance vs. columnstore.  The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Subsequent features in this series document performance enhancements that are even more significant. 

    Read the article

  • SSAS: Utility to export SQL code from your cube's Data Source View (DSV)

    - by DrJohn
    When you are working on a cube, particularly in a multi-person team, it is sometimes necessary to review what changes that have been done to the SQL queries in the cube's data source view (DSV). This can be a problem as the SQL editor in the DSV is not the best interface to review code. Now of course you can cut and paste the SQL into SSMS, but you have to do each query one-by-one. What is worse your DBA is unlikely to have BIDS installed, so you will have to manually export all the SQL yourself and send him the files. To make it easy to get hold of the SQL in a Data Source View, I developed a C# utility which connects to an OLAP database and uses Analysis Services Management Objects (AMO) to obtain and export all the SQL to a series of files. The added benefit of this approach is that these SQL files can be placed under source code control which means the DBA can easily compare one version with another. The Trick When I came to implement this utility, I quickly found that the AMO API does not give direct access to anything useful about the tables in the data source view. Iterating through the DSVs and tables is easy, but getting to the SQL proved to be much harder. My Google searches returned little of value, so I took a look at the idea of using the XmlDom to open the DSV’s XML and obtaining the SQL from that. This is when the breakthrough happened. Inspecting the DSV’s XML I saw the things I was interested in were called TableType DbTableName FriendlyName QueryDefinition Searching Google for FriendlyName returned this page: Programming AMO Fundamental Objects which hinted at the fact that I could use something called ExtendedProperties to obtain these XML attributes. This simplified my code tremendously to make the implementation almost trivial. So here is my code with appropriate comments. The full solution can be downloaded from here: ExportCubeDsvSQL.zip   using System;using System.Data;using System.IO;using Microsoft.AnalysisServices; ... class code removed for clarity// connect to the OLAP server Server olapServer = new Server();olapServer.Connect(config.olapServerName);if (olapServer != null){ // connected to server ok, so obtain reference to the OLAP databaseDatabase olapDatabase = olapServer.Databases.FindByName(config.olapDatabaseName);if (olapDatabase != null){ Console.WriteLine(string.Format("Succesfully connected to '{0}' on '{1}'",   config.olapDatabaseName,   config.olapServerName));// export SQL from each data source view (usually only one, but can be many!)foreach (DataSourceView dsv in olapDatabase.DataSourceViews){ Console.WriteLine(string.Format("Exporting SQL from DSV '{0}'", dsv.Name));// for each table in the DSV, export the SQL in a fileforeach (DataTable dt in dsv.Schema.Tables){ Console.WriteLine(string.Format("Exporting SQL from table '{0}'", dt.TableName)); // get name of the table in the DSV// use the FriendlyName as the user inputs this and therefore has control of itstring queryName = dt.ExtendedProperties["FriendlyName"].ToString().Replace(" ", "_");string sqlFilePath = Path.Combine(targetDir.FullName, queryName + ".sql"); // delete the sql file if it exists... file deletion code removed for clarity// write out the SQL to a fileif (dt.ExtendedProperties["TableType"].ToString() == "View"){ File.WriteAllText(sqlFilePath, dt.ExtendedProperties["QueryDefinition"].ToString());}if (dt.ExtendedProperties["TableType"].ToString() == "Table"){ File.WriteAllText(sqlFilePath, dt.ExtendedProperties["DbTableName"].ToString()); } } } Console.WriteLine(string.Format("Successfully written out SQL scripts to '{0}'", targetDir.FullName)); } }   Of course, if you are following industry best practice, you should be basing your cube on a series of views. This will mean that this utility will be of limited practical value unless of course you are inheriting a project and want to check if someone did the implementation correctly.

    Read the article

  • Cloud Backup: Getting the Users' Backs Up

    - by Tony Davis
    On Wednesday last week, Microsoft announced that as of July 1, all data transfers into its Microsoft Azure cloud will be free (though you have to pay for transferring data out). On Thursday last week, SQL Azure in Western Europe went down. It was a relatively short outage, but since SQL Azure currently provides no easy way to take a standard backup of a database and store it locally, many people had no recourse but to wait patiently for their cloud-based app to resume. It seems that Microsoft are very keen encourage developers to move their data onto their cloud, but are developers ready to do it, given that such basic backup capabilities are lacking? Recently on Simple-Talk, Mike Mooney described a perfect use case for the Microsoft Cloud. They had a simple web-based application with a SQL Server backend; they could move the application to Windows Azure, and the data into SQL Azure and in the process free themselves from much of the hassle surrounding management and scaling of the hardware, network and so on. It was a great fit and yet it nearly didn't happen; lack of support for the BACKUP command almost proved a show-stopper. Of course, backups of Azure databases are always and have always been taken automatically, for disaster recovery purposes, but these are strictly on-cloud copies and as of now it is not possible to use them to them to restore a database to a particular point in time. It seems that none of those clever Microsoft people managed to predict the need to perform basic backups of Azure databases so that copies could be stored locally, outside the Azure universe. At the very least, as Mike points out, performing a local backup before a new deployment is more or less mandatory. Microsoft did at least note the sound of gnashing teeth and, as a stop-gap measure, offered SQL Azure Database Copy which basically allows you to create an online clone of your database, but this doesn't allow for storing local archives of the data. To that end MS has provided SQL Azure Import/Export, to package up and export a database and its data, using BACPACs. These BACPACs do not guarantee transactional consistency; for example, if a child table is modified after the parent is copied, then the copied database will be in inconsistent state (meaning, to add to the fun, BACPACs need to be created from a database copy). In any event, widespread problems with BACPAC's evil cousin, the DACPAC have been well-documented, and it seems likely that many will also give BACPAC the bum's rush. Finally, in a TechEd 2011 presentation tagged "SQL Azure Advanced Administration", it was announced that "backup and restore" were coming in the next SQL Azure CTP. And yet this still doesn't mean that we'll get simple backups as DBAs know and love them. What it does mean, at least, is the ability to restore any given database to a point in time within a 2-week window. For the time being, if you want a local copy of your data and don't want to brave the BACPAC, one is left with SSIS or BCP, creative use of schema and data comparison tools, or use of SQL Azure Backup (currently in beta) in order to perform this simple but vital task. Cheers, Tony.

    Read the article

  • SQL SERVER – Finding Different ColumnName From Almost Identitical Tables

    - by pinaldave
    I have mentioned earlier on this blog that I love social media – Facebook and Twitter. I receive so many interesting questions that sometimes I wonder how come I never faced them in my real life scenario. Well, let us see one of the similar situation. Here is one of the questions which I received on my social media handle. “Pinal, I have a large database. I did not develop this database but I have inherited this database. In our database we have many tables but all the tables are in pairs. We have one archive table and one current table. Now here is interesting situation. For a while due to some reason our organization has stopped paying attention to archive data. We did not archive anything for a while. If this was not enough we  even changed the schema of current table but did not change the corresponding archive table. This is now becoming a huge huge problem. We know for sure that in current table we have added few column but we do not know which ones. Is there any way we can figure out what are the new column added in the current table and does not exist in the archive tables? We cannot use any third party tool. Would you please guide us?” Well here is the interesting example of how we can use sys.column catalogue views and get the details of the newly added column. I have previously written about EXCEPT over here which is very similar to MINUS of Oracle. In following example we are going to create two tables. One of the tables has extra column. In our resultset we will get the name of the extra column as we are comparing the catalogue view of the column name. USE AdventureWorks2012 GO CREATE TABLE ArchiveTable (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(100), Col3 VARCHAR(100)); CREATE TABLE CurrentTable (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(100), Col3 VARCHAR(100), ExtraCol INT); GO -- Columns in ArchiveTable but not in CurrentTable SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'ArchiveTable' EXCEPT SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'CurrentTable' GO -- Columns in CurrentTable but not in ArchiveTable SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'CurrentTable' EXCEPT SELECT name ColumnName FROM sys.columns WHERE OBJECT_NAME(OBJECT_ID) = 'ArchiveTable' GO DROP TABLE ArchiveTable; DROP TABLE CurrentTable; GO The above query will return us following result. I hope this solves the problems. It is not the most elegant solution ever possible but it works. Here is the puzzle back to you – what native T-SQL solution would you have provided in this situation? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL System Table, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

    - by Pinal Dave
    There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

    Read the article

  • Columnstore Case Study #1: MSIT SONAR Aggregations

    - by aspiringgeek
    Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in this deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR.  By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows.  New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly.  Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements.  Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data.  "Conventional vs. Columnstore Metrics" document the raw data.  Note on this linear display the magnitude of the conventional index performance vs. columnstore.  The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Subsequent features in this series document performance enhancements that are even more significant. 

    Read the article

  • Java EE 7 Roadmap

    - by Linda DeMichiel
    The Java EE 6 Platform, released in December 2009, has seen great uptake from the community with its POJO-based programming model, lightweight Web Profile, and extension points. There are now 13 Java EE 6 compliant appserver implementations today! When we announced the Java EE 7 JSR back in early 2011, our plans were that we would release it by Q4 2012. This target date was slightly over three years after the release of Java EE 6, but at the same time it meant that we had less than two years to complete a fairly comprehensive agenda — to continue to invest in significant enhancements in simplification, usability, and functionality in updated versions of the JSRs that are currently part of the platform; to introduce new JSRs that reflect emerging needs in the community; and to add support for use in cloud environments. We have since announced a minor adjustment in our dates (to the spring of 2013) in order to accommodate the inclusion of JSRs of importance to the community, such as Web Sockets and JSON-P. At this point, however, we have to make a choice. Despite our best intentions, our progress has been slow on the cloud side of our agenda. Partially this has been due to a lack of maturity in the space for provisioning, multi-tenancy, elasticity, and the deployment of applications in the cloud. And partially it is due to our conservative approach in trying to get things "right" in view of limited industry experience in the cloud area when we started this work. Because of this, we believe that providing solid support for standardized PaaS-based programming and multi-tenancy would delay the release of Java EE 7 until the spring of 2014 — that is, two years from now and over a year behind schedule. In our opinion, that is way too long. We have therefore proposed to the Java EE 7 Expert Group that we adjust our course of action — namely, stick to our current target release dates, and defer the remaining aspects of our agenda for PaaS enablement and multi-tenancy support to Java EE 8. Of course, we continue to believe that Java EE is well-suited for use in the cloud, although such use might not be quite ready for full standardization. Even today, without Java EE 7, Java EE vendors such as Oracle, Red Hat, IBM, and CloudBees have begun to offer the ability to run Java EE applications in the cloud. Deferring the remaining cloud-oriented aspects of our agenda has several important advantages: It allows Java EE Platform vendors to gain more experience with their implementations in this area and thus helps us avoid risks entailed by trying to standardize prematurely in an emerging area. It means that the community won't need to wait longer for those features that are ready at the cost of those features that need more time. Because we have already laid some of the infrastructure for cloud support in Java EE 7, including resource definition metadata, improved security configuration, JPA schema generation, etc., it will allow us to expedite a Java EE 8 release. We therefore plan to target the Java EE 8 Platform release for the spring of 2015. This shift in the scope of Java EE 7 allows us to better retain our focus on enhancements in simplification and usability and to deliver on schedule those features that have been most requested by developers. These include the support for HTML 5 in the form of Web Sockets and JSON-P; the simplified JMS 2.0 APIs; improved Managed Bean alignment, including transactional interceptors; the JAX-RS 2.0 client API; support for method-level validation; a much more comprehensive expression language; and more. We feel strongly that this is the right thing to do, and we hope that you will support us in this proposed direction.

    Read the article

  • ODEE Green Field (Windows) Part 3 - SOA Suite

    - by AndyL-Oracle
     So you're still here, are you? I'm sure you're probably overjoyed at the prospect of continuing with our green field installation of ODEE. In my previous post, I covered the installation of WebLogic - you probably noticed, like I did, that it's a pretty quick install. I'm pretty certain this had everything to do with how quickly the next post made it to the internet! So let's dig in. Make sure you've followed the steps from the initial post to obtain the necessary software and prerequisites! Unpack the RCU (Repository Creation Utility). This ZIP file contains a directory (rcuHome) that should be extracted into your ORACLE_HOME. Run the RCU – execute rcuHome/bin/rcu.bat. Click Next. Select Create and click Next. Enter the database connection details and click Next – any failure to connection will show in the Messages box. Click Ok Expand and select the SOA Infrastructure item. This will automatically select additional required components. You can change the prefix used, but DEV is recommended. If you are creating a sandbox that includes additional components like WebCenter Content and UMS, you may select those schemas as well but they are not required for a basic ODEE installation. Click Next. Click OK. Specify the password for the schema(s). Then click Next. Click Next. Click OK. Click OK. Click Create. Click Close. Unpack the SOA Suite installation files into a single directory e.g. SOA. Run the installer – navigate and execute SOA/Disk1/setup.exe. If you receive a JDK error, switch to a command line to start the installer. To start the installer via command line, do Start?Run?cmd and cd into the SOA\Disk1 directory. Run setup.exe –jreLoc < pathtoJRE >. Ensure you do not use a path with spaces – use the ~1 notation as necessary (your directory must not exceed 8 characters so “Program Files” becomes “Progra~1” and “Program Files (x86)” becomes “Progra~2” in this notation). Click Next. Select Skip and click Next. Resolve any issues shown and click Next. Verify your oracle home locations. Defaults are recommended. Click Next. Select your application server. If you’ve already installed WebLogic, this should be automatically selected for you. Click Next. Click Install. Allow the installation to progress… Click Next. Click Finish. You can save the installation details if you want. That should keep you satisfied for the moment. Get ready, because the next posts are going to be meaty! 

    Read the article

  • New Sample Demonstrating the Traversing of Tree Bindings

    - by Duncan Mills
    A technique that I seem to use a fair amount, particularly in the construction of dynamic UIs is the use of a ADF Tree Binding to encode a multi-level master-detail relationship which is then expressed in the UI in some kind of looping form – usually a series of nested af:iterators, rather than the conventional tree or treetable. This technique exploits two features of the treebinding. First the fact that an treebinding can return both a collectionModel as well as a treeModel, this collectionModel can be used directly by an iterator. Secondly that the “rows” returned by the collectionModel themselves contain an attribute called .children. This attribute in turn gives access to a collection of all the children of that node which can also be iterated over. Putting this together you can represent the data encoded into a tree binding in all sorts of ways. As an example I’ve put together a very simple sample based on the HT schema and uploaded it to the ADF Sample project. It produces this UI: The important code is shown here for a Region -> Country -> Location Hierachy: <af:iterator id="i1" value="#{bindings.AllRegions.collectionModel}" var="rgn"> <af:showDetailHeader text="#{rgn.RegionName}" disclosed="true" id="sdh1"> <af:iterator id="i2" value="#{rgn.children}" var="cnty">     <af:showDetailHeader text="#{cnty.CountryName}" disclosed="true" id="sdh2">       <af:iterator id="i3" value="#{cnty.children}" var="loc">         <af:panelList id="pl1">         <af:outputText value="#{loc.City}" id="ot3"/>           </af:panelList>         </af:iterator>       </af:showDetailHeader>     </af:iterator>   </af:showDetailHeader> </af:iterator>  You can download the entire sample from here:

    Read the article

  • MySQL Server 5.6 defaults changes

    - by user12626240
    We're improving the MySQL Server defaults, as announced by Tomas Ulin at MySQL Connect. Here's what we're changing:  Setting  Old  New  Notes back_log  50  50 + ( max_connections / 5 ) capped at 900 binlog_checksum  off  CRC32  New variable in 5.6 binlog_row_event_max_size  1k  8k flush_time  1800  Windows changes from 1800 to 0  Was already 0 on other platforms host_cache_size  128  128 + 1 for each of the first 500 max_connections + 1 for every 20 max_connections over 500, capped at 2000  New variable in 5.6 innodb_autoextend_increment  8  64  Now affects *.ibd files. 64 is 64 megabytes innodb_buffer_pool_instances  0  8. On 32 bit Windows only, if innodb_buffer_pool_size is greater than 1300M, default is innodb_buffer_pool_size / 128M innodb_concurrency_tickets  500  5000 innodb_file_per_table  off  on innodb_log_file_size  5M  48M  InnoDB will always change size to match my.cnf value. Also see innodb_log_compressed_pages and binlog_row_image innodb_old_blocks_time 0  1000 1 second innodb_open_files  300  300; if innodb_file_per_table is ON, higher of table_open_cache or 300 innodb_purge_batch_size  20  300 innodb_purge_threads  0  1 innodb_stats_on_metadata  on  off join_buffer_size 128k  256k max_allowed_packet  1M  4M max_connect_errors  10  100 open_files_limit  0  5000  See note 1 query_cache_size  0  1M query_cache_type  on/1  off/0 sort_buffer_size  2M  256k sql_mode  none  NO_ENGINE_SUBSTITUTION  See later post about default my.cnf for STRICT_TRANS_TABLES sync_master_info  0  10000  Recommend: master_info_repository=table sync_relay_log  0  10000 sync_relay_log_info  0  10000  Recommend: relay_log_info_repository=table. Also see Replication Relay and Status Logs table_definition_cache  400  400 + table_open_cache / 2, capped at 2000 table_open_cache  400  2000   Also see table_open_cache_instances thread_cache_size  0  8 + max_connections/100, capped at 100 Note 1: In 5.5 there was already a rule to make open_files_limit 10 + max_connections + table_cache_size * 2 if that was higher than the user-specified value. Now uses the higher of that and (5000 or what you specify). We are also adding a new default my.cnf file and guided instructions on the key settings to adjust. More on this in a later post. We're also providing a page with suggestions for settings to improve backwards compatibility. The old example files like my-huge.cnf are obsolete. Some of the improvements are present from 5.6.6 and the rest are coming. These are ideas, and until they are in an official GA release, they are subject to change. As part of this work I reviewed every old server setting plus many hundreds of emails of feedback and testing results from inside and outside Oracle's MySQL Support team and the many excellent blog entries and comments from others over the years, including from many MySQL Gurus out there, like Baron, Sheeri, Ronald, Schlomi, Giuseppe and Mark Callaghan. With these changes we're trying to make it easier to set up the server by adjusting only a few settings that will cause others to be set. This happens only at server startup and only applies to variables where you haven't set a value. You'll see a similar approach used for the Performance Schema. The Gurus don't need this but for many newcomers the defaults will be very useful. Possibly the most unusual change is the way we vary the setting for innodb_buffer_pool_instances for 32-bit Windows. This is because we've found that DLLs with specified load addresses often fragment the limited four gigabyte 32-bit address space and make it impossible to allocate more than about 1300 megabytes of contiguous address space for the InnoDB buffer pool. The smaller requests for many pools are more likely to succeed. If you change the value of innodb_log_file_size in my.cnf you will see a message like this in the error log file at the next restart, instead of the old error message: [Warning] InnoDB: Resizing redo log from 2*64 to 5*128 pages, LSN=5735153 One of the biggest challenges for the defaults is the millions of installations on a huge range of systems, from point of sale terminals and routers though shared hosting or end user systems and on to major servers with lots of CPU cores, hundreds of gigabytes of RAM and terabytes of fast disk space. Our past defaults were for the smaller systems and these change that to larger shared hosting or shared end user systems, still with a bias towards the smaller end. There is a bias in favour of OLTP workloads, so reporting systems may need more changes. Where there is a conflict between the best settings for benchmarks and normal use, we've favoured production, not benchmarks. We're very interested in your feedback, comments and suggestions.

    Read the article

  • What's new in Servlet 3.1 ? - Java EE 7 moving forward

    - by arungupta
    Servlet 3.0 was released as part of Java EE 6 and made huge changes focused at ease-of-use. The idea was to leverage the latest language features such as annotations and generics and modernize how Servlets can be written. The web.xml was made as optional as possible. Servet 3.1 (JSR 340), scheduled to be part of Java EE 7, is an incremental release focusing on couple of key features and some clarifications in the specification. The main features of Servlet 3.1 are explained below: Non-blocking I/O - Servlet 3.0 allowed asynchronous request processing but only traditional I/O was permitted. This can restrict scalability of your applications. Non-blocking I/O allow to build scalable applications. TOTD #188 provide more details about how non-blocking I/O can be done using Servlet 3.1. HTTP protocol upgrade mechanism - Section 14.42 in the HTTP 1.1 specification (RFC 2616) defines an upgrade mechanism that allows to transition from HTTP 1.1 to some other, incompatible protocol. The capabilities and nature of the application-layer communication after the protocol change is entirely dependent upon the new protocol chosen. After an upgrade is negotiated between the client and the server, the subsequent requests use the new chosen protocol for message exchanges. A typical example is how WebSocket protocol is upgraded from HTTP as described in Opening Handshake section of RFC 6455. The decision to upgrade is made in Servlet.service method. This is achieved by adding a new method: HttpServletRequest.upgrade and two new interfaces: javax.servlet.http.HttpUpgradeHandler and javax.servlet.http.WebConnection. TyrusHttpUpgradeHandler shows how WebSocket protocol upgrade is done in Tyrus (Reference Implementation for Java API for WebSocket). Security enhancements Applying run-as security roles to #init and #destroy methods Session fixation attack by adding HttpServletRequest.changeSessionId and a new interface HttpSessionIdListener. You can listen for any session id changes using these methods. Default security semantic for non-specified HTTP method in <security-constraint> Clarifying the semantics if a parameter is specified in the URI and payload Miscellaneous ServletResponse.reset clears any data that exists in the buffer as well as the status code, headers. In addition, Servlet 3.1 will also clears the state of calling getServletOutputStream or getWriter. ServletResponse.setCharacterEncoding: Sets the character encoding (MIME charset) of the response being sent to the client, for example, to UTF-8. Relative protocol URL can be specified in HttpServletResponse.sendRedirect. This will allow a URL to be specified without a scheme. That means instead of specifying "http://anotherhost.com/foo/bar.jsp" as a redirect address, "//anotherhost.com/foo/bar.jsp" can be specified. In this case the scheme of the corresponding request will be used. Clarification in HttpServletRequest.getPart and .getParts without multipart configuration. Clarification that ServletContainerInitializer is independent of metadata-complete and is instantiated per web application. A complete replay of What's New in Servlet 3.1: An Overview from JavaOne 2012 can be seen here (click on CON6793_mp4_6793_001 in Media). Each feature will be added to the JSR subject to EG approval. You can share your feedback to [email protected]. Here are some more references for you: Servlet 3.1 Public Review Candidate Downloads Servlet 3.1 PR Candidate Spec Servlet 3.1 PR Candidate Javadocs Servlet Specification Project JSR Expert Group Discussion Archive Java EE 7 Specification Status Several features have already been integrated in GlassFish 4 Promoted Builds. Have you tried any of them ? Here are some other Java EE 7 primers published so far: Concurrency Utilities for Java EE (JSR 236) Collaborative Whiteboard using WebSocket in GlassFish 4 (TOTD #189) Non-blocking I/O using Servlet 3.1 (TOTD #188) What's New in EJB 3.2 ? JPA 2.1 Schema Generation (TOTD #187) WebSocket Applications using Java (JSR 356) Jersey 2 in GlassFish 4 (TOTD #182) WebSocket and Java EE 7 (TOTD #181) Java API for JSON Processing (JSR 353) JMS 2.0 Early Draft (JSR 343) And of course, more on their way! Do you want to see any particular one first ?

    Read the article

  • Broken Views

    - by Ajarn Mark Caldwell
    “SELECT *” isn’t just hazardous to performance, it can actually return blatantly wrong information. There are a number of blog posts and articles out there that actively discourage the use of the SELECT * FROM …syntax.  The two most common explanations that I have seen are: Performance:  The SELECT * syntax will return every column in the table, but frequently you really only need a few of the columns, and so by using SELECT * your are retrieving large volumes of data that you don’t need, but the system has to process, marshal across tiers, and so on.  It would be much more efficient to only select the specific columns that you need. Future-proof:  If you are taking other shortcuts in your code, along with using SELECT *, you are setting yourself up for trouble down the road when enhancements are made to the system.  For example, if you use SELECT * to return results from a table into a DataTable in .NET, and then reference columns positionally (e.g. myDataRow[5]) you could end up with bad data if someone happens to add a column into position 3 and skewing all the remaining columns’ ordinal position.  Or if you use INSERT…SELECT * then you will likely run into errors when a new column is added to the source table in any position. And if you use SELECT * in the definition of a view, you will run into a variation of the future-proof problem mentioned above.  One of the guys on my team, Mike Byther, ran across this in a project we were doing, but fortunately he caught it while we were still in development.  I asked him to put together a test to prove that this was related to the use of SELECT * and not some other anomaly.  I’ll walk you through the test script so you can see for yourself what happens. We are going to create a table and two views that are based on that table, one of them uses SELECT * and the other explicitly lists the column names.  The script to create these objects is listed below. IF OBJECT_ID('testtab') IS NOT NULL DROP TABLE testtabgoIF OBJECT_ID('testtab_vw') IS NOT NULL DROP VIEW testtab_vwgo IF OBJECT_ID('testtab_vw_named') IS NOT NULL DROP VIEW testtab_vw_namedgo CREATE TABLE testtab (col1 NVARCHAR(5) null, col2 NVARCHAR(5) null)INSERT INTO testtab(col1, col2)VALUES ('A','B'), ('A','B')GOCREATE VIEW testtab_vw AS SELECT * FROM testtabGOCREATE VIEW testtab_vw_named AS SELECT col1, col2 FROM testtabgo Now, to prove that the two views currently return equivalent results, select from them. SELECT 'star', col1, col2 FROM testtab_vwSELECT 'named', col1, col2 FROM testtab_vw_named OK, so far, so good.  Now, what happens if someone makes a change to the definition of the underlying table, and that change results in a new column being inserted between the two existing columns?  (Side note, I normally prefer to append new columns to the end of the table definition, but some people like to keep their columns alphabetized, and for clarity for later people reviewing the schema, it may make sense to group certain columns together.  Whatever the reason, it sometimes happens, and you need to protect yourself and your code from the repercussions.) DROP TABLE testtabgoCREATE TABLE testtab (col1 NVARCHAR(5) null, col3 NVARCHAR(5) NULL, col2 NVARCHAR(5) null)INSERT INTO testtab(col1, col3, col2)VALUES ('A','C','B'), ('A','C','B')goSELECT 'star', col1, col2 FROM testtab_vwSELECT 'named', col1, col2 FROM testtab_vw_named I would have expected that the view using SELECT * in its definition would essentially pass-through the column name and still retrieve the correct data, but that is not what happens.  When you run our two select statements again, you see that the View that is based on SELECT * actually retrieves the data based on the ordinal position of the columns at the time that the view was created.  Sure, one work-around is to recreate the View, but you can’t really count on other developers to know the dependencies you have built-in, and they won’t necessarily recreate the view when they refactor the table. I am sure that there are reasons and justifications for why Views behave this way, but I find it particularly disturbing that you can have code asking for col2, but actually be receiving data from col3.  By the way, for the record, this entire scenario and accompanying test script apply to SQL Server 2008 R2 with Service Pack 1. So, let the developer beware…know what assumptions are in effect around your code, and keep on discouraging people from using SELECT * syntax in anything but the simplest of ad-hoc queries. And of course, let’s clean up after ourselves.  To eliminate the database objects created during this test, run the following commands. DROP TABLE testtabDROP VIEW testtab_vwDROP VIEW testtab_vw_named

    Read the article

  • Apache 2.2 and FastCGI stops responding, warnings, crashes

    - by Brett
    I've seen this question posted a few times using a Google search, with no real answers. I have a multi-threaded FastCGI application running with Apache 2.2 on FreeBSD 7.2. There are a few issues with it, and I am unable to really figure out the source of the problem even after poking through a bunch of the mod_fastcgi source code. My FastCGI application gets anywhere from 2 to 15 or so hits per second, and mostly services a back-end API (the majority of web server usage is for this, and not actually serving content). Everything seems to work ok under normal conditions, but recently this problem has been becoming worse. It starts out with the FastCGI process manager apparently trying to close unneeded processes, sending them a SIGTERM signal. I catch the signal, clean up some stuff, and exit (by calling exit()) with status code 0. This process seems to result in three log messages in my httpd error log: [Tue Jun 01 14:03:31 2010] [warn] FastCGI: (dynamic) server "/home/program/wwwroot/domains/www.mydomain.com/cgi-bin/program.cgi" (pid 98182) termination signaled [Tue Jun 01 14:03:31 2010] [warn] FastCGI: (dynamic) server "/home/program/wwwroot/domains/www.mydomain.com/cgi-bin/program.cgi" (pid 98182) terminated by calling exit with status '0' [Tue Jun 01 14:03:31 2010] [warn] FastCGI: (dynamic) server "/home/program/wwwroot/domains/www.mydomain.com/cgi-bin/program.cgi" restarted (pid 98294) I am not sure why it says it is restarting the process, but in any case no core dump is ever generated so I do believe it is the FastCGI process manager doing it's thing. This makes sense because it begins to happen after the initial load increase from restarting Apache. Since it's down for a few seconds, it gets hit with a couple of hundred requests over the first few seconds it's running again (sometimes even hitting the upper limit of MAXCLIENTS in Apache), and this seems to be the process manager doing the work of spawning more processes to handle the increased load. So this all seems fine, but here is where things get weird. There are really two problems that I see. First, my multithreaded FastCGI process spawns 25 worker threads, and all seem to be used according to my internal log files (multiple processes are clearly using multiple threads to do work). However it seems that 3 or 4 FastCGI processes is not enough to handle the 5 to 15 hit per second load, even though the requests take about .02s or so to process internally. In order to be at all responsive, it seems I need 50 or more FastCGI processes, leading me to believe that FastCGI does not realize that my program is multithreaded. I've read the documentation at http://www.fastcgi.com/mod_fastcgi/docs/mod_fastcgi.html and do not see any option pertaining to multithreaded-ness, and my internal code is more or less set up just like the examples provided by the FastCGI library. The second problem I am having is that once process termination has happened a bunch of times as above (and seemingly at random), I begin getting a lot of these messages in my error log: [Tue Jun 01 14:06:22 2010] [warn] (32)Broken pipe: FastCGI: write() to PM failed (ignore if a restart or shutdown is pending) The messages occur for about half the hits I get to the server, and it completely kills the responsiveness of my application - it seems FastCGI will look for a working "pipe" until it finds one, and fail to realize that whatever application it is trying to contact is dead. It does still work though, it's just incredibly unresponsive - sometimes taking up to 40 or so seconds to process a request. I recompiled mod_fastcgi with some extra debugging around the point of the error message, and it appears that the error happens when it tries to write() to the application. The call to write() fails with a -1 return code, and sets errno to EPIPE. I am noticing that the issue happens mostly when either a crash occurs in one of the FastCGI processes, or a bunch of them are seemingly terminated by the process manager. I haven't had any core dumps though, except for one, where the backtrace outputted by gdb is just a single call to free() at address 0x0000000000000000 with nothing else in the stack trace, so I don't really know what to make of that. I'm thinking it happens sometime after the SIGTERM signal is caught, maybe some global variable not being cleaned up properly or something.

    Read the article

  • Apache outputs all urls of a second domain as a subfolder of the primary domain name

    - by s_rathbone
    Hi all, would anyone be able to possibly give me some guidance.. Basically, i have a 'shared hosting' account with a large internet hosting provider, and my account lets me have multiple seperate domains within this folder structure.(note: not aliased domains and not sub domains). so, my goal is to have 2 domains set up. i have already purchased the two domain names i need: The first domain is the 'primary' domain name for the root folder(eg. www.example1.com) and the second domain name is set for one of its sub folders(eg. www.example2.com is set to the folder www.example1.com/sites/music). The problem is that when apache returns a page of the second domain back to the browser, apache writes the hyperlinks as if it's a sub folder of the first domain ( eg. www.example2.com/index.html. comes out as http://www.example1.com/sites/music/index.html). Now, I have done some reading on this, looking though "Apache: the definitive guide"(o'reilly), and although it was useful, couldn't really find the answer. i'm guessing this issue is most likely an apache setup issue in http.conf, rather than an issue with the hosting company itself (which is why im posting it here) and I have also been to the official documentation for apache site, and i am guessing i might need to use something like the rewritebase directive in htaccess files.. but im really not sure, im more of a java programmer guy, and have been struggling with this for a couple of days. Any guidance would be REALLY appreciated. If it helps, my hosting company is godaddy, and my sites are hosted on linux. My problem was originally with wordpress which i reinstalled a number of times in various ways to correct the problem, but ive just done a test with a very simple static html, and it still has the same issue with relative urls like this: <html> <head></head><body><a href="images/dog.html">Pictures of Dogs</a></body> </html> However, it is fine if i hardcode the urls like this: <html> <head></head><body><a href="http://www.example2.com/images/dog.html">Pictures of Dogs</a></body> </html> Thanks heaps, Steve R NOW FIXED Ok, the problem has now been fixed, and i didn't need to modify any .conf or .htaccess files. The problem was, that when I went to install the second application into a second domain from the godaddy site, one of the setup questions is that it asks you which site you want it installed to. after that it asks for the desired folder path. However, the problem was that the second domain name was already pointing to the correct subfolder of the primary domain. So when I started installing wordpress again and came to the menu to select which site it was for, and it listed only the primary domain as an option, i assumed that this was like a label of "which hosting account?", or "which primary domain will your application will be installed under?" because I already knew that in the next step i was specifiying the folder. In order to correct this, you must make sure that your second domain is added to your domain list so that it will be listed as an option during the installation process. For further details please read tystips.com/archives/52/how2-save-money-host-multiple-wordpress-blogs-on-a-single-godaddy-hosting-account/

    Read the article

  • disk-to-disk backup without costly backup redundancy?

    - by AaronLS
    A good backup strategy involves a combination of 1) disconnected backups/snapshots that will not be affected by bugs, viruses, and/or security breaches 2) geographically distributed backups to protect against local disasters 3) testing backups to ensure that they can be restored as needed Generally I take an onsite backup daily, and an offsite backup weekly, and do test restores periodically. In the rare circumstance that I need to restore files, I do some from the local backup. Should a catastrophic event destroy the servers and local backups, then the offsite weekly tape backup would be used to restore the files. I don't need multiple offsite backups with redundancy. I ALREADY HAVE REDUNDANCY THROUGH THE USE OF BOTH LOCAL AND REMOTE BACKUPS. I have recovery blocks and par files with the backups, so I already have protection against a small percentage of corrupt bits. I perform test restores to ensure the backups function properly. Should the remote backups experience a dataloss, I can replace them with one of the local backups. There are historical offsite backups as well, so if a dataloss was not noticed for a few weeks(such as a bug/security breach/virus), the data could be restored from an older backup. By doing this, the only scenario that poses a risk to complete data loss would be one where both the local, remote, and servers all experienced a data loss in the same time period. I'm willing to risk that happening since the odds of that trifecta negligibly small, and the data isn't THAT valuable to me. So I hope I have emphasized that I don't need redundancy in my offsite backups because I have covered all the bases. I know this exact technique is employed by numerous businesses. Of course there are some that take multiple offsite backups, because the data is so incredibly valuable that they don't even want to risk that trifecta disaster, but in the majority of cases the trifecta disaster is an accepted risk. I HAD TO COVER ALL THIS BECAUSE SOME PEOPLE DON'T READ!!! I think I have justified my backup strategy and the majority of businesses who use offsite tape backups do not have any additional redundancy beyond what is mentioned above(recovery blocks, par files, historical snapshots). Now I would like to eliminate the use of tapes for offsite backups, and instead use a backup service. Most however are extremely costly for $/gb/month storage. I don't mind paying for transfer bandwidth, but the cost of storage is way to high. All of them advertise that they maintain backups of the data, and I imagine they use RAID as well. Obviously if you were using them to host servers this would all be necessary, but for my scenario, I am simply replacing my offsite backups with such a service. So there is no need for RAID, and absolutely no value in another layer of backups of backups. My one and only question: "Are there online data-storage/backup services that do not use redundancy or offer backups(backups of my backups) as part of their packages, and thus are more reasonably priced?" NOT my question: "Is this a flawed strategy?" I don't care if you think this is a good strategy or not. I know it pretty standard. Very few people make an extra copy of their offsite backups. They already have local backups that they can use to replace the remote backups if something catastrophic happens at the remote site. Please limit your responses to the question posed. Sorry if I seem a little abrasive, but I had some trolls in my last post who didn't read my requirements nor my question, and were trying to go off answering a totally different question. I made it pretty clear, but didn't try to justify my strategy, because I didn't ask about whether my strategy was justifyable. So I apologize if this was lengthy, as it really didn't need to be, but since there are so many trolls here who try to sidetrack questions by responding without addressing the question at hand.

    Read the article

  • httpd.conf configuration - for internal/external access

    - by tom smith
    hey. after a lot of trail/error/research, i've decided to post here in the hopes that i can get clarification on what i've screwed up... i've got a situation where i have multiple servers behind a router/firewall. i want to be able to access the sites i have from an internal and external url/address, and get the same site. i have to use portforwarding on the router, so i need to be able to use proxyreverse to redirect the user to the approriate server, running the apache/web app... my setup the external urls joomla.gotdns.com forge.gotdns.com both of these point to my router's external ip address (67.168.2.2) (not really) the router forwards port 80 to my server lserver6 192.168.1.56 lserver6 - 192.168.1.56 lserver9 - 192.168.1.59 lserver6 - joomla app lserver9 - forge app i want to be able to have the httpd process (httpd.conf) configured on lserver6 to be able to allow external users accessing the system (foo.gotdns.com) be able to access the joomla app on lserver6 and the same for the forge app running on lserver9 at the same time, i would also like to be able to access the apps from the internal servers, so i'd need to be able to somehow configure the vhost setup/proxyreverse setup to handle the internal access... i've tried setting up multiple vhosts with no luck.. i've looked at the different examples online.. so there must be something subtle that i'm missing... the section of my httpd.conf file that deals with the vhost is below... if there's something else that's needed, let me know and i can post it as well.. thanks -tom ##joomla - file /etc/httpd/conf.d/joomla.conf Alias /joomla /var/www/html/joomla <Directory /var/www/html/joomla> </Directory> # Use name-based virtual hosting. #NameVirtualHost *:80 # NOTE: NameVirtualHost cannot be used without a port specifier # (e.g. :80) if mod_ssl is being used, due to the nature of the # SSL protocol. # VirtualHost example: # Almost any Apache directive may go into a VirtualHost container. # The first VirtualHost section is used for requests without a known # server name. #<VirtualHost *:80> # ServerAdmin [email protected] # DocumentRoot /www/docs/dummy-host.example.com # ServerName dummy-host.example.com # ErrorLog logs/dummy-host.example.com-error_log # CustomLog logs/dummy-host.example.com-access_log common #</VirtualHost> NameVirtualHost 192.168.1.56:80 <VirtualHost 192.168.1.56:80> #ServerAdmin [email protected] #DocumentRoot /var/www/html #ServerName lserver6.tmesa.com #ServerName fforge.tmesa.com ServerName fforge.gotdns.com:80 #ErrorLog logs/dummy-host.example.com-error_log #CustomLog logs/dummy-host.example.com-access_log common #ProxyRequests Off ProxyPass / http://192.168.1.81:80/ ProxyPassReverse / http://192.168.1.81:80/ </VirtualHost> <VirtualHost 192.168.1.56:80> #ServerAdmin [email protected] DocumentRoot /var/www/html/joomla #ServerName lserver6.tmesa.com #ServerName fforge.tmesa.com ServerName 192.168.1.56:80 #ErrorLog logs/dummy-host.example.com-error_log #CustomLog logs/dummy-host.example.com-access_log common #ProxyRequests Off </VirtualHost>

    Read the article

  • Network Restructure Method for Double-NAT network

    - by Adrian
    Due to a series of poor network design decisions (mostly) made many years ago in order to save a few bucks here and there, I have a network that is decidedly sub-optimally architected. I'm looking for suggestions to improve this less-than-pleasant situation. We're a non-profit with a Linux-based IT department and a limited budget. (Note: None of the Windows equipment we have runs does anything that talks to the Internet nor do we have any Windows admins on staff.) Key points: We have a main office and about 12 remote sites that essentially double NAT their subnets with physically-segregated switches. (No VLANing and limited ability to do so with current switches) These locations have a "DMZ" subnet that are NAT'd on an identically assigned 10.0.0/24 subnet at each site. These subnets cannot talk to DMZs at any other location because we don't route them anywhere except between server and adjacent "firewall". Some of these locations have multiple ISP connections (T1, Cable, and/or DSLs) that we manually route using IP Tools in Linux. These firewalls all run on the (10.0.0/24) network and are mostly "pro-sumer" grade firewalls (Linksys, Netgear, etc.) or ISP-provided DSL modems. Connecting these firewalls (via simple unmanaged switches) is one or more servers that must be publically-accessible. Connected to the main office's 10.0.0/24 subnet are servers for email, tele-commuter VPN, remote office VPN server, primary router to the internal 192.168/24 subnets. These have to be access from specific ISP connections based on traffic type and connection source. All our routing is done manually or with OpenVPN route statements Inter-office traffic goes through the OpenVPN service in the main 'Router' server which has it's own NAT'ing involved. Remote sites only have one server installed at each site and cannot afford multiple servers due to budget constraints. These servers are all LTSP servers several 5-20 terminals. The 192.168.2/24 and 192.168.3/24 subnets are mostly but NOT entirely on Cisco 2960 switches that can do VLAN. The remainder are DLink DGS-1248 switches that I am not sure I trust well enough to use with VLANs. There is also some remaining internal concern about VLANs since only the senior networking staff person understands how it works. All regular internet traffic goes through the CentOS 5 router server which in turns NATs the 192.168/24 subnets to the 10.0.0.0/24 subnets according to the manually-configured routing rules that we use to point outbound traffic to the proper internet connection based on '-host' routing statements. I want to simplify this and ready All Of The Things for ESXi virtualization, including these public-facing services. Is there a no- or low-cost solution that would get rid of the Double-NAT and restore a little sanity to this mess so that my future replacement doesn't hunt me down? Basic Diagram for the main office: These are my goals: Public-facing Servers with interfaces on that middle 10.0.0/24 network to be moved in to 192.168.2/24 subnet on ESXi servers. Get rid of the double NAT and get our entire network on one single subnet. My understanding is that this is something we'll need to do under IPv6 anyway, but I think this mess is standing in the way.

    Read the article

  • Windows 7: How to place SuperFetch cache on an SSD?

    - by Ian Boyd
    I'm thinking of adding a solid state drive (SSD) to my existing Windows 7 installation. I know I can (and should) move my paging file to the SSD: Should the pagefile be placed on SSDs? Yes. Most pagefile operations are small random reads or larger sequential writes, both of which are types of operations that SSDs handle well. In looking at telemetry data from thousands of traces and focusing on pagefile reads and writes, we find that Pagefile.sys reads outnumber pagefile.sys writes by about 40 to 1, Pagefile.sys read sizes are typically quite small, with 67% less than or equal to 4 KB, and 88% less than 16 KB. Pagefile.sys writes are relatively large, with 62% greater than or equal to 128 KB and 45% being exactly 1 MB in size. In fact, given typical pagefile reference patterns and the favorable performance characteristics SSDs have on those patterns, there are few files better than the pagefile to place on an SSD. What I don't know is if I even can put a SuperFetch cache (i.e. ReadyBoost cache) on the solid state drive. I want to get the benefit of Windows being able to cache gigabytes of frequently accessed data on a relativly small (e.g. 30GB) solid state drive. This is exactly what SuperFetch+ReadyBoost (or SuperFetch+ReadyDrive) was designed for. Will Windows offer (or let) me place a ReadyBoost cache on a solid state flash drive connected via SATA? A problem with the ReadyBoost cache over the ReadyDrive cache is that the ReadyBoost cache does not survive between reboots. The cache is encrypted with a per-session key, making its existing contents unusable during boot and SuperFetch pre-fetching during login. Update One I know that Windows Vista limited you to only one ReadyBoost.sfcache file (I do not know if Windows 7 removed that limitation): Q: Can use use multiple devices for EMDs? A: Nope. We've limited Vista to one ReadyBoost per machine Q: Why just one device? A: Time and quality. Since this is the first revision of the feature, we decided to focus on making the single device exceptional, without the difficulties of managing multiple caches. We like the idea, though, and it's under consideration for future versions. I also know that the 4GB limit on the cache file was a limitation of the FAT filesystem used on most USB sticks - an SSD drive would be formatted with NTFS: Q: What's the largest amount of flash that I can use for ReadyBoost? A: You can use up to 4GB of flash for ReadyBoost (which turns out to be 8GB of cache w/ the compression) Q: Why can't I use more than 4GB of flash? A: The FAT32 filesystem limits our ReadyBoost.sfcache file to 4GB Can a ReadyBoost cache on an NTFS volume be larger than 4GB? Update Two The ReadyBoost cache is encrypted with a per-boot session key. This means that the cache has to be re-built after each boot, and cannot be used to help speed boot times, or latency from login to usable. Windows ReadyDrive technology takes advantage of non-volatile (NV) memory (i.e. flash) that is incorporated with some hybrid hard drives. This flash cache can be used to help Windows boot, or resume from hibernate faster. Will Windows 7 use an internal SSD drive as a ReadyBoost/*ReadyDrive*/SuperFetch cache? Is it possible to make Windows store a SuperFetch cache (i.e. ReadyBoost) on a non-removable SSD? Is it possible to not encrypt the ReadyBoost cache, and if so will Windows 7 use the cache at boot time? See also SuperUser.com: ReadyBoost + SSD = ? Windows 7 - ReadyBoost & SSD drives? Support and Q&A for Solid-State Drives Using SDD as a cache for HDD, is there a solution? Performance increase using SSD for paging/fetch/cache or ReadyBoost? (Win7) Windows 7 To Boost SSD Performance How to Disable Nonvolatile Caching

    Read the article

  • Does this prove a network bandwidth bottleneck?

    - by Yuji Tomita
    I've incorrectly assumed that my internal AB testing means my server can handle 1k concurrency @3k hits per second. My theory at at the moment is that the network is the bottleneck. The server can't send enough data fast enough. External testing from blitz.io at 1k concurrency shows my hits/s capping off at 180, with pages taking longer and longer to respond as the server is only able to return 180 per second. I've served a blank file from nginx and benched it: it scales 1:1 with concurrency. Now to rule out IO / memcached bottlenecks (nginx normally pulls from memcached), I serve up a static version of the cached page from the filesystem. The results are very similar to my original test; I'm capped at around 180 RPS. Splitting the HTML page in half gives me double the RPS, so it's definitely limited by the size of the page. If I internally ApacheBench from the local server, I get consistent results of around 4k RPS on both the Full Page and the Half Page, at high transfer rates. Transfer rate: 62586.14 [Kbytes/sec] received If I AB from an external server, I get around 180RPS - same as the blitz.io results. How do I know it's not intentional throttling? If I benchmark from multiple external servers, all results become poor which leads me to believe the problem is in MY servers outbound traffic, not a download speed issue with my benchmarking servers / blitz.io. So I'm back to my conclusion that my server can't send data fast enough. Am I right? Are there other ways to interpret this data? Is the solution/optimization to set up multiple servers + load balancing that can each serve 180 hits per second? I'm quite new to server optimization, so I'd appreciate any confirmation interpreting this data. Outbound traffic Here's more information about the outbound bandwidth: The network graph shows a maximum output of 16 Mb/s: 16 megabits per second. Doesn't sound like much at all. Due to a suggestion about throttling, I looked into this and found that linode has a 50mbps cap (which I'm not even close to hitting, apparently). I had it raised to 100mbps. Since linode caps my traffic, and I'm not even hitting it, does this mean that my server should indeed be capable of outputting up to 100mbps but is limited by some other internal bottleneck? I just don't understand how networks at this large of a scale work; can they literally send data as fast as they can read from the HDD? Is the network pipe that big? In conclusion 1: Based on the above, I'm thinking I can definitely raise my 180RPS by adding an nginx load balancer on top of a multi nginx server setup at exactly 180RPS per server behind the LB. 2: If linode has a 50/100mbit limit that I'm not hitting at all, there must be something I can do to hit that limit with my single server setup. If I can read / transmit data fast enough locally, and linode even bothers to have a 50mbit/100mbit cap, there must be an internal bottleneck that's not allowing me to hit those caps that I'm not sure how to detect. Correct? I realize the question is huge and vague now, but I'm not sure how to condense it. Any input is appreciated on any conclusion I've made.

    Read the article

  • How to place SuperFetch cache on an SSD?

    - by Ian Boyd
    I'm thinking of adding a solid state drive (SSD) to my existing Windows 7 installation. I know I can (and should) move my paging file to the SSD: Should the pagefile be placed on SSDs? Yes. Most pagefile operations are small random reads or larger sequential writes, both of which are types of operations that SSDs handle well. In looking at telemetry data from thousands of traces and focusing on pagefile reads and writes, we find that Pagefile.sys reads outnumber pagefile.sys writes by about 40 to 1, Pagefile.sys read sizes are typically quite small, with 67% less than or equal to 4 KB, and 88% less than 16 KB. Pagefile.sys writes are relatively large, with 62% greater than or equal to 128 KB and 45% being exactly 1 MB in size. In fact, given typical pagefile reference patterns and the favorable performance characteristics SSDs have on those patterns, there are few files better than the pagefile to place on an SSD. What I don't know is if I even can put a SuperFetch cache (i.e. ReadyBoost cache) on the solid state drive. I want to get the benefit of Windows being able to cache gigabytes of frequently accessed data on a relativly small (e.g. 30GB) solid state drive. This is exactly what SuperFetch+ReadyBoost (or SuperFetch+ReadyDrive) was designed for. Will Windows offer (or let) me place a ReadyBoost cache on a solid state flash drive connected via SATA? A problem with the ReadyBoost cache over the ReadyDrive cache is that the ReadyBoost cache does not survive between reboots. The cache is encrypted with a per-session key, making its existing contents unusable during boot and SuperFetch pre-fetching during login. Update One I know that Windows Vista limited you to only one ReadyBoost.sfcache file (I do not know if Windows 7 removed that limitation): Q: Can use use multiple devices for EMDs? A: Nope. We've limited Vista to one ReadyBoost per machine Q: Why just one device? A: Time and quality. Since this is the first revision of the feature, we decided to focus on making the single device exceptional, without the difficulties of managing multiple caches. We like the idea, though, and it's under consideration for future versions. I also know that the 4GB limit on the cache file was a limitation of the FAT filesystem used on most USB sticks - an SSD drive would be formatted with NTFS: Q: What's the largest amount of flash that I can use for ReadyBoost? A: You can use up to 4GB of flash for ReadyBoost (which turns out to be 8GB of cache w/ the compression) Q: Why can't I use more than 4GB of flash? A: The FAT32 filesystem limits our ReadyBoost.sfcache file to 4GB Can a ReadyBoost cache on an NTFS volume be larger than 4GB? Update Two The ReadyBoost cache is encrypted with a per-boot session key. This means that the cache has to be re-built after each boot, and cannot be used to help speed boot times, or latency from login to usable. Windows ReadyDrive technology takes advantage of non-volatile (NV) memory (i.e. flash) that is incorporated with some hybrid hard drives. This flash cache can be used to help Windows boot, or resume from hibernate faster. Will Windows 7 use an internal SSD drive as a ReadyBoost/*ReadyDrive*/SuperFetch cache? Is it possible to make Windows store a SuperFetch cache (i.e. ReadyBoost) on a non-removable SSD? Is it possible to not encrypt the ReadyBoost cache, and if so will Windows 7 use the cache at boot time? See also SuperUser.com: ReadyBoost + SSD = ? Windows 7 - ReadyBoost & SSD drives? Support and Q&A for Solid-State Drives Using SDD as a cache for HDD, is there a solution? Performance increase using SSD for paging/fetch/cache or ReadyBoost? (Win7) Windows 7 To Boost SSD Performance How to Disable Nonvolatile Caching

    Read the article

< Previous Page | 538 539 540 541 542 543 544 545 546 547 548 549  | Next Page >