Search Results

Search found 1555 results on 63 pages for 'filtering'.

Page 58/63 | < Previous Page | 54 55 56 57 58 59 60 61 62 63  | Next Page >

  • Using SQL Execution Plans to discover the Swedish alphabet

    - by Rob Farley
    SQL Server is quite remarkable in a bunch of ways. In this post, I’m using the way that the Query Optimizer handles LIKE to keep it SARGable, the Execution Plans that result, Collations, and PowerShell to come up with the Swedish alphabet. SARGability is the ability to seek for items in an index according to a particular set of criteria. If you don’t have SARGability in play, you need to scan the whole index (or table if you don’t have an index). For example, I can find myself in the phonebook easily, because it’s sorted by LastName and I can find Farley in there by moving to the Fs, and so on. I can’t find everyone in my suburb easily, because the phonebook isn’t sorted that way. I can’t even find people who have six letters in their last name, because also the book is sorted by LastName, it’s not sorted by LEN(LastName). This is all stuff I’ve looked at before, including in the talk I gave at SQLBits in October 2010. If I try to find everyone who’s names start with F, I can do that using a query a bit like: SELECT LastName FROM dbo.PhoneBook WHERE LEFT(LastName,1) = 'F'; Unfortunately, the Query Optimizer doesn’t realise that all the entries that satisfy LEFT(LastName,1) = 'F' will be together, and it has to scan the whole table to find them. But if I write: SELECT LastName FROM dbo.PhoneBook WHERE LastName LIKE 'F%'; then SQL is smart enough to understand this, and performs an Index Seek instead. To see why, I look further into the plan, in particular, the properties of the Index Seek operator. The ToolTip shows me what I’m after: You’ll see that it does a Seek to find any entries that are at least F, but not yet G. There’s an extra Predicate in there (a Residual Predicate if you like), which checks that each LastName is really LIKE F% – I suppose it doesn’t consider that the Seek Predicate is quite enough – but most of the benefit is seen by its working out the Seek Predicate, filtering to just the “at least F but not yet G” section of the data. This got me curious though, particularly about where the G comes from, and whether I could leverage it to create the Swedish alphabet. I know that in the Swedish language, there are three extra letters that appear at the end of the alphabet. One of them is ä that appears in the word Västerås. It turns out that Västerås is quite hard to find in an index when you’re looking it up in a Swedish map. I talked about this briefly in my five-minute talk on Collation from SQLPASS (the one which was slightly less than serious). So by looking at the plan, I can work out what the next letter is in the alphabet of the collation used by the column. In other words, if my alphabet were Swedish, I’d be able to tell what the next letter after F is – just in case it’s not G. It turns out it is… Yes, the Swedish letter after F is G. But I worked this out by using a copy of my PhoneBook table that used the Finnish_Swedish_CI_AI collation. I couldn’t find how the Query Optimizer calculates the G, and my friend Paul White (@SQL_Kiwi) tells me that it’s frustratingly internal to the QO. He’s particularly smart, even if he is from New Zealand. To investigate further, I decided to do some PowerShell, leveraging the Get-SqlPlan function that I blogged about recently (make sure you also have the SqlServerCmdletSnapin100 snap-in added). I started by indicating that I was going to use Finnish_Swedish_CI_AI as my collation of choice, and that I’d start whichever letter cam straight after the number 9. I figure that this is a cheat’s way of guessing the first letter of the alphabet (but it doesn’t actually work in Unicode – luckily I’m using varchar not nvarchar. Actually, there are a few aspects of this code that only work using ASCII, so apologies if you were wanting to apply it to Greek, Japanese, etc). I also initialised my $alphabet variable. $collation = 'Finnish_Swedish_CI_AI'; $firstletter = '9'; $alphabet = ''; Now I created the table for my test. A single field would do, and putting a Clustered Index on it would suffice for the Seeks. Invoke-Sqlcmd -server . -data tempdb -query "create table dbo.collation_test (col varchar(10) collate $collation primary key);" Now I get into the looping. $c = $firstletter; $stillgoing = $true; while ($stillgoing) { I construct the query I want, seeking for entries which start with whatever $c has reached, and get the plan for it: $query = "select col from dbo.collation_test where col like '$($c)%';"; [xml] $pl = get-sqlplan $query "." "tempdb"; At this point, my $pl variable is a scary piece of XML, representing the execution plan. A bit of hunting through it showed me that the EndRange element contained what I was after, and that if it contained NULL, then I was done. $stillgoing = ($pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange -ne $null); Now I could grab the value out of it (which came with apostrophes that needed stripping), and append that to my $alphabet variable.   if ($stillgoing)   {  $c=$pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange.RangeExpressions.ScalarOperator.ScalarString.Replace("'","");     $alphabet += $c;   } Finally, finishing the loop, dropping the table, and showing my alphabet! } Invoke-Sqlcmd -server . -data tempdb -query "drop table dbo.collation_test;"; $alphabet; When I run all this, I see that the Swedish alphabet is ABCDEFGHIJKLMNOPQRSTUVXYZÅÄÖ, which matches what I see at Wikipedia. Interesting to see that the letters on the end are still there, even with Case Insensitivity. Turns out they’re not just “letters with accents”, they’re letters in their own right. I’m sure you gave up reading long ago, and really aren’t that fazed about the idea of doing this using PowerShell. I chose PowerShell because I’d already come up with an easy way of grabbing the estimated plan for a query, and PowerShell does allow for easy navigation of XML. I find the most interesting aspect of this as the fact that the Query Optimizer uses the next letter of the alphabet to maintain the SARGability of LIKE. I’m hoping they do something similar for a whole bunch of operations. Oh, and the fact that you know how to find stuff in the IKEA catalogue. Footnote: If you are interested in whether this works in other languages, you might want to consider the following screenshot, which shows that in principle, it should work with Japanese. It might be a bit harder to run this in PowerShell though, as I’m not sure how it translates. In Hiragana, the Japanese alphabet starts ?, ?, ?, ?, ?, ...

    Read the article

  • Broken cups installation on a ubuntu server 64

    - by user67046
    Hi, I am having trouble with an cups installation. It seems to be in a broken state. When i try to reinstall it it stalls, the same if i try to remove it completely. I am running the server version 64 bit of Ubuntu 10.10 with kernel Linux version 2.6.35-22-server. When i try to start the cups daemon with the following command sudo service cups start It just stays there and nothing happens. I have tried to remove it, to be able to reinstall it, with the following command sudo apt-get purge cups It finally stalls with the following message Removing cups ... After that nothing happens. The process tree for the apt-get command looks like this. 1404 1404 1404 ? 00:00:00 sshd 26495 26495 26495 ? 00:00:00 sshd 26581 26495 26495 ? 00:00:00 sshd 26582 26582 26582 pts/4 00:00:00 bash 27158 27158 26582 pts/4 00:00:00 apt-get 27172 27172 27172 pts/2 00:00:00 dpkg 27176 27172 27172 pts/2 00:00:00 cups.prerm 27178 27172 27172 pts/2 00:00:00 stop I have tried to leave the process running for a while to see if i get any error messages but without success. To get out of it I have to kill the processes. sudo dpkg --configure cups dpkg: error processing cups (--configure): package cups is already installed and configured Errors were encountered while processing: cups sudo dpkg --status cups Package: cups Status: purge ok installed Priority: optional Section: net Installed-Size: 8292 Maintainer: Ubuntu Developers <[email protected]> Architecture: amd64 Version: 1.4.4-6ubuntu2.3 Replaces: cupsddk-drivers (<< 1.4.0) Provides: cupsddk-drivers Depends: libavahi-client3 (>= 0.6.16), libavahi-common3 (>= 0.6.16), libc6 (>= 2.7), libcups2 (>= 1.4.4-3~), libcupscgi1 (>= 1.4.2), libcupsdriver1 (>= 1.4.0), libcupsimage2 (>= 1.4.0), libcupsmime1 (>= 1.4.0), libcupsppdc1 (>= 1.4.0), libdbus-1-3 (>= 1.0.2), libgcc1 (>= 1:4.1.1), libgnutls26 (>= 2.7.14-0), libgssapi-krb5-2 (>= 1.8+dfsg), libijs-0.35, libkrb5-3 (>= 1.6.dfsg.2), libldap-2.4-2 (>= 2.4.7), libpam0g (>= 0.99.7.1), libpaper1, libpoppler7, libslp1, libstdc++6 (>= 4.1.1), libusb-0.1-4 (>= 2:0.1.12), zlib1g (>= 1:1.1.4), debconf (>= 1.2.9) | debconf-2.0, upstart-job, poppler-utils (>= 0.12), procps, ghostscript, lsb-base (>= 3), cups-common (>= 1.4.4), cups-client (>= 1.4.4-6ubuntu2.3), ssl-cert (>= 1.0.11), adduser, bc, ttf-freefont, cups-ppdc Recommends: foomatic-filters (>= 4.0), cups-driver-gutenprint, ghostscript-cups Suggests: cups-bsd, foomatic-db-compressed-ppds | foomatic-db, hplip, xpdf-korean | xpdf-japanese | xpdf-chinese-traditional | xpdf-chinese-simplified, cups-pdf, smbclient (>= 3.0.9), udev Breaks: foomatic-filters (<< 4.0) Conflicts: cupsddk-drivers (<< 1.4.0) Conffiles: /etc/fonts/conf.d/99pdftoopvp.conf a5221cfad70a981c80864229ef56586d /etc/logrotate.d/cups 5bb41fa9900f0d1c565954405a2bd7c4 /etc/default/cups 2b436fbb1a32b82b6aba45a76a1d7e40 /etc/pam.d/cups ff2488324854f7b1e892bb0df062d5f0 /etc/init/cups.conf 1a3cd022e8474e3d2b44640f33ce68e3 /etc/ufw/applications.d/cups 29e98a6d850da251e180c3d68dec2bd3 /etc/apparmor.d/usr.sbin.cupsd 60c4b26bfd5c033baa3dd48a3b2e9911 /etc/cups/cupsd.conf e2c7ec15835ea0939e5e86f7c6efcc03 /etc/cups/snmp.conf 2326a8af1e112676d55245bc5eb459ca /etc/cups/cupsd.conf.default a68d54d76021e857dd1d64edf57d36c5 Description: Common UNIX Printing System(tm) - server The Common UNIX Printing System (or CUPS(tm)) is a printing system and general replacement for lpd and the like. It supports the Internet Printing Protocol (IPP), and has its own filtering driver model for handling various document types. . This package provides the CUPS scheduler/daemon and related files. Original-Maintainer: Debian CUPS Maintainers <[email protected]> Would be greatful if someone could provide some help on how to solve this issue.

    Read the article

  • Using the BAM Interceptor with Continuation

    - by Charles Young
    Originally posted on: http://geekswithblogs.net/cyoung/archive/2014/06/02/using-the-bam-interceptor-with-continuation.aspxI’ve recently been resurrecting some code written several years ago that makes extensive use of the BAM Interceptor provided as part of BizTalk Server’s BAM event observation library.  In doing this, I noticed an issue with continuations.  Essentially, whenever I tried to configure one or more continuations for an activity, the BAM Interceptor failed to complete the activity correctly.   Careful inspection of my code confirmed that I was initializing and invoking the BAM interceptor correctly, so I was mystified.  However, I eventually found the problem.  It is a logical error in the BAM Interceptor code itself. The BAM Interceptor provides a useful mechanism for implementing dynamic tracking.  It supports configurable ‘track points’.  These are grouped into named ‘locations’.  BAM uses the term ‘step’ as a synonym for ‘location’.   Each track point defines a BAM action such as starting an activity, extracting a data item, enabling a continuation, etc.  Each step defines a collection of track points. Understanding Steps The BAM Interceptor provides an abstract model for handling configuration of steps.  It doesn’t, however, define any specific configuration mechanism (e.g., config files, SSO, etc.)  It is up to the developer to decide how to store, manage and retrieve configuration data.  At run time, this configuration is used to register track points which then drive the BAM Interceptor. The full semantics of a step are not immediately clear from Microsoft’s documentation.  They represent a point in a business activity where BAM tracking occurs.  They are named locations in the code.  What is less obvious is that they always represent either the full tracking work for a given activity or a discrete fragment of that work which commences with the start of a new activity or the continuation of an existing activity.  The BAM Interceptor enforces this by throwing an error if no ‘start new’ or ‘continue’ track point is registered for a named location. This constraint implies that each step must marked with an ‘end activity’ track point.  One of the peculiarities of BAM semantics is that when an activity is continued under a correlated ID, you must first mark the current activity as ‘ended’ in order to ensure the right housekeeping is done in the database.  If you re-start an ended activity under the same ID, you will leave the BAM import tables in an inconsistent state.  A step, therefore, always represents an entire unit of work for a given activity or continuation ID.  For activities with continuation, each unit of work is termed a ‘fragment’. Instance and Fragment State Internally, the BAM Interceptor maintains state data at two levels.  First, it represents the overall state of the activity using a ‘trace instance’ token.  This token contains the name and ID of the activity together with a couple of state flags.  The second level of state represents a ‘trace fragment’.   As we have seen, a fragment of an activity corresponds directly to the notion of a ‘step’.  It is the unit of work done at a named location, and it must be bounded by start and end, or continue and end, actions. When handling continuations, the BAM Interceptor differentiates between ‘root’ fragments and other fragments.  Very simply, a root fragment represents the start of an activity.  Other fragments represent continuations.  This is where the logic breaks down.  The BAM Interceptor loses state integrity for root fragments when continuations are defined. Initialization Microsoft’s BAM Interceptor code supports the initialization of BAM Interceptors from track point configuration data.  The process starts by populating an Activity Interceptor Configuration object with an array of track points.  These can belong to different steps (aka ‘locations’) and can be registered in any order.  Once it is populated with track points, the Activity Interceptor Configuration is used to initialise the BAM Interceptor.  The BAM Interceptor sets up a hash table of array lists.  Each step is represented by an array list, and each array list contains an ordered set of track points.  The BAM Interceptor represents track points as ‘executable’ components.  When the OnStep method of the BAM Interceptor is called for a given step, the corresponding list of track points is retrieved and each track point is executed in turn.  Each track point retrieves any required data using a call back mechanism and then serializes a BAM trace fragment object representing a specific action (e.g., start, update, enable continuation, stop, etc.).  The serialised trace fragment is then handed off to a BAM event stream (buffered or direct) which takes the appropriate action. The Root of the Problem The logic breaks down in the Activity Interceptor Configuration.  Each Activity Interceptor Configuration is initialised with an instance of a ‘trace instance’ token.  This provides the basic metadata for the activity as a whole.  It contains the activity name and ID together with state flags indicating if the activity ID is a root (i.e., not a continuation fragment) and if it is completed.  This single token is then shared by all trace actions for all steps registered with the Activity Interceptor Configuration. Each trace instance token is automatically initialised to represent a root fragment.  However, if you subsequently register a ‘continuation’ step with the Activity Interceptor Configuration, the ‘root’ flag is set to false at the point the ‘continue’ track point is registered for that step.   If you use a ‘reflector’ tool to inspect the code for the ActivityInterceptorConfiguration class, you can see the flag being set in one of the overloads of the RegisterContinue method.    This makes no sense.  The trace instance token is shared across all the track points registered with the Activity Interceptor Configuration.  The Activity Interceptor Configuration is designed to hold track points for multiple steps.  The ‘root’ flag is clearly meant to be initialised to ‘true’ for the preliminary root fragment and then subsequently set to false at the point that a continuation step is processed.  Instead, if the Activity Interceptor Configuration contains a continuation step, it is changed to ‘false’ before the root fragment is processed.  This is clearly an error in logic. The problem causes havoc when the BAM Interceptor is used with continuation.  Effectively the root step is no longer processed correctly, and the ultimate effect is that the continued activity never completes!   This has nothing to do with the root and the continuation being in the same process.  It is due to a fundamental mistake of setting the ‘root’ flag to false for a continuation before the root fragment is processed. The Workaround Fortunately, it is easy to work around the bug.  The trick is to ensure that you create a new Activity Interceptor Configuration object for each individual step.  This may mean filtering your configuration data to extract the track points for a single step or grouping the configured track points into individual steps and the creating a separate Activity Interceptor Configuration for each group.  In my case, the first approach was required.  Here is what the amended code looks like: // Because of a logic error in Microsoft's code, a separate ActivityInterceptorConfiguration must be used // for each location. The following code extracts only those track points for a given step name (location). var trackPointGroup = from ResolutionService.TrackPoint tp in bamActivity.TrackPoints                       where (string)tp.Location == bamStepName                       select tp; var bamActivityInterceptorConfig =     new Microsoft.BizTalk.Bam.EventObservation.ActivityInterceptorConfiguration(activityName); foreach (var trackPoint in trackPointGroup) {     switch (trackPoint.Type)     {         case TrackPointType.Start:             bamActivityInterceptorConfig.RegisterStartNew(trackPoint.Location, trackPoint.ExtractionInfo);             break; etc… I’m using LINQ to filter a list of track points for those entries that correspond to a given step and then registering only those track points on a new instance of the ActivityInterceptorConfiguration class.   As soon as I re-wrote the code to do this, activities with continuations started to complete correctly.

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • General monitoring for SQL Server Analysis Services using Performance Monitor

    - by Testas
    A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs.     File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor,  a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used……  Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total  The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above  25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file 

    Read the article

  • Exchange Server 2007 Setup

    - by AlamedaDad
    Hi, I'm working on a upgrade to Exchange 2007 and I wanted to get some advise on hardware choices. We currently have an Exchange 2003 STD server with 400 users split between 6 AD Sites, that is housed on a single server. We need to move to a redundant, fault tolerant system to support our users. I'm planning on installing 2 Dell 1950 servers with W2k8-std to act as CAS and Hub servers, with NLB to allow abstraction of the actual server name to the users. There won't be an edge system since we have a Barracuda box already that will handle in/out spam/virus filtering. Backend I'm planning on 2 mailbox servers which will be Dell 2950s with 16GB RAM, 2 either dual-core or quad-core CPUs and 6 300GB SAS drives in some RAID config. These systems will be clustered using W2k8 Ent clustering and running CCR in Exchange. My questions are as follows: Is 16GB enough RAM for serving that many mailboxes along with the windows clustering and ccr? I'm trying to figure out disk layouts and I'm unsure of whether to use all local disk or some local and some SAN, via an OpenFiler iSCSI server. The SAN would be a Dell 2850 with 6 - 300GB SCSI drives and a PERC controller to slice as I want, with 8GB RAM. Option 1: 2 drives, RAID 1 - OS 2 drives, RAID 1 - Logs 2 drives, RAID 1 - Mail stores Option 2: 2 drives, RAID 1 - OS and logs 4 drives, RAID 5 - Mail Stores and scratch space for eseutil. Option 3: 2 drives, RAID 1 - OS 2 drives, RAID 1 - Logs 2 drives, RAID 0 - scratch space ~300GB iSCSI volume for mail stores Option 4: 2 drives, RAID 1 - OS 4 drives, RAID 5 - scratch space ~300GB iSCSI volume for mail stores ~300GB iSCSI volume for logs I have 2 sockets for CPUs and need to chose between dual and quad cores. The dual core have faster clocks but less cache and I'm thinking older architecture. Am I better off with more cores and cache while sacraficing clock speed? I am planning on adding the new E2K7 cluster to the E2K3 server and then move each mailbox over, all at once, then remove the old server. This seems more complicated than simply getting rid of the 2003 server and then adding the 2007 cluster and restoring the mailboxes using PowerControls or exmerge. The migration option lets me do this on my time, where a cutover means it all needs to work at once. If I go with the cutover method, how can I prebuild the servers and add them to the domain right after removing the 2003 server, or can't I? I think the answer is no and the migration is my only real option if I want to prebuild. I need to also migrate about 30GB of Public Folders. Is there anything special about this, other than specifying in the E2K7 install that I want older Outlook clients and PF's setup? I guess I could even keep the E2K3 server to host just the PFs? Lastly, if I have a mix of Outlook 200, 2003 and 2007 what do I need to do to make sure they all have access to the GAL and OAB? At time of cutover, we'll be at like 90% 2007, but we will have some older stuff around. My plan is to use Outlook Anywhere on laptops that are used outside the physical network. Are there any gotchas involved in that? I'm even thinking about using is for all Outlook clients, does anyone do that? The reason I'm considering it is that our WAN is really VPN tunnels over internet connections, so not a fully messhed, stable WAN. Thank you all very much for the assistance in advance and I look forward to discussion of these points! Regards...Michael

    Read the article

  • Of C# Iterators and Performance

    - by James Michael Hare
    Some of you reading this will be wondering, "what is an iterator" and think I'm locked in the world of C++.  Nope, I'm talking C# iterators.  No, not enumerators, iterators.   So, for those of you who do not know what iterators are in C#, I will explain it in summary, and for those of you who know what iterators are but are curious of the performance impacts, I will explore that as well.   Iterators have been around for a bit now, and there are still a bunch of people who don't know what they are or what they do.  I don't know how many times at work I've had a code review on my code and have someone ask me, "what's that yield word do?"   Basically, this post came to me as I was writing some extension methods to extend IEnumerable<T> -- I'll post some of the fun ones in a later post.  Since I was filtering the resulting list down, I was using the standard C# iterator concept; but that got me wondering: what are the performance implications of using an iterator versus returning a new enumeration?   So, to begin, let's look at a couple of methods.  This is a new (albeit contrived) method called Every(...).  The goal of this method is to access and enumeration and return every nth item in the enumeration (including the first).  So Every(2) would return items 0, 2, 4, 6, etc.   Now, if you wanted to write this in the traditional way, you may come up with something like this:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         List<T> newList = new List<T>();         int count = 0;           foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 newList.Add(i);             }         }           return newList;     }     So basically this method takes any IEnumerable<T> and returns a new IEnumerable<T> that contains every nth item.  Pretty straight forward.   The problem?  Well, Every<T>(...) will construct a list containing every nth item whether or not you care.  What happens if you were searching this result for a certain item and find that item after five tries?  You would have generated the rest of the list for nothing.   Enter iterators.  This C# construct uses the yield keyword to effectively defer evaluation of the next item until it is asked for.  This can be very handy if the evaluation itself is expensive or if there's a fair chance you'll never want to fully evaluate a list.   We see this all the time in Linq, where many expressions are chained together to do complex processing on a list.  This would be very expensive if each of these expressions evaluated their entire possible result set on call.    Let's look at the same example function, this time using an iterator:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         int count = 0;         foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 yield return i;             }         }     }   Notice it does not create a new return value explicitly, the only evidence of a return is the "yield return" statement.  What this means is that when an item is requested from the enumeration, it will enter this method and evaluate until it either hits a yield return (in which case that item is returned) or until it exits the method or hits a yield break (in which case the iteration ends.   Behind the scenes, this is all done with a class that the CLR creates behind the scenes that keeps track of the state of the iteration, so that every time the next item is asked for, it finds that item and then updates the current position so it knows where to start at next time.   It doesn't seem like a big deal, does it?  But keep in mind the key point here: it only returns items as they are requested. Thus if there's a good chance you will only process a portion of the return list and/or if the evaluation of each item is expensive, an iterator may be of benefit.   This is especially true if you intend your methods to be chainable similar to the way Linq methods can be chained.    For example, perhaps you have a List<int> and you want to take every tenth one until you find one greater than 10.  We could write that as:       List<int> someList = new List<int>();         // fill list here         someList.Every(10).TakeWhile(i => i <= 10);     Now is the difference more apparent?  If we use the first form of Every that makes a copy of the list.  It's going to copy the entire list whether we will need those items or not, that can be costly!    With the iterator version, however, it will only take items from the list until it finds one that is > 10, at which point no further items in the list are evaluated.   So, sounds neat eh?  But what's the cost is what you're probably wondering.  So I ran some tests using the two forms of Every above on lists varying from 5 to 500,000 integers and tried various things.    Now, iteration isn't free.  If you are more likely than not to iterate the entire collection every time, iterator has some very slight overhead:   Copy vs Iterator on 100% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 5 Copy 5 5 5 Iterator 5 50 50 Copy 28 50 50 Iterator 27 500 500 Copy 227 500 500 Iterator 247 5000 5000 Copy 2266 5000 5000 Iterator 2444 50,000 50,000 Copy 24,443 50,000 50,000 Iterator 24,719 500,000 500,000 Copy 250,024 500,000 500,000 Iterator 251,521   Notice that when iterating over the entire produced list, the times for the iterator are a little better for smaller lists, then getting just a slight bit worse for larger lists.  In reality, given the number of items and iterations, the result is near negligible, but just to show that iterators come at a price.  However, it should also be noted that the form of Every that returns a copy will have a left-over collection to garbage collect.   However, if we only partially evaluate less and less through the list, the savings start to show and make it well worth the overhead.  Let's look at what happens if you stop looking after 80% of the list:   Copy vs Iterator on 80% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 4 Copy 5 5 4 Iterator 5 50 40 Copy 27 50 40 Iterator 23 500 400 Copy 215 500 400 Iterator 200 5000 4000 Copy 2099 5000 4000 Iterator 1962 50,000 40,000 Copy 22,385 50,000 40,000 Iterator 19,599 500,000 400,000 Copy 236,427 500,000 400,000 Iterator 196,010       Notice that the iterator form is now operating quite a bit faster.  But the savings really add up if you stop on average at 50% (which most searches would typically do):     Copy vs Iterator on 50% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 2 Copy 5 5 2 Iterator 4 50 25 Copy 25 50 25 Iterator 16 500 250 Copy 188 500 250 Iterator 126 5000 2500 Copy 1854 5000 2500 Iterator 1226 50,000 25,000 Copy 19,839 50,000 25,000 Iterator 12,233 500,000 250,000 Copy 208,667 500,000 250,000 Iterator 122,336   Now we see that if we only expect to go on average 50% into the results, we tend to shave off around 40% of the time.  And this is only for one level deep.  If we are using this in a chain of query expressions it only adds to the savings.   So my recommendation?  If you have a resonable expectation that someone may only want to partially consume your enumerable result, I would always tend to favor an iterator.  The cost if they iterate the whole thing does not add much at all -- and if they consume only partially, you reap some really good performance gains.   Next time I'll discuss some of my favorite extensions I've created to make development life a little easier and maintainability a little better.

    Read the article

  • Deferred rendering with VSM - Scaling light depth loses moments

    - by user1423893
    I'm calculating my shadow term using a VSM method. This works correctly when using forward rendered lights but fails with deferred lights. // Shadow term (1 = no shadow) float shadow = 1; // [Light Space -> Shadow Map Space] // Transform the surface into light space and project // NB: Could be done in the vertex shader, but doing it here keeps the // "light shader" abstraction and doesn't limit the number of shadowed lights float4x4 LightViewProjection = mul(LightView, LightProjection); float4 surf_tex = mul(position, LightViewProjection); // Re-homogenize // 'w' component is not used in later calculations so no need to homogenize (it will equal '1' if homogenized) surf_tex.xyz /= surf_tex.w; // Rescale viewport to be [0,1] (texture coordinate system) float2 shadow_tex; shadow_tex.x = surf_tex.x * 0.5f + 0.5f; shadow_tex.y = -surf_tex.y * 0.5f + 0.5f; // Half texel offset //shadow_tex += (0.5 / 512); // Scaled distance to light (instead of 'surf_tex.z') float rescaled_dist_to_light = dist_to_light / LightAttenuation.y; //float rescaled_dist_to_light = surf_tex.z; // [Variance Shadow Map Depth Calculation] // No filtering float2 moments = tex2D(ShadowSampler, shadow_tex).xy; // Flip the moments values to bring them back to their original values moments.x = 1.0 - moments.x; moments.y = 1.0 - moments.y; // Compute variance float E_x2 = moments.y; float Ex_2 = moments.x * moments.x; float variance = E_x2 - Ex_2; variance = max(variance, Bias.y); // Surface is fully lit if the current pixel is before the light occluder (lit_factor == 1) // One-tailed inequality valid if float lit_factor = (rescaled_dist_to_light <= moments.x - Bias.x); // Compute probabilistic upper bound (mean distance) float m_d = moments.x - rescaled_dist_to_light; // Chebychev's inequality float p = variance / (variance + m_d * m_d); p = ReduceLightBleeding(p, Bias.z); // Adjust the light color based on the shadow attenuation shadow *= max(lit_factor, p); This is what I know for certain so far: The lighting is correct if I do not try and calculate the shadow term. (No shadows) The shadow term is correct when calculated using forward rendered lighting. (VSM works with forward rendered lights) With the current rescaled light distance (lightAttenuation.y is the far plane value): float rescaled_dist_to_light = dist_to_light / LightAttenuation.y; The light is correct and the shadow appears to be zoomed in and misses the blurring: When I do not rescale the light and use the homogenized 'surf_tex': float rescaled_dist_to_light = surf_tex.z; the shadows are blurred correctly but the lighting is incorrect and the cube model is no longer lit Why is scaling by the far plane value (LightAttenuation.y) zooming in too far? The only other factor involved is my world pixel position, which is calculated as follows: // [Position] float4 position; // [Screen Position] position.xy = input.PositionClone.xy; // Use 'x' and 'y' components already homogenized for uv coordinates above position.z = tex2D(DepthSampler, texCoord).r; // No need to homogenize 'z' component position.z = 1.0 - position.z; position.w = 1.0; // 1.0 = position.w / position.w // [World Position] position = mul(position, CameraViewProjectionInverse); // Re-homogenize position (xyz AND w, otherwise shadows will bend when camera is close) position.xyz /= position.w; position.w = 1.0; Using the inverse matrix of the camera's view x projection matrix does work for lighting but maybe it is incorrect for shadow calculation? EDIT: Light calculations for shadow including 'dist_to_light' // Work out the light position and direction in world space float3 light_position = float3(LightViewInverse._41, LightViewInverse._42, LightViewInverse._43); // Direction might need to be negated float3 light_direction = float3(-LightViewInverse._31, -LightViewInverse._32, -LightViewInverse._33); // Unnormalized light vector float3 dir_to_light = light_position - position; // Direction from vertex float dist_to_light = length(dir_to_light); // Normalise 'toLight' vector for lighting calculations dir_to_light = normalize(dir_to_light); EDIT2: These are the calculations for the moments (depth) //============================================= //---[Vertex Shaders]-------------------------- //============================================= DepthVSOutput depth_VS( float4 Position : POSITION, uniform float4x4 shadow_view, uniform float4x4 shadow_view_projection) { DepthVSOutput output = (DepthVSOutput)0; // First transform position into world space float4 position_world = mul(Position, World); output.position_screen = mul(position_world, shadow_view_projection); output.light_vec = mul(position_world, shadow_view).xyz; return output; } //============================================= //---[Pixel Shaders]--------------------------- //============================================= DepthPSOutput depth_PS(DepthVSOutput input) { DepthPSOutput output = (DepthPSOutput)0; // Work out the depth of this fragment from the light, normalized to [0, 1] float2 depth; depth.x = length(input.light_vec) / FarPlane; depth.y = depth.x * depth.x; // Flip depth values to avoid floating point inaccuracies depth.x = 1.0f - depth.x; depth.y = 1.0f - depth.y; output.depth = depth.xyxy; return output; } EDIT 3: I have tried the folloiwng: float4 pp; pp.xy = input.PositionClone.xy; // Use 'x' and 'y' components already homogenized for uv coordinates above pp.z = tex2D(DepthSampler, texCoord).r; // No need to homogenize 'z' component pp.z = 1.0 - pp.z; pp.w = 1.0; // 1.0 = position.w / position.w // Determine the depth of the pixel with respect to the light float4x4 LightViewProjection = mul(LightView, LightProjection); float4x4 matViewToLightViewProj = mul(CameraViewProjectionInverse, LightViewProjection); float4 vPositionLightCS = mul(pp, matViewToLightViewProj); float fLightDepth = vPositionLightCS.z / vPositionLightCS.w; // Transform from light space to shadow map texture space. float2 vShadowTexCoord = 0.5 * vPositionLightCS.xy / vPositionLightCS.w + float2(0.5f, 0.5f); vShadowTexCoord.y = 1.0f - vShadowTexCoord.y; // Offset the coordinate by half a texel so we sample it correctly vShadowTexCoord += (0.5f / 512); //g_vShadowMapSize This suffers the same problem as the second picture. I have tried storing the depth based on the view x projection matrix: output.position_screen = mul(position_world, shadow_view_projection); //output.light_vec = mul(position_world, shadow_view); output.light_vec = output.position_screen; depth.x = input.light_vec.z / input.light_vec.w; This gives a shadow that has lots surface acne due to horrible floating point precision errors. Everything is lit correctly though. EDIT 4: Found an OpenGL based tutorial here I have followed it to the letter and it would seem that the uv coordinates for looking up the shadow map are incorrect. The source uses a scaled matrix to get the uv coordinates for the shadow map sampler /// <summary> /// The scale matrix is used to push the projected vertex into the 0.0 - 1.0 region. /// Similar in role to a * 0.5 + 0.5, where -1.0 < a < 1.0. /// <summary> const float4x4 ScaleMatrix = float4x4 ( 0.5, 0.0, 0.0, 0.0, 0.0, -0.5, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.5, 0.5, 0.5, 1.0 ); I had to negate the 0.5 for the y scaling (M22) in order for it to work but the shadowing is still not correct. Is this really the correct way to scale? float2 shadow_tex; shadow_tex.x = surf_tex.x * 0.5f + 0.5f; shadow_tex.y = surf_tex.y * -0.5f + 0.5f; The depth calculations are exactly the same as the source code yet they still do not work, which makes me believe something about the uv calculation above is incorrect.

    Read the article

  • Try a sample: Using the counter predicate for event sampling

    - by extended_events
    Extended Events offers a rich filtering mechanism, called predicates, that allows you to reduce the number of events you collect by specifying criteria that will be applied during event collection. (You can find more information about predicates in Using SQL Server 2008 Extended Events (by Jonathan Kehayias)) By evaluating predicates early in the event firing sequence we can reduce the performance impact of collecting events by stopping event collection when the criteria are not met. You can specify predicates on both event fields and on a special object called a predicate source. Predicate sources are similar to action in that they typically are related to some type of global information available from the server. You will find that many of the actions available in Extended Events have equivalent predicate sources, but actions and predicates sources are not the same thing. Applying predicates, whether on a field or predicate source, is very similar to what you are used to in T-SQL in terms of how they work; you pick some field/source and compare it to a value, for example, session_id = 52. There is one predicate source that merits special attention though, not just for its special use, but for how the order of predicate evaluation impacts the behavior you see. I’m referring to the counter predicate source. The counter predicate source gives you a way to sample a subset of events that otherwise meet the criteria of the predicate; for example you could collect every other event, or only every tenth event. Simple CountingThe counter predicate source works by creating an in memory counter that increments every time the predicate statement is evaluated. Here is a simple example with my favorite event, sql_statement_completed, that only collects the second statement that is run. (OK, that’s not much of a sample, but this is for demonstration purposes. Here is the session definition: CREATE EVENT SESSION counter_test ON SERVERADD EVENT sqlserver.sql_statement_completed    (ACTION (sqlserver.sql_text)    WHERE package0.counter = 2)ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS) You can find general information about the session DDL syntax in BOL and from Pedro’s post Introduction to Extended Events. The important part here is the WHERE statement that defines that I only what the event where package0.count = 2; in other words, only the second instance of the event. Notice that I need to provide the package name along with the predicate source. You don’t need to provide the package name if you’re using event fields, only for predicate sources. Let’s say I run the following test queries: -- Run three statements to test the sessionSELECT 'This is the first statement'GOSELECT 'This is the second statement'GOSELECT 'This is the third statement';GO Once you return the event data from the ring buffer and parse the XML (see my earlier post on reading event data) you should see something like this: event_name sql_text sql_statement_completed SELECT ‘This is the second statement’ You can see that only the second statement from the test was actually collected. (Feel free to try this yourself. Check out what happens if you remove the WHERE statement from your session. Go ahead, I’ll wait.) Percentage Sampling OK, so that wasn’t particularly interesting, but you can probably see that this could be interesting, for example, lets say I need a 25% sample of the statements executed on my server for some type of QA analysis, that might be more interesting than just the second statement. All comparisons of predicates are handled using an object called a predicate comparator; the simple comparisons such as equals, greater than, etc. are mapped to the common mathematical symbols you know and love (eg. = and >), but to do the less common comparisons you will need to use the predicate comparators directly. You would probably look to the MOD operation to do this type sampling; we would too, but we don’t call it MOD, we call it divides_by_uint64. This comparator evaluates whether one number is divisible by another with no remainder. The general syntax for using a predicate comparator is pred_comp(field, value), field is always first and value is always second. So lets take a look at how the session changes to answer our new question of 25% sampling: CREATE EVENT SESSION counter_test_25 ON SERVERADD EVENT sqlserver.sql_statement_completed    (ACTION (sqlserver.sql_text)    WHERE package0.divides_by_uint64(package0.counter,4))ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS)GO Here I’ve replaced the simple equivalency check with the divides_by_uint64 comparator to check if the counter is evenly divisible by 4, which gives us back every fourth record. I’ll leave it as an exercise for the reader to test this session. Why order matters I indicated at the start of this post that order matters when it comes to the counter predicate – it does. Like most other predicate systems, Extended Events evaluates the predicate statement from left to right; as soon as the predicate statement is proven false we abandon evaluation of the remainder of the statement. The counter predicate source is only incremented when it is evaluated so whether or not the counter is incremented will depend on where it is in the predicate statement and whether a previous criteria made the predicate false or not. Here is a generic example: Pred1: (WHERE statement_1 AND package0.counter = 2)Pred2: (WHERE package0.counter = 2 AND statement_1) Let’s say I cause a number of events as follows and examine what happens to the counter predicate source. Iteration Statement Pred1 Counter Pred2 Counter A Not statement_1 0 1 B statement_1 1 2 C Not statement_1 1 3 D statement_1 2 4 As you can see, in the case of Pred1, statement_1 is evaluated first, when it fails (A & C) predicate evaluation is stopped and the counter is not incremented. With Pred2 the counter is evaluated first, so it is incremented on every iteration of the event and the remaining parts of the predicate are then evaluated. In this example, Pred1 would return an event for D while Pred2 would return an event for B. But wait, there is an interesting side-effect here; consider Pred2 if I had run my statements in the following order: Not statement_1 Not statement_1 statement_1 statement_1 In this case I would never get an event back from the system because the point at which counter=2, the rest of the predicate evaluates as false so the event is not returned. If you’re using the counter target for sampling and you’re not getting the expected events, or any events, check the order of the predicate criteria. As a general rule I’d suggest that the counter criteria should be the last element of your predicate statement since that will assure that your sampling rate will apply to the set of event records defined by the rest of your predicate. Aside: I’m interested in hearing about uses for putting the counter predicate criteria earlier in the predicate statement. If you have one, post it in a comment to share with the class. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • C# Extension Methods - To Extend or Not To Extend...

    - by James Michael Hare
    I've been thinking a lot about extension methods lately, and I must admit I both love them and hate them. They are a lot like sugar, they taste so nice and sweet, but they'll rot your teeth if you eat them too much.   I can't deny that they aren't useful and very handy. One of the major components of the Shared Component library where I work is a set of useful extension methods. But, I also can't deny that they tend to be overused and abused to willy-nilly extend every living type.   So what constitutes a good extension method? Obviously, you can write an extension method for nearly anything whether it is a good idea or not. Many times, in fact, an idea seems like a good extension method but in retrospect really doesn't fit.   So what's the litmus test? To me, an extension method should be like in the movies when a person runs into their twin, separated at birth. You just know you're related. Obviously, that's hard to quantify, so let's try to put a few rules-of-thumb around them.   A good extension method should:     Apply to any possible instance of the type it extends.     Simplify logic and improve readability/maintainability.     Apply to the most specific type or interface applicable.     Be isolated in a namespace so that it does not pollute IntelliSense.     So let's look at a few examples in relation to these rules.   The first rule, to me, is the most important of all. Once again, it bears repeating, a good extension method should apply to all possible instances of the type it extends. It should feel like the long lost relative that should have been included in the original class but somehow was missing from the family tree.    Take this nifty little int extension, I saw this once in a blog and at first I really thought it was pretty cool, but then I started noticing a code smell I couldn't quite put my finger on. So let's look:       public static class IntExtensinos     {         public static int Seconds(int num)         {             return num * 1000;         }           public static int Minutes(int num)         {             return num * 60000;         }     }     This is so you could do things like:       ...     Thread.Sleep(5.Seconds());     ...     proxy.Timeout = 1.Minutes();     ...     Awww, you say, that's cute! Well, that's the problem, it's kitschy and it doesn't always apply (and incidentally you could achieve the same thing with TimeStamp.FromSeconds(5)). It's syntactical candy that looks cool, but tends to rot and pollute the code. It would allow things like:       total += numberOfTodaysOrders.Seconds();     which makes no sense and should never be allowed. The problem is you're applying an extension method to a logical domain, not a type domain. That is, the extension method Seconds() doesn't really apply to ALL ints, it applies to ints that are representative of time that you want to convert to milliseconds.    Do you see what I mean? The two problems, in a nutshell, are that a) Seconds() called off a non-time value makes no sense and b) calling Seconds() off something to pass to something that does not take milliseconds will be off by a factor of 1000 or worse.   Thus, in my mind, you should only ever have an extension method that applies to the whole domain of that type.   For example, this is one of my personal favorites:       public static bool IsBetween<T>(this T value, T low, T high)         where T : IComparable<T>     {         return value.CompareTo(low) >= 0 && value.CompareTo(high) <= 0;     }   This allows you to check if any IComparable<T> is within an upper and lower bound. Think of how many times you type something like:       if (response.Employee.Address.YearsAt >= 2         && response.Employee.Address.YearsAt <= 10)     {     ...     }     Now, you can instead type:       if(response.Employee.Address.YearsAt.IsBetween(2, 10))     {     ...     }     Note that this applies to all IComparable<T> -- that's ints, chars, strings, DateTime, etc -- and does not depend on any logical domain. In addition, it satisfies the second point and actually makes the code more readable and maintainable.   Let's look at the third point. In it we said that an extension method should fit the most specific interface or type possible. Now, I'm not saying if you have something that applies to enumerables, you create an extension for List, Array, Dictionary, etc (though you may have reasons for doing so), but that you should beware of making things TOO general.   For example, let's say we had an extension method like this:       public static T ConvertTo<T>(this object value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This lets you do more fluent conversions like:       double d = "5.0".ConvertTo<double>();     However, if you dig into Reflector (LOVE that tool) you will see that if the type you are calling on does not implement IConvertible, what you convert to MUST be the exact type or it will throw an InvalidCastException. Now this may or may not be what you want in this situation, and I leave that up to you. Things like this would fail:       object value = new Employee();     ...     // class cast exception because typeof(IEmployee) != typeof(Employee)     IEmployee emp = value.ConvertTo<IEmployee>();       Yes, that's a downfall of working with Convertible in general, but if you wanted your fluent interface to be more type-safe so that ConvertTo were only callable on IConvertibles (and let casting be a manual task), you could easily make it:         public static T ConvertTo<T>(this IConvertible value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This is what I mean by choosing the best type to extend. Consider that if we used the previous (object) version, every time we typed a dot ('.') on an instance we'd pull up ConvertTo() whether it was applicable or not. By filtering our extension method down to only valid types (those that implement IConvertible) we greatly reduce our IntelliSense pollution and apply a good level of compile-time correctness.   Now my fourth rule is just my general rule-of-thumb. Obviously, you can make extension methods as in-your-face as you want. I included all mine in my work libraries in its own sub-namespace, something akin to:       namespace Shared.Core.Extensions { ... }     This is in a library called Shared.Core, so just referencing the Core library doesn't pollute your IntelliSense, you have to actually do a using on Shared.Core.Extensions to bring the methods in. This is very similar to the way Microsoft puts its extension methods in System.Linq. This way, if you want 'em, you use the appropriate namespace. If you don't want 'em, they won't pollute your namespace.   To really make this work, however, that namespace should only include extension methods and subordinate types those extensions themselves may use. If you plant other useful classes in those namespaces, once a user includes it, they get all the extensions too.   Also, just as a personal preference, extension methods that aren't simply syntactical shortcuts, I like to put in a static utility class and then have extension methods for syntactical candy. For instance, I think it imaginable that any object could be converted to XML:       namespace Shared.Core     {         // A collection of XML Utility classes         public static class XmlUtility         {             ...             // Serialize an object into an xml string             public static string ToXml(object input)             {                 var xs = new XmlSerializer(input.GetType());                   // use new UTF8Encoding here, not Encoding.UTF8. The later includes                 // the BOM which screws up subsequent reads, the former does not.                 using (var memoryStream = new MemoryStream())                 using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))                 {                     xs.Serialize(xmlTextWriter, input);                     return Encoding.UTF8.GetString(memoryStream.ToArray());                 }             }             ...         }     }   I also wanted to be able to call this from an object like:       value.ToXml();     But here's the problem, if i made this an extension method from the start with that one little keyword "this", it would pop into IntelliSense for all objects which could be very polluting. Instead, I put the logic into a utility class so that users have the choice of whether or not they want to use it as just a class and not pollute IntelliSense, then in my extensions namespace, I add the syntactical candy:       namespace Shared.Core.Extensions     {         public static class XmlExtensions         {             public static string ToXml(this object value)             {                 return XmlUtility.ToXml(value);             }         }     }   So now it's the best of both worlds. On one hand, they can use the utility class if they don't want to pollute IntelliSense, and on the other hand they can include the Extensions namespace and use as an extension if they want. The neat thing is it also adheres to the Single Responsibility Principle. The XmlUtility is responsible for converting objects to XML, and the XmlExtensions is responsible for extending object's interface for ToXml().

    Read the article

  • 14+ WordPress Portfolio Themes

    - by Edward
    There are various portfolio themes for WordPress out there, with this collection we are trying to help you choose the best one. These themes can be used to create any type of personal, photography, art or corporate portfolio. Display 3 in 1 Display 3 in 1 – Business & Portfolio WordPress Theme. Features a fantastic 3D Image slideshow that can be controlled from your backend with a custom tool. The Theme has a huge wordpress custom backend (8 additional Admin Pages) that make customization of the Theme easy for those who dont know much about coding or wordpress. Price: $40 View Demo Download DeepFocus Tempting features such as automatic separation of blog and portfolio content by template, publishing of most important information on homepage, styles to choose from and many more such features. It also provides for page templates for blog, portfolio, blog archive, tags etc. It has the best feature that helps you to manage everything from one place. Price: $39 (Package includes more than 55 themes) View Demo Download SimplePress Simple, yet awesome. One of the best portfolio theme. Price: $39 (Package includes more than 55 themes) View Demo Download Graphix Graphix is one of best word press portfolio themes. It is most suited to aspiring designers, developers, artists and photographers who’d like a framework theme, which has a great-looking portfolio with a feature-rich blog. It has theme option page, 5-color style, SEO option, featured content blocks, drop down multi-level menu, social profile link custom widgets, custom post, custom page template etc. Price: $69 Single & $149 Developer Package View Demo Download Bizznizz It boasts of many features such as custom homepage, custom post types, custom widgets, portfolio templates, alternative styles and many more. View Demo Download Showtime Ultimate WordPress Theme for you to create your web portfolio, It has 3 different styles for you to choose from. Price: $40 View Demo Download Montana WP Horizontal Portfolio Theme Montana Theme – WP Horizontal Portfolio Theme, best suited for creative studios to showcase design, photography, illustration, paintings and art. Price: $30 View Demo Download OverALL OverALL Premium WordPress Blog & Portfolio Theme, is low priced & has amazing tons of features. Price: $17 View Demo Download Habitat Habitat – Blog and Portfolio Theme. Unique Portfolio Sorting/Filtering with a custom jQuery script (each entry supports multiple images or a video) Multiple Featured Images for each post to generate individual Slideshows per Post, or the option to directly embed video content from youtube, vimeo, hulu etc. Price: $35 View Demo Download Fresh Folio Fresh Folio from WooThemes, can be used as both portfolio and a premium WordPress theme. The theme is a remix of the Fresh News Theme and Proud Folio Theme which combines all the best elements of the respective blog and portfolio style themes. View Demo Download Fresh Folio Features: Can be used to create an impressive portfolio. 7 diverse theme styles to choose from (default, blue, red, grunge light, grunge floral, antique, blue creamer, nightlife) The template will automatically (visually) separate your blog & portfolio content, making this an amazing theme for aspiring designers, developers, artists, photographers etc. Unique page templates types for the portfolio, blog, blog archives, tags & search results. Integrated Theme Options (for WordPress) to tweak the layout, colour scheme etc. for the theme Optional Automatic Image Resize, which is used to dynamically create the thumbnails and featured images Includes Widget enabled Sidebars. eGallery eGallery is a theme made to transform your wordpress blog into a fully functional online portfolio. Theme is perfectly designed to emphasize the artwork you choose to showcase. The design has been greatly enhanced using javascript, and is easy to implement. Price: $39 (Package includes more than 55 themes) View Demo Download ProudFolio ProudFolio is a portfolio premium WordPress theme from Woo Themes. The theme is for designers, developers, artists and photographers who would like a showcase theme which would depict as a portfolio and also serves a purpose of blog. ProudFolio puts a strong emphasis on the portfolio pieces, allowing for decent-sized thumbnails, huge fullscreen views via Lightbox, and full details on the single page. The theme file also contains a choice of three different background images and color schemes. Price: $70 Single $150 Developer License View Demo Download Features: The template will automatically (visually) separate your blog & portfolio content. An unique homepage layout, which publishes only the most important information; Unique page templates for the portfolio, blog, blog archives, tags & search results. Integrated Theme Options (for WordPress) to tweak the layout, colour scheme etc. for the theme; Built-in video panel, which you can use to publish any web-based Flash videos; Automatic Image Resize, which is used to dynamically create the thumbnails and featured images; Custom Page Templates for Archives, Sitemap & Image Gallery; Built-in Gravatar Support for Authors & Comments; Integrated Banner Management script to display randomized banner ads of your choice site-wide; Pretty drop down navigation everywhere; and Widget Enabled Sidebars. Porftolio WordPress Theme A FREE wordpress theme designed for web portfolios and (for now) just for web portfolios. It is coming with an Administrative Panel from where you can edit the head quote text, you can edit all theme colors, font families, font sizes and you can fill a curriculum vitae and display it into a special page. Theme demo and download can be found here Viz | Biz Viz | Biz is a premium WordPress photo gallery and portfolio theme designed specifically for photographers, graphic designers and web designers who want to display their creative work online, market their services, as well as have a typical text blog, using the power and flexibility of WordPress. It is priced for $79.95. Theme Features: Premium quality portfolio template Custom logo uploader to replace the standard graphic with your own unique look from the WP Dashboard Integrated blog component (front images are custom fields and thumbnails, but you can also have a typical blog) Four tabbed feature areas (About Me, Services, Recent Posts, and Tags) Two home page feature photos (You choose which photos to feature using a WP category) Manage your online portfolio through the WordPress CMS Crop two sizes of your work: One for the front page thumbnails and another full size version and upload to WP Search engine optimized. Related posts:14 WordPress Photo Blog & Portfolio Themes 6 PhotoBlog Portfolio WordPress Themes Professional WordPress Business Themes

    Read the article

  • More Fun with C# Iterators and Generators

    - by James Michael Hare
    In my last post, I talked quite a bit about iterators and how they can be really powerful tools for filtering a list of items down to a subset of items.  This had both pros and cons over returning a full collection, which, in summary, were:   Pros: If traversal is only partial, does not have to visit rest of collection. If evaluation method is costly, only incurs that cost on elements visited. Adds little to no garbage collection pressure.    Cons: Very slight performance impact if you know caller will always consume all items in collection. And as we saw in the last post, that con for the cost was very, very small and only really became evident on very tight loops consuming very large lists completely.    One of the key items to note, though, is the garbage!  In the traditional (return a new collection) method, if you have a 1,000,000 element collection, and wish to transform or filter it in some way, you have to allocate space for that copy of the collection.  That is, say you have a collection of 1,000,000 items and you want to double every item in the collection.  Well, that means you have to allocate a collection to hold those 1,000,000 items to return, which is a lot especially if you are just going to use it once and toss it.   Iterators, though, don't have this problem.  Each time you visit the node, it would return the doubled value of the node (in this example) and not allocate a second collection of 1,000,000 doubled items.  Do you see the distinction?  In both cases, we're consuming 1,000,000 items.  But in one case we pass back each doubled item which is just an int (for example's sake) on the stack and in the other case, we allocate a list containing 1,000,000 items which then must be garbage collected.   So iterators in C# are pretty cool, eh?  Well, here's one more thing a C# iterator can do that a traditional "return a new collection" transformation can't!   It can return **unbounded** collections!   I know, I know, that smells a lot like an infinite loop, eh?  Yes and no.  Basically, you're relying on the caller to put the bounds on the list, and as long as the caller doesn't you keep going.  Consider this example:   public static class Fibonacci {     // returns the infinite fibonacci sequence     public static IEnumerable<int> Sequence()     {         int iteration = 0;         int first = 1;         int second = 1;         int current = 0;         while (true)         {             if (iteration++ < 2)             {                 current = 1;             }             else             {                 current = first + second;                 second = first;                 first = current;             }             yield return current;         }     } }   Whoa, you say!  Yes, that's an infinite loop!  What the heck is going on there?  Yes, that was intentional.  Would it be better to have a fibonacci sequence that returns only a specific number of items?  Perhaps, but that wouldn't give you the power to defer the execution to the caller.   The beauty of this function is it is as infinite as the sequence itself!  The fibonacci sequence is unbounded, and so is this method.  It will continue to return fibonacci numbers for as long as you ask for them.  Now that's not something you can do with a traditional method that would return a collection of ints representing each number.  In that case you would eventually run out of memory as you got to higher and higher numbers.  This method, though, never runs out of memory.   Now, that said, you do have to know when you use it that it is an infinite collection and bound it appropriately.  Fortunately, Linq provides a lot of these extension methods for you!   Let's say you only want the first 10 fibonacci numbers:       foreach(var fib in Fibonacci.Sequence().Take(10))     {         Console.WriteLine(fib);     }   Or let's say you only want the fibonacci numbers that are less than 100:       foreach(var fib in Fibonacci.Sequence().TakeWhile(f => f < 100))     {         Console.WriteLine(fib);     }   So, you see, one of the nice things about iterators is their power to work with virtually any size (even infinite) collections without adding the garbage collection overhead of making new collections.    You can also do fun things like this to make a more "fluent" interface for for loops:   // A set of integer generator extension methods public static class IntExtensions {     // Begins counting to inifity, use To() to range this.     public static IEnumerable<int> Every(this int start)     {         // deliberately avoiding condition because keeps going         // to infinity for as long as values are pulled.         for (var i = start; ; ++i)         {             yield return i;         }     }     // Begins counting to infinity by the given step value, use To() to     public static IEnumerable<int> Every(this int start, int byEvery)     {         // deliberately avoiding condition because keeps going         // to infinity for as long as values are pulled.         for (var i = start; ; i += byEvery)         {             yield return i;         }     }     // Begins counting to inifity, use To() to range this.     public static IEnumerable<int> To(this int start, int end)     {         for (var i = start; i <= end; ++i)         {             yield return i;         }     }     // Ranges the count by specifying the upper range of the count.     public static IEnumerable<int> To(this IEnumerable<int> collection, int end)     {         return collection.TakeWhile(item => item <= end);     } }   Note that there are two versions of each method.  One that starts with an int and one that starts with an IEnumerable<int>.  This is to allow more power in chaining from either an existing collection or from an int.  This lets you do things like:   // count from 1 to 30 foreach(var i in 1.To(30)) {     Console.WriteLine(i); }     // count from 1 to 10 by 2s foreach(var i in 0.Every(2).To(10)) {     Console.WriteLine(i); }     // or, if you want an infinite sequence counting by 5s until something inside breaks you out... foreach(var i in 0.Every(5)) {     if (someCondition)     {         break;     }     ... }     Yes, those are kinda play functions and not particularly useful, but they show some of the power of generators and extension methods to form a fluid interface.   So what do you think?  What are some of your favorite generators and iterators?

    Read the article

  • simple and reliable centralized logging inside Amazon VPC

    - by Nakedible
    I need to set up centralized logging for a set of servers (10-20) in an Amazon VPC. The logging should be as to not lose any log messages in case any single server goes offline - or in the case that an entire availability zone goes offline. It should also tolerate packet loss and other normal network conditions without losing or duplicating messages. It should store the messages durably, at the minimum on two different EBS volumes in two availability zones, but S3 is a good place as well. It should also be realtime so that the messages arrive within seconds of their generation to two different availability zones. I also need to sync logfiles not generated via syslog, so a syslog-only centralized logging solution would not fulfill all the needs, although I guess that limitation could be worked around. I have already reviewed a few solutions, and I will list them here: Flume to Flume to S3: I could set up two logservers as Flume hosts which would store log messages either locally or in S3, and configure all the servers with Flume to send all messages to both servers, using the end-to-end reliability options. That way the loss of a single server shouldn't cause lost messages and all messages would arrive in two availability zones in realtime. However, there would need to be some way to join the logs of the two servers, deduplicating all the messages delivered to both. This could be done by adding a unique id on the sending side to each message and then write some manual deduplication runs on the logfiles. I haven't found an easy solution to the duplication problem. Logstash to Logstash to ElasticSearch: I could install Logstash on the servers and have them deliver to a central server via AMQP, with the durability options turned on. However, for this to work I would need to use some of the clustering capable AMQP implementations, or fan out the deliver just as in the Flume case. AMQP seems to be a yet another moving part with several implementations and no real guidance on what works best this sort of setup. And I'm not entirely convinced that I could get actual end-to-end durability from logstash to elasticsearch, assuming crashing servers in between. The fan-out solutions run in to the deduplication problem again. The best solution that would seem to handle all the cases, would be Beetle, which seems to provide high availability and deduplication via a redis store. However, I haven't seen any guidance on how to set this up with Logstash and Redis is one more moving part again for something that shouldn't be terribly difficult. Logstash to ElasticSearch: I could run Logstash on all the servers, have all the filtering and processing rules in the servers themselves and just have them log directly to a removet ElasticSearch server. I think this should bring me reliable logging and I can use the ElasticSearch clustering features to share the database transparently. However, I am not sure if the setup actually survives Logstash restarts and intermittent network problems without duplicating messages in a failover case or similar. But this approach sounds pretty promising. rsync: I could just rsync all the relevant log files to two different servers. The reliability aspect should be perfect here, as the files should be identical to the source files after a sync is done. However, doing an rsync several times per second doesn't sound fun. Also, I need the logs to be untamperable after they have been sent, so the rsyncs would need to be in append-only mode. And log rotations mess things up unless I'm careful. rsyslog with RELP: I could set up rsyslog to send messages to two remote hosts via RELP and have a local queue to store the messages. There is the deduplication problem again, and RELP itself might also duplicate some messages. However, this would only handle the things that log via syslog. None of these solutions seem terribly good, and they have many unknowns still, so I am asking for more information here from people who have set up centralized reliable logging as to what are the best tools to achieve that goal.

    Read the article

  • What's new in Solaris 11.1?

    - by Karoly Vegh
    Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1.  Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all.  I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop )  V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples.   VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog.  VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg.  parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness.  per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature.  NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones.  VIII: Security got a lot of attention too:  Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits.  PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users  SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion.  SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed.  Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified.  IX. Networking, as one of the core strengths of Solaris 11, has been extended with:  Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly  X. Data management:  FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all.  The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%!  XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges.  The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract.  For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;)  wbr,charlie

    Read the article

  • Open GL stars are not rendering

    - by Darestium
    I doing Nehe's Open GL Lesson 9. I'm using SFML for windowing, the strange thing is no stars are rendering. #include <SFML/System.hpp> #include <SFML/Window.hpp> #include <SFML/Graphics.hpp> #include <iostream> void processEvents(sf::Window *app); void processInput(sf::Window *app); void renderGlScene(sf::Window *app); void init(); int loadResources(); const int NUM_OF_STARS = 50; float triRot = 0.0f; float quadRot = 0.0f; bool twinkle = false; bool tKey = false; float zoom = 15.0f; float tilt = 90.0f; float spin = 0.0f; unsigned int loop; unsigned int texture_handle[1]; typedef struct { int r, g, b; float distance; float angle; } stars; stars star[NUM_OF_STARS]; int main() { sf::Window app(sf::VideoMode(800, 600, 32), "Nehe Lesson 9"); app.UseVerticalSync(false); init(); if (loadResources() == -1) { return EXIT_FAILURE; } while (app.IsOpened()) { processEvents(&app); processInput(&app); renderGlScene(&app); app.Display(); } return EXIT_SUCCESS; } int loadResources() { sf::Image img_data; // Load Texture if (!img_data.LoadFromFile("data/images/star.bmp")) { std::cout << "Could not load data/images/star.bmp"; return -1; } // Generate 1 texture glGenTextures(1, &texture_handle[0]); // Linear filtering glBindTexture(GL_TEXTURE_2D, texture_handle[0]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, img_data.GetWidth(), img_data.GetHeight(), 0, GL_RGBA, GL_UNSIGNED_BYTE, img_data.GetPixelsPtr()); return 0; } void processInput(sf::Window *app) { const sf::Input& input = app->GetInput(); if (input.IsKeyDown(sf::Key::T) && !tKey) { tKey = true; twinkle = !twinkle; } if (!input.IsKeyDown(sf::Key::T)) { tKey = false; } if (input.IsKeyDown(sf::Key::Up)) { tilt -= 0.05f; } if (input.IsKeyDown(sf::Key::Down)) { tilt += 0.05f; } if (input.IsKeyDown(sf::Key::PageUp)) { zoom -= 0.02f; } if (input.IsKeyDown(sf::Key::Up)) { zoom += 0.02f; } } void init() { glClearDepth(1.f); glClearColor(0.f, 0.f, 0.f, 0.f); // Enable texturing glEnable(GL_TEXTURE_2D); //glDepthMask(GL_TRUE); // Setup a perpective projection glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(45.f, 1.f, 1.f, 500.f); glShadeModel(GL_SMOOTH); glBlendFunc(GL_SRC_ALPHA, GL_ONE); glEnable(GL_BLEND); for (loop = 0; loop < NUM_OF_STARS; loop++) { star[loop].distance = (float)loop / NUM_OF_STARS * 5.0f; // Calculate distance from the centre // Give stars random rgb value star[loop].r = rand() % 256; star[loop].g = rand() % 256; star[loop].b = rand() % 256; } } void processEvents(sf::Window *app) { sf::Event event; while (app->GetEvent(event)) { if (event.Type == sf::Event::Closed) { app->Close(); } if (event.Type == sf::Event::KeyPressed && event.Key.Code == sf::Key::Escape) { app->Close(); } } } void renderGlScene(sf::Window *app) { app->SetActive(); // Clear color depth buffer glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); // Apply some transformations glMatrixMode(GL_MODELVIEW); glLoadIdentity(); // Select texture glBindTexture(GL_TEXTURE_2D, texture_handle[0]); for (loop = 0; loop < NUM_OF_STARS; loop++) { glLoadIdentity(); // Reset The View Before We Draw Each Star glTranslatef(0.0f, 0.0f, zoom); // Zoom Into The Screen (Using The Value In 'zoom') glRotatef(tilt, 1.0f, 0.0f, 0.0f); // Tilt The View (Using The Value In 'tilt') glRotatef(star[loop].angle, 0.0f, 1.0f, 0.0f); // Rotate To The Current Stars Angle glTranslatef(star[loop].distance, 0.0f, 0.0f); // Move Forward On The X Plane glRotatef(-star[loop].angle,0.0f,1.0f,0.0f); // Cancel The Current Stars Angle glRotatef(-tilt,1.0f,0.0f,0.0f); // Cancel The Screen Tilt if (twinkle) { glColor4ub(star[(NUM_OF_STARS - loop) - 1].r, star[(NUM_OF_STARS - loop)-1].g, star[(NUM_OF_STARS - loop) - 1].b, 255); glBegin(GL_QUADS); // Begin Drawing The Textured Quad glTexCoord2f(0.0f, 0.0f); glVertex3f(-1.0f, -1.0f, 0.0f); glTexCoord2f(1.0f, 0.0f); glVertex3f( 1.0f, -1.0f, 0.0f); glTexCoord2f(1.0f, 1.0f); glVertex3f( 1.0f, 1.0f, 0.0f); glTexCoord2f(0.0f, 1.0f); glVertex3f(-1.0f, 1.0f, 0.0f); glEnd(); // Done Drawing The Textured Quad } glRotatef(spin,0.0f,0.0f,1.0f); // Rotate The Star On The Z Axis // Assign A Color Using Bytes glColor4ub(star[loop].r, star[loop].g, star[loop].b, 255); glBegin(GL_QUADS); // Begin Drawing The Textured Quad glTexCoord2f(0.0f, 0.0f); glVertex3f(-1.0f,-1.0f, 0.0f); glTexCoord2f(1.0f, 0.0f); glVertex3f( 1.0f,-1.0f, 0.0f); glTexCoord2f(1.0f, 1.0f); glVertex3f( 1.0f, 1.0f, 0.0f); glTexCoord2f(0.0f, 1.0f); glVertex3f(-1.0f, 1.0f, 0.0f); glEnd(); // Done Drawing The Textured Quad spin += 0.01f; // Used To Spin The Stars star[loop].angle += (float)loop / NUM_OF_STARS; // Changes The Angle Of A Star star[loop].distance -= 0.01f; // Changes The Distance Of A Star if (star[loop].distance < 0.0f) { star[loop].distance += 5.0f; // Move The Star 5 Units From The Center star[loop].r = rand() % 256; // Give It A New Red Value star[loop].g = rand() % 256; // Give It A New Green Value star[loop].b = rand() % 256; // Give It A New Blue Value } } } I've looked over the code atleast 10 times now and I can't figure out the problem. Any help would be much appreciated.

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Troubleshooting Website problems within the local network

    - by HaydnWVN
    Have an external website which opens fine on some PC's, yet seems to time out (or symptoms of timing out, but never actually does) on others. Seems to only affect (some) of our newer HP Pro 3305 MT Workstations. All of which are running Win7 32bit SP1 with all updates. Older PC's (Win7 32bit SP1 & WinXP) are unaffected. Using Google Chrome & Firefox makes no difference. Opening the website in IE9 Compatibility Mode has exactly the same symptoms. All PC's are on the same local network (Workgroup) using the same DNS server & gateway (inhouse) on the same internet connection, on the same subnet. There is no proxy server, no content filtering, no load balancing etc etc. Only group policy in effect (locally) is for Update scheduling. Local firewalls are all the same (Kaspersky WP4) and our external facing firewall has no IP specific settings. I have no control over the external website, traceroute shows the same destination on all PC's. It is a fairly popular website in our industry (Horticulture) and i'm not aware of any other people (even other sites within our sister companies) with the same problem. Update: Used Fiddler2 to monitor the HTTP request, seems its not getting fulfilled for some reason?! Request sent: GET http://www.rhs.org.uk/ HTTP/1.1 Host: www.rhs.org.uk Connection: keep-alive User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-GB,en-US;q=0.8,en;q=0.6 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 Log from Fiddler 2 of the request: This session is not yet complete. Press F5 to refresh when session is complete for updated statistics. Request Count: 1 Bytes Sent: 567 (headers:567; body:0) Bytes Received: 0 (headers:0; body:0) ACTUAL PERFORMANCE -------------- ClientConnected: 17:02:33.720 ClientBeginRequest: 17:02:39.118 GotRequestHeaders: 17:02:39.118 ClientDoneRequest: 17:02:39.118 Determine Gateway: 0ms DNS Lookup: 0ms TCP/IP Connect: 46ms HTTPS Handshake: 0ms ServerConnected: 17:02:39.165 FiddlerBeginRequest: 17:02:39.165 ServerGotRequest: 17:02:39.165 ServerBeginResponse: 00:00:00.000 GotResponseHeaders: 00:00:00.000 ServerDoneResponse: 00:00:00.000 ClientBeginResponse: 00:00:00.000 ClientDoneResponse: 00:00:00.000 RESPONSE BYTES (by Content-Type) -------------- ~headers~: 0 Log of a successful request from a working PC (done this morning, excuse the timestamps being different from above): Request Count: 1 Bytes Sent: 493 (headers:493; body:0) Bytes Received: 20,413 (headers:525; body:19,888) ACTUAL PERFORMANCE -------------- ClientConnected: 08:22:47.766 ClientBeginRequest: 08:22:47.766 GotRequestHeaders: 08:22:47.766 ClientDoneRequest: 08:22:47.766 Determine Gateway: 0ms DNS Lookup: 26ms TCP/IP Connect: 30ms HTTPS Handshake: 0ms ServerConnected: 08:22:47.828 FiddlerBeginRequest: 08:22:47.828 ServerGotRequest: 08:22:47.828 ServerBeginResponse: 08:22:48.905 GotResponseHeaders: 08:22:48.905 ServerDoneResponse: 08:22:48.905 ClientBeginResponse: 08:22:48.905 ClientDoneResponse: 08:22:48.905 Overall Elapsed: 00:00:01.1388020 RESPONSE BYTES (by Content-Type) -------------- text/html: 19,888 ~headers~: 525 So my question has evolved into: What is the difference between the 2 requests and how do I determine why 1 PC is not getting a reply to it's GET request?

    Read the article

  • CPU Utilization LAMP stack

    - by Max
    We've got an ec2 m2.4xlarge running Magento (centos 5.6, httpd 2.2, php 5.2.17 with eaccelerator 0.9.5.3, mysql 5.1.52). Right now we're getting a large traffic spike, and our top looks like this: top - 09:41:29 up 31 days, 1:12, 1 user, load average: 120.01, 129.03, 113.23 Tasks: 1190 total, 18 running, 1172 sleeping, 0 stopped, 0 zombie Cpu(s): 97.3%us, 1.8%sy, 0.0%ni, 0.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.4%st Mem: 71687720k total, 36898928k used, 34788792k free, 49692k buffers Swap: 880737784k total, 0k used, 880737784k free, 1586524k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2433 mysql 15 0 23.6g 4.5g 7112 S 564.7 6.6 33607:34 mysqld 24046 apache 16 0 411m 65m 28m S 26.4 0.1 0:09.05 httpd 24360 apache 15 0 410m 60m 25m S 26.4 0.1 0:03.65 httpd 24993 apache 16 0 410m 57m 21m S 26.1 0.1 0:01.41 httpd 24838 apache 16 0 428m 74m 20m S 24.8 0.1 0:02.37 httpd 24359 apache 16 0 411m 62m 26m R 22.3 0.1 0:08.12 httpd 23850 apache 15 0 411m 64m 27m S 16.8 0.1 0:14.54 httpd 25229 apache 16 0 404m 46m 17m R 10.2 0.1 0:00.71 httpd 14594 apache 15 0 404m 63m 34m S 8.4 0.1 1:10.26 httpd 24955 apache 16 0 404m 50m 21m R 8.4 0.1 0:01.66 httpd 24313 apache 16 0 399m 46m 22m R 8.1 0.1 0:02.30 httpd 25119 apache 16 0 411m 59m 23m S 6.8 0.1 0:01.45 httpd Questions: Would giving msyqld more memory help it cache queries and react faster? If so, how? Other than splitting mysql and php to separate servers (which we're about to do) is there anything else we could/should be doing? Thanks! UPDATE: Here's our my.cnf along with the output of mysqltuner. It looks like a cache problem. Thanks again! # cat /etc/my.cnf [client] port = **** socket = /var/lib/mysql/mysql.sock [mysqld] datadir=/mnt/persistent/mysql port=**** socket=/var/lib/mysql/mysql.sock key_buffer = 512M max_allowed_packet = 64M table_cache = 1024 sort_buffer_size = 8M read_buffer_size = 4M read_rnd_buffer_size = 2M myisam_sort_buffer_size = 64M thread_cache_size = 128M tmp_table_size = 128M join_buffer_size = 1M query_cache_limit = 2M query_cache_size= 64M query_cache_type = 1 max_connections = 1000 thread_stack = 128K thread_concurrency = 48 log-bin=mysql-bin server-id = 1 wait_timeout = 300 innodb_data_home_dir = /mnt/persistent/mysql/ innodb_data_file_path = ibdata1:10M:autoextend innodb_buffer_pool_size = 20G innodb_additional_mem_pool_size = 20M innodb_log_file_size = 64M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 50 innodb_thread_concurrency = 48 ft_min_word_len=3 [myisamchk] ft_min_word_len=3 key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M # ./mysqltuner.pl >> MySQLTuner 1.2.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.52-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +Archive -BDB +Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 2G (Tables: 26) [--] Data in InnoDB tables: 749M (Tables: 250) [!!] Total fragmented tables: 262 -------- Security Recommendations ------------------------------------------- -------- Performance Metrics ------------------------------------------------- [--] Up for: 31d 2h 30m 38s (680M q [253.371 qps], 2M conn, TX: 4825B, RX: 236B) [--] Reads / Writes: 89% / 11% [--] Total buffers: 20.6G global + 15.1M per thread (1000 max threads) [OK] Maximum possible memory usage: 35.4G (51% of installed RAM) [OK] Slow queries: 0% (35K/680M) [OK] Highest usage of available connections: 53% (537/1000) [OK] Key buffer size / total MyISAM indexes: 512.0M/457.2M [OK] Key buffer hit rate: 100.0% (9B cached / 264K reads) [OK] Query cache efficiency: 42.3% (260M cached / 615M selects) [!!] Query cache prunes per day: 4384652 [OK] Sorts requiring temporary tables: 0% (1K temp sorts / 38M sorts) [!!] Joins performed without indexes: 100404 [OK] Temporary tables created on disk: 17% (7M on disk / 45M total) [OK] Thread cache hit rate: 99% (537 created / 2M connections) [!!] Table cache hit rate: 0% (1K open / 946K opened) [OK] Open file limit used: 9% (453/5K) [OK] Table locks acquired immediately: 99% (758M immediate / 758M locks) [OK] InnoDB data size / buffer pool: 749.3M/20.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance Enable the slow query log to troubleshoot bad queries Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Variables to adjust: query_cache_size (> 64M) join_buffer_size (> 1.0M, or always use indexes with joins) table_cache (> 1024)

    Read the article

  • Metrics - A little knowledge can be a dangerous thing (or 'Why you're not clever enough to interpret metrics data')

    - by Jason Crease
    At RedGate Software, I work on a .NET obfuscator  called SmartAssembly.  Various features of it use a database to store various things (exception reports, name-mappings, etc.) The user is given the option of using either a SQL-Server database (which requires them to have Microsoft SQL Server), or a Microsoft Access MDB file (which requires nothing). MDB is the default option, but power-users soon switch to using a SQL Server database because it offers better performance and data-sharing. In the fashionable spirit of optimization and metrics, an obvious product-management question is 'Which is the most popular? SQL Server or MDB?' We've collected data about this fact, using our 'Feature-Usage-Reporting' technology (available as part of SmartAssembly) and more recently our 'Application Metrics' technology: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 28 19.0 8115 8115 MDB 114 77.6 1449 1449 (As a disclaimer, please note than SmartAssembly has far more than 132 users . This data is just a selection of one build) So, it would appear that SQL-Server is used by fewer users, but more often. Great. But here's why these numbers are useless to me: Only the original developers understand the data What does a single 'usage' of 'MDB' mean? Does this happen once per run? Once per option change? On clicking the 'Obfuscate Now' button? When running the command-line version or just from the UI version? Each question could skew the data 10-fold either way, and the answers only known by the developer that instrumented the application in the first place. In other words, only the original developer can interpret the data - product-managers cannot interpret the data unaided. Most of the data is from uninterested users About half of people who download and run a free-trial from the internet quit it almost immediately. Only a small fraction use it sufficiently to make informed choices. Since the MDB option is the default one, we don't know how many of those 114 were people CHOOSING to use the MDB, or how many were JUST HAPPENING to use this MDB default for their 20-second trial. This is a problem we see across all our metrics: Are people are using X because it's the default or are they using X because they want to use X? We need to segment the data further - asking what percentage of each percentage meet our criteria for an 'established user' or 'informed user'. You end up spending hours writing sophisticated and dubious SQL queries to segment the data further. Not fun. You can't find out why they used this feature Metrics can answer the when and what, but not the why. Why did people use feature X? If you're anything like me, you often click on random buttons in unfamiliar applications just to explore the feature-set. If we listened uncritically to metrics at RedGate, we would eliminate the most-important and more-complex features which people actually buy the software for, leaving just big buttons on the main page and the About-Box. "Ah, that's interesting!" rather than "Ah, that's actionable!" People do love data. Did you know you eat 1201 chickens in a lifetime? But just 4 cows? Interesting, but useless. Often metrics give you a nice number: '5.8% of users have 3 or more monitors' . But unless the statistic is both SUPRISING and ACTIONABLE, it's useless. Most metrics are collected, reviewed with lots of cooing. and then forgotten. Unless a piece-of-data could change things, it's useless collecting it. People get obsessed with significance levels The first things that lots of people do with this data is do a t-test to get a significance level ("Hey! We know with 99.64% confidence that people prefer SQL Server to MDBs!") Believe me: other causes of error/misinterpretation in your data are FAR more significant than your t-test could ever comprehend. Confirmation bias prevents objectivity If the data appears to match our instinct, we feel satisfied and move on. If it doesn't, we suspect the data and dig deeper, plummeting down a rabbit-hole of segmentation and filtering until we give-up and move-on. Data is only useful if it can change our preconceptions. Do you trust this dodgy data more than your own understanding, knowledge and intelligence?  I don't. There's always multiple plausible ways to interpret/action any data Let's say we segment the above data, and get this data: Post-trial users (i.e. those using a paid version after the 14-day free-trial is over): Parameter Number of users % of total users Number of sessions Number of usages SQL Server 13 9.0 1115 1115 MDB 5 4.2 449 449 Trial users: Parameter Number of users % of total users Number of sessions Number of usages SQL Server 15 10.0 7000 7000 MDB 114 77.6 1000 1000 How do you interpret this data? It's one of: Mostly SQL Server users buy our software. People who can't afford SQL Server tend to be unable to afford or unwilling to buy our software. Therefore, ditch MDB-support. Our MDB support is so poor and buggy that our massive MDB user-base doesn't buy it.  Therefore, spend loads of money improving it, and think about ditching SQL-Server support. People 'graduate' naturally from MDB to SQL Server as they use the software more. Things are fine the way they are. We're marketing the tool wrong. The large number of MDB users represent uninformed downloaders. Tell marketing to aggressively target SQL Server users. To choose an interpretation you need to segment again. And again. And again, and again. Opting-out is correlated with feature-usage Metrics tends to be opt-in. This skews the data even further. Between 5% and 30% of people choose to opt-in to metrics (often called 'customer improvement program' or something like that). Casual trial-users who are uninterested in your product or company are less likely to opt-in. This group is probably also likely to be MDB users. How much does this skew your data by? Who knows? It's not all doom and gloom. There are some things metrics can answer well. Environment facts. How many people have 3 monitors? Have Windows 7? Have .NET 4 installed? Have Japanese Windows? Minor optimizations.  Is the text-box big enough for average user-input? Performance data. How long does our app take to start? How many databases does the average user have on their server? As you can see, questions about who-the-user-is rather than what-the-user-does are easier to answer and action. Conclusion Use SmartAssembly. If not for the metrics (called 'Feature-Usage-Reporting'), then at least for the obfuscation/error-reporting. Data raises more questions than it answers. Questions about environment are the easiest to answer.

    Read the article

  • Introduction to Human Workflow 11g

    - by agiovannetti
    Human Workflow is a component of SOA Suite just like BPEL, Mediator, Business Rules, etc. The Human Workflow component allows you to incorporate human intervention in a business process. You can use Human Workflow to create a business process that requires a manager to approve purchase orders greater than $10,000; or a business process that handles article reviews in which a group of reviewers need to vote/approve an article before it gets published. Human Workflow can handle the task assignment and routing as well as the generation of notifications to the participants. There are three common patterns or usages of Human Workflow: 1) Approval Scenarios: manage documents and other transactional data through approval chains . For example: approve expense report, vacation approval, hiring approval, etc. 2) Reviews by multiple users or groups: group collaboration and review of documents or proposals. For example, processing a sales quote which is subject to review by multiple people. 3) Case Management: workflows around work management or case management. For example, processing a service request. This could be routed to various people who all need to modify the task. It may also incorporate ad hoc routing which is unknown at design time. SOA 11g Human Workflow includes the following features: Assignment and routing of tasks to the correct users or groups. Deadlines, escalations, notifications, and other features required for ensuring the timely performance of a task. Presentation of tasks to end users through a variety of mechanisms, including a Worklist application. Organization, filtering, prioritization and other features required for end users to productively perform their tasks. Reports, reassignments, load balancing and other features required by supervisors and business owners to manage the performance of tasks. Human Workflow Architecture The Human Workflow component is divided into 3 modules: the service interface, the task definition and the client interface module. The Service Interface handles the interaction with BPEL and other components. The Client Interface handles the presentation of task data through clients like the Worklist application, portals and notification channels. The task definition module is in charge of managing the lifecycle of a task. Who should get the task assigned? What should happen next with the task? When must the task be completed? Should the task be escalated?, etc Stages and Participants When you create a Human Task you need to specify how the task is assigned and routed. The first step is to define the stages and participants. A stage is just a logical group. A participant can be a user, a group of users or an application role. The participants indicate the type of assignment and routing that will be performed. Stages can be sequential or in parallel. You can combine them to create any usage you require. See diagram below: Assignment and Routing There are different ways a task can be assigned and routed: Single Approver: task is assigned to a single user, group or role. For example, a vacation request is assigned to a manager. If the manager approves or rejects the request, the employee is notified with the decision. If the task is assigned to a group then once one of managers acts on it, the task is completed. Parallel : task is assigned to a set of people that must work in parallel. This is commonly used for voting. For example, a task gets approved once 50% of the participants approve it. You can also set it up to be a unanimous vote. Serial : participants must work in sequence. The most common scenario for this is management chain escalation. FYI (For Your Information) : task is assigned to participants who can view it, add comments and attachments, but can not modify or complete the task. Task Actions The following is the list of actions that can be performed on a task: Claim : if a task is assigned to a group or multiple users, then the task must be claimed first to be able to act on it. Escalate : if the participant is not able to complete a task, he/she can escalate it. The task is reassigned to his/her manager (up one level in a hierarchy). Pushback : the task is sent back to the previous assignee. Reassign :if the participant is a manager, he/she can delegate a task to his/her reports. Release : if a task is assigned to a group or multiple users, it can be released if the user who claimed the task cannot complete the task. Any of the other assignees can claim and complete the task. Request Information and Submit Information : use when the participant needs to supply more information or to request more information from the task creator or any of the previous assignees. Suspend and Resume :if a task is not relevant, it can be suspended. A suspension is indefinite. It does not expire until Resume is used to resume working on the task. Withdraw : if the creator of a task does not want to continue with it, for example, he wants to cancel a vacation request, he can withdraw the task. The business process determines what happens next. Renew : if a task is about to expire, the participant can renew it. The task expiration date is extended one week. Notifications Human Workflow provides a mechanism for sending notifications to participants to alert them of changes on a task. Notifications can be sent via email, telephone voice message, instant messaging (IM) or short message service (SMS). Notifications can be sent when the task status changes to any of the following: Assigned/renewed/delegated/reassigned/escalated Completed Error Expired Request Info Resume Suspended Added/Updated comments and/or attachments Updated Outcome Withdraw Other Actions (e.g. acquiring a task) Here is an example of an email notification: Worklist Application Oracle BPM Worklist application is the default user interface included in SOA Suite. It allows users to access and act on tasks that have been assigned to them. For example, from the Worklist application, a loan agent can review loan applications or a manager can approve employee vacation requests. Through the Worklist Application users can: Perform authorized actions on tasks, acquire and check out shared tasks, define personal to-do tasks and define subtasks. Filter tasks view based on various criteria. Work with standard work queues, such as high priority tasks, tasks due soon and so on. Work queues allow users to create a custom view to group a subset of tasks in the worklist, for example, high priority tasks, tasks due in 24 hours, expense approval tasks and more. Define custom work queues. Gain proxy access to part of another user's tasks. Define custom vacation rules and delegation rules. Enable group owners to define task dispatching rules for shared tasks. Collect a complete workflow history and audit trail. Use digital signatures for tasks. Run reports like Unattended tasks, Tasks productivity, etc. Here is a screenshoot of what the Worklist Application looks like. On the right hand side you can see the tasks that have been assigned to the user and the task's detail. References Introduction to SOA Suite 11g Human Workflow Webcast Note 1452937.2 Human Workflow Information Center Using the Human Workflow Service Component 11.1.1.6 Human Workflow Samples Human Workflow APIs Java Docs

    Read the article

  • SharePoint logging to a list

    - by Norgean
    I recently worked in an environment with several servers. Locating the correct SharePoint log file for error messages, or development trace calls, is cumbersome. And once the solution hit the cloud, it got even worse, as we had no access to the log files at all. Obviously we are not the only ones with this problem, and the current trend seems to be to log to a list. This had become an off-hour project, so rather than do the sensible thing and find a ready-made solution, I decided to do it the hard way. So! Fire up Visual Studio, create yet another empty SharePoint solution, and start to think of some requirements. Easy on/offI want to be able to turn list-logging on and off.Easy loggingFor me, this means being able to use string.Format.Easy filteringLet's have the possibility to add some filtering columns; category and severity, where severity can be "verbose", "warning" or "error". Easy on/off Well, that's easy. Create a new web feature. Add an event receiver, and create the list on activation of the feature. Tear the list down on de-activation. I chose not to create a new content type; I did not feel that it would give me anything extra. I based the list on the generic list - I think a better choice would have been the announcement type. Approximately: public void CreateLog(SPWeb web)         {             var list = web.Lists.TryGetList(LogListName);             if (list == null)             {                 var listGuid = web.Lists.Add(LogListName, "Logging for the masses", SPListTemplateType.GenericList);                 list = web.Lists[listGuid];                 list.Title = LogListTitle;                 list.Update();                 list.Fields.Add(Category, SPFieldType.Text, false);                 var stringColl = new StringCollection();                 stringColl.AddRange(new[]{Error, Information, Verbose});                 list.Fields.Add(Severity, SPFieldType.Choice, true, false, stringColl);                 ModifyDefaultView(list);             }         }Should be self explanatory, but: only create the list if it does not already exist (d'oh). Best practice: create it with a Url-friendly name, and, if necessary, give it a better title. ...because otherwise you'll have to look for a list with a name like "Simple_x0020_Log". I've added a couple of fields; a field for category, and a 'severity'. Both to make it easier to find relevant log messages. Notice that I don't have to call list.Update() after adding the fields - this would cause a nasty error (something along the lines of "List locked by another user"). The function for deleting the log is exactly as onerous as you'd expect:         public void DeleteLog(SPWeb web)         {             var list = web.Lists.TryGetList(LogListTitle);             if (list != null)             {                 list.Delete();             }         } So! "All" that remains is to log. Also known as adding items to a list. Lots of different methods with different signatures end up calling the same function. For example, LogVerbose(web, message) calls LogVerbose(web, null, message) which again calls another method which calls: private static void Log(SPWeb web, string category, string severity, string textformat, params object[] texts)         {             if (web != null)             {                 var list = web.Lists.TryGetList(LogListTitle);                 if (list != null)                 {                     var item = list.AddItem(); // NOTE! NOT list.Items.Add… just don't, mkay?                     var text = string.Format(textformat, texts);                     if (text.Length > 255) // because the title field only holds so many chars. Sigh.                         text = text.Substring(0, 254);                     item[SPBuiltInFieldId.Title] = text;                     item[Degree] = severity;                     item[Category] = category;                     item.Update();                 }             } // omitted: Also log to SharePoint log.         } By adding a params parameter I can call it as if I was doing a Console.WriteLine: LogVerbose(web, "demo", "{0} {1}{2}", "hello", "world", '!'); Ok, that was a silly example, a better one might be: LogError(web, LogCategory, "Exception caught when updating {0}. exception: {1}", listItem.Title, ex); For performance reasons I use list.AddItem rather than list.Items.Add. For completeness' sake, let us include the "ModifyDefaultView" function that I deliberately skipped earlier.         private void ModifyDefaultView(SPList list)         {             // Add fields to default view             var defaultView = list.DefaultView;             var exists = defaultView.ViewFields.Cast<string>().Any(field => String.CompareOrdinal(field, Severity) == 0);               if (!exists)             {                 var field = list.Fields.GetFieldByInternalName(Severity);                 if (field != null)                     defaultView.ViewFields.Add(field);                 field = list.Fields.GetFieldByInternalName(Category);                 if (field != null)                     defaultView.ViewFields.Add(field);                 defaultView.Update();                   var sortDoc = new XmlDocument();                 sortDoc.LoadXml(string.Format("<Query>{0}</Query>", defaultView.Query));                 var orderBy = (XmlElement) sortDoc.SelectSingleNode("//OrderBy");                 if (orderBy != null && sortDoc.DocumentElement != null)                     sortDoc.DocumentElement.RemoveChild(orderBy);                 orderBy = sortDoc.CreateElement("OrderBy");                 sortDoc.DocumentElement.AppendChild(orderBy);                 field = list.Fields[SPBuiltInFieldId.Modified];                 var fieldRef = sortDoc.CreateElement("FieldRef");                 fieldRef.SetAttribute("Name", field.InternalName);                 fieldRef.SetAttribute("Ascending", "FALSE");                 orderBy.AppendChild(fieldRef);                   fieldRef = sortDoc.CreateElement("FieldRef");                 field = list.Fields[SPBuiltInFieldId.ID];                 fieldRef.SetAttribute("Name", field.InternalName);                 fieldRef.SetAttribute("Ascending", "FALSE");                 orderBy.AppendChild(fieldRef);                 defaultView.Query = sortDoc.DocumentElement.InnerXml;                 //defaultView.Query = "<OrderBy><FieldRef Name='Modified' Ascending='FALSE' /><FieldRef Name='ID' Ascending='FALSE' /></OrderBy>";                 defaultView.Update();             }         } First two lines are easy - see if the default view includes the "Severity" column. If it does - quit; our job here is done.Adding "severity" and "Category" to the view is not exactly rocket science. But then? Then we build the sort order query. Through XML. The lines are numerous, but boring. All to achieve the CAML query which is commented out. The major benefit of using the dom to build XML, is that you may get compile time errors for spelling mistakes. I say 'may', because although the compiler will not let you forget to close a tag, it will cheerfully let you spell "Name" as "Naem". Whichever you prefer, at the end of the day the view will sort by modified date and ID, both descending. I added the ID as there may be several items with the same time stamp. So! Simple logging to a list, with sensible a view, and with normal functionality for creating your own filterings. I should probably have added some more views in code, ready filtered for "only errors", "errors and warnings" etc. And it would be nice to block verbose logging completely, but I'm not happy with the alternatives. (yetanotherfeature or an admin page seem like overkill - perhaps just removing it as one of the choices, and not log if it isn't there?) Before you comment - yes, try-catches have been removed for clarity. There is nothing worse than having a logging function that breaks your site!

    Read the article

  • MySQL reserves too much RAM

    - by Buddy
    I have a cheap VPS with 128Mb RAM and 256Mb burst. MySQL starts and reserves about 110Mb, but uses not more than 20Mb of them. My VPS Control Panel shows, that I use 127Mb (I also running nginx and sphinx), I know, that it shows reserved RAM, but when I reach over 128Mb, my VPS reboots automatically every 4 hours. So I want to force MySQL to reserve less RAM. How can i do that? I did some tweaks with my.conf but it helped not so much. top output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 15 0 2156 668 572 S 0.0 0.3 0:00.03 init 11311 root 15 0 11212 356 228 S 0.0 0.1 0:00.00 vzctl 11312 root 18 0 3712 1484 1248 S 0.0 0.6 0:00.01 bash 11347 root 18 0 2284 916 732 R 0.0 0.3 0:00.00 top 13978 root 17 -4 2248 552 344 S 0.0 0.2 0:00.00 udevd 14262 root 15 0 1812 564 472 S 0.0 0.2 0:00.03 syslogd 14293 sphinx 15 0 11816 1172 672 S 0.0 0.4 0:00.07 searchd 14305 root 25 0 7192 1036 636 S 0.0 0.4 0:00.00 sshd 14321 root 25 0 2832 836 668 S 0.0 0.3 0:00.00 xinetd 15389 root 18 0 3708 1300 1132 S 0.0 0.5 0:00.00 mysqld_safe 15441 mysql 15 0 113m 16m 4440 S 0.0 6.4 0:00.15 mysqld 15489 root 21 0 13056 1456 340 S 0.0 0.6 0:00.00 nginx 15490 nginx 18 0 13328 2388 992 S 0.0 0.9 0:00.06 nginx 15507 nginx 25 0 19520 5888 4244 S 0.0 2.2 0:00.00 php-cgi 15508 nginx 18 0 19636 4876 2748 S 0.0 1.9 0:00.12 php-cgi 15509 nginx 15 0 19668 4872 2716 S 0.0 1.9 0:00.11 php-cgi 15518 root 18 0 4492 1116 568 S 0.0 0.4 0:00.01 crond MySQL tuner: >> MySQLTuner 1.0.1 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering Please enter your MySQL administrative login: root Please enter your MySQL administrative password: -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.0.77 [OK] Operating on 32-bit architecture with less than 2GB RAM -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in InnoDB tables: 1M (Tables: 1) [OK] Total fragmented tables: 0 -------- Performance Metrics ------------------------------------------------- [--] Up for: 38m 43s (37 q [0.016 qps], 20 conn, TX: 4M, RX: 3K) [--] Reads / Writes: 100% / 0% [--] Total buffers: 28.1M global + 832.0K per thread (100 max threads) [OK] Maximum possible memory usage: 109.4M (42% of installed RAM) [OK] Slow queries: 0% (0/37) [OK] Highest usage of available connections: 1% (1/100) [OK] Key buffer size / total MyISAM indexes: 128.0K/64.0K [OK] Query cache efficiency: 42.1% (8 cached / 19 selects) [OK] Query cache prunes per day: 0 [!!] Temporary tables created on disk: 27% (3 on disk / 11 total) [!!] Thread cache is disabled [OK] Table cache hit rate: 57% (8 open / 14 opened) [OK] Open file limit used: 1% (12/1K) [OK] Table locks acquired immediately: 100% (22 immediate / 22 locks) [!!] Connections aborted: 10% [OK] InnoDB data size / buffer pool: 1.5M/8.0M -------- Recommendations ----------------------------------------------------- General recommendations: MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries When making adjustments, make tmp_table_size/max_heap_table_size equal Reduce your SELECT DISTINCT queries without LIMIT clauses Set thread_cache_size to 4 as a starting value Your applications are not closing MySQL connections properly Variables to adjust: tmp_table_size (> 32M) max_heap_table_size (> 16M) thread_cache_size (start at 4) I think if I do what MySQLtuner says, MySQL will use more RAM.

    Read the article

  • mySQL Optimization Suggestions

    - by Brian Schroeter
    I'm trying to optimize our mySQL configuration for our large Magento website. The reason I believe that mySQL needs to be configured further is because New Relic has shown that our SELECT queries are taking a long time (20,000+ ms) in some categories. I ran MySQLTuner 1.3.0 and got the following results... (Disclaimer: I restarted mySQL earlier after tweaking some settings, and so the results here may not be 100% accurate): >> MySQLTuner 1.3.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering [OK] Currently running supported MySQL version 5.5.37-35.0 [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MRG_MYISAM [--] Data in MyISAM tables: 7G (Tables: 332) [--] Data in InnoDB tables: 213G (Tables: 8714) [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17) [--] Data in MEMORY tables: 0B (Tables: 353) [!!] Total fragmented tables: 5492 -------- Security Recommendations ------------------------------------------- [!!] User '@host5.server1.autopartsnetwork.com' has no password set. [!!] User '@localhost' has no password set. [!!] User 'root@%' has no password set. -------- Performance Metrics ------------------------------------------------- [--] Up for: 5h 3m 4s (5M q [317.443 qps], 42K conn, TX: 18B, RX: 2B) [--] Reads / Writes: 95% / 5% [--] Total buffers: 35.5G global + 184.5M per thread (1024 max threads) [!!] Maximum possible memory usage: 220.0G (174% of installed RAM) [OK] Slow queries: 0% (6K/5M) [OK] Highest usage of available connections: 5% (61/1024) [OK] Key buffer size / total MyISAM indexes: 512.0M/3.1G [OK] Key buffer hit rate: 100.0% (102M cached / 45K reads) [OK] Query cache efficiency: 66.9% (3M cached / 5M selects) [!!] Query cache prunes per day: 3486361 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 812K sorts) [!!] Joins performed without indexes: 1328 [OK] Temporary tables created on disk: 11% (126K on disk / 1M total) [OK] Thread cache hit rate: 99% (61 created / 42K connections) [!!] Table cache hit rate: 19% (9K open / 49K opened) [OK] Open file limit used: 2% (712/25K) [OK] Table locks acquired immediately: 100% (5M immediate / 5M locks) [!!] InnoDB buffer pool / data size: 32.0G/213.4G [OK] InnoDB log waits: 0 -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Reduce your overall MySQL memory footprint for system stability Enable the slow query log to troubleshoot bad queries Increasing the query_cache size over 128M may reduce performance Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Read this before increasing table_cache over 64: http://bit.ly/1mi7c4C Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 512M) [see warning above] join_buffer_size (> 128.0M, or always use indexes with joins) table_cache (> 12288) innodb_buffer_pool_size (>= 213G) My my.cnf configuration is as follows... [client] port = 3306 [mysqld_safe] nice = 0 [mysqld] tmpdir = /var/lib/mysql/tmp user = mysql port = 3306 skip-external-locking character-set-server = utf8 collation-server = utf8_general_ci event_scheduler = 0 key_buffer = 512M max_allowed_packet = 64M thread_stack = 512K thread_cache_size = 512 sort_buffer_size = 24M read_buffer_size = 8M read_rnd_buffer_size = 24M join_buffer_size = 128M # for some nightly processes client sessions set the join buffer to 8 GB auto-increment-increment = 1 auto-increment-offset = 1 myisam-recover = BACKUP max_connections = 1024 # max connect errors artificially high to support behaviors of NetScaler monitors max_connect_errors = 999999 concurrent_insert = 2 connect_timeout = 5 wait_timeout = 180 net_read_timeout = 120 net_write_timeout = 120 back_log = 128 # this table_open_cache might be too low because of MySQL bugs #16244691 and #65384) table_open_cache = 12288 tmp_table_size = 512M max_heap_table_size = 512M bulk_insert_buffer_size = 512M open-files-limit = 8192 open-files = 1024 query_cache_type = 1 # large query limit supports SOAP and REST API integrations query_cache_limit = 4M # larger than 512 MB query cache size is problematic; this is typically ~60% full query_cache_size = 512M # set to true on read slaves read_only = false slow_query_log_file = /var/log/mysql/slow.log slow_query_log = 0 long_query_time = 0.2 expire_logs_days = 10 max_binlog_size = 1024M binlog_cache_size = 32K sync_binlog = 0 # SSD RAID10 technically has a write capacity of 10000 IOPS innodb_io_capacity = 400 innodb_file_per_table innodb_table_locks = true innodb_lock_wait_timeout = 30 # These servers have 80 CPU threads; match 1:1 innodb_thread_concurrency = 48 innodb_commit_concurrency = 2 innodb_support_xa = true innodb_buffer_pool_size = 32G innodb_file_per_table innodb_flush_log_at_trx_commit = 1 innodb_log_buffer_size = 2G skip-federated [mysqldump] quick quote-names single-transaction max_allowed_packet = 64M I have a monster of a server here to power our site because our catalog is very large (300,000 simple SKUs), and I'm just wondering if I'm missing anything that I can configure further. :-) Thanks!

    Read the article

  • need assistance with my.cnf - 1500% CPU usage

    - by Alan Long
    I'm running into a few issues with our new database server. It is a HP G8 with 2 INTEL XEON E5-2650 processors and 32GB of ram. This server is dedicated as a MySQL server (5.1.69) for our intranet portal. I have been having issues with this server staying alive - I notice high CPU usage during certain times of day (8% ~ 1500%+) and see very low memory usage (7 ~ 15%) based on using the 'top' command. When the CPU usage passes 1000%, that is when the app usually dies. I'm trying to see what I'm doing wrong with the config file, hopefully one of the experts can chime in and let me know what they think. See below for my.cnf file: [mysqld] default-storage-engine=InnoDB datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock #user=mysql large-pages # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 max_connections=275 tmp_table_size=1G key_buffer_size=384M key_buffer=384M thread_cache_size=1024 long_query_time=5 low_priority_updates=1 max_heap_table_size=1G myisam_sort_buffer_size=8M concurrent_insert=2 table_cache=1024 sort_buffer_size=8M read_buffer_size=5M read_rnd_buffer_size=6M join_buffer_size=16M table_definition_cache=6k open_files_limit=8k slow_query_log #skip-name-resolve # Innodb Settings innodb_buffer_pool_size=18G innodb_thread_concurrency=0 innodb_log_file_size=1G innodb_log_buffer_size=16M innodb_flush_log_at_trx_commit=2 innodb_lock_wait_timeout=50 innodb_file_per_table #innodb_buffer_pool_instances=4 #eliminating double buffering innodb_flush_method = O_DIRECT flush_time=86400 innodb_additional_mem_pool_size=40M #innodb_io_capacity = 5000 #innodb_read_io_threads = 64 #innodb_write_io_threads = 64 # increase until threads_created doesnt grow anymore thread_cache=1024 query_cache_type=1 query_cache_limit=4M query_cache_size=256M # Try number of CPU's*2 for thread_concurrency thread_concurrency = 0 wait_timeout = 1800 connect_timeout = 10 interactive_timeout = 60 [mysqldump] max_allowed_packet=32M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid log-slow-queries=/var/log/mysql/slow-queries.log long_query_time = 1 log-queries-not-using-indexes we connect to one database with 75 tables, the largest table has 1,150,000 entries and the second largest has 128,036 entries. I have also verified that our PHP queries are optimized as best as possible. Reference - MySQLtuner: >> MySQLTuner 1.2.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.69-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in InnoDB tables: 420M (Tables: 75) [!!] Total fragmented tables: 75 -------- Security Recommendations ------------------------------------------- [!!] User '[email protected]' has no password set. -------- Performance Metrics ------------------------------------------------- [--] Up for: 1h 14m 50s (8M q [1K qps], 705 conn, TX: 6B, RX: 892M) [--] Reads / Writes: 68% / 32% [--] Total buffers: 19.7G global + 35.2M per thread (275 max threads) [!!] Maximum possible memory usage: 29.1G (93% of installed RAM) [OK] Slow queries: 0% (472/8M) [OK] Highest usage of available connections: 66% (183/275) [OK] Key buffer size / total MyISAM indexes: 384.0M/91.0K [OK] Key buffer hit rate: 100.0% (173 cached / 0 reads) [OK] Query cache efficiency: 96.2% (7M cached / 7M selects) [!!] Query cache prunes per day: 553614 [OK] Sorts requiring temporary tables: 0% (3 temp sorts / 1K sorts) [!!] Temporary tables created on disk: 49% (3K on disk / 7K total) [OK] Thread cache hit rate: 74% (183 created / 705 connections) [OK] Table cache hit rate: 97% (231 open / 238 opened) [OK] Open file limit used: 0% (17/8K) [OK] Table locks acquired immediately: 100% (432K immediate / 432K locks) [OK] InnoDB data size / buffer pool: 420.9M/18.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Reduce your overall MySQL memory footprint for system stability Increasing the query_cache size over 128M may reduce performance Temporary table size is already large - reduce result set size Reduce your SELECT DISTINCT queries without LIMIT clauses Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 256M) [see warning above] Thanks in advanced for your help!

    Read the article

  • Two network interfaces and two IP addresses on the same subnet in Linux

    - by Scott Duckworth
    I recently ran into a situation where I needed two IP addresses on the same subnet assigned to one Linux host so that we could run two SSL/TLS sites. My first approach was to use IP aliasing, e.g. using eth0:0, eth0:1, etc, but our network admins have some fairly strict settings in place for security that squashed this idea: They use DHCP snooping and normally don't allow static IP addresses. Static addressing is accomplished by using static DHCP entries, so the same MAC address always gets the same IP assignment. This feature can be disabled per switchport if you ask and you have a reason for it (thankfully I have a good relationship with the network guys and this isn't hard to do). With the DHCP snooping disabled on the switchport, they had to put in a rule on the switch that said MAC address X is allowed to have IP address Y. Unfortunately this had the side effect of also saying that MAC address X is ONLY allowed to have IP address Y. IP aliasing required that MAC address X was assigned two IP addresses, so this didn't work. There may have been a way around these issues on the switch configuration, but in an attempt to preserve good relations with the network admins I tried to find another way. Having two network interfaces seemed like the next logical step. Thankfully this Linux system is a virtual machine, so I was able to easily add a second network interface (without rebooting, I might add - pretty cool). A few keystrokes later I had two network interfaces up and running and both pulled IP addresses from DHCP. But then the problem came in: the network admins could see (on the switch) the ARP entry for both interfaces, but only the first network interface that I brought up would respond to pings or any sort of TCP or UDP traffic. After lots of digging and poking, here's what I came up with. It seems to work, but it also seems to be a lot of work for something that seems like it should be simple. Any alternate ideas out there? Step 1: Enable ARP filtering on all interfaces: # sysctl -w net.ipv4.conf.all.arp_filter=1 # echo "net.ipv4.conf.all.arp_filter = 1" >> /etc/sysctl.conf From the file networking/ip-sysctl.txt in the Linux kernel docs: arp_filter - BOOLEAN 1 - Allows you to have multiple network interfaces on the same subnet, and have the ARPs for each interface be answered based on whether or not the kernel would route a packet from the ARP'd IP out that interface (therefore you must use source based routing for this to work). In other words it allows control of which cards (usually 1) will respond to an arp request. 0 - (default) The kernel can respond to arp requests with addresses from other interfaces. This may seem wrong but it usually makes sense, because it increases the chance of successful communication. IP addresses are owned by the complete host on Linux, not by particular interfaces. Only for more complex setups like load- balancing, does this behaviour cause problems. arp_filter for the interface will be enabled if at least one of conf/{all,interface}/arp_filter is set to TRUE, it will be disabled otherwise Step 2: Implement source-based routing I basically just followed directions from http://lartc.org/howto/lartc.rpdb.multiple-links.html, although that page was written with a different goal in mind (dealing with two ISPs). Assume that the subnet is 10.0.0.0/24, the gateway is 10.0.0.1, the IP address for eth0 is 10.0.0.100, and the IP address for eth1 is 10.0.0.101. Define two new routing tables named eth0 and eth1 in /etc/iproute2/rt_tables: ... top of file omitted ... 1 eth0 2 eth1 Define the routes for these two tables: # ip route add default via 10.0.0.1 table eth0 # ip route add default via 10.0.0.1 table eth1 # ip route add 10.0.0.0/24 dev eth0 src 10.0.0.100 table eth0 # ip route add 10.0.0.0/24 dev eth1 src 10.0.0.101 table eth1 Define the rules for when to use the new routing tables: # ip rule add from 10.0.0.100 table eth0 # ip rule add from 10.0.0.101 table eth1 The main routing table was already taken care of by DHCP (and it's not even clear that its strictly necessary in this case), but it basically equates to this: # ip route add default via 10.0.0.1 dev eth0 # ip route add 130.127.48.0/23 dev eth0 src 10.0.0.100 # ip route add 130.127.48.0/23 dev eth1 src 10.0.0.101 And voila! Everything seems to work just fine. Sending pings to both IP addresses works fine. Sending pings from this system to other systems and forcing the ping to use a specific interface works fine (ping -I eth0 10.0.0.1, ping -I eth1 10.0.0.1). And most importantly, all TCP and UDP traffic to/from either IP address works as expected. So again, my question is: is there a better way to do this? This seems like a lot of work for a seemingly simple problem.

    Read the article

< Previous Page | 54 55 56 57 58 59 60 61 62 63  | Next Page >