Search Results

Search found 8021 results on 321 pages for 'resource monitoring'.

Page 23/321 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30  | Next Page >

  • What is best way to raise application events for use with admin network monitoring tools

    - by J Cooper
    In a semi-critical .NET service application, what would be a good strategy for raising application events that could be monitored by network administration tools? The events would be for errors, status changes, and possibly other notifications. My company is planning to use some kind of tool down the road to monitor all critical machines and services. As of right now they are using Spice Works to do some monitoring but it is not known if they will keep this down the road. By strategy I mean, perhaps using some sort of protocol ( my network admin has mentioned SNMP), perhaps a service such as windows event log. I have no idea what is available, so I'm leaving the options open. With that in mind, here is a list of preferences I came up with: Somewhat easy to use with .NET. Reliability Should work well with a variety of admin monitoring tools Works with non - windows monitoring tools Works with Spice Works

    Read the article

  • Munin Mongodb Plugin Not Showing. . . ?

    - by alfredo
    I have installed munin and munin-node on my monitoring server and installed munin-node on my mongodb server, I have set them both up and all is working great. But, the mongodb plugins aren't showing on my monitoring server. I see the node listed and "Disk, Network, Processes, System", but not the mongo stuff. If I execute one of the plugins directly on the mongo server "python /usr/share/munin/plugins/mongo_btree" it returns output, but nothing shows on the monitoring server.

    Read the article

  • Removing resource limits on Solaris 10

    - by mikeydonkey
    How should one remove all potential artificial resource limitations for a process? I just saw a case where a server application consumed resources so that some limitation was hit. The other shells into the same server etc were all extremely slow (waiting for something to free up for them; ie. prstat starting 5 minutes). It wasn't CPU/memory related problem so I think it has got something to do with ulimits / projects. Already managed to set the maximum open files to 500 000 and it helped a little bit. However there is something else and I can not figure out what resource is maxed out. I can get some in-house administrator probably to check this but I would like to understand how I could make sure there shouldn't be any limitations! If you think I am going the wrong way (would be better to figure out what limitation should be specfically tuned etc) please feel free to point me to the correct way. I know technical stuff - it's just Solaris 10 that is giving me headache :/

    Read the article

  • Performance data collection for short-running, ephemeral servers

    - by ErikA
    We're building a medical image processing software stack, currently hosted on various AWS resources. As part of this application, we have a handful of long-running servers (database, load balancers, web application, etc.). Collecting performance data on those servers is quite simple - my go-to- recipe of Nagios (for monitoring/notifications) and Munin (for collection of performance data and displaying trends) will work just fine. However - as part of this application, we are constantly starting up and terminating compute instances on EC2. In typical usage, these compute instances start up, configure themselves, receive a job from a message queue, and then get to work processing that job, which takes anywhere from 15 minutes to over 8 hours. After job completion, these instances get terminated, never to be heard from again. What is a decent strategy for collecting performance data on these short-lived instances? I don't necessarily need monitoring on them - if they fail for whatever reason, our application will detect this and handle re-starting the job on another instance or raising the flag so an administrator can take a look at things. However, it still would be useful to collect information like CPU (user, idle, iowait, etc.), memory usage, network traffic, disk read/write data, etc. In our internal database, we track the instance ID of the machine that runs each job, and it would be quite helpful to be able to look up performance data for a specific instance ID for troubleshooting and profiling. Munin doesn't seem like a great candidate, as it requires maintaining a list of munin nodes in a text file - far from ideal for an environment with a high amount of churn, and for the short amount of time each node will be running, I'd rather keep the full-resolution data indefinitely than have RRD water down the data over time. In the end, my guess is that this will require a monitoring engine that: uses a database (MySQL, SQLite, etc.) for configuration and data storage exposes an API for adding/removing hosts and services Are there other things I should be thinking about when evaluating options? Perhaps I'm over-thinking this, though, and just ought to run sar at 1-minute intervals on these short-lived instances and collect the sar db files prior to termination.

    Read the article

  • "one-off" use of http_proxy in a Chef remote_file resource

    - by user169200
    I have a use case where most of my remote_file resources and yum resources download files directly from an internal server. However, there is a need to download one or two files with remote_file that is outside our firewall and which must go through a HTTP proxy. If I set the http_proxy setting in /etc/chef/client.rb, it adversely affects the recipe's ability to download yum and other files from internal resources. Is there a way to have a remote_file resource download a remote URL through a proxy without setting the http_proxy value in /etc/chef/client.rb? In my sample code, below, I'm downloading a redmine bundle from rubyforge.org, which requires my servers to go through a corporate proxy. I came up with a ruby_block before and after the remote_file resource that sets the http_proxy and "unsets" it. I'm looking for a cleaner way to do this. ruby_block "setenv-http_proxy" do block do Chef::Config.http_proxy = node['redmine']['http_proxy'] ENV['http_proxy'] = node['redmine']['http_proxy'] ENV['HTTP_PROXY'] = node['redmine']['http_proxy'] end action node['redmine']['rubyforge_use_proxy'] ? :create : :nothing notifies :create_if_missing, "remote_file[redmine-bundle.zip]", :immediately end remote_file "redmine-bundle.zip" do path "#{Dir.tmpdir}/redmine-#{attrs['version']}-bundle.zip" source attrs['download_url'] mode "0644" action :create_if_missing notifies :decompress, "zipp[redmine-bundle.zip]", :immediately notifies :create, "ruby_block[unsetenv-http_proxy]", :immediately end ruby_block "unsetenv-http_proxy" do block do Chef::Config.http_proxy = nil ENV['http_proxy'] = nil ENV['HTTP_PROXY'] = nil end action node['redmine']['rubyforge_use_proxy'] ? :create : :nothing end

    Read the article

  • Random "not accessible" "you might not have permission to use this network resource"

    - by Jim Fred
    A couple of computers, both Win7-64 can connect to shares on a NAS server, at least most of the time. At random intervals, these Win7-64 computers cannot access some shares but can access others on the same NAS. When access is denied, a dialog box appears saying "\\myServer\MyShare02 not accessible...you might not have permission to use this network resource..." Other shares, say \\myServer\MyShare01, ARE accessible from the affected computers and yet other computers CAN access the affected shares. Reboots of the affected computers seem to allow the affected computer to connect to the affected shares - but then, getting a cup of coffee seems to help too. When the problem appears, the network seems to be ok e.g. the affected computers can access other shares on the affected server and can ping etc. Also Other computers can access the affected shares. The NAS server is a NetGear ReadyNas Pro. The problem might be on the NAS side such as a resource limitation but since only 2 Win7-64 PCs seem to be affected the most, the problem could be on the PC side - I'm not sure yet. I of course searched for solutions and found several tips addressing initial connection problems (use correct workgroup name, use IP address instead of server name, remove security restrictions etc) but none of those remedies address the random nature of this problem.

    Read the article

  • Microsoft Security Essentials Not Monitoring

    - by nateify
    When I boot into Windows Vista, Microsoft Security Essentials is set to run when the system starts. When I open the program, it says Microsoft Security Essentials isn't monitoring your computer because the program's service stopped. It tells me that it can't update definitions or enable real time protection unless I do it manually (every time I boot). Is there a way I can fix this so I always have real time protection and updating?

    Read the article

  • Open source server monitoring

    - by Webnet
    I'm running Ubuntu server and am looking or a server monitoring utility that is open source/free. We don't need any thing too fancy. Mainly we want to know when the server is offline or if any core services have issues. Preferably something that can send us text messages or emails would be great.

    Read the article

  • Smokeping monitoring IP address and port

    - by bob
    Hi I have successfully install smokeping on my Ubuntu Karmic machine and am monitoring servers. I need to be able to monitor an IP address and a port, can someone tell me how I do that? I looked into Smokeping::probes::TCPPing but I cannot find how to install TCPPing Any help would be much appreciated Thanks

    Read the article

  • SCOM 2007 - SQL Server Monitoring

    - by Gary Hines
    Hi All, I've been tasked with comparing the monitoring capabilities in SCOM 2007 with the established SQL products such as Spotlight, SQLSentry, etc. I'm pretty sure that SCOM can't do some of the more in-depth stuff that those products do, but can it be configured/programmed to alert on the most often encountered problems via VBScript and TSQL. Does anyone have an idea of what type of development effort would be needed for this? Thanks for any and all help.

    Read the article

  • How to handle failure to release a resource which is contained in a smart pointer?

    - by cj
    How should an error during resource deallocation be handled, when the object representing the resource is contained in a shared pointer? Smart pointers are a useful tool to manage resources safely. Examples of such resources are memory, disk files, database connections, or network connections. // open a connection to the local HTTP port boost::shared_ptr<Socket> socket = Socket::connect("localhost:80"); In a typical scenario, the class encapsulating the resource should be noncopyable and polymorphic. A good way to support this is to provide a factory method returning a shared pointer, and declare all constructors non-public. The shared pointers can now be copied from and assigned to freely. The object is automatically destroyed when no reference to it remains, and the destructor then releases the resource. /** A TCP/IP connection. */ class Socket { public: static boost::shared_ptr<Socket> connect(const std::string& address); virtual ~Socket(); protected: Socket(const std::string& address); private: // not implemented Socket(const Socket&); Socket& operator=(const Socket&); }; But there is a problem with this approach. The destructor must not throw, so a failure to release the resource will remain undetected. A common way out of this problem is to add a public method to release the resource. class Socket { public: virtual void close(); // may throw // ... }; Unfortunately, this approach introduces another problem: Our objects may now contain resources which have already been released. This complicates the implementation of the resource class. Even worse, it makes it possible for clients of the class to use it incorrectly. The following example may seem far-fetched, but it is a common pitfall in multi-threaded code. socket->close(); // ... size_t nread = socket->read(&buffer[0], buffer.size()); // wrong use! Either we ensure that the resource is not released before the object is destroyed, thereby losing any way to deal with a failed resource deallocation. Or we provide a way to release the resource explicitly during the object's lifetime, thereby making it possible to use the resource class incorrectly. There is a way out of this dilemma. But the solution involves using a modified shared pointer class. These modifications are likely to be controversial. Typical shared pointer implementations, such as boost::shared_ptr, require that no exception be thrown when their object's destructor is called. Generally, no destructor should ever throw, so this is a reasonable requirement. These implementations also allow a custom deleter function to be specified, which is called in lieu of the destructor when no reference to the object remains. The no-throw requirement is extended to this custom deleter function. The rationale for this requirement is clear: The shared pointer's destructor must not throw. If the deleter function does not throw, nor will the shared pointer's destructor. However, the same holds for other member functions of the shared pointer which lead to resource deallocation, e.g. reset(): If resource deallocation fails, no exception can be thrown. The solution proposed here is to allow custom deleter functions to throw. This means that the modified shared pointer's destructor must catch exceptions thrown by the deleter function. On the other hand, member functions other than the destructor, e.g. reset(), shall not catch exceptions of the deleter function (and their implementation becomes somewhat more complicated). Here is the original example, using a throwing deleter function: /** A TCP/IP connection. */ class Socket { public: static SharedPtr<Socket> connect(const std::string& address); protected: Socket(const std::string& address); virtual Socket() { } private: struct Deleter; // not implemented Socket(const Socket&); Socket& operator=(const Socket&); }; struct Socket::Deleter { void operator()(Socket* socket) { // Close the connection. If an error occurs, delete the socket // and throw an exception. delete socket; } }; SharedPtr<Socket> Socket::connect(const std::string& address) { return SharedPtr<Socket>(new Socket(address), Deleter()); } We can now use reset() to free the resource explicitly. If there is still a reference to the resource in another thread or another part of the program, calling reset() will only decrement the reference count. If this is the last reference to the resource, the resource is released. If resource deallocation fails, an exception is thrown. SharedPtr<Socket> socket = Socket::connect("localhost:80"); // ... socket.reset();

    Read the article

  • Developer Dashboard in SharePoint 2010

    - by jcortez
    Introducing the Developer Dashboard As a SharePoint developer (or IT Professional), how many times have you had the pleasure of figuring out why a particular page on your site is taking too long to render? I'm sure one of the techniques you have employed in troubleshooting is the process of elimination - removing individual web parts from the page hoping to identify which web part is misbehaving. One of the new features of SharePoint 2010 is the Developer Dashboard. This dashboard provides tracing and performance information that can be useful when you are trying to troubleshoot pages that are loading too slow. The Developer Dashboard is turned off by default and I'll go over 3 different ways to display it. Here is a screenshot of what the Developer Dashboard looks like when displayed at the bottom of the page:   You can see on the left side the different events that fired during the page processing pipeline and how long these events took. This is where you will see individual web parts being processed and how long it took to complete (obviously the kind of processing depends on what the web part does). On the right side you would see the different database calls issued through the SharePoint Object Model to process the page. You will notice that each of these database queries are actually a hyperlink and clicking on it displays a pop-up window that shows the actual SQL Query Text, the Call Stack that triggered the database call, and the IO statistics of that query. Enabling the Developer Dashboard Option 1: Managed Code   The Developer Dashboard is a farm-wide setting and the code above won't work if it is used within a web part hosted on any non-Central Admin site. The SPDeveloperDashboardLevel enum has three possible values: On, Off, and OnDemand. Setting it to On will always display the Developer Dashboard at the bottom of the page. Setting it Off will hide the Developer Dashboard. Setting it to OnDemand will add an icon at the top right corner of the page (see screenshot below) where a Site Collection Admin can toggle the display of the Developer Dashboard for a particular site collection. In my opinion, OnDemand is the best setting when troubleshooting a page or during development since a Site Collection Admin can turn it on or off and for a particular site only. The first cool thing about this is that the Site Collection Admin that turned it on will be the only one to see the Developer Dashboard output. Everyday users won't see the Developer Dashboard output even if it was turned on by a Site Collection Admin. If you need more flexibility on who gets to see the Developer Dashboard output, you can set the SPDeveloperDashboardSettings.RequiredPermissions to control which group of users will have the permission to see the output. Option 2: Using stsadm Using stsadm, you can run the following command to configure the Developer Dashboard: STSADM –o setproperty –pn developer-dashboard –pv OnDemand To successfully execute this command, be sure you that are running as a Farm Admin. Option 3: Using PowerShell For all scripts in SharePoint 2010, I prefer writing them as PowerShell scripts. Though the stsadm command is less verbose, the PowerShell equivalent is pretty straightforward and uses the SharePoint Object Model: You can of course parameterized the value that gets assigned to the DisplayLevel property so you can turn it On, Off or OnDemand depending on the parameter. Events and the Developer Dashboard  Now, don't assume that all the code inside your web part or page will show up in the Developer Dashboard complete with all the great troubleshooting information. Only a finite set of events are monitored by default (for a web part it will events in the base web part class). Let's say you have a click event that could take some time, for example a web service call. And you want to include troubleshooting information for this event in the Developer Dashboard. Enter SPMonitoredScope which is also a new feature in SharePoint 2010. In SharePoint 2010, everything is executed within a "Monitored Scope". And each scope has a set of "Monitors" that measures and counts calls and timings which appears in the Developer Dashboard. Below is an example on how to get your custom code to get included in the Developer Dashboard by wrapping it inside a new monitored scope: The code above would include your new scope "My long web service call" into the Developer Dashboard and would log the time it took to complete processing. In my opinion, wrapping your custom code in a SPMonitoredScope is a SharePoint development best practice since it provides you visibility and a better understanding on the performance of your components.

    Read the article

  • Ask How-To Geek: How Can I Monitor My Bandwidth Usage?

    - by Jason Fitzpatrick
    If you’re lucky you enjoy wide open internet access with out restriction (or restrictions so high you would have to work all month to meet them). If you’re not so lucky, you’ve got an ISP with heavy caps. Today we help out a reader working under such a cap. Latest Features How-To Geek ETC Internet Explorer 9 RC Now Available: Here’s the Most Interesting New Stuff Here’s a Super Simple Trick to Defeating Fake Anti-Virus Malware How to Change the Default Application for Android Tasks Stop Believing TV’s Lies: The Real Truth About "Enhancing" Images The How-To Geek Valentine’s Day Gift Guide Inspire Geek Love with These Hilarious Geek Valentines Can the Birds and Pigs Really Be Friends in the End? [Angry Birds Video] Add the 2D Version of the New Unity Interface to Ubuntu 10.10 and 11.04 MightyMintyBoost Is a 3-in-1 Gadget Charger Watson Ties Against Human Jeopardy Opponents Peaceful Tropical Cavern Wallpaper SnapBird Supercharges Your Twitter Searches

    Read the article

  • How Can I Track the Modifications a Program’s Installer Makes?

    - by Jason Fitzpatrick
    What exactly are those installation apps doing as the progress bar whizzes by? If you want to keep a close eye on things, you’ll need the right tools. Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-drive grouping of Q&A web sites. How To Delete, Move, or Rename Locked Files in Windows HTG Explains: Why Screen Savers Are No Longer Necessary 6 Ways Windows 8 Is More Secure Than Windows 7

    Read the article

  • Cannot start tor with vidalia, failed to bind listening port because of tor-socks running

    - by ganjan
    I get these errors trying to run tor with vidalia Apr 19 21:55:15.371 [Notice] Tor v0.2.1.30. This is experimental software. Do not rely on it for strong anonymity. (Running on Linux i686) Apr 19 21:55:15.372 [Notice] Initialized libevent version 1.4.13-stable using method epoll. Good. Apr 19 21:55:15.373 [Notice] Opening Socks listener on 127.0.0.1:9050 Apr 19 21:55:15.373 [Warning] Could not bind to 127.0.0.1:9050: Address already in use. Is Tor already running? Apr 19 21:55:15.373 [Warning] Failed to parse/validate config: Failed to bind one of the listener ports. Apr 19 21:55:15.373 [Error] Reading config failed--see warnings above. I don't think tor is running. Here is a nmap scan of my localhost Starting Nmap 5.21 ( http://nmap.org ) at 2011-04-19 21:59 CEST Nmap scan report for localhost (127.0.0.1) Host is up (0.0000050s latency). Hostname localhost resolves to 2 IPs. Only scanned 127.0.0.1 rDNS record for 127.0.0.1: localhost.localdomain Not shown: 989 closed ports PORT STATE SERVICE 22/tcp open ssh 53/tcp open domain 80/tcp open http 139/tcp open netbios-ssn 445/tcp open microsoft-ds 631/tcp open ipp 3128/tcp open squid-http 3306/tcp open mysql 9000/tcp open cslistener 9050/tcp open tor-socks 10000/tcp open snet-sensor-mgmt I see tor-socks is running here, probably be the cause of the problem. How do I stop this from starting up? I want to use vidalia so I can monitor whats going on.

    Read the article

  • Write TSQL, win a Kindle.

    - by Fatherjack
    So recently Red Gate launched sqlmonitormetrics.red-gate.com and showed the world how to embed your own scripts harmoniously in a third party tool to get the details that you want about your SQL Server performance. The site has a way to submit your own metrics and take a copy of the ones that other people have submitted to build a library of code to keep track of key metrics of your servers performance. There have been several submissions already but they have now launched a competition to provide an incentive for you to get creative and show us what you can do with a bit of TSQL and the SQL Monitor framework*. What’s it worth? Well, if you are one of the 3 winners then you get to choose either a Kindle Fire or $199. How do you win? Simply write the T-SQL for a SQL Monitor custom metric and the relevant description and introduction for it and submit it via  sqlmonitormetrics.red-gate.com before 14th Sept 2012 and then sit back and wait while the judges review your code and your aims in writing the metric. Who are the judges and how will they judge the metrics? There are two judges for this competition, Steve Jones (Microsoft SQL Server MVP, co-founder of SQLServerCentral.com, author, blogger etc) and Jonathan Allen (um, yeah, Steve has done all the good stuff, I’m here by good fortune). We will be looking to rate the metrics on each of 3 criteria: how the metric can help with performance tuning SQL Server. how having the metric running enables DBA’s to meet best practice. how interesting /original the idea for the metric is. Our combined decision will be final etc etc **  What happens to my metric? Any metrics submitted to the competition will be automatically entered into the site library and become available for sharing once the competition is over. You’ll get full credit for metrics you submit regardless of the competition results. You can enter as many metrics as you like. How long does it take? Honestly? Once you have the T-SQL sorted then so long as you can type your name and your email address you are done : http://sqlmonitormetrics.red-gate.com/share-a-metric/ What can I monitor? If you really really want a Kindle or $199 (and let’s face it, who doesn’t? ) and are momentarily stuck for inspiration, take a look at these example custom metrics that have been written by Stuart Ainsworth, Fabiano Amorim, TJay Belt, Louis Davidson, Grant Fritchey, Brad McGehee and me  to start the library off. There are some great pieces of TSQL in those metrics gathering important stats about how SQL Server is performing.   * – framework may not be the best word here but I was under pressure and couldnt think of a better one. If you prefer try ‘engine’, or ‘application’? I don’t know, pick something that makes sense to you. ** – for the full (legal) version of the rules check the details on sqlmonitormetrics.red-gate.com or send us an email if you want any point clarified. Disclaimer – Jonathan is a Friend of Red Gate and as such, whenever they are discussed, will have a generally positive disposition towards Red Gate tools. Other tools are often available and you should always try others before you come back and buy the Red Gate ones. All code in this blog is provided “as is” and no guarantee, warranty or accuracy is applicable or inferred, run the code on a test server and be sure to understand it before you run it on a server that means a lot to you or your manager.

    Read the article

  • Tuning Red Gate: #3 of Lots

    - by Grant Fritchey
    I'm drilling down into the metrics about SQL Server itself available to me in the Analysis tab of SQL Monitor to see what's up with our two problematic servers. In the previous post I'd noticed that rg-sql01 had quite a few CPU spikes. So one of the first things I want to check there is how much CPU is getting used by SQL Server itself. It's possible we're looking at some other process using up all the CPU Nope, It's SQL Server. I compared this to the rg-sql02 server: You can see that there is a more, consistently low set of CPU counters there. I clearly need to look at rg-sql01 and capture more specific data around the queries running on it to identify which ones are causing these CPU spikes. I always like to look at the Batch Requests/sec on a server, not because it's an indication of a problem, but because it gives you some idea of the load. Just how much is this server getting hit? Here are rg-sql01 and rg-sql02: Of the two, clearly rg-sql01 has a lot of activity. Remember though, that's all this is a measure of, activity. It doesn't suggest anything other than what it says, the number of requests coming in. But it's the kind of thing you want to know in order to understand how the system is used. Are you seeing a correlation between the number of requests and the CPU usage, or a reverse correlation, the number of requests drops as the CPU spikes? See, it's useful. Some of the details you can look at are Compilations/sec, Compilations/Batch and Recompilations/sec. These give you some idea of how the cache is getting used within the system. None of these showed anything interesting on either server. One metric that I like (even though I know it can be controversial) is the Page Life Expectancy. On the average server I expect see a series of mountains as the PLE climbs then drops due to a data load or something along those lines. That's not the case here: Those spikes back in January suggest that the servers weren't really being used much. The PLE on the rg-sql01 seems to be somewhat consistent growing to 3 hours or so then dropping, but the rg-sql02 PLE looks like it might be all over the map. Instead of continuing to look at this high level gathering data view, I'm going to drill down on rg-sql02 and see what it's done for the last week: And now we begin to see where we might have an issue. Memory on this system is getting flushed every 1/2 hour or so. I'm going to check another metric, scans: Whoa! I'm going back to the system real quick to look at some disk information again for rg-sql02. Here is the average disk queue length on the server: and the transfers Right, I think I have a guess as to what's up here. We're seeing memory get flushed constantly and we're seeing lots of scans. The disks are queuing, especially that F drive, and there are lots of requests that correspond to the scans and the memory flushes. In short, we've got queries that are scanning the data, a lot, so we either have bad queries or bad indexes. I'm going back to the server overview for rg-sql02 and check the Top 10 expensive queries. I'm modifying it to show me the last 3 days and the totals, so I'm not looking at some maintenance routine that ran 10 minutes ago and is skewing the results: OK. I need to look into these queries that are getting executed this much. They're generating a lot of reads, but which queries are generating the most reads: Ow, all still going against the same database. This is where I'm going to temporarily leave SQL Monitor. What I want to do is connect up to the server, validate that the Warehouse database is using the F:\ drive (which I'll put money down it is) and then start seeing what's up with these queries. Part 1 of the Series Part 2 of the Series

    Read the article

  • Log & monitor mysql databases on servers

    - by user3215
    How MySQL databases logged and monitored on ubuntu servers in real time?. I checked /var/log/mysql.log and found it empty. EDIT 1: The log was not enabled in the mysql configuration file. Now it logs and I could see the logs in the file /var/log/mysql/mysql.log But this could not be sufficient to gather additional information about the database logs. Is there any other way or any popular open source tool?

    Read the article

  • Tuning Red Gate: #4 of Some

    - by Grant Fritchey
    First time connecting to these servers directly (keys to the kingdom, bwa-ha-ha-ha. oh, excuse me), so I'm going to take a look at the server properties, just to see if there are any issues there. Max memory is set, cool, first possible silly mistake clear. In fact, these look to be nicely set up. Oh, I'd like to see the ANSI Standards set by default, but it's not a big deal. The default location for database data is the F:\ drive, where I saw all the activity last time. Cool, the people maintaining the servers in our company listen, parallelism threshold is set to 35 and optimize for ad hoc is enabled. No shocks, no surprises. The basic setup is appropriate. On to the problem database. Nothing wrong in the properties. The database is in SIMPLE recovery, but I think it's a reporting system, so no worries there. Again, I'd prefer to see the ANSI settings for connections, but that's the worst thing I can see. Time to look at the queries, tables, indexes and statistics because all the information I've collected over the last several days suggests that we're not looking at a systemic problem (except possibly not enough memory), but at the traditional tuning issues. I just want to note that, I started looking at the system, not the queries. So should you when tuning your environment. I know, from the data collected through SQL Monitor, what my top poor performing queries are, and the most frequently called, etc. I'm starting with the most frequently called. I'm going to get the execution plan for this thing out of the cache (although, with the cache dumping constantly, I might not get it). And it's not there. Called 1.3 million times over the last 3 days, but it's not in cache. Wow. OK. I'll see what's in cache for this database: SELECT  deqs.creation_time,         deqs.execution_count,         deqs.max_logical_reads,         deqs.max_elapsed_time,         deqs.total_logical_reads,         deqs.total_elapsed_time,         deqp.query_plan,         SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,                   (deqs.statement_end_offset - deqs.statement_start_offset) / 2                   + 1) AS QueryStatement FROM    sys.dm_exec_query_stats AS deqs         CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest         CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp WHERE   dest.dbid = DB_ID('Warehouse') AND deqs.statement_end_offset > 0 AND deqs.statement_start_offset > 0 ORDER BY deqs.max_logical_reads DESC ; And looking at the most expensive operation, we have our first bad boy: Multiple table scans against very large sets of data and a sort operation. a sort operation? It's an insert. Oh, I see, the table is a heap, so it's doing an insert, then sorting the data and then inserting into the primary key. First question, why isn't this a clustered index? Let's look at some more of the queries. The next one is deceiving. Here's the query plan: You're thinking to yourself, what's the big deal? Well, what if I told you that this thing had 8036318 reads? I know, you're looking at skinny little pipes. Know why? Table variable. Estimated number of rows = 1. Actual number of rows. well, I'm betting several more than one considering it's read 8 MILLION pages off the disk in a single execution. We have a serious and real tuning candidate. Oh, and I missed this, it's loading the table variable from a user defined function. Let me check, let me check. YES! A multi-statement table valued user defined function. And another tuning opportunity. This one's a beauty, seriously. Did I also mention that they're doing a hash against all the columns in the physical table. I'm sure that won't lead to scans of a 500,000 row table, no, not at all. OK. I lied. Of course it is. At least it's on the top part of the Loop which means the scan is only executed once. I just did a cursory check on the next several poor performers. all calling the UDF. I think I found a big tuning opportunity. At this point, I'm typing up internal emails for the company. Someone just had their baby called ugly. In addition to a series of suggested changes that we need to implement, I'm also apologizing for being such an unkind monster as to question whether that third eye & those flippers belong on such an otherwise lovely child.

    Read the article

  • Tuning Red Gate: #2 of Many

    - by Grant Fritchey
    In the last installment, I used the SQL Monitor tool to get a snapshot view of the current state of the servers at Red Gate that are giving us trouble. That snapshot suggested some areas where I should focus some time, primarily in which queries were being called most frequently or were running the longest. But, you don't want to just run off & start tuning queries. Remember, the foundation for query tuning is the server itself. So, I want to be sure I'm not looking at some major hardware or configuration issues that I need to address first. Rather than look at the current status of the server, I'm going to look at historical data. Clicking on the Analysis tab of SQL Monitor I get a whole list of counters that I can look at. More importantly, I can look at them over a period of time. Even more importantly, I can compare past periods with current periods to see if we're looking at a progressive issue or not. There are counters here that will give me an indication of load, and there are counters here that will tell me specifics about that load. First, I want to just look at the load to understand where the pain points might be. Trying to drill down before you have detailed information is just bad planning. First thing I'm going to check is the CPU, just to see what's up there. I have two servers I'm interested in, so I'll show you both: Looking at the last 30 days for both servers, well, let's just say that the first server is about what I would expect. It has an average baseline behavior with occasional, regular, peaks. This looks like a system with a fairly steady & predictable load that probably has a nightly batch process that spikes the processor. In short, normal stuff. The points there where the CPU drops radically. that might be worth investigating further because something changed the processing on this system a lot. But the first server. It's all over the place. There's no steady CPU behavior at all. It's spike high for long periods of time. It's up, it's down. I'm really going to have to spend time looking at CPU issues on this server to try to figure out what's up. It might be other processes being shared on the server, it might be something else. Either way, I'm going to have to spend time evaluating this CPU, especially those peeks about a week ago. Looking at the Pages/sec, again, just a measure of load, I see that there are some peaks on the rg-sql02 server, but over all, it looks like a fairly standard load. Plus, the peaks are only up to 550 pages/sec. Remember, this isn't a performance measure, but just a load measurement, but from this, I don't think we're looking at major memory issues, but I may want to correlate these counters with the CPU counters. Again, the other server looks like there's stuff going on. The load is not at all consistent. In fact there was a point earlier in the year that looks pretty severe. Plus the spikes here are twice the size of the other system. We've got a lot more load going on here and I will probably need to drill down on memory usage on this server. Taking a look at the disk transfers/sec the load on both systems seems to roughly correspond to the other load indicators. Notice that drop right in the middle of the graph for rg-sql02. I wonder if the office was closed over that period or a system was down for maintenance. If I saw spikes in memory or disk that corresponded to the drip in CPU, you can assume something was using those other resources and causing a drop, but when everything goes down, it just means that the system isn't gettting used. The disk on the rg-sql01 system isn't spiking exactly the same way as the memory & cpu, so there's a good chance (chance mind you) that any performance issues might not be disk related. However, notice that huge jump at the beginning of the month. Several disks were used more than they were for the rest of the month. That's the load on the server. What about the load on SQL Server itself? Next time.

    Read the article

  • Manage and Monitor Identity Ranges in SQL Server Transactional Replication

    - by Yaniv Etrogi
    Problem When using transactional replication to replicate data in a one way topology from a publisher to a read-only subscriber(s) there is no need to manage identity ranges. However, when using  transactional replication to replicate data in a two way replication topology - between two or more servers there is a need to manage identity ranges in order to prevent a situation where an INSERT commands fails on a PRIMARY KEY violation error  due to the replicated row being inserted having a value for the identity column which already exists at the destination database. Solution There are two ways to address this situation: Assign a range of identity values per each server. Work with parallel identity values. The first method requires some maintenance while the second method does not and so the scripts provided with this article are very useful for anyone using the first method. I will explore this in more detail later in the article. In the first solution set server1 to work in the range of 1 to 1,000,000,000 and server2 to work in the range of 1,000,000,001 to 2,000,000,000.  The ranges are set and defined using the DBCC CHECKIDENT command and when the ranges in this example are well maintained you meet the goal of preventing the INSERT commands to fall due to a PRIMARY KEY violation. The first insert at server1 will get the identity value of 1, the second insert will get the value of 2 and so on while on server2 the first insert will get the identity value of 1000000001, the second insert 1000000002 and so on thus avoiding a conflict. Be aware that when a row is inserted the identity value (seed) is generated as part of the insert command at each server and the inserted row is replicated. The replicated row includes the identity column’s value so the data remains consistent across all servers but you will be able to tell on what server the original insert took place due the range that  the identity value belongs to. In the second solution you do not manage ranges but enforce a situation in which identity values can never get overlapped by setting the first identity value (seed) and the increment property one time only during the CREATE TABLE command of each table. So a table on server1 looks like this: CREATE TABLE T1 (  c1 int NOT NULL IDENTITY(1, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); And a table on server2 looks like this: CREATE TABLE T1(  c1 int NOT NULL IDENTITY(2, 5) PRIMARY KEY CLUSTERED ,c2 int NOT NULL ); When these two tables are inserted the results of the identity values look like this: Server1:  1, 6, 11, 16, 21, 26… Server2:  2, 7, 12, 17, 22, 27… This assures no identity values conflicts while leaving a room for 3 additional servers to participate in this same environment. You can go up to 9 servers using this method by setting an increment value of 9 instead of 5 as I used in this example. Continues…

    Read the article

  • SCOM, 90 Days In, II. Noise.

    - by merrillaldrich
    Once you get past the basic architecture of a SCOM implementation, and build the servers, and so on, the first real problem is … well, noise. Suddenly (depending on how you deploy) the system will reach out, like marching army ants or a some very clever cybernetic spider and find, and then proceed to yell at you about, every single problem on every server you didn’t know you had. That, of course, is the point. Still, a tool like this is not useful if it doesn’t surface the real problems from the...(read more)

    Read the article

< Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30  | Next Page >