Search Results

Search found 6387 results on 256 pages for 'cpu allocation'.

Page 201/256 | < Previous Page | 197 198 199 200 201 202 203 204 205 206 207 208  | Next Page >

  • optimizing oracle query

    - by deming
    I'm having a hard time wrapping my head around this query. it is taking almost 200+ seconds to execute. I've pasted the execution plan as well. SELECT user_id , ROLE_ID , effective_from_date , effective_to_date , participant_code , ACTIVE FROM CMP_USER_ROLE E WHERE ACTIVE = 0 AND (SYSDATE BETWEEN effective_from_date AND effective_to_date OR TO_CHAR(effective_to_date,'YYYY-Q') = '2010-2') AND participant_code = 'NY005' AND NOT EXISTS ( SELECT 1 FROM CMP_USER_ROLE r WHERE r.USER_ID= E.USER_ID AND r.role_id = E.role_id AND r.ACTIVE = 4 AND E.effective_to_date <= (SELECT MAX(last_update_date) FROM CMP_USER_ROLE S WHERE S.role_id = r.role_id AND S.role_id = r.role_id AND S.ACTIVE = 4 )) Explain plan ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 37 | 154 (2)| 00:00:02 | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID | USER_ROLE | 1 | 37 | 30 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | N_USER_ROLE_IDX6 | 27 | | 3 (0)| 00:00:01 | |* 4 | FILTER | | | | | | | 5 | HASH GROUP BY | | 1 | 47 | 124 (2)| 00:00:02 | |* 6 | TABLE ACCESS BY INDEX ROWID | USER_ROLE | 159 | 3339 | 119 (1)| 00:00:02 | | 7 | NESTED LOOPS | | 11 | 517 | 123 (1)| 00:00:02 | |* 8 | TABLE ACCESS BY INDEX ROWID| USER_ROLE | 1 | 26 | 4 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | N_USER_ROLE_IDX5 | 1 | | 3 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | N_USER_ROLE_IDX2 | 957 | | 74 (2)| 00:00:01 | -----------------------------------------------------------------------------------------------------

    Read the article

  • Why use threading data race will occur, but will not use gevent

    - by onlytiancai
    My test code is as follows, using threading, count is not 5,000,000 , so there has been data race, but using gevent, count is 5,000,000, there was no data race . Is not gevent coroutine execution will atom "count + = 1", rather than split into a one CPU instruction to execute? # -*- coding: utf-8 -*- import threading use_gevent = True use_debug = False cycles_count = 100*10000 if use_gevent: from gevent import monkey monkey.patch_thread() count = 0 class Counter(threading.Thread): def __init__(self, name): self.thread_name = name super(Counter, self).__init__(name=name) def run(self): global count for i in xrange(cycles_count): if use_debug: print '%s:%s' % (self.thread_name, count) count = count + 1 counters = [Counter('thread:%s' % i) for i in range(5)] for counter in counters: counter.start() for counter in counters: counter.join() print 'count=%s' % count

    Read the article

  • "Inlining" (kind of) functions at runtime in C

    - by fortran
    Hi, I was thinking about a typical problem that is very JIT-able, but hard to approach with raw C. The scenario is setting up a series of function pointers that are going to be "composed" (as in maths function composition) once at runtime and then called lots and lots of times. Doing it the obvious way involves many virtual calls, that are expensive, and if there are enough nested functions to fill the CPU branch prediction table completely, then the performance with drop considerably. In a language like Lisp, I could probably process the code and substitute the "virtual" call by the actual contents of the functions and then call compile to have an optimized version, but that seems very hacky and error prone to do in C, and using C is a requirement for this problem ;-) So, do you know if there's a standard, portable and safe way to achieve this in C? Cheers

    Read the article

  • Does SetThreadPriority cause thread reschedulling?

    - by Suma
    Consider following situation, assuming single CPU system: thread A is running with a priority THREAD_PRIORITY_NORMAL, signals event E thread B with a priority THREAD_PRIORITY_LOWEST is waiting for an event E (Note: at this point the thread is not scheduled because it is runnable, but A is higher priority and runnable as well) thread A calls SetThreadPriority(B, THREAD_PRIORITY_ABOVE_NORMAL) Is thread B re-scheduled immediately to run, or is thread A allowed to continue until current time-slice is over, and B is scheduled only once a new time-slice has begun? I would be interested to know the answer for WinXP, Vista and Win7, if possible. Note: the scenario above is simplified from my real world code, where multiple threads are running on multiple cores, but the main object of the question stays: does SetThreadPriority cause thread scheduling to happen?

    Read the article

  • CUDA Global Memory, Where is it?

    - by gamerx
    I understand that in CUDA's memory hierachy, we have things like shared memory, texture memory, constant memory, registers and of course the global memory which we allocate using cudaMalloc(). I've been searching through whatever documentations I can find but I have yet to come across any that explicitly explains what is the global memory. I believe that the global memory allocated is on the GDDR of graphics card itself and not the RAM that is shared with the CPU since one of the documentations did state that the pointer cannot be dereferenced by the host side. Am I right?

    Read the article

  • Java concurrency - Should block or yield?

    - by teto
    Hi, I have multiple threads each one with its own private concurrent queue and all they do is run an infinite loop retrieving messages from it. It could happen that one of the queues doesn't receive messages for a period of time (maybe a couple seconds), and also they could come in big bursts and fast processing is necessary. I would like to know what would be the most appropriate to do in the first case: use a blocking queue and block the thread until I have more input or do a Thread.yield()? I want to have as much CPU resources available as possible at a given time, as the number of concurrent threads may increase with time, but also I don't want the message processing to fall behind, as there is no guarantee of when the thread will be reescheduled for execution when doing a yield(). I know that hardware, operating system and other factors play an important role here, but setting that aside and looking at it from a Java (JVM?) point of view, what would be the most optimal?

    Read the article

  • How can I optimize this code?

    - by loop0
    Hi, I'm developing a logger daemon to squid to grab the logs on a mongodb database. But I'm experiencing too much cpu utilization. How can I optimize this code? from sys import stdin from pymongo import Connection connection = Connection() db = connection.squid logs = db.logs buffer = [] a = 'timestamp' b = 'resp_time' c = 'src_ip' d = 'cache_status' e = 'reply_size' f = 'req_method' g = 'req_url' h = 'username' i = 'dst_ip' j = 'mime_type' L = 'L' while True: l = stdin.readline() if l[0] == L: l = l[1:].split() buffer.append({ a: float(l[0]), b: int(l[1]), c: l[2], d: l[3], e: int(l[4]), f: l[5], g: l[6], h: l[7], i: l[8], j: l[9] } ) if len(buffer) == 1000: logs.insert(buffer) buffer = [] if not l: break connection.disconnect()

    Read the article

  • When is a program limited by the memory bandwidth?

    - by hanno
    I want to know if a program that I am using and which requires a lot of memory is limited by the memory bandwidth. When do you expect this to happen? Did it ever happen to you in a real life scenario? I found several articles discussing this issue, including http://www.cs.virginia.edu/~mccalpin/papers/bandwidth/node12.html http://www.cs.virginia.edu/~mccalpin/papers/bandwidth/node13.html http://ispass.org/ucas5/session2_3_ibm.pdf The first link is a bit old, but suggests that you need to perform less than about 1-40 floating point operations per floating point variable in order to see this effect (correct me if I'm wrong). How can I measure the memory bandwidth that a given program is using and how do I measure the (peak) bandwidth that my system can offer? I don't want to discuss any complicated cache issues here. I'm only interested in the communication between the CPU and the memory.

    Read the article

  • VS2008 is very slow on a specific large C++ solution

    - by VioletRose
    I have a solution with 21 C++ projects and 1 VB.NET project. The IDE responds very slowly when I simply move the carret in a file or try to open the menu. The process seems to take 50% of CPU for each movement. It only happens with this solution and only on my machine. The solution has total of 2380 source and header files, of which 1280 are header files. I tried to remove all connection to the source control (Perforce) but it didn't help. Also, I have Visual Assist installed but even after removing it (uninstall), the same behavior continued. Any idea?

    Read the article

  • How to determine bandwidth used by cron job?

    - by Lost_in_code
    I'm not a unix guy. CPanel does a good job of managing cronjobs and that is what I used to run dozens of cronjobs. All of them combined run more than 5000 times every day. Every cron makes a call to an external API. How can I check how much bandwidth are all the cron jobs eating? For my website I use awstats and that shows bandwidth usage et al. Another thing is that I dont want the admins to ban the cron jobs because they are using too much bandwidth (and CPU), more than what is allocated in my web hosting package.

    Read the article

  • Import external dll based on 64bit or 32bit OS

    - by Mike_G
    I have a dll that comes in both 32bit and 64bit version. My .NET WinForm is configured for "Any CPU" and my boss will not let us have separate installs for the different OS versions. So I am wondering: if I package both dlls in the install, then is there a way to have the WinForm determine if its 64bit/32bit and load the proper dll. I found this article for determining version. But i am not sure how to inject the proper way to define the DLLImport attribute on the methods i wish to use. Any ideas?

    Read the article

  • java virtual machine - how does it allocate resources?

    - by Will
    I am testing the performance of a data streaming system that supports continuous queries. This is how it works: - There is a polling service which sends data to my system. - As data passes into the system, each query evaluates based on a window of the stream at the current time. - The window slides as data passes in. My problem is this, when I add more queries to the system, I should expect the throughput to decrease because it can't cope the data rate. However, I actually observe an increase in throughput. I can't understand why this is the case and I am guessing that it's something to do with the way the JVM allocates CPU, memory etc. Can anyone shed any light to my problem?

    Read the article

  • Loop function works first time, not second time

    - by user1483101
    I'm creating a parsing program to look for certain strings in a a text file and count them. However, I'm having some trouble with one spot. def callbrowse(): filename = tkFileDialog.askopenfilename(filetypes = (("Text files", "*.txt"),("HTML files", ".html;*.htm"),("All files", "*.*"))) print filename try: global filex global writefile filex = open(filename, 'r') print "Success!!" print filename except: print "Failed to open file" ######This returns the correct count only the first time it is run. The next time it ######returns 0. If the browse button is clicked again, then this function returns the ######correct count again. def count_errors(error_name): count = 0 for line in filex: if error_name == "CPU > 79%": stringparse = "Utilization is above" elif error_name == "Stuck touchscreen": stringparse = "Stuck touchscreen" if re.match("(.*)" + "Utilization is above" + "(.*)",line): count = count + 1 return count Thanks for any help. I can't seem to get this to work right.

    Read the article

  • Mesos slave not 'Running' multiple executors simultaneously

    - by user3084164
    I am using Mesos to distribute a bunch of tasks to different machines (mesos-slaves). Here is what happens: 1. My scheduler gets resource offers and accepts it. 2. Mesos stages multiple executors on the same mesos-slaves (each slave has 4 cpus) 3. Only ONE executor enters the 'Running' state on each of the slaves while the others are shown in 'Staging' state. 4. Only after the current executor finishes execution the other executor starts running. Given that I have 4 CPUs on each machine, shouldn't each slave be running 4 executors simultaneously? Each executor requires 1 CPU.

    Read the article

  • How to read system information in C++ on Windows and Linux?

    - by f4
    I need to read system information like CPU/RAM/disks usage in C++. Maybe swap, network and process too but that's less important. It has probably been done thousand of times before so I first tried to search for a library. Someone here suggested SIGAR, which seems to fit my needs but it has a GPL license and it is for inclusion in a proprietary product. So it's not an option here. I feel like it's something not that easy to implement, as it'll need testing on several platforms. So a library would be welcome. If you don't know of any library, could you point me in the right direction for both platforms?

    Read the article

  • Best way to handle different panels occupying same area

    - by gmnnn
    I have this application: I want to change the marked area when the user is clicked one of the navBarItems (like Microsoft OUTLOOK). I've been doing some research and a lot of people said that I can add several panels and show/hide them when user is clicked a navBarItem. But the area will contain a lot of gridviews and a lot of other controls. I don't know if I want to initialize all of them when application starts because it's gonna be hard on the cpu and memory to keep all the controls running at the same time. And I don't think it's an elegant solution for this kind of situation. But if I choose to initialize controls when user is clicked to corresponding navBarItem, it's gonna be laggy for the user. What is the best design approach for this situation? PS: I can use commercial libraries too. Thank you.

    Read the article

  • Count of Distinct Int32 Values in .NET

    - by Eric J.
    I am receiving a stream of unordered Int32 values and need to track the count of distinct values that I receive. My thought is to add the Int32 values into a HashSet<Int32>. Duplicate entries will simply not be added per the behavior of HashSet. Do I understand correctly that set membership is based on GetHashCode() and that the hash code of an Int32 is the number itself? Is there an approach that is either more CPU or more memory efficient? UPDATE The data stream is rather large. Simply using Linq to iterate the stream to get the distinct count is not what I'm after, since that would involve iterating the stream a second time.

    Read the article

  • pass parameter from jsp to struts 2 action

    - by andrey_groza
    I have an e-store application and I want to pass item id to the action every time the button for that item is pressed. my jsp : <s:submit value="addToCart" action="addToCart" type="submit"> <s:param name="id" value="%{#cpu.id}" /> </s:submit> action: public class ProductsCPU extends BaseAction implements Preparable, SessionAware { private static final long serialVersionUID = 2124421844550008773L; private List colors = new ArrayList<>(); private List cpus; private String id; public String getId() { return id; } public void setId(String id) { this.id = id; } When I print id to console, it has the NULL value. What is the problem?

    Read the article

  • Backing up my data causes my server to crash using Symantec Backup Exec 12, or How I Came to Loathe

    - by Kyle Noland
    I have a Dell PowerEdge 2850 running Windows Server 2003. It is the primary file server for one of my clients. I have another server also running Windows Server 2003 that acts as the core media server for Symantec Backup Exec 12. I recently upgraded from Backup Exec 11d to 12. This upgrade was necessary because we also just upgraded from Exchange 2003 to Exchange 2007. After the upgrade I had to push-install the new version 12 Backup Exec Remote Agents to each of the servers I am backing up (about 6 total). 5 of my servers are doing just fine, faithfully completing backups every night. My file server routinely crashes. Observations: When the server crashes, it does not blue screen, it just locks up completely. Even the mouse is unresponsive. If you leave the server locked up long enough, it will eventually reboot itself and hang on the Windows splash screen. There is absolutely zero useful Event Viewer evidence of a problem. The logs go from routine logging to an Unexplained Shutdown Event the next morning when I have to hard reset the server to get it to boot. 90% of the time the server does not boot cleanly, it hangs on the Windows splash screen. I don't have any light to shed here. When the server hangs all I can do is hard reset it and try again. Even after a successful boot and chkdsk /r operation, if you reboot the machine, you have a 90% chance it won't back up again cleanly. The back story: This server started crashing during nightly backups about a month ago. I tried everything I could think of to troubleshoot the problem and eventually had to give up because I could not keep coming to the office at 4 AM to try to get the server back online. One Friday I got lucky and the server stayed up for its entire full backup. I took this opportunity to restore the full backup to a temporary server I set up and switched all my users to the temporary. Then I reloaded the ailing file server. I kept all my users on the temporary file server for about 3 weeks. I installed the same Backup Exec Remote Agent and Trend Micro A/V client on the temporary server that I was using on the regular file server. During this time, I had absolutely no problems backing up the temporary server. I tested the reloaded file server extensively. I rebooted the server once an hour every day for 3 weeks trying to make it fail. It never did. I felt confident that the reload was the answer to my problems. I moved all of the data from the temporary server back to the regular server. I got 3 nightly backups out of it before it locked up again and started the familiar failure to boot cleanly behavior. This weekend I decided to monitor the file server through the entire backup job. I RDPd into the file server and also into the server running Backup Exec. On the file server I opened the Task Manager so I could view the processes and watch CPU and memory usage. Everything was running smoothly for about 60GB worth of backup. Then I noticed that the byte count of the backup job in Backup Exec had stopped progressing. I looked back over at my RDP session into the file server, and I was getting real time updates about CPU and memory usage still - both nearly 0%, which is unusual. Backups usually hover around 40% usage for the duration of the backup job. Let me reiterate this point: The screen was refreshing and I was getting real time Task Manager updates - until I clicked on the Start menu. The screen went black and the server locked up. In truth, I think the server had already locked up, the video card just hadn't figured it out yet. I went back into my bag of trick: driving to the office and hard reseting the server over and over again when it hangs up at the Windows splash screen. I did this for 2 hours without getting a successful boot. I started panicking because I did not have a decent backup to use to get everything back onto the working temporary file server. Once I exhausted everything I knew to do, I took a deep breath, booted to the Windows Server 2003 CD and performed a repair installation of Windows. The server came back up fine, with all of my data intact. I can now reboot the server at will and it will come back up cleanly. The problem is that I'm afraid as soon as I try to back that data up again I will back at square one. So let me sum things up: Here is what I've done so far to troubleshoot this server: Deleted and recreated the RAID 5 sets. Initialized the drives. Reloaded the server with a fresh Server 2003 install. Confirmed with Dell that I have installed the latest, Dell approved BIOS and NIC drivers. Uninstalled / reinstalled the Backup Exec Remote Agent. Uninstalled the Trend Micro A/V client. Configured the server not to reboot itself after a blue screen so I can see any stop error. I used to think the server was blue screening, but since I enabled this setting I now know that the server just completely locks up. Run chkdsk /r from the Windows Recovery Console. Several errors were found and corrected, but did not help my problem. Help confirm or deny the following assumptions: There are two problems at work here. Why the server is locking up in the first place, and why the server won't boot cleanly after a lockup. This is ultimately a software problem. The server works fine and can be rebooted cleanly all day long - until the first lockup - following a fresh OS load or even a Repair installation. This is not a problem with Backup Exec in general. All of my other servers back up just fine. For the record, all of the other servers run Server 2003, and some of them house more data than the file server in question here. Any help is appreciated. The irony is almost too much to bear. Backing up my data is what is jeopardizing it.

    Read the article

  • How to stop a ICMP attack?

    - by cumhur onat
    We are under a heavy icmp flood attack. Tcpdump shows the result below. Altough we have blocked ICMP with iptables tcpdump still prints icmp packets. I've also attached iptables configuration and "top" result. Is there any thing I can do to completely stop icmp packets? [root@server downloads]# tcpdump icmp -v -n -nn tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 03:02:47.810957 IP (tos 0x0, ttl 49, id 16007, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 124, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.811559 IP (tos 0x0, ttl 49, id 16010, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 52, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.811922 IP (tos 0x0, ttl 49, id 16012, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 122, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.812485 IP (tos 0x0, ttl 49, id 16015, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 126, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.812613 IP (tos 0x0, ttl 49, id 16016, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 122, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.812992 IP (tos 0x0, ttl 49, id 16018, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 122, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.813582 IP (tos 0x0, ttl 49, id 16020, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 52, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.814092 IP (tos 0x0, ttl 49, id 16023, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 120, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.814233 IP (tos 0x0, ttl 49, id 16024, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 120, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.815579 IP (tos 0x0, ttl 49, id 16025, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 50, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.815726 IP (tos 0x0, ttl 49, id 16026, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 IP (tos 0x0, ttl 50, id 31864, offset 0, flags [none], proto: ICMP (1), length: 76) 77.92.136.196 > 94.201.175.188: [|icmp] 03:02:47.815890 IP (tos 0x0, ttl 49, id 16027, offset 0, flags [none], proto: ICMP (1), length: 56) 80.227.64.183 > 77.92.136.196: ICMP redirect 94.201.175.188 to host 80.227.64.129, length 36 iptables configuration: [root@server etc]# iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ofis tcp -- anywhere anywhere tcp dpt:mysql ofis tcp -- anywhere anywhere tcp dpt:ftp DROP icmp -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination DROP icmp -- anywhere anywhere Chain ofis (2 references) target prot opt source destination ACCEPT all -- OUR_OFFICE_IP anywhere DROP all -- anywhere anywhere top: top - 03:12:19 up 400 days, 15:43, 3 users, load average: 1.49, 1.67, 2.61 Tasks: 751 total, 3 running, 748 sleeping, 0 stopped, 0 zombie Cpu(s): 8.2%us, 1.0%sy, 0.0%ni, 87.9%id, 2.1%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 32949948k total, 26906844k used, 6043104k free, 4707676k buffers Swap: 10223608k total, 0k used, 10223608k free, 14255584k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 36 root 39 19 0 0 0 R 100.8 0.0 17:03.56 ksoftirqd/11 10552 root 15 0 11408 1460 676 R 5.7 0.0 0:00.04 top 7475 lighttpd 15 0 304m 22m 15m S 3.8 0.1 0:05.37 php-cgi 1294 root 10 -5 0 0 0 S 1.9 0.0 380:54.73 kjournald 3574 root 15 0 631m 11m 5464 S 1.9 0.0 0:00.65 node 7766 lighttpd 16 0 302m 19m 14m S 1.9 0.1 0:05.70 php-cgi 10237 postfix 15 0 52572 2216 1692 S 1.9 0.0 0:00.02 scache 1 root 15 0 10372 680 572 S 0.0 0.0 0:07.99 init 2 root RT -5 0 0 0 S 0.0 0.0 0:16.72 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.06 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0.0 0.0 1:10.46 migration/1 6 root 34 19 0 0 0 S 0.0 0.0 0:01.11 ksoftirqd/1 7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 root RT -5 0 0 0 S 0.0 0.0 2:36.15 migration/2 9 root 34 19 0 0 0 S 0.0 0.0 0:00.19 ksoftirqd/2 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 root RT -5 0 0 0 S 0.0 0.0 3:48.91 migration/3 12 root 34 19 0 0 0 S 0.0 0.0 0:00.20 ksoftirqd/3 13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 uname -a [root@server etc]# uname -a Linux thisis.oursite.com 2.6.18-238.19.1.el5 #1 SMP Fri Jul 15 07:31:24 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux arp -an [root@server downloads]# arp -an ? (77.92.136.194) at 00:25:90:04:F0:90 [ether] on eth0 ? (192.168.0.2) at 00:25:90:04:F0:91 [ether] on eth1 ? (77.92.136.193) at 00:23:9C:0B:CD:01 [ether] on eth0

    Read the article

  • How can I get FreeNAS to respond to libvirt shutdown requests

    - by ptomli
    I have a KVM VM of FreeNAS 0.7.1 Shere (revision 5127) running on Ubuntu Server 10.04 and I'm unable to convince the VM to shutdown from the host virsh shutdown freenas I would expect this to send some ACPI? trigger to the VM and FreeNAS then do what it's told. I'm not a FreeBSD fundi so I don't really know what packages or processes to poke to get this running. I have tried to convince powerd to run, but the VM cpus don't have the required freq entry Sysctl HW $ sysctl hw hw.machine: amd64 hw.model: QEMU Virtual CPU version 0.12.3 hw.ncpu: 1 hw.byteorder: 1234 hw.physmem: 523116544 hw.usermem: 463806464 hw.pagesize: 4096 hw.floatingpoint: 1 hw.machine_arch: amd64 hw.realmem: 536850432 hw.aac.iosize_max: 65536 hw.amr.force_sg32: 0 hw.an.an_cache_iponly: 1 hw.an.an_cache_mcastonly: 0 hw.an.an_cache_mode: dbm hw.an.an_dump: off hw.ata.to: 15 hw.ata.wc: 1 hw.ata.atapi_dma: 1 hw.ata.ata_dma_check_80pin: 1 hw.ata.ata_dma: 1 hw.ath.txbuf: 200 hw.ath.rxbuf: 40 hw.ath.regdomain: 0 hw.ath.countrycode: 0 hw.ath.xchanmode: 1 hw.ath.outdoor: 1 hw.ath.calibrate: 30 hw.ath.hal.swba_backoff: 0 hw.ath.hal.sw_brt: 10 hw.ath.hal.dma_brt: 2 hw.bce.msi_enable: 1 hw.bce.tso_enable: 1 hw.bge.allow_asf: 0 hw.cardbus.cis_debug: 0 hw.cardbus.debug: 0 hw.cs.recv_delay: 570 hw.cs.ignore_checksum_failure: 0 hw.cs.debug: 0 hw.cxgb.snd_queue_len: 50 hw.cxgb.use_16k_clusters: 1 hw.cxgb.force_fw_update: 0 hw.cxgb.singleq: 0 hw.cxgb.ofld_disable: 0 hw.cxgb.msi_allowed: 2 hw.cxgb.txq_mr_size: 1024 hw.cxgb.sleep_ticks: 1 hw.cxgb.tx_coalesce: 0 hw.firewire.hold_count: 3 hw.firewire.try_bmr: 1 hw.firewire.fwmem.speed: 2 hw.firewire.fwmem.eui64_lo: 0 hw.firewire.fwmem.eui64_hi: 0 hw.firewire.phydma_enable: 1 hw.firewire.nocyclemaster: 0 hw.firewire.fwe.rx_queue_len: 128 hw.firewire.fwe.tx_speed: 2 hw.firewire.fwe.stream_ch: 1 hw.firewire.fwip.rx_queue_len: 128 hw.firewire.sbp.tags: 0 hw.firewire.sbp.use_doorbell: 0 hw.firewire.sbp.scan_delay: 500 hw.firewire.sbp.login_delay: 1000 hw.firewire.sbp.exclusive_login: 1 hw.firewire.sbp.max_speed: -1 hw.firewire.sbp.auto_login: 1 hw.mfi.max_cmds: 128 hw.mfi.event_class: 0 hw.mfi.event_locale: 65535 hw.pccard.cis_debug: 0 hw.pccard.debug: 0 hw.cbb.debug: 0 hw.cbb.start_32_io: 4096 hw.cbb.start_16_io: 256 hw.cbb.start_memory: 2281701376 hw.pcic.pd6722_vsense: 1 hw.pcic.intr_mask: 57016 hw.pci.honor_msi_blacklist: 1 hw.pci.enable_msix: 1 hw.pci.enable_msi: 1 hw.pci.do_power_resume: 1 hw.pci.do_power_nodriver: 0 hw.pci.enable_io_modes: 1 hw.pci.host_mem_start: 2147483648 hw.syscons.kbd_debug: 1 hw.syscons.kbd_reboot: 1 hw.syscons.bell: 1 hw.syscons.saver.keybonly: 1 hw.syscons.sc_no_suspend_vtswitch: 0 hw.usb.uplcom.interval: 100 hw.usb.uvscom.interval: 100 hw.usb.uvscom.opktsize: 8 hw.wi.debug: 0 hw.wi.txerate: 0 hw.xe.debug: 0 hw.intr_storm_threshold: 1000 hw.availpages: 127714 hw.bus.devctl_disable: 0 hw.ste.rxsyncs: 0 hw.busdma.total_bpages: 32 hw.busdma.zone0.total_bpages: 32 hw.busdma.zone0.free_bpages: 32 hw.busdma.zone0.reserved_bpages: 0 hw.busdma.zone0.active_bpages: 0 hw.busdma.zone0.total_bounced: 0 hw.busdma.zone0.total_deferred: 0 hw.busdma.zone0.lowaddr: 0xffffffff hw.busdma.zone0.alignment: 2 hw.busdma.zone0.boundary: 65536 hw.clockrate: 2808 hw.instruction_sse: 1 hw.apic.enable_extint: 0 hw.kbd.keymap_restrict_change: 0 hw.acpi.supported_sleep_state: S3 S4 S5 hw.acpi.power_button_state: S5 hw.acpi.sleep_button_state: S3 hw.acpi.lid_switch_state: NONE hw.acpi.standby_state: S1 hw.acpi.suspend_state: S3 hw.acpi.sleep_delay: 1 hw.acpi.s4bios: 0 hw.acpi.verbose: 0 hw.acpi.disable_on_reboot: 0 hw.acpi.handle_reboot: 0 hw.acpi.cpu.cx_lowest: C1 Processes $ ps ax PID TT STAT TIME COMMAND 0 ?? DLs 0:00.00 [swapper] 1 ?? ILs 0:00.00 /sbin/init -- 2 ?? DL 0:00.08 [g_event] 3 ?? DL 0:00.29 [g_up] 4 ?? DL 0:00.33 [g_down] 5 ?? DL 0:00.00 [crypto] 6 ?? DL 0:00.00 [crypto returns] 7 ?? DL 0:00.00 [xpt_thrd] 8 ?? DL 0:00.00 [kqueue taskq] 9 ?? DL 0:00.00 [acpi_task_0] 10 ?? RL 34:12.42 [idle: cpu0] 11 ?? WL 0:01.13 [swi4: clock sio] 12 ?? WL 0:00.00 [swi3: vm] 13 ?? WL 0:00.00 [swi1: net] 14 ?? DL 0:00.04 [yarrow] 15 ?? WL 0:00.00 [swi6: task queue] 16 ?? WL 0:00.00 [swi2: cambio] 17 ?? DL 0:00.00 [acpi_task_1] 18 ?? DL 0:00.00 [acpi_task_2] 19 ?? WL 0:00.00 [swi5: +] 20 ?? DL 0:00.01 [thread taskq] 21 ?? WL 0:00.00 [swi6: Giant taskq] 22 ?? WL 0:00.00 [irq9: acpi0] 23 ?? WL 0:00.09 [irq14: ata0] 24 ?? WL 0:00.11 [irq15: ata1] 25 ?? WL 0:00.57 [irq11: ed0 uhci0] 26 ?? DL 0:00.00 [usb0] 27 ?? DL 0:00.00 [usbtask-hc] 28 ?? DL 0:00.00 [usbtask-dr] 29 ?? WL 0:00.01 [irq1: atkbd0] 30 ?? WL 0:00.00 [swi0: sio] 31 ?? DL 0:00.00 [sctp_iterator] 32 ?? DL 0:00.00 [pagedaemon] 33 ?? DL 0:00.00 [vmdaemon] 34 ?? DL 0:00.00 [idlepoll] 35 ?? DL 0:00.00 [pagezero] 36 ?? DL 0:00.01 [bufdaemon] 37 ?? DL 0:00.00 [vnlru] 38 ?? DL 0:00.14 [syncer] 39 ?? DL 0:00.01 [softdepflush] 1221 ?? Is 0:00.00 /sbin/devd 1289 ?? Is 0:00.01 /usr/sbin/syslogd -ss -f /var/etc/syslog.conf 1608 ?? Is 0:00.00 /usr/sbin/cron -s 1692 ?? Ss 0:00.03 /usr/local/sbin/mDNSResponderPosix -b -f /var/etc/mdn 1730 ?? S 0:00.43 /usr/local/sbin/lighttpd -f /var/etc/lighttpd.conf -m 1882 ?? DL 0:00.00 [system_taskq] 1883 ?? DL 0:00.00 [arc_reclaim_thread] 4139 ?? S 0:00.03 /usr/local/bin/php /usr/local/www/exec.php 4144 ?? S 0:00.00 sh -c ps ax 4145 ?? R 0:00.00 ps ax 1816 v0 Is 0:00.01 login [pam] (login) 1818 v0 I+ 0:00.03 -tcsh (csh) 1817 v1 Is+ 0:00.00 /usr/libexec/getty Pc ttyv1 1402 con- I 0:00.00 /usr/local/sbin/afpd -F /var/etc/afpd.conf 1404 con- S 0:00.00 /usr/local/sbin/cnid_metad 1682 con- I 0:02.78 /usr/local/sbin/mt-daapd -m -c /var/etc/mt-daapd.conf 1789 con- S 0:00.18 /usr/local/bin/fuppesd --config-dir /var/etc --config Libvert snippet <domain type='kvm'> <name>freenas</name> <uuid>********-****-****-****-************</uuid> <memory>524288</memory> <currentMemory>524288</currentMemory> <vcpu>1</vcpu> <os> <type arch='x86_64' machine='pc-0.12'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/kvm</emulator> Is this possible? Ideally I'd like to be able to stop the host without having to manually deal with shutting down the VM.

    Read the article

  • Strange Recurrent Excessive I/O Wait

    - by Chris
    I know quite well that I/O wait has been discussed multiple times on this site, but all the other topics seem to cover constant I/O latency, while the I/O problem we need to solve on our server occurs at irregular (short) intervals, but is ever-present with massive spikes of up to 20k ms a-wait and service times of 2 seconds. The disk affected is /dev/sdb (Seagate Barracuda, for details see below). A typical iostat -x output would at times look like this, which is an extreme sample but by no means rare: iostat (Oct 6, 2013) tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16.00 0.00 156.00 9.75 21.89 288.12 36.00 57.60 5.50 0.00 44.00 8.00 48.79 2194.18 181.82 100.00 2.00 0.00 16.00 8.00 46.49 3397.00 500.00 100.00 4.50 0.00 40.00 8.89 43.73 5581.78 222.22 100.00 14.50 0.00 148.00 10.21 13.76 5909.24 68.97 100.00 1.50 0.00 12.00 8.00 8.57 7150.67 666.67 100.00 0.50 0.00 4.00 8.00 6.31 10168.00 2000.00 100.00 2.00 0.00 16.00 8.00 5.27 11001.00 500.00 100.00 0.50 0.00 4.00 8.00 2.96 17080.00 2000.00 100.00 34.00 0.00 1324.00 9.88 1.32 137.84 4.45 59.60 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22.00 44.00 204.00 11.27 0.01 0.27 0.27 0.60 Let me provide you with some more information regarding the hardware. It's a Dell 1950 III box with Debian as OS where uname -a reports the following: Linux xx 2.6.32-5-amd64 #1 SMP Fri Feb 15 15:39:52 UTC 2013 x86_64 GNU/Linux The machine is a dedicated server that hosts an online game without any databases or I/O heavy applications running. The core application consumes about 0.8 of the 8 GBytes RAM, and the average CPU load is relatively low. The game itself, however, reacts rather sensitive towards I/O latency and thus our players experience massive ingame lag, which we would like to address as soon as possible. iostat: avg-cpu: %user %nice %system %iowait %steal %idle 1.77 0.01 1.05 1.59 0.00 95.58 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 13.16 25.42 135.12 504701011 2682640656 sda 1.52 0.74 20.63 14644533 409684488 Uptime is: 19:26:26 up 229 days, 17:26, 4 users, load average: 0.36, 0.37, 0.32 Harddisk controller: 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) Harddisks: Array 1, RAID-1, 2x Seagate Cheetah 15K.5 73 GB SAS Array 2, RAID-1, 2x Seagate ST3500620SS Barracuda ES.2 500GB 16MB 7200RPM SAS Partition information from df: Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 480191156 30715200 425083668 7% /home /dev/sda2 7692908 437436 6864692 6% / /dev/sda5 15377820 1398916 13197748 10% /usr /dev/sda6 39159724 19158340 18012140 52% /var Some more data samples generated with iostat -dx sdb 1 (Oct 11, 2013) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 15.00 0.00 70.00 0.00 656.00 9.37 4.50 1.83 4.80 33.60 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 12.00 836.00 500.00 100.00 sdb 0.00 0.00 0.00 3.00 0.00 32.00 10.67 9.96 1990.67 333.33 100.00 sdb 0.00 0.00 0.00 4.00 0.00 40.00 10.00 6.96 3075.00 250.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 2.62 4648.00 500.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 1.00 0.00 16.00 16.00 1.69 7024.00 1000.00 100.00 sdb 0.00 74.00 0.00 124.00 0.00 1584.00 12.77 1.09 67.94 6.94 86.00 Characteristic charts generated with rrdtool can be found here: iostat plot 1, 24 min interval: http://imageshack.us/photo/my-images/600/yqm3.png/ iostat plot 2, 120 min interval: http://imageshack.us/photo/my-images/407/griw.png/ As we have a rather large cache of 5.5 GBytes, we thought it might be a good idea to test if the I/O wait spikes would perhaps be caused by cache miss events. Therefore, we did a sync and then this to flush the cache and buffers: echo 3 > /proc/sys/vm/drop_caches and directly afterwards the I/O wait and service times virtually went through the roof, and everything on the machine felt like slow motion. During the next few hours the latency recovered and everything was as before - small to medium lags in short, unpredictable intervals. Now my question is: does anybody have any idea what might cause this annoying behaviour? Is it the first indication of the disk array or the raid controller dying, or something that can be easily mended by rebooting? (At the moment we're very reluctant to do this, however, because we're afraid that the disks might not come back up again.) Any help is greatly appreciated. Thanks in advance, Chris. Edited to add: we do see one or two processes go to 'D' state in top, one of which seems to be kjournald rather frequently. If I'm not mistaken, however, this does not indicate the processes causing the latency, but rather those affected by it - correct me if I'm wrong. Does the information about uninterruptibly sleeping processes help us in any way to address the problem? @Andy Shinn requested smartctl data, here it is: smartctl -a -d megaraid,2 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:13 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 20 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 1236631092 Blocks received from initiator = 1097862364 Blocks read from cache and sent to initiator = 1383620256 Number of read and write commands whose size <= segment size = 531295338 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36556.93 number of minutes until next internal SMART test = 32 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 509271032 47 0 509271079 509271079 20981.423 0 write: 0 0 0 0 0 5022.039 0 verify: 1870931090 196 0 1870931286 1870931286 100558.708 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36538 - [- - -] # 2 Background short Completed 16 36514 - [- - -] # 3 Background short Completed 16 36490 - [- - -] # 4 Background short Completed 16 36466 - [- - -] # 5 Background short Completed 16 36442 - [- - -] # 6 Background long Completed 16 36420 - [- - -] # 7 Background short Completed 16 36394 - [- - -] # 8 Background short Completed 16 36370 - [- - -] # 9 Background long Completed 16 36364 - [- - -] #10 Background short Completed 16 36361 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes] smartctl -a -d megaraid,3 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:26 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 19 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 288745640 Blocks received from initiator = 1097848399 Blocks read from cache and sent to initiator = 1304149705 Number of read and write commands whose size <= segment size = 527414694 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36596.83 number of minutes until next internal SMART test = 28 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 610862490 44 0 610862534 610862534 20470.133 0 write: 0 0 0 0 0 5022.480 0 verify: 2861227413 203 0 2861227616 2861227616 100872.443 0 Non-medium error count: 1 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36580 - [- - -] # 2 Background short Completed 16 36556 - [- - -] # 3 Background short Completed 16 36532 - [- - -] # 4 Background short Completed 16 36508 - [- - -] # 5 Background short Completed 16 36484 - [- - -] # 6 Background long Completed 16 36462 - [- - -] # 7 Background short Completed 16 36436 - [- - -] # 8 Background short Completed 16 36412 - [- - -] # 9 Background long Completed 16 36404 - [- - -] #10 Background short Completed 16 36401 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes]

    Read the article

  • Why are my hard drives failing?

    - by WishCow
    I have a small Ubuntu server running at home, with 2 HDDs. There are two software raids (raid1) on the disks, managed by mdadm, which I believe is irrelevant, but mentioning it anyway. Both of the HDDs are Western Digital, and have been used for around 2 years, when one of them started making clicking noises, and died. I figured that maybe it's natural after 2 years, so I bought a new one, and resynced the raid arrays. After about a month, the other drive also died. I didn't get suspicious, since both drives have been bought at the same time, it's not that surprising to see both of them near each other, so I bought another one. So far, 2 old drives failed, and 2 brand new in the system. After one month, one of the new drives died. This is when it started getting suspicious. Since the PC was put together from some really old parts (think AthlonXP), I figured that maybe the motherboard's SATA controller is the culprit. Of course you can't switch parts easily in an old PC like this, so I bought a whole system, new MB, new CPU, new RAM. Took the just failed drive back, since it was under warranty, and got it replaced. So it is up to 2 failed drives from the old ones, and 1 failed drive from the new ones. No problems, for 1 month. After that errors were creeping up again in /var/log/messages, and mdadm was reporting raid array failures. I started tearing my hair out. Everything is new in the system, it's up to the third brand new HDD, it's simply not possible that all of the new drives that I bought were faulty. Let's see what is still common... the cables. Okay, long shot, let's replace the SATA cables. Take HDD back, smile to the guy at the counter and say that I'm really unlucky. He replaces the HDD. I come home, one month passes and one of HDDs fails, again. I'm not joking. Two of the brand new HDDs have failed. Maybe it's a bug in the OS. Let's see what the manufacturer's testing tool says. Download testing tool, burn it to a CD, reboot, leave HDD testing overnight. Test says that the drive is faulty, and I should back up everything, if I still can. I don't know what's happening, but it does not look like a software problem, something is definitely thrashing the HDDs. I should mention now, that the whole system is in a shoebox. Since there are a load of "build your own ikea case" stuff, I thought there shouldn't be any problems throwing the thing in a box, and stuffing it away somewhere. The box is well ventilated, but I thought that just maybe the drives were overheating. There is no other possible answer to this. So I took the HDD back, and got it replaced (for the 3rd time), and bought HDD coolers. And just now, I have heard the sound of doom. click click whizzzzzzzzz. SSH into the box: You have new mail! mail r 1 DegradedArrayEvent on /dev/md0 ... dmesg output: [47128.000051] ata3: lost interrupt (Status 0x50) [47128.000097] end_request: I/O error, dev sda, sector 58588863 [47128.000134] md: super_written gets error=-5, uptodate=0 [48043.976054] ata3: lost interrupt (Status 0x50) [48043.976086] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [48043.976132] ata3.00: cmd c8/00:18:bf:40:52/00:00:00:00:00/e1 tag 0 dma 12288 in [48043.976135] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [48043.976208] ata3.00: status: { DRDY } [48043.976241] ata3: soft resetting link [48044.148446] ata3.00: configured for UDMA/133 [48044.148457] ata3.00: device reported invalid CHS sector 0 [48044.148477] ata3: EH complete Recap: No possibility of overheating 6 drives have failed, 4 of those have been brand new. I'm not sure now that the original two have been faulty, or suffered the same thing that the new ones. There is nothing common in the system, apart from the OS which is Ubuntu Karmic now (started with Jaunty). New MB, new CPU, new RAM, new SATA cables. No, the little holes on the HDD are not covered I'm crying. Really. I don't have the face to return to the store now, it's not possible for 4 drives to fail under 4 months. A few ideas that I have been thinking: Is it possible that I fuck up something when I partition and resync the drives? Can it be so bad that it physicaly wrecks the drive? (since the vendor supplied tool says that the drive is damaged) I do the partitoning with fdisk, and use the same block size for the raid1 partitions (I check the exact block sizes with fdisk -lu) Is it possible that the linux kernel or mdadm, or something is not compatible with this exact brand of HDDs, and thrashes them? Is it possible that it may be the shoebox? Try placing it somewhere else? It's under a shelf now, so humidity is not a problem either. Is it possible that a normal PC case will solve my problem (I'm going to shoot myself then)? I will get a picture tomorrow. Am I just simply cursed? Any help or speculation is greatly appreciated. Edit: The power strip is guarded against overvoltage. Edit2: I have moved inbetween these 4 months, so the possibility of the cause being "dirty" electricity in both places, is very low. Edit3: I have checked the voltages in the BIOS (couldn't borrow a multimeter), and they are all seem correct, the biggest discrepancy is in the 12V, because it's supplying 11.3. Should I be worried about that? Edit4: I put my desktop PC's PSU into the server. The BIOS reported much more accurate voltage readings, and also it has successfully rebuilt the raid1 array, which took some 3-4 hours, so I feel a little positive now. Will get a new PSU tomorrow to test with that. Also, attaching the picture about the box: (disregard the 3rd drive)

    Read the article

  • Optimize php-fpm and varnish for a powerfull server

    - by Jim
    My setup is: Intel® Core™ i7-2600 and RAM 16 GB DDR3 RAM varnish+nginx+php-fpm+apc for a not very heavy WordPress blog with W3 Total Cache and CDN My problem is that after 55 hits per second according to blitz.io varnish starts giving out timeouts. CPU usage at this time is hardly 1%. Free memory at all time remains 10GB+. I tried benchmarking php-fpm directly with result of 150hits/s without any timeouts. But after that the CPU usage goes 100% and it stops responding. Can you help me optimize it to handle more? As i understand nginx has nothing to do over here so i dont put its config. php-fpm config listen = /tmp/php5-fpm.sock listen.allowed_clients = 127.0.0.1 user = nginx group = nginx pm = dynamic pm.max_children = 150 pm.start_servers = 7 pm.min_spare_servers = 2 pm.max_spare_servers = 15 pm.max_requests = 500 slowlog = /var/log/php-fpm/www-slow.log php_admin_value[error_log] = /var/log/php-fpm/www-error.log php_admin_flag[log_errors] = on apc extension = apc.so apc.enabled=1 apc.shm_size=512MB apc.num_files_hint=0 apc.user_entries_hint=0 apc.ttl=7200 apc.use_request_time=1 apc.user_ttl=7200 apc.gc_ttl=3600 apc.cache_by_default=1 apc.filters apc.mmap_file_mask=/tmp/apc.XXXXXX apc.file_update_protection=2 apc.enable_cli=0 apc.max_file_size=1M apc.stat=1 apc.stat_ctime=0 apc.canonicalize=0 apc.write_lock=1 apc.report_autofilter=0 apc.rfc1867=0 apc.rfc1867_prefix =upload_ apc.rfc1867_name=APC_UPLOAD_PROGRESS apc.rfc1867_freq=0 apc.rfc1867_ttl=3600 apc.include_once_override=0 apc.lazy_classes=0 apc.lazy_functions=0 apc.coredump_unmap=0 apc.file_md5=0 apc.preload_path Varnish VCL backend default { .host = "127.0.0.1"; .port = "8080"; .connect_timeout = 6s; .first_byte_timeout = 6s; .between_bytes_timeout = 60s; } acl purgehosts { "localhost"; "127.0.0.1"; } # Called after a document has been successfully retrieved from the backend. sub vcl_fetch { # Uncomment to make the default cache "time to live" is 5 minutes, handy # but it may cache stale pages unless purged. (TODO) # By default Varnish will use the headers sent to it by Apache (the backend server) # to figure out the correct TTL. # WP Super Cache sends a TTL of 3 seconds, set in wp-content/cache/.htaccess set beresp.ttl = 24h; # Strip cookies for static files and set a long cache expiry time. if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") { unset beresp.http.set-cookie; set beresp.ttl = 24h; } # If WordPress cookies found then page is not cacheable if (req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)") { # set beresp.cacheable = false;#versions less than 3 #beresp.ttl>0 is cacheable so 0 will not be cached set beresp.ttl = 0s; } else { #set beresp.cacheable = true; set beresp.ttl=24h;#cache for 24hrs } # Varnish determined the object was not cacheable #if ttl is not > 0 seconds then it is cachebale if (!beresp.ttl > 0s) { # set beresp.http.X-Cacheable = "NO:Not Cacheable"; } else if ( req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)" ) { # You don't wish to cache content for logged in users set beresp.http.X-Cacheable = "NO:Got Session"; return(hit_for_pass); #previously just pass but changed in v3+ } else if ( beresp.http.Cache-Control ~ "private") { # You are respecting the Cache-Control=private header from the backend set beresp.http.X-Cacheable = "NO:Cache-Control=private"; return(hit_for_pass); } else if ( beresp.ttl < 1s ) { # You are extending the lifetime of the object artificially set beresp.ttl = 300s; set beresp.grace = 300s; set beresp.http.X-Cacheable = "YES:Forced"; } else { # Varnish determined the object was cacheable set beresp.http.X-Cacheable = "YES"; if (beresp.status == 404 || beresp.status >= 500) { set beresp.ttl = 0s; } # Deliver the content return(deliver); } sub vcl_hash { # Each cached page has to be identified by a key that unlocks it. # Add the browser cookie only if a WordPress cookie found. if ( req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)" ) { #set req.hash += req.http.Cookie; hash_data(req.http.Cookie); } } # vcl_recv is called whenever a request is received sub vcl_recv { # remove ?ver=xxxxx strings from urls so css and js files are cached. # Watch out when upgrading WordPress, need to restart Varnish or flush cache. set req.url = regsub(req.url, "\?ver=.*$", ""); # Remove "replytocom" from requests to make caching better. set req.url = regsub(req.url, "\?replytocom=.*$", ""); remove req.http.X-Forwarded-For; set req.http.X-Forwarded-For = client.ip; # Exclude this site because it breaks if cached if ( req.http.host == "sr.ituts.gr" ) { return( pass ); } # Serve objects up to 2 minutes past their expiry if the backend is slow to respond. set req.grace = 120s; # Strip cookies for static files: if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") { unset req.http.Cookie; return(lookup); } # Remove has_js and Google Analytics __* cookies. set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", ""); # Remove a ";" prefix, if present. set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", ""); # Remove empty cookies. if (req.http.Cookie ~ "^\s*$") { unset req.http.Cookie; } if (req.request == "PURGE") { if (!client.ip ~ purgehosts) { error 405 "Not allowed."; } #previous version ban() was purge() ban("req.url ~ " + req.url + " && req.http.host == " + req.http.host); error 200 "Purged."; } # Pass anything other than GET and HEAD directly. if (req.request != "GET" && req.request != "HEAD") { return( pass ); } /* We only deal with GET and HEAD by default */ # remove cookies for comments cookie to make caching better. set req.http.cookie = regsub(req.http.cookie, "1231111111111111122222222333333=[^;]+(; )?", ""); # never cache the admin pages, or the server-status page, or your feed? you may want to..i don't if (req.request == "GET" && (req.url ~ "(wp-admin|bb-admin|server-status|feed)")) { return(pipe); } # don't cache authenticated sessions if (req.http.Cookie && req.http.Cookie ~ "(wordpress_|PHPSESSID)") { return(lookup); } # don't cache ajax requests if(req.http.X-Requested-With == "XMLHttpRequest" || req.url ~ "nocache" || req.url ~ "(control.php|wp-comments-post.php|wp-login.php|bb-login.php|bb-reset-password.php|register.php)") { return (pass); } return( lookup ); } Varnish Daemon options DAEMON_OPTS="-a :80 \ -T 127.0.0.1:6082 \ -f /etc/varnish/ituts.vcl \ -u varnish -g varnish \ -S /etc/varnish/secret \ -p thread_pool_add_delay=2 \ -p thread_pools=8 \ -p thread_pool_min=100 \ -p thread_pool_max=1000 \ -p session_linger=50 \ -p session_max=150000 \ -p sess_workspace=262144 \ -s malloc,5G" Im not sure where to start, should i for start optimize php-fpm and then go to varnish or php-fpm is at its max right now so i should start looking for the problem in varnish?

    Read the article

  • How to set up linux watchdog daemon with Intel 6300esb

    - by ACiD GRiM
    I've been searching for this on Google for sometime now and I have yet to find proper documentation on how to connect the kernel driver for my 6300esb watchdog timer to /dev/watchdog and ensure that watchdog daemon is keeping it alive. I am using RHEL compatible Scientific Linux 6.3 in a KVM virtual machine by the way Below is everything I've tried so far: dmesg|grep 6300 i6300ESB timer: Intel 6300ESB WatchDog Timer Driver v0.04 i6300ESB timer: initialized (0xffffc900008b8000). heartbeat=30 sec (nowayout=0) | ll /dev/watchdog crw-rw----. 1 root root 10, 130 Sep 22 22:25 /dev/watchdog | /etc/watchdog.conf #ping = 172.31.14.1 #ping = 172.26.1.255 #interface = eth0 file = /var/log/messages #change = 1407 # Uncomment to enable test. Setting one of these values to '0' disables it. # These values will hopefully never reboot your machine during normal use # (if your machine is really hung, the loadavg will go much higher than 25) max-load-1 = 24 max-load-5 = 18 max-load-15 = 12 # Note that this is the number of pages! # To get the real size, check how large the pagesize is on your machine. #min-memory = 1 #repair-binary = /usr/sbin/repair #test-binary = #test-timeout = watchdog-device = /dev/watchdog # Defaults compiled into the binary #temperature-device = #max-temperature = 120 # Defaults compiled into the binary #admin = root interval = 10 #logtick = 1 # This greatly decreases the chance that watchdog won't be scheduled before # your machine is really loaded realtime = yes priority = 1 # Check if syslogd is still running by enabling the following line #pidfile = /var/run/syslogd.pid Now maybe I'm not testing it correctly, but I would expecting that stopping the watchdog service would cause the /dev/watchdog to time out after 30 seconds and I should see the host reboot, however this does not happen. Also, here is my config for the KVM vm <!-- WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: virsh edit sl6template or other application using the libvirt API. --> <domain type='kvm'> <name>sl6template</name> <uuid>960d0ac2-2e6a-5efa-87a3-6bb779e15b6a</uuid> <memory unit='KiB'>262144</memory> <currentMemory unit='KiB'>262144</currentMemory> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='rhel6.3.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>Westmere</model> <vendor>Intel</vendor> <feature policy='require' name='tm2'/> <feature policy='require' name='est'/> <feature policy='require' name='vmx'/> <feature policy='require' name='ds'/> <feature policy='require' name='smx'/> <feature policy='require' name='ss'/> <feature policy='require' name='vme'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='rdtscp'/> <feature policy='require' name='ht'/> <feature policy='require' name='dca'/> <feature policy='require' name='pbe'/> <feature policy='require' name='tm'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='pdpe1gb'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='pclmuldq'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='acpi'/> <feature policy='require' name='monitor'/> <feature policy='require' name='aes'/> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/mnt/data/vms/sl6template.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <interface type='bridge'> <mac address='52:54:00:44:57:f6'/> <source bridge='br0.2'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:88:0f:42'/> <source bridge='br1'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <watchdog model='i6300esb' action='reset'> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </watchdog> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> </domain> Any help is appreciated as the most I've found are patches to kvm and general softdog documentation or IPMI watchdog answers.

    Read the article

< Previous Page | 197 198 199 200 201 202 203 204 205 206 207 208  | Next Page >