Search Results

Search found 1914 results on 77 pages for 'mongrel cluster'.

Page 67/77 | < Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74 | Next Page >

hdfs configuration

- by Ananymous

I am a newbie. Trying to setup a hdfs system to serve my data (I don't plan to use mapreduce) at my lab. So far I have read, cluster setup in but I am still confused. Several questions: Do I need to have a secondary namenode? There are 2 files, masters and slaves. Do I really need these 2 files eventhough I just want hdfs? If I need them, what should go in there? I assume my namenode in masters and datanodes as slaves? Do I need slaves nodes What configuration files are needed for namenode, secondary namenode, datanode and client? (I assume core-site.xml is needed for all 4)? In addition, can someone suggest a good configuration model? sample configuration for namenode, secondary namenode, datanode, and the client would be very helpful. I am getting confused because it seems most of the documentation assumes I want to use map-reduce which isn't the case.

Read the article
Backing up SQL NetApp Snapshots using TSM

- by WerkkreW

In our environment we have a 3 node SQL 2005 Cluster which is on NetApp storage. We are currently using SMSQL (NetApp SnapManager for SQL) to take Snapshot backups of the data. This works great, but due to some audit requirements we are also forced to maintain some copies on tape. We have used NDMP in other places across the enterprise but we do not want to use it in this specific instance. Basically what I need to do is, get the most recent snapshot copy of the databases on tape, via Tivoli Storage Manager (TSM). What I have done is, obtained a basic Windows Server 2003 VM with SnapDrive installed, which is SAN attached and zoned to the NetApp, and I have written a batch file to do the following: Mount the latest __RECENT snapshot lun to the host, using a specific drive letter Perform a TSM based incremental backup Dis-mount the LUN This seems to work fine, except sometimes the LUN's do not mount due to some sort of timeout. Also, due to my limited knowledge of windows batch scripting, I have no way to monitor the success or failure of these backups since I do not know how to send a valid return code back to the TSM scheduling service. Is there a more efficient/elegant way to accomplish this without NDMP?

Read the article
AWS VPC ELB vs. Custom Load Balancing

- by CP510

So I'm wondering if this is a good idea. I have a Amazon AWS VPC setup with a public and private subnets. So I all ready get the Internet Gateway and NAT. I was going to setup all my web servers (Apache2 isntances) and DB servers in the private subnet and use a Load Balancer/Reverse Proxy to pick up requests and send them into the private subnets cluster of servers. My question then, is Amazons ELB's a good use for these, or is it better to setup my own custom instance to handle the public requests and run them through the NAT using nginx or pound? I like the second option just for the sake of having a instance I can log into and check. As well as taking advantage of caching and fail2ban ddos prevention, as well as possibly using fail safes to redirect traffic. But I have no experience with their ELB's, so I thought I'd ask your opinions. Also, if you guys have an opinion on this as well, would using the second option allow me to only have 1 public IP address and be able to route SSH connections through port numbers to respective instances? Thanks in advance!

Read the article
Need advise for choosing software\hardware for virtualization.

- by Anatoly

Currently we have these servers : Windows SBS 2003 premium on IBM X266 double Xeon F43, 2GB ram. DC, exchange (70 users), Mssql. Windows 2003 R2 32bit on IBM x3400 with double XEON E5310 and 4GB ram. Terminal server (40+ users), ERP application based on uniPaaS platform from Magicsoftware, and Pervasive sql. Ubuntu 8.04 (simple pc box) with squid proxy, GLPI system and PHPBB3 forum for internal use. Recently number of concurrent users on Terminal server passed 40 users in rush hours and it gets stuck frequently. Therefore we need an upgrade. I think about transfer all physical servers to virtual servers based on cluster of 2 physical servers for reducing downtime. I think we will grow till 50-60 concurrent terminal users in rush hours. I also plan to virtualize 10-15 Win XP/7 workstation (office,ERP etc), and there is a little probability for Asterisk\Hylafax for 100 users (if it possible on same VM). Also we need NAS storage for 2-3TB. What hardware upgrade/purchase we need for complete this task? Which VM solution is preferable VmWare or Hyper-V? What backup software should we choose? Acronis or something another? Thank you in advance.

Read the article
memory tuning with rails/unicorn running on ubuntu

- by user970193

I am running unicorn on Ubuntu 11, Rails 3.0, and Ruby 1.8.7. It is an 8 core ec2 box, and I am running 15 workers. CPU never seems to get pinned, and I seem to be handling requests pretty nicely. My question concerns memory usage, and what concerns I should have with what I am seeing. (if any) Here is the scenario: Under constant load (about 15 reqs/sec coming in from nginx), over the course of an hour, each server in the 3 server cluster loses about 100MB / hour. This is a linear slope for about 6 hours, then it appears to level out, but still maybe appear to lose about 10MB/hour. If I drop my page caches using the linux command echo 1 /proc/sys/vm/drop_caches, the available free memory shoots back up to what it was when I started the unicorns, and the memory loss pattern begins again over the hours. Before: total used free shared buffers cached Mem: 7130244 5005376 2124868 0 113628 422856 -/+ buffers/cache: 4468892 2661352 Swap: 33554428 0 33554428 After: total used free shared buffers cached Mem: 7130244 4467144 2663100 0 228 11172 -/+ buffers/cache: 4455744 2674500 Swap: 33554428 0 33554428 My Ruby code does use memoizations and I'm assuming Ruby/Rails/Unicorn is keeping its own caches... what I'm wondering is should I be worried about this behaviour? FWIW, my Unicorn config: worker_processes 15 listen "#{CAPISTRANO_ROOT}/shared/pids/unicorn_socket", :backlog = 1024 listen 8080, :tcp_nopush = true timeout 180 pid "#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid" GC.respond_to?(:copy_on_write_friendly=) and GC.copy_on_write_friendly = true before_fork do |server, worker| STDERR.puts "XXXXXXXXXXXXXXXXXXX BEFORE FORK" print_gemfile_location defined?(ActiveRecord::Base) and ActiveRecord::Base.connection.disconnect! defined?(Resque) and Resque.redis.client.disconnect old_pid = "#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid.oldbin" if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH # already killed end end File.open("#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid.ok", "w"){|f| f.print($$.to_s)} end after_fork do |server, worker| defined?(ActiveRecord::Base) and ActiveRecord::Base.establish_connection defined?(Resque) and Resque.redis.client.connect end Is there a need to experiment enforcing more stringent garbage collection using OobGC (http://unicorn.bogomips.org/Unicorn/OobGC.html)? Or is this just normal behaviour, and when/as the system needs more memory, it will empty the caches by itself, without me manually running that cache command? Basically, is this normal, expected behaviour? tia

Read the article
Best practices for thin-provisioning Linux servers (on VMware)

- by nbr

I have a setup of about 20 Linux machines, each with about 30-150 gigabytes of customer data. Probably the size of data will grow significantly faster on some machines than others. These are virtual machines on a VMware vSphere cluster. The disk images are stored on a SAN system. I'm trying to find a solution that would use disk space sparingly, while still allowing for easy growing of individual machines. In theory, I would just create big disks for each machine and use thin provisioning. Each disk would grow as needed. However, it seems that a 500 GB ext3 filesystem with only 50 GB of data and quite a low number of writes still easily grows the disk image to eg. 250 GB over time. Or maybe I'm doing something wrong here? (I was surprised how little I found on the subject with Google. BTW, there's even no thin-provisioning tag on serverfault.com.) Currently I'm planning to create big, thin-provisioned disks - but with a small LVM volume on them. For example: a 100 GB volume on a 500 GB disk. That way I could more easily grow the LVM volume and the filesystem size as needed, even online. Now for the actual question: Are there better ways to do this? (that is, to grow data size as needed without downtime.) Possible solutions include: Using a thin-provisioning friendly filesystem that tries to occupy the same spots over and over again, thus not growing the image size. Finding an easy method of reclaiming free space on the partition (re-thinning?) Something else? A bonus question: If I go with my current plan, would you recommend creating partitions on the disks (pvcreate /dev/sdX1 vs pvcreate /dev/sdX)? I think it's against conventions to use raw disks without partitions, but it would make it a bit easier to grow the disks, if that is ever needed. This is all just a matter of taste, right?

Read the article
Default /server-status location not inheriting in Apache

- by rmalayter

I'm having a problem getting /server-status to work Apache 2.2.14 on Ubuntu Server 10.04.1. The default symlinks for status.load and status.conf are present in /etc/apache2/mods-enabled. The status.conf does include the location /server-status and appropriate allow/deny directives. However, the only vhost I have in sites-enabled looks like this. The idea is to proxy anything with a Tomcat URL to a cluster of tomcats, and anything else to an IIS box. However, this seems to result in requests to /server-status being sent to IIS. Copying the /server-status in explicitly to the Vhost configuration doesn't seem to help, no matter what order I use. Is it possible to include /server-status do this within a vhost configuration that has a "default" proxy rule?: <VirtualHost *:80> ServerAdmin webmaster@localhost DocumentRoot /var/www Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED <Proxy balancer://tomcatCluster> BalancerMember ajp://qa-app1:8009 route=1 BalancerMember ajp://qa-app2:8009 route=2 ProxySet stickysession=ROUTEID </Proxy> <ProxyMatch "^/(mytomcatappA|mytomcatappB)/(.*)" > ProxyPassMatch balancer://tomcatCluster/$1/$2 </ProxyMatch> #proxy anything that's not a tomcat URL to IIS on port 80 <Proxy /> ProxyPass http://qa-web1/ </Proxy>

Read the article
Server Clustering (Django, Apache, Nginx, Postgres)

- by system-matrix

I have a project deployed with django, Apache, Nginx and Postgres. The project has requirement of live data viewable to customers. The projects main points are: 1. Devices in field send data to server(devices are also like website users) after login. 2. There is background import process which imports the uploaded data in postgres. 3. The webusers of the system use this data and can send commands to the devices, which devices read when they login. 4. There are also background analysis routines running on the data. All the above mentioned setup and system is deployed on one amazon EC2 cloud machine. The project currently supports over 600 devices and 400 users. But as the number of devices are increasing with time the performance of the server is going down. We want to extend this project so that it can support more and more devices. My initial thinking is, We will create one more server like current one and divide the devices amongst these to servers. But Again We need a central user and device managment point though django admin. Any Ideas? What are the best possible ways to create a scalable architecture? How can I create a Postgres Cluster and Use it with Django, if possible?

Read the article
What can cause a kernel hang on redhat 4?

- by Ivan Buttinoni

I've to solve a nasty problem on a ten machine "cluster": randomly one of these machine hang during an hard computation, sometime still ping sometime not. The problem was described me at the phone, I've still no touch/see these machine, so I can't be more precise. It seem there's no (real) keyboard or monitor linked to them, so I haven't nothing about keyboard led or messages on monitor. Don't worry, what I really need is some suggestion where to search the problem, some suggestions on what can cause a kernel hang on a working machine. I also see this post, but seem same need on a different situation. My ideas since now: - HW problem (ram, cpu, fan etc.) - bad autofs configuration - bad nfs(?) configuration - presence of a trojan/hacker/etc - /dev/"swap" linked to /dev/zero - kernel out of memory(??) - kernel bugged In other words I try to imagine what kind of envent can occour that can crash the kernel insted of the application that generate the event. What hang have YOU experienced before? Write it to me! TIA

Read the article
IPC between multiple processes on multiple servers

- by z8000

Let's say you have 2 servers each with 8 CPU cores each. The servers each run 8 network services that each host an arbitrary number of long-lived TCP/IP client connections. Clients send messages to the services. The services do something based on the messages, and potentially notify N1 of the clients of state changes. Sure, it sounds like a botnet but it isn't. Consider how IRC works with c2s and s2s connections and s2s message relaying. The servers are in the same data center. The servers can communicate over a private VLAN @1GigE. Messages are < 1KB in size. How would you coordinate which services on which host should receive and relay messages to connected clients for state change messages? There's an infinite number of ways to solve this problem efficiently. AMQP (RabbitMQ, ZeroMQ, etc.) Spread Toolkit N^2 connections between allservices (bad) Heck, even run IRC! ... I'm looking for a solution that: perhaps exploits the fact that there's only a small closed cluster is easy to admin scales well is "dumb" (no weird edge cases) What are your experiences? What do you recommend? Thanks!

Read the article
Problems configuring logstash for email output

- by user2099762

I'm trying to configure logstash to send email alerts and log output in elasticsearch / kibana. I have the logs successfully syncing via rsyslog, but I get the following error when I run /opt/logstash-1.4.1/bin/logstash agent -f /opt/logstash-1.4.1/logstash.conf --configtest Error: Expected one of #, {, ,, ] at line 23, column 12 (byte 387) after filter { if [program] == "nginx-access" { grok { match = [ "message" , "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} [%{HTTPDATE:time_local}] %{QS:request} %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent}” ] } } } output { stdout { } elasticsearch { embedded = false host = " Here is my logstash config file input { syslog { type => syslog port => 5544 } } filter { if [program] == "nginx-access" { grok { match => [ "message" , "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[% {HTTPDATE:time_local}\] %{QS:request} %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent}” ] } } } output { stdout { } elasticsearch { embedded => false host => "localhost" cluster => "cluster01" } email { from => "[email protected]" match => [ "Error 504 Gateway Timeout", "status,504", "Error 404 Not Found", "status,404" ] subject => "%{matchName}" to => "[email protected]" via => "smtp" body => "Here is the event line that occured: %{@message}" htmlbody => "<h2>%{matchName}</h2><br/><br/><h3>Full Event</h3><br/><br/><div align='center'>%{@message}</div>" } } I've checked line 23 which is referenced in the error and it looks fine....I've tried taking out the filter, and everything works...without changing that line. Please help

Read the article
Best Practice - SQL 2012 & IIS in VMWare

- by Dan Ribar

We are pretty new to VMWare and looking for some thoughts on our environment. We have a VMWare cluster that has on one host: VM#1: MS Windows 2008 R2 Enterprise & SQL Server 2012 VM#2: MS Windows 2008 R2 Standard & IIS The IIS asp.net app talks directly to the SQL Server. We had this similar environment on physical servers a few months ago and just recently moved to the virtualized environment. Regarding the setup, we have not tweaked any of the vm resource parameters -- all is set as standard and all is working. What is observed is that the VMs seem to spool down and we get lags in response. Of course this sin't as fast as the old physical environment, but I am wondering if: *is it a good idea to run the SQL server and the IIS server on the same host? They are the only two VMs on it. The host is a new Dell R620 with 192 gb mem. does it make sense to change any CPU or memory reservations when it doesn't seem like there is any contention is there a way to keep the VMs spooled up to eliminate delays? This is a brand new squeaky clean vanilla install. What are your thoughts?

Read the article
Building vs buying a server for an academic lab [closed]

- by Roy

I'm looking for advice on the classic build vs buy question. We need a new linux server to run Matlab computation on in our lab (academic). Matlab parallel computing toolbox licence allows up to 12 local workers so we are aiming at a 12 core server with 4GB memory per core (total of 48gb). The system will have an SSD for the OS and a raid-5 (4x2tb) for data. I looked around and found a (relatively) cheap vendor, Silicon Mechanics, that offers a system to our liking (specs below) for $6732. However, buying the components from newegg cost only $4464! The difference is $2268 which is 50% of the base cost. If buying from a company can be thought of as a sort of insurance, basically my premiums are of 50% of the base cost which to me sounds like a lot. Of course any downtime is bad, but the work is not "mission critical", i.e. if it takes a few days to fix it when it breaks its no the end of the world. If it takes weeks to months then its a problem. If it breaks 2-3 times in 3 years, not too bad. If it breaks every month not good. In term of build experience, I set up a linux cluster in grad school (from existing computers) and I build my home pcs but I never built a server before. The server components I'm thinking about: 1 x SUPERMICRO SYS-7046T-6F 4U Tower Server Barebone Dual LGA 1366 Intel 5520 DDR3 1333/1066/800 ($1,050) 12 x Kingston 4GB 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) ECC Unbuffered Server Memory ($420) 2 x Intel Xeon E5645 Westmere-EP 2.4GHz LGA 1366 80W Six-Core ($1,116) 4 x Seagate Constellation ES 2TB 7200 RPM SATA 6.0Gb/s 3.5" ($1,040) 1 x SAMSUNG Internal DVD Writer Black SATA ($20) 1 x Intel 520 Series 2.5" 180GB SATA III MLC SSD $300 1 x LSI LSI00281 PCI-Express 2.0 x8 MD2 Low profile SATA / SAS MegaRAID SAS 9260CV-4i Controller Card, $695

Read the article
WOL doesn't work if set to anything other than `a` but this setting makes it boot all the time

- by Elton Carvalho

I manage a small "cluster" of 4 Xeon machines with Intel boards in my lab. They are all plugged to a 5-port 3-Com switch with static IP addresses like 10.0.0.x. They are all running OpenSuse 11.4 and their /home/ is served by one of the machines (node00) via NFS. They are plugged to an UPS that can keep them on for ca. 15 minutes, but there are lots of electric shortages due to "unscheduled maintenace" that are longer than this. So they end up being powered down without notice. If I set the BIOS to turn them on after power shortages, the issue is that they all boot at the same time and, if node00 decides to run fsck in the /home/ partition, it does not finish booting before the others try to NFS mount their /home/. I am trying to make wake on lan work, so I can choose to boot the NFS clients only after the server has successfully booted. The problem is that when I run ethtool I get an output like this: Supports Wake-on: pumbag Wake-on: g Theoretically, it is set to wake on MagicPacket(tm), according to the manual. But sending the WOL packet using wol -i 10.0.0.255 $MACADDR does not wake up the box after I shut it down with halt. The ethernet link led blinks after I send the packet, so it appears to be getting to the machine. However, if I set it up with ethtool -s eth1 wol bag, the machine always wakes up right after halting, even if I don't send the Magic packet. This means that the device can wake up with LAN activity, but seems to be ignoring the magic packet. Setting wol ag does not wake the box with the MagicPacket. Does setting wol a mean that it should boot with any broadcast message? How can I diagnose the issue of the machine not waking up with the MagicPacket even though I am sending it and it's set up to wake up with it? Thanks in advance!

Read the article
Trying to install datastax opscenter - Failed to load application: cannot import name _parse

- by gansbrest

I'm not familiar with python, maybe someone could explain what's going on here? ec2-user@prod-opscenter-01:~ % java -version java version "1.7.0_45" Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) ec2-user@prod-opscenter-01:~ % python -V Python 2.6.8 ec2-user@prod-opscenter-01:~ % openssl version OpenSSL 1.0.1e-fips 11 Feb 2013 And now the error ec2-user@prod-opscenter-01:~ % sudo /etc/init.d/opscenterd start Starting Cassandra cluster manager opscenterd Starting opscenterdUnhandled Error Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 652, in run runApp(config) File "/usr/lib64/python2.6/site-packages/twisted/scripts/twistd.py", line 23, in runApp _SomeApplicationRunner(config).run() File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 386, in run self.application = self.createOrGetApplication() File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 451, in createOrGetApplication application = getApplication(self.config, passphrase) --- <exception caught here> --- File "/usr/lib64/python2.6/site-packages/twisted/application/app.py", line 462, in getApplication application = service.loadApplication(filename, style, passphrase) File "/usr/lib64/python2.6/site-packages/twisted/application/service.py", line 405, in loadApplication application = sob.loadValueFromFile(filename, 'application', passphrase) File "/usr/lib64/python2.6/site-packages/twisted/persisted/sob.py", line 210, in loadValueFromFile exec fileObj in d, d File "bin/start_opscenter.py", line 1, in <module> from opscenterd import opscenterd_tap File "/usr/lib/python2.6/site-packages/opscenterd/opscenterd_tap.py", line 37, in <module> File "/usr/lib/python2.6/site-packages/opscenterd/OpsCenterdService.py", line 13, in <module> File "/usr/lib/python2.6/site-packages/opscenterd/ClusterServices.py", line 22, in <module> File "/usr/lib/python2.6/site-packages/opscenterd/WebServer.py", line 40, in <module> File "/usr/lib/python2.6/site-packages/opscenterd/Agents.py", line 18, in <module> exceptions.ImportError: cannot import name _parse Failed to load application: cannot import name _parse Maybe there are open source alternatives to monitoring cassandra I should look at? Thanks a lot

Read the article
Server 2008 R2 domain windows update strategy

- by Joost Verdaasdonk

Let me explain my question a bit. We are a small company that have now made the first move to a bigger network. For now the network contains of 5 servers 2008 R2 (dc,sql,web,etc..). Everything we need is now in place but for now we cannot afford to finish the network by implementing redundant systems. (secondary dc, dns, sql cluster, etc...) For some people this is hard to understand but this is the current situation. (and we are aware and will fix this when we can) Because we want to keep our system secure and up to date I've made sure that all systems are updated regularly. The problem is ofc that the nr of updates Microsoft rolls out that need a system reboot seam to occur more often. (maybe I'm wrong and it just feels like this) ;-) In our domain servers depend on each other for services (like SQL, WEB, or whatever) so just rebooting a server at will is NOT a good idea! For now I update all of them without rebooting at once. After all are up to date I bring them down in the order they are depended on each other. After this I reboot all of them in the inverse order. I understand ofc that if I DID have redundancy in my system that updating and rebooting would not be such a problem because the server task could be taken over by another node but this is something we generally need to add when we can. So my question is. If you read my above situation can you suggest me more Update strategies or general ideas that could help me do this process in a better / faster way? Thanks for your thoughts!

Read the article
Home Server: storage virtualisation, what to choose?

- by Huygens

I'm looking for virtualisation solutions for storage and OS for a home server. A sort of private cloud where I manage the storage space independently of the VM one. This question focus on storage management. (I have another question related to the VM/compute instance management). Here my environement and wishes. Server: HP Proliant MicroServer with 8 GB RAM (AMD Turion dual core with AMD-V technology) with 1 250GB system disk and up to 4 HDD (2 TB) for "data" OS types: only Linux (perhaps a *BSD VM in the future) Linux distributions do not matter, I'm familiar with RHEL, Fedora, Suse, Ubuntu, but any other recommandation will be fine The 4 HDD is going to be a software RAID array, probably RAID 5. storage should be "virtualised/cloudified": easy to extend: if I add a NAS on the network, I can include the NAS space capacity within this storage space as one virtual disk. This can be a NAS, an external HDD or another server. cluster FS or S3 style space or OpenStack block storage? Whatever is easier to manage/maintain and easy to integrate/plug to VM/compute instance. I would prefer free (libre, as in a free speach) and open source tools. But it does not have to be free as in a free beer. Note: the VMs I intend to run on top of this server are one dedicated to backup, one for a "owncloud/dropbox"-like service and perhaps one for media server (hosting video and photos). I'm not sure if traditional VMs or compute instance are the most suitable for this.

Read the article
Missing drive space in Server 2003

- by Tim Brigham

I have two drives used for SQL backups which for the last week have been acting strange - the free space indicated by windows is far off from what windirstat, etc indicates. There should only be about 60 GB of drive space used and there is about 160. This would match the utilization if the two last backup files were still residing on disk. SQL server is 2000, OS Server 2003 x64. Running on a VMware 5.0 cluster. OSSEC and McAfee for this system shows clean. My current plan is to temporarily attach one of these drives this drive to another VM for analysis. Is there anything more I should be looking at? There were a lot of pages on the net when I was looking for documentation on this issue but I haven't found this case described. EDIT: Unfortunately even a full reboot did not clear this behavior. I also used process explorer to look for open file handles. No dice.

Read the article
central apache log analysis of many hosts

- by Jason Antman

We have 30+ apache httpd servers, and are looking to perform analysis on the logs both for historical trending and near "real time" monitoring/alerting. I'm mainly interested in things like error rates (4xx/5xx), response time, overall request rate, etc. but it would also be very useful to pull out more compute-intensive statistics like unique client IPs and user agents per unit of time. I'm leaning towards building this as a centralized collector/server/storage, and am also considering the possibility of storing non-apache logs (i.e. general syslog, firewall logs, etc.) in the same system. Obviously a large part of this will probably have to be custom (at least the connection between pieces and the parsing/analysis we do), but I haven't been able to find much information on people who have done stuff like this, at least at shops smaller than Google/Facebook/etc. who can throw their log data into a hundred-node compute cluster and run Map/Reduce on it. The main things I'm looking for are: - All open source - Some way of collecting logs from apache machines that isn't too resource-intensive, and transports them relatively quickly over the network - Some way of storing them (NoSQL? key-value store?) on the backend, for a given amount of time (and then rolling them up into historical averages) - In the middle of this, a way of graphing in near-real-time (probably also with some statistical analysis on it) and hopefully alerting off of those graphs. Any suggestions/pointers/ideas, to either "products"/projects or descriptions of how other people do this would be greatly helpful. Unfortunately, we're not exactly a new-age-y devops shop, lots of old stuff, homogeneous infrastructure, and strained boxes.

Read the article
Restoring MBR, partition table, and boot sector of memory card without data loss ("USBC")

- by Synetech

Abstract I have a FAT32 memory card that when inserted into a computer causes Windows to prompt to format it. The card is definitely not supposed to be blank and has a bunch of files on it. Symptoms Using a hex-editor/disk-viewer, I examined the card and found that several sectors/clusters have been overwritten with something that has a signature of USBC at the start of the sector. Specifically, the master boot record (and partition table) is gone (hence Windows thinking the card is blank and needing to be formatted), as are the boot sectors (they have the USBC signature and a volume label of NO NAME and partition type of FAT32). Fortunately, it looks like both copies of the FAT are almost entirely intact (a few FAT entries at the start of a cluster here and there seem to be overwritten by USBC). The root directory is also nearly intact—I can see the volume label entry and subdirectory listings, but one sector is overwritten. (There are no more instances of USBC after the last one in the FAT2.) Hypothesis These observations seem to indicate some sort of virus that erases a few key filesystem structures, and then overwrites a few extra sectors here and there. Googling it seems to corroborate the idea of a virus, except that others report a file called USBC which does not apply here, and in fact, could not be possible since there is no filesystem to even see files. I cannot find any information about a virus with these symptoms, nor a removal tool. (I can't help but wonder if it is actually due to an autorun virus prevention tool.) Question I can likely fix the FAT corruption since they are mostly contiguous chains and maybe even the lost sector of the root directory, but does anyone know of a convenient way to restore or (re)create the MBR/partition table and boot sectors (without formatting or overwriting the data)?

Read the article
VMWare use of Gratuitous ARP REPLY

- by trs80

I have an ESXi cluster that hosts several Windows Server VMs and around 30 Windows workstation VMs. Packet captures show a high number of ARP replies of the form: -sender_ip: VM IP -sender_mac: VM virtual MAC -target_ip: 0.0.0.0 -target_mac: Switch interface MAC The specific addresses aren't really a concern -- they're all legitimate and we're not having any problems with communications (most of the questions surrounding GARP and VMWare have to do with ping issues, a problem we don't have). I'm looking for an explanation of the traffic pattern in an environment that functions as expected. So the question is why would I see a high number of unsolicited ARP replies? Is this a mechanism VMWare uses for some purpose? What is it? Is there an alternative? EDIT: Quick diagram: [esxi]--[switch vlan]--[inline IDS]--[fw]--(rest of network) The IDS is complaining about these unsolicited ARPs. Several IDS vendors trigger on ARP replies without a prior request, or for ARP replies that have a target IP of 0.0.0.0. The target MAC in these replies is the VLAN interface on the switch. Capture points: -The IDS grabs the offending packets -The FW can see the same ones -A VM on the ESXi host does not see these, although there is an ARP request for a specific IP on the ESXi host that has source_ip=0.0.0.0 and source_mac=[switch vlan interface]. I can't share the captures, unfortunately. Really I'm interested in finding out if this is normal for an ESXi deployment.

Read the article
Can Octopussy use messages other than syslog style?

- by Lee Lowder

I am currently exploring different options for a centralized log server. We use both Linux (Ubuntu 10.04 / 12.04, LTS for both) and Windows, though for this specific issue only Linux is relevant. I like the interface that octopussy has and it's feature list, but I am hesitant due to a few things. One of the biggest concerns I have is that it seems to be syslog only. The end goal is to have a centralized place for our devs and admins to be able to search through the logs generated by Apache, Tomcat and 70+ web apps spread out among a cluster, for both our prod and test environments. While I did see that octopussy has support for plugins, I haven't been able to find any sort of plugin repo or in depth guides as to what can be done with them. Does anyone know if plugins can be used to allow octopussy to non-syslog messages? Specifically log4j type log messages that may include multi-line stack traces and such. Also, is there a user community for this software, such as a mailing list or forum? I've been unable to locate any so far. Thank you.

Read the article
Exceptional slowdown of robocopy copying from VM to DFS array

- by user1588867

I've got an old win 2003 VM (VMware) on a blade cluster of VMs that I'm moving a considerable amount of files to our new DFS array. There are two main folders with about 1.7 million and half a million smaller files (letters, memos, and other smaller files) respectively. Total size is ~420 GB and ~100 GB. We're using the gui version of robocopy on the server to copy the files. We had initiated a file copy about a month ago to test the process and found that it was taking around 4 hours for the large file. Now that I'm in the process of actually switching the files over it has been taking 18-20 hours. Nothing has changed on the server side and nothing has changed on the settings of the copy (no logs, 1 retry with a wait of 1 second). Our intent is to shut off the share and force the copy over again to get all the files that have been left out of the copy due to being locked by users. I can't take a 20 hour outage to do that though. Does anyone have any theories about what could be causing such a delay for robocopy compared to previously shorter runs?

Read the article
DRBD as a block device for XEN VM (Centos 5.3)

- by SaberTooth

Hi all, I have setup a drbd resource between 2 server nodes - everything works correctly when doing sync tests between the two. (I want to create a HA cluster using drbd,xen and heartbeat) However, when I try and create a XEN VM with Centos as guest operating system, I get through to the partitioning screen on the install but when I select a partitioning type the next screen gives me the following error : "An error has occurred - no valid devices were found on which to create new file systems. Please check your hardware for the cause of this problem." This is the first time attempting create a setup like this and searching Google does not help much... my config files for DRBD and XEN.... DRBD (just the section that is pertinent) on xennode0 { device /dev/drbd0; disk /dev/sda5; address X.X.X.X:7788; flexible-meta-disk internal; } on xennode1 { device /dev/drbd0; disk /dev/sda5; address X.X.X.X:7788; meta-disk internal; } XEN kernel = "/boot/xeninstall/vmlinuz" ramdisk = "/boot/xeninstall/initrd.img" extra = "text" name = "VM" maxmem = 3000 memory = 3000 vcpus = 4 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" vfb = [ ] disk = [ "phy:/dev/drbd0,sda1,w", "tap:aio:/srv/xen/xenswap.img,sda2,w" ] vif = [ "mac=00:16:3e:11:67:ae,bridge=xenbr0" ] root = "/dev/sda1 ro" Thanks in advance!

Read the article
150 TB and growing, but how to grow?

- by seandavi

My group currently has two largish storage servers, both NAS running debian linux. The first is an all-in-one 24-disk (SATA) server that is several years old. We have two hardware RAIDS set up on it with LVM over those. The second server is 64 disks divided over 4 enclosures, each a hardware RAID 6, connected via external SAS. We use XFS with LVM over that to create 100TB useable storage. All of this works pretty well, but we are outgrowing these systems. Having build two such servers and still growing, we want to build something that allows us more flexibility in terms of future growth, backup options, that behaves better under disk failure (checking the larger filesystem can take a day or more), and can stand up in a heavily concurrent environment (think small computer cluster). We do not have system administration support, so we administer all of this ourselves (we are a genomics lab). So, what we seek is a relatively low-cost, acceptable performance storage solution that will allow future growth and flexible configuration (think ZFS with different pools having different operating characteristics). We are probably outside the realm of a single NAS. We have been thinking about a combination of ZFS (on openindiana, for example) or btrfs per server with glusterfs running on top of that if we do it ourselves. What we are weighing that against is simply biting the bullet and investing in Isilon or 3Par storage solutions. Any suggestions or experiences are appreciated.

Read the article

Search Results

Search found 1914 results on 77 pages for 'mongrel cluster'.

Page 67/77 | < Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74 | Next Page >

- by Ananymous

- by WerkkreW

- by CP510

- by Anatoly

- by user970193

- by nbr

- by rmalayter

- by system-matrix

- by Ivan Buttinoni

- by z8000

- by user2099762

- by Dan Ribar

- by Roy

- by Elton Carvalho

- by gansbrest

- by Joost Verdaasdonk

- by Huygens

- by Tim Brigham

- by Jason Antman

- by Synetech

- by trs80

- by Lee Lowder

- by user1588867

- by SaberTooth

- by seandavi

< Previous Page | 63 64 65 66 67 68 69 70 71 72 73 74 | Next Page >