So I'm getting a company set up in the Amazon Cloud -- creating IAAS protocol/solutions/standardized implementation, etc while also being the SysAdmin for individual systems, app environments, and day-to-day uptime.
One of the biggest issues I'm having is tracking various system/application logs, as well as logging/monitoring/archiving system metrics like memory usage, cpu usage, etc etc In a centralized fashion. E.g. -- Nagios + Urchin.
The BIGGEST impediment to my endeavors is the following:
The company application is deployed in the form of a Java *.WAR file, uploaded to an Elastic BeanStalk application environment, load balancing and auto-scaling between 3(min) and 10(max) servers, and the EC2's that run the application are fired up and disposed of ad-hoc.
That is to say, I can't monitor the individual EC2's for very long because so many are being terminated then auto-provisioned/auto-scaled on the fly -- so I'd constantly be having to "monitor what I'm monitoring", and continuously remove/add EC2 machine addresses to my monitoring lists.
IS there some sort of way to use monitoring tools like Zabbix or Nagios to monitor the ElasticBeanStalk, and have it automatically add on new EC2's, and remove terminated/failed EC2's from its monitoring list automatically?
Furthermore, is there anything I can do with GrayLog to achieve similar results with the aggregation/centralization of my application logs from multiple EC2 instances into ONE consolidated set of logs/events? If not GrayLog, is there ANYTHING LIKE GrayLog that can automatically detect what EC2 members are being added/removed from the environment, and collect the logs from them automatically?
Any and all advice or direction is appreciated.
Thanks much, and cheers!!