AWS: CloudWatch unified agent – сбор метрик и логов с EC2 и Ansible роль для него

Автор: | 07/06/2018

В дополнение к посту AWS: CloudWatch logs – сбор и мониторинг логов, где сбор логов выполнялся старым агентом – пример использования нового агента, который собирать и метрики инстанса, и логи.

Для работы агента – ЕС2 требуется подключенная роль CloudWatchAgentServerPolicy, создание описано тут>>>.

Пост кратенький, просто пример установки и запуска.

Установка CloudWatch unified агента

Устанавливаем unzip:

[simterm]

root@ip-172-31-45-128:/home/admin# apt install unzip

[/simterm]

Загружаем архив с агентом:

[simterm]

root@ip-172-31-45-128:/home/admin# wget https://s3.amazonaws.com/amazoncloudwatch-agent/linux/amd64/latest/AmazonCloudWatchAgent.zip

[/simterm]

Распаковываем:

[simterm]

root@ip-172-31-45-128:/home/admin# unzip AmazonCloudWatchAgent.zip 
Archive:  AmazonCloudWatchAgent.zip
  inflating: amazon-cloudwatch-agent.deb  
  inflating: detect-system.sh        
  inflating: uninstall.sh            
  inflating: manifest.json           
  inflating: install.sh              
  inflating: amazon-cloudwatch-agent.rpm

[/simterm]

Устанавливаем:

[simterm]

root@ip-172-31-45-128:/home/admin# bash install.sh 
Selecting previously unselected package amazon-cloudwatch-agent.
(Reading database ... 33701 files and directories currently installed.)
Preparing to unpack ./amazon-cloudwatch-agent.deb ...
Unpacking amazon-cloudwatch-agent (1.200763.0-1) ...
Setting up amazon-cloudwatch-agent (1.200763.0-1) ...

[/simterm]

Настройка агента

После установки агент создаёт дефолтный файл настроек:

[simterm]

root@ip-172-31-45-128:/home/admin# ls -l /opt/aws/amazon-cloudwatch-agent/etc
total 4
-rw-r--r-- 1 root root 825 May 12 02:40 common-config.toml

[/simterm]

Он будет использоваться и SSM агентом, если он используется, и CloudWatch агентом для определения данных доступа к AWS (не требуются, если используется IAM роль для EC2) и настроек прокси, если таковой есть.

Генерируем файл настроек /opt/aws/amazon-cloudwatch-agent/bin/config.json для CloudWatch агента (в Ansible он будет копироваться из файлов роли):

[simterm]

root@ip-172-31-45-128:/home/admin# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard      
=============================================================
= Welcome to the AWS CloudWatch Agent Configuration Manager =
=============================================================
On which OS are you planning to use the agent?
1. linux
2. windows
default choice: [1]:
1
Trying to fetch the default region based on ec2 metadata...
Are you using EC2 or On-Premises hosts?
1. EC2
2. On-Premises
default choice: [1]:

Do you want to monitor any host metrics? e.g. CPU, memory, etc.
1. yes
2. no
default choice: [1]:

Do you want to monitor cpu metrics per core? Additional CloudWatch charges may apply.
1. yes
2. no
default choice: [1]:

Do you want to add ec2 dimensions (ImageId, InstanceId, InstanceType, AutoScalingGroupName) into all of your metrics if the info is available?
1. yes
2. no
default choice: [1]:

Would you like to collect your metrics at high resolution (sub-minute resolution)? This enables sub-minute resolution for all metrics, but you can customize for specific metrics in the output json file.
1. 1s
2. 10s
3. 30s
4. 60s
default choice: [4]:

Which default metrics config do you want?
1. Basic
2. Standard
3. Advanced
4. None
default choice: [1]:
3
Current config as follows:
{
        "metrics": {
                "append_dimensions": {
                        "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
                        "ImageId": "${aws:ImageId}",
                        "InstanceId": "${aws:InstanceId}",
                        "InstanceType": "${aws:InstanceType}"
                },
                "metrics_collected": {
                        "cpu": {
                                "measurement": [
                                        "cpu_usage_idle",
                                        "cpu_usage_iowait",
                                        "cpu_usage_user",
                                        "cpu_usage_system"
                                ],
...
Are you satisfied with the above config? Note: it can be manually customized after the wizard completes to add additional items.
1. yes
2. no
default choice: [1]:

Do you have any existing CloudWatch Log Agent (http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html) configuration file to import for migration?
1. yes
2. no
default choice: [2]:

Do you want to monitor any log files?
1. yes
2. no
default choice: [1]:

Log file path:
/var/log/syslog
Log group name:
default choice: [syslog]

Do you want to specify any additional log files to monitor?
1. yes
2. no
default choice: [1]:
2
Saved config file to /opt/aws/amazon-cloudwatch-agent/bin/config.json successfully.
Current config as follows:
{
        "logs": {
                "logs_collected": {
                        "files": {
                                "collect_list": [
                                        {
                                                "file_path": "/var/log/syslog",
                                                "log_group_name": "syslog"
                                        }
                                ]
                        }
                }
        },
        "metrics": {
                "append_dimensions": {
                        "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
                        "ImageId": "${aws:ImageId}",
                        "InstanceId": "${aws:InstanceId}",
                        "InstanceType": "${aws:InstanceType}"
                },
                "metrics_collected": {
                        "cpu": {
                                "measurement": [
                                        "cpu_usage_idle",
                                        "cpu_usage_iowait",
                                        "cpu_usage_user",
                                        "cpu_usage_system"
                                ],
...
Please check the above content of the config.
The config file is also located at /opt/aws/amazon-cloudwatch-agent/bin/config.json.
Edit it manually if needed.
Do you want to store the config in the SSM parameter store?
1. yes
2. no
default choice: [1]:
2
Program exits now.

[/simterm]

Запускаем агент (в Ansible он запускается systemd):

[simterm]

root@ip-172-31-45-128:/home/admin# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-file /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --download-source file:/opt/aws/amazon-cloudwatch-agent/bin/config.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml
Start configuration validation...
/opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml
Valid Json input schema.
Configuration validation first phase succeeded
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
Created symlink /etc/systemd/system/multi-user.target.wants/amazon-cloudwatch-agent.service → /etc/systemd/system/amazon-cloudwatch-agent.service.

[/simterm]

Проверяем статус:

[simterm]

root@bm-backed-app-dev:/home/admin# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status
{
  "status": "running",
  "starttime": "2018-06-06T08:35:36+00:00",
  "version": "1.201116.0"
}

[/simterm]

Проверяем метрики:

Логи:

systemd

Файл сервиса создаётся автоматом при установке агента:

[simterm]

root@ip-172-31-45-128:/etc/systemd/system# cat /etc/systemd/system/amazon-cloudwatch-agent.service
# Copyright 2017 Amazon.com, Inc. and its affiliates. All Rights Reserved.
#
# Licensed under the Amazon Software License (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#   http://aws.amazon.com/asl/
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

# Location: /etc/systemd/system/amazon-cloudwatch-agent.service
# systemctl enable amazon-cloudwatch-agent
# systemctl start amazon-cloudwatch-agent
# systemctl | grep amazon-cloudwatch-agent
# https://www.freedesktop.org/software/systemd/man/systemd.unit.html

[Unit]
Description=Amazon CloudWatch Agent
After=network.target

[Service]
ExecStart=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
Restart=always

[Install]
WantedBy=multi-user.target

[/simterm]

И добавлен в автозапуск:

[simterm]

root@ip-172-31-45-128:/etc/systemd/system# systemctl list-unit-files | grep enabled
amazon-cloudwatch-agent.service        enabled  
...

[/simterm]

Ansible роль

Создаём роль:

[simterm]

$ mkdir -p roles/cloudwatch/{tasks,templates,defaults}

[/simterm]

Создаём roles/cloudwatchtasks/main.yml, описываем установку:

- name: Download CloudWatch archive
  unarchive:
    src: "{{ cw_agent_s3_url }}"
    dest: /tmp
    remote_src: yes

- name: Install CloudWatch agent
  command: /bin/bash /tmp/install.sh
  args:
    chdir: /tmp

- name: Copy config file
  copy:
    src: templates/amazon-cloudwatch-agent-config.json
    dest: "{{ cw_config_path }}"

- name: Restart CloudWatch agent
  systemd:
    state: restarted
    name: amazon-cloudwatch-agent

Создаём roles/cloudwatch/templates/amazon-cloudwatch-agent-config.json:

{
        "logs": {
                "logs_collected": {
                        "files": {
                                "collect_list": [
                                        {
                                                "file_path": "/var/log/syslog",
                                                "log_group_name": "syslog"
                                        }
                                ]
                        }
                }
        },
        "metrics": {
                "append_dimensions": {
                        "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
                        "ImageId": "${aws:ImageId}",
                        "InstanceId": "${aws:InstanceId}",
                        "InstanceType": "${aws:InstanceType}"
                },
                "metrics_collected": {
                        "cpu": {
                                "measurement": [
                                        "cpu_usage_idle",
                                        "cpu_usage_iowait",
                                        "cpu_usage_user",
                                        "cpu_usage_system"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ],
                                "totalcpu": false
                        },
                        "disk": {
                                "measurement": [
                                        "used_percent",
                                        "inodes_free"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ]
                        },
                        "diskio": {
                                "measurement": [
                                        "io_time",
                                        "write_bytes",
                                        "read_bytes",
                                        "writes",
                                        "reads"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ]
                        },
                        "mem": {
                                "measurement": [
                                        "mem_used_percent"
                                ],
                                "metrics_collection_interval": 60
                        },
                        "netstat": {
                                "measurement": [
                                        "tcp_established",
                                        "tcp_time_wait"
                                ],
                                "metrics_collection_interval": 60
                        },
                        "swap": {
                                "measurement": [
                                        "swap_used_percent"
                                ],
                                "metrics_collection_interval": 60
                        }
                }
        }
}

Полный список метрик для EC2 – тут>>>.

Описание файла настроек CloduWatch агента – тут>>>.

Создаём roles/cloudwatch/defaults/main.yml:

cw_agent_s3_url: https://s3.amazonaws.com/amazoncloudwatch-agent/linux/amd64/latest/AmazonCloudWatchAgent.zip
cw_config_path: /opt/aws/amazon-cloudwatch-agent/bin/config.json

Запускаем (тут у меня свой скрипт для запуска Ansible):

[simterm]

$ ./ansible_exec.sh -a -S

Tags: app
Env: backend-dev
Vault: /home/setevoy/.ssh/mobilebackend_aws_credentials.yml
RSA: /home/setevoy/Work/aws-credentials/bm-backend-dev.pem

Are you sure to proceed? [y/n] y

Installing dependencies...

 [WARNING]: - manala.logrotate (1.0.1) is already installed - use --force to change version to unspecified

 [WARNING]: - jnv.unattended-upgrades (v1.6.0) is already installed - use --force to change version to unspecified


Done.

Executing syntax-check...

playbook: backend.yml
Syntax check passed.

Skipping dry-run.

Applying roles...

PLAY [all] ****

TASK [Gathering Facts] ****
ok: [bm-mb-dev-ssh.domain.world]

TASK [cloudwatch : Download CloudWatch archive] ****
changed: [bm-mb-dev-ssh.domain.world]

TASK [cloudwatch : Install CloudWatch agent] ****
changed: [bm-mb-dev-ssh.domain.world]

TASK [cloudwatch : Copy config file] ****
changed: [bm-mb-dev-ssh.domain.world]

TASK [cloudwatch : Start CloudWatch agent] ****
changed: [bm-mb-dev-ssh.domain.world]

PLAY RECAP ****
bm-mb-dev-ssh.domain.world : ok=5    changed=4    unreachable=0    failed=0   

Provisioning done.

[/simterm]

Проверяем:

[simterm]

root@bm-backed-app-dev:~# ps aux | grep aws
root     30165  0.7  1.1 151696 23548 ?        Ssl  17:56   0:00 /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -pidfile /opt/aws/amazon-cloudwatch-agent/var/amazon-cloudwatch-agent.pid -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml

[/simterm]

Удаление агента

Для удаления агента – используем dpkg:

[simterm]

# dpkg -r amazon-cloudwatch-agent

[/simterm]

При необходимости – так же удаляем каталог /opt/aws/amazon-cloudwatch-agent/.

Готово.

UPD В процессе окончательной настройки использовались:

Amazon CloudWatch Concepts

Metrics Collected by the CloudWatch Agent

Manually Create or Edit the CloudWatch Agent Configuration File

Common Scenarios with CloudWatch Agent