datadogを使ってfluentdコンテナを監視する

About

datadogのコンテナからfluentdコンテナを監視するやつをやってみました。

やりたいこと

datadogコンテナを立てる
fluentdコンテナを立てる
datadogコンテナでfluentdコンテナを監視する
アプリのサーバを立ち上げてfluentdにログが貯まるかを確認する

datadogコンテナを用意する

datadog公式がコンテナを用意しているので、API_KEYを取得すれば簡単に起動できます。

GitHub - DataDog/docker-dd-agent: Datadog Agent Dockerfile for Trusted Builds.

API_KEYは環境変数を使って渡します。

datadogはカスタマイズ可能なサンプルファイルが/etc/dd-agent/conf.d大量に用意されています。

# ls /etc/dd-agent/conf.d/
activemq.yaml.example       docker_daemon.yaml.example   kong.yaml.example               ntp.yaml.default                statsd.yaml.example
activemq_xml.yaml.example   elastic.yaml.example         kube_dns.yaml.example           openstack.yaml.example          supervisord.yaml.example
agent_metrics.yaml.default  etcd.yaml.example            kubernetes.yaml.example         pgbouncer.yaml.example          system_core.yaml.example
apache.yaml.example         fluentd.yaml                 kubernetes_state.yaml.example   php_fpm.yaml.example            system_swap.yaml.example
auto_conf                   fluentd.yaml.example         kyototycoon.yaml.example        postfix.yaml.example            tcp_check.yaml.example
btrfs.yaml.example          gearmand.yaml.example        lighttpd.yaml.example           postgres.yaml.example           teamcity.yaml.example
cacti.yaml.example          go_expvar.yaml.example       linux_proc_extras.yaml.example  powerdns_recursor.yaml.example  tokumx.yaml.example
cassandra.yaml.example      gunicorn.yaml.example        mapreduce.yaml.example          process.yaml.example            tomcat.yaml.example
ceph.yaml.example           haproxy.yaml.example         marathon.yaml.example           rabbitmq.yaml.example           twemproxy.yaml.example
consul.yaml.example         hdfs.yaml.example            mcache.yaml.example             redisdb.yaml.example            varnish.yaml.example
couch.yaml.example          hdfs_datanode.yaml.example   mesos.yaml.example              riak.yaml.example               vsphere.yaml.example
couchbase.yaml.example      hdfs_namenode.yaml.example   mesos_master.yaml.example       riakcs.yaml.example             yarn.yaml.example
directory.yaml.example      http_check.yaml.example      mesos_slave.yaml.example        snmp.yaml.example               zk.yaml.example
disk.yaml.default           jenkins.yaml.example         mongo.yaml.example              solr.yaml.example
dns_check.yaml.example      jmx.yaml.example             mysql.yaml.example              spark.yaml.example
docker.yaml.example         kafka.yaml.example           nagios.yaml.example             sqlserver.yaml.example
docker_daemon.yaml          kafka_consumer.yaml.example  nginx.yaml.example              ssh_check.yaml.example

今回はその中でfluentd.yamlを利用します。

$ cat dd-agent/fluentd.yaml
init_config:

instances:
    # Every instance requires a `monitor_agent_url`
    # Optional, set `plugin_ids` to monitor a specific scope of plugins.
    -  monitor_agent_url: http://example.com:24220/api/plugins.json

このファイルをimageに入れて、

$ docker run -d --name dd-agent -v /var/run/docker.sock:/var/run/docker.sock:ro -v /proc/:/host/proc/:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e API_KEY=xxx -e SD_BACKEND=docker wrapper-dd-agent

これで単体で起動ができます。ただしこの状態ではfluentdの監視ができていないのでcheckでエラーになっていると思います。

fluentdコンテナを用意する

こちらも公式で用意されています。

GitHub - fluent/fluentd-docker-image: Docker image for Fluentd

defaultの設定だとmonitor_agentが起動していないので、設定を追加します

$ cat fluent/fluentd.conf
<source>
  @type monitor_agent
  bind 0.0.0.0
  port 24220
</source>

<source>
  @type  forward
  @id    input1
  @label @mainstream
  port  24224
</source>

<filter **>
  @type stdout
</filter>

<label @mainstream>
  <match docker.**>
    @type file
    @id   output_docker1
    path         /fluentd/log/docker.*.log
    symlink_path /fluentd/log/docker.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   1m
    time_format       %Y%m%dT%H%M%S%z
  </match>
  <match **>
    @type file
    @id   output1
    path         /fluentd/log/data.*.log
    symlink_path /fluentd/log/data.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   10m
    time_format       %Y%m%dT%H%M%S%z
  </match>
</label>

このファイルがあるディレクトリをマウントさせることで設定ファイルを書き換えることができます。

この状態で起動させてみると、

$ docker run -d -p 24224:24224 -p 24220:24220 --name test -v ~/work/dd-agent/fluent:/fluentd/etc -e FLUENTD_CONF=fluentd.conf fluent/fluentd

ホストマシン上からcurlで叩くと情報が取得できます。

$ curl localhost:24220/api/plugins
plugin_id:object:2b0941885390   plugin_category:input   type:monitor_agent      output_plugin:false     retry_count:
plugin_id:input1        plugin_category:input   type:forward    output_plugin:false     retry_count:
plugin_id:object:2b094116ed68   plugin_category:filter  type:stdout     output_plugin:false     retry_count:
plugin_id:output_docker1        plugin_category:output  type:file       output_plugin:true      buffer_queue_length:0   buffer_total_queued_size:0      retry_count:0
plugin_id:output1       plugin_category:output  type:file       output_plugin:true      buffer_queue_length:0   buffer_total_queued_size:0      retry_count:0

docker-composeで起動させる

検証用にdocker-compose.ymlを作ったのでこれを使います。

$ cat docker-compose.yml
version: '3'
services:
  fluentd:
    image: fluent/fluentd
    volumes:
      - ~/work/dd-agent/fluent:/fluentd/etc
    env_file:
      - .env
  datadog:
    build:
      context: .
      dockerfile: Dockerfile.datadog
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /proc/:/host/proc/:ro
      - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
    env_file:
      - .env
    links:
      - fluentd:fluentd.local

.envファイル経由で環境変数を渡しています。datadogはDockerfileからbuildしてますが、これは設定ファイルをimage内に固めただけです。

もう一つ肝なのがコンテナ間通信です。datadogコンテナからfluentdコンテナにアクセスさせる必要があるので、linksの設定を入れています。linksの設定を入れることでdatadogコンテナからfluentdコンテナへの通信がfluentd.localでいけるので設定ファイルも変更します。

$ cat dd-agent/fluentd.yaml
init_config:

instances:
    # Every instance requires a `monitor_agent_url`
    # Optional, set `plugin_ids` to monitor a specific scope of plugins.
    -  monitor_agent_url: http://fluentd.local:24220/api/plugins.json

これで起動すると、datadogコンテナがfluentdコンテナを監視し始めました。

実際にfluentdコンテナ上でログが取れるか確認する

ログがちゃんと取れているか確認したいので適当なコンテナを起動して、log_driverをfluentdにしてみます。

以前作ったサンプルアプリがあるのでそれを使います。

$ cat docker-compose.yml
version: '3'
services:
  fluentd:
    image: fluent/fluentd
    ports:
      - "24224:24224"
    volumes:
      - ~/work/dd-agent/fluent:/fluentd/etc
    env_file:
      - .env
  datadog:
    build:
      context: .
      dockerfile: Dockerfile.datadog
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /proc/:/host/proc/:ro
      - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
    env_file:
      - .env
    logging:
      driver: fluentd
    links:
      - fluentd:fluentd.local
    depends_on:
      - fluentd
  db:
    image: postgres
    env_file: .db.env
    logging:
      driver: fluentd
    depends_on:
      - fluentd
  web:
    image: tjinjin/dockerized-rails
    command: bash -c "rm tmp/pids/server.pid && bundle exec rails db:setup && bundle exec rails s -p 3000 -b '0.0.0.0'"
    env_file: .db.env
    ports:
      - "3000:3000"
    logging:
      driver: fluentd
    depends_on:
      - fluentd
      - db

これを起動させてfluentdコンテナに溜まっているログを確認してみます。

$ docker exec -ti ddagent_fluentd_1 sh
/ # tail -5 /fluentd/log/data.log
20170624T042554+0000    d6da01eb034a    {"log":"  \u001B[1m\u001B[36mMember Load (1.6ms)\u001B[0m  \u001B[1m\u001B[34mSELECT \"members\".* FROM \"members\"\u001B[0m","container_id":"d6da01eb034af320d1855f2df3e61b04bd632907d92300487324bf176f390b38","container_name":"/ddagent_web_1","source":"stdout"}
20170624T042554+0000    d6da01eb034a    {"container_name":"/ddagent_web_1","source":"stdout","log":"  Rendered members/index.html.erb within layouts/application (13.6ms)","container_id":"d6da01eb034af320d1855f2df3e61b04bd632907d92300487324bf176f390b38"}
20170624T042554+0000    d6da01eb034a    {"source":"stdout","log":"Completed 200 OK in 501ms (Views: 485.2ms | ActiveRecord: 3.8ms)","container_id":"d6da01eb034af320d1855f2df3e61b04bd632907d92300487324bf176f390b38","container_name":"/ddagent_web_1"}
20170624T042554+0000    d6da01eb034a    {"container_id":"d6da01eb034af320d1855f2df3e61b04bd632907d92300487324bf176f390b38","container_name":"/ddagent_web_1","source":"stdout","log":""}
20170624T042554+0000    d6da01eb034a    {"container_name":"/ddagent_web_1","source":"stdout","log":"","container_id":"d6da01eb034af320d1855f2df3e61b04bd632907d92300487324bf176f390b38"}

わかりにくいですが、webサーバのログが出力されていることが確認できました。

まとめ

最近ECS触っているのでまずはcomposeで試してみました。ECSでもlinks定義をすればコンテナ間通信ができるようなのでfluentdとdatadogコンテナを同じタスクにすればいいのか、他の手段でfluentdコンテナの状態を監視するのか模索していきます。

tjinjin's blog

インフラ要素多めの個人メモ