ALBのログをembulkを使ってmysqlに入れる
About
ALBのログを分析したかったので、手軽にできそうなembulkを使ってみました。
環境
pluginはこんな感じです。
$ embulk gem list 2017-07-21 09:46:57.920 +0900: Embulk v0.8.23 *** LOCAL GEMS *** did_you_mean (default: 1.0.1) embulk-input-s3 (0.2.11) embulk-output-mysql (0.7.8) jar-dependencies (default: 0.3.5) jruby-openssl (0.9.17 java) json (1.8.3 java) minitest (default: 5.4.1) net-telnet (default: 0.1.1) power_assert (default: 0.2.3) psych (2.0.17 java) racc (1.4.14 java) rake (default: 10.4.2) rdoc (default: 4.2.0) test-unit (default: 3.1.1)
設定ファイル
# s3_to_mysql.yml.liquid in: type: s3 bucket: <s3_bucket> # modify path_prefix: AWSLogs/<account_id>/elasticloadbalancing/ap-northeast-1/2017/07/20/ #modify auth_method: session access_key_id: {{ env.AWS_ACCESS_KEY_ID }} secret_access_key: {{ env.AWS_SECRET_ACCESS_KEY }} session_token: {{ env.AWS_SESSION_TOKEN }} parser: charset: UTF-8 newline: LF type: csv delimiter: ' ' quote: "" trim_if_not_quoted: false skip_header_lines: 1 allow_extra_columns: false allow_optional_columns: false columns: - {name: protocol, type: string} - {name: timestamp, type: string} - {name: elb, type: string} - {name: client_port, type: string} - {name: backend_port, type: string} - {name: request_processing_time, type: string} - {name: backend_processing_time, type: string} - {name: response_processing_time, type: string} - {name: elb_status_code, type: string} - {name: backend_status_code, type: string} - {name: received_bytes, type: string} - {name: send_bytes, type: string} - {name: request, type: string} - {name: user_agent, type: string} - {name: ssl_cipher, type: string} - {name: ssl_protocol, type: string} - {name: target_group_arn, type: string} - {name: trace_id, type: string} decoders: - {type: gzip} out: type: mysql host: localhost user: root password: "" database: alb_log table: alb_log mode: replace
あとはmysqlにdatabase作ってembulk run s3_to_mysql.yml.liquid
すれば完成です!全部文字列で突っ込んでいるので、time系は型を変えたほうがいいかもしれないです。