ApacheアクセスログをElasticsearchへ流す
Elasticsearchはdockerコンテナで用意、Apache側は落ちてたwordpressのコンテナにtd-agentをインストールしてテスト
- td-agent 0.12.12
- Elasticsearch 1.7.1
td-agent のインストール
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
標準出力で確認
apache2のテンプレート↓がデフォルトで用意されているのでそのまま使う
format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/ time_format %d/%b/%Y:%H:%M:%S %z
/etc/td-agent/td-agent.conf
<source> type tail format apache2 path /var/log/apache2/access.log pos_file /var/log/td-agent/apache_access.pos tag apache.access </source> <match *.**> type copy <store> type stdout </store> </match>
td-agent を起動
/etc/init.d/td-agent start
dockerコンテナの上なんでもrootで動かしてるせいでログが読めない。。。
2015-10-04 17:16:11 +0000 [error]: Permission denied @ rb_sysopen - /var/log/apache2/access.log 2015-10-04 17:16:11 +0000 [error]: suppressed same stacktrace
適当対応
/etc/init.d/td-agent
: TD_AGENT_USER=root TD_AGENT_GROUP=root :
こんなん出た
2015-10-06 22:25:17 +0900 raw.apache.access: {"host":"192.168.1.1","user":null,"method":"POST","path":"/wp-admin/admin-ajax.php","code":200,"size":580,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"} 2015-10-06 22:25:35 +0900 raw.apache.access: {"host":"192.168.1.1","user":null,"method":"GET","path":"/wp-admin/index.php","code":200,"size":14048,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3"} 2015-10-06 22:25:55 +0900 raw.apache.access: {"host":"192.168.1.1","user":null,"method":"GET","path":"/wp-admin/index.php","code":200,"size":14046,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (iPad; CPU OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3"}
UAを解析したい
tagomoris/fluent-plugin-woothee · GitHub
# td-agent-gem install fluent-plugin-woothee WARN: Unresolved specs during Gem::Specification.reset: json (>= 1.4.3) WARN: Clearing out unresolved specs. Please report a bug if this causes problems. ERROR: While executing gem ... (Gem::RemoteFetcher::FetchError) Errno::ECONNREFUSED: Connection refused - connect(2) for "your-dns-needs-immediate-attention.dev" port 443 (https://your-dns-needs-immediate-attention.dev/quick/Marshal.4.8/woothee-1.2.0.gemspec.rz)
変なエラーでた、コンテナをdevドメインにしてあるせいらしい
your-dns-needs-immediate-attention | Triple-networks
# echo "search home.local" >> /etc/resolv.conf
/etc/td-agent/td-agent.conf
<source> type tail format apache2 path /var/log/apache2/access.log pos_file /var/log/td-agent/apache_access.pos tag raw.apache.access </source> <match raw.**> type woothee key_name agent remove_prefix raw add_prefix parsed merge_agent_info yes out_key_name agent_name out_key_category agent_category out_key_os agent_os out_key_os_version agent_os_version out_key_version agent_version out_key_vendor agent_vendor </match> <match *.**> type copy <store> type stdout </store> </match>
こうなった。
2015-10-06 23:17:10 +0900 parsed.apache.access: {"host":"192.168.1.1","user":null,"method":"GET","path":"/wp-admin/index.php","code":200,"size":14014,"referer":"http://192.168.1.10/wp-admin/plugins.php","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36","agent_name":"Chrome","agent_category":"pc","agent_os":"Mac OSX","agent_os_version":"10.10.5","agent_version":"45.0.2454.101","agent_vendor":"Google"} 2015-10-06 23:18:17 +0900 parsed.apache.access: {"host":"192.168.1.1","user":null,"method":"GET","path":"/wp-admin/index.php","code":200,"size":14051,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3","agent_name":"Safari","agent_category":"smartphone","agent_os":"iPhone","agent_os_version":"5.0","agent_version":"5.1","agent_vendor":"Apple"} 2015-10-06 23:19:11 +0900 parsed.apache.access: {"host":"192.168.1.1","user":null,"method":"GET","path":"/wp-admin/index.php","code":200,"size":14049,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (Linux; U; Android 4.0.4; en-gb; GT-I9300 Build/IMM76D) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30","agent_name":"Safari","agent_category":"smartphone","agent_os":"Android","agent_os_version":"4.0.4","agent_version":"4.0","agent_vendor":"Apple"}
他にも、filter_categories、drop_categories で特定のカテゴリを出力、破棄できる。
クローラリクエストを効率良くざっくり破棄した場合は、'woothee_fast_crawler_filter' を使う、完璧に破棄したい場合は、'woothee' + 'drop_categories crawler'を併せて使う。
geoipも使ってみる
y-ken/fluent-plugin-geoip · GitHub
# apt-get install build-essential # apt-get install libgeoip-dev # td-agent-gem install fluent-plugin-geoip
デフォルトでバンドルされてる無償データベースだと国レベル(緯度経度も)
/etc/td-agent/td-agent.conf
<source> type tail format apache2 path /var/log/apache2/access.log pos_file /var/log/td-agent/apache_access.pos tag raw.apache.access </source> <match raw.**> type woothee key_name agent remove_prefix raw add_prefix ua_parsed merge_agent_info yes out_key_name agent_name out_key_category agent_category out_key_os agent_os out_key_os_version agent_os_version out_key_version agent_version out_key_vendor agent_vendor </match> <match ua_parsed.**> type geoip # Specify one or more geoip lookup field which has ip address (default: host) # in the case of accessing nested value, delimit keys by dot like 'host.ip'. geoip_lookup_key host # Specify optional geoip database (using bundled GeoLiteCity databse by default) #geoip_database "/path/to/your/GeoIPCity.dat" #enable_key_country_code geoip_country # Set adding field with placeholder (more than one settings are required.) <record> #city ${city["host"]} geoip_latitude ${latitude["host"]} geoip_longitude ${longitude["host"]} geoip_country_code3 ${country_code3["host"]} geoip_country ${country_code["host"]} country_name ${country_name["host"]} #dma ${dma_code["host"]} #area ${area_code["host"]} #region ${region["host"]} geoip_location_properties '{ "lat" : ${latitude["host"]}, "lon" : ${longitude["host"]} }' geoip_location_string ${latitude["host"]},${longitude["host"]} geoip_location_array '[${longitude["host"]},${latitude["host"]}]' </record> # Settings for tag remove_tag_prefix ua_parsed. tag parsed.${tag} # To avoid get stacktrace error with `[null, null]` array for elasticsearch. skip_adding_null_record true # Set log_level for fluentd-v0.10.43 or earlier (default: warn) log_level info # Set buffering time (default: 0s) flush_interval 1s </match> <match *.**> type copy <store> type stdout </store> </match>
geoip_locationはどの形式でもO.K. 適当なIPを流して確認
2015-10-07 01:35:08 +0900 geoip.apache.access: {"host":"1.0.0.0","user":null,"method":"POST","path":"/wp-admin/admin-ajax.php","code":200,"size":580,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36","agent_name":"Chrome","agent_category":"pc","agent_os":"Mac OSX","agent_os_version":"10.10.5","agent_version":"45.0.2454.101","agent_vendor":"Google","geoip_latitude":-27.0,"geoip_longitude":133.0,"geoip_country_code3":"AUS","geoip_country":"AU","country_name":"Australia","geoip_location_properties":{"lat":-27.0,"lon":133.0},"geoip_location_string":"-27.0,133.0","geoip_location_array":[133.0,-27.0]} 2015-10-07 01:36:08 +0900 geoip.apache.access: {"host":"128.0.0.0","user":null,"method":"POST","path":"/wp-admin/admin-ajax.php","code":200,"size":580,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36","agent_name":"Chrome","agent_category":"pc","agent_os":"Mac OSX","agent_os_version":"10.10.5","agent_version":"45.0.2454.101","agent_vendor":"Google","geoip_latitude":46.0,"geoip_longitude":25.0,"geoip_country_code3":"ROU","geoip_country":"RO","country_name":"Romania","geoip_location_properties":{"lat":46.0,"lon":25.0},"geoip_location_string":"46.0,25.0","geoip_location_array":[25.0,46.0]} 2015-10-07 01:37:08 +0900 geoip.apache.access: {"host":"114.170.237.217","user":null,"method":"POST","path":"/wp-admin/admin-ajax.php","code":200,"size":580,"referer":"http://192.168.1.10/wp-admin/index.php","agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36","agent_name":"Chrome","agent_category":"pc","agent_os":"Mac OSX","agent_os_version":"10.10.5","agent_version":"45.0.2454.101","agent_vendor":"Google","geoip_latitude":35.689998626708984,"geoip_longitude":139.69000244140625,"geoip_country_code3":"JPN","geoip_country":"JP","country_name":"Japan","geoip_location_properties":{"lat":35.689998626708984,"lon":139.69000244140625},"geoip_location_string":"35.689998626708984,139.69000244140625","geoip_location_array":[139.69000244140625,35.689998626708984]}
Elasticsearchへ流す
uken/fluent-plugin-elasticsearch · GitHub
# td-agent-gem install fluent-plugin-elasticsearch
/etc/td-agent/td-agent.conf
: <match parsed.**> type elasticsearch hosts es1.containers.dev:9200,es2.containers.dev:9200 type_name access logstash_format true logstash_prefix apache_log_wordpress logstash_dateformat %Y.%m flush_interval 10s </match>
geoipの緯度経度がそのままだとgeo_pointにマッピングされないので、明示的にタイプをマッピングしておく。
curl -XPUT 'es1.containers.dev:9200/_template/apache_log/?pretty' -d ' { "template": "apache_log*", "mappings": { "access": { "properties": { "geoip_location_properties": { "type": "geo_point" }, "geoip_location_string": { "type": "geo_point" }, "geoip_location_array": { "type": "geo_point" } } } } } '
適当にログを流す、ドキュメントはこんな感じになった
{ "_index": "apache_log_wordpress-2015.10", "_type": "access", "_id": "AVBOD5OpAEC4whOefbLc", "_score": 1, "_source": { "host": "1.0.0.0", "user": null, "method": "POST", "path": "/wp-admin/admin-ajax.php", "code": 200, "size": 580, "referer": "http://192.168.1.10/wp-admin/index.php", "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36", "agent_name": "Chrome", "agent_category": "pc", "agent_os": "Mac OSX", "agent_os_version": "10.10.5", "agent_version": "45.0.2454.101", "agent_vendor": "Google", "geoip_latitude": -27, "geoip_longitude": 133, "geoip_country_code3": "AUS", "geoip_country": "AU", "country_name": "Australia", "geoip_location_properties": { "lat": -27, "lon": 133 }, "geoip_location_string": "-27.0,133.0", "geoip_location_array": [133, -27], "@timestamp": "2015-10-07T01:35:08+09:00" } }
マッピングも確認
# curl -XGET 'es1.containers.dev:9200/apache_log_wordpress-2015.10/_mapping/?pretty' { "apache_log_wordpress-2015.10" : { "mappings" : { "access" : { "properties" : { "@timestamp" : { "type" : "date", "format" : "dateOptionalTime" }, "agent" : { "type" : "string" }, "agent_category" : { "type" : "string" }, "agent_name" : { "type" : "string" }, "agent_os" : { "type" : "string" }, "agent_os_version" : { "type" : "string" }, "agent_vendor" : { "type" : "string" }, "agent_version" : { "type" : "string" }, "code" : { "type" : "long" }, "country_name" : { "type" : "string" }, "geoip_country" : { "type" : "string" }, "geoip_country_code3" : { "type" : "string" }, "geoip_latitude" : { "type" : "double" }, "geoip_location_array" : { "type" : "geo_point" }, "geoip_location_properties" : { "type" : "geo_point" }, "geoip_location_string" : { "type" : "geo_point" }, "geoip_longitude" : { "type" : "double" }, "host" : { "type" : "string" }, "method" : { "type" : "string" }, "path" : { "type" : "string" }, "referer" : { "type" : "string" }, "size" : { "type" : "long" } } } } } }
ここまで
Kibanaで軽く確認してチャートや地図も問題無し、@timestampもちゃんとログのリクエスト時間になってる。
とりあえず導入としてはこんなもんで。