系统架构
Nginx config
Nginx 日志配置请参考微信公众号ELK专栏《基于ELK Nginx日志分析》的文章
Filebeat config
代码语言:javascript复制[root@elk-node1 conf.d]# egrep -v "*#|^$" /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/*.access.log
tags: ["nginx.access"]
- type: log
paths:
- /var/log/nginx/*.error.log
tags: ["nginx.error"]
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
output.logstash:
hosts: ["192.168.99.186:6044"]
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
Logstash config
代码语言:javascript复制[root@elk-node2 conf.d]# cat nginx.conf
input {
beats {
port => 6044
}
}
filter {
if "nginx.access" in [tags] {
json {
source => "message"
remove_field => "message"
}
date {
match => ["timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
}
useragent {
target => "agent"
source => "http_user_agent"
}
geoip {
target => "geoip"
source => "remote_add"
#fields => ["city_name", "country_code2", "country_name", "region_name","longitude","latitude","ip"]
add_field => ["[geoip][coordinates]","%{[geoip][longitude]}"]
add_field => ["[geoip][coordinates]","%{[geoip][latitude]}"]
}
mutate {
convert => ["[geoip][coordinates]","float"]
add_field => [ "[zabbix_key]", "nginxstatus" ]
add_field => [ "[zabbix_host]", "192.168.99.186" ]
add_field => [ "nginxstatus","%{hostname}@%{server_addr}-%{status}" ]
}
}
else if "nginx.error" in [tags] {
mutate {
remove_field => ["@timestamp"]
}
grok {
match => {"message" => "(?<datetime>%{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY}[- ]%{TIME}) [%{LOGLEVEL:severity}] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}(?:, client: (?<real_ip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:domain}?)(?:, request: %{QS:request})?(?:, upstream: (?<upstream>"%{URI}"|%{QS}))?(?:, host: %{QS:request_host})?(?:, referrer: "%{URI:referrer}")?"}
}
date {
match => ["datetime", "yyyy/MM/dd HH:mm:ss"]
target => "@timestamp"
}
mutate {
remove_field => ["message"]
}
}
}
output{
#stdout{codec => rubydebug}
if "nginx.access" in [tags]{
elasticsearch{
index => "logstash-nginx.access-%{ YYYY.MM.dd}"
hosts => ["192.168.99.186:9200"]
user => "elastic"
password => "qZXo7E"
}
}
if [nginxstatus] =~ /(502|504|404|302|200|401)/ {
zabbix {
zabbix_host => "[zabbix_host]"
zabbix_key => "[zabbix_key]"
zabbix_server_host => "192.168.99.200"
zabbix_server_port => "10051"
zabbix_value => "nginxstatus"
}
}
else if "nginx.error" in [tags]{
elasticsearch {
index => "nginx.error-%{ YYYY.MM.dd}"
hosts => ["192.168.99.186:9200"]
user => "elastic"
password => "qZXo7E"
}
}
}
关于logstash zabbix 配置参数介绍请参考微信公众号ELK专栏《ELK 联动 ZABBIX 实现异常日志告警》的文章
kibana 查看索引字段
logstash-output-zabbix config
使用logstash-output-zabbix插件,将logstash收集到的数据过滤出异常日志输出到ZABBIX实现告警推送。
代码语言:javascript复制/usr/share/logstash/bin/logstash-plugin install logstash-output-zabbix
/usr/share/logstash/bin/logstash-plugin list
/usr/share/logstash/bin/logstash-plugin list --verbose
/usr/share/logstash/bin/logstash-plugin list |grep output
/usr/share/logstash/bin/logstash-plugin update logstash-output-zabbix
Zabbix config
监控项
触发器
count 函数
代码语言:javascript复制支持类型:float,int,str,text,log
作用:返回指定时间间隔内数值的统计
举例:count(600)最近10分钟得到值的个数
count(600,12)最近10分钟得到值的个数等于12
count(600,12,"gt")最近10分钟得到值大于12的个数
count(#10,12,"gt")最近10个值中,值大于12的个数
count(600,12,"gt",86400)24小时之前的10分钟内值大于12的个数
count(600,,,86400)24小时之前的10分钟数据值的个数
函数说明:count (sec|#num,<pattern>,<operator>,<time_shift>)
第一个参数:秒或#num
第二个参数:样本数据
第三个参数:操作参数
第四个参数:漂移参数
支持的操作类型
eq: 相等 ne: 不相等 gt: 大于 ge: 大于等于 lt: 小于 le: 小于等于 like: 内容匹配
正常返回状态码为200,5分钟内连续大于10次返回值不是200则进行触发告警,3分钟小于5次返回值为200或5分钟之内没有数据恢复!
install zabbix_sender
代码语言:javascript复制yum install zabbix_sender
zabbix_sender 测试
代码语言:javascript复制[root@elk-node2 conf.d]# zabbix_sender -s 192.168.99.186 -z 192.168.99.200 -k "nginxstatus" -o 1 -vv
zabbix_sender [27919]: DEBUG: answer [{"response":"success","info":"processed: 1; failed: 0; total: 1; seconds spent: 0.000074"}]
Response from "192.168.99.200:10051": "processed: 1; failed: 0; total: 1; seconds spent: 0.000074"
sent: 1; skipped: 0; total: 1
Latest data
告警事件