简单介绍
在使用zabbix的过程中,我们会发现zabbix的功能越发强大,完全可以满足企业的各种需求,我们可以利用自定义模板,将我们需要监控的监控项自定义到模板中,并设置好阈值触发器,并设置好告警等等,可以满足我们的所有需求。
配置参考
自动发现部分
创建模板——》Discovery——》设置
网络自动发现
1 2 3 4 5 6 | name: Network interface discovery Key: net.if.discovery Filters Label Macro: {#IFNAME} Regular expression: @Network interfaces for discovery |
---|
磁盘自动发现
1 2 3 4 5 6 7 8 9 10 | name: Mounted filesystem discovery Key: vfs.fs.discovery Filters A Label Macro: {#FSNAME} Regular expression: @File name for discovery B Label Macro: {#FSTYPE} Regular expression: @File systems for discovery |
---|
网络
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | 出网: Name: Outgoing network traffic on $1 Key: net.if.out[{#IFNAME}] Type of information: Numeric(unsigned) Data type: Decimal units: bps Use custom multiplier: 8 入网: Name: Incoming network traffic on $1 Key: net.if.in[{#IFNAME}] Type of information: Numeric(unsigned) Data type: Decimal units: bps Use custom multiplier: 8 |
---|
磁盘
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | free disk Name: Free disk space on $1 (percentage) Key: vfs.fs.size[{#FSNAME},pfree] Type of information: Numeric(float) units: % free inodes Name: Free inodes on $1 (percentage) Key: vfs.fs.inode[{#FSNAME},pfree] Type of information: Numeric(float) units: % total disk Name: Total disk space on $1 Key: vfs.fs.size[{#FSNAME},total] Type of information: Numeric(unsigned) Data type: Decimal units: B use disk Name: Used disk space on $1 Key: vfs.fs.size[{#FSNAME},used] Type of information: Numeric(unsigned) Data type: Decimal units: B |
---|
磁盘阈值报警规则
1 2 3 4 5 6 7 8 | Name: {HOST.NAME} Free disk space is less than 5% on volume {#FSNAME} Expression: {Template_base_xs:vfs.fs.size[{#FSNAME},pfree].last(0)}<5 Name: {HOST.NAME} Free disk space is less than 15% on volume {#FSNAME} Expression: {Template_base_xs:vfs.fs.size[{#FSNAME},pfree].last(0)}<15 Name: {HOST.NAME} Free inodes is less than 15% on volume {#FSNAME} Expression: {Template_base_xs:vfs.fs.inode[{#FSNAME},pfree].last(0)}<15 |
---|
以上为自动发现项,包括网络和磁盘,因为很多情况下我们不知道有多少块网卡,或者网卡名叫什么,或者有多少块磁盘等情况,下面是一些zabbix自带的检测系统的一些常用项的监控项配置
添加其他监控项
主机存活性
1 2 3 4 5 | Name: Agent ping Key: agent.ping Type of information: Numeric(unsigned) Data type: Decimal units: |
---|
CPU相关
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | cpu steal time Name: CPU $2 time Key: system.cpu.util[,steal] Type of information: Numeric(float) units: % cpu user time Name: CPU $2 time Key: system.cpu.util[,user] Type of information: Numeric(float) units: % cpu softirq time Name: CPU $2 time Key: system.cpu.util[,softirq] Type of information: Numeric(float) units: % cpu system time Name: CPU $2 time Key: system.cpu.util[,system] Type of information: Numeric(float) units: % cpu nice time Name: CPU $2 time Key: system.cpu.util[,nice] Type of information: Numeric(float) units: % cpu iowait time Name: CPU $2 time Key: system.cpu.util[,iowait] Type of information: Numeric(float) units: % cpu idle time Name: CPU $2 time Key: system.cpu.util[,idle] Type of information: Numeric(float) units: % cpu interrupt time Name: CPU $2 time Key: system.cpu.util[,interrupt] Type of information: Numeric(float) units: % cpu context switches per second Name: CPU context switches per second Key: system.cpu.switches Type of information: Numeric(unsigned) units: sps cpu interrupts per second Name: CPU interrupts per second Key: system.cpu.intr Type of information: Numeric(unsigned) units: sps |
---|
进程数
1 2 3 4 5 6 7 8 9 | Name: Processes number of running Key: proc.num[,,run] Type of information: Numeric(unsigned) units: Name: Processes total number Key: proc.num[] Type of information: Numeric(unsigned) units: |
---|
负载
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | Name: Processor load (1 min average) Key: system.cpu.load[,avg1] Type of information: Numeric(float) units: Name: Processor load (5 min average) Key: system.cpu.load[,avg5] Type of information: Numeric(float) units: Name: Processor load (15 min average) Key: system.cpu.load[,avg15] Type of information: Numeric(float) units: |
---|
自定义key,获取要监控项的值
脚本参考:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | #!/bin/bash port="9998" Mem() { free -m | grep Mem | awk '{printf ("%.2fn"),$3/$2*100}' } Disk() { df -h | grep "/dev/vd" | awk '{print $5}' | awk -F '%' '{print $1}' | head -1 } Nginx() { sudo netstat -antplu |grep nginx | wc -l } Php() { ps auxf |grep php-fpm | grep -v grep | wc -l } Mysql() { sudo netstat -antplu |grep mysql | wc -l } Redis() { sudo netstat -antplu | grep redis | grep $port | wc -l } $1 |
---|
zabbix配置自动发现规则,用来自动发现机器,并把机器添加到组里,并链接模板等操作。
1、Configuration -> Discovery -> Create discovery rule
2、设置发现规则名、探测时间等
注:添加的IP地址范围可以是IP段,也可以是以逗号分隔的多个IP地址
Name
自动发现的规则名称,唯一
Discovery by proxy
谁执行当前的发现规则(no proxy便是zabbix server)
Delay
多久去探测一次(最好设置的时间长一些,这里测试的话设置的短一些)
Checks
支持的checks有SSH、LDAP、SMTP、FTP、HTTP、HTTPS、POP、NNTP、IMAP、TCP、Telnet、Zabbix agent、SNMPv1 agent、SNMPv2 agent、SNMPv3 agent、ICMP ping
Port
可以写单个端口,也可以写端口段,例如22-45;也可以写端口段列表,以逗号分隔,例如:22-45,50-100,465
Device uniqueness criteria
可以用IP地址作为设备唯一标识
3、为自动发现的主机创建动作
注:这里要选择Event source中的Discovery,然后再点击创建
整个大体过程:
- 创建发现规则(Discovery)
- 创建动作(Actions)
- 添加触发条件(IP地址范围、服务类型、Discovery状态)
- 创建操作(添加主机、添加主机组、关联到自定义模板)
- 验证