Zabbix 自定义LLD

线上部分实时job是用storm开发的，为了监控数据的延迟，在storm处理日志的时候会把日志的时间插入到Redis中，然后通过zabbix做延迟的监控。由于经常有新的job上线，手动配置监控项就变得比较麻烦，为了解放生产力，还是需要搞成自动化。

之前添加网卡和分区监控的时候用了LLD的功能，并用了其内置的宏变量，新版本的zabbix是支持custom LLD的，实现步骤如下：

1.在模板中设置一个discovery rule ( UserParameter Key)，调用脚本，返回zabbix规定的json数据（返回自定义的宏变量），并正确设置的discovery（比如filter等）

这里通过官方文档并结合线上的agent日志，可以看到zabbix规定的数据格式

143085:20141127:000548.967 Requested [vfs.fs.discovery] 143085:20141127:000548.967 Sending back [{ "data":[ { "{#FSNAME}":"/", "{#FSTYPE}":"rootfs"}, { "{#FSNAME}":"/proc/sys/fs/binfmt_misc", "{#FSTYPE}":"binfmt_misc"}, { "{#FSNAME}":"/data", "{#FSTYPE}":"ext4"}]}]

比如线上返回json数据的key:

UserParameter=storm.delay.discovery,Python2.6 /apps/sh/zabbix_scripts/storm/storm_delay_discovery.py

并通过

zabbix_get -s 127.0.0.1 -k storm.delay.discovery

验证返回数据的准确性

storm_delay_discovery.py内容如下：

#!/usr/bin/python import sys import redis import exceptions import traceback _hashtables = [] _continue = True _alldict = {} _alllist = [] class RedisException(Exception): def __init__(self, errorlog): self.errorlog = errorlog def __str__(self): return "error log is %s" % (self.errorlog) def scan_one(cursor,conn): try: cursor_v = conn.scan(cursor) cursor_next = cursor_v[0] cursor_value = cursor_v[1] for line in cursor_value: if (line.startswith("com-vip-storm") or line.startswith("stormdelay_")) and str(line) != "stormdelay_riskcontroll": _hashtables.append(line) else: pass return cursor_next except Exception,e: raise RedisException(str(e)) def scan_all(conn): try: cursor1 = scan_one('0',conn) global _continue while _continue: cursor2 = scan_one(cursor1,conn) if int(cursor2) == 0: _continue = False else: cursor1 = cursor2 _continue = True except Exception,e: raise RedisException(str(e)) def hget_fields(conn,hashname): onedict = {} fields = conn.hkeys(hashname) for field in fields: onedict["{#STORMHASHNAME}"] = hashname onedict["{#STORMHASHFIELD}"] = field _alllist.append(onedict) if __name__ == '__main__': try: r=redis.StrictRedis(host='xxxx', port=xxx, db=0) scan_all(r) for hashtable in _hashtables: hget_fields(r,hashtable) _alldict["data"] = _alllist print str(_alldict).replace("'",'"') except Exception,e: print -1

2.设置item/graph/trigger prototypes：

这里以item为例，定义item prototypes (同样需要定义key),key的参数为宏变量

比如Free inodes on {#FSNAME} (percentage)--->vfs.fs.inode[{#FSNAME},pfree]

本例中，在item中使用上面返回的宏变量即可，

storm_delay[hget,{#STORMHASHNAME},{#STORMHASHFIELD}]

最后，把包含LLD的template链接到host上即可。

最后再配合screen.create/screenitem.update api就可以实现监控添加/screen添加,更新的自动化了。

css node.js 云数据库Redis zabbix

0 人点赞