【滴滴开源运维监控系统】夜莺V5版本部署实践

2022-03-31 20:44:34 浏览数 (1)

滴滴开源运维监控系统-夜莺Nightingale

夜莺是新一代国产智能监控系统。对云原生场景、传统物理机虚拟机场景,都有很好的支持,10分钟完成搭建,1小时熟悉使用,经受了滴滴生产环境海量数据的验证,希望打造国产监控的标杆之作

新版Nightingale在2020.3.20发布v1版本,目前是v5.0版本,从这个版本开始,与Prometheus、VictoriaMetrics、Grafana、Telegraf等生态做了协同集成,力争打造国内最好用的开源运维监控系统。

(图片可点击放大查看)

(图片可点击放大查看)

本文参考如下链接完成

(图片可点击放大查看)

代码语言:javascript复制
https://n9e.gitee.io/quickstart/standalone/
https://n9e.gitee.io/quickstart/telegraf/
https://blog.csdn.net/smallbird108/article/details/122497200

相关组件安装包准备

代码语言:javascript复制
1、https://downloads.mysql.com/archives/community/
2、https://github.com/prometheus/prometheus/releases/download/v2.33.1/prometheus-2.33.1.linux-amd64.tar.gz
3、https://dl.influxdata.com/telegraf/releases/telegraf-1.21.3-1.x86_64.rpm
4、https://github.com/n9e/fe-v5/releases
n9e-5.3.3.tar.gz

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

一、安装MySQL

(图片可点击放大查看)

代码语言:javascript复制
 rpm -ivh mysql-community-common-5.7.36-1.el7.x86_64.rpm 
 rpm -ivh mysql-community-libs-5.7.36-1.el7.x86_64.rpm
 rpm -ivh mysql-community-client-5.7.36-1.el7.x86_64.rpm
 rpm -ivh mysql-community-server-5.7.36-1.el7.x86_64.rpm

(图片可点击放大查看)

(图片可点击放大查看)

代码语言:javascript复制
systemctl start mysqld
netstat -anp | grep 3306
systemctl enable mysqld
查看初始密码
grep 'temporary password' /var/log/mysqld.log
修改密码
set password for root@localhost=password('MySQL_2022');
grant all privileges on *.* to root@'%' identified by 'MySQL_2022';
flush privileges;

(图片可点击放大查看)

二、安装prometheus

代码语言:javascript复制
mkdir -p /opt/prometheus
tar xf prometheus-2.33.1.linux-amd64.tar.gz
cp -far prometheus-2.33.1.linux-amd64/*  /opt/prometheus/
cd /opt/prometheus
chown -R root:root *

(图片可点击放大查看)

代码语言:javascript复制
# service 
cat <<EOF >/etc/systemd/system/prometheus.service
[Unit]
Description="prometheus"
Documentation=https://prometheus.io/
After=network.target

[Service]
Type=simple

ExecStart=/opt/prometheus/prometheus  --config.file=/opt/prometheus/prometheus.yml --storage.tsdb.path=/opt/prometheus/data --web.enable-lifecycle --enable-feature=remote-write-receiver --query.lookback-delta=2m 

Restart=on-failure
SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=prometheus


[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable prometheus
systemctl restart prometheus
systemctl status prometheus

(图片可点击放大查看)

其中prometheus在启动的时候要注意开启 --enable-feature=remote-write-receiver

(图片可点击放大查看)

三、安装Redis

建议给Redis添加密码

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

代码语言:javascript复制
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
yum install -y redis
systemctl enable redis
vim /etc/redis.conf
代码语言:javascript复制
systemctl restart redis

四、n9e部署

代码语言:javascript复制
mkdir /usr/local/n9e
tar -zxvf n9e-5.3.3.tar.gz -C /usr/local/n9e/

vim /usr/local/n9e/etc/server.conf 
配置文件中MySQL Redis连接密码修改以及对接IP地址修改
vim /usr/local/n9e/etc/webapi.conf 

mysql -uroot -p'MySQL_2022' < /usr/local/n9e/docker/initsql/a-n9e.sql

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

代码语言:javascript复制
mkdir /opt/n9e
cat <<EOF >/etc/systemd/system/n9e-server.service
[Unit]
Description="n9e-server"
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/n9e/n9e server
WorkingDirectory=/usr/local/n9e
Restart=on-failure
SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=n9e-server
[Install]
WantedBy=multi-user.target
EOF

cat <<EOF >/etc/systemd/system/n9e-webapi.service
[Unit]
Description="n9e-webapi"
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/n9e/n9e webapi
WorkingDirectory=/usr/local/n9e
Restart=on-failure
SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=n9e-server
[Install]
WantedBy=multi-user.target
EOF

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

代码语言:javascript复制
systemctl enable n9e-server.service
systemctl enable n9e-server.service
systemctl enable n9e-webapi.service 
systemctl restart n9e-server.service  n9e-webapi.service
systemctl status n9e-server.service
systemctl status n9e-webapi.service 
firewall-cmd --permanent --zone=public --add-port=18000/tcp
firewall-cmd --permanent --zone=public --add-port=19000/tcp
firewall-cmd --reload

五、监控主机上安装采集器telegraf

例如找一台监控主机作为监控主机客户端进行测试

代码语言:javascript复制
rpm -ivh telegraf-1.21.3-1.x86_64.rpm

(图片可点击放大查看)

(图片可点击放大查看)

代码语言:javascript复制
cat <<EOF > /etc/telegraf/telegraf.conf
[global_tags]

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

[[outputs.opentsdb]]
  host = "http://192.168.31.127"
  port = 19000
  http_batch_size = 50
  http_path = "/opentsdb/put"
  debug = false
  separator = "_"

[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = true

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.system]]
  fielddrop = ["uptime_format"]

[[inputs.net]]
  ignore_protocol_stats = true

EOF

systemctl restart telegraf.service

六、登录n9e web服务端参看监控指标项

默认用户名密码为:root/root.2000

(图片可点击放大查看)

(图片可点击放大查看)

(图片可点击放大查看)

这里使用telegraf作为采集器,本文只简单介绍入门部署,更多功能待研究与实践

0 人点赞