腾讯云大数据 ES Serverless 对接自建 Filebeat 采集 Apache logs 实战

2023-11-30 23:07:10 浏览数 (3)

0x00.前言

上一篇文章介绍了开箱即用的采集 CVM 日志的方法:https://cloud.tencent.com/developer/article/2365751

因为腾讯云大数据 ES Serverless 还支持自建 Filebeat,更具备普适性,本文会详细进行介绍说明

0x01.限制

1. ES 仅开放了内网访问地址,所以需要自建 Filebeat & ES 需要在同一个私有网络中

2. ES 北京地区目前仅可选择「北京三区」~「北京七区」

因此本次实战的自建 Filebeat 选择同在北京三区的 CVM

3. 因为是自建 Filebeat,支持多种操作系统。而不仅是 Linux 64 位系统

0x02.安装 Filebeat

本文要采集的是 Apache logs,需要使用到 Filebeat

接下来介绍如何在 Windows 服务器中进行安装

步骤 1

首先去下载 filebeat-7.14.2-windows-x86_64.msi,注意需要 7.14.2 版本的(官网文档中也有说明:https://cloud.tencent.com/document/product/845/90416#.E8.87.AA.E5.BB.BA-filebeat-.E6.95.B0.E6.8D.AE.E9.87.87.E9.9B.86)

这里选择了 .msi 安装包,也可以选择 .zip 然后自行解压后注册服务,注册命令如下

代码语言:javascript复制
PS D:filebeat-7.14.2-windows-x86_64> .install-service-filebeat.ps1


__GENUS          : 2
__CLASS          : __PARAMETERS
__SUPERCLASS     :
__DYNASTY        : __PARAMETERS
__RELPATH        :
__PROPERTY_COUNT : 1
__DERIVATION     : {}
__SERVER         :
__NAMESPACE      :
__PATH           :
ReturnValue      : 5
PSComputerName   :

__GENUS          : 2
__CLASS          : __PARAMETERS
__SUPERCLASS     :
__DYNASTY        : __PARAMETERS
__RELPATH        :
__PROPERTY_COUNT : 1
__DERIVATION     : {}
__SERVER         :
__NAMESPACE      :
__PATH           :
ReturnValue      : 0
PSComputerName   :

Status      : Stopped
Name        : filebeat
DisplayName : filebeat

安装完成

0x03.创建空索引

填写基础信息,因为没有找到 Filebeat 输出的样例,所以索引配置的 mappings 设置为「动态生成」

我可不想一个一个填写.jpg:https://www.elastic.co/guide/en/beats/filebeat/7.10/exported-fields-apache.html

确认创建,完成后获得「索引内网访问地址」

0x04.配置 Filebeat

步骤 2

打开配置文件夹

新建 filebeat.yml

进行配置,按需修改

注意 ES 内网端口(http)是 80 而不是 9200

内网 Kibana 是 443 端口而不是 5601

代码语言:javascript复制
# ============================== Filebeat inputs ===============================

filebeat.inputs:
- type: log
  # Change to true to enable this input configuration.
  enabled: true
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - C:Apache24logs*
# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================
setup.template.enabled: false
setup.ilm.enabled: false
  #template setting's value is set to false by default. If you set it to true, an error will be reported when the configuration is submitted


# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded

# ================================== Logging ===================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]
############################# output ######################################
output.elasticsearch:
  # Array of hosts to connect to.
  allow_older_versions: true
  protocol: "http"
  hosts: ["http://index-<rm>.ap-beijing.qcloudes.com:80"]

  # Authentication credentials - either API key or username/password.
  username: "elastic"
  password: "<rm>"
  indices:
  - index: apache_logs-<rm>
    when.equals:
      fields.type: log

setup.dashboards.enabled: true
setup.kibana:
  host: "https://index-<rm>-internal.kibana.qcloudes.com:443"
  space.id: "index-<rm>"

步骤 3

开启 apache 模块

代码语言:javascript复制
D:filebeat-7.14.2-windows-x86_64>filebeat.exe modules enable apache
Module apache doesn't exist!

重命名 apache.yml.disabled 至 apache.yml

0x05.启动 Filebeat

先 setup

代码语言:javascript复制
D:filebeat-7.14.2-windows-x86_64>.filebeat.exe setup
ILM policy and write alias loading not enabled.
Template loading not enabled.

Index setup finished.
Loading dashboards (Kibana must be running and reachable)
Loaded dashboards
Setting up ML using setup --machine-learning is going to be removed in 8.0.0. Please use the ML app instead.
See more: https://www.elastic.co/guide/en/machine-learning/current/index.html
Loaded machine learning job configurations
Loaded Ingest pipelines

然后启动服务

但是却迟迟没有数据

看到 Kibana 的 pattern 多了个 filebeat-*,突然想到应该把索引名称命名为 filebeat

于是重新创建了一个索引

但是仍然没有消息

如果有数据,就可以去 dashboard 查看图表了

0x06. 后记

查看日志才发现,是 Serverless index 不支持 _ingest/pipeline 的 PUT 操作,所以导致写入失败

2023-11-30T22:58:14.842 0800 ERROR [modules] fileset/setup.go:80 Error loading pipeline: 1 error: error loading pipeline for fileset apache/error: the Filebeat modules require Elasticsearch >= 5.0. This is the response I got from Elasticsearch: {"error":"Serverless index does not support uri [/_ingest/pipeline/filebeat-7.14.2-apache-error-pipeline] and method [PUT]"}

可以理解是官方为了安全裁剪掉了不必要的 api,但是这会导致自建 filebeat 无法接入,希望未来官方可以修复这个问题

0 人点赞