Edgex foundry(Ireland-2.0版本)- Security 模式启动过程分析security-bootstrapper[通俗易懂]

2022-09-09 14:10:56 浏览数 (1)

大家好,又见面了,我是你们的朋友全栈君。


Edegx Foundry: 运行在边缘侧开源的,厂商中立的,灵活可定制的,支持交互操作的软件平台。可与设备、传感器、执行器和其他物联网对象的物理设备进行交互。简单地说,EdgeX是一种边缘中间件平台,服务于信息系统 和 IOT设备的感知操作(有关edgex的更多介绍,可查看edgex的的官方文档Introduction – EdgeX Foundry Documentation)。

Edegx 官方提供了基于docker,可快速启动docker compose 文件,让用户可以快速体验到edgex提供的功能 。edgexfoundry/edgex-compose: EdgeX Foundry Docker Compose release compose files and tools for building EdgeX compose files (github.com) 。

如果仅限于体验或者使用Edgex 那么基于docker compose模式已经足够。

edgex 模块众多,基于微服务的架构部署运行。如果想深入了解Edgex的运行架构或者基于Edgex进行定制开发,以及进行edgex的高可用部署。就必要了解一下Edgex模块间的运行逻辑和关系,尤其是edgex 的security模式。

下图是edgex 的 docker-compose.yml 定义的模块间依赖关系。(依赖关系,而非启动顺序,docker compose 的depends_on不能保证容器中微服务的启动顺序)

代码语言:javascript复制
    command: /app-service-configurable -cp=consul.http://edgex-core-consul:8500 --registry
      --confdir=/res
    container_name: edgex-app-rules-engine
    depends_on:
    - consul
    - data
    - security-bootstrapper
    entrypoint:
    - /edgex-init/ready_to_run_wait_install.sh

本系列文章将介绍edgex security模式下和安全相关的几个模块。 不会过于侧重源码分析,更多的是关于模块间的依存关系的分析介绍。

security-bootstrapper

注意上文图中的依赖关系,使用的时docker的声明式定义depends_on,这个依赖能够管理到的粒度是容器级别的依赖,而并不能保证edgex的各个子模块的微服务的启动顺序,如依赖的http 服务是否已经完成启动,依赖的文件是否创建成功等等,edgex提供了对这些依赖进行check的校验工具security-bootstrapper。Edgex的 ADR 更加详细的介绍了edgex的启动过程的设计思路。

这个模块是edgex提供的工具模块之一。 按照上图分析,edgex微服务模块之间是有一些依赖关系的。edgex 使用此模块提供几个辅助功,帮助edgex实现各个微服务,等待其依赖服务启动之后再启动之前。

对应的源码仓库 .(本文后续没有特殊说明,那么文件位置都是在这个代码仓库下的相对位置)

edgexfoundry

源码位置:

cmd/security-bootstrapper/main.go

security-bootstrapper 也可以启动一个微服务(gate命令选项),但不属于edgex对外提供的服务,主要是为其他微服务模块提供一个类似看门的机制,供其他微服务再启动时进行check,来确认security-bootstrapper 已经完成了启动。

除gate命令外,其他命令主要是在edgex的启动脚本entrypoint-scripts 中使用。edgex security模式下所有的微服务都是通过 entrypoint-scripts 脚本来启动的。关于entrypoint-scripts的介绍,在每个模块单独分析吧。

security-bootstrapper的 DockerFile (cmd/security-bootstrapper/Dockerfile) 定义了其启动命令选项 gate .

举个docker-compose 中的demo

代码语言:javascript复制
Please specify command for security-bootstrapper.exe
Usage: security-bootstrapper.exe [options] <command> [arg...]
Options:
    -h, --help    Show this message
    --confdir     Specify local configuration directory

Commands:
    gate              Do security bootstrapper gating on stages while starting services
    genPassword       Generate a random password
    getHttpStatus     Do an HTTP GET call to get the status code
    help              Show available commands (this text)
    listenTcp         Start up a TCP listener
    pingPgDb          Test Postgres database readiness
    setupRegistryACL  Set up registry's ACL and configure the access
    waitFor           Wait for the other services with specified URI(s) to connect:
                      the URI(s) can be communication protocols like tcp/tcp4/tcp6/http/https or files

每一个命令选项对应的源码位置

internal/security/bootstrapper/command/{option}/command.go

命令选项

gate:

gate选项主要有三个阶段,在configuartion.toml中定义了每个阶段的行为

  1. 发布start信号(监听端口,StartPort:54324)
  2. 等待基础服务完成启动(redis postgre consul)
  3. 发布一个Ready信号(监听端口 ToRunPort 54329)
代码语言:javascript复制
func (c *cmd) Execute() (statusCode int, err error) {
	c.loggingClient.Infof("Security bootstrapper running %s", CommandName)

	bootstrapServer := tcp.NewTcpServer()
	c.loggingClient.Debugf("init phase: attempts to start up the listener on bootstrap host: %s, port: %d",
		c.config.StageGate.BootStrapper.Host, c.config.StageGate.BootStrapper.StartPort)

	// in a separate go-routine so it won't block the main thread execution
     //==========================================================
    //==========================================================
    // 1 启动 端口 start port
     //==========================================================
    //==========================================================
	go openGatingSemaphorePort(bootstrapServer, c.config.StageGate.BootStrapper.StartPort, c.loggingClient,
		"Raising bootstrap semaphore for secure bootstrapping")

	// wait on for others to be done: each of tcp dialers is a blocking call
    //==========================================================
    //==========================================================
    // 2.1 等待consul
    //==========================================================
    //==========================================================
	c.loggingClient.Debug("Waiting on dependent semaphores required to raise the ready-to-run semaphore ...")
	if err := tcp.DialTcp(
		c.config.StageGate.Registry.Host,
		c.config.StageGate.Registry.ReadyPort,
		c.loggingClient); err != nil {
		retErr := fmt.Errorf("found error while waiting for readiness of Registry at %s:%d, err: %v",
			c.config.StageGate.Registry.Host, c.config.StageGate.Registry.ReadyPort, err)
		return interfaces.StatusCodeExitWithError, retErr
	}
	c.loggingClient.Info("Registry is ready")
     //==========================================================
    //==========================================================
    // 2.2 等待postgre启动
     //==========================================================
    //==========================================================
	if err := tcp.DialTcp(
		c.config.StageGate.KongDB.Host,
		c.config.StageGate.KongDB.ReadyPort,
		c.loggingClient); err != nil {
		retErr := fmt.Errorf("found error while waiting for readiness of KongDB at %s:%d, err: %v",
			c.config.StageGate.KongDB.Host, c.config.StageGate.KongDB.ReadyPort, err)
		return interfaces.StatusCodeExitWithError, retErr
	}
	c.loggingClient.Info("KongDB is ready")
     //==========================================================
    //==========================================================
    // 2.3 等待redis启动
     //==========================================================
    //==========================================================
	if err := tcp.DialTcp(
		c.config.StageGate.Database.Host,
		c.config.StageGate.Database.ReadyPort,
		c.loggingClient); err != nil {
		retErr := fmt.Errorf("found error while waiting for readiness of Database at %s:%d, err: %v",
			c.config.StageGate.Database.Host, c.config.StageGate.Database.ReadyPort, err)
		return interfaces.StatusCodeExitWithError, retErr
	}
	c.loggingClient.Info("Database is ready")

	// Reached ready-to-run phase
	c.loggingClient.Debugf("ready-to-run phase: attempts to start up the listener on ready-to-run port: %d",
		c.config.StageGate.Ready.ToRunPort)

	readyToRunServer := tcp.NewTcpServer()
     //==========================================================
    //==========================================================
    // 3 发布启动完成信号 监听(ToRunPort )
     //==========================================================
    //==========================================================
	go openGatingSemaphorePort(readyToRunServer, c.config.StageGate.Ready.ToRunPort, c.loggingClient,
		"Raising ready-to-run semaphore for secure bootstrapping")

......
.....
}

genPassword:

代码语言:javascript复制
func (c *cmd) Execute() (int, error) {
	c.loggingClient.Infof("Security bootstrapper running %s", CommandName)

	randomBytes := make([]byte, randomBytesLength)
	_, err := rand.Read(randomBytes) // all of salt guaranteed to be filled if err==nil
	if err != nil {
		return interfaces.StatusCodeExitWithError, err
	}

	randPass := base64.StdEncoding.EncodeToString(randomBytes)
	// output the randPass to stdout
	fmt.Fprintln(os.Stdout, randPass)

	return interfaces.StatusCodeExitNormal, nil
}

主要作用生成随机的的密码,逻辑比较简单,生成一个随机的33位的byte,然后计算其base64

getHttpStatus

获取指定的http服务的状态,比简单就忽略了

listenTcp

启动一个tcp监听

代码语言:javascript复制
func (c *cmd) Execute() (int, error) {
	c.loggingClient.Infof("Security bootstrapper running %s", CommandName)

	tcpServer := tcp.NewTcpServer()

	// block and listening forever until internal error
	if err := tcpServer.StartListener(c.tcpPort, c.loggingClient, c.tcpHost); err != nil {
		return interfaces.StatusCodeExitWithError, err
	}

	return interfaces.StatusCodeExitNormal, nil
}

pingPgDb

这是一个专门为检测kong的数据库状态的命令,kong支持的数据库不止是postgre。比简单就忽略了

setupRegistryACL

命令选项中最复杂的一个操作。在consul的启动脚本中使用。

截取一段consul_wait_install.sh (cmd/security-bootstrapper/entrypoint-scripts/consul_wait_install.sh)

代码语言:javascript复制
docker-entrypoint.sh agent 
  -ui 
  -bootstrap 
  -server 
  -config-file=/edgex-init/consul-bootstrapper/config_consul_acl.json 
  -client 0.0.0.0 &
# wait for the secretstore tokens ready as we need the token for bootstrapping
echo "$(date) Executing waitFor on Consul with waiting on TokensReadyPort 
  tcp://${STAGEGATE_SECRETSTORESETUP_HOST}:${STAGEGATE_SECRETSTORESETUP_TOKENS_READYPORT}"
/edgex-init/security-bootstrapper --confdir=/edgex-init/res waitFor 
  -uri tcp://"${STAGEGATE_SECRETSTORESETUP_HOST}":"${STAGEGATE_SECRETSTORESETUP_TOKENS_READYPORT}" 
  -timeout "${STAGEGATE_WAITFOR_TIMEOUT}"

# we don't want to exit out the whole Consul process when ACL bootstrapping failed, just that
# Consul won't have ACL to be used
set  e
# call setupRegistryACL bootstrapping command, containing both ACL bootstrapping and re-configure consul access steps
/edgex-init/security-bootstrapper --confdir=/edgex-init/res setupRegistryACL

源码位置:

这里不要着急看源码,需要先理解下valut 的 consul Secret engines 功能。 在下面有链接。看懂了 consul Secret engines功能这段代码就不难理解了。代码就不贴了,比较长。

internal/security/bootstrapper/command/setupacl/command.go

这个命令选项主要做了以下几个事情

1 因为edgex启用了acl控制,所以在consul完成启动后需要生成acl的最高权限的token(bootstrap token),所以第一件事情就是生成consul的 bootstrap token 。edgex是通过rest 接口生成的这个token

http://edgex-core-consul:8500/v1/acl/bootstrap

2 开启vault的 consul Secret engines 功能 (关于这块的介绍可以单独起一篇介绍了)

需要注意的是在执行这个命令之前,consul会检查vault已经完成了启动,这个在后面分析consul启动过程时会在介绍。

vault 官方文档

命令行操作

Consul – Secrets Engines | Vault by HashiCorp (vaultproject.io)

Rest Api

Consul – Secrets Engines – HTTP API | Vault by HashiCorp (vaultproject.io)

consul 官方文档

consul acl

在vault 中托管consul的access token 生成策略,及生成策略在cosul中ACL权限

vault进行consul访问密钥的管理可查看官方文档

简单来说就是因为consul开启ACL之后,访问consul也时需要access token的,consul的申请acl token 需要用到管理权限的token去申请(management token)。因为management token有很高的权限, 暴露出去必然会带来很大的安全风险。

代码语言:javascript复制
curl 
    --header "X-Consul-Token: my-management-token" 
    --request PUT 
    --data '{"Name": "sample", "Type": "management"}' 
    https://consul.rocks/v1/acl/create

所以vault 提供了consul secret engine 功能,在vault中管理consul 的management token,同时配置生成consul access token(ACL) 的规则,在vault中这些规则通过role来进行管理。

应用在需要访问consul时,通过调用vault的api 来获得一个访问consul的access token。

如此通过vault 来管理consul 的access token。

配置完成后访问vault ui可以在 secrets Engines 下看到consul相关的配置

3 保存consul的 bootsrap token 到文件

BootstrapTokenPath

4 edgex将consul的ACL规则放到了代码里面

internal/security/bootstrapper/command/setupacl/aclpolicies.go

代码语言:javascript复制
const (
	// edgeXPolicyRules are rules for edgex services
	// in Phase 2, we will use the same policy for all EdgeX services
	// TODO: phase 3 will have more finer grained policies for each service
	edgeXPolicyRules = `
	# HCL definition of server agent policy for EdgeX
	agent "" {
		policy = "read"
	}
	agent_prefix "edgex" {
		policy = "write"
	}
	node "" {
  		policy = "read"
	}
	node_prefix "edgex" {
		policy = "write"
	}
	service "" {
		policy = "write"
	}
	service_prefix "" {
		policy = "write"
	}
	# allow key value store put
	# once the default_policy is switched to "deny",
	# this is needed if wants to allow updating Key/Value configuration
	key "" {
		policy = "write"
	}
	key_prefix "" {
		policy = "write"
	}
	`

	// edgeXServicePolicyName is the name of the agent policy for edgex
	edgeXServicePolicyName = "edgex-service-policy"

	consulCreatePolicyAPI     = "/v1/acl/policy"
	consulReadPolicyByNameAPI = "/v1/acl/policy/name/%s"

	aclNotFoundMessage = "ACL not found"
)

waitFor

这个操作是所有命令选项中唯一的阻塞操作,会一直等待指定的资源可以用之后返回,也是在entrypoint-scripts脚本中使用最多多的命令选项

等待的资源对象支持,tcp端口,文件,http服务支持等待超时参数

代码语言:javascript复制
func (c *cmd) waitForDependencies() error {
	dependencyChan := make(chan struct{})
	waitErr := make(chan error, 1)

	go func() {
		for _, uri := range c.parsedURIs {
			c.loggingClient.Infof("Waiting for: [%s] with timeout: [%s]", uri.String(), c.timeout.String())

			switch uri.Scheme {
			case "file":
				c.waitForFile(uri)
			case "tcp", "tcp4", "tcp6":
				c.waitForSocket(uri.Scheme, uri.Host)
			case "unix":
				c.waitForSocket(uri.Scheme, uri.Path)
			case "http", "https":
				c.waitForHTTP(uri)
			default:
				waitErr <- fmt.Errorf("invalid host protocol provided: %s. supported protocols are: file, tcp, tcp4, " 
					"tcp6, unix and http", uri.Scheme)
				return
			}
		}

		c.waitGroup.Wait()
		close(dependencyChan)
	}()

	select {
	case err := <-waitErr:
		return err
	case <-dependencyChan:
		break
	case <-time.After(c.timeout):
		return fmt.Errorf("Timeout after %s waiting on dependencies to become available: %v", c.timeout, c.uris)
	}

	return nil
}

后面将会更新的内容

  1. Vault
  2. security-secretstore-setup
  3. secrets-config & proxy-setup
  4. kong db
  5. kong
  6. consul
  7. Redis db
  8. 核心模块
  9. windows 下的edgex开发环境

发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/162200.html原文链接:https://javaforall.cn

0 人点赞