Terraform开启本地日志跟踪以及问题自查

2021-02-26 15:44:47 浏览数 (2)

|本文以tencentcloud terraform 为例,介绍使用terraform CLI过程中如何开启本地日志跟踪以及一些通用问题的自查方法

开启本地日志跟踪

在CLI中执行terraform apply前可以使用以下命令开启本地日志跟踪

代码语言:javascript复制
export TF_LOG=DEBUG
export TF_LOG_PATH=./terraform.log

开启后再次执行命令【terraform apply/destroy】会在terraform本地文件夹会生成一个terraform.log的文件。里面记录了tencentcloud terraform定义的日志输出。如图。

日志开启效果日志开启效果

通过export 还可以直接export secretId以及secretKey【可以在控制台-个人账号-中查询】,省略写入tf文件

代码语言:javascript复制
export TENCENTCLOUD_SECRET_ID=YourSecretId
export TENCENTCLOUD_SECRET_KEY=YourSecretKey

下面以一个执行出错的示例来分析如何定位问题。

本例中创建了一个K8S cluster 并挂载一台已经存在的CVM作节点【相关tf可以参考官方example】

代码语言:javascript复制
$ terraform apply
 terraform apply
2021/02/25 17:53:02 [WARN] Log levels other than TRACE are currently unreliable, and are supported only for backward compatibility.
  Use TF_LOG=TRACE to see Terraform's internal logs.
  ----
data.tencentcloud_instance_types.default: Refreshing state...
data.tencentcloud_cbs_storages.storages: Refreshing state...
data.tencentcloud_vpc_subnets.vpc2: Refreshing state...
data.tencentcloud_images.default: Refreshing state...
data.tencentcloud_vpc_subnets.vpc: Refreshing state...
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
    create

Terraform will perform the following actions:

  # tencentcloud_kubernetes_cluster.managed_cluster will be created
    resource "tencentcloud_kubernetes_cluster" "managed_cluster" {
        certification_authority      = (known after apply)
        claim_expired_seconds        = 300
        cluster_as_enabled           = false
        cluster_cidr                 = "10.1.0.0/16"
        cluster_deploy_type          = "MANAGED_CLUSTER"
        cluster_desc                 = "test cluster desc"
        cluster_external_endpoint    = (known after apply)
        cluster_internet             = false
        cluster_intranet             = false
        cluster_ipvs                 = true
        cluster_max_pod_num          = 32
        cluster_max_service_num      = 32
        cluster_name                 = "keep"
        cluster_node_num             = (known after apply)
        cluster_os                   = "ubuntu16.04.1 LTSx86_64"
        cluster_os_type              = "GENERAL"
        cluster_version              = "1.10.5"
        container_runtime            = "docker"
        deletion_protection          = false
        domain                       = (known after apply)
        id                           = (known after apply)
        ignore_cluster_cidr_conflict = false
        is_non_static_ip_mode        = false
        kube_config                  = (known after apply)
        network_type                 = "GR"
        node_name_type               = "lan-ip"
        password                     = (known after apply)
        pgw_endpoint                 = (known after apply)
        security_policy              = (known after apply)
        user_name                    = (known after apply)
        vpc_id                       = "vpc-h70b6b49"
        worker_instances_list        = (known after apply)

        worker_config {
            availability_zone                       = "ap-guangzhou-3"
            count                                   = 1
            enhanced_monitor_service                = false
            enhanced_security_service               = false
            instance_charge_type                    = "POSTPAID_BY_HOUR"
            instance_charge_type_prepaid_period     = 1
            instance_charge_type_prepaid_renew_flag = "NOTIFY_AND_MANUAL_RENEW"
            instance_name                           = "sub machine of tke"
            instance_type                           = "S1.SMALL1"
            internet_charge_type                    = "TRAFFIC_POSTPAID_BY_HOUR"
            internet_max_bandwidth_out              = 100
            password                                = (sensitive value)
            public_ip_assigned                      = true
            subnet_id                               = "subnet-1uwh63so"
            system_disk_size                        = 60
            system_disk_type                        = "CLOUD_SSD"
            user_data                               = "dGVzdA=="

            data_disk {
                disk_size = 50
                disk_type = "CLOUD_PREMIUM"
            }
        }
    }

  # tencentcloud_kubernetes_cluster_attachment.test_attach will be created
    resource "tencentcloud_kubernetes_cluster_attachment" "test_attach" {
        cluster_id      = (known after apply)
        hostname        = "user"
        id              = (known after apply)
        instance_id     = "ins-lmnl6t1g"
        labels          = {
            "test1" = "test1"
            "test2" = "test2"
        }
        password        = (sensitive value)
        security_groups = (known after apply)
        state           = (known after apply)

        worker_config {
            docker_graph_path = "/var/lib/docker"
            is_schedule       = true

            data_disk {
                auto_format_and_mount = false
                disk_size             = 50
                disk_type             = "CLOUD_PREMIUM"
            }
        }
    }

Plan: 2 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

tencentcloud_kubernetes_cluster.managed_cluster: Creating...

Error: [TencentCloudSDKError] Code=InternalError.CidrConflictWithOtherCluster, Message=DashboardError,Code : -10013 , Msg : CIDR_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51

  on main.tf line 424, in resource "tencentcloud_kubernetes_cluster" "managed_cluster":
 424: resource "tencentcloud_kubernetes_cluster" "managed_cluster" {

CLI提示错误为

代码语言:javascript复制
[TencentCloudSDKError] Code=InternalError.CidrConflictWithOtherCluster, Message=DashboardError,Code : -10013 , Msg : CIDR_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51

如何定位

1. 找到requestId,40d3ee5d-f723-4ef9-8f01-32d725464d51

2.打开上文中的terraform.log,搜索该requestId,找到上下文

2021-02-25T17:53:20.222 0800 [DEBUG] plugin.terraform-provider-tencentcloud.exe: 2021/02/25 17:53:20 [DEBUG] setting computed for "worker_instances_list" from ComputedKeys

_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16"},"RequestId":"40d3ee5d-f723-4ef9-8f01-32d725464d51"}},cost 370.8109ms

6 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51

2021-02-25T17:53:20.593 0800 [DEBUG] plugin.terraform-provider-tencentcloud.exe: 2021/02/25 17:53:20 common.go:79: [DEBUG] [ELAPSED] resource.tencentcloud_kubernetes_cluster.create elapsed 371 ms

3.分析日志,定位到是创建k8s cluster过程中出的问题,上例中是因为cidr与已存在的其他k8s cluster 有冲突造成的;还有大部分情况,CLI提示的错误原因不够清晰,或是没有requestID的报错造成定位有困难,可以将 tf项目文件,CLI提示以及其产生的日志terraform.log文件 一起提工单请求协助。

0 人点赞