|本文以tencentcloud terraform 为例,介绍使用terraform CLI过程中如何开启本地日志跟踪以及一些通用问题的自查方法
开启本地日志跟踪
在CLI中执行terraform apply前可以使用以下命令开启本地日志跟踪
代码语言:javascript复制export TF_LOG=DEBUG
export TF_LOG_PATH=./terraform.log
开启后再次执行命令【terraform apply/destroy】会在terraform本地文件夹会生成一个terraform.log的文件。里面记录了tencentcloud terraform定义的日志输出。如图。
通过export 还可以直接export secretId以及secretKey【可以在控制台-个人账号-中查询】,省略写入tf文件
代码语言:javascript复制export TENCENTCLOUD_SECRET_ID=YourSecretId
export TENCENTCLOUD_SECRET_KEY=YourSecretKey
下面以一个执行出错的示例来分析如何定位问题。
本例中创建了一个K8S cluster 并挂载一台已经存在的CVM作节点【相关tf可以参考官方example】
代码语言:javascript复制$ terraform apply
terraform apply
2021/02/25 17:53:02 [WARN] Log levels other than TRACE are currently unreliable, and are supported only for backward compatibility.
Use TF_LOG=TRACE to see Terraform's internal logs.
----
data.tencentcloud_instance_types.default: Refreshing state...
data.tencentcloud_cbs_storages.storages: Refreshing state...
data.tencentcloud_vpc_subnets.vpc2: Refreshing state...
data.tencentcloud_images.default: Refreshing state...
data.tencentcloud_vpc_subnets.vpc: Refreshing state...
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
create
Terraform will perform the following actions:
# tencentcloud_kubernetes_cluster.managed_cluster will be created
resource "tencentcloud_kubernetes_cluster" "managed_cluster" {
certification_authority = (known after apply)
claim_expired_seconds = 300
cluster_as_enabled = false
cluster_cidr = "10.1.0.0/16"
cluster_deploy_type = "MANAGED_CLUSTER"
cluster_desc = "test cluster desc"
cluster_external_endpoint = (known after apply)
cluster_internet = false
cluster_intranet = false
cluster_ipvs = true
cluster_max_pod_num = 32
cluster_max_service_num = 32
cluster_name = "keep"
cluster_node_num = (known after apply)
cluster_os = "ubuntu16.04.1 LTSx86_64"
cluster_os_type = "GENERAL"
cluster_version = "1.10.5"
container_runtime = "docker"
deletion_protection = false
domain = (known after apply)
id = (known after apply)
ignore_cluster_cidr_conflict = false
is_non_static_ip_mode = false
kube_config = (known after apply)
network_type = "GR"
node_name_type = "lan-ip"
password = (known after apply)
pgw_endpoint = (known after apply)
security_policy = (known after apply)
user_name = (known after apply)
vpc_id = "vpc-h70b6b49"
worker_instances_list = (known after apply)
worker_config {
availability_zone = "ap-guangzhou-3"
count = 1
enhanced_monitor_service = false
enhanced_security_service = false
instance_charge_type = "POSTPAID_BY_HOUR"
instance_charge_type_prepaid_period = 1
instance_charge_type_prepaid_renew_flag = "NOTIFY_AND_MANUAL_RENEW"
instance_name = "sub machine of tke"
instance_type = "S1.SMALL1"
internet_charge_type = "TRAFFIC_POSTPAID_BY_HOUR"
internet_max_bandwidth_out = 100
password = (sensitive value)
public_ip_assigned = true
subnet_id = "subnet-1uwh63so"
system_disk_size = 60
system_disk_type = "CLOUD_SSD"
user_data = "dGVzdA=="
data_disk {
disk_size = 50
disk_type = "CLOUD_PREMIUM"
}
}
}
# tencentcloud_kubernetes_cluster_attachment.test_attach will be created
resource "tencentcloud_kubernetes_cluster_attachment" "test_attach" {
cluster_id = (known after apply)
hostname = "user"
id = (known after apply)
instance_id = "ins-lmnl6t1g"
labels = {
"test1" = "test1"
"test2" = "test2"
}
password = (sensitive value)
security_groups = (known after apply)
state = (known after apply)
worker_config {
docker_graph_path = "/var/lib/docker"
is_schedule = true
data_disk {
auto_format_and_mount = false
disk_size = 50
disk_type = "CLOUD_PREMIUM"
}
}
}
Plan: 2 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
tencentcloud_kubernetes_cluster.managed_cluster: Creating...
Error: [TencentCloudSDKError] Code=InternalError.CidrConflictWithOtherCluster, Message=DashboardError,Code : -10013 , Msg : CIDR_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51
on main.tf line 424, in resource "tencentcloud_kubernetes_cluster" "managed_cluster":
424: resource "tencentcloud_kubernetes_cluster" "managed_cluster" {
CLI提示错误为
代码语言:javascript复制[TencentCloudSDKError] Code=InternalError.CidrConflictWithOtherCluster, Message=DashboardError,Code : -10013 , Msg : CIDR_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51
如何定位
1. 找到requestId,40d3ee5d-f723-4ef9-8f01-32d725464d51
2.打开上文中的terraform.log,搜索该requestId,找到上下文
2021-02-25T17:53:20.222 0800 [DEBUG] plugin.terraform-provider-tencentcloud.exe: 2021/02/25 17:53:20 [DEBUG] setting computed for "worker_instances_list" from ComputedKeys
_CONFLICT_WITH_OTHER_CLUSTER[cidr 10.1.0.0/16 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16"},"RequestId":"40d3ee5d-f723-4ef9-8f01-32d725464d51"}},cost 370.8109ms
6 is conflict with cluster id: cls-1zc0kpyo], err : CheckCIDRWithVPCClusters failed,CIDR(10.1.0.0/16) conflict with clusterCIDR,ClusterID:cls-1zc0kpyo,clusterCIDR:10.1.0.0/16,err:CIDR1:10.1.0.0/16,firstIP:10.1.0.0,conflict with CIDR2:10.1.0.0/16, RequestId=40d3ee5d-f723-4ef9-8f01-32d725464d51
2021-02-25T17:53:20.593 0800 [DEBUG] plugin.terraform-provider-tencentcloud.exe: 2021/02/25 17:53:20 common.go:79: [DEBUG] [ELAPSED] resource.tencentcloud_kubernetes_cluster.create elapsed 371 ms
3.分析日志,定位到是创建k8s cluster过程中出的问题,上例中是因为cidr与已存在的其他k8s cluster 有冲突造成的;还有大部分情况,CLI提示的错误原因不够清晰,或是没有requestID的报错造成定位有困难,可以将 tf项目文件,CLI提示以及其产生的日志terraform.log文件
一起提工单请求协助。