weixin_39717598
weixin_39717598
2020-11-27 18:49

Perma-diff when using self_link for network or subnetwork in google_container_cluster

This issue was originally opened by as hashicorp/terraform#17919. It was migrated here as a result of the provider split. The original body of the issue is below.

Terraform Version


terraform -v
Terraform v0.11.7
+ provider.google v1.9.0
+ provider.random v1.2.0

Terraform Configuration Files

hcl
k8s-cluster.tf

-----------OUTPUT OMITTED------------------
# Network
resource "google_compute_network" "cluster-net" {
  name                    = "cluster-net"
  project                 = "${google_project.gke-proj.project_id}"
  auto_create_subnetworks = "false"
}

# Subnet for cluster nodes
resource "google_compute_subnetwork" "nodes-subnet" {
  name          = "nodes-subnet"
  project       = "${google_project.gke-proj.project_id}"
  ip_cidr_range = "10.101.0.0/24"
  network       = "${google_compute_network.cluster-net.self_link}"
  region        = "us-east4"

  secondary_ip_range {
    range_name    = "container-range-1"
    ip_cidr_range = "172.20.0.0/16"
  }

  secondary_ip_range {
    range_name    = "service-range-1"
    ip_cidr_range = "10.200.0.0/16"
  }
}

resource "google_container_cluster" "primary" {
  project            = "${google_project.gke-proj.project_id}"
  name               = "semrush-test"
  zone               = "us-east4-a"
  initial_node_count = 3

  network    = "${google_compute_network.cluster-net.self_link}"
  subnetwork = "${google_compute_subnetwork.nodes-subnet.self_link}"

  ip_allocation_policy {
    cluster_secondary_range_name  = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.0.range_name}"
    services_secondary_range_name = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.1.range_name}"
  }

}

Expected Behavior

terraform plan should show the plan, then terraform apply should this plan apply. After that running a command terraform plan on the SAME configuration files should show "No changes. Infrastructure is up-to-date."

Actual Behavior

terraform plan shows actual plan:

bash
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + google_container_cluster.kek
      id:                      <computed>
      additional_zones.#:      <computed>
      addons_config.#:         <computed>
      cluster_ipv4_cidr:       <computed>
      enable_kubernetes_alpha: "false"
      enable_legacy_abac:      "false"
      endpoint:                <computed>
      initial_node_count:      "3"
      instance_group_urls.#:   <computed>
      logging_service:         <computed>
      master_auth.#:           <computed>
      master_version:          <computed>
      monitoring_service:      <computed>
      name:                    "semrush-test"
      network:                 "cluster-net"
      network_policy.#:        <computed>
      node_config.#:           <computed>
      node_pool.#:             <computed>
      node_version:            <computed>
      private_cluster:         "false"
      project:                 "project-id"
      region:                  <computed>
      subnetwork:              "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
      zone:                    "us-east4-a"


Plan: 1 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
</computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed>

terraform apply applies the plan:

hcl
Terraform will perform the following actions:

  + google_container_cluster.kek
      id:                      <computed>
      additional_zones.#:      <computed>
      addons_config.#:         <computed>
      cluster_ipv4_cidr:       <computed>
      enable_kubernetes_alpha: "false"
      enable_legacy_abac:      "false"
      endpoint:                <computed>
      initial_node_count:      "3"
      instance_group_urls.#:   <computed>
      logging_service:         <computed>
      master_auth.#:           <computed>
      master_version:          <computed>
      monitoring_service:      <computed>
      name:                    "semrush-test"
      network:                 "cluster-net"
      network_policy.#:        <computed>
      node_config.#:           <computed>
      node_pool.#:             <computed>
      node_version:            <computed>
      private_cluster:         "false"
      project:                 "project-id"
      region:                  <computed>
      subnetwork:              "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
      zone:                    "us-east4-a"


Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_container_cluster.kek: Creating...
  additional_zones.#:      "" => "<computed>"
  addons_config.#:         "" => "<computed>"
  cluster_ipv4_cidr:       "" => "<computed>"
  enable_kubernetes_alpha: "" => "false"
  enable_legacy_abac:      "" => "false"
  endpoint:                "" => "<computed>"
  initial_node_count:      "" => "3"
  instance_group_urls.#:   "" => "<computed>"
  logging_service:         "" => "<computed>"
  master_auth.#:           "" => "<computed>"
  master_version:          "" => "<computed>"
  monitoring_service:      "" => "<computed>"
  name:                    "" => "semrush-test"
  network:                 "" => "cluster-net"
  network_policy.#:        "" => "<computed>"
  node_config.#:           "" => "<computed>"
  node_pool.#:             "" => "<computed>"
  node_version:            "" => "<computed>"
  private_cluster:         "" => "false"
  project:                 "" => "project-id"
  region:                  "" => "<computed>"
  subnetwork:              "" => "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet"
  zone:                    "" => "us-east4-a"
google_container_cluster.kek: Still creating... (10s elapsed)
google_container_cluster.kek: Still creating... (20s elapsed)
google_container_cluster.kek: Still creating... (30s elapsed)
google_container_cluster.kek: Still creating... (40s elapsed)
google_container_cluster.kek: Still creating... (50s elapsed)
google_container_cluster.kek: Still creating... (1m0s elapsed)
google_container_cluster.kek: Still creating... (1m10s elapsed)
google_container_cluster.kek: Still creating... (1m20s elapsed)
google_container_cluster.kek: Still creating... (1m30s elapsed)
google_container_cluster.kek: Still creating... (1m40s elapsed)
google_container_cluster.kek: Still creating... (1m50s elapsed)
google_container_cluster.kek: Still creating... (2m0s elapsed)
google_container_cluster.kek: Creation complete after 2m4s (ID: semrush-test)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
</computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed>

terrafrom plan again on the SAME configuration shows that it must recreate cluster:

hcl
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

-/+ google_container_cluster.kek (new resource required)
      id:                      "semrush-test" => <computed> (forces new resource)
      additional_zones.#:      "0" => <computed>
      addons_config.#:         "1" => <computed>
      cluster_ipv4_cidr:       "10.24.0.0/14" => <computed>
      enable_kubernetes_alpha: "false" => "false"
      enable_legacy_abac:      "false" => "false"
      endpoint:                "35.186.170.157" => <computed>
      initial_node_count:      "3" => "3"
      instance_group_urls.#:   "1" => <computed>
      logging_service:         "logging.googleapis.com" => <computed>
      master_auth.#:           "1" => <computed>
      master_version:          "1.8.8-gke.0" => <computed>
      monitoring_service:      "monitoring.googleapis.com" => <computed>
      name:                    "semrush-test" => "semrush-test"
      network:                 "cluster-net" => "cluster-net"
      network_policy.#:        "0" => <computed>
      node_config.#:           "1" => <computed>
      node_pool.#:             "1" => <computed>
      node_version:            "1.8.8-gke.0" => <computed>
      private_cluster:         "false" => "false"
      project:                 "project-id" => "project-id"
      region:                  "" => <computed>
      subnetwork:              "nodes-subnet" => "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-east4/subnetworks/nodes-subnet" (forces new resource)
      zone:                    "us-east4-a" => "us-east4-a"


Plan: 1 to add, 0 to change, 1 to destroy.
</computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed></computed>

So terraform after creating GKE cluster misses some properties (id, subnet) and wants to recreate the cluster, because it thinks that something has changed.

Steps to Reproduce

  1. terraform init
  2. terraform apply
  3. terraform plan

Additional Context

During to experiments how to work around this bug it was detected that the cause is in referencing to properties network and subnetwork of gke cluster by self_links. Referencing by name fixes this wrong behavior. So this configuration works fine:

hcl
resource "google_container_cluster" "primary" {
  project            = "${google_project.gke-proj.project_id}"
  name               = "super-cluster-new"
  zone               = "us-east4-a"
  initial_node_count = 3

  network    = "${google_compute_network.cluster-net.name}"
  subnetwork = "${google_compute_subnetwork.nodes-subnet.name}"

  ip_allocation_policy {
    cluster_secondary_range_name  = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.0.range_name}"
    services_secondary_range_name = "${google_compute_subnetwork.nodes-subnet.secondary_ip_range.1.range_name}"
  }
}

该提问来源于开源项目:hashicorp/terraform-provider-google

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

9条回答

  • weixin_39646695 weixin_39646695 5月前

    As you identified, it looks like we're storing the name of the network/subnetwork, but you're specifying a self_link. As a workaround, specifying the name should make this go away. A real solution would probably be to either validate this and throw an error if it's not just a name, accept a name or a self_link, or suppress the diff if a self_link is for a resource with the right name.

    点赞 评论 复制链接分享
  • weixin_39519769 weixin_39519769 5月前

    We already have a DiffSuppressFunc: compareSelfLinkOrResourceName. However, this function expects the value stored in state to be a self-link.

    All the other APIs return a self-link for the subnetwork and network field except the GKE API.

    In the Read method, we should use the ParseNetworkFieldValue and ParseSubnetworkFieldValue and save the self-link.

    go
    d.Set("network", cluster.Network) // wrong. Unlike other APIs. cluster.Network is the name-only.
    d.Set("subnetwork", cluster.Subnetwork) // same
    
    点赞 评论 复制链接分享
  • weixin_39646695 weixin_39646695 5月前

    I have feelings about changing this on people mid-major-version, because it's technically a breaking change, though one could argue it probably doesn't matter. But this would change the value returned for every interpolation of network or subnetwork, which seems like a value you'd want to interpolate?

    I'm not saying let's leave it weird, but would it seems like we could work around it for the moment, and put it on the list for changes to make in 2.0.0 to address it at the root. /2¢

    点赞 评论 复制链接分享
  • weixin_39594895 weixin_39594895 5月前

    Fixed as part of #1528

    点赞 评论 复制链接分享
  • weixin_39649478 weixin_39649478 5月前

    Hi,

    Not sure if I need to open a new issue or use this one... I'm using v1.13.0 of the Terraform Google provider and I'm still faced to this problem:

    
    google_service_account.node: Refreshing state... (ID: projects/xxx/serviceAc....iam.gserviceaccount.com)
    data.google_compute_zones.available: Refreshing state...
    data.google_compute_subnetwork.gke: Refreshing state...
    data.google_compute_network.gke: Refreshing state...
    google_container_cluster.gke: Refreshing state... (ID: test-gke-cluster)
    
    An execution plan has been generated and is shown below.
    Resource actions are indicated with the following symbols:
      ~ update in-place
    
    Terraform will perform the following actions:
    
      ~ google_container_cluster.gke
          network:    "projects/xxx/global/networks/test-gke-network" => "https://www.googleapis.com/compute/v1/projects/xxx/global/networks/test-gke-network"
          subnetwork: "projects/xxx/regions/europe-west1/subnetworks/test-gke-network-subnet-1" => "test-gke-network-subnet-1"
    
    
    Plan: 0 to add, 1 to change, 0 to destroy.
    

    I'm using datasources to retrieve network and subnetwork resources but I don't think it's related.

    My 2¢:

    I saw you use relative links so it nether works with name nor self_link attributes and always trigger a change.

    Thanks :)

    点赞 评论 复制链接分享
  • weixin_39594895 weixin_39594895 5月前

    Hey , I think the issue you're experiencing is the same root cause as #988 and #1566, which we haven't been able to figure out a fix for yet.

    点赞 评论 复制链接分享
  • weixin_39654352 weixin_39654352 5月前

    I am having this issue with the following attributes: id: "example-cluster" => (forces new resource) node_pool.0.name: "primary-pool" => "default-pool" (forces new resource)

    点赞 评论 复制链接分享
  • weixin_39646695 weixin_39646695 5月前

    Hi ,

    I'm not sure that's an issue related to this one. Do you mind opening a new issue and filling out the issue template? We'll need a bunch more information before we can help you with that, unfortunately.

    点赞 评论 复制链接分享
  • weixin_39717598 weixin_39717598 5月前

    I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

    If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback.com. Thanks!

    点赞 评论 复制链接分享

相关推荐