weixin_39683598
weixin_39683598
2020-11-22 16:45

Fix strongSwan logging configuration

strongSwan logging configuration is reported as invalid in some OSes (e.g Ubuntu 16.04) when starting the strongSwan services. It is probably due to the strongSwan version change after the Antea Docker image is updated to use Ubuntu 20.04. This commit fixes the strongSwan logging configuration.

该提问来源于开源项目:vmware-tanzu/antrea

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

17条回答

  • weixin_39683598 weixin_39683598 4月前

    : this PR is to fix the ovs.conf error: https://github.com/vmware-tanzu/antrea/pull/1191

    点赞 评论 复制链接分享
  • weixin_39837124 weixin_39837124 4月前

    , I have some ipv6 e2e test failure to solve. It is great if you can help on it. Thank you! I noticed that after antrea-ipsec.yml is applied, antrea-agent pods will repeatly switch between crash and running because they try to restart. Finally they turn crash. I am not sure if wait for just a few seconds will work? Maybe after the check, they turns crash again?

    点赞 评论 复制链接分享
  • weixin_39837124 weixin_39837124 4月前

    , I found this issue yesterday as well. I changed /var/log/strongswan/charon.log to charon without the path field you just added. It still failed and logs said /etc/strongswan.d/ovs.conf containes unexpected .. I am not sure if this solution solves this as well?

    点赞 评论 复制链接分享
  • weixin_39761558 weixin_39761558 4月前

    I think the solution does not have to be tied to IPsec (or should not). So yes, maybe the solution is to spin for a couple seconds and make sure the Pods stay ready.

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    Seems to me we can either check the container status, or run IPsec command inside the antrea-ipsec container.

    点赞 评论 复制链接分享
  • weixin_39761558 weixin_39761558 4月前

    We need a generic solution to make sure that the Antrea Agent Pods run correctly after the update to enable IPsec. Currently we do: 1) apply new YAML 2) run kubectl rollout status -w 3) make sure we have the correct number of Pods available

    Apparently this is not enough to detect this case where the Pod crashes after a few seconds, so we need to improve it. We don't use the MinReadySeconds field in the DaemonSet spec, so we have to come up with something else.

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    /skip-all

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    You mean to check IPsec started correctly or not, or a better solution for Python3? I can look at the former.

    点赞 评论 复制链接分享
  • weixin_39761558 weixin_39761558 4月前

    My guess is this needs to be changed: https://github.com/vmware-tanzu/antrea/blob/88a8df32b482ae27252bdee3a1496a34cbcf5b0d/test/e2e/connectivity_test.go#L211-L214. I think it returns immediately because it thinks all Pods are ready, but it must be looking at pre-update Pods, and it does not realize the new Pods are crashing.

    点赞 评论 复制链接分享
  • weixin_39761558 weixin_39761558 4月前

    We know that IPsec coverage is broken: https://github.com/vmware-tanzu/antrea/issues/1043. The ovs-monitor-ipsec daemon was actually not starting correctly for a very long time (2 releases), and this did not get caught by the e2e tests. My guess is that the tests do not detect that the Agents do not get started properly after updating the manifest to enable IPsec. The issue is still open and assigned to because we need to fix the tests. , do you have cycles to work on it this week? If not, we can re-assign.

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

    Right. But I do not understand why our CI e2e tests passed with no failure. Do you have an idea?

    Ok I guess might be the IPsec services failed to start, but traffic goes through without IPsec. Let me think about how to check this.

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

    Right. But I do not understand why our CI e2e tests passed with no failure. Do you have an idea?

    点赞 评论 复制链接分享
  • weixin_39761558 weixin_39761558 4月前

    Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

    点赞 评论 复制链接分享
  • weixin_39683858 weixin_39683858 4月前

    I think it is not about Antrea release, but strongSwan version used in our image, depended on the Ubuntu version.

    sorry i meant earlier ubuntu release.. 16.04 :D

    I did not try, but I remember not (must be the reason I used the previous format).

    点赞 评论 复制链接分享
  • weixin_39683598 weixin_39683598 4月前

    works

    I think it is not about Antrea release, but strongSwan version used in our image, depended on the Ubuntu version.

    点赞 评论 复制链接分享
  • weixin_39717152 weixin_39717152 4月前

    Codecov Report

    Merging #1184 into master will decrease coverage by 0.07%. The diff coverage is n/a.

    Impacted file tree graph

    diff
    @@            Coverage Diff             @@
    ##           master    #1184      +/-   ##
    ==========================================
    - Coverage   56.10%   56.02%   -0.08%     
    ==========================================
      Files         106      106              
      Lines       11535    11535              
    ==========================================
    - Hits         6472     6463       -9     
    - Misses       4496     4502       +6     
    - Partials      567      570       +3     
    

    | Flag | Coverage Δ | | |---|---|---| | #integration-tests | 47.37% <ø> (ø) | | | #unit-tests | 41.44% <ø> (-0.09%) | :arrow_down: |

    Flags with carried forward coverage won't be shown. Click here to find out more.

    | Impacted Files | Coverage Δ | | |---|---|---| | pkg/apiserver/certificate/certificate.go | 72.83% <0.00%> (-6.18%) | :arrow_down: | | pkg/apiserver/storage/ram/watch.go | 85.71% <0.00%> (-3.18%) | :arrow_down: | | pkg/apiserver/storage/ram/store.go | 80.39% <0.00%> (-1.31%) | :arrow_down: |

    点赞 评论 复制链接分享
  • weixin_39884074 weixin_39884074 4月前

    Thanks for your PR. Unit tests and code linters are run automatically every time the PR is updated. E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

    The following commands are available: * /test-e2e: to trigger e2e tests. * /skip-e2e: to skip e2e tests. * /test-conformance: to trigger conformance tests. * /skip-conformance: to skip conformance tests. * /test-whole-conformance: to trigger all conformance tests on linux. * /skip-whole-conformance: to skip all conformance tests on linux. * /test-networkpolicy: to trigger networkpolicy tests. * /skip-networkpolicy: to skip networkpolicy tests. * /test-windows-conformance: to trigger windows conformance tests. * /skip-windows-conformance: to skip windows conformance tests. * /test-windows-networkpolicy: to trigger windows networkpolicy tests. * /skip-windows-networkpolicy: to skip windows networkpolicy tests. * /test-hw-offload: to trigger ovs hardware offload test. * /skip-hw-offload: to skip ovs hardware offload test. * /test-all: to trigger all tests (except whole conformance). * /skip-all: to skip all tests (except whole conformance).

    点赞 评论 复制链接分享