Skip to content

fix(provider): aws-sd provider null pointer #5404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 29, 2025

Conversation

ivankatliarchuk
Copy link
Contributor

@ivankatliarchuk ivankatliarchuk commented May 15, 2025

Description

Partially Fixes #5202

Relates #5167

^ In the issue is missing HOW to reproduce, so rely mainly on test fixtures

The issue fixes a case, when record exist, but not managed by current external-dns, aka missing label aws-sd-description. In that case we do a safe skip. Current behaviour - null pointer and crash.

For update,create,delete etc, required labels managed here

sdr.updateLabels(filteredChanges.Create)
sdr.updateLabels(filteredChanges.UpdateNew)
sdr.updateLabels(filteredChanges.UpdateOld)
sdr.updateLabels(filteredChanges.Delete)
return sdr.provider.ApplyChanges(ctx, filteredChanges)
}
func (sdr *AWSSDRegistry) updateLabels(endpoints []*endpoint.Endpoint) {
for _, ep := range endpoints {
if ep.Labels == nil {
ep.Labels = make(map[string]string)
}
ep.Labels[endpoint.OwnerLabelKey] = sdr.ownerID
ep.Labels[endpoint.AWSSDDescriptionLabel] = ep.Labels.SerializePlain(false)
}
}

managed to capture null pointers with test fixtures.

Screenshot 2025-05-15 at 09 42 57
Screenshot 2025-05-15 at 09 42 01

  • Split tests and test fixtures. Refactoring of tests, I could add them in pre-requsit pull request
  • Added early returns
  • Added tests for dry-run
  • Skip record if not managed by external-dns
if len(resp.Instances) == 0 {
  if err := p.DeleteService(ctx, srv); err != nil {
	  log.Errorf("Failed to delete service %q, error: %s", *srv.Name, err)
  }
  continue
}
// 3rd party created a record, external-dns could not manage it
if srv.Description == nil {
  log.Warnf("Skipping service %q as owner id not configured", *srv.Name)
  continue
}

Another potential case for nil pointer is here

Description: aws.String(ep.Labels[endpoint.AWSSDDescriptionLabel]),
, as ep.Labels could be null at this stage. Not changed, will see if there going to be a bug raised.

Screenshot 2025-05-15 at 11 53 50

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 15, 2025
@k8s-ci-robot k8s-ci-robot requested review from mloiseleur and szuecs May 15, 2025 09:51
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 15, 2025
@k8s-ci-robot k8s-ci-robot changed the title fix(provider): aws-sd provider null pointer feat(aws): add tls ca provision May 15, 2025
@ivankatliarchuk ivankatliarchuk force-pushed the awssd-5202 branch 2 times, most recently from 0a67905 to bab1fe2 Compare May 15, 2025 10:57
@ivankatliarchuk
Copy link
Contributor Author

/retitle fix(provider): aws-sd provider null pointer

@k8s-ci-robot k8s-ci-robot changed the title feat(aws): add tls ca provision fix(provider): aws-sd provider null pointer May 15, 2025
@ivankatliarchuk
Copy link
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label May 15, 2025
@mloiseleur
Copy link
Collaborator

@redbaron Do you think you can do a first review on this PR ?

@mloiseleur
Copy link
Collaborator

@JyothsnaGuduri @restlesswind Can you confirm this PR fixes your issue ?

To test from a PR:

gh repo clone kubernetes-sigs/external-dns
cd external-dns
gh pr checkout 5404
kubectl config use-context <context>
go run main.go \
  --kubeconfig=<kubeconfig_path> \
  --xxx=yyy

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 22, 2025
@JyothsnaGuduri
Copy link

I can confirm this PR changes work fine,

I used following command after cloning the repo and PR branch

AWS_PROFILE=<aws-profile> go run main.go --kubeconfig=$HOME/.kube/config --namespace=external-dns --log-level=info --log-format=text --interval=10m --source=service --source=ingress --policy=upsert-only --registry=aws-sd --txt-owner-id=<owner-id> --domain-filter=<domain-filter> --provider=aws-sd

find the logs below

INFO[0000] config: {APIServerURL: KubeConfig:/Users/<user>/.kube/config RequestTimeout:30s DefaultTargets:[] ContourLoadBalancerService:heptio-contour/contour GlooNamespace:gloo-system SkipperRouteGroupVersion:zalando.org/v1 Sources:[service ingress] Namespace:external-dns AnnotationFilter: LabelFilter: FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false IgnoreIngressTLSSpec:false IgnoreIngressRulesSpec:false GatewayNamespace: GatewayLabelFilter: Compatibility: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:aws-sd GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s GoogleZoneVisibility: DomainFilter:[<domainFilter>] ExcludeDomains:[] RegexDomainFilter: RegexDomainExclusion: ZoneNameFilter:[] ZoneIDFilter:[] TargetNetFilter:[] ExcludeTargetNets:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSAssumeRoleExternalID: AWSBatchChangeSize:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AWSZoneCacheDuration:0s AWSSDServiceCleanup:false AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: BluecatDNSConfiguration: BluecatConfigFile:/etc/kubernetes/bluecat.json BluecatDNSView: BluecatGatewayHost: BluecatRootZone: BluecatDNSServerName: BluecatDNSDeployType:no-deploy BluecatSkipTLSVerify:false CloudflareProxied:false CloudflareDNSRecordsPerPage:100 CoreDNSPrefix:/skydns/ RcodezeroTXTEncrypt:false AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: AkamaiEdgercPath: AkamaiEdgercSection: InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true InfobloxView: InfobloxMaxResults:0 InfobloxFQDNRegEx: InfobloxNameRegEx: InfobloxCreatePTR:false InfobloxCacheDuration:0 DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 OCIConfigFile:/etc/kubernetes/oci.yaml InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSAPIKey: PDNSTLSEnabled:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:upsert-only Registry:aws-sd TXTOwnerID:eks-sandbox TXTPrefix: TXTSuffix: Interval:10m0s MinEventSyncInterval:5s Once:false DryRun:false UpdateEvents:false LogFormat:text MetricsAddress::7979 LogLevel:info TXTCacheInterval:0s TXTWildcardReplacement: ExoscaleEndpoint:https://api.exoscale.ch/dns ExoscaleAPIKey: ExoscaleAPISecret: CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: RFC2136Host: RFC2136Port:0 RFC2136Zone: RFC2136Insecure:false RFC2136GSSTSIG:false RFC2136KerberosRealm: RFC2136KerberosUsername: RFC2136KerberosPassword: RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:false RFC2136MinTTL:0s RFC2136BatchChangeSize:50 NS1Endpoint: NS1IgnoreSSL:false NS1MinTTLSeconds:0 TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50 ManagedDNSRecordTypes:[A CNAME] GoDaddyAPIKey: GoDaddySecretKey: GoDaddyTTL:0 GoDaddyOTE:false OCPRouterName: IBMCloudProxied:false IBMCloudConfigFile:/etc/kubernetes/ibmcloud.json TencentCloudConfigFile:/etc/kubernetes/tencent-cloud.json TencentCloudZoneType: PiholeServer: PiholePassword: PiholeTLSInsecureSkipVerify:false PluralCluster: PluralProvider:} 
INFO[0000] Instantiating new Kubernetes client          
INFO[0000] Using kubeConfig                             
INFO[0000] Created Kubernetes client https://<k8s-cluster>.gr7.us-west-2.eks.amazonaws.com 
INFO[0004] All records are already up to date  

Hope this helps

@mloiseleur
Copy link
Collaborator

Thanks 👍.
@ivankatliarchuk Once you have rebased, I think we can merge it.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 29, 2025
@mloiseleur
Copy link
Collaborator

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 29, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mloiseleur

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 29, 2025
@k8s-ci-robot k8s-ci-robot merged commit 95c2c72 into kubernetes-sigs:master May 29, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

External-dns >0.15.1 failing with nil pointer dereference on startup
4 participants