-
Notifications
You must be signed in to change notification settings - Fork 130
[OTAGENT-512] Add OTel Agent Gateway feature implementation #2483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2483 +/- ##
==========================================
+ Coverage 38.09% 38.16% +0.06%
==========================================
Files 299 300 +1
Lines 25182 25461 +279
==========================================
+ Hits 9594 9718 +124
- Misses 14853 15002 +149
- Partials 735 741 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
b0aeb8a to
262ecb3
Compare
262ecb3 to
e33ef74
Compare
| internalTrafficPolicy := corev1.ServiceInternalTrafficPolicyLocal | ||
| if err := managers.ServiceManager().AddService( | ||
| f.localServiceName, | ||
| f.owner.GetNamespace(), | ||
| common.GetOtelAgentGatewayServiceSelector(f.owner), | ||
| []corev1.ServicePort{*otlpGrpcPort, *otlpHttpPort}, | ||
| &internalTrafficPolicy, | ||
| ); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure you do NOT want to use an internal traffic policy, that's intended for daemonset componnents, as that would cause pods to not see any endpoints if they're not on the same node as the gateway deployment. You wouldn't notice on a 1-node kind cluster since everything is on the same node : https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/#how-it-works
So it should remain as nil, and you'd need to update QA instructions to test this scenario properly (2 nodes cluster, deploy client on non gateway node, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call - yes the gateway should use cluster traffic (the default). I've fixed this and tested it's working in a 3-node kind cluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated QA too
What does this PR do?
Add OTel Agent Gateway feature implementation. Reference implementation in helm charts: http://github.com/DataDog/helm-charts/blob/main/charts/datadog/templates/otel-agent-gateway-deployment.yaml
Motivation
Support OTel Agent Gateway in operator. See how it currently works in helm charts: https://docs.datadoghq.com/opentelemetry/setup/ddot_collector/install/kubernetes_gateway/
Additional Notes
HPA is currently not natively supported in operator, we may require users to define HPA as a separate resource
Replicas, node selector, affinity and other configs will be supported in the next PR.
Minimum Agent Versions
Are there minimum versions of the Datadog Agent and/or Cluster Agent required?
Describe your test plan
Built the operator locally and deployed to a kind cluster with OTel Agent Gateway enabled.
For QA, use the following config:
Verify:
datadog-otel-test-otel-agent-gatewayis createddatadog-otel-test-otel-agent-gatewayis created and listening on 4317/TCP, 4318/TCPdatadog-otel-test-otel-agent-gateway-<pod>is created and runningdatadog-otel-test-otel-agent-gatewaybinds to the pod aboveotel-agentcontainer in the node agent pod has a config that sends to the k8s service abovedatadog-otel-test-otel-agent-gatewayand its binding are createdAdditionally, deploy the OTel test client below. Ideally you have a multi-node cluster and deploy the client to a different node than the one that gateway runs on, to test the cross-node routing.
It sends OTLP data to the otel-agent in the node agent pod (NOT the otel agent gateway): datadog-otel-test-agent 4317. Then the otel-agent in the node agent pod forwards the data to the otel agent gateway, and gateway sends the data eventually to DD.
Checklist
bug,enhancement,refactoring,documentation,tooling, and/ordependenciesqa/skip-qalabel