-
Notifications
You must be signed in to change notification settings - Fork 622
Add per db metrics #4183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add per db metrics #4183
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
This file was deleted.
This file was deleted.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
This file was deleted.
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
# This list of queries configures an OTel SQL Query Receiver to read pgMonitor | ||
# metrics from Postgres. | ||
# | ||
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/-/receiver/sqlqueryreceiver#metrics-queries | ||
# https://github.com/CrunchyData/pgmonitor/blob/v5.2.1/sql_exporter/common/crunchy_per_db_collector.yml | ||
# | ||
# Note: Several metrics in the `crunchy_per_db_collector` track the materialized views and | ||
# pgMonitor-extension version -- metrics that aren't meaningful in the CPK environment. | ||
# The list of metrics that fall into this category include | ||
# * ccp_metric_matview_refresh_last_run_fail_count | ||
# * ccp_metric_matview_refresh_longest_runtime_seconds | ||
# * ccp_metric_matview_refresh_longest_runtime | ||
# * ccp_metric_table_refresh_longest_runtime | ||
# * ccp_pgmonitor_extension_per_db | ||
|
||
- sql: > | ||
SELECT current_database() as dbname | ||
, n.nspname as schemaname | ||
, c.relname | ||
, pg_catalog.pg_total_relation_size(c.oid) as bytes | ||
FROM pg_catalog.pg_class c | ||
JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid | ||
WHERE NOT pg_is_other_temp_schema(n.oid) | ||
AND relkind IN ('r', 'm', 'f'); | ||
metrics: | ||
- metric_name: ccp_table_size_bytes | ||
value_type: double | ||
value_column: bytes | ||
description: "Table size in bytes including indexes" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
static_attributes: | ||
server: "localhost:5432" | ||
- sql: > | ||
SELECT current_database() as dbname | ||
, p.schemaname | ||
, p.relname | ||
, p.seq_scan | ||
, p.seq_tup_read | ||
, COALESCE(p.idx_scan, 0) AS idx_scan | ||
, COALESCE(p.idx_tup_fetch, 0) as idx_tup_fetch | ||
, p.n_tup_ins | ||
, p.n_tup_upd | ||
, p.n_tup_del | ||
, p.n_tup_hot_upd | ||
, CASE | ||
WHEN current_setting('server_version_num')::int >= 160000 | ||
THEN p.n_tup_newpage_upd | ||
ELSE 0::bigint | ||
END AS n_tup_newpage_upd | ||
, p.n_live_tup | ||
, p.n_dead_tup | ||
, p.vacuum_count | ||
, p.autovacuum_count | ||
, p.analyze_count | ||
, p.autoanalyze_count | ||
FROM pg_catalog.pg_stat_user_tables p; | ||
metrics: | ||
- metric_name: ccp_stat_user_tables_seq_scan | ||
data_type: sum | ||
value_column: seq_scan | ||
description: "Number of sequential scans initiated on this table" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
static_attributes: | ||
server: "localhost:5432" | ||
- metric_name: ccp_stat_user_tables_seq_tup_read | ||
data_type: sum | ||
value_column: seq_tup_read | ||
description: "Number of live rows fetched by sequential scans" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
static_attributes: | ||
server: "localhost:5432" | ||
- metric_name: ccp_stat_user_tables_idx_scan | ||
data_type: sum | ||
description: "Number of index scans initiated on this table" | ||
value_column: idx_scan | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_idx_tup_fetch | ||
data_type: sum | ||
description: "Number of live rows fetched by index scans" | ||
value_column: idx_tup_fetch | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_tup_ins | ||
data_type: sum | ||
description: "Number of rows inserted" | ||
value_column: n_tup_ins | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_tup_upd | ||
data_type: sum | ||
description: "Number of rows updated" | ||
value_column: n_tup_upd | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_tup_del | ||
data_type: sum | ||
description: "Number of rows deleted" | ||
value_column: n_tup_del | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_tup_hot_upd | ||
data_type: sum | ||
description: "Number of rows HOT updated (i.e., with no separate index update required)" | ||
value_column: n_tup_hot_upd | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_tup_newpage_upd | ||
data_type: sum | ||
description: "Number of rows updated where the successor version goes onto a new heap page, leaving behind an original version with a t_ctid field that points to a different heap page. These are always non-HOT updates." | ||
value_column: n_tup_newpage_upd | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_live_tup | ||
description: "Estimated number of live rows" | ||
value_column: n_live_tup | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_n_dead_tup | ||
description: "Estimated number of dead rows" | ||
value_column: n_dead_tup | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_vacuum_count | ||
data_type: sum | ||
description: "Number of times this table has been manually vacuumed (not counting VACUUM FULL)" | ||
value_column: vacuum_count | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_autovacuum_count | ||
data_type: sum | ||
description: "Number of times this table has been vacuumed by the autovacuum daemon" | ||
value_column: autovacuum_count | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_analyze_count | ||
data_type: sum | ||
description: "Number of times this table has been manually analyzed" | ||
value_column: analyze_count | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
- metric_name: ccp_stat_user_tables_autoanalyze_count | ||
data_type: sum | ||
description: "Number of times this table has been analyzed by the autovacuum daemon" | ||
value_column: autoanalyze_count | ||
static_attributes: | ||
server: "localhost:5432" | ||
attribute_columns: ["dbname", "schemaname", "relname"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,9 @@ import ( | |
//go:embed "generated/postgres_5s_metrics.json" | ||
var fiveSecondMetrics json.RawMessage | ||
|
||
//go:embed "generated/postgres_5m_per_db_metrics.json" | ||
var fiveMinutePerDBMetrics json.RawMessage | ||
|
||
//go:embed "generated/postgres_5m_metrics.json" | ||
var fiveMinuteMetrics json.RawMessage | ||
|
||
|
@@ -33,15 +36,9 @@ var ltPG17Fast json.RawMessage | |
//go:embed "generated/eq_pg16_fast_metrics.json" | ||
var eqPG16Fast json.RawMessage | ||
|
||
//go:embed "generated/gte_pg16_slow_metrics.json" | ||
var gtePG16Slow json.RawMessage | ||
|
||
//go:embed "generated/lt_pg16_fast_metrics.json" | ||
var ltPG16Fast json.RawMessage | ||
|
||
//go:embed "generated/lt_pg16_slow_metrics.json" | ||
var ltPG16Slow json.RawMessage | ||
|
||
type queryMetrics struct { | ||
Metrics []*metric `json:"metrics"` | ||
Query string `json:"sql"` | ||
|
@@ -71,6 +68,7 @@ func EnablePostgresMetrics(ctx context.Context, inCluster *v1beta1.PostgresClust | |
// will continually append to it and blow up our ConfigMap | ||
fiveSecondMetricsClone := slices.Clone(fiveSecondMetrics) | ||
fiveMinuteMetricsClone := slices.Clone(fiveMinuteMetrics) | ||
fiveMinutePerDBMetricsClone := slices.Clone(fiveMinutePerDBMetrics) | ||
|
||
if inCluster.Spec.PostgresVersion >= 17 { | ||
fiveSecondMetricsClone, err = appendToJSONArray(fiveSecondMetricsClone, gtePG17Fast) | ||
|
@@ -91,20 +89,11 @@ func EnablePostgresMetrics(ctx context.Context, inCluster *v1beta1.PostgresClust | |
log.Error(err, "error compiling metrics for postgres 16") | ||
} | ||
|
||
if inCluster.Spec.PostgresVersion >= 16 { | ||
fiveMinuteMetricsClone, err = appendToJSONArray(fiveMinuteMetricsClone, gtePG16Slow) | ||
if err != nil { | ||
log.Error(err, "error compiling metrics for postgres 16 and greater") | ||
} | ||
} else { | ||
if inCluster.Spec.PostgresVersion < 16 { | ||
fiveSecondMetricsClone, err = appendToJSONArray(fiveSecondMetricsClone, ltPG16Fast) | ||
if err != nil { | ||
log.Error(err, "error compiling fast metrics for postgres versions less than 16") | ||
} | ||
fiveMinuteMetricsClone, err = appendToJSONArray(fiveMinuteMetricsClone, ltPG16Slow) | ||
if err != nil { | ||
log.Error(err, "error compiling slow metrics for postgres versions less than 16") | ||
} | ||
} | ||
|
||
// Remove any queries that user has specified in the spec | ||
|
@@ -117,7 +106,7 @@ func EnablePostgresMetrics(ctx context.Context, inCluster *v1beta1.PostgresClust | |
var fiveSecondMetricsArr []queryMetrics | ||
err := json.Unmarshal(fiveSecondMetricsClone, &fiveSecondMetricsArr) | ||
if err != nil { | ||
log.Error(err, "error compiling postgres metrics") | ||
log.Error(err, "error compiling five second postgres metrics") | ||
} | ||
|
||
// Remove any specified metrics from the five second metrics | ||
|
@@ -128,19 +117,31 @@ func EnablePostgresMetrics(ctx context.Context, inCluster *v1beta1.PostgresClust | |
var fiveMinuteMetricsArr []queryMetrics | ||
err = json.Unmarshal(fiveMinuteMetricsClone, &fiveMinuteMetricsArr) | ||
if err != nil { | ||
log.Error(err, "error compiling postgres metrics") | ||
log.Error(err, "error compiling five minute postgres metrics") | ||
} | ||
|
||
// Remove any specified metrics from the five minute metrics | ||
fiveMinuteMetricsArr = removeMetricsFromQueries( | ||
inCluster.Spec.Instrumentation.Metrics.CustomQueries.Remove, fiveMinuteMetricsArr) | ||
|
||
// Convert json to array of queryMetrics objects | ||
var fiveMinutePerDBMetricsArr []queryMetrics | ||
err = json.Unmarshal(fiveMinutePerDBMetricsClone, &fiveMinutePerDBMetricsArr) | ||
if err != nil { | ||
log.Error(err, "error compiling per-db postgres metrics") | ||
} | ||
|
||
// Remove any specified metrics from the five minute per-db metrics | ||
fiveMinutePerDBMetricsArr = removeMetricsFromQueries( | ||
inCluster.Spec.Instrumentation.Metrics.CustomQueries.Remove, fiveMinutePerDBMetricsArr) | ||
|
||
// Convert back to json data | ||
// The error return value can be ignored as the errchkjson linter | ||
// deems the []queryMetrics to be a safe argument: | ||
// https://github.com/breml/errchkjson | ||
fiveSecondMetricsClone, _ = json.Marshal(fiveSecondMetricsArr) | ||
fiveMinuteMetricsClone, _ = json.Marshal(fiveMinuteMetricsArr) | ||
fiveMinutePerDBMetricsClone, _ = json.Marshal(fiveMinutePerDBMetricsArr) | ||
} | ||
|
||
// Add Prometheus exporter | ||
|
@@ -180,31 +181,65 @@ func EnablePostgresMetrics(ctx context.Context, inCluster *v1beta1.PostgresClust | |
Exporters: []ComponentID{Prometheus}, | ||
} | ||
|
||
// Add custom queries if they are defined in the spec | ||
// Add custom queries and per-db metrics if they are defined in the spec | ||
if inCluster.Spec.Instrumentation != nil && | ||
inCluster.Spec.Instrumentation.Metrics != nil && | ||
inCluster.Spec.Instrumentation.Metrics.CustomQueries != nil && | ||
inCluster.Spec.Instrumentation.Metrics.CustomQueries.Add != nil { | ||
|
||
for _, querySet := range inCluster.Spec.Instrumentation.Metrics.CustomQueries.Add { | ||
// Create a receiver for the query set | ||
receiverName := "sqlquery/" + querySet.Name | ||
config.Receivers[receiverName] = map[string]any{ | ||
"driver": "postgres", | ||
"datasource": fmt.Sprintf( | ||
`host=localhost dbname=postgres port=5432 user=%s password=${env:PGPASSWORD}`, | ||
MonitoringUser), | ||
"collection_interval": querySet.CollectionInterval, | ||
// Give Postgres time to finish setup. | ||
"initial_delay": "15s", | ||
"queries": "${file:/etc/otel-collector/" + | ||
querySet.Name + "/" + querySet.Queries.Key + "}", | ||
inCluster.Spec.Instrumentation.Metrics != nil { | ||
|
||
if inCluster.Spec.Instrumentation.Metrics.CustomQueries != nil && | ||
inCluster.Spec.Instrumentation.Metrics.CustomQueries.Add != nil { | ||
|
||
for _, querySet := range inCluster.Spec.Instrumentation.Metrics.CustomQueries.Add { | ||
// Create a receiver for the query set | ||
|
||
dbs := []string{"postgres"} | ||
if len(querySet.Databases) != 0 { | ||
dbs = querySet.Databases | ||
} | ||
Comment on lines
+194
to
+197
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Definitely need to make a note in the documentation that if the user provides any databases for a custom query set that the default "postgres" database will not be included (unless they include it themselves). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For sure, working on the documentation now. |
||
for _, db := range dbs { | ||
receiverName := fmt.Sprintf( | ||
"sqlquery/%s-%s", querySet.Name, db) | ||
Comment on lines
+199
to
+200
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
config.Receivers[receiverName] = map[string]any{ | ||
"driver": "postgres", | ||
"datasource": fmt.Sprintf( | ||
`host=localhost dbname=%s port=5432 user=%s password=${env:PGPASSWORD}`, | ||
db, | ||
MonitoringUser), | ||
"collection_interval": querySet.CollectionInterval, | ||
// Give Postgres time to finish setup. | ||
"initial_delay": "15s", | ||
"queries": "${file:/etc/otel-collector/" + | ||
querySet.Name + "/" + querySet.Queries.Key + "}", | ||
} | ||
|
||
// Add the receiver to the pipeline | ||
pipeline := config.Pipelines[PostgresMetrics] | ||
pipeline.Receivers = append(pipeline.Receivers, receiverName) | ||
config.Pipelines[PostgresMetrics] = pipeline | ||
} | ||
} | ||
} | ||
if inCluster.Spec.Instrumentation.Metrics.PerDBMetricTargets != nil { | ||
|
||
for _, db := range inCluster.Spec.Instrumentation.Metrics.PerDBMetricTargets { | ||
// Create a receiver for the query set for the db | ||
receiverName := "sqlquery/" + db | ||
config.Receivers[receiverName] = map[string]any{ | ||
"driver": "postgres", | ||
"datasource": fmt.Sprintf( | ||
`host=localhost dbname=%s port=5432 user=%s password=${env:PGPASSWORD}`, | ||
db, | ||
MonitoringUser), | ||
"collection_interval": "5m", | ||
// Give Postgres time to finish setup. | ||
"initial_delay": "15s", | ||
"queries": slices.Clone(fiveMinutePerDBMetricsClone), | ||
} | ||
|
||
// Add the receiver to the pipeline | ||
pipeline := config.Pipelines[PostgresMetrics] | ||
pipeline.Receivers = append(pipeline.Receivers, receiverName) | ||
config.Pipelines[PostgresMetrics] = pipeline | ||
// Add the receiver to the pipeline | ||
pipeline := config.Pipelines[PostgresMetrics] | ||
pipeline.Receivers = append(pipeline.Receivers, receiverName) | ||
config.Pipelines[PostgresMetrics] = pipeline | ||
} | ||
} | ||
} | ||
} | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/11--add-per-db-metrics.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestAssert | ||
commands: | ||
# First, check that all containers in the instance pod are ready. | ||
# Then, grab the collector metrics output and check that the per-db metrics | ||
# are present for the single added target. | ||
- script: | | ||
retry() { bash -ceu 'printf "$1\nSleeping...\n" && sleep 5' - "$@"; } | ||
check_containers_ready() { bash -ceu 'echo "$1" | jq -e ".[] | select(.type==\"ContainersReady\") | .status==\"True\""' - "$@"; } | ||
contains() { bash -ceu '[[ "$1" == *"$2"* ]]' - "$@"; } | ||
pod=$(kubectl get pods -o name -n "${NAMESPACE}" \ | ||
-l postgres-operator.crunchydata.com/cluster=otel-cluster,postgres-operator.crunchydata.com/data=postgres) | ||
[ "$pod" = "" ] && retry "Pod not found" && exit 1 | ||
condition_json=$(kubectl get "${pod}" -n "${NAMESPACE}" -o jsonpath="{.status.conditions}") | ||
[ "$condition_json" = "" ] && retry "conditions not found" && exit 1 | ||
{ check_containers_ready "$condition_json"; } || { | ||
retry "containers not ready" | ||
exit 1 | ||
} | ||
scrape_metrics=$(kubectl exec "${pod}" -c collector -n "${NAMESPACE}" -- \ | ||
curl --insecure --silent http://localhost:9187/metrics) | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="pikachu"'; } || { | ||
retry "ccp_table_size_bytes not found for pikachu" | ||
exit 1 | ||
} | ||
{ ! contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="onix"'; } || { | ||
retry "ccp_table_size_bytes found for onix" | ||
exit 1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/13--add-per-db-metrics.yaml |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestAssert | ||
commands: | ||
# First, check that all containers in the instance pod are ready. | ||
# Then, grab the collector metrics output and check that the per-db metrics | ||
# are present for both added targets. | ||
- script: | | ||
retry() { bash -ceu 'printf "$1\nSleeping...\n" && sleep 5' - "$@"; } | ||
check_containers_ready() { bash -ceu 'echo "$1" | jq -e ".[] | select(.type==\"ContainersReady\") | .status==\"True\""' - "$@"; } | ||
contains() { bash -ceu '[[ "$1" == *"$2"* ]]' - "$@"; } | ||
pod=$(kubectl get pods -o name -n "${NAMESPACE}" \ | ||
-l postgres-operator.crunchydata.com/cluster=otel-cluster,postgres-operator.crunchydata.com/data=postgres) | ||
[ "$pod" = "" ] && retry "Pod not found" && exit 1 | ||
condition_json=$(kubectl get "${pod}" -n "${NAMESPACE}" -o jsonpath="{.status.conditions}") | ||
[ "$condition_json" = "" ] && retry "conditions not found" && exit 1 | ||
{ check_containers_ready "$condition_json"; } || { | ||
retry "containers not ready" | ||
exit 1 | ||
} | ||
scrape_metrics=$(kubectl exec "${pod}" -c collector -n "${NAMESPACE}" -- \ | ||
curl --insecure --silent http://localhost:9187/metrics) | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="pikachu"'; } || { | ||
retry "ccp_table_size_bytes not found for pikachu" | ||
exit 1 | ||
} | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="onix"'; } || { | ||
retry "ccp_table_size_bytes not found for onix" | ||
exit 1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/15--remove-per-db-metrics.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestAssert | ||
commands: | ||
# First, check that all containers in the instance pod are ready. | ||
# Then, grab the collector metrics output and check that the per-db metrics | ||
# are absent from the targets since they've been removed. | ||
- script: | | ||
retry() { bash -ceu 'printf "$1\nSleeping...\n" && sleep 5' - "$@"; } | ||
check_containers_ready() { bash -ceu 'echo "$1" | jq -e ".[] | select(.type==\"ContainersReady\") | .status==\"True\""' - "$@"; } | ||
contains() { bash -ceu '[[ "$1" == *"$2"* ]]' - "$@"; } | ||
pod=$(kubectl get pods -o name -n "${NAMESPACE}" \ | ||
-l postgres-operator.crunchydata.com/cluster=otel-cluster,postgres-operator.crunchydata.com/data=postgres) | ||
[ "$pod" = "" ] && retry "Pod not found" && exit 1 | ||
condition_json=$(kubectl get "${pod}" -n "${NAMESPACE}" -o jsonpath="{.status.conditions}") | ||
[ "$condition_json" = "" ] && retry "conditions not found" && exit 1 | ||
{ check_containers_ready "$condition_json"; } || { | ||
retry "containers not ready" | ||
exit 1 | ||
} | ||
scrape_metrics=$(kubectl exec "${pod}" -c collector -n "${NAMESPACE}" -- \ | ||
curl --insecure --silent http://localhost:9187/metrics) | ||
{ ! contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="pikachu"'; } || { | ||
retry "ccp_table_size_bytes found for pikachu" | ||
exit 1 | ||
} | ||
{ ! contains "${scrape_metrics}" 'ccp_table_size_bytes{dbname="onix"'; } || { | ||
retry "ccp_table_size_bytes found for onix" | ||
exit 1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/17--add-custom-queries-per-db.yaml | ||
assert: | ||
- files/17-custom-queries-per-db-added.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestAssert | ||
commands: | ||
# First, check that all containers in the instance pod are ready. | ||
# Then, grab the collector metrics output and check that the two metrics that we | ||
# checked for earlier are no longer there. | ||
# Then, check that the two custom metrics that we added are present | ||
# only for the targets that were specified. | ||
- script: | | ||
retry() { bash -ceu 'printf "$1\nSleeping...\n" && sleep 5' - "$@"; } | ||
check_containers_ready() { bash -ceu 'echo "$1" | jq -e ".[] | select(.type==\"ContainersReady\") | .status==\"True\""' - "$@"; } | ||
contains() { bash -ceu '[[ "$1" == *"$2"* ]]' - "$@"; } | ||
pod=$(kubectl get pods -o name -n "${NAMESPACE}" \ | ||
-l postgres-operator.crunchydata.com/cluster=otel-cluster,postgres-operator.crunchydata.com/data=postgres) | ||
[ "$pod" = "" ] && retry "Pod not found" && exit 1 | ||
condition_json=$(kubectl get "${pod}" -n "${NAMESPACE}" -o jsonpath="{.status.conditions}") | ||
[ "$condition_json" = "" ] && retry "conditions not found" && exit 1 | ||
{ check_containers_ready "$condition_json"; } || { | ||
retry "containers not ready" | ||
exit 1 | ||
} | ||
scrape_metrics=$(kubectl exec "${pod}" -c collector -n "${NAMESPACE}" -- \ | ||
curl --insecure --silent http://localhost:9187/metrics) | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes_1{dbname="pikachu"'; } || { | ||
retry "custom metric not found for pikachu db" | ||
exit 1 | ||
} | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes_1{dbname="onix"'; } || { | ||
retry "custom metric found for onix db" | ||
exit 1 | ||
} | ||
{ contains "${scrape_metrics}" 'ccp_table_size_bytes_2{dbname="onix"'; } || { | ||
retry "custom metric not found for onix db" | ||
exit 1 | ||
} | ||
{ ! contains "${scrape_metrics}" 'ccp_table_size_bytes_2{dbname="pikachu"'; } || { | ||
retry "custom metric found for pikachu db" | ||
exit 1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/19--add-logs-exporter.yaml | ||
assert: | ||
- files/19-logs-exporter-added.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/21--create-cluster.yaml | ||
assert: | ||
- files/21-cluster-created.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
apiVersion: kuttl.dev/v1beta1 | ||
kind: TestStep | ||
apply: | ||
- files/15--add-backups.yaml | ||
- files/23--add-backups.yaml | ||
assert: | ||
- files/15-backups-added.yaml | ||
- files/23-backups-added.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
--- | ||
apiVersion: postgres-operator.crunchydata.com/v1beta1 | ||
kind: PostgresCluster | ||
metadata: | ||
name: otel-cluster | ||
spec: | ||
users: | ||
- name: ash | ||
databases: | ||
- pikachu | ||
- name: brock | ||
databases: | ||
- onix | ||
instrumentation: | ||
metrics: | ||
perDBMetricTargets: | ||
- pikachu |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
apiVersion: postgres-operator.crunchydata.com/v1beta1 | ||
kind: PostgresCluster | ||
metadata: | ||
name: otel-cluster | ||
spec: | ||
instrumentation: | ||
metrics: | ||
perDBMetricTargets: | ||
- pikachu | ||
- onix |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
--- | ||
apiVersion: postgres-operator.crunchydata.com/v1beta1 | ||
kind: PostgresCluster | ||
metadata: | ||
name: otel-cluster | ||
spec: | ||
instrumentation: | ||
metrics: | ||
customQueries: | ||
remove: | ||
- ccp_connection_stats_active | ||
- ccp_database_size_bytes | ||
- ccp_table_size_bytes |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--- | ||
apiVersion: postgres-operator.crunchydata.com/v1beta1 | ||
kind: PostgresCluster | ||
metadata: | ||
name: otel-cluster | ||
spec: | ||
instrumentation: | ||
metrics: | ||
customQueries: | ||
add: | ||
- name: custom1 | ||
databases: [pikachu, onix] | ||
queries: | ||
name: my-custom-queries2 | ||
key: custom1.yaml | ||
- name: custom2 | ||
databases: [onix] | ||
queries: | ||
name: my-custom-queries2 | ||
key: custom2.yaml | ||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: my-custom-queries2 | ||
data: | ||
custom1.yaml: | | ||
- sql: > | ||
SELECT current_database() as dbname | ||
, n.nspname as schemaname | ||
, c.relname | ||
, pg_catalog.pg_total_relation_size(c.oid) as bytes | ||
FROM pg_catalog.pg_class c | ||
JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid | ||
WHERE NOT pg_is_other_temp_schema(n.oid) | ||
AND relkind IN ('r', 'm', 'f'); | ||
metrics: | ||
- metric_name: ccp_table_size_bytes_1 | ||
value_type: double | ||
value_column: bytes | ||
description: "Table size in bytes including indexes" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
static_attributes: | ||
server: "localhost:5432" | ||
custom2.yaml: | | ||
- sql: > | ||
SELECT current_database() as dbname | ||
, n.nspname as schemaname | ||
, c.relname | ||
, pg_catalog.pg_total_relation_size(c.oid) as bytes | ||
FROM pg_catalog.pg_class c | ||
JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid | ||
WHERE NOT pg_is_other_temp_schema(n.oid) | ||
AND relkind IN ('r', 'm', 'f'); | ||
metrics: | ||
- metric_name: ccp_table_size_bytes_2 | ||
value_type: double | ||
value_column: bytes | ||
description: "Table size in bytes including indexes" | ||
attribute_columns: ["dbname", "schemaname", "relname"] | ||
static_attributes: | ||
server: "localhost:5432" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
apiVersion: postgres-operator.crunchydata.com/v1beta1 | ||
kind: PostgresCluster | ||
metadata: | ||
name: otel-cluster | ||
status: | ||
instances: | ||
- name: instance1 | ||
readyReplicas: 1 | ||
replicas: 1 | ||
updatedReplicas: 1 | ||
proxy: | ||
pgBouncer: | ||
readyReplicas: 1 | ||
replicas: 1 | ||
--- | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/data: postgres | ||
postgres-operator.crunchydata.com/role: master | ||
postgres-operator.crunchydata.com/cluster: otel-cluster | ||
postgres-operator.crunchydata.com/crunchy-otel-collector: "true" | ||
status: | ||
containerStatuses: | ||
- name: collector | ||
ready: true | ||
started: true | ||
- name: database | ||
ready: true | ||
started: true | ||
- name: pgbackrest | ||
ready: true | ||
started: true | ||
- name: pgbackrest-config | ||
ready: true | ||
started: true | ||
- name: replication-cert-copy | ||
ready: true | ||
started: true | ||
phase: Running | ||
--- | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/data: pgbackrest | ||
postgres-operator.crunchydata.com/cluster: otel-cluster | ||
postgres-operator.crunchydata.com/crunchy-otel-collector: "true" | ||
status: | ||
containerStatuses: | ||
- name: collector | ||
ready: true | ||
started: true | ||
- name: pgbackrest | ||
ready: true | ||
started: true | ||
- name: pgbackrest-config | ||
ready: true | ||
started: true | ||
phase: Running | ||
--- | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/role: pgbouncer | ||
postgres-operator.crunchydata.com/cluster: otel-cluster | ||
postgres-operator.crunchydata.com/crunchy-otel-collector: "true" | ||
status: | ||
containerStatuses: | ||
- name: collector | ||
ready: true | ||
started: true | ||
- name: pgbouncer | ||
ready: true | ||
started: true | ||
- name: pgbouncer-config | ||
ready: true | ||
started: true | ||
phase: Running | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: otel-cluster-primary | ||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/role: pgadmin | ||
postgres-operator.crunchydata.com/pgadmin: otel-pgadmin | ||
--- | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/data: pgadmin | ||
postgres-operator.crunchydata.com/role: pgadmin | ||
postgres-operator.crunchydata.com/pgadmin: otel-pgadmin | ||
postgres-operator.crunchydata.com/crunchy-otel-collector: "true" | ||
status: | ||
containerStatuses: | ||
- name: collector | ||
ready: true | ||
started: true | ||
- name: pgadmin | ||
ready: true | ||
started: true | ||
phase: Running | ||
--- | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
labels: | ||
postgres-operator.crunchydata.com/role: pgadmin | ||
postgres-operator.crunchydata.com/pgadmin: otel-pgadmin | ||
type: Opaque | ||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: my-custom-queries2 |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 IIUC, this and
customQueries
only make sense at/on Postgres. Is there a way to arrange structs so the PGAdmin API doesn't have these fields? Not a blocker; v1beta1.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to make any too big changes with the spec, but I wonder if the pgadmin and postgrescluster versions will diverge even more (or if we'll eventually want to slice up instrumentation by target-type, eg., postgres-instrumentation, pgbouncer-instrumentation, pgadmin-instrumentation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah good point... Probably makes sense to at least have a pgadmin-instrumentation and a postgrescluster-instrumentation and then pick and choose which structs make sense in each
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add "taking apart the instrumentation struct" to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make sense if you have the bandwidth...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the other work due this sprint, I'm going to make a ticket for this topic and not include it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made that other ticket.