GCP PubSub scaler produces large amount of errors when subscription has no messages. #5896

Caislear · 2024-06-19T08:37:04Z

Report

The GCP Pub/Sub Scaler produces large amounts of "error getting metric" errors when the Pubsub subscription is empty. In our case it's producing 10's of thousands of error messages per day. Along with causing excessive error spam, this also appears to cause issues with Flux deployment reconciliations, as the scaler sometimes does not get marked as healthy during scaler changes and results in delayed deployments.

Expected Behavior

There to be some form of mechanism for accepting the fact that sometimes subscriptions may have no messages for an extended period of time. Such as the valueIfNull default value feature on the older Stackdriver scaler. This allows us have a default fallback value if there is no data returned from the backing GCP metric.

Actual Behavior

Keda logs large amounts of "error getting metric" and "error getting scale decision" as result of having no default fallback value. From testing this appears to not only cause excessive error spam, but somtimes interferes with our Flux deployments as the scaler fails to be marked as healthy due to failing to resolve a scale decision and thus delays our deployments.

Steps to Reproduce the Problem

Create a pubsub scaler
Have it pointing at Pubsub subscription that only receives sporadic messages throughout a given day. (The majority of the time the subscription is empty)
Monitor over the period of the day and note repeated repeated log messages with failing to get metric and failing to make scale decision

(The more scalers the more noticeable this issue is, in our case we have 40+ pubsub scalers)

Logs from KEDA operator

could not find stackdriver metric with filter fetch pubsub_subscription | metric 'pubsub.googleapis.com/subscription/oldest_unacked_message_age' | filter (resource.project_id == 'xxx' && resource.subscription_id == 'xxx') | within 2m

KEDA Version

2.14.0

Kubernetes Version

1.28

Platform

Google Cloud

Scaler Details

Google Cloud Platform Pub/Sub

Anything else?

The most straightforward solution I have is to add in the functionality to have a default value that pubsub scalers can optionally use if configured when the Google Cloud Platform metrics returns no value/null. This functionality already exists on the default Stackdriver scaler implementation and as a result is easy to port.

I have created a pr that adds this functionality onto the pubsub scaler and this resolves our issues with Keda continuously logging errors. Found here

The text was updated successfully, but these errors were encountered:

stale · 2024-08-18T11:21:04Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Caislear · 2024-08-19T14:50:10Z

This has been resolved with the recent changes I made in version 2.15.0. Closing issue out.

Caislear added the bug Something isn't working label Jun 19, 2024

This was referenced Jun 19, 2024

Could not find stackdriver metric with query fetch pubsub_subscription - Google Cloud Platform‎ Pub/Sub #5855

Closed

feat: GCP Pub/Sub scaler add configurable fallback value when no metric value found #5897

Merged

stale bot added the stale All issues that are marked as stale due to inactivity label Aug 18, 2024

Caislear closed this as completed Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCP PubSub scaler produces large amount of errors when subscription has no messages. #5896

GCP PubSub scaler produces large amount of errors when subscription has no messages. #5896

Caislear commented Jun 19, 2024 •

edited

Loading

stale bot commented Aug 18, 2024

Caislear commented Aug 19, 2024

GCP PubSub scaler produces large amount of errors when subscription has no messages. #5896

GCP PubSub scaler produces large amount of errors when subscription has no messages. #5896

Comments

Caislear commented Jun 19, 2024 • edited Loading

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

stale bot commented Aug 18, 2024

Caislear commented Aug 19, 2024

Caislear commented Jun 19, 2024 •

edited

Loading