Skip to content

Features wishlist! #2354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
calebcall opened this issue May 27, 2025 · 0 comments
Open

Features wishlist! #2354

calebcall opened this issue May 27, 2025 · 0 comments

Comments

@calebcall
Copy link

Wishlist

  1. Notifications: Notifications are one of the most important part of any monitoring stack. It doesn't matter how well the actual monitoring works if you can't be properly notified. So I would love to see the following in a proper notification module:
  • Central notification channel management. I shouldn't need to go to every single monitor to change a notification channel. I shouldn't have to make the same update multiple times (i.e. changing a discord webhook to a new discord webhook for example)
  • Pluggable notification channels. By this I mean, the Bluewave team may have Slack, Discord, Telegram, and email included out of the box, some may want an obscure notification channel that they use (NTFY for example). Make it simple to add a new notification channel module without having to modify core code. This allows the system to be adaptable and expandable without having to have the Bluewave team be the choking point. The team can chose to bring in any quality and useful channels the community builds in to the core product if they so desire to.
  • Ability to add multiple notification channels of the same type. I should be able to have multiple discord channels, maybe one for prod, one for non-prod.
  • Ability to send to multiple notifications for the same monitor. Example, disk runs low on serverA I want to notify the support discord channel so that maybe one of the engineers that has a minute can grab it. At the same time, or maybe with a configurable delay, I want to notify the on-call pager via another means (PagerDuty, OpGenie, Grafana On-Call, etc) so they are paged and can act on it.
  • Notification escalations. Notification has been sent, it hasn't been acknowledged and is still alerting XX time later (configurable, minutes, hours, etc) it escalates to a different notification channel.
  • Custom service monitoring. Ability to monitor a service on a server is paramount to a monitoring solution. Sometimes a service doesn't expose a port or endpoint or that endpoint is restricted. So being able to monitor for a process on a server where capture is already running, would be useful.
  1. Other
  • Tags on monitors that allow me to search and filter based on them. This would include in the notifications. For example, I would like to be able to notify a customer when their service is having an issue. So if I could setup the monitor that's monitoring their service with a tag, then on the monitor add a filter for tag = XXX, send to this channel, but also send to this other channel (for a production support team for example). Example of this concept is Grafana Alerting: https://grafana.com/docs/grafana/latest/alerting/configure-notifications/
  • Searchable monitors. Currently with ~150 monitors in Checkmate, I have to browse through all the pages to find the monitor I want to look at.
  • Page size preference remembered sessions (at the use level would be ideal, along with having a global configurable default value). Related to the above, if I change it to 25 per page and then refresh, it defaults back to 5. Super annoying.
  • Custom script execution. Similar to custom service monitoring, being able to create scripts or plugins that run and do something more complex than up/down and reports back a healthy or not makes a monitoring system extremely expandable and adaptable. A good example is in Icinga/Nagios/Checkmk eco-system. The parsing is simplified in that the return code of the plugin has to be 0, 1, 2, 3 for Ok, Warning, Critical, Unknown. Then no need to have to parse all sort of different outputs. https://www.monitoring-plugins.org/doc/guidelines.html#AEN74
  • Websocket and gRPC monitoring. Many of the endpoints I need to monitor are one of the two. Being able to query them and check for successful replies is crucial to a complete monitoring solution.
  • Event handlers. Being able to trigger event handlers based on certain criteria. For example, if a service dies, restart it, however if that service dies more than once in a given time period, send a notification.
  • Ability to export metrics in prometheus style metrics. Monitoring and observability are two different things, so even though I have monitoring in place I'm still going to have an observability platform. Since the main metrics are being gathered by capture/checkmate, it would be nice if at a minimum those could be exposed as prometheus metrics, even better if there was a way to provide a prometheus compatible url and have it do the remote write.

All these are features in other monitoring systems. In all the cases where the ability to use a custom plugin, module, etc is proposed I would find it perfectly acceptable to require the plugin be written in a certain language and have other requirements (for example the return code of the custom script/plugin) to make it more easily to be "plugged-in" to checkmate/capture and to keep the system running efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant