Google Cloud Monitoring Integration
Google Cloud Monitoring helps you gain visiblity into the performance, availability, and health of applications and infrastrucutre running on Google Cloud.
This guide will show you how to configure Google Cloud alerting to send alerts to your Ready Five account using a Google Cloud alerting webhook notification channel so they create incidents and escalate to your team appropriately.
Create the Ready Five integration
In your web browser, navigate to the "Integrations" tab in the team that should own this integration and click the "Add Integration" button.
For the integration type, click the "Add" button in the Google Cloud box.
Give the integration a name (or keep the default) and an optional description and click "Add".
The integration is now created, and you now have a URL on this screen that you'll need in a minute. Keep this tab open and open another.
Add a Google Cloud notification channel
Open the Google Cloud Monitoring console, go to the "Alerting" navigation item in the sidebar and click "Edit Notification Channels" in the top bar.
In the middle of the page, find the "Webhooks" section and click "Add new".
Paste in the Ready Five integration URL in the "Endpoint URL" field and specify a name for this notification channel. We recommend including the Ready Five team name in this display name to make alert destinations obvious later in cases where you may target more than one team from Google Cloud. Click "Test Connection", which will send a request and create an incident in Ready Five, verifying that the webhook is properly configured. You can resolve that incident then continue with setup.
After testing the connection, the "Save" button becomes clickable. Click the "Save" button.
The webhook connecting Google Cloud to Ready Five is now configured as a "notification channel" and can be used in Google Cloud Monitoring alerting policies.
Add a Google Cloud alerting policy
Alerting policies allow you to specify the conditions that must occur for alerts to trigger. You'll often configure many alerting policies for various elements of your infrastructure, ensuring that you're notified about important situations that happen in your account. This could be based on elevated CPU for compute instances, memory usage on your database servers, or any number of other possibilities.
There is excellent documentation elsewhere for selecting actionable metrics for alerting and for fine-tuning complex alerting policies, so this guide focuses only on wiring those alert policies up to notify Ready Five.
For the purposes of this guide, we'll configure an alerting policy that triggers when an Uptime Check fails a check.
Return to the "Alerting" navigation item in the sidebar and click "Create Policy" in the top bar.
Click "Select a metric" to choose the metric for this condition.
We'll select Uptime Check URL => Uptime_check => Check passed, then click "Apply". Again, this is an example - the metrics that you select may not match what we're choosing in this guide.
There are multiple checks in our account, and we only want to alert if the check called "ready-five-web" is failing, so we're filtering on check_id
where the comparator is = equals
and the value is ready-five-web
(1).
If there are one or more failures during any minute, we want to alert (2).
This metric is a "check passed" metric that returns true when successful. We only care about failures, so we're effectively inverting the metric by aggregating to count false
(3).
Click "Next" to continue.
We want to trigger this alert when any checks fail. So we'll set the Condition Type to "Threshold" and set the Threshold value to "0". Whenever the count of failures goes above 0 during a one minute period, an alert will trigger.
Click "Next" to continue.
Now we're asked to configure notifications for this policy. Choose the notification channel you created earlier (1).
If you want the Ready Five incident to be closed when the alerting conditions return to non-breaching levels, check the box for "Notify on incident closure". If you prefer to require someone from your team to explicitly close the incident on Ready Five, uncheck this box (2).
For the Incident autoclose duration, the default of 7 days generally works well. This only applies if no further data is available to determine the status of the incident, allowing Google to close the incident (3).
Take advantage of the "Documentation" field! Since you're creating the alert policy, you probably have some insight into what should be done by a team member when this alert triggers. Include any links to runbooks, dashboards, chat rooms, and anything else that may be helpful in the context of this specific failure. This field supports markdown formatting, it supports dynamic variables, and displays a live preview while typing. If provided, the Documentation content will be displayed prominently on Ready Five incident pages (4).
Give the alert policy a name to help you identify it later (5).
Click "Next" to continue to alert policy review.
Review your alert settings and if everything looks good, click "Create policy".
This policy is now in effect. Whenever the threshold is breached, Ready Five will be notified. When the threshold returns to normal levels, a follow-up alert will be sent to Ready Five to close the incident.