Metric Filters
📋 Table of Contents
- Overview
- Metric Filter Structure
- Error Metric Filter
- Error Alarm
- Log Subscription
- IAM Roles and Permissions
- Best Practices
- Useful Links
🌟 Overview
This document outlines the CloudWatch Metric Filters configured in our infrastructure. Metric filters allow us to extract specific data from log events and transform it into CloudWatch metrics, enabling more advanced monitoring and alerting capabilities.
Note: The metric filter and associated resources are defined in the
metric-filter.tsfile in our infrastructure code.
🏗️ Metric Filter Structure
A metric filter typically consists of the following components:
| Component | Description |
|---|---|
| Filter Pattern | Defines what to look for in the log events |
| Metric Name | The name of the metric to create or update |
| Metric Namespace | The namespace for the metric |
| Metric Value | The value to publish for the metric when a matched log event is found |
🚨 Error Metric Filter
We have configured a metric filter to track error logs across our application.
Error Metric Filter Configuration
const errorMetricFilter = new aws.cloudwatch.LogMetricFilter(`gh-errorfilter-${stack}-cw-${region}-metric-filter`, {
name: `gh-errorfilter-${stack}-cw-${region}-metric-filter`,
logGroupName: logGroup.name,
metricTransformation: {
name: "ErrorCount",
namespace: "CustomMetrics",
value: "1",
},
pattern: '{(($.event.status = "500 Internal Server Error") || ($.event.code = 500) || ($.level = "error" && $.event != "Invalid HTTP_HOST header: *" && $.event != "*was sent SIGTERM!") || ($.event.level = "error" && $.event != "Invalid HTTP_HOST header: *" && $.event != "*was sent SIGTERM!"))}',
});
This filter counts occurrences of 500 errors and other error logs, excluding specific known issues.
Important: Regularly review and update the filter pattern to ensure it captures all relevant error scenarios while excluding false positives.
🔔 Error Alarm
An alarm is set up to trigger when the error count exceeds a specified threshold.
Error Alarm Configuration
const errorAlarm = new aws.cloudwatch.MetricAlarm(`gh-errorfilter-${stack}-cw-${region}-alarm`, {
name: `gh-errorfilter-${stack}-cw-${region}-alarm`,
comparisonOperator: "GreaterThanThreshold",
evaluationPeriods: 1,
metricName: errorMetricFilter.metricTransformation.name,
namespace: errorMetricFilter.metricTransformation.namespace,
period: 10,
statistic: "Sum",
threshold: 1,
alarmDescription: "This alarm is triggered when there are any error logs",
alarmActions: [snsTopicErrorAlerts.arn],
treatMissingData: "notBreaching",
});
This alarm triggers when there is more than one error log within a 10-second period.
📡 Log Subscription
A log subscription filter is set up to send matching log events to a Lambda function for further processing.
Log Subscription Configuration
const logSubscription = new aws.cloudwatch.LogSubscriptionFilter(`gh-errorfilter-${stack}-cw-${region}-subscription`, {
logGroup: logGroup.name,
filterPattern: errorMetricFilter.pattern,
destinationArn: errorFilterLambdaFunction.arn,
}, {
dependsOn: [lambdaPermission, logRolePolicy, logGroup],
});
This subscription sends matching log events to the errorFilterLambdaFunction for additional processing or alerting.
Note: For details on how the Lambda function processes and sends these logs to Discord, see the Error Filter Discord Lambda Function documentation.
🔐 IAM Roles and Permissions
Appropriate IAM roles and permissions are set up to allow CloudWatch Logs to invoke the Lambda function.
IAM Role and Policy Configuration
const logsRole = new aws.iam.Role(`gh-errorfilter-${stack}-iam-${region}-role`, {
// ... role configuration ...
});
const logRolePolicy = new aws.iam.RolePolicy(`gh-errorfilter-${stack}-iam-${region}-role-policy`, {
// ... policy configuration ...
});
const lambdaPermission = new aws.lambda.Permission(`gh-errorfilter-${stack}-lambda-${region}-permission`, {
// ... lambda permission configuration ...
});
These configurations ensure that CloudWatch Logs has the necessary permissions to interact with the Lambda function.
📝 Best Practices
- Optimize Filter Patterns: Regularly review and refine filter patterns to ensure accuracy and efficiency.
- Monitor Costs: Be aware that extracting metrics from logs can increase CloudWatch costs. Monitor usage and adjust as needed.
- Use Meaningful Metric Names: Choose clear and descriptive names for your metrics to aid in monitoring and troubleshooting.
- Set Appropriate Thresholds: Regularly review and adjust alarm thresholds based on application behavior and requirements.
- Leverage Lambda for Complex Processing: Use Lambda functions for more complex log processing that can't be handled by metric filters alone.
🔗 Useful Links
- CloudWatch Logs Metric Filter Documentation
- CloudWatch Alarms Documentation
- Lambda Function Error Handling
- IAM Roles for CloudWatch Logs
Remember to keep this document up-to-date as you modify the metric filter configuration or add new filters. Regular reviews will ensure that your log monitoring strategy remains effective and aligned with your operational needs.