Monitor infrastructure

Cloud Provider Observability

AWS Observability

CloudWatch metrics

Services

Grafana Cloud

Services

CloudWatch metrics supports the following services, and allows you to pick from a wide array of available metrics and statistics. Metrics in bold text are included in the default configuration. The statistics for all metrics are Average, Maximum, Minimum, Sum, SampleCount, p50, p75, p90, p95, p99.

AWS/ACMPrivateCA

Function: Provides a private certificate authority for managing SSL/TLS certificates

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_acmprivateca_info
aws_acmprivateca_crlgenerated	CRLGenerated	Monitors the number of Certificate Revocation Lists (CRLs) generated. Used to ensure the regular creation of revocation lists for certificate management.
aws_acmprivateca_failure	Failure	Tracks the number of failures in Private CA operations. Useful for identifying issues in certificate issuance or other operations.
aws_acmprivateca_misconfigured_crlbucket	MisconfiguredCRLBucket	Monitors the number of instances where the CRL bucket is misconfigured. Useful for ensuring proper configuration and access to the CRL storage bucket.
aws_acmprivateca_success	Success	Tracks the number of successful operations within the ACM Private CA. Useful for monitoring operational efficiency and successful certificate issuances.
aws_acmprivateca_time	Time	Measures the time taken for various operations in ACM Private CA, helping to monitor performance and identify any slowdowns in certificate processing.

AWS/AmazonMQ

Function: Managed message broker service for Apache ActiveMQ and RabbitMQ

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_amazonmq_info
aws_amazonmq_ack_rate	AckRate	Monitors the acknowledgment rate of messages, ensuring efficient message processing and acknowledgment.
aws_amazonmq_burst_balance	BurstBalance	Tracks the balance of burst credits, monitoring if the broker can handle sudden spikes in traffic.
aws_amazonmq_channel_count	ChannelCount	Monitors the number of active channels, indicating resource usage and load on the broker.
aws_amazonmq_confirm_rate	ConfirmRate	Measures the rate at which messages are confirmed, ensuring message delivery guarantees.
aws_amazonmq_connection_count	ConnectionCount	Tracks the number of active connections, helping monitor broker usage and possible overloading.
aws_amazonmq_consumer_count	ConsumerCount	Monitors the number of consumers connected, useful for understanding broker demand and throughput.
aws_amazonmq_cpu_credit_balance	CpuCreditBalance	Tracks the remaining CPU credits, important for ensuring the broker has enough processing power to handle workload.
aws_amazonmq_cpu_utilization	CpuUtilization	Measures the percentage of CPU usage, helping identify potential performance bottlenecks.
aws_amazonmq_current_connections_count	CurrentConnectionsCount	Shows the number of currently connected clients, useful for tracking session loads.
aws_amazonmq_dequeue_count	DequeueCount	Monitors the number of messages dequeued, which helps gauge message consumption activity.
aws_amazonmq_dispatch_count	DispatchCount	Measures the number of messages dispatched to consumers, helping monitor message flow.
aws_amazonmq_enqueue_count	EnqueueCount	Tracks the number of messages enqueued, giving insights into the volume of messages entering the system.
aws_amazonmq_enqueue_time	EnqueueTime	Measures the time taken to enqueue messages, used to monitor latency and performance.
aws_amazonmq_established_connections_count	EstablishedConnectionsCount	Tracks the number of successfully established connections, used to monitor system stability.
aws_amazonmq_exchange_count	ExchangeCount	Monitors the number of exchanges, useful for analyzing message routing activity.
aws_amazonmq_expired_count	ExpiredCount	Tracks the number of messages that have expired without being consumed, useful for monitoring failed message deliveries.
aws_amazonmq_heap_usage	HeapUsage	Measures the heap memory usage of the broker, useful for detecting memory-related performance issues.
aws_amazonmq_in_flight_count	InFlightCount	Monitors the number of messages currently in transit, helping to ensure the broker isn’t overwhelmed by unacknowledged messages.
aws_amazonmq_inactive_durable_topic_subscribers_count	InactiveDurableTopicSubscribersCount	Monitors inactive durable subscribers, useful for tracking unused resources or inefficient topic subscriptions.
aws_amazonmq_job_scheduler_store_percent_usage	JobSchedulerStorePercentUsage	Measures the percentage of the job scheduler store usage, important for capacity planning and performance.
aws_amazonmq_journal_files_for_fast_recovery	JournalFilesForFastRecovery	Monitors the number of journal files available for fast recovery, ensuring quick system recovery.
aws_amazonmq_journal_files_for_full_recovery	JournalFilesForFullRecovery	Tracks journal files required for full recovery, ensuring data durability and integrity during failures.
aws_amazonmq_memory_usage	MemoryUsage	Measures the memory usage of the broker, ensuring the broker has adequate memory for message processing.
aws_amazonmq_message_count	MessageCount	Tracks the total number of messages in the broker, providing insights into message load and storage.
aws_amazonmq_message_ready_count	MessageReadyCount	Monitors the number of messages ready for delivery, helping gauge the efficiency of message consumption.
aws_amazonmq_message_unacknowledged_count	MessageUnacknowledgedCount	Tracks unacknowledged messages, useful for detecting potential message delivery problems.
aws_amazonmq_network_in	NetworkIn	Measures the incoming network traffic, useful for tracking data ingestion and throughput.
aws_amazonmq_network_out	NetworkOut	Measures the outgoing network traffic, helping monitor data egress and bandwidth usage.
aws_amazonmq_open_transaction_count	OpenTransactionCount	Tracks the number of open transactions, useful for identifying resource contention or potential system stalls.
aws_amazonmq_producer_count	ProducerCount	Monitors the number of producers, useful for understanding message production activity in the system.
aws_amazonmq_publish_rate	PublishRate	Measures the rate at which messages are being published, providing insights into message inflow.
aws_amazonmq_queue_count	QueueCount	Tracks the number of active queues, useful for analyzing message distribution across queues.
aws_amazonmq_queue_size	QueueSize	Monitors the size of the message queues, helping gauge message backlog and system load.
aws_amazonmq_rabbit_mqdisk_free	RabbitMQDiskFree	Tracks the available disk space for RabbitMQ, ensuring that there’s enough storage for message persistence.
aws_amazonmq_rabbit_mqdisk_free_limit	RabbitMQDiskFreeLimit	Monitors the disk free space threshold, alerting when approaching critical limits to avoid disruptions.
aws_amazonmq_rabbit_mqfd_used	RabbitMQFdUsed	Tracks the number of file descriptors used by RabbitMQ, ensuring system resources are not exhausted.
aws_amazonmq_rabbit_mqmem_limit	RabbitMQMemLimit	Monitors the memory usage limit for RabbitMQ, ensuring the broker doesn’t run out of memory.
aws_amazonmq_rabbit_mqmem_used	RabbitMQMemUsed	Measures the memory currently in use by RabbitMQ, useful for monitoring resource efficiency.
aws_amazonmq_receive_count	ReceiveCount	Tracks the number of received messages, helping monitor message inflow and processing rates.
aws_amazonmq_store_percent_usage	StorePercentUsage	Monitors the percentage of the store usage, ensuring sufficient capacity for message persistence.
aws_amazonmq_system_cpu_utilization	SystemCpuUtilization	Measures the CPU usage of the underlying system, helping to detect potential CPU bottlenecks.
aws_amazonmq_temp_percent_usage	TempPercentUsage	Monitors the percentage usage of temporary storage, useful for avoiding storage exhaustion during peak loads.
aws_amazonmq_total_consumer_count	TotalConsumerCount	Tracks the total number of consumers, helping assess the overall load and activity on the broker.
aws_amazonmq_total_dequeue_count	TotalDequeueCount	Monitors the total number of dequeued messages, useful for analyzing message consumption rates.
aws_amazonmq_total_enqueue_count	TotalEnqueueCount	Tracks the total number of enqueued messages, providing insights into message production volumes.
aws_amazonmq_total_message_count	TotalMessageCount	Monitors the total count of messages in the system, giving an overview of the message load.
aws_amazonmq_total_producer_count	TotalProducerCount	Tracks the total number of producers, useful for understanding message inflow activity.
aws_amazonmq_volume_read_ops	VolumeReadOps	Measures the number of read operations on the broker’s volume, helping monitor disk I/O performance.
aws_amazonmq_volume_write_ops	VolumeWriteOps	Measures the number of write operations on the broker’s volume, useful for detecting disk I/O bottlenecks.

AWS/ApiGateway

Function: Enables developers to create and manage APIs for accessing data and services

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_apigateway_info
aws_apigateway_4xx	4xx	Monitors the number of 4xx client errors, used to track issues related to invalid requests from clients.
aws_apigateway_5xx	5xx	Tracks the number of 5xx server errors, used to monitor API Gateway or backend server issues.
aws_apigateway_count	Count	Measures the total number of API requests, providing insights into traffic volume.
aws_apigateway_integration_latency	IntegrationLatency	Monitors the latency between API Gateway and the backend integration, useful for diagnosing performance issues in backend services.
aws_apigateway_latency	Latency	Tracks overall API latency, including both API Gateway processing and backend integration latency, helping to monitor user experience.
aws_apigateway_4_xxerror	4XXError	Measures the occurrence of 4xx errors (client errors), useful for understanding the rate of client-related issues.
aws_apigateway_5_xxerror	5XXError	Monitors 5xx errors (server errors), used to detect server-side failures in the API Gateway or its backend.
aws_apigateway_cache_hit_count	CacheHitCount	Tracks the number of times API requests were served from the cache, helping to monitor the efficiency of cache usage.
aws_apigateway_cache_miss_count	CacheMissCount	Monitors the number of cache misses, useful for optimizing cache configuration and reducing backend load.
aws_apigateway_client_error	ClientError	Measures errors originating from the client (4xx), used to monitor the rate of invalid requests sent by clients.
aws_apigateway_connect_count	ConnectCount	Tracks the number of successful WebSocket connection requests, providing insights into the usage of WebSocket APIs.
aws_apigateway_data_processed	DataProcessed	Monitors the amount of data processed by the API Gateway, useful for analyzing API data transfer and throughput.

aws_apigateway_execution_error	ExecutionError	Tracks execution errors during the API request process, useful for identifying failures in API execution logic.
aws_apigateway_integration_error	IntegrationError	Monitors errors that occur during integration with backend services, useful for detecting issues in backend communication.
aws_apigateway_message_count	MessageCount	Tracks the number of messages sent and received in WebSocket APIs, useful for monitoring message flow in real-time communication APIs.

AWS/AppStream

Function: Delivers cloud-based desktops and applications to end-users on any device

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_appstream_info
aws_appstream_actual_capacity	ActualCapacity	Monitors the actual number of available instances for streaming, used to ensure enough resources are deployed.
aws_appstream_available_capacity	AvailableCapacity	Tracks the number of instances available for use but not currently in use, helping to gauge spare capacity for handling future demand.
aws_appstream_capacity_utilization	CapacityUtilization	Measures the percentage of capacity utilization, useful for optimizing resource allocation and ensuring cost-effective usage.
aws_appstream_desired_capacity	DesiredCapacity	Represents the desired number of instances based on scaling policies, helping to monitor scaling efficiency and capacity planning.
aws_appstream_in_use_capacity	InUseCapacity	Tracks the number of instances currently in use, helping to monitor active workload and resource consumption.
aws_appstream_insufficient_capacity_error	InsufficientCapacityError	Measures the number of times a capacity request failed due to insufficient resources, indicating capacity shortages or bottlenecks.
aws_appstream_pending_capacity	PendingCapacity	Monitors instances that are in the process of being provisioned, helping to track the status of scaling events.
aws_appstream_running_capacity	RunningCapacity	Tracks the total number of running instances, providing insights into the active resources currently being used to support users.

AWS/AppSync

Function: Managed service for building GraphQL APIs that connects to data sources like DynamoDB

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_appsync_info
aws_appsync_4_xxerror	4XXError	Monitors client-side (4xx) errors in requests, useful for tracking invalid requests made by clients.
aws_appsync_5_xxerror	5XXError	Tracks server-side (5xx) errors, helping to detect issues in the API or the server infrastructure.
aws_appsync_active_connections	ActiveConnections	Measures the number of active WebSocket connections, useful for understanding the real-time activity on the AppSync API.
aws_appsync_active_subscriptions	ActiveSubscriptions	Tracks the number of active subscriptions, helping to monitor usage and engagement with subscription-based real-time data services.
aws_appsync_connect_client_error	ConnectClientError	Monitors errors encountered by clients while trying to establish connections, indicating issues in the client-side configuration or request.
aws_appsync_connect_server_error	ConnectServerError	Tracks server-side errors during the connection process, helping to identify server-side failures or misconfigurations during connection attempts.
aws_appsync_connect_success	ConnectSuccess	Measures the successful WebSocket connection attempts, useful for monitoring overall connection success rates.
aws_appsync_connection_duration	ConnectionDuration	Monitors the duration of WebSocket connections, helping to gauge session longevity and user engagement.
aws_appsync_disconnect_client_error	DisconnectClientError	Tracks errors that occur when clients try to disconnect, useful for monitoring client-side disconnection issues.
aws_appsync_disconnect_server_error	DisconnectServerError	Monitors server-side errors during disconnection, helping to detect issues in properly closing WebSocket connections.
aws_appsync_disconnect_success	DisconnectSuccess	Measures successful disconnections from WebSocket connections, useful for ensuring smooth session terminations.
aws_appsync_latency	Latency	Tracks the time taken to process requests, useful for monitoring API performance and identifying latency issues.
aws_appsync_publish_data_message_client_error	PublishDataMessageClientError	Monitors client-side errors during data message publishing, used to detect issues with client-side data transmission.
aws_appsync_publish_data_message_server_error	PublishDataMessageServerError	Tracks server-side errors during data message publishing, helping to identify issues in server-side message handling or transmission.
aws_appsync_publish_data_message_size	PublishDataMessageSize	Measures the size of data messages being published, useful for tracking payload sizes and ensuring efficient message transmission.
aws_appsync_publish_data_message_success	PublishDataMessageSuccess	Tracks successful data message publications, helping to monitor overall message delivery success.
aws_appsync_requests	Requests	Measures the total number of requests processed by AppSync, providing insights into traffic and API usage.
aws_appsync_subscribe_client_error	SubscribeClientError	Monitors client-side errors during subscription attempts, useful for tracking issues in subscribing to real-time data feeds.
aws_appsync_subscribe_server_error	SubscribeServerError	Tracks server-side errors during subscription attempts, helping to identify server failures when clients try to subscribe.
aws_appsync_subscribe_success	SubscribeSuccess	Measures successful subscription attempts, useful for monitoring subscription adoption and engagement rates.
aws_appsync_tokens_consumed	TokensConsumed	Tracks the number of tokens consumed by requests, useful for managing API rate limits and monitoring user activity.
aws_appsync_unsubscribe_client_error	UnsubscribeClientError	Monitors client-side errors during unsubscription attempts, used to detect issues when clients try to unsubscribe from data feeds.
aws_appsync_unsubscribe_server_error	UnsubscribeServerError	Tracks server-side errors during unsubscription attempts, useful for identifying server-side issues when clients try to unsubscribe.
aws_appsync_unsubscribe_success	UnsubscribeSuccess	Measures successful unsubscription attempts, ensuring smooth termination of real-time data subscriptions.

AWS/ApplicationELB

Function: Distributes incoming traffic to targets like EC2 instances, containers, and IP addresses

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_applicationelb_info
aws_applicationelb_active_connection_count	ActiveConnectionCount	Monitors the number of active connections, useful for understanding current load on the load balancer.
aws_applicationelb_client_tlsnegotiation_error_count	ClientTLSNegotiationErrorCount	Tracks the number of failed TLS negotiations between clients and the load balancer, used to detect TLS handshake issues.
aws_applicationelb_consumed_lcus	ConsumedLCUs	Measures the number of Load Balancer Capacity Units (LCUs) used, helping to track resource consumption and cost.
aws_applicationelb_elbauth_error	ELBAuthError	Tracks errors during authentication processes, useful for monitoring failures in authentication workflows.
aws_applicationelb_elbauth_failure	ELBAuthFailure	Monitors failed authentication attempts, helping detect potential security issues or configuration problems.
aws_applicationelb_elbauth_latency	ELBAuthLatency	Measures the latency of authentication requests, useful for identifying delays in authentication workflows.
aws_applicationelb_elbauth_refresh_token_success	ELBAuthRefreshTokenSuccess	Tracks successful refresh token requests, useful for monitoring token refresh operations.
aws_applicationelb_elbauth_success	ELBAuthSuccess	Measures successful authentication requests, useful for monitoring authentication performance.
aws_applicationelb_elbauth_user_claims_size_exceeded	ELBAuthUserClaimsSizeExceeded	Monitors instances where user claims exceed the allowed size, which can help in tuning authentication configurations.
aws_applicationelb_httpcode_elb_3_xx_count	HTTPCode_ELB_3XX_Count	Tracks the number of 3xx HTTP responses, which indicate redirection, useful for monitoring redirects on the load balancer.
aws_applicationelb_httpcode_elb_4_xx_count	HTTPCode_ELB_4XX_Count	Monitors the number of 4xx client error responses, useful for detecting invalid client requests.
aws_applicationelb_httpcode_elb_5_xx_count	HTTPCode_ELB_5XX_Count	Tracks the number of 5xx server error responses, helping identify backend issues.
aws_applicationelb_httpcode_target_2_xx_count	HTTPCode_Target_2XX_Count	Measures the number of successful 2xx responses from targets, useful for tracking successful request handling.
aws_applicationelb_httpcode_target_3_xx_count	HTTPCode_Target_3XX_Count	Monitors the number of 3xx redirects from target servers, useful for understanding traffic redirection by targets.
aws_applicationelb_httpcode_target_4_xx_count	HTTPCode_Target_4XX_Count	Tracks 4xx client errors returned by target servers, helping identify configuration or client-side issues.
aws_applicationelb_httpcode_target_5_xx_count	HTTPCode_Target_5XX_Count	Monitors the number of 5xx errors returned by target servers, useful for identifying server-side issues.
aws_applicationelb_ipv6_processed_bytes	IPv6ProcessedBytes	Measures the number of bytes processed over IPv6, useful for tracking IPv6 traffic volume.
aws_applicationelb_ipv6_request_count	IPv6RequestCount	Tracks the number of IPv6 requests, providing insights into IPv6 usage and adoption.
aws_applicationelb_new_connection_count	NewConnectionCount	Monitors the number of new connections established, helping understand connection initiation patterns.
aws_applicationelb_processed_bytes	ProcessedBytes	Measures the total amount of data processed by the load balancer, useful for tracking overall throughput.
aws_applicationelb_rejected_connection_count	RejectedConnectionCount	Tracks the number of connections rejected by the load balancer, useful for identifying capacity or configuration issues.
aws_applicationelb_request_count	RequestCount	Measures the total number of requests handled by the load balancer, useful for monitoring traffic volume.
aws_applicationelb_rule_evaluations	RuleEvaluations	Tracks the number of rule evaluations on the load balancer, helping to monitor rule complexity and processing time.
aws_applicationelb_target_connection_error_count	TargetConnectionErrorCount	Monitors the number of connection errors to target servers, useful for identifying connectivity issues between the load balancer and targets.
aws_applicationelb_target_response_time	TargetResponseTime	Measures the response time of target servers, helping to track backend performance and latency.
aws_applicationelb_target_tlsnegotiation_error_count	TargetTLSNegotiationErrorCount	Tracks failed TLS negotiations between the load balancer and target servers, useful for detecting SSL/TLS issues with backend services.
aws_applicationelb_anomalous_host_count	AnomalousHostCount	Monitors the number of hosts showing anomalous behavior, helping detect potential security issues or performance outliers.
aws_applicationelb_desync_mitigation_mode_non_compliant_request_count	DesyncMitigationMode_NonCompliant_Request_Count	Tracks non-compliant requests under desync mitigation mode, useful for monitoring and securing application traffic.
aws_applicationelb_dropped_invalid_header_request_count	DroppedInvalidHeaderRequestCount	Monitors requests dropped due to invalid headers, helping identify and fix misconfigurations or potential security risks.
aws_applicationelb_forwarded_invalid_header_request_count	ForwardedInvalidHeaderRequestCount	Tracks invalid header requests that were forwarded, helping detect improper traffic that bypassed filtering.
aws_applicationelb_grpc_request_count	GrpcRequestCount	Measures the number of gRPC requests handled, useful for tracking gRPC-based API traffic.
aws_applicationelb_httpcode_elb_500_count	HTTPCode_ELB_500_Count	Tracks the number of 500 Internal Server Errors from the load balancer, useful for detecting backend or load balancer failures.
aws_applicationelb_httpcode_elb_502_count	HTTPCode_ELB_502_Count	Monitors the number of 502 Bad Gateway errors, indicating backend communication failures.
aws_applicationelb_httpcode_elb_503_count	HTTPCode_ELB_503_Count	Tracks the number of 503 Service Unavailable errors, helping detect capacity or service availability issues.
aws_applicationelb_httpcode_elb_504_count	HTTPCode_ELB_504_Count	Measures the number of 504 Gateway Timeout errors, indicating backend timeouts.
aws_applicationelb_http_fixed_response_count	HTTP_Fixed_Response_Count	Tracks the number of fixed responses sent by the load balancer, useful for monitoring traffic directed to predefined responses.
aws_applicationelb_http_redirect_count	HTTP_Redirect_Count	Monitors the number of HTTP redirects sent by the load balancer, useful for tracking traffic redirection.
aws_applicationelb_http_redirect_url_limit_exceeded_count	HTTP_Redirect_Url_Limit_Exceeded_Count	Tracks instances where the redirect URL limit was exceeded, indicating potential configuration issues.
aws_applicationelb_healthy_host_count	HealthyHostCount	Measures the number of healthy hosts behind the load balancer, helping monitor service availability.
aws_applicationelb_healthy_state_dns	HealthyStateDNS	Monitors DNS health state, useful for ensuring DNS routing functionality.
aws_applicationelb_healthy_state_routing	HealthyStateRouting	Tracks the health of routing decisions by the load balancer, ensuring smooth traffic distribution.
aws_applicationelb_lambda_internal_error	LambdaInternalError	Monitors internal errors in AWS Lambda functions invoked by the load balancer, useful for debugging serverless application issues.
aws_applicationelb_lambda_target_processed_bytes	LambdaTargetProcessedBytes	Measures the bytes processed by Lambda targets, providing insights into data throughput for serverless applications.
aws_applicationelb_lambda_user_error	LambdaUserError	Tracks user-triggered errors in Lambda functions, helping to identify issues in function logic or inputs.
aws_applicationelb_mitigated_host_count	MitigatedHostCount	Monitors the number of hosts mitigated due to anomalies, useful for tracking security incidents.
aws_applicationelb_non_sticky_request_count	NonStickyRequestCount	Measures the number of non-sticky requests handled, helping to monitor session persistence performance.
aws_applicationelb_request_count_per_target	RequestCountPerTarget	Tracks the number of requests processed per target, useful for understanding traffic distribution and load balancing efficiency.
aws_applicationelb_standard_processed_bytes	StandardProcessedBytes	Measures the total amount of bytes processed, useful for tracking data throughput on standard targets.
aws_applicationelb_un_healthy_host_count	UnHealthyHostCount	Monitors the number of unhealthy hosts behind the load balancer, helping to identify availability issues.
aws_applicationelb_unhealthy_routing_request_count	UnhealthyRoutingRequestCount
aws_applicationelb_unhealthy_state_dns	UnhealthyStateDNS
aws_applicationelb_unhealthy_state_routing	UnhealthyStateRouting

AWS/Athena

Function: Interactive query service to analyze data in S3 using SQL

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_athena_info
aws_athena_engine_execution_time	EngineExecutionTime	Measures the time taken by the query engine to execute a query, helping to monitor query performance and identify execution bottlenecks.
aws_athena_processed_bytes	ProcessedBytes	Tracks the amount of data processed by the query engine, useful for understanding query cost and efficiency.
aws_athena_query_planning_time	QueryPlanningTime	Monitors the time taken to plan and prepare the query for execution, helping identify delays during the query planning phase.
aws_athena_query_queue_time	QueryQueueTime	Measures the time a query spends in the queue before execution, useful for monitoring system load and query prioritization issues.
aws_athena_service_processing_time	ServiceProcessingTime	Tracks the time taken by Athena’s internal services to process a query, helping to identify processing delays within the service.
aws_athena_total_execution_time	TotalExecutionTime	Measures the total time from query submission to completion, providing a comprehensive view of query performance and potential bottlenecks.

AWS/AutoScaling

Function: Automatically adjusts capacity to maintain performance and cost efficiency

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_autoscaling_info
aws_autoscaling_group_and_warm_pool_desired_capacity	GroupAndWarmPoolDesiredCapacity	Monitors the desired capacity of both the Auto Scaling group and the warm pool, used to ensure adequate resources are provisioned.
aws_autoscaling_group_and_warm_pool_total_capacity	GroupAndWarmPoolTotalCapacity	Tracks the total capacity of the Auto Scaling group and warm pool, providing an overview of the available resources.
aws_autoscaling_group_desired_capacity	GroupDesiredCapacity	Measures the desired number of instances in the Auto Scaling group, useful for capacity planning and scaling decisions.
aws_autoscaling_group_in_service_capacity	GroupInServiceCapacity	Tracks the number of instances currently in service, helping to monitor the active workload.
aws_autoscaling_group_in_service_instances	GroupInServiceInstances	Monitors the actual number of instances currently running in the group, useful for managing resource availability.
aws_autoscaling_group_max_size	GroupMaxSize	Measures the maximum size of the Auto Scaling group, helping ensure the group does not exceed the defined limit.
aws_autoscaling_group_min_size	GroupMinSize	Tracks the minimum size of the Auto Scaling group, ensuring a baseline level of capacity is maintained.
aws_autoscaling_group_pending_capacity	GroupPendingCapacity	Monitors the capacity of instances that are pending launch, useful for understanding the state of scaling events.
aws_autoscaling_group_pending_instances	GroupPendingInstances	Tracks the number of instances that are pending launch, helping monitor scaling processes in progress.
aws_autoscaling_group_standby_capacity	GroupStandbyCapacity	Measures the capacity of instances in standby mode, useful for tracking inactive but available resources.
aws_autoscaling_group_standby_instances	GroupStandbyInstances	Monitors the number of instances in standby mode, helping assess resource availability for scaling.
aws_autoscaling_group_terminating_capacity	GroupTerminatingCapacity	Tracks the capacity of instances being terminated, helping to monitor scaling down activities.
aws_autoscaling_group_terminating_instances	GroupTerminatingInstances	Monitors the number of instances being terminated, useful for understanding scaling down operations.
aws_autoscaling_group_total_capacity	GroupTotalCapacity	Measures the total capacity of the Auto Scaling group, providing a complete view of resources available for scaling.
aws_autoscaling_group_total_instances	GroupTotalInstances	Tracks the total number of instances in the Auto Scaling group, helping to monitor overall resource allocation.
aws_autoscaling_predictive_scaling_capacity_forecast	PredictiveScalingCapacityForecast	Provides forecasted capacity based on predictive scaling, helping to plan for future resource needs.
aws_autoscaling_predictive_scaling_load_forecast	PredictiveScalingLoadForecast	Tracks forecasted load on the Auto Scaling group, helping to ensure capacity meets future demand.
aws_autoscaling_predictive_scaling_metric_pair_correlation	PredictiveScalingMetricPairCorrelation	Measures the correlation between metric pairs for predictive scaling, useful for improving prediction accuracy.
aws_autoscaling_warm_pool_desired_capacity	WarmPoolDesiredCapacity	Monitors the desired capacity of the warm pool, helping to ensure the pool has sufficient resources for quick scaling.
aws_autoscaling_warm_pool_min_size	WarmPoolMinSize	Tracks the minimum size of the warm pool, ensuring a baseline level of resources for rapid scaling.
aws_autoscaling_warm_pool_pending_capacity	WarmPoolPendingCapacity	Measures the capacity of instances pending in the warm pool, useful for understanding warm pool availability.
aws_autoscaling_warm_pool_terminating_capacity	WarmPoolTerminatingCapacity	Monitors the capacity of instances being terminated in the warm pool, helping to track scaling down activities.
aws_autoscaling_warm_pool_total_capacity	WarmPoolTotalCapacity	Tracks the total capacity of the warm pool, providing a complete view of available resources for quick scaling.
aws_autoscaling_warm_pool_warmed_capacity	WarmPoolWarmedCapacity	Measures the capacity of warmed instances in the warm pool, useful for tracking resources that are ready for immediate use.

AWS/Backup

Function: Centralized backup service to automate and manage backups across AWS services

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_backup_info
aws_backup_number_of_backup_jobs_aborted	NumberOfBackupJobsAborted	Tracks the number of backup jobs that were aborted, useful for monitoring failed or incomplete backup operations.
aws_backup_number_of_backup_jobs_completed	NumberOfBackupJobsCompleted	Measures the number of backup jobs successfully completed, useful for tracking the effectiveness of backup operations.
aws_backup_number_of_backup_jobs_created	NumberOfBackupJobsCreated	Tracks the total number of backup jobs initiated, helping to monitor backup frequency and schedule adherence.
aws_backup_number_of_backup_jobs_expired	NumberOfBackupJobsExpired	Monitors the number of backup jobs that have expired, useful for ensuring data retention policies are followed.
aws_backup_number_of_backup_jobs_failed	NumberOfBackupJobsFailed	Measures the number of backup jobs that have failed, useful for identifying errors in the backup process.
aws_backup_number_of_backup_jobs_pending	NumberOfBackupJobsPending	Tracks the number of backup jobs currently in a pending state, helping monitor delays or scheduling issues.
aws_backup_number_of_backup_jobs_running	NumberOfBackupJobsRunning	Monitors the number of backup jobs that are currently running, useful for tracking ongoing backup processes.
aws_backup_number_of_copy_jobs_completed	NumberOfCopyJobsCompleted	Measures the number of copy jobs successfully completed, helping track backup data replication across regions or storage tiers.
aws_backup_number_of_copy_jobs_created	NumberOfCopyJobsCreated	Tracks the number of initiated copy jobs, useful for monitoring data replication schedules.
aws_backup_number_of_copy_jobs_failed	NumberOfCopyJobsFailed	Monitors the number of failed copy jobs, helping to detect issues with backup replication processes.
aws_backup_number_of_copy_jobs_running	NumberOfCopyJobsRunning	Tracks the number of copy jobs currently in progress, useful for monitoring ongoing replication activities.
aws_backup_number_of_recovery_points_cold	NumberOfRecoveryPointsCold	Measures the number of cold (archived) recovery points, useful for tracking long-term storage of backup data.
aws_backup_number_of_recovery_points_completed	NumberOfRecoveryPointsCompleted	Tracks the total number of recovery points successfully created, helping to ensure that data can be restored when needed.
aws_backup_number_of_recovery_points_deleting	NumberOfRecoveryPointsDeleting	Monitors the number of recovery points being deleted, useful for tracking clean-up or retention policy actions.
aws_backup_number_of_recovery_points_expired	NumberOfRecoveryPointsExpired	Measures the number of expired recovery points, useful for ensuring compliance with retention policies.
aws_backup_number_of_recovery_points_partial	NumberOfRecoveryPointsPartial	Tracks the number of incomplete (partial) recovery points, helping to identify issues with backup integrity or storage capacity.
aws_backup_number_of_restore_jobs_completed	NumberOfRestoreJobsCompleted	Measures the number of successful restore jobs, useful for tracking data recovery operations.
aws_backup_number_of_restore_jobs_failed	NumberOfRestoreJobsFailed	Monitors the number of restore jobs that have failed, useful for identifying problems in the recovery process.
aws_backup_number_of_restore_jobs_pending	NumberOfRestoreJobsPending	Tracks the number of restore jobs that are pending, useful for monitoring delays in data recovery.
aws_backup_number_of_restore_jobs_running	NumberOfRestoreJobsRunning	Monitors the number of restore jobs currently in progress, helping to track ongoing recovery processes.

AWS/Billing

Function: Provides detailed usage and cost data for AWS services. This service only produces metrics to specific regions in AWS. Any jobs configured with this service will only gather data from the us-east-1 regions.

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_billing_estimated_charges	EstimatedCharges	Tracks the estimated charges for your AWS account, providing insights into overall AWS cost and usage. This is useful for budget monitoring and cost management over time, helping to identify cost spikes or unusual charges.

AWS/Cassandra

Function: Managed Apache Cassandra-compatible database service

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_cassandra_info
aws_cassandra_account_max_reads	AccountMaxReads	Tracks the maximum number of read requests for the account, helping monitor and manage read activity and limits.
aws_cassandra_account_max_table_level_reads	AccountMaxTableLevelReads	Measures the maximum number of reads at the table level, useful for understanding read distribution across tables.
aws_cassandra_account_max_table_level_writes	AccountMaxTableLevelWrites	Tracks the maximum number of write operations at the table level, helping identify write-heavy tables.
aws_cassandra_account_max_writes	AccountMaxWrites	Measures the maximum number of write requests for the account, useful for managing overall write throughput.
aws_cassandra_account_provisioned_read_capacity_utilization	AccountProvisionedReadCapacityUtilization	Monitors the utilization of provisioned read capacity, helping ensure optimal read capacity allocation.
aws_cassandra_account_provisioned_write_capacity_utilization	AccountProvisionedWriteCapacityUtilization	Tracks the utilization of provisioned write capacity, ensuring efficient use of write resources.
aws_cassandra_conditional_check_failed_requests	ConditionalCheckFailedRequests	Measures the number of failed conditional checks, useful for monitoring logical errors during write operations.
aws_cassandra_consumed_read_capacity_units	ConsumedReadCapacityUnits	Tracks the number of read capacity units consumed, helping monitor read activity and optimize capacity.
aws_cassandra_consumed_write_capacity_units	ConsumedWriteCapacityUnits	Monitors the number of write capacity units consumed, providing insights into write operations and capacity optimization.
aws_cassandra_max_provisioned_table_read_capacity_utilization	MaxProvisionedTableReadCapacityUtilization	Tracks the maximum utilization of provisioned read capacity at the table level, helping manage read resources per table.
aws_cassandra_max_provisioned_table_write_capacity_utilization	MaxProvisionedTableWriteCapacityUtilization	Monitors the maximum utilization of provisioned write capacity at the table level, ensuring efficient use of write resources per table.
aws_cassandra_returned_item_count	ReturnedItemCount	Measures the total number of items returned by read operations, useful for understanding query efficiency.
aws_cassandra_returned_item_count_by_select	ReturnedItemCountBySelect	Tracks the number of items returned by select queries, helping optimize query results and performance.
aws_cassandra_successful_request_count	SuccessfulRequestCount	Monitors the number of successful requests, providing insights into the operational success rate of read and write operations.
aws_cassandra_successful_request_latency	SuccessfulRequestLatency	Measures the latency of successful requests, helping to optimize performance and identify bottlenecks.
aws_cassandra_system_errors	SystemErrors	Tracks the number of system-related errors, useful for identifying and addressing infrastructure or service issues.
aws_cassandra_user_errors	UserErrors	Monitors the number of user-related errors, helping identify application-level issues or misconfigurations.

AWS/CertificateManager

Function: Manages the provisioning, renewal, and deployment of SSL/TLS certificates

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_certificatemanager_info
aws_certificatemanager_days_to_expiry	DaysToExpiry	Tracks the number of days remaining until an SSL/TLS certificate expires. This metric is useful for monitoring certificate lifecycles and ensuring that certificates are renewed before expiration to avoid service disruptions.

AWS/CloudFront

Function: Content delivery network to deliver data, videos, applications globally

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_cloudfront_info
aws_cloudfront_4xx_error_rate	4xxErrorRate	Tracks the rate of 4xx client-side errors, helping to monitor user request issues.
aws_cloudfront_5xx_error_rate	5xxErrorRate	Tracks the rate of 5xx server-side errors, useful for detecting backend or CloudFront issues.
aws_cloudfront_bytes_downloaded	BytesDownloaded	Measures the total bytes downloaded via CloudFront, useful for monitoring bandwidth usage.
aws_cloudfront_bytes_uploaded	BytesUploaded	Monitors the amount of data uploaded to CloudFront, helping track upload activity.
aws_cloudfront_requests	Requests	Tracks the total number of requests processed by CloudFront, providing insight into traffic volume.
aws_cloudfront_total_error_rate	TotalErrorRate	Measures the combined rate of all error responses (both 4xx and 5xx), helping monitor service reliability.
aws_cloudfront_401_error_rate	401ErrorRate	Tracks the rate of 401 Unauthorized errors, useful for monitoring authentication issues.
aws_cloudfront_403_error_rate	403ErrorRate	Monitors the rate of 403 Forbidden errors, helping to detect access control issues.
aws_cloudfront_404_error_rate	404ErrorRate	Measures the rate of 404 Not Found errors, useful for tracking invalid requests or missing resources.
aws_cloudfront_502_error_rate	502ErrorRate	Tracks the rate of 502 Bad Gateway errors, indicating backend server or network issues.
aws_cloudfront_503_error_rate	503ErrorRate	Monitors the rate of 503 Service Unavailable errors, helping to detect capacity or availability issues.
aws_cloudfront_504_error_rate	504ErrorRate	Tracks the rate of 504 Gateway Timeout errors, indicating backend server delays.
aws_cloudfront_cache_hit_rate	CacheHitRate	Measures the percentage of requests served from CloudFront’s cache, useful for optimizing content delivery efficiency.
aws_cloudfront_function_compute_utilization	FunctionComputeUtilization	Tracks the compute utilization of CloudFront Functions, helping to monitor resource usage for custom code execution.
aws_cloudfront_function_execution_errors	FunctionExecutionErrors	Monitors the number of execution errors in CloudFront Functions, helping to identify failures in custom logic.
aws_cloudfront_function_invocations	FunctionInvocations	Tracks the total number of CloudFront Function invocations, useful for monitoring function usage.
aws_cloudfront_function_throttles	FunctionThrottles	Measures throttled CloudFront Function invocations, indicating capacity or rate-limiting issues.
aws_cloudfront_function_validation_errors	FunctionValidationErrors	Tracks validation errors for CloudFront Functions, useful for debugging incorrect function configurations.
aws_cloudfront_lambda_execution_error	LambdaExecutionError	Monitors errors during Lambda@Edge function execution, useful for identifying issues with serverless logic.
aws_cloudfront_lambda_limit_exceeded_errors	LambdaLimitExceededErrors	Tracks instances where Lambda@Edge functions exceed their resource limits, helping detect performance bottlenecks.
aws_cloudfront_lambda_validation_error	LambdaValidationError	Measures Lambda@Edge validation errors, useful for ensuring proper configuration.
aws_cloudfront_origin_latency	OriginLatency	Tracks the latency from CloudFront to the origin server, helping to identify performance bottlenecks in origin server communication.

AWS/Cognito

Function: Provides authentication, authorization, and user management for web and mobile apps

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_cognito_info
aws_cognito_account_take_over_risk	AccountTakeOverRisk	Tracks the risk of account takeover attempts, useful for detecting malicious login attempts.
aws_cognito_compromised_credentials_risk	CompromisedCredentialsRisk	Monitors the risk of compromised credentials, helping to detect and mitigate security threats.
aws_cognito_federation_successes	FederationSuccesses	Tracks the number of successful federated sign-ins, useful for monitoring third-party identity provider usage.
aws_cognito_federation_throttles	FederationThrottles	Measures the number of throttled federation sign-in attempts, useful for identifying rate-limiting issues.
aws_cognito_no_risk	NoRisk	Tracks the number of no-risk sign-ins, indicating successful and secure login attempts.
aws_cognito_override_block	OverrideBlock	Monitors instances where an administrator overrides a block, useful for auditing account management actions.
aws_cognito_risk	Risk	Tracks general login risk events, helping to monitor suspicious activity.
aws_cognito_sign_in_successes	SignInSuccesses	Tracks the number of successful sign-ins, helping to monitor user authentication success.
aws_cognito_sign_in_throttles	SignInThrottles	Measures the number of throttled sign-in attempts, useful for detecting excessive login activity or rate-limiting.
aws_cognito_sign_up_successes	SignUpSuccesses	Tracks successful user sign-ups, providing insight into account creation trends.
aws_cognito_sign_up_throttles	SignUpThrottles	Measures throttled sign-up attempts, useful for identifying potential rate-limiting or abuse during account creation.
aws_cognito_token_refresh_successes	TokenRefreshSuccesses	Tracks the number of successful token refreshes, useful for monitoring user session continuity.
aws_cognito_token_refresh_throttles	TokenRefreshThrottles	Monitors the number of throttled token refresh requests, helping identify rate-limiting or session issues.

AWS/DDoSProtection

Function: Protects against distributed denial of service attacks with AWS Shield

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_ddosprotection_info
aws_ddosprotection_ddo_sattack_bits_per_second	DDoSAttackBitsPerSecond	Monitors the volume of a DDoS attack in terms of data transfer per second, useful for detecting bandwidth-based attacks.
aws_ddosprotection_ddo_sattack_packets_per_second	DDoSAttackPacketsPerSecond	Tracks the number of packets involved in a DDoS attack per second, helping to identify packet flood attacks.
aws_ddosprotection_ddo_sattack_requests_per_second	DDoSAttackRequestsPerSecond	Monitors the number of requests in a DDoS attack per second, useful for identifying application-layer DDoS attacks.
aws_ddosprotection_ddo_sdetected	DDoSDetected	Tracks the detection of DDoS attacks, providing alerts when a potential attack is detected.
aws_ddosprotection_volume_bits_per_second	VolumeBitsPerSecond	Monitors the data transfer volume per second during a DDoS attack, helping to understand the scale of the attack.
aws_ddosprotection_volume_packets_per_second	VolumePacketsPerSecond	Measures the volume of packets per second, useful for tracking the size of DDoS attacks in terms of packet rate.

AWS/DMS

Function: Migrates databases to AWS with minimal downtime

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_dms_info
aws_dms_cdcchanges_disk_source	CDCChangesDiskSource	Tracks changes to the disk source during Change Data Capture (CDC) operations, useful for monitoring disk-based CDC changes.
aws_dms_cdcchanges_disk_target	CDCChangesDiskTarget	Monitors changes to the disk target during CDC, useful for tracking target-side disk usage in migrations.
aws_dms_cdcchanges_memory_source	CDCChangesMemorySource	Tracks memory usage on the source during CDC operations, helping monitor memory-based migrations.
aws_dms_cdcchanges_memory_target	CDCChangesMemoryTarget	Monitors memory usage on the target during CDC operations, useful for tracking memory consumption on the target side.
aws_dms_cdcincoming_changes	CDCIncomingChanges	Measures the number of incoming changes during CDC operations, helping to monitor the rate of data changes.
aws_dms_cdclatency_source	CDCLatencySource	Tracks latency on the source side during CDC operations, helping to identify performance issues with data changes.
aws_dms_cdclatency_target	CDCLatencyTarget	Monitors the latency on the target side during CDC operations, useful for tracking potential bottlenecks.
aws_dms_cdcthroughput_bandwidth_source	CDCThroughputBandwidthSource	Measures the source bandwidth usage during CDC operations, helping to monitor network usage.
aws_dms_cdcthroughput_bandwidth_target	CDCThroughputBandwidthTarget	Monitors the target bandwidth usage during CDC, useful for tracking data transfer rates.
aws_dms_cdcthroughput_rows_source	CDCThroughputRowsSource	Tracks the number of rows processed from the source during CDC operations, useful for monitoring data throughput.
aws_dms_cdcthroughput_rows_target	CDCThroughputRowsTarget	Monitors the number of rows written to the target during CDC, helping to ensure data is migrated efficiently.
aws_dms_cpuutilization	CPUUtilization	Measures the CPU usage of DMS instances, helping to ensure that the system has enough resources to perform migrations.
aws_dms_free_storage_space	FreeStorageSpace	Tracks the amount of free storage available on the DMS instance, useful for preventing storage exhaustion during migrations.
aws_dms_freeable_memory	FreeableMemory	Monitors the available memory on the DMS instance, useful for ensuring that enough memory is available for operations.
aws_dms_full_load_throughput_bandwidth_source	FullLoadThroughputBandwidthSource	Tracks bandwidth usage during full load operations on the source, useful for monitoring network utilization.
aws_dms_full_load_throughput_bandwidth_target	FullLoadThroughputBandwidthTarget	Monitors bandwidth usage during full load operations on the target, helping track data transfer efficiency.
aws_dms_full_load_throughput_rows_source	FullLoadThroughputRowsSource	Tracks the number of rows processed from the source during full load migrations, helping to monitor data throughput.
aws_dms_full_load_throughput_rows_target	FullLoadThroughputRowsTarget	Monitors the number of rows loaded to the target during full load operations, helping to ensure migration progress.
aws_dms_network_receive_throughput	NetworkReceiveThroughput	Tracks the network receive rate, helping to monitor inbound network performance during migrations.
aws_dms_network_transmit_throughput	NetworkTransmitThroughput	Measures the network transmit rate, useful for monitoring outbound network performance.
aws_dms_read_iops	ReadIOPS	Tracks the number of read operations per second, helping to monitor disk read performance.
aws_dms_read_latency	ReadLatency	Measures the latency of read operations, helping to identify performance issues in disk reads.
aws_dms_read_throughput	ReadThroughput	Monitors the throughput of read operations, useful for tracking how much data is being read during migrations.
aws_dms_swap_usage	SwapUsage	Tracks the amount of swap space used, helping monitor memory performance.
aws_dms_write_iops	WriteIOPS	Measures the number of write operations per second, useful for monitoring disk write performance.
aws_dms_write_latency	WriteLatency	Tracks the latency of write operations, helping identify performance issues during data writes.
aws_dms_write_throughput	WriteThroughput	Monitors the throughput of write operations, helping to understand the speed of data writes during migration operations.

AWS/DX

Function: AWS Direct Connect provides a dedicated network connection to AWS.

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_dx_info
aws_dx_connection_bps_egress	ConnectionBpsEgress	Measures the egress bandwidth (bits per second) for Direct Connect connections, helping monitor outbound data transfer.
aws_dx_connection_bps_ingress	ConnectionBpsIngress	Monitors the ingress bandwidth (bits per second), providing insights into inbound data transfer rates.
aws_dx_connection_crcerror_count	ConnectionCRCErrorCount	Tracks CRC errors on the connection, useful for identifying data integrity issues or hardware problems.
aws_dx_connection_encryption_state	ConnectionEncryptionState	Monitors the encryption state of Direct Connect connections, helping ensure secure data transfer.
aws_dx_connection_error_count	ConnectionErrorCount	Tracks the number of errors on the Direct Connect connection, useful for diagnosing connectivity issues.
aws_dx_connection_light_level_rx	ConnectionLightLevelRx	Measures the received light level, helping monitor the health of fiber optic connections.
aws_dx_connection_light_level_tx	ConnectionLightLevelTx	Tracks the transmitted light level, helping ensure proper signal strength in fiber optic connections.
aws_dx_connection_pps_egress	ConnectionPpsEgress	Monitors the number of packets per second being transmitted (egress), useful for tracking network traffic patterns.
aws_dx_connection_pps_ingress	ConnectionPpsIngress	Tracks the number of packets per second being received (ingress), useful for understanding inbound traffic load.
aws_dx_connection_state	ConnectionState	Monitors the operational state of Direct Connect connections, helping to detect connection status changes.
aws_dx_virtual_interface_bps_egress	VirtualInterfaceBpsEgress	Measures the outbound bandwidth usage for virtual interfaces, helping track the data flow from virtual interfaces.
aws_dx_virtual_interface_bps_ingress	VirtualInterfaceBpsIngress	Monitors inbound bandwidth usage for virtual interfaces, providing insight into data ingress through virtual interfaces.
aws_dx_virtual_interface_pps_egress	VirtualInterfacePpsEgress	Tracks the number of outbound packets per second for virtual interfaces, helping monitor packet-based traffic.
aws_dx_virtual_interface_pps_ingress	VirtualInterfacePpsIngress	Measures the number of inbound packets per second for virtual interfaces, useful for monitoring packet-level ingress.

AWS/DocDB

Function: Managed document database service that supports MongoDB workloads

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_docdb_info
aws_docdb_backup_retention_period_storage_used	BackupRetentionPeriodStorageUsed	Tracks the amount of storage used for backup retention, helping manage backup costs and storage.
aws_docdb_buffer_cache_hit_ratio	BufferCacheHitRatio	Monitors the cache hit ratio, helping to ensure data is being effectively cached.
aws_docdb_cpuutilization	CPUUtilization	Measures the CPU usage of the database, useful for monitoring resource consumption.
aws_docdb_change_stream_log_size	ChangeStreamLogSize	Tracks the size of the change stream log, helping monitor the volume of changes being processed.
aws_docdb_dbcluster_replica_lag_maximum	DBClusterReplicaLagMaximum	Monitors the maximum replication lag between the primary and replica nodes in the cluster.
aws_docdb_dbcluster_replica_lag_minimum	DBClusterReplicaLagMinimum	Tracks the minimum replication lag, helping ensure data replication is kept in sync.
aws_docdb_dbinstance_replica_lag	DBInstanceReplicaLag	Monitors replication lag at the instance level, useful for tracking data consistency across instances.
aws_docdb_database_connections	DatabaseConnections	Tracks the number of active connections to the database, helping monitor connection load.
aws_docdb_database_connections_max	DatabaseConnectionsMax	Monitors the maximum number of connections allowed, helping avoid connection exhaustion.
aws_docdb_database_cursors	DatabaseCursors	Tracks the number of database cursors in use, helping monitor query processing.
aws_docdb_database_cursors_max	DatabaseCursorsMax	Monitors the maximum number of database cursors, useful for managing resource limits.
aws_docdb_database_cursors_timed_out	DatabaseCursorsTimedOut	Tracks cursors that have timed out, helping identify performance issues.
aws_docdb_disk_queue_depth	DiskQueueDepth	Measures the depth of the disk I/O queue, useful for monitoring disk performance.
aws_docdb_documents_deleted	DocumentsDeleted	Tracks the number of documents deleted, helping to monitor data deletion operations.
aws_docdb_documents_inserted	DocumentsInserted	Measures the number of documents inserted, helping to track data growth in the database.
aws_docdb_documents_returned	DocumentsReturned	Tracks the number of documents returned by queries, useful for monitoring query performance.
aws_docdb_documents_updated	DocumentsUpdated	Measures the number of documents updated, helping track changes in the database.
aws_docdb_engine_uptime	EngineUptime	Monitors the total uptime of the database engine, useful for tracking availability.
aws_docdb_free_local_storage	FreeLocalStorage	Tracks the amount of free storage on the database node, helping to prevent storage exhaustion.
aws_docdb_freeable_memory	FreeableMemory	Monitors the amount of free memory, useful for ensuring sufficient memory availability.
aws_docdb_network_receive_throughput	NetworkReceiveThroughput	Measures the amount of data being received by the database, useful for tracking inbound network usage.
aws_docdb_network_throughput	NetworkThroughput	Monitors overall network throughput, helping track both inbound and outbound traffic.
aws_docdb_network_transmit_throughput	NetworkTransmitThroughput	Measures the amount of data being transmitted from the database, helping track outbound traffic.
aws_docdb_opcounters_command	OpcountersCommand	Tracks the number of database commands executed, useful for monitoring operational throughput.
aws_docdb_opcounters_delete	OpcountersDelete	Monitors the number of delete operations, useful for tracking data modifications.
aws_docdb_opcounters_getmore	OpcountersGetmore	Measures the number of getMore operations, useful for monitoring pagination in queries.
aws_docdb_opcounters_insert	OpcountersInsert	Tracks the number of insert operations, helping monitor data insert performance.
aws_docdb_opcounters_query	OpcountersQuery	Monitors the number of queries executed, useful for tracking query load.
aws_docdb_opcounters_update	OpcountersUpdate	Measures the number of update operations, helping monitor data modifications in the database.
aws_docdb_read_iops	ReadIOPS	Tracks the number of input/output operations per second for reads, helping to monitor read performance.
aws_docdb_read_latency	ReadLatency	Measures the latency of read operations, helping to identify performance issues with data retrieval.
aws_docdb_read_throughput	ReadThroughput	Monitors the rate of data being read from the database, useful for tracking read performance.
aws_docdb_snapshot_storage_used	SnapshotStorageUsed	Tracks the amount of storage used for database snapshots, helping manage backup storage costs.
aws_docdb_swap_usage	SwapUsage	Monitors the amount of swap space used, helping track memory efficiency.
aws_docdb_total_backup_storage_billed	TotalBackupStorageBilled	Tracks the amount of backup storage billed, useful for understanding backup costs.
aws_docdb_volume_bytes_used	VolumeBytesUsed	Measures the amount of storage volume in use, helping track database storage usage.
aws_docdb_volume_read_iops	VolumeReadIOPs	Tracks the number of read input/output operations per second on the storage volume, useful for monitoring storage performance.
aws_docdb_volume_write_iops	VolumeWriteIOPs	Measures the number of write I/O operations per second, helping monitor write performance on the storage volume.
aws_docdb_write_iops	WriteIOPS	Tracks the number of write operations per second, useful for tracking write throughput.
aws_docdb_write_latency	WriteLatency	Measures the latency of write operations, helping to identify performance bottlenecks during data insertion or updates.
aws_docdb_write_throughput	WriteThroughput	Monitors the rate at which data is written to the database, useful for understanding write performance.

AWS/DynamoDB

Function: Fully managed NoSQL database service for low-latency applications at scale

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_dynamodb_info
aws_dynamodb_account_max_reads	AccountMaxReads	Monitors the maximum number of reads across all tables in the account, helping track overall read activity.
aws_dynamodb_account_max_table_level_reads	AccountMaxTableLevelReads	Tracks the maximum reads at the table level, helping to identify read-heavy tables.
aws_dynamodb_account_max_table_level_writes	AccountMaxTableLevelWrites	Measures the maximum number of writes at the table level, useful for identifying write-intensive tables.
aws_dynamodb_account_max_writes	AccountMaxWrites	Tracks the maximum number of writes across all tables in the account, helping monitor write throughput.
aws_dynamodb_account_provisioned_read_capacity_utilization	AccountProvisionedReadCapacityUtilization	Monitors the utilization of the provisioned read capacity, helping ensure sufficient read capacity allocation.
aws_dynamodb_account_provisioned_write_capacity_utilization	AccountProvisionedWriteCapacityUtilization	Tracks the utilization of the provisioned write capacity, useful for efficient capacity management.
aws_dynamodb_age_of_oldest_unreplicated_record	AgeOfOldestUnreplicatedRecord	Measures the age of the oldest unreplicated record, helping track replication lag.
aws_dynamodb_conditional_check_failed_requests	ConditionalCheckFailedRequests	Tracks the number of failed conditional checks, useful for identifying logical issues during write operations.
aws_dynamodb_consumed_change_data_capture_units	ConsumedChangeDataCaptureUnits	Measures the number of consumed Change Data Capture units, helping monitor CDC-based operations.
aws_dynamodb_consumed_read_capacity_units	ConsumedReadCapacityUnits	Monitors the total read capacity units consumed, helping track and optimize read operations.
aws_dynamodb_consumed_write_capacity_units	ConsumedWriteCapacityUnits	Measures the total write capacity units consumed, useful for monitoring and optimizing write operations.
aws_dynamodb_failed_to_replicate_record_count	FailedToReplicateRecordCount	Tracks the number of records that failed to replicate, useful for identifying replication issues.
aws_dynamodb_max_provisioned_table_read_capacity_utilization	MaxProvisionedTableReadCapacityUtilization	Measures the maximum utilization of the provisioned read capacity at the table level, useful for understanding table-specific read activity.
aws_dynamodb_max_provisioned_table_write_capacity_utilization	MaxProvisionedTableWriteCapacityUtilization	Tracks the maximum utilization of provisioned write capacity at the table level, helping optimize write capacity.
aws_dynamodb_on_demand_max_read_request_units	OnDemandMaxReadRequestUnits	Monitors the maximum number of read request units in on-demand mode, useful for managing scaling costs.
aws_dynamodb_on_demand_max_write_request_units	OnDemandMaxWriteRequestUnits	Tracks the maximum number of write request units in on-demand mode, helping optimize scaling and cost management.
aws_dynamodb_online_index_consumed_write_capacity	OnlineIndexConsumedWriteCapacity	Measures the write capacity consumed by online index builds, useful for tracking index creation overhead.
aws_dynamodb_online_index_percentage_progress	OnlineIndexPercentageProgress	Monitors the progress of online index creation, useful for understanding index build status.
aws_dynamodb_online_index_throttle_events	OnlineIndexThrottleEvents	Tracks throttle events during online index creation, useful for detecting capacity constraints.
aws_dynamodb_pending_replication_count	PendingReplicationCount	Monitors the number of records pending replication, useful for tracking replication progress.
aws_dynamodb_provisioned_read_capacity_units	ProvisionedReadCapacityUnits	Tracks the total provisioned read capacity units, useful for managing resource allocation.
aws_dynamodb_provisioned_write_capacity_units	ProvisionedWriteCapacityUnits	Monitors the total provisioned write capacity units, helping ensure proper capacity allocation.
aws_dynamodb_read_throttle_events	ReadThrottleEvents	Measures the number of throttled read requests, useful for identifying capacity limitations.
aws_dynamodb_replication_latency	ReplicationLatency	Tracks the replication latency, helping ensure timely data consistency across replicas.
aws_dynamodb_returned_bytes	ReturnedBytes	Monitors the amount of data returned in response to queries, useful for tracking data retrieval patterns.
aws_dynamodb_returned_item_count	ReturnedItemCount	Measures the total number of items returned by read operations, useful for monitoring query performance.
aws_dynamodb_returned_records_count	ReturnedRecordsCount	Tracks the number of records returned by queries, useful for understanding query load and performance.
aws_dynamodb_successful_request_latency	SuccessfulRequestLatency	Monitors the latency of successful requests, useful for optimizing request performance.
aws_dynamodb_system_errors	SystemErrors	Tracks system-level errors, helping identify infrastructure or platform issues.
aws_dynamodb_throttled_put_record_count	ThrottledPutRecordCount	Monitors the number of throttled PutItem requests, useful for managing write capacity.
aws_dynamodb_throttled_requests	ThrottledRequests	Tracks the total number of throttled requests, helping to identify capacity limitations or traffic spikes.
aws_dynamodb_time_to_live_deleted_item_count	TimeToLiveDeletedItemCount	Measures the number of items deleted due to Time to Live (TTL) expiration, useful for managing automatic data deletion.
aws_dynamodb_transaction_conflict	TransactionConflict	Monitors the number of transaction conflicts, helping to optimize transaction performance.
aws_dynamodb_user_errors	UserErrors	Tracks user-level errors, helping identify application issues.
aws_dynamodb_write_throttle_events	WriteThrottleEvents	Monitors the number of throttled write requests, useful for identifying capacity constraints during write operations.

AWS/EBS

Function: Block storage for use with EC2 instances

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_ebs_info
aws_ebs_volume_read_bytes	VolumeReadBytes	Measures the total bytes read from the EBS volume, useful for monitoring data retrieval activity.
aws_ebs_volume_write_bytes	VolumeWriteBytes	Tracks the total bytes written to the EBS volume, helping monitor data write operations.
aws_ebs_volume_read_ops	VolumeReadOps	Monitors the number of read operations on the EBS volume, useful for tracking read performance.
aws_ebs_volume_write_ops	VolumeWriteOps	Measures the number of write operations on the EBS volume, helping to monitor write throughput.
aws_ebs_volume_total_read_time	VolumeTotalReadTime	Tracks the total time spent on read operations, useful for understanding read latency.
aws_ebs_volume_total_write_time	VolumeTotalWriteTime	Monitors the total time spent on write operations, helping to understand write latency.
aws_ebs_volume_idle_time	VolumeIdleTime	Measures the amount of idle time for the EBS volume, useful for understanding periods of inactivity.
aws_ebs_volume_queue_length	VolumeQueueLength	Tracks the length of the queue for I/O requests on the EBS volume, helping to identify potential performance bottlenecks.
aws_ebs_volume_throughput_percentage	VolumeThroughputPercentage	Monitors the throughput percentage of the EBS volume, useful for ensuring optimal performance.
aws_ebs_volume_consumed_read_write_ops	VolumeConsumedReadWriteOps	Measures the number of read and write operations consumed, helping track IOPS utilization.
aws_ebs_burst_balance	BurstBalance	Tracks the balance of burst credits available for burstable performance EBS volumes, helping manage performance spikes.
aws_ebs_enable_copied_image_deprecation_completed	EnableCopiedImageDeprecationCompleted	Measures the completion of copied image deprecation operations, useful for lifecycle management.
aws_ebs_enable_copied_image_deprecation_failed	EnableCopiedImageDeprecationFailed	Tracks the failure of copied image deprecation operations, helping identify issues with deprecation.
aws_ebs_enable_image_deprecation_completed	EnableImageDeprecationCompleted	Measures the completion of image deprecation operations, helping monitor deprecation success.
aws_ebs_enable_image_deprecation_failed	EnableImageDeprecationFailed	Tracks the failure of image deprecation operations, useful for identifying deprecation issues.
aws_ebs_images_copied_region_completed	ImagesCopiedRegionCompleted	Monitors the completion of image copy operations across regions, helping manage multi-region image availability.
aws_ebs_images_copied_region_deregister_completed	ImagesCopiedRegionDeregisterCompleted	Tracks the completion of deregistration of copied images across regions, useful for lifecycle management.
aws_ebs_images_copied_region_deregistered_failed	ImagesCopiedRegionDeregisteredFailed	Measures failures during the deregistration of copied images, helping identify operational issues.
aws_ebs_images_copied_region_failed	ImagesCopiedRegionFailed	Tracks failures in region-to-region image copy operations, useful for identifying cross-region availability issues.
aws_ebs_images_copied_region_started	ImagesCopiedRegionStarted
aws_ebs_images_create_completed	ImagesCreateCompleted
aws_ebs_images_create_failed	ImagesCreateFailed
aws_ebs_images_create_started	ImagesCreateStarted
aws_ebs_images_deregister_completed	ImagesDeregisterCompleted
aws_ebs_images_deregister_failed	ImagesDeregisterFailed
aws_ebs_resources_targeted	ResourcesTargeted
aws_ebs_snapshots_copied_account_completed	SnapshotsCopiedAccountCompleted
aws_ebs_snapshots_copied_account_delete_completed	SnapshotsCopiedAccountDeleteCompleted
aws_ebs_snapshots_copied_account_delete_failed	SnapshotsCopiedAccountDeleteFailed
aws_ebs_snapshots_copied_account_failed	SnapshotsCopiedAccountFailed
aws_ebs_snapshots_copied_account_started	SnapshotsCopiedAccountStarted
aws_ebs_snapshots_copied_region_completed	SnapshotsCopiedRegionCompleted
aws_ebs_snapshots_copied_region_delete_completed	SnapshotsCopiedRegionDeleteCompleted
aws_ebs_snapshots_copied_region_delete_failed	SnapshotsCopiedRegionDeleteFailed
aws_ebs_snapshots_copied_region_failed	SnapshotsCopiedRegionFailed
aws_ebs_snapshots_copied_region_started	SnapshotsCopiedRegionStarted
aws_ebs_snapshots_create_completed	SnapshotsCreateCompleted	Tracks the successful completion of snapshot creation, helping monitor backup operations.
aws_ebs_snapshots_create_failed	SnapshotsCreateFailed	Measures the number of failed snapshot creation attempts, useful for detecting backup failures.
aws_ebs_snapshots_create_started	SnapshotsCreateStarted
aws_ebs_snapshots_delete_completed	SnapshotsDeleteCompleted	Tracks the completion of snapshot deletion, useful for storage management.
aws_ebs_snapshots_delete_failed	SnapshotsDeleteFailed	Measures the number of failed snapshot deletion attempts, helping track operational issues with snapshot management.
aws_ebs_snapshots_shared_completed	SnapshotsSharedCompleted

AWS/EC2

Function: Virtual servers in the cloud for running applications

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_ec2_info
aws_ec2_cpuutilization	CPUUtilization	Measures the amount of data received by the EC2 instance, useful for monitoring inbound traffic.
aws_ec2_network_in	NetworkIn	Measures the amount of data received by the EC2 instance, useful for monitoring inbound traffic.
aws_ec2_network_out	NetworkOut	Monitors the amount of data sent from the EC2 instance, helping track outbound traffic.
aws_ec2_network_packets_in	NetworkPacketsIn	Tracks the number of network packets received, useful for understanding inbound network traffic patterns.
aws_ec2_network_packets_out	NetworkPacketsOut	Measures the number of network packets sent, helping monitor outbound network activity.
aws_ec2_disk_read_bytes	DiskReadBytes	Monitors the number of bytes read from the instance’s storage, useful for tracking data retrieval performance.
aws_ec2_disk_write_bytes	DiskWriteBytes	Measures the number of bytes written to the instance’s storage, helping to track storage write operations.
aws_ec2_disk_read_ops	DiskReadOps	Tracks the number of read operations on the instance’s storage, useful for monitoring storage performance.
aws_ec2_disk_write_ops	DiskWriteOps	Measures the number of write operations on the instance’s storage, helping track write activity.
aws_ec2_status_check_failed	StatusCheckFailed	Tracks whether the EC2 instance has failed the instance or system status checks, useful for identifying potential issues.
aws_ec2_status_check_failed_instance	StatusCheckFailed_Instance	Monitors whether the instance has failed the instance-level status checks, helping to detect internal instance issues.
aws_ec2_status_check_failed_system	StatusCheckFailed_System	Tracks failures in the system-level status checks, useful for identifying infrastructure issues impacting the instance.
aws_ec2_ebsiobalance_percent	EBSIOBalance%	Measures the I/O balance of attached EBS volumes, helping to ensure that the instance has adequate I/O capacity.
aws_ec2_ebsbyte_balance_percent	EBSByteBalance%	Tracks the byte balance of attached EBS volumes, useful for managing storage throughput.
aws_ec2_ebsread_ops	EBSReadOps	Monitors the number of read operations on attached EBS volumes, useful for tracking storage read performance.
aws_ec2_ebswrite_ops	EBSWriteOps	Tracks the number of write operations on attached EBS volumes, helping to monitor storage write activity.
aws_ec2_ebsread_bytes	EBSReadBytes	Measures the number of bytes read from attached EBS volumes, useful for monitoring data retrieval performance.
aws_ec2_ebswrite_bytes	EBSWriteBytes	Tracks the number of bytes written to attached EBS volumes, helping to monitor data write performance.
aws_ec2_cpucredit_balance	CPUCreditBalance	Monitors the remaining CPU credits for burstable instances, helping ensure that sufficient CPU credits are available for performance.
aws_ec2_cpucredit_usage	CPUCreditUsage	Tracks the number of CPU credits used, useful for monitoring the consumption of burstable instances.
aws_ec2_cpusurplus_credit_balance	CPUSurplusCreditBalance	Measures the surplus CPU credits available for burstable instances, useful for tracking instance performance capacity.
aws_ec2_cpusurplus_credits_charged	CPUSurplusCreditsCharged	Tracks the number of surplus CPU credits charged, helping manage costs associated with overutilization.
aws_ec2_dedicated_host_cpuutilization	DedicatedHostCPUUtilization	Measures the CPU usage of dedicated EC2 hosts, helping to optimize host-level resource allocation.
aws_ec2_metadata_no_token	MetadataNoToken	Monitors the number of failed attempts to retrieve metadata without a token, useful for identifying security or access issues.
aws_ec2_status_check_failed_attached_ebs	StatusCheckFailed_AttachedEBS	Tracks status check failures related to attached EBS volumes, helping monitor storage health and performance.

AWS/EC2Spot

Function: Uses spare EC2 capacity at reduced prices for workloads with flexible start times

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_ec2spot_info
aws_ec2spot_available_instance_pools_count	AvailableInstancePoolsCount	Monitors the number of instance pools available for Spot requests, useful for tracking availability.
aws_ec2spot_bids_submitted_for_capacity	BidsSubmittedForCapacity
Tracks the number of bids submitted for capacity in Spot instances, helping monitor the Spot instance bidding process.
aws_ec2spot_eligible_instance_pool_count	EligibleInstancePoolCount	Measures the number of eligible instance pools for Spot requests, useful for understanding Spot market options.
aws_ec2spot_fulfilled_capacity	FulfilledCapacity	Tracks the capacity fulfilled by Spot instances, helping monitor the success rate of Spot requests.
aws_ec2spot_max_percent_capacity_allocation	MaxPercentCapacityAllocation	Measures the maximum percent of capacity allocated, useful for understanding the allocation of Spot instances.
aws_ec2spot_pending_capacity	PendingCapacity	Tracks the pending Spot instance capacity, helping monitor Spot instance provisioning.
aws_ec2spot_percent_capacity_allocation	PercentCapacityAllocation	Monitors the percentage of capacity allocated to Spot instances, useful for managing resource allocation.
aws_ec2spot_target_capacity	TargetCapacity	Tracks the target capacity for Spot instances, useful for monitoring Spot instance request goals.
aws_ec2spot_terminating_capacity	TerminatingCapacity	Measures the capacity being terminated in Spot instances, helping track Spot instance lifecycle management.

AWS/ECR

Function: Managed container image registry for storing Docker images

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_ecr_repository_pull_count	RepositoryPullCount	Monitors the number of pulls from an ECR repository, useful for tracking container image usage.

AWS/ECS

Function: Fully managed container orchestration service for running Docker containers

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_ecs_info
aws_ecs_cpureservation	CPUReservation	Tracks the CPU reserved for ECS tasks, helping monitor resource reservation.
aws_ecs_cpuutilization	CPUUtilization	Monitors the CPU utilization of ECS tasks, useful for tracking resource usage.
aws_ecs_gpureservation	GPUReservation	Tracks GPU reservation for ECS tasks, helping manage GPU resources.
aws_ecs_memory_reservation	MemoryReservation	Monitors the memory reserved for ECS tasks, helping track memory resource allocation.
aws_ecs_memory_utilization	MemoryUtilization	Tracks the memory utilization of ECS tasks, useful for monitoring memory resource consumption.

AWS/EFS

Function: Scalable and fully managed file storage for use with EC2 instances

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_efs_info
aws_efs_burst_credit_balance	BurstCreditBalance	Monitors the balance of burst credits for EFS, useful for managing performance bursts.
aws_efs_client_connections	ClientConnections	Tracks the number of client connections to EFS, useful for understanding file system usage.
aws_efs_data_read_iobytes	DataReadIOBytes	Measures the amount of data read from EFS, helping track read performance.
aws_efs_data_write_iobytes	DataWriteIOBytes	Tracks the amount of data written to EFS, helping monitor write performance.
aws_efs_metadata_iobytes	MetadataIOBytes	Monitors the metadata operations on EFS, useful for tracking metadata-related I/O.
aws_efs_metered_iobytes	MeteredIOBytes	Tracks the amount of metered I/O operations, helping manage performance limits.
aws_efs_percent_iolimit	PercentIOLimit	Monitors the percentage of the I/O limit reached, useful for performance management.
aws_efs_permitted_throughput	PermittedThroughput	Measures the allowed throughput for EFS, helping monitor throughput limits.
aws_efs_storage_bytes	StorageBytes	Tracks the total storage used by EFS, useful for managing storage capacity.
aws_efs_total_iobytes	TotalIOBytes	Measures the total I/O operations, helping monitor overall file system performance.

AWS/ELB

Function: Distributes traffic across multiple targets like EC2 instances and containers

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_elb_info
aws_elb_backend_connection_errors	BackendConnectionErrors	Tracks the number of connection errors between ELB and the backend instances, useful for identifying connection issues.
aws_elb_healthy_host_count	HealthyHostCount	Monitors the number of healthy backend instances, helping track instance health.
aws_elb_httpcode_backend_2_xx	HTTPCode_Backend_2XX	Tracks successful responses (2XX) from the backend, useful for monitoring backend application performance.
aws_elb_httpcode_backend_3_xx	HTTPCode_Backend_3XX	Measures redirection responses (3XX) from the backend, helping monitor routing performance.
aws_elb_httpcode_backend_4_xx	HTTPCode_Backend_4XX	Tracks client errors (4XX) from the backend, useful for identifying issues with client requests.
aws_elb_httpcode_backend_5_xx	HTTPCode_Backend_5XX	Monitors server errors (5XX) from the backend, helping track server-side issues.
aws_elb_httpcode_elb_4_xx	HTTPCode_ELB_4XX	Measures client errors (4XX) at the ELB level, useful for tracking errors handled by the ELB.
aws_elb_httpcode_elb_5_xx	HTTPCode_ELB_5XX	Tracks server errors (5XX) at the ELB level, helping monitor ELB server-side performance.
aws_elb_latency	Latency	Monitors the latency of requests through the ELB, useful for tracking response times.
aws_elb_request_count	RequestCount	Tracks the number of requests handled by the ELB, useful for monitoring traffic levels.
aws_elb_spillover_count	SpilloverCount	Measures the number of requests that were rejected due to lack of available resources,	helping track capacity limitations.
aws_elb_surge_queue_length	SurgeQueueLength	Tracks the length of the request queue, useful for monitoring traffic surges.
aws_elb_un_healthy_host_count	UnHealthyHostCount	Monitors the number of unhealthy backend instances, helping identify infrastructure issues.
aws_elb_estimated_albactive_connection_count	EstimatedALBActiveConnectionCount	Tracks the number of active connections to the ALB, useful for monitoring load balancer usage.
aws_elb_estimated_albconsumed_lcus	EstimatedALBConsumedLCUs	Measures the load balancer capacity units (LCUs) consumed by the ALB, helping monitor resource usage.
aws_elb_estimated_albnew_connection_count	EstimatedALBNewConnectionCount	Tracks the number of new connections established with the ALB, useful for monitoring connection traffic.
aws_elb_estimated_processed_bytes	EstimatedProcessedBytes	Monitors the total bytes processed by the ALB, helping to track data flow through the load balancer.

AWS/ES

Function: Managed Elasticsearch service for real-time search and analytics

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_es_info
aws_es_info	aws_es_info	Provides general information about the Elasticsearch service
aws_es_2xx	2xx	Tracks successful requests to the Elasticsearch service
aws_es_3xx	3xx	Tracks redirection requests to the Elasticsearch service
aws_es_4xx	4xx	Tracks client error responses from the Elasticsearch service
aws_es_5xx	5xx	Tracks server error responses from the Elasticsearch service
aws_es_adanomaly_detectors_index_status_red	ADAnomalyDetectorsIndexStatus.red	Indicates if the anomaly detection index is in a red (critical) state
aws_es_adanomaly_detectors_index_status_index_exists	ADAnomalyDetectorsIndexStatusIndexExists	Tracks whether the anomaly detection index exists or not
aws_es_adanomaly_results_index_status_red	ADAnomalyResultsIndexStatus.red	Indicates if the anomaly results index is in a red (critical) state
aws_es_adanomaly_results_index_status_index_exists	ADAnomalyResultsIndexStatusIndexExists	Tracks whether the anomaly results index exists or not
aws_es_adexecute_failure_count	ADExecuteFailureCount	Tracks the number of times anomaly detection execution has failed
aws_es_adexecute_request_count	ADExecuteRequestCount	Tracks the number of anomaly detection execution requests
aws_es_adhcexecute_failure_count	ADHCExecuteFailureCount	Tracks the number of high cardinality anomaly detection execution failures
aws_es_adhcexecute_request_count	ADHCExecuteRequestCount	Tracks the number of high cardinality anomaly detection execution requests
aws_es_admodels_checkpoint_index_status_red	ADModelsCheckpointIndexStatus.red	Indicates if the model checkpoint index is in a red (critical) state
aws_es_admodels_checkpoint_index_status_index_exists	ADModelsCheckpointIndexStatusIndexExists	Tracks whether the model checkpoint index exists
aws_es_adplugin_unhealthy	ADPluginUnhealthy	Indicates if the anomaly detection plugin is in an unhealthy state
aws_es_alerting_degraded	AlertingDegraded	Indicates if the alerting feature is in a degraded state
aws_es_alerting_index_exists	AlertingIndexExists	Tracks whether the alerting index exists
aws_es_alerting_index_status_green	AlertingIndexStatus.green	Indicates if the alerting index is in a green (healthy) state
aws_es_alerting_index_status_red	AlertingIndexStatus.red	Indicates if the alerting index is in a red (critical) state
aws_es_alerting_index_status_yellow AlertingIndexStatus.yellow	Indicates if the alerting index is in a yellow (warning) state
aws_es_alerting_nodes_not_on_schedule	AlertingNodesNotOnSchedule	Tracks the number of nodes not on schedule for alerting
aws_es_alerting_nodes_on_schedule	AlertingNodesOnSchedule	Tracks the number of nodes on schedule for alerting
aws_es_alerting_scheduled_job_enabled	AlertingScheduledJobEnabled	Indicates if alerting scheduled jobs are enabled
aws_es_asynchronous_search_cancelled	AsynchronousSearchCancelled	Tracks the number of asynchronous search requests that were canceled
aws_es_asynchronous_search_completion_rate	AsynchronousSearchCompletionRate	Tracks the rate of successful asynchronous search completions
aws_es_asynchronous_search_failure_rate	AsynchronousSearchFailureRate	Tracks the rate of failed asynchronous search requests
aws_es_asynchronous_search_initialized_rate	AsynchronousSearchInitializedRate	Tracks the rate of initialized asynchronous search requests
aws_es_asynchronous_search_max_running_time	AsynchronousSearchMaxRunningTime	Tracks the maximum time taken by asynchronous search requests
aws_es_asynchronous_search_persist_failed_rate	AsynchronousSearchPersistFailedRate	Tracks the rate of failed attempts to persist asynchronous search results
aws_es_asynchronous_search_persist_rate	AsynchronousSearchPersistRate	Tracks the rate of successful attempts to persist asynchronous search results
aws_es_asynchronous_search_rejected	AsynchronousSearchRejected	Tracks the number of asynchronous search requests that were rejected
aws_es_asynchronous_search_running_current	AsynchronousSearchRunningCurrent	Tracks the number of currently running asynchronous search requests
aws_es_asynchronous_search_store_health	AsynchronousSearchStoreHealth	Tracks the health of the store for asynchronous search
aws_es_asynchronous_search_store_size	AsynchronousSearchStoreSize	Tracks the size of the asynchronous search store
aws_es_asynchronous_search_stored_response_count	AsynchronousSearchStoredResponseCount	Tracks the number of responses stored for asynchronous search
aws_es_asynchronous_search_submission_rate	AsynchronousSearchSubmissionRate Tracks the rate of submitted asynchronous search requests
aws_es_auto_follow_leader_call_failure	AutoFollowLeaderCallFailure	Tracks the number of failures when trying to call the leader for cross-cluster replication
aws_es_auto_follow_num_failed_start_replication	AutoFollowNumFailedStartReplication	Tracks the number of failed attempts to start cross-cluster replication
aws_es_auto_follow_num_success_start_replication	AutoFollowNumSuccessStartReplication	Tracks the number of successful attempts to start cross-cluster replication
aws_es_auto_tune_changes_history_heap_size	AutoTuneChangesHistoryHeapSize	Tracks the heap size usage history for auto-tune changes
aws_es_auto_tune_changes_history_jvmyoung_gen_args	AutoTuneChangesHistoryJVMYoungGenArgs	Tracks JVM young generation arguments for auto-tune changes
aws_es_auto_tune_failed	AutoTuneFailed	Tracks the number of failed auto-tune attempts
aws_es_auto_tune_succeeded	AutoTuneSucceeded	Tracks the number of successful auto-tune attempts
aws_es_auto_tune_value	AutoTuneValue	Tracks the value of auto-tune changes
aws_es_automated_snapshot_failure	AutomatedSnapshotFailure	Tracks the number of failures in automated	snapshots
aws_es_avg_point_in_time_alive_time	AvgPointInTimeAliveTime	Tracks the average lifetime of point-in-time snapshots
aws_es_burst_balance	BurstBalance	Tracks the burst balance for the service
aws_es_cpucredit_balance	CPUCreditBalance	Tracks the balance of CPU credits for the nodes
aws_es_cpuutilization	CPUUtilization	Tracks the CPU utilization of the nodes
aws_es_cluster_index_writes_blocked	ClusterIndexWritesBlocked	Tracks whether index writes are blocked at the cluster level
aws_es_cluster_status_green	ClusterStatus.green	Indicates if the cluster is in a green (healthy) state
aws_es_cluster_status_red	ClusterStatus.red	Indicates if the cluster is in a red (critical) state
aws_es_cluster_status_yellow	ClusterStatus.yellow	Indicates if the cluster is in a yellow (warning) state
aws_es_cluster_used_space	ClusterUsedSpace	Tracks the amount of used storage space in the cluster
aws_es_cold_storage_space_utilization	ColdStorageSpaceUtilization	Tracks the storage utilization of cold data
aws_es_cold_to_warm_migration_failure_count	ColdToWarmMigrationFailureCount	Tracks the number of failures during migration from cold to warm storage
aws_es_cold_to_warm_migration_latency	ColdToWarmMigrationLatency	Tracks the latency of migration from cold to warm storage
aws_es_cold_to_warm_migration_queue_size	ColdToWarmMigrationQueueSize	Tracks the queue size for migration from cold to warm storage
aws_es_cold_to_warm_migration_success_count	ColdToWarmMigrationSuccessCount	Tracks the number of successful migrations from cold to warm storage
aws_es_coordinating_write_rejected	CoordinatingWriteRejected	Tracks the number of rejected coordinating node write requests
aws_es_cross_cluster_inbound_replication_requests	CrossClusterInboundReplicationRequests	Tracks the number of inbound replication requests for cross-cluster replication
aws_es_cross_cluster_inbound_requests	CrossClusterInboundRequests	Tracks the number of inbound requests for cross-cluster replication
aws_es_cross_cluster_outbound_connections	CrossClusterOutboundConnections	Tracks the number of outbound connections for cross-cluster replication
aws_es_cross_cluster_outbound_replication_requests	CrossClusterOutboundReplicationRequests	Tracks the number of outbound replication requests for cross-cluster replication
aws_es_cross_cluster_outbound_requests	CrossClusterOutboundRequests	Tracks the number of outbound requests for cross-cluster replication
aws_es_current_point_in_time	CurrentPointInTime	Tracks the current point in time (snapshot) available in Elasticsearch
aws_es_data_nodes	DataNodes	Tracks the number of data nodes in the Elasticsearch cluster
aws_es_data_nodes_shards_active	DataNodesShards.active	Tracks the number of active shards on data nodes
aws_es_data_nodes_shards_initializing	DataNodesShards.initializing	Tracks the number of shards that are initializing on data nodes
aws_es_data_nodes_shards_relocating	DataNodesShards.relocating	Tracks the number of shards that are relocating on data nodes
aws_es_data_nodes_shards_unassigned	DataNodesShards.unassigned	Tracks the number of unassigned shards on data nodes
aws_es_deleted_documents	DeletedDocuments	Tracks the number of deleted documents from the Elasticsearch cluster
aws_es_disk_queue_depth	DiskQueueDepth	Tracks the depth of the disk queue
aws_es_reporting_failed_request_sys_err_count	ESReportingFailedRequestSysErrCount	Tracks the number of failed reporting requests due to system errors
aws_es_reporting_failed_request_user_err_count	ESReportingFailedRequestUserErrCount	Tracks the number of failed reporting requests due to user errors
aws_es_reporting_request_count	ESReportingRequestCount	Tracks the number of reporting requests submitted to Elasticsearch
aws_es_reporting_success_count	ESReportingSuccessCount	Tracks the number of successful reporting requests
aws_es_elasticsearch_requests	ElasticsearchRequests	Tracks the number of requests to Elasticsearch
aws_es_follower_check_point	FollowerCheckPoint	Tracks the checkpoint of a follower node in cross-cluster replication
aws_es_free_storage_space	FreeStorageSpace	Tracks the available storage space in the Elasticsearch cluster
aws_es_has_active_point_in_time	HasActivePointInTime	Indicates `whether there is an active point-in-time snapshot
aws_es_has_used_point_in_time	HasUsedPointInTime	Indicates whether the point-in-time snapshot has been used
aws_es_hot_storage_space_utilization	HotStorageSpaceUtilization	Tracks the storage utilization of hot data
aws_es_hot_to_warm_migration_failure_count	HotToWarmMigrationFailureCount	Tracks the number of failures during migration from hot to warm storage
aws_es_hot_to_warm_migration_force_merge_latency	HotToWarmMigrationForceMergeLatency	Tracks the latency of force merging during migration from hot to warm storage
aws_es_hot_to_warm_migration_processing_latency	HotToWarmMigrationProcessingLatency	Tracks the latency of processing migration from hot to warm storage
aws_es_hot_to_warm_migration_queue_size	HotToWarmMigrationQueueSize	Tracks the queue size for migration from hot to warm storage
aws_es_hot_to_warm_migration_snapshot_latency	HotToWarmMigrationSnapshotLatency	Tracks the latency of snapshotting during migration from hot to warm storage
aws_es_hot_to_warm_migration_success_count	HotToWarmMigrationSuccessCount	Tracks the number of successful migrations from hot to warm storage
aws_es_hot_to_warm_migration_success_latency	HotToWarmMigrationSuccessLatency	Tracks the latency of successful migrations from hot to warm storage
aws_es_indexing_latency IndexingLatency	Tracks the latency of indexing documents in the Elasticsearch cluster
aws_es_indexing_rate IndexingRate	Tracks the rate of indexing documents in the Elasticsearch cluster
aws_es_invalid_host_header_requests	InvalidHostHeaderRequests	Tracks the number of requests with invalid host headers
aws_es_iops_throttle	IopsThrottle	Tracks throttling of input/output operations
aws_es_jvmgcold_collection_count	JVMGCOldCollectionCount	Tracks the number of garbage collection events in the old generation of JVM
aws_es_jvmgcold_collection_time	JVMGCOldCollectionTime	Tracks the time spent in garbage collection in the old generation of JVM
aws_es_jvmgcyoung_collection_count	JVMGCYoungCollectionCount	Tracks the number of garbage collection events in the young generation of JVM
aws_es_jvmgcyoung_collection_time	JVMGCYoungCollectionTime	Tracks the time spent in garbage collection in the young generation of JVM
aws_es_jvmmemory_pressure	JVMMemoryPressure	Tracks memory pressure on the JVM used by Elasticsearch
aws_es_kmskey_error KMSKeyError	Tracks the number of errors related to KMS keys used by the Elasticsearch cluster
aws_es_kmskey_inaccessible	KMSKeyInaccessible	Tracks the number of times a KMS key is inaccessible for the Elasticsearch cluster
aws_es_knncache_capacity_reached	KNNCacheCapacityReached	Tracks when the KNN cache capacity is reached
aws_es_knncircuit_breaker_triggered	KNNCircuitBreakerTriggered	Tracks when the KNN circuit breaker is triggered
aws_es_knneviction_count	KNNEvictionCount	Tracks the number of evictions from the KNN cache
aws_es_knngraph_index_errors	KNNGraphIndexErrors	Tracks errors during KNN graph indexing
aws_es_knngraph_index_requests	KNNGraphIndexRequests	Tracks the number of KNN graph index requests
aws_es_knngraph_memory_usage	KNNGraphMemoryUsage	Tracks memory usage by the KNN graph
aws_es_knngraph_query_errors	KNNGraphQueryErrors	Tracks errors during KNN graph queries
aws_es_knngraph_query_requests	KNNGraphQueryRequests	Tracks the number of KNN graph query requests
aws_es_knnhit_count	KNNHitCount	Tracks the number of hits returned by KNN queries
aws_es_knnload_exception_count	KNNLoadExceptionCount	Tracks the number of exceptions during	KNN data loading
aws_es_knnload_success_count	KNNLoadSuccessCount	Tracks the number of successful KNN data load operations
aws_es_knnmiss_count	KNNMissCount	Tracks the number of KNN cache misses
aws_es_knnquery_requests	KNNQueryRequests	Tracks the number of KNN queries
aws_es_knnscript_compilation_errors	KNNScriptCompilationErrors	Tracks the number of errors during KNN script compilation
aws_es_knnscript_compilations	KNNScriptCompilations	Tracks the number of KNN script compilations
aws_es_knnscript_query_errors	KNNScriptQueryErrors	Tracks errors during KNN script queries
aws_es_knnscript_query_requests	KNNScriptQueryRequests	Tracks the number of KNN script queries
aws_es_knntotal_load_time	KNNTotalLoadTime	Tracks the total load time for KNN operations
aws_es_kibana_concurrent_connections	KibanaConcurrentConnections	Tracks the number of concurrent Kibana connections
aws_es_kibana_healthy_nodes	KibanaHealthyNodes	Tracks the number of healthy Kibana nodes
aws_es_kibana_heap_total	KibanaHeapTotal	Tracks the total heap size of Kibana
aws_es_kibana_heap_used	KibanaHeapUsed	Tracks the heap size used by Kibana
aws_es_kibana_heap_utilization	KibanaHeapUtilization	Tracks the heap utilization of Kibana
aws_es_kibana_os1_minute_load	KibanaOS1MinuteLoad	Tracks the 1-minute load average of the Kibana node’s operating system
aws_es_kibana_reporting_failed_request_sys_err_count	KibanaReportingFailedRequestSysErrCount	Tracks the number of failed Kibana reporting requests due to system errors
aws_es_kibana_reporting_failed_request_user_err_count	KibanaReportingFailedRequestUserErrCount	Tracks the number of failed Kibana reporting requests due to user errors
aws_es_kibana_reporting_request_count	KibanaReportingRequestCount	Tracks the number of Kibana reporting requests
aws_es_kibana_reporting_success_count	KibanaReportingSuccessCount	Tracks the number of successful Kibana reporting requests
aws_es_kibana_request_total	KibanaRequestTotal	Tracks the total number of requests sent to Kibana
aws_es_kibana_response_times_max_in_millis	KibanaResponseTimesMaxInMillis	Tracks the maximum response time of Kibana requests in milliseconds
aws_es_ltrfeature_memory_usage_in_bytes	LTRFeatureMemoryUsageInBytes	Tracks memory usage by LTR features in bytes
aws_es_ltrfeatureset_memory_usage_in_bytes	LTRFeaturesetMemoryUsageInBytes	Tracks memory usage by LTR feature sets in bytes
aws_es_ltrmemory_usage	LTRMemoryUsage	Tracks overall memory usage by LTR features
aws_es_ltrmodel_memory_usage_in_bytes	LTRModelMemoryUsageInBytes	Tracks memory usage by LTR models in bytes
aws_es_ltrrequest_error_count	LTRRequestErrorCount	Tracks the number of errors in LTR requests
aws_es_ltrrequest_total_count	LTRRequestTotalCount	Tracks the total number of LTR requests
aws_es_ltrstatus_red	LTRStatus.red	Indicates if the LTR status is in a red (critical) state
aws_es_leader_check_point	LeaderCheckPoint	Tracks the checkpoint of the leader node in cross-cluster replication
aws_ es_master_cpucredit_balance	MasterCPUCreditBalance	Tracks the balance of CPU credits for the master node
aws_ es_master_cpuutilization	MasterCPUUtilization	Tracks the CPU utilization of the master node
aws_ es_master_free_storage_space	MasterFreeStorageSpace	Tracks the free storage space available on the naster node
aws_ es_master_jvmmemory_pressure	MasterJVMMemoryPressure	Tracks JVM memory pressure on the master node
aws_ es_master_old_gen_jvmmemory_pressure	MasterOldGenJVMMemoryPressure	Tracks old generation JVM memory pressure on the master node
aws_ es_master_reachable_from_node	MasterReachableFromNode	Tracks whether the master node is reachable from the data nodes
aws_ es_master_sys_memory_utilization	MasterSysMemoryUtilization	Tracks system memory utilization of the master node
aws_ es_max_provisioned_throughput	MaxProvisionedThroughput	Tracks the maximum provisioned throughput for Elasticsearch
aws_ es_nodes	Nodes	Tracks the number of nodes in the Elasticsearch cluster
aws_ es_old_gen_jvmmemory_pressure	OldGenJVMMemoryPressure	Tracks old generation JVM memory pressure on the nodes
aws_ es_open_search_dashboards_concurrent_connections	penSearchDashboardsConcurrentConnections	Tracks the number of concurrent connections to OpenSearch Dashboards
aws_ es_open_search_dashboards_healthy_node	OpenSearchDashboardsHealthyNode	Tracks the number of healthy OpenSearch Dashboard nodes
aws_ es_open_search_dashboards_healthy_nodes	OpenSearchDashboardsHealthyNodes	Tracks the number of healthy OpenSearch Dashboard nodes
aws_ es_open_search_dashboards_heap_total	OpenSearchDashboardsHeapTotal	Tracks the total heap size of OpenSearch Dashboards
aws_ es_open_search_dashboards_heap_used	OpenSearchDashboardsHeapUsed	Tracks the heap size used by OpenSearch Dashboards
aws_ es_open_search_dashboards_heap_utilization	OpenSearchDashboardsHeapUtilization	Tracks the heap utilization of OpenSearch Dashboards
aws_ es_open_search_dashboards_os1_minute_load	OpenSearchDashboardsOS1MinuteLoad	Tracks the 1-minute load average of the OpenSearch Dashboards node’s operating system
aws_ es_open_search_dashboards_request_total	OpenSearchDashboardsRequestTotal	Tracks the total number of requests sent to OpenSearch Dashboards
aws_ es_open_search_dashboards_response_times_max_in_millis	OpenSearchDashboardsResponseTimesMaxInMillis	Tracks the maximum response time of OpenSearch Dashboards requests in milliseconds
aws_ es_open_search_requests	OpenSearchRequests	Tracks the number of requests to OpenSearch
aws_ es_opensearch_dashboards_reporting_failed_request_sys_err_count	OpensearchDashboardsReportingFailedRequestSysErrCount	Tracks the number of failed OpenSearch Dashboards reporting requests due to system errors
aws_ es_opensearch_dashboards_reporting_failed_request_user_err_count	OpensearchDashboardsReportingFailedRequestUserErrCount	Tracks the number of failed OpenSearch Dashboards reporting requests due to user errors
aws_ es_opensearch_dashboards_reporting_request_count	OpensearchDashboardsReportingRequestCount	Tracks the number of OpenSearch Dashboards reporting requests
aws_ es_opensearch_dashboards_reporting_success_count	OpensearchDashboardsReportingSuccessCount	Tracks the number of successful OpenSearch Dashboards reporting requests
aws_es_pplfailed_request_count_by_cus_err	PPLFailedRequestCountByCusErr	Tracks the number of PPL failed requests due to customer errors
aws_es_pplfailed_request_count_by_sys_err	PPLFailedRequestCountBySysErr	Tracks the number of PPL failed requests due to system errors
aws_es_pplrequest_count	PPLRequestCount	Tracks the total number of PPL requests
aws_es_primary_write_rejected	PrimaryWriteRejected	Tracks the number of rejected primary write requests
aws_es_read_iops	ReadIOPS	Tracks input/output operations per second for reads
aws_es_read_iopsmicro_bursting	ReadIOPSMicroBursting	Tracks micro-bursting of input/output operations for reads
aws_es_read_latency	ReadLatency	Tracks the latency of read operations in the Elasticsearch cluster
aws_es_read_throughput	ReadThroughput	Tracks the throughput of read operations
aws_es_read_throughput_micro_bursting	ReadThroughputMicroBursting	Tracks micro-bursting of read throughput
aws_es_remote_storage_used_space	RemoteStorageUsedSpace	Tracks the amount of used space in remote storage
aws_es_remote_storage_write_rejected	RemoteStorageWriteRejected	Tracks the number of rejected write operations in remote storage
aws_es_replica_write_rejected	ReplicaWriteRejected	Tracks the number of rejected replica write requests
aws_es_replication_num_bootstrapping_indices	ReplicationNumBootstrappingIndices	Tracks the number of indices in the bootstrapping state for replication
aws_es_replication_num_failed_indices	ReplicationNumFailedIndices	Tracks the number of failed replication indices
aws_es_replication_num_paused_indices	ReplicationNumPausedIndices	Tracks the number of paused replication indices
aws_es_replication_num_syncing_indices	ReplicationNumSyncingIndices	Tracks the number of replication indices currently syncing
aws_es_replication_rate	ReplicationRate	Tracks the rate of replication in Elasticsearch
aws_es_sqldefault_cursor_request_count	SQLDefaultCursorRequestCount	Tracks the number of default SQL cursor requests
aws_es_sqlfailed_request_count_by_cus_err	SQLFailedRequestCountByCusErr	Tracks the number of SQL failed requests due to customer errors
aws_es_sqlfailed_request_count_by_sys_err	SQLFailedRequestCountBySysErr	Tracks the number of SQL failed requests due to system errors
aws_es_sqlrequest_count	SQLRequestCount	Tracks the total number of SQL requests
aws_es_sqlunhealthy	SQLUnhealthy	Tracks whether the SQL plugin is in an unhealthy state
aws_es_search_latency	SearchLatency	Tracks the latency of search operations in the Elasticsearch cluster
aws_es_search_rate	SearchRate	Tracks the rate of search operations
aws_es_search_shard_task_cancelled	SearchShardTaskCancelled	Tracks the number of search shard tasks that were canceled
aws_es_search_task_cancelled	SearchTaskCancelled	Tracks the number of canceled search tasks
aws_es_searchable_documents	SearchableDocuments	Tracks the number of searchable documents
aws_es_segment_count	SegmentCount	Tracks the number of segments in the Elasticsearch cluster
aws_es_shards_active	Shards.active	Tracks the number of active shards
aws_es_shards_active_primary	Shards.activePrimary	Tracks the number of active primary shards
aws_es_shards_delayed_unassigned	Shards.delayedUnassigned	Tracks the number of delayed unassigned shards
aws_es_shards_initializing	Shards.initializing	Tracks the number of initializing shards
aws_es_shards_relocating	Shards.relocating	Tracks the number of relocating shards
aws_es_shards_unassigned	Shards.unassigned	Tracks the number of unassigned shards
aws_es_sys_memory_utilization	SysMemoryUtilization	Tracks system memory utilization
aws_es_threadpool_bulk_queue	ThreadpoolBulkQueue	Tracks the size of the bulk thread pool queue
aws_es_threadpool_bulk_rejected	ThreadpoolBulkRejected	Tracks the number of bulk thread pool tasks that were rejected
aws_es_threadpool_bulk_threads	ThreadpoolBulkThreads	Tracks the number of active threads in the bulk thread pool
aws_es_threadpool_force_merge_queue	ThreadpoolForce_mergeQueue	Tracks the size of the force merge thread pool queue
aws_es_threadpool_force_merge_rejected	ThreadpoolForce_mergeRejected	Tracks the number of force merge thread pool tasks that were rejected
aws_es_threadpool_force_merge_threads	ThreadpoolForce_mergeThreads	Tracks the number of active threads in the force merge thread pool
aws_es_threadpool_index_queue	ThreadpoolIndexQueue	Tracks the size of the index thread pool queue
aws_es_threadpool_index_rejected	ThreadpoolIndexRejected	Tracks the number of index thread pool tasks that were rejected
aws_es_threadpool_index_threads	ThreadpoolIndexThreads	Tracks the number of active threads in the index thread pool
aws_es_threadpool_search_queue	ThreadpoolSearchQueue	Tracks the size of the search thread pool queue
aws_es_threadpool_search_rejected	ThreadpoolSearchRejected	Tracks the number of search thread pool tasks that were rejected
aws_es_threadpool_search_threads	ThreadpoolSearchThreads	Tracks the number of active threads in the search thread pool
aws_es_threadpool_write_queue	ThreadpoolWriteQueue	Tracks the size of the write thread pool queue
aws_es_threadpool_write_rejected	ThreadpoolWriteRejected	Tracks the number of write thread pool tasks that were rejected
aws_es_threadpool_write_threads	ThreadpoolWriteThreads	Tracks the number of active threads in the write thread pool
aws_es_threadpoolsql_worker_queue	Threadpoolsql-workerQueue	Tracks the size of the SQL worker thread pool queue
aws_es_threadpoolsql_worker_rejected	Threadpoolsql-workerRejected	Tracks the number of SQL worker thread pool tasks that were rejected
aws_es_threadpoolsql_worker_threads	Threadpoolsql-workerThreads	Tracks the number of active threads in the SQL worker thread pool
aws_es_throughput_throttle	ThroughputThrottle	Tracks throttling of throughput in the Elasticsearch cluster
aws_es_total_point_in_time	TotalPointInTime	Tracks the total number of point-in-time snapshots
aws_es_warm_cpuutilization	WarmCPUUtilization	Tracks the CPU utilization of warm data nodes
aws_es_warm_free_storage_space	WarmFreeStorageSpace	Tracks the available storage space in warm data nodes
aws_es_warm_jvmgcold_collection_count	WarmJVMGCOldCollectionCount	Tracks the number of garbage collection events in the old generation of JVM on warm data nodes
aws_es_warm_jvmgcyoung_collection_count	WarmJVMGCYoungCollectionCount	Tracks the number of garbage collection events in the young generation of JVM on warm data nodes
aws_es_warm_jvmgcyoung_collection_time	WarmJVMGCYoungCollectionTime	Tracks the time spent in garbage collection in the young generation of JVM on warm data nodes
aws_es_warm_jvmmemory_pressure	WarmJVMMemoryPressure	Tracks memory pressure on warm data nodes
aws_es_warm_old_gen_jvmmemory_pressure	WarmOldGenJVMMemoryPressure	Tracks old generation JVM memory pressure on warm data nodes
aws_es_warm_search_latency	WarmSearchLatency	Tracks the latency of search operations on warm data nodes
aws_es_warm_search_rate	WarmSearchRate	Tracks the rate of search operations on warm data nodes
aws_es_warm_searchable_documents	WarmSearchableDocuments	Tracks the number of searchable documents on warm data nodes
aws_es_warm_storage_space_utilization	WarmStorageSpaceUtilization	Tracks storage space utilization on warm data nodes
aws_es_warm_sys_memory_utilization	WarmSysMemoryUtilization	Tracks system memory utilization on warm data nodes
aws_es_warm_threadpool_search_queue	WarmThreadpoolSearchQueue	Tracks the size of the search thread pool queue on warm data nodes
aws_es_warm_threadpool_search_rejected	WarmThreadpoolSearchRejected	Tracks the number of search thread pool tasks that were rejected on warm data nodes
aws_es_warm_threadpool_search_threads	WarmThreadpoolSearchThreads	Tracks the number of active threads in the search thread pool on warm data nodes
aws_es_warm_to_cold_migration_failure_count	WarmToColdMigrationFailureCount	Tracks the number of failures during migration from warm to cold storage
aws_es_warm_to_cold_migration_latency	WarmToColdMigrationLatency	Tracks the latency of migration from warm to cold storage
aws_es_warm_to_cold_migration_queue_size	WarmToColdMigrationQueueSize	Tracks the queue size for migration from warm to cold storage
aws_es_warm_to_cold_migration_success_count	WarmToColdMigrationSuccessCount	Tracks the number of successful migrations from warm to cold storage
aws_es_warm_to_hot_migration_queue_size	WarmToHotMigrationQueueSize	Tracks the queue size for migration from warm to hot storage
aws_es_write_iops WriteIOPS	Tracks input/output operations per second for writes
aws_es_write_iopsmicro_bursting	WriteIOPSMicroBursting	Tracks micro-bursting of input/output operations for writes
aws_es_write_latency	WriteLatency	Tracks the latency of write operations in the Elasticsearch cluster
aws_es_write_throughput	WriteThroughput	Tracks the throughput of write operations
aws_es_write_throughput_micro_bursting	WriteThroughputMicroBursting	Tracks micro-bursting of write throughput

AWS/ElastiCache

Function: Managed Redis and Memcached for real-time caching

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_elasticache_info
aws_elasticache_active_defrag_hits	ActiveDefragHits	Tracks the number of active defragmentation hits in ElastiCache
aws_elasticache_authentication_failures	AuthenticationFailures	Monitors the number of failed authentication attempts
aws_elasticache_bytes_read_from_disk	BytesReadFromDisk	Measures the number of bytes read from disk in the ElastiCache cluster
aws_elasticache_bytes_read_into_memcached	BytesReadIntoMemcached	Tracks the number of bytes read into Memcached
aws_elasticache_bytes_used_for_cache	BytesUsedForCache	Monitors the total amount of memory used for cache
aws_elasticache_bytes_used_for_cache_items	BytesUsedForCacheItems	Measures the memory used by items in cache
aws_elasticache_bytes_used_for_hash	BytesUsedForHash	Tracks memory used for hash tables in the cache
aws_elasticache_bytes_used_for_memory_db	BytesUsedForMemoryDB	Monitors memory usage for MemoryDB in ElastiCache
aws_elasticache_bytes_written_out_from_memcached	BytesWrittenOutFromMemcached	Tracks the number of bytes written out from Memcached
aws_elasticache_bytes_written_to_disk	BytesWrittenToDisk	Measures the number of bytes written to disk in the ElastiCache cluster
aws_elasticache_cpucredit_balance	CPUCreditBalance	Tracks the balance of CPU credits for burstable instance types
aws_elasticache_cpucredit_usage	CPUCreditUsage	Monitors CPU credit usage for burstable instance types
aws_elasticache_cpuutilization	CPUUtilization	Measures the CPU utilization of the ElastiCache instance
aws_elasticache_cache_hit_rate	CacheHitRate	Tracks the cache hit rate, indicating how often requested data is found in cache
aws_elasticache_cache_hits	CacheHits	Measures the total number of cache hits
aws_elasticache_cache_misses	CacheMisses	Tracks the number of cache misses, when requested data is not found in cache
aws_elasticache_cas_badval	CasBadval	Monitors the number of CAS operations that failed due to bad values
aws_elasticache_cas_hits	CasHits	Tracks the number of successful CAS operations
aws_elasticache_cas_misses	CasMisses	Measures the number of CAS operations that failed due to missing data
aws_elasticache_channel_authorization_failures	ChannelAuthorizationFailures	Tracks the number of channel authorization failures
aws_elasticache_cluster_based_cmds	ClusterBasedCmds	Monitors the number of cluster-based commands executed
aws_elasticache_cluster_based_cmds_latency	ClusterBasedCmdsLatency	Tracks the latency of cluster-based commands
aws_elasticache_cmd_config_get	CmdConfigGet	Measures the number of configuration GET commands executed
aws_elasticache_cmd_config_set	CmdConfigSet	Tracks the number of configuration SET commands executed
aws_elasticache_cmd_flush	CmdFlush	Monitors the number of flush commands executed in the ElastiCache cluster
aws_elasticache_cmd_get	CmdGet	Tracks the number of GET commands executed in the cache
aws_elasticache_cmd_set	CmdSet	Measures the number of SET commands executed in the cache
aws_elasticache_cmd_touch	CmdTouch	Tracks the number of touch commands executed in the cache
aws_elasticache_command_authorization_failures	CommandAuthorizationFailures	Monitors the number of command authorization failures in the ElastiCache cluster
aws_elasticache_curr_config	CurrConfig	Tracks the current configuration state of the ElastiCache instance
aws_elasticache_curr_connections	CurrConnections	Measures the current number of open connections to the ElastiCache instance
aws_elasticache_curr_items	CurrItems	Tracks the current number of items in the cache
aws_elasticache_curr_volatile_items	CurrVolatileItems	Monitors the number of volatile items in the cache
aws_elasticache_db0_average_ttl	DB0AverageTTL	Measures the average time-to-live (TTL) of items in the cache
**aws_elasticache_database_capacity_usage_counted_for_evict_percentage	DatabaseCapacityUsageCountedForEvictPercentage**	Tracks the percentage of database capacity usage considered for eviction
aws_elasticache_database_capacity_usage_percentage	DatabaseCapacityUsagePercentage	Monitors the overall percentage of database capacity usage
aws_elasticache_database_memory_usage_counted_for_evict_percentage	DatabaseMemoryUsageCountedForEvictPercentage**	Tracks the percentage of database memory usage considered for eviction
aws_elasticache_database_memory_usage_percentage	DatabaseMemoryUsagePercentage	Measures the overall memory usage percentage in the ElastiCache cluster
aws_elasticache_decr_hits	DecrHits	Monitors the number of successful DECR (decrement) operations
aws_elasticache_decr_misses	DecrMisses	Tracks the number of DECR operations that failed
aws_elasticache_delete_hits	DeleteHits	Measures the number of successful DELETE operations
aws_elasticache_delete_misses	DeleteMisses	Tracks the number of DELETE operations that failed
aws_elasticache_engine_cpuutilization	EngineCPUUtilization	Monitors the CPU utilization of the ElastiCache engine
aws_elasticache_eval_based_cmds	EvalBasedCmds	Tracks the number of EVAL-based commands executed in the cache
aws_elasticache_eval_based_cmds_latency	EvalBasedCmdsLatency	Measures the latency of EVAL-based commands in the cache
aws_elasticache_evicted_unfetched	EvictedUnfetched	Monitors the number of items evicted before being fetched
aws_elasticache_evictions	Evictions	Tracks the total number of evictions in the cache
aws_elasticache_expired_unfetched	ExpiredUnfetched	Measures the number of items that expired before being fetched
aws_elasticache_freeable_memory	FreeableMemory	Tracks the amount of free memory available in the ElastiCache cluster
aws_elasticache_geo_spatial_based_cmds	GeoSpatialBasedCmds	Monitors the number of geospatial commands executed
aws_elasticache_geo_spatial_based_cmds_latency	GeoSpatialBasedCmdsLatency	Measures the latency of geospatial commands
aws_elasticache_get_hits	GetHits	Tracks the number of successful GET operations in the cache
aws_elasticache_get_misses	GetMisses	Measures the number of GET operations that failed
aws_elasticache_get_type_cmds	GetTypeCmds	Monitors the number of GET-type commands executed
aws_elasticache_get_type_cmds_latency	GetTypeCmdsLatency	Measures the latency of GET-type commands executed
aws_elasticache_global_datastore_replication_lag	GlobalDatastoreReplicationLag
aws_elasticache_hash_based_cmds	HashBasedCmds
aws_elasticache_hash_based_cmds_latency	HashBasedCmdsLatency
aws_elasticache_hyper_log_log_based_cmds	HyperLogLogBasedCmds
aws_elasticache_hyper_log_log_based_cmds_latency	HyperLogLogBasedCmdsLatency
aws_elasticache_iam_authentication_expirations	IamAuthenticationExpirations
aws_elasticache_iam_authentication_throttling	IamAuthenticationThrottling
aws_elasticache_incr_hits	IncrHits
aws_elasticache_incr_misses	IncrMisses
aws_elasticache_is_master	IsMaster
aws_elasticache_is_primary	IsPrimary
aws_elasticache_json_based_cmds	JsonBasedCmds
aws_elasticache_json_based_cmds_latency	JsonBasedCmdsLatency
aws_elasticache_json_based_get_cmds	JsonBasedGetCmds
aws_elasticache_key_authorization_failures	KeyAuthorizationFailures
aws_elasticache_key_based_cmds	KeyBasedCmds
aws_elasticache_key_based_cmds_latency	KeyBasedCmdsLatency
aws_elasticache_keys_tracked	KeysTracked
aws_elasticache_keyspace_hits	KeyspaceHits
aws_elasticache_keyspace_misses	KeyspaceMisses
aws_elasticache_list_based_cmds	ListBasedCmds
aws_elasticache_list_based_cmds_latency	ListBasedCmdsLatency
aws_elasticache_master_link_health_status	MasterLinkHealthStatus
aws_elasticache_max_replication_throughput	MaxReplicationThroughput
aws_elasticache_memory_fragmentation_ratio	MemoryFragmentationRatio
aws_elasticache_network_bandwidth_in_allowance_exceeded	NetworkBandwidthInAllowanceExceeded
aws_elasticache_network_bandwidth_out_allowance_exceeded	NetworkBandwidthOutAllowanceExceeded
aws_elasticache_network_bytes_in	NetworkBytesIn
aws_elasticache_network_bytes_out	NetworkBytesOut
aws_elasticache_network_conntrack_allowance_exceeded	NetworkConntrackAllowanceExceeded
aws_elasticache_network_link_local_allowance_exceeded	NetworkLinkLocalAllowanceExceeded
aws_elasticache_network_max_bytes_in	NetworkMaxBytesIn
awselasticache_network_max_bytes_out	NetworkMaxBytesOut
aws_elasticache_network_max_packets_in	NetworkMaxPacketsIn
aws_elasticache_network_max_packets_out	NetworkMaxPacketsOut
aws_elasticache_network_packets_in	NetworkPacketsIn
aws_elasticache_network_packets_out	NetworkPacketsOut
aws_elasticache_network_packets_per_second_allowance_exceeded	NetworkPacketsPerSecondAllowanceExceeded
aws_elasticache_new_connections	NewConnections
aws_elasticache_new_items	NewItems
aws_elasticache_num_items_read_from_disk	NumItemsReadFromDisk
aws_elasticache_num_items_written_to_disk	NumItemsWrittenToDisk
aws_elasticache_primary_link_health_status	PrimaryLinkHealthStatus
aws_elasticache_pub_sub_based_cmds	PubSubBasedCmds
aws_elasticache_pub_sub_based_cmds_latency	PubSubBasedCmdsLatency
aws_elasticache_reclaimed	Reclaimed
aws_elasticache_replication_bytes	ReplicationBytes
aws_elasticache_replication_delayed_write_commands	ReplicationDelayedWriteCommands
aws_elasticache_replication_lag	ReplicationLag
aws_elasticache_save_in_progress	SaveInProgress
aws_elasticache_search_based_cmds	SearchBasedCmds
aws_elasticache_search_based_get_cmds	SearchBasedGetCmds
aws_elasticache_search_based_set_cmds	SearchBasedSetCmds
aws_elasticache_search_number_of_indexed_keys	SearchNumberOfIndexedKeys
aws_elasticache_search_number_of_indexes	SearchNumberOfIndexes
aws_elasticache_search_total_index_size	SearchTotalIndexSize
aws_elasticache_set_based_cmds	SetBasedCmds
aws_elasticache_set_based_cmds_latency	SetBasedCmdsLatency
aws_elasticache_set_type_cmds	SetTypeCmds
aws_elasticache_set_type_cmds_latency	SetTypeCmdsLatency
aws_elasticache_slabs_moved	SlabsMoved
aws_elasticache_sorted_set_based_cmds	SortedSetBasedCmds
aws_elasticache_sorted_set_based_cmds_latency	SortedSetBasedCmdsLatency
aws_elasticache_stream_based_cmds	StreamBasedCmds
aws_elasticache_stream_based_cmds_latency	StreamBasedCmdsLatency
aws_elasticache_string_based_cmds	StringBasedCmds
aws_elasticache_string_based_cmds_latency	StringBasedCmdsLatency
aws_elasticache_swap_usage	SwapUsage
aws_elasticache_touch_hits	TouchHits
aws_elasticache_touch_misses	TouchMisses
aws_elasticache_traffic_management_active	TrafficManagementActive
aws_elasticache_unused_memory	UnusedMemory

AWS/ElasticBeanstalk

Function: Service to quickly deploy and manage applications in the cloud without provisioning resources

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_elasticbeanstalk_info	ElasticBeanstalk Info	General information about the AWS Elastic Beanstalk environment
aws_elasticbeanstalk_application_latency_p10	ApplicationLatencyP10	Tracks the 10th percentile application latency for Elastic Beanstalk
aws_elasticbeanstalk_application_latency_p50	ApplicationLatencyP50	Measures the median (50th percentile) application latency
aws_elasticbeanstalk_application_latency_p75	ApplicationLatencyP75	Tracks the 75th percentile latency of requests in Elastic Beanstalk
aws_elasticbeanstalk_application_latency_p85	ApplicationLatencyP85	Measures the 85th percentile latency for Elastic Beanstalk applications
aws_elasticbeanstalk_application_latency_p90	ApplicationLatencyP90	Tracks the 90th percentile application latency
aws_elasticbeanstalk_application_latency_p95	ApplicationLatencyP95	Measures the 95th percentile latency for Elastic Beanstalk applications
aws_elasticbeanstalk_application_latency_p99	ApplicationLatencyP99	Tracks the 99th percentile application latency
aws_elasticbeanstalk_application_latency_p99_9	ApplicationLatencyP99.9	Measures the 99.9th percentile application latency in Elastic Beanstalk
aws_elasticbeanstalk_application_requests2xx	ApplicationRequests2xx	Tracks the number of successful application requests with 2xx status codes
aws_elasticbeanstalk_application_requests3xx	ApplicationRequests3xx	Measures the number of application requests with 3xx (redirection) status codes
aws_elasticbeanstalk_application_requests4xx	ApplicationRequests4xx	Tracks the number of client error requests with 4xx status codes
aws_elasticbeanstalk_application_requests5xx	ApplicationRequests5xx	Measures the number of server error requests with 5xx status codes
aws_elasticbeanstalk_application_requests_total	ApplicationRequestsTotal	Tracks the total number of application requests received
aws_elasticbeanstalk_cpuidle	CPUIdle	Measures the idle CPU time of instances within Elastic Beanstalk
aws_elasticbeanstalk_cpuiowait	CPUIowait	Tracks the CPU time spent waiting for I/O operations to complete
aws_elasticbeanstalk_cpuirq	CPUIrq	Measures the time spent on interrupt requests (IRQ) on the CPU
aws_elasticbeanstalk_cpunice	CPUNice	Tracks the CPU time spent on user processes that have been “niced”
aws_elasticbeanstalk_cpusoftirq	CPUSoftirq	Monitors CPU time used for soft interrupt requests
aws_elasticbeanstalk_cpusystem	CPUSystem	Tracks the amount of CPU time spent executing system-level tasks
aws_elasticbeanstalk_cpuuser	CPUUser	Measures the amount of CPU time spent executing user processes
aws_elasticbeanstalk_environment_health	EnvironmentHealth	Monitors the overall health status of the Elastic Beanstalk environment
aws_elasticbeanstalk_instance_health	InstanceHealth	Tracks the health status of individual instances in Elastic Beanstalk
aws_elasticbeanstalk_instances_degraded	InstancesDegraded	Monitors the number of instances with degraded health
aws_elasticbeanstalk_instances_info	InstancesInfo	Provides general information about the state of instances in Elastic Beanstalk
aws_elasticbeanstalk_instances_no_data	InstancesNoData	Tracks the number of instances reporting no data
aws_elasticbeanstalk_instances_ok	InstancesOk	Monitors the number of healthy instances in the environment
aws_elasticbeanstalk_instances_pending	InstancesPending	Measures the number of instances in a pending state
aws_elasticbeanstalk_instances_severe	InstancesSevere	Tracks the number of instances with severe health problems
aws_elasticbeanstalk_instances_unknown	InstancesUnknown	Monitors the number of instances with unknown health status
aws_elasticbeanstalk_instances_warning	InstancesWarning	Tracks the number of instances in warning status
aws_elasticbeanstalk_load_average1min	LoadAverage1min	Measures the system load average over the last 1 minute
aws_elasticbeanstalk_load_average5min	LoadAverage5min	Tracks the system load average over the last 5 minutes
aws_elasticbeanstalk_root_filesystem_util	RootFilesystemUtil	Monitors the usage of the root file system

AWS/ElasticMapReduce

Function: Managed big data platform for processing large amounts of data using Hadoop

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_elasticmapreduce_info	ElasticMapReduce Info	General information about the state of the AWS Elastic MapReduce cluster
aws_elasticmapreduce_apps_completed	AppsCompleted	Tracks the number of applications that have successfully completed
aws_elasticmapreduce_apps_failed	AppsFailed	Monitors the number of applications that have failed
aws_elasticmapreduce_apps_killed	AppsKilled	Tracks the number of applications that were terminated or killed
aws_elasticmapreduce_apps_pending	AppsPending	Measures the number of applications that are in the pending state
aws_elasticmapreduce_apps_running	AppsRunning	Tracks the number of applications currently running
aws_elasticmapreduce_apps_submitted	AppsSubmitted	Measures the total number of applications that have been submitted
aws_elasticmapreduce_backup_failed	BackupFailed	Tracks the number of backup attempts that failed
aws_elasticmapreduce_capacity_remaining_gb	CapacityRemainingGB	Measures the remaining storage capacity in gigabytes within the cluster
aws_elasticmapreduce_cluster_status	ClusterStatus	Monitors the overall status of the Elastic MapReduce cluster
aws_elasticmapreduce_container_allocated	ContainerAllocated	Tracks the number of containers allocated for running tasks
aws_elasticmapreduce_container_pending	ContainerPending	Measures the number of containers pending allocation
aws_elasticmapreduce_container_pending_ratio	ContainerPendingRatio	Tracks the ratio of pending containers to total containers
aws_elasticmapreduce_container_reserved	ContainerReserved	Monitors the number of containers reserved for future tasks
aws_elasticmapreduce_core_nodes_pending	CoreNodesPending	Tracks the number of core nodes that are pending
aws_elasticmapreduce_core_nodes_running	CoreNodesRunning	Measures the number of core nodes that are currently running
aws_elasticmapreduce_corrupt_blocks	CorruptBlocks	Monitors the number of blocks that are identified as corrupt
aws_elasticmapreduce_dfs_pending_replication_blocks	DfsPendingReplicationBlocks	Tracks the number of HDFS blocks that are pending replication
aws_elasticmapreduce_hbase	HBase	Monitors the health and activity of the HBase database in the cluster
aws_elasticmapreduce_hdfsbytes_read	HDFSBytesRead	Measures the number of bytes read from HDFS in the cluster
aws_elasticmapreduce_hdfsbytes_written	HDFSBytesWritten	Tracks the number of bytes written to HDFS
aws_elasticmapreduce_hdfsutilization	HDFSUtilization	Monitors the utilization of HDFS in the cluster
aws_elasticmapreduce_hbase_backup_failed	HbaseBackupFailed	Tracks the number of failed backups for HBase in the cluster
aws_elasticmapreduce_io	IO	Monitors input/output (I/O) operations in the cluster
aws_elasticmapreduce_is_idle	IsIdle	Tracks if the cluster or a node is currently idle
aws_elasticmapreduce_jobs_failed	JobsFailed	Measures the number of failed jobs in the cluster
aws_elasticmapreduce_jobs_running	JobsRunning	Tracks the number of currently running jobs
aws_elasticmapreduce_live_data_nodes	LiveDataNodes	Monitors the number of live data nodes in the cluster
aws_elasticmapreduce_live_task_trackers	LiveTaskTrackers	Tracks the number of live task trackers
aws_elasticmapreduce_mractive_nodes	MRActiveNodes	Measures the number of active MapReduce nodes in the cluster
aws_elasticmapreduce_mrdecommissioned_nodes	MRDecommissionedNodes	Tracks the number of decommissioned MapReduce nodes
aws_elasticmapreduce_mrlost_nodes	MRLostNodes	Monitors the number of lost MapReduce nodes in the cluster
aws_elasticmapreduce_mrrebooted_nodes	MRRebootedNodes	Measures the number of rebooted MapReduce nodes
aws_elasticmapreduce_mrtotal_nodes	MRTotalNodes	Tracks the total number of MapReduce nodes
aws_elasticmapreduce_mrunhealthy_nodes	MRUnhealthyNodes	Monitors the number of unhealthy MapReduce nodes
aws_elasticmapreduce_map_reduce	Map/Reduce	General metric for MapReduce activity in the cluster
aws_elasticmapreduce_map_slots_open	MapSlotsOpen	Tracks the number of open Map slots in the cluster
aws_elasticmapreduce_map_tasks_remaining	MapTasksRemaining	Monitors the number of remaining Map tasks
aws_elasticmapreduce_map_tasks_running	MapTasksRunning	Tracks the number of Map tasks currently running
aws_elasticmapreduce_memory_allocated_mb	MemoryAllocatedMB	Measures the memory allocated in MB in the cluster
aws_elasticmapreduce_memory_available_mb	MemoryAvailableMB	Tracks the available memory in MB in the cluster
aws_elasticmapreduce_memory_reserved_mb	MemoryReservedMB	Monitors the memory reserved for future tasks in MB
aws_elasticmapreduce_memory_total_mb	MemoryTotalMB	Tracks the total memory available in MB in the cluster
aws_elasticmapreduce_missing_blocks	MissingBlocks	Measures the number of missing HDFS blocks in the cluster
aws_elasticmapreduce_most_recent_backup_duration	MostRecentBackupDuration	Tracks the duration of the most recent backup
aws_elasticmapreduce_node_status	NodeStatus	Monitors the overall status of the nodes in the cluster
aws_elasticmapreduce_pending_deletion_blocks	PendingDeletionBlocks	Tracks the number of HDFS blocks pending deletion
aws_elasticmapreduce_reduce_slots_open	ReduceSlotsOpen	Measures the number of open Reduce slots in the cluster
aws_elasticmapreduce_reduce_tasks_remaining	ReduceTasksRemaining	Monitors the number of remaining Reduce tasks
aws_elasticmapreduce_reduce_tasks_running	ReduceTasksRunning	Tracks the number of currently running Reduce tasks
aws_elasticmapreduce_remaining_map_tasks_per_slot	RemainingMapTasksPerSlot	Measures the remaining Map tasks per slot
aws_elasticmapreduce_s3_bytes_read	S3BytesRead	Tracks the number of bytes read from S3 during the cluster operation
aws_elasticmapreduce_s3_bytes_written	S3BytesWritten	Measures the number of bytes written to S3 during the cluster operation
aws_elasticmapreduce_task_nodes_pending	TaskNodesPending	Tracks the number of task nodes that are pending allocation
aws_elasticmapreduce_task_nodes_running	TaskNodesRunning	Monitors the number of running task nodes in the cluster
aws_elasticmapreduce_time_since_last_successful_backup	TimeSinceLastSuccessfulBackup	Measures the time elapsed since the last successful backup
aws_elasticmapreduce_total_load	TotalLoad	Tracks the total computational load on the cluster
aws_elasticmapreduce_under_replicated_blocks	UnderReplicatedBlocks	Monitors the number of under-replicated HDFS blocks in the cluster
aws_elasticmapreduce_yarnmemory_available_percentage	YARNMemoryAvailablePercentage	Tracks the percentage of available YARN memory in the cluster

AWS/Events

Function: Delivers a near real-time stream of system events for building reactive applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_events_info		General information about AWS Events
aws_events_dead_letter_invocations	DeadLetterInvocations	Tracks the number of times a message is sent to the dead letter queue
aws_events_events Events		Monitors the total number of events received by AWS Events
aws_events_failed_invocations	FailedInvocations	Tracks the number of invocation failures
aws_events_ingestionto_invocation_complete_latency	IngestiontoInvocationCompleteLatency	Measures the latency from event ingestion to invocation completion
aws_events_ingestionto_invocation_start_latency	IngestiontoInvocationStartLatency	Measures the latency from event ingestion to invocation start
aws_events_invocation_attempts	InvocationAttempts	Tracks the total number of invocation attempts
aws_events_invocations	Invocations	Tracks the total number of invocations
aws_events_invocations_created	InvocationsCreated	Monitors the number of invocations created
aws_events_invocations_failed_to_be_sent_to_dlq	InvocationsFailedToBeSentToDlq	Tracks the number of invocations that failed to be sent to the dead letter queue
aws_events_invocations_sent_to_dlq	InvocationsSentToDlq	Tracks the number of invocations successfully sent to the dead letter queue
aws_events_matched_events	MatchedEvents	Monitors the number of events that matched event rules
aws_events_put_events_approximate_call_count	PutEventsApproximateCallCount	Measures the approximate number of PutEvents API call requests
aws_events_put_events_approximate_failed_count	PutEventsApproximateFailedCount	Tracks the approximate number of PutEvents API call failures
aws_events_put_events_approximate_success_count	PutEventsApproximateSuccessCount	Monitors the approximate number of successful PutEvents API call requests
aws_events_put_events_approximate_throttled_count	PutEventsApproximateThrottledCount	Tracks the approximate number of throttled PutEvents API call requests
aws_events_put_events_entries_count	PutEventsEntriesCount	Measures the number of event entries in PutEvents requests
aws_events_put_events_failed_entries_count	PutEventsFailedEntriesCount	Tracks the number of failed event entries in PutEvents requests
aws_events_put_events_latency	PutEventsLatency	Monitors the latency of PutEvents API requests
aws_events_put_events_request_size	PutEventsRequestSize	Measures the size of PutEvents API requests
aws_events_put_partner_events_approximate_call_count	PutPartnerEventsApproximateCallCount	Monitors the approximate number of PutPartnerEvents API call requests
aws_events_put_partner_events_approximate_failed_count	PutPartnerEventsApproximateFailedCount	Tracks the approximate number of failed PutPartnerEvents API call requests
aws_events_put_partner_events_approximate_success_count	PutPartnerEventsApproximateSuccessCount	Measures the approximate number of successful PutPartnerEvents API call requests
aws_events_put_partner_events_approximate_throttled_count	PutPartnerEventsApproximateThrottledCount	Tracks the approximate number of throttled PutPartnerEvents API call requests
aws_events_put_partner_events_entries_count	PutPartnerEventsEntriesCount	Measures the number of event entries in PutPartnerEvents requests
aws_events_put_partner_events_failed_entries_count	PutPartnerEventsFailedEntriesCount	Monitors the number of failed event entries in PutPartnerEvents requests
aws_events_put_partner_events_latency	PutPartnerEventsLatency	Tracks the latency of PutPartnerEvents API requests
aws_events_retry_invocation_attempts	RetryInvocationAttempts	Measures the number of retry invocation attempts
aws_events_successful_invocation_attempts	SuccessfulInvocationAttempts	Tracks the number of successful invocation attempts
aws_events_throttled_rules	ThrottledRules	Monitors the number of rules that were throttled
aws_events_triggered_rules	TriggeredRules	Tracks the number of event rules that were triggered

AWS/FSx

Function: Managed file systems optimized for specific workloads like Windows and Lustre

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_fsx_info		General information about FSx
aws_fsx_cpuutilization	CPUUtilization	Measures the percentage of CPU utilization on the FSx file system
aws_fsx_client_connections	ClientConnections	Tracks the number of active client connections to the FSx file system
aws_fsx_data_read_bytes	DataReadBytes	Monitors the total bytes read from the file system
aws_fsx_data_read_operations	DataReadOperations	Measures the number of data read operations
aws_fsx_data_write_bytes	DataWriteBytes	Tracks the total bytes written to the file system
aws_fsx_data_write_operations	DataWriteOperations	Monitors the number of data write operations
aws_fsx_deduplication_saved_storage	DeduplicationSavedStorage	Measures the amount of storage saved through data deduplication
aws_fsx_disk_iops_utilization	DiskIopsUtilization	Tracks the percentage of disk IOPS (Input/Output Operations Per Second) utilization
aws_fsx_disk_read_bytes	DiskReadBytes	Monitors the total bytes read from the disk
aws_fsx_disk_read_operations	DiskReadOperations	Measures the number of disk read operations
aws_fsx_disk_throughput_balance	DiskThroughputBalance	Tracks the balance of disk throughput usage
aws_fsx_disk_throughput_utilization	DiskThroughputUtilization	Measures the percentage of disk throughput utilization
aws_fsx_disk_write_bytes	DiskWriteBytes	Tracks the total bytes written to the disk
aws_fsx_disk_write_operations	DiskWriteOperations	Monitors the number of disk write operations
aws_fsx_file_server_disk_iops_balance	FileServerDiskIopsBalance	Measures the balance of IOPS utilization on the file server
aws_fsx_file_server_disk_iops_utilization	FileServerDiskIopsUtilization	Tracks the percentage of IOPS utilization on the file server
aws_fsx_file_server_disk_throughput_balance	FileServerDiskThroughputBalance	Measures the balance of disk throughput on the file server
aws_fsx_file_server_disk_throughput_utilization	FileServerDiskThroughputUtilization	Monitors the percentage of disk throughput utilization on the file server
aws_fsx_free_data_storage_capacity	FreeDataStorageCapacity	Tracks the amount of free data storage capacity available
aws_fsx_free_storage_capacity	FreeStorageCapacity	Measures the total amount of free storage capacity available
aws_fsx_memory_utilization	MemoryUtilization	Monitors the percentage of memory utilization on the file system
aws_fsx_metadata_operations	MetadataOperations	Tracks the number of metadata operations (like file system metadata lookups)
aws_fsx_network_throughput_utilization	NetworkThroughputUtilization	Measures the percentage of network throughput utilization
aws_fsx_storage_capacity_utilization	StorageCapacityUtilization	Tracks the percentage of storage capacity utilization

AWS/Firehose

Function: Service to reliably load streaming data into AWS data stores like S3 and Redshift

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_firehose_info		General information about Firehose
aws_firehose_active_partitions_limit	ActivePartitionsLimit	Tracks the limit of active partitions
aws_firehose_backup_to_s3_bytes	BackupToS3.Bytes	Measures the amount of data backed up to S3 in bytes
aws_firehose_backup_to_s3_data_freshness	BackupToS3.DataFreshness	Monitors the data freshness of backups to S3
aws_firehose_backup_to_s3_records	BackupToS3.Records	Tracks the number of records backed up to S3
aws_firehose_backup_to_s3_success	BackupToS3.Success	Measures the success rate of data backup to S3
aws_firehose_bytes_per_second_limit	BytesPerSecondLimit	Monitors the bytes per second limit for data delivery
aws_firehose_data_read_from_kinesis_stream_bytes	DataReadFromKinesisStream.Bytes	Tracks the amount of data read from a Kinesis stream in bytes
aws_firehose_data_read_from_kinesis_stream_records	DataReadFromKinesisStream.Records	Tracks the number of records read from a Kinesis stream
aws_firehose_data_read_from_source_backpressured	DataReadFromSource.Backpressured	Measures if the data source is backpressured
aws_firehose_data_read_from_source_bytes	DataReadFromSource.Bytes	Monitors the amount of data read from the source in bytes
aws_firehose_data_read_from_source_records	DataReadFromSource.Records	Tracks the number of records read from the source
aws_firehose_delivery_to_amazon_open_search_serverless_auth_failure	DeliveryToAmazonOpenSearchServerless.AuthFailure	Tracks authorization failures during delivery to Amazon OpenSearch Serverless
aws_firehose_delivery_to_amazon_open_search_serverless_bytes	DeliveryToAmazonOpenSearchServerless.Bytes	Measures the amount of data delivered to Amazon OpenSearch Serverless in bytes
aws_firehose_delivery_to_amazon_open_search_serverless_data_freshness	DeliveryToAmazonOpenSearchServerless.DataFreshness	Monitors the data freshness during delivery to Amazon OpenSearch Serverless
aws_firehose_delivery_to_amazon_open_search_serverless_delivery_rejected	DeliveryToAmazonOpenSearchServerless.DeliveryRejected	Tracks the number of rejected deliveries to Amazon OpenSearch Serverless
aws_firehose_delivery_to_amazon_open_search_serverless_records	DeliveryToAmazonOpenSearchServerless.Records	Measures the number of records delivered to Amazon OpenSearch Serverless
aws_firehose_delivery_to_amazon_open_search_serverless_success	DeliveryToAmazonOpenSearchServerless.Success	Tracks the success rate of delivery to Amazon OpenSearch Serverless
aws_firehose_delivery_to_amazon_open_search_service_auth_failure	DeliveryToAmazonOpenSearchService.AuthFailure	Monitors authorization failures during delivery to Amazon OpenSearch Service
aws_firehose_delivery_to_amazon_open_search_service_bytes	DeliveryToAmazonOpenSearchService.Bytes	Tracks the amount of data delivered to Amazon OpenSearch Service in bytes
aws_firehose_delivery_to_amazon_open_search_service_data_freshness	DeliveryToAmazonOpenSearchService.DataFreshness	Monitors the data freshness during delivery to Amazon OpenSearch Service
aws_firehose_delivery_to_amazon_open_search_service_delivery_rejected	DeliveryToAmazonOpenSearchService.DeliveryRejected	Tracks the number of rejected deliveries to Amazon OpenSearch Service
aws_firehose_delivery_to_amazon_open_search_service_records	DeliveryToAmazonOpenSearchService.Records	Measures the number of records delivered to Amazon OpenSearch Service
aws_firehose_delivery_to_amazon_open_search_service_success	DeliveryToAmazonOpenSearchService.Success	Tracks the success rate of delivery to Amazon OpenSearch Service
aws_firehose_delivery_to_elasticsearch_bytes	DeliveryToElasticsearch.Bytes	Measures the amount of data delivered to Elasticsearch in bytes
aws_firehose_delivery_to_elasticsearch_records	DeliveryToElasticsearch.Records	Tracks the number of records delivered to Elasticsearch
aws_firehose_delivery_to_elasticsearch_success	DeliveryToElasticsearch.Success	Monitors the success rate of delivery to Elasticsearch
aws_firehose_delivery_to_http_endpoint_bytes	DeliveryToHttpEndpoint.Bytes	Measures the amount of data delivered to an HTTP endpoint in bytes
aws_firehose_delivery_to_http_endpoint_data_freshness	DeliveryToHttpEndpoint.DataFreshness	Monitors the data freshness during delivery to an HTTP endpoint
aws_firehose_delivery_to_http_endpoint_processed_bytes	DeliveryToHttpEndpoint.ProcessedBytes	Tracks the amount of data processed at an HTTP endpoint
aws_firehose_delivery_to_http_endpoint_processed_records	DeliveryToHttpEndpoint.ProcessedRecords	Monitors the number of records processed at an HTTP endpoint
aws_firehose_delivery_to_http_endpoint_records	DeliveryToHttpEndpoint.Records	Tracks the number of records delivered to an HTTP endpoint
aws_firehose_delivery_to_http_endpoint_success	DeliveryToHttpEndpoint.Success	Measures the success rate of delivery to an HTTP endpoint
aws_firehose_delivery_to_redshift_bytes	DeliveryToRedshift.Bytes	Tracks the amount of data delivered to Redshift in bytes
aws_firehose_delivery_to_redshift_records	DeliveryToRedshift.Records	Monitors the number of records delivered to Redshift
aws_firehose_delivery_to_redshift_success	DeliveryToRedshift.Success	Measures the success rate of delivery to Redshift
aws_firehose_delivery_to_s3_bytes	DeliveryToS3.Bytes	Tracks the amount of data delivered to S3 in bytes
aws_firehose_delivery_to_s3_data_freshness	DeliveryToS3.DataFreshness	Monitors the data freshness during delivery to S3
aws_firehose_delivery_to_s3_object_count	DeliveryToS3.ObjectCount	Tracks the number of objects delivered to S3
aws_firehose_delivery_to_s3_records	DeliveryToS3.Records	Monitors the number of records delivered to S3
aws_firehose_delivery_to_s3_success	DeliveryToS3.Success	Measures the success rate of delivery to S3
aws_firehose_delivery_to_snowflake_bytes	DeliveryToSnowflake.Bytes	Tracks the amount of data delivered to Snowflake in bytes
aws_firehose_delivery_to_snowflake_data_commit_latency	DeliveryToSnowflake.DataCommitLatency	Measures the latency for data commit during delivery to Snowflake
aws_firehose_delivery_to_snowflake_data_freshness	DeliveryToSnowflake.DataFreshness	Monitors the data freshness during delivery to Snowflake
aws_firehose_delivery_to_snowflake_records	DeliveryToSnowflake.Records	Tracks the number of records delivered to Snowflake
aws_firehose_delivery_to_snowflake_success	DeliveryToSnowflake.Success	Measures the success rate of delivery to Snowflake
aws_firehose_delivery_to_splunk_bytes DeliveryToSplunk.Bytes		Tracks the amount of data delivered to Splunk in bytes
aws_firehose_delivery_to_splunk_data_ack_latency	DeliveryToSplunk.DataAckLatency	Measures the acknowledgment latency during delivery to Splunk
aws_firehose_delivery_to_splunk_data_freshness	DeliveryToSplunk.DataFreshness	Monitors the data freshness during delivery to Splunk
aws_firehose_delivery_to_splunk_records	DeliveryToSplunk.Records	Tracks the number of records delivered to Splunk
aws_firehose_delivery_to_splunk_success	DeliveryToSplunk.Success	Measures the success rate of delivery to Splunk
aws_firehose_describe_delivery_stream_latency	DescribeDeliveryStream.Latency	Tracks the latency for describing a delivery stream
aws_firehose_describe_delivery_stream_requests	DescribeDeliveryStream.Requests	Measures the number of requests to describe a delivery stream
aws_firehose_execute_processing_duration	ExecuteProcessing.Duration	Tracks the duration of data processing during delivery
aws_firehose_execute_processing_success	ExecuteProcessing.Success	Measures the success rate of data processing during delivery
aws_firehose_failed_conversion_bytes	FailedConversion.Bytes	Tracks the number of bytes that failed during conversion
aws_firehose_failed_conversion_records	FailedConversion.Records	Monitors the number of records that failed during conversion
aws_firehose_failed_validation_bytes	FailedValidation.Bytes	Tracks the number of bytes that failed during validation
aws_firehose_failed_validation_records	FailedValidation.Records	Monitors the number of records that failed during validation
aws_firehose_incoming_bytes	IncomingBytes	Tracks the amount of incoming data in bytes
aws_firehose_incoming_put_requests	IncomingPutRequests	Measures the number of incoming put requests
aws_firehose_incoming_records	IncomingRecords	Monitors the number of incoming records
aws_firehose_jqprocessing_duration	JQProcessing.Duration	Tracks the duration of JQ (JSON Query) processing
aws_firehose_kmskey_access_denied	KMSKeyAccessDenied	Monitors instances where access to the KMS (Key Management Service) key is denied
aws_firehose_kmskey_disabled	KMSKeyDisabled	Tracks the instances where the KMS key is disabled
aws_firehose_kmskey_invalid_state	KMSKeyInvalidState	Monitors the instances where the KMS key is in an invalid state
aws_firehose_kmskey_not_found	KMSKeyNotFound	Tracks the instances where the KMS key is not found
aws_firehose_kafka_offset_lag	KafkaOffsetLag	Monitors the lag in Kafka offset
aws_firehose_kinesis_millis_behind_latest	KinesisMillisBehindLatest	Tracks the time lag (in milliseconds) behind the latest record in Kinesis
aws_firehose_list_delivery_streams_latency	ListDeliveryStreams.Latency	Measures the latency in listing delivery streams
aws_firehose_list_delivery_streams_requests	ListDeliveryStreams.Requests	Tracks the number of requests for listing delivery streams
aws_firehose_output_decompressed_bytes_failed	OutputDecompressedBytes.Failed	Measures the number of decompressed bytes that failed
aws_firehose_output_decompressed_bytes_success	OutputDecompressedBytes.Success	Tracks the number of decompressed bytes that succeeded
aws_firehose_output_decompressed_records_failed	OutputDecompressedRecords.Failed	Monitors the number of decompressed records that failed
aws_firehose_output_decompressed_records_success	OutputDecompressedRecords.Success	Tracks the number of decompressed records that succeeded
aws_firehose_partition_count	PartitionCount	Measures the count of partitions during data delivery
aws_firehose_partition_count_exceeded	PartitionCountExceeded	Monitors instances where partition count exceeds limits
aws_firehose_per_partition_throughput	PerPartitionThroughput	Measures the throughput per partition during data delivery
aws_firehose_put_record_bytes	PutRecord.Bytes	Tracks the number of bytes delivered via PutRecord API
aws_firehose_put_record_latency	PutRecord.Latency	Measures the latency in PutRecord API calls
aws_firehose_put_record_requests	PutRecord.Requests	Monitors the number of requests via PutRecord API
aws_firehose_put_record_batch_bytes	PutRecordBatch.Bytes	Tracks the number of bytes delivered via PutRecordBatch API
aws_firehose_put_record_batch_latency	PutRecordBatch.Latency	Measures the latency in PutRecordBatch API calls
aws_firehose_put_record_batch_records	PutRecordBatch.Records	Monitors the number of records delivered via PutRecordBatch	API
aws_firehose_put_record_batch_requests	PutRecordBatch.Requests	Measures the number of requests via PutRecordBatch API
aws_firehose_put_requests_per_second_limit	PutRequestsPerSecondLimit	Monitors the limit on PutRecord requests per second
aws_firehose_records_per_second_limit	RecordsPerSecondLimit	Tracks the limit on records processed per second
aws_firehose_resource_count	ResourceCount	Monitors the count of resources in the data delivery stream
aws_firehose_source_throttled_delay	SourceThrottled.Delay	Measures the delay caused by throttling on the data source
aws_firehose_succeed_conversion_bytes	SucceedConversion.Bytes	Tracks the number of bytes successfully converted
aws_firehose_succeed_conversion_records	SucceedConversion.Records	Monitors the number of records successfully converted
aws_firehose_succeed_processing_bytes	SucceedProcessing.Bytes	Measures the number of bytes successfully processed
aws_firehose_succeed_processing_records	SucceedProcessing.Records	Tracks the number of records successfully processed
aws_firehose_throttled_describe_stream	ThrottledDescribeStream	Monitors instances of throttled DescribeStream API calls
aws_firehose_throttled_get_records	ThrottledGetRecords	Measures instances of throttled GetRecords API calls
aws_firehose_throttled_get_shard_iterator	ThrottledGetShardIterator	Tracks instances of throttled GetShardIterator API calls
aws_firehose_throttled_records	ThrottledRecords	Measures instances where records are throttled
aws_firehose_update_delivery_stream_latency	UpdateDeliveryStream.Latency	Measures the latency in updating delivery streams
aws_firehose_update_delivery_stream_requests	UpdateDeliveryStream.Requests	Tracks the number of requests for updating delivery streams

AWS/GameLift

Function: Managed service for deploying, operating, and scaling dedicated game servers

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_gamelift_info		General information about GameLift
aws_gamelift_activating_game_sessions	ActivatingGameSessions	Tracks the number of game sessions currently being activated
aws_gamelift_active_game_sessions	ActiveGameSessions	Monitors the number of active game sessions
aws_gamelift_active_instances	ActiveInstances	Tracks the number of active GameLift instances
aws_gamelift_active_server_processes	ActiveServerProcesses	Monitors the number of active server processes
aws_gamelift_available_game_servers	AvailableGameServers	Tracks the number of available game servers
aws_gamelift_available_game_sessions	AvailableGameSessions	Monitors the number of available game sessions
aws_gamelift_average_wait_time	AverageWaitTime	Tracks the average wait time for players
aws_gamelift_current_player_sessions	CurrentPlayerSessions	Monitors the number of current active player sessions
aws_gamelift_current_tickets	CurrentTickets	Tracks the number of current active matchmaking tickets
aws_gamelift_desired_instances	DesiredInstances	Tracks the number of desired instances for the fleet
aws_gamelift_draining_available_game_servers	DrainingAvailableGameServers	Monitors the number of available game servers that are draining
aws_gamelift_draining_utilized_game_servers	DrainingUtilizedGameServers	Tracks the number of utilized game servers that are draining
aws_gamelift_first_choice_not_viable	FirstChoiceNotViable	Monitors the number of times the first placement choice was not viable
aws_gamelift_first_choice_out_of_capacity	FirstChoiceOutOfCapacity	Tracks the number of times the first placement choice ran out of capacity
aws_gamelift_game_session_interruptions	GameSessionInterruptions	Monitors the number of game session interruptions
aws_gamelift_healthy_server_processes	HealthyServerProcesses	Tracks the number of healthy server processes
aws_gamelift_idle_instances	IdleInstances	Monitors the number of idle instances in the fleet
aws_gamelift_instance_interruptions	InstanceInterruptions	Tracks the number of GameLift instance interruptions
aws_gamelift_lowest_latency_placement	LowestLatencyPlacement	Monitors placements based on the lowest latency
aws_gamelift_lowest_price_placement	LowestPricePlacement	Tracks placements based on the lowest price
aws_gamelift_match_acceptances_timed_out	MatchAcceptancesTimedOut	Monitors the number of match acceptance timeouts
aws_gamelift_matches_accepted	MatchesAccepted	Tracks the number of matches that have been accepted
aws_gamelift_matches_created	MatchesCreated	Monitors the number of matches that have been created
aws_gamelift_matches_placed	MatchesPlaced	Tracks the number of matches successfully placed
aws_gamelift_matches_rejected	MatchesRejected	Monitors the number of rejected matches
aws_gamelift_max_instances	MaxInstances	Tracks the maximum number of instances
aws_gamelift_min_instances	MinInstances	Monitors the minimum number of instances
aws_gamelift_percent_available_game_sessions	PercentAvailableGameSessions	Tracks the percentage of available game sessions
aws_gamelift_percent_healthy_server_processes	PercentHealthyServerProcesses	Monitors the percentage of healthy server processes
aws_gamelift_percent_idle_instances	PercentIdleInstances	Tracks the percentage of idle instances
aws_gamelift_placement	Placement	Monitors the match placement process
aws_gamelift_placements_canceled	PlacementsCanceled	Tracks the number of canceled placements
aws_gamelift_placements_failed	PlacementsFailed	Monitors the number of failed placements
aws_gamelift_placements_started	PlacementsStarted	Tracks the number of placement processes started
aws_gamelift_placements_succeeded	PlacementsSucceeded	Monitors the number of successful placements
aws_gamelift_placements_timed_out	PlacementsTimedOut	Tracks the number of timed-out placements
aws_gamelift_player_session_activations	PlayerSessionActivations	Monitors the number of activated player sessions
aws_gamelift_players_started	PlayersStarted	Tracks the number of players who have started their sessions
aws_gamelift_queue_depth	QueueDepth	Monitors the depth of the matchmaking queue
aws_gamelift_rule_evaluations_failed	RuleEvaluationsFailed	Tracks the number of failed rule evaluations during matchmaking
aws_gamelift_rule_evaluations_passed	RuleEvaluationsPassed	Monitors the number of passed rule evaluations during matchmaking
aws_gamelift_server_process_abnormal_terminations	ServerProcessAbnormalTerminations	Tracks the number of abnormal terminations of server processes
aws_gamelift_server_process_activations	ServerProcessActivations	Monitors the number of server process activations
aws_gamelift_server_process_terminations	ServerProcessTerminations	Tracks the number of server process terminations
aws_gamelift_tickets_failed	TicketsFailed	Monitors the number of failed matchmaking tickets
aws_gamelift_tickets_started	TicketsStarted	Tracks the number of matchmaking tickets that have started
aws_gamelift_tickets_timed_out	TicketsTimedOut	Monitors the number of matchmaking tickets that have timed out
aws_gamelift_time_to_match	TimeToMatch	Tracks the average time taken to find a match
aws_gamelift_time_to_ticket_success	TimeToTicketSuccess	Monitors the time taken to successfully complete a matchmaking ticket
aws_gamelift_utilized_game_servers	UtilizedGameServers	Tracks the number of utilized game servers

AWS/GlobalAccelerator

Function: Provides static IP addresses to improve availability and performance for global applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_globalaccelerator_info		General information about Global Accelerator
aws_globalaccelerator_healthy_endpoint_count	HealthyEndpointCount	Monitors the number of healthy endpoints in the accelerator
aws_globalaccelerator_new_flow_count	NewFlowCount	Tracks the number of new network flows being processed
aws_globalaccelerator_processed_bytes_in	ProcessedBytesIn	Monitors the volume of incoming traffic processed by the accelerator
aws_globalaccelerator_processed_bytes_out	ProcessedBytesOut	Tracks the volume of outgoing traffic processed by the accelerator
aws_globalaccelerator_unhealthy_endpoint_count	UnhealthyEndpointCount

AWS/Glue

Function: Managed ETL service that prepares and loads data for analytics

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_glue_info		General information about AWS Glue
aws_glue_all_disk_available_gb	glue.ALL.disk.available_GB	Tracks the available disk space in gigabytes for all Glue resources
aws_glue_all_disk_used_percentage	glue.ALL.disk.used.percentage	Measures the percentage of disk space used across all Glue resources
aws_glue_all_disk_used_gb	glue.ALL.disk.used_GB	Tracks the used disk space in gigabytes for all Glue resources
aws_glue_all_jvm_heap_usage	glue.ALL.jvm.heap.usage	Monitors the JVM heap usage for all Glue resources
aws_glue_all_jvm_heap_used	glue.ALL.jvm.heap.used	Measures the amount of JVM heap used across all Glue resources
aws_glue_all_memory_heap_available	glue.ALL.memory.heap.available	Tracks the available memory heap for all Glue resources
aws_glue_all_memory_heap_used	glue.ALL.memory.heap.used	Measures the used memory heap for all Glue resources
aws_glue_all_memory_heap_used_percentage	glue.ALL.memory.heap.used.percentage	Measures the percentage of memory heap used across all Glue resources
aws_glue_all_memory_non_heap_available	glue.ALL.memory.non-heap.available	Monitors the available non-heap memory for all Glue resources
aws_glue_all_memory_non_heap_percentage	glue.ALL.memory.non-heap.percentage	Tracks the percentage of non-heap memory used
aws_glue_all_memory_non_heap_used	glue.ALL.memory.non-heap.used	Measures the used non-heap memory across all Glue resources
aws_glue_all_memory_total_available	glue.ALL.memory.total.available	Tracks the total available memory for all Glue resources
aws_glue_all_memory_total_used	glue.ALL.memory.total.used	Measures the total used memory for all Glue resources
aws_glue_all_memory_total_used_percentage	glue.ALL.memory.total.used.percentage	Measures the total percentage of memory used
aws_glue_all_s3_filesystem_read_bytes	glue.ALL.s3.filesystem.read_bytes	Tracks the total number of bytes read from S3 filesystems
aws_glue_all_s3_filesystem_write_bytes	glue.ALL.s3.filesystem.write_bytes	Tracks the total number of bytes written to S3 filesystems
aws_glue_all_system_cpu_system_load	glue.ALL.system.cpuSystemLoad	Monitors the system CPU load across all Glue resources
aws_glue_driver_block_manager_disk_disk_space_used_mb	glue.driver.BlockManager.disk.diskSpaceUsed_MB	Measures the disk space used by the block manager in megabytes
aws_glue_driver_executor_allocation_manager_executors_number_all_executors	glue.driver.ExecutorAllocationManager.executors.numberAllExecutors	Tracks the number of executors across all Glue drivers
aws_glue_driver_executor_allocation_manager_executors_number_max_needed_executors	glue.driver.ExecutorAllocationManager.executors.numberMaxNeededExecutors	Tracks the maximum number of executors needed
aws_glue_driver_aggregate_bytes_read	glue.driver.aggregate.bytesRead	Tracks the total bytes read across all Glue driver instances
aws_glue_driver_aggregate_elapsed_time	glue.driver.aggregate.elapsedTime		Measures the total elapsed time for tasks
aws_glue_driver_aggregate_num_completed_stages	glue.driver.aggregate.numCompletedStages	Tracks the total number of completed stages
aws_glue_driver_aggregate_num_completed_tasks	glue.driver.aggregate.numCompletedTasks	Tracks the total number of completed tasks
aws_glue_driver_aggregate_num_failed_tasks	glue.driver.aggregate.numFailedTasks	Measures the number of failed tasks
aws_glue_driver_aggregate_num_killed_tasks	glue.driver.aggregate.numKilledTasks	Tracks the number of killed tasks
aws_glue_driver_aggregate_records_read	glue.driver.aggregate.recordsRead	Tracks the total number of records read by drivers
aws_glue_driver_aggregate_shuffle_bytes_written	glue.driver.aggregate.shuffleBytesWritten	Measures the number of shuffle bytes written
aws_glue_driver_aggregate_shuffle_local_bytes_read	glue.driver.aggregate.shuffleLocalBytesRead	Tracks the number of shuffle bytes read locally
aws_glue_driver_bytes_read	glue.driver.bytesRead	Measures the total bytes read by drivers
aws_glue_driver_bytes_written	glue.driver.bytesWritten	Measures the total bytes written by drivers
aws_glue_driver_disk_available_gb	glue.driver.disk.available_GB	Tracks the available disk space for Glue drivers
aws_glue_driver_disk_used_percentage	glue.driver.disk.used.percentage	Measures the percentage of disk space used by Glue drivers
aws_glue_driver_disk_used_gb	glue.driver.disk.used_GB	Measures the used disk space in gigabytes for Glue drivers
aws_glue_driver_files_read	glue.driver.filesRead	Tracks the total number of files read
aws_glue_driver_files_written	glue.driver.filesWritten	Measures the total number of files written
aws_glue_driver_jvm_heap_usage	glue.driver.jvm.heap.usage	Monitors the JVM heap usage of Glue drivers
aws_glue_driver_jvm_heap_used	glue.driver.jvm.heap.used	Measures the used JVM heap for Glue drivers
aws_glue_driver_memory_heap_available	glue.driver.memory.heap.available	Tracks the available heap memory for Glue drivers
aws_glue_driver_memory_heap_used	glue.driver.memory.heap.used	Measures the used heap memory for Glue drivers
aws_glue_driver_memory_heap_used_percentage	glue.driver.memory.heap.used.percentage	Measures the percentage of heap memory used
aws_glue_driver_memory_non_heap_available	glue.driver.memory.non-heap.available	Tracks the available non-heap memory for Glue drivers
aws_glue_driver_memory_non_heap_percentage	glue.driver.memory.non-heap.percentage	Measures the percentage of non-heap memory used
aws_glue_driver_memory_non_heap_used	glue.driver.memory.non-heap.used	Tracks the non-heap memory used by Glue drivers
aws_glue_driver_memory_total_available	glue.driver.memory.total.available	Tracks the total available memory for Glue drivers
aws_glue_driver_memory_total_used	glue.driver.memory.total.used	Measures the total memory used by Glue drivers
aws_glue_driver_memory_total_used_percentage	glue.driver.memory.total.used.percentage	Tracks the percentage of total memory used
aws_glue_driver_partitions_read	glue.driver.partitionsRead	Tracks the number of partitions read by drivers
aws_glue_driver_records_read	glue.driver.recordsRead	Tracks the number of records read by Glue drivers
aws_glue_driver_records_written	glue.driver.recordsWritten	Measures the number of records written by Glue drivers
aws_glue_driver_s3_filesystem_read_bytes	glue.driver.s3.filesystem.read_bytes	Measures the bytes read from S3 filesystem by drivers
aws_glue_driver_s3_filesystem_write_bytes	glue.driver.s3.filesystem.write_bytes	Tracks the bytes written to S3 filesystem by drivers
aws_glue_driver_skewness_job	glue.driver.skewness.job	Tracks skewness in job execution
aws_glue_driver_skewness_stage	glue.driver.skewness.stage	Tracks skewness in stages of execution
aws_glue_driver_streaming_batch_processing_time_in_ms	glue.driver.streaming.batchProcessingTimeInMs	Measures the batch processing time in milliseconds for streaming jobs
aws_glue_driver_streaming_num_records	glue.driver.streaming.numRecords	Tracks the number of records processed in streaming jobs
aws_glue_driver_system_cpu_system_load	glue.driver.system.cpuSystemLoad	Monitors the CPU system load on Glue drivers
aws_glue_driver_worker_utilization	glue.driver.workerUtilization	Tracks the worker utilization rate
aws_glue_error_all	glue.error.ALL	Tracks all errors occurring in Glue
aws_glue_succeed_all	glue.succeed.ALL	Measures the success rate of all Glue jobs

AWS/IoT

Function: Provides cloud services to connect IoT devices to the cloud and manage IoT workloads

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_iot_info		General information about AWS IoT
aws_iot_canceled_job_execution_count	CanceledJobExecutionCount	Tracks the count of canceled job executions
aws_iot_canceled_job_execution_total_count	CanceledJobExecutionTotalCount	Tracks the total count of canceled job executions
aws_iot_client_error	ClientError	Monitors the client error count
aws_iot_connect_auth_error	Connect.AuthError	Tracks authentication errors during connection attempts
aws_iot_connect_client_error	Connect.ClientError	Measures client-side errors during connection attempts
aws_iot_connect_server_error	Connect.ServerError	Tracks server-side errors during connection attempts
aws_iot_connect_success	Connect.Success	Measures successful connection attempts
aws_iot_connect_throttle	Connect.Throttle	Monitors throttled connection attempts
aws_iot_delete_thing_shadow_accepted	DeleteThingShadow.Accepted	Tracks successful shadow deletions
aws_iot_failed_job_execution_count	FailedJobExecutionCount	Tracks the count of failed job executions
aws_iot_failed_job_execution_total_count	FailedJobExecutionTotalCount	Measures the total count of failed job executions
aws_iot_failure	Failure	Tracks overall failure events
aws_iot_get_thing_shadow_accepted**	GetThingShadow.Accepted	Measures the number of successful shadow retrievals
aws_iot_in_progress_job_execution_count	InProgressJobExecutionCount	Tracks the count of in-progress job executions
aws_iot_in_progress_job_execution_total_count	InProgressJobExecutionTotalCount	Measures the total count of in-progress job executions
aws_iot_non_compliant_resources	NonCompliantResources	Tracks the count of non-compliant resources
aws_iot_num_log_batches_failed_to_publish_throttled	NumLogBatchesFailedToPublishThrottled	Monitors log batches that failed to publish due to throttling
aws_iot_num_log_events_failed_to_publish_throttled	NumLogEventsFailedToPublishThrottled	Measures log events that failed to publish due to throttling
aws_iot_parse_error	ParseError	Tracks the number of message parse errors
aws_iot_ping_success	Ping.Success	Measures successful ping operations
aws_iot_publish_in_auth_error	PublishIn.AuthError	Tracks authentication errors during inbound publish operations
aws_iot_publish_in_client_error	PublishIn.ClientError	Monitors client-side errors during inbound publish operations
aws_iot_publish_in_server_error	PublishIn.ServerError	Tracks server-side errors during inbound publish operations
aws_iot_publish_in_success	PublishIn.Success	Measures successful inbound publish operations
aws_iot_publish_in_throttle	PublishIn.Throttle	Tracks throttled inbound publish operations
aws_iot_publish_out_auth_error	PublishOut.AuthError	Tracks authentication errors during outbound publish operations
aws_iot_publish_out_client_error	PublishOut.ClientError	Monitors client-side errors during outbound publish operations
aws_iot_publish_out_success	PublishOut.Success	Measures successful outbound publish operations
aws_iot_queued_job_execution_count	QueuedJobExecutionCount	Tracks the count of job executions in the queue
aws_iot_queued_job_execution_total_count	QueuedJobExecutionTotalCount	Measures the total count of queued job executions
aws_iot_rejected_job_execution_count	RejectedJobExecutionCount	Tracks the count of rejected job executions
aws_iot_rejected_job_execution_total_count	RejectedJobExecutionTotalCount	Measures the total count of rejected job executions
aws_iot_removed_job_execution_count	RemovedJobExecutionCount	Tracks the count of removed job executions
aws_iot_removed_job_execution_total_count	RemovedJobExecutionTotalCount	Measures the total count of removed job executions
aws_iot_resources_evaluated	ResourcesEvaluated	Measures the number of resources evaluated
aws_iot_rule_message_throttled	RuleMessageThrottled	Tracks the number of rule messages throttled
aws_iot_rule_not_found	RuleNotFound	Measures instances where rules were not found
aws_iot_rules_executed	RulesExecuted	Tracks the number of executed rules
aws_iot_server_error	ServerError	Monitors server-side errors
aws_iot_subscribe_auth_error	Subscribe.AuthError	Tracks authentication errors during subscription attempts
aws_iot_subscribe_client_error	Subscribe.ClientError	Measures client-side errors during subscription attempts
aws_iot_subscribe_server_error	Subscribe.ServerError	Tracks server-side errors during subscription attempts
aws_iot_subscribe_success	Subscribe.Success	Measures successful subscription attempts
aws_iot_subscribe_throttle	Subscribe.Throttle	Monitors throttled subscription attempts
aws_iot_succeeded_job_execution_count	SucceededJobExecutionCount	Tracks the count of successful job executions
aws_iot_succeeded_job_execution_total_count	SucceededJobExecutionTotalCount	Measures the total count of successful job executions
aws_iot_success	Success	Tracks overall successful operations
aws_iot_topic_match	TopicMatch	Measures the number of successful topic matches
aws_iot_unsubscribe_client_error	Unsubscribe.ClientError	Monitors client-side errors during unsubscribe operations
aws_iot_unsubscribe_server_error	Unsubscribe.ServerError	Tracks server-side errors during unsubscribe operations
aws_iot_unsubscribe_success	Unsubscribe.Success	Measures successful unsubscribe operations
aws_iot_unsubscribe_throttle	Unsubscribe.Throttle	Monitors throttled unsubscribe operations
aws_iot_update_thing_shadow_accepted	UpdateThingShadow.Accepted	Measures successful shadow update operations
aws_iot_violations	Violations	Tracks policy violations
aws_iot_violations_cleared	ViolationsCleared	Measures cleared violations
aws_iot_violations_invalidated	ViolationsInvalidated	Tracks invalidated violations

AWS/Kafka

Function: Managed Apache Kafka service for building real-time streaming applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_kafka_info		General information about AWS Kafka cluster
aws_kafka_active_controller_count	ActiveControllerCount	Indicates how many active controllers are in the Kafka cluster
aws_kafka_burst_balance	BurstBalance	Measures the burst balance remaining for the Kafka broker instances
aws_kafka_bw_in_allowance_exceeded	BwInAllowanceExceeded	Tracks the instances where incoming bandwidth allowance has been exceeded
aws_kafka_bw_out_allowance_exceeded	BwOutAllowanceExceeded	Tracks the instances where outgoing bandwidth allowance has been exceeded
aws_kafka_bytes_in_per_sec	BytesInPerSec	Measures the rate of incoming bytes per second into the Kafka cluster
aws_kafka_bytes_out_per_sec	BytesOutPerSec	Measures the rate of outgoing bytes per second from the Kafka cluster
aws_kafka_cpucredit_balance	CPUCreditBalance	Shows the remaining CPU credits for instances running in burstable performance mode
aws_kafka_client_connection_count	ClientConnectionCount	Indicates the total number of client connections to the Kafka brokers
aws_kafka_conn_track_allowance_exceeded	ConnTrackAllowanceExceeded	Tracks instances where the connection tracking allowance is exceeded
aws_kafka_connection_close_rate	ConnectionCloseRate	Monitors the rate at which connections are being closed
aws_kafka_connection_count	ConnectionCount	Displays the number of open connections to the Kafka brokers
aws_kafka_connection_creation_rate	ConnectionCreationRate	Tracks the rate of new connections being created to the Kafka brokers
aws_kafka_cpu_credit_usage	CPUCreditUsage	Shows the CPU credits consumed by the Kafka instances running in burstable mode
aws_kafka_cpu_idle	CPUIdle	Indicates the percentage of idle CPU resources on Kafka instances
aws_kafka_cpu_io_wait	CpuIoWait	Measures the time instances spend waiting for I/O operations to complete
aws_kafka_cpu_system	CpuSystem	Tracks CPU usage by the system processes on Kafka instances
aws_kafka_cpu_user	CpuUser	Shows CPU usage by user processes on Kafka instances
aws_kafka_estimated_max_time_lag	EstimatedMaxTimeLag	Measures the maximum estimated time lag in replication
aws_kafka_estimated_time_lag	EstimatedTimeLag	Monitors the estimated time lag in replication between Kafka brokers
aws_kafka_fetch_consumer_local_time_ms_mean	FetchConsumerLocalTimeMsMean	Measures the average time it takes to fetch messages locally by the consumer
aws_kafka_fetch_consumer_request_queue_time_ms_mean	FetchConsumerRequestQueueTimeMsMean	Indicates the average time messages spend in the consumer request queue
aws_kafka_fetch_consumer_response_queue_time_ms_mean	FetchConsumerResponseQueueTimeMsMean	Tracks the average time it takes for a consumer to queue a response
aws_kafka_fetch_consumer_response_send_time_ms_mean	FetchConsumerResponseSendTimeMsMean	Measures the average time taken to send a consumer response
aws_kafka_fetch_consumer_total_time_ms_mean	FetchConsumerTotalTimeMsMean	Tracks the total time spent processing a consumer fetch request
aws_kafka_fetch_follower_local_time_ms_mean	FetchFollowerLocalTimeMsMean	Measures the average time it takes for a Kafka broker follower to fetch messages locally
aws_kafka_fetch_follower_request_queue_time_ms_mean	FetchFollowerRequestQueueTimeMsMean	Measures the time follower fetch requests spend in the queue
aws_kafka_fetch_follower_response_queue_time_ms_mean	FetchFollowerResponseQueueTimeMsMean	Tracks the time follower fetch responses spend in the response queue
aws_kafka_fetch_follower_response_send_time_ms_mean	FetchFollowerResponseSendTimeMsMean	Measures the time it takes for a Kafka broker follower to send a fetch response
aws_kafka_fetch_follower_total_time_ms_mean	FetchFollowerTotalTimeMsMean	Tracks the total time for a Kafka broker follower to fetch messages
aws_kafka_fetch_message_conversions_per_sec	FetchMessageConversionsPerSec	Monitors the rate of message format conversions during fetching
aws_kafka_fetch_throttle_byte_rate	FetchThrottleByteRate	Measures the rate at which fetching is throttled due to byte rate limits
aws_kafka_fetch_throttle_queue_size	FetchThrottleQueueSize	Indicates the number of messages in the fetch throttle queue
aws_kafka_fetch_throttle_time	FetchThrottleTime	Tracks the total time Kafka throttles fetch requests
aws_kafka_global_partition_count	GlobalPartitionCount	Displays the total number of partitions in the Kafka cluster
aws_kafka_global_topic_count	GlobalTopicCount	Shows the total number of topics in the Kafka cluster
aws_kafka_heap_memory_after_gc	HeapMemoryAfterGC	Tracks the amount of heap memory remaining after garbage collection
aws_kafka_app_logs_disk_used	KafkaAppLogsDiskUsed	Measures the amount of disk space used by Kafka application logs
aws_kafka_data_logs_disk_used	KafkaDataLogsDiskUsed	Measures the disk space used by Kafka data logs
aws_kafka_leader_count	LeaderCount	Shows the number of partition leaders in the Kafka cluster
aws_kafka_max_offset_lag	MaxOffsetLag	Measures the maximum offset lag between Kafka brokers
aws_kafka_memory_buffered	MemoryBuffered	Indicates the amount of memory currently buffered by Kafka
aws_kafka_memory_cached	MemoryCached	Shows the amount of memory cached by Kafka
aws_kafka_memory_free	MemoryFree	Displays the amount of free memory on Kafka brokers
aws_kafka_memory_used	MemoryUsed	Measures the total amount of memory being used by Kafka brokers
aws_kafka_messages_in_per_sec	MessagesInPerSec	Tracks the number of messages produced per second in the Kafka cluster
aws_kafka_network_processor_avg_idle_percent	NetworkProcessorAvgIdlePercent	Measures the idle percentage of the network processors
aws_kafka_network_rx_dropped	NetworkRxDropped	Shows the number of dropped incoming network packets
aws_kafka_network_rx_errors	NetworkRxErrors	Tracks the number of errors on received network packets
aws_kafka_network_rx_packets	NetworkRxPackets	Measures the number of network packets received
aws_kafka_network_tx_dropped	NetworkTxDropped	Tracks the number of dropped outgoing network packets
aws_kafka_network_tx_errors	NetworkTxErrors	Shows the number of errors on transmitted network packets
aws_kafka_network_tx_packets	NetworkTxPackets	Tracks the number of network packets transmitted
aws_kafka_offline_partitions_count	OfflinePartitionsCount	Monitors the number of Kafka partitions that are offline
aws_kafka_offset_lag	OffsetLag	Measures the current offset lag in Kafka replication
aws_kafka_partition_count	PartitionCount	Displays the total number of partitions in the Kafka cluster
aws_kafka_pps_allowance_exceeded	PpsAllowanceExceeded	Tracks instances where the packets-per-second allowance has been exceeded
aws_kafka_produce_local_time_ms_mean	ProduceLocalTimeMsMean	Measures the average time taken to produce messages locally
aws_kafka_produce_message_conversions_per_sec	ProduceMessageConversionsPerSec	Monitors the rate of message conversions during production
aws_kafka_produce_message_conversions_time_ms_mean	ProduceMessageConversionsTimeMsMean	Tracks the time taken to convert messages during production
aws_kafka_produce_request_queue_time_ms_mean	ProduceRequestQueueTimeMsMean	Measures the time produce requests spend in the queue
aws_kafka_produce_response_queue_time_ms_mean	ProduceResponseQueueTimeMsMean	Monitors the time produce responses spend in the queue
aws_kafka_produce_response_send_time_ms_mean	ProduceResponseSendTimeMsMean	Tracks the time it takes to send produce responses
aws_kafka_produce_throttle_byte_rate	ProduceThrottleByteRate	Measures the rate at which production is throttled due to byte rate limits
aws_kafka_produce_throttle_queue_size	ProduceThrottleQueueSize	Tracks the size of the production throttle queue
aws_kafka_produce_throttle_time	ProduceThrottleTime	Measures the total time Kafka throttles produce requests
aws_kafka_produce_total_time_ms_mean	ProduceTotalTimeMsMean	Tracks the total time spent on producing messages
aws_kafka_remote_copy_bytes_per_sec	RemoteCopyBytesPerSec	Measures the rate of bytes copied remotely
aws_kafka_remote_copy_errors_per_sec	RemoteCopyErrorsPerSec	Tracks the rate of errors during remote copying
aws_kafka_remote_copy_lag_bytes	RemoteCopyLagBytes	Monitors the lag in bytes during remote copying
aws_kafka_remote_fetch_bytes_per_sec	RemoteFetchBytesPerSec	Tracks the rate of bytes fetched remotely
aws_kafka_remote_fetch_errors_per_sec	RemoteFetchErrorsPerSec	Measures the rate of errors during remote fetching
aws_kafka_remote_fetch_requests_per_sec	RemoteFetchRequestsPerSec	Tracks the number of remote fetch requests per second
aws_kafka_remote_log_manager_tasks_avg_idle_percent	RemoteLogManagerTasksAvgIdlePercent	Monitors the idle percentage of remote log manager tasks
aws_kafka_remote_log_reader_avg_idle_percent	RemoteLogReaderAvgIdlePercent	Tracks the idle percentage of remote log reader tasks
aws_kafka_remote_log_reader_task_queue_size	RemoteLogReaderTaskQueueSize	Measures the size of the remote log reader task queue
aws_kafka_replication_bytes_in_per_sec	ReplicationBytesInPerSec	Tracks the rate of incoming replication bytes
aws_kafka_replication_bytes_out_per_sec	ReplicationBytesOutPerSec	Measures the rate of outgoing replication bytes
aws_kafka_request_bytes_mean	RequestBytesMean	Tracks the average size of Kafka requests
aws_kafka_request_exempt_from_throttle_time	RequestExemptFromThrottleTime	Tracks the time requests are exempt from throttling
aws_kafka_request_handler_avg_idle_percent	RequestHandlerAvgIdlePercent	Measures the idle percentage of request handlers
aws_kafka_request_throttle_queue_size	RequestThrottleQueueSize	Tracks the size of the request throttle queue
aws_kafka_request_throttle_time	RequestThrottleTime	Measures the time requests are throttled in Kafka
aws_kafka_request_time	RequestTime	Monitors the overall time spent handling requests in Kafka
aws_kafka_root_disk_used	RootDiskUsed	Tracks the amount of disk space used by the root partition
aws_kafka_sum_offset_lag	SumOffsetLag	Measures the total offset lag across all partitions
aws_kafka_swap_free	SwapFree	Tracks the amount of free swap memory available on Kafka brokers
aws_kafka_swap_used	SwapUsed	Measures the amount of swap memory used by Kafka brokers
aws_kafka_tcpconnections	TCPConnections	Tracks the total number of TCP connections on the Kafka cluster
aws_kafka_tcp_connections	TcpConnections	Monitors the active TCP connections in the Kafka cluster
aws_kafka_traffic_bytes	TrafficBytes	Measures the total traffic in bytes on Kafka brokers
aws_kafka_traffic_shaping	TrafficShaping	Tracks instances where traffic shaping is applied to Kafka brokers
aws_kafka_under_min_isr_partition_count	UnderMinIsrPartitionCount	Tracks the number of partitions below the minimum in-sync replicas
aws_kafka_under_replicated_partitions	UnderReplicatedPartitions	Measures the number of under-replicated partitions in the Kafka cluster
aws_kafka_volume_queue_length	VolumeQueueLength	Tracks the queue length for disk I/O operations
aws_kafka_volume_read_bytes	VolumeReadBytes	Measures the number of bytes read from disk
aws_kafka_volume_read_ops	VolumeReadOps	Tracks the number of read operations on the disk
aws_kafka_volume_total_read_time	VolumeTotalReadTime	Measures the total time spent on disk read operations
aws_kafka_volume_total_write_time	VolumeTotalWriteTime	Measures the total time spent on disk write operations
aws_kafka_volume_write_bytes	VolumeWriteBytes	Tracks the number of bytes written to disk
aws_kafka_volume_write_ops	VolumeWriteOps	Measures the number of write operations on the disk
aws_kafka_zoo_keeper_request_latency_ms_mean	ZooKeeperRequestLatencyMsMean	Measures the average latency of requests to ZooKeeper
aws_kafka_zoo_keeper_session_state	ZooKeeperSessionState	Tracks the current session state of ZooKeeper

AWS/Kinesis

Function: Managed service for real-time data processing and analytics

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_kinesis_info
aws_kinesis_get_records_bytes	GetRecords.Bytes	Measures the total number of bytes retrieved by the GetRecords call.
aws_kinesis_get_records_iterator_age	GetRecords.IteratorAge	Measures the age of the last record retrieved using the iterator.
aws_kinesis_get_records_iterator_age_milliseconds	GetRecords.IteratorAgeMilliseconds	Measures the age of the iterator in milliseconds for the GetRecords call.
aws_kinesis_get_records_latency	GetRecords.Latency	Tracks the latency of the GetRecords call to retrieve data from a stream.
aws_kinesis_get_records_records	GetRecords.Records	Tracks the total number of records retrieved by the GetRecords call.
aws_kinesis_get_records_success	GetRecords.Success	Measures the success rate of the GetRecords call.
aws_kinesis_incoming_bytes	IncomingBytes	Tracks the number of incoming bytes written to the stream.
aws_kinesis_incoming_records	IncomingRecords	Measures the total number of records being written to the stream.
aws_kinesis_iterator_age_milliseconds	IteratorAgeMilliseconds	Tracks the age of the iterator used in GetRecords, measured in milliseconds.
aws_kinesis_outgoing_bytes	OutgoingBytes	Tracks the total number of outgoing bytes from the stream.
aws_kinesis_outgoing_records	OutgoingRecords	Measures the total number of outgoing records from the stream.
aws_kinesis_put_record_bytes	PutRecord.Bytes	Measures the total number of bytes in the PutRecord call.
aws_kinesis_put_record_latency	PutRecord.Latency	Tracks the latency of PutRecord requests to write data to the stream.
aws_kinesis_put_record_success	PutRecord.Success	Measures the success rate of the PutRecord call.
aws_kinesis_put_records_bytes	PutRecords.Bytes	Measures the total number of bytes written using the PutRecords call.
aws_kinesis_put_records_failed_records	PutRecords.FailedRecords	Tracks the number of failed records in the PutRecords call.
aws_kinesis_put_records_latency	PutRecords.Latency	Measures the latency of PutRecords requests to the stream.
aws_kinesis_put_records_records	PutRecords.Records	Tracks the total number of records written using the PutRecords call.
aws_kinesis_put_records_success	PutRecords.Success	Measures the success rate of the PutRecords call.
aws_kinesis_put_records_successful_records	PutRecords.SuccessfulRecords	Measures the total number of successful records in the PutRecords call.
aws_kinesis_put_records_throttled_records	PutRecords.ThrottledRecords	Tracks the number of throttled records in the PutRecords call due to exceeding throughput limits.
aws_kinesis_put_records_total_records	PutRecords.TotalRecords	Measures the total number of records submitted via PutRecords.
aws_kinesis_read_provisioned_throughput_exceeded	ReadProvisionedThroughputExceeded	Tracks the number of times read requests exceeded the provisioned throughput.
aws_kinesis_subscribe_to_shard_rate_exceeded	SubscribeToShard.RateExceeded	Tracks the number of times the rate for SubscribeToShard exceeded limits.
aws_kinesis_subscribe_to_shard_success	SubscribeToShard.Success	Measures the success rate of SubscribeToShard operations.
aws_kinesis_subscribe_to_shard_event_bytes	SubscribeToShardEvent.Bytes	Tracks the number of bytes received in shard events during SubscribeToShard operations.
aws_kinesis_subscribe_to_shard_event_millis_behind_latest	SubscribeToShardEvent.MillisBehindLatest	Tracks how far behind the latest event the shard event is during SubscribeToShard operations.
aws_kinesis_subscribe_to_shard_event_records	SubscribeToShardEvent.Records	Measures the number of records received in shard events during SubscribeToShard operations.
aws_kinesis_subscribe_to_shard_event_success	SubscribeToShardEvent.Success	Tracks the success rate of SubscribeToShard events.
aws_kinesis_write_provisioned_throughput_exceeded	WriteProvisionedThroughputExceeded	Measures the number of times write operations exceeded the provisioned throughput limits.

AWS/KinesisAnalytics

Function: Processes streaming data in real time using SQL

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_kinesisanalytics_bytes	Bytes	Tracks the total amount of data processed by Kinesis Analytics.
aws_kinesisanalytics_input_processing_dropped_records	InputProcessing.DroppedRecords	Measures the number of dropped records during input processing.
aws_kinesisanalytics_input_processing_duration	InputProcessing.Duration	Tracks the duration of input processing.
aws_kinesisanalytics_input_processing_ok_bytes	InputProcessing.OkBytes	Measures the number of bytes successfully processed during input.
aws_kinesisanalytics_input_processing_ok_records	InputProcessing.OkRecords	Tracks the number of records successfully processed during input.
aws_kinesisanalytics_input_processing_processing_failed_records	InputProcessing.ProcessingFailedRecords	Measures the number of records that failed during input processing.
aws_kinesisanalytics_input_processing_success	InputProcessing.Success	Tracks the success rate of input processing operations.
aws_kinesisanalytics_kpus KPUs		Monitors the number of Kinesis Processing Units (KPUs) used.
aws_kinesisanalytics_lambda_delivery_delivery_failed_records	LambdaDelivery.DeliveryFailedRecords	Measures the number of failed records delivered to AWS Lambda by Kinesis Analytics.
aws_kinesisanalytics_lambda_delivery_duration	LambdaDelivery.Duration	Tracks the duration of record delivery to AWS Lambda.
aws_kinesisanalytics_lambda_delivery_ok_records	LambdaDelivery.OkRecords	Measures the number of records successfully delivered to AWS Lambda.
aws_kinesisanalytics_millis_behind_latest	MillisBehindLatest	Tracks the time Kinesis Analytics is behind the latest record in milliseconds.
aws_kinesisanalytics_records	Records	Measures the total number of records processed by Kinesis Analytics.
aws_kinesisanalytics_success	Success	Tracks the success rate of all Kinesis Analytics operations.
aws_kinesisanalytics_back_pressured_time_ms_per_second	backPressuredTimeMsPerSecond	Measures the amount of time in milliseconds Kinesis Analytics was back-pressured.
aws_kinesisanalytics_busy_time_ms_per_second	busyTimeMsPerSecond	Tracks the time Kinesis Analytics spent in a busy state, processing data.
aws_kinesisanalytics_bytes_requested_per_fetch	bytesRequestedPerFetch	Measures the number of bytes requested in each fetch operation.
aws_kinesisanalytics_bytes_consumed_rate	bytes_consumed_rate	Tracks the rate at which bytes are consumed from the stream.
aws_kinesisanalytics_commits_failed	commitsFailed	Measures the number of failed commit operations.
aws_kinesisanalytics_commits_succeeded	commitsSucceeded	Tracks the number of successful commit operations.
aws_kinesisanalytics_committedoffsets	committedoffsets	Monitors the committed offsets of records processed.
aws_kinesisanalytics_container_cpuutilization	containerCPUUtilization	Tracks the CPU utilization of the Kinesis Analytics container.
aws_kinesisanalytics_container_disk_utilization	containerDiskUtilization	Monitors the disk utilization of the Kinesis Analytics container.
aws_kinesisanalytics_container_memory_utilization	containerMemoryUtilization	Measures the memory utilization of the Kinesis Analytics container.
aws_kinesisanalytics_cpu_utilization	cpuUtilization	Tracks the overall CPU utilization of Kinesis Analytics.
aws_kinesisanalytics_current_input_watermark	currentInputWatermark	Monitors the current watermark for input data.
aws_kinesisanalytics_current_output_watermark	currentOutputWatermark	Tracks the current watermark for output data.
aws_kinesisanalytics_currentoffsets	currentoffsets	Measures the current offsets for processed records.
aws_kinesisanalytics_downtime	downtime	Tracks the total downtime of the Kinesis Analytics application.
aws_kinesisanalytics_full_restarts	fullRestarts	Measures the number of full restarts of the Kinesis Analytics application.
aws_kinesisanalytics_heap_memory_utilization	heapMemoryUtilization	Monitors the heap memory utilization.
aws_kinesisanalytics_idle_time_ms_per_second	idleTimeMsPerSecond	Tracks the idle time of Kinesis Analytics in milliseconds per second.
aws_kinesisanalytics_last_checkpoint_duration	lastCheckpointDuration	Measures the duration of the last checkpoint process.
aws_kinesisanalytics_last_checkpoint_size	lastCheckpointSize	Monitors the size of the last checkpoint.
aws_kinesisanalytics_managed_memory_total	managedMemoryTotal	Tracks the total managed memory available.
aws_kinesisanalytics_managed_memory_used	managedMemoryUsed	Measures the amount of managed memory currently in use.
aws_kinesisanalytics_managed_memory_utilization	managedMemoryUtilization	Tracks the utilization of managed memory.
aws_kinesisanalytics_num_late_records_dropped	numLateRecordsDropped	Measures the number of late records dropped by Kinesis Analytics.
aws_kinesisanalytics_num_records_in	numRecordsIn	Tracks the number of records ingested by Kinesis Analytics.
aws_kinesisanalytics_num_records_in_per_second	numRecordsInPerSecond	Monitors the rate of incoming records per second.
aws_kinesisanalytics_num_records_out	numRecordsOut	Measures the number of records output by Kinesis Analytics.
aws_kinesisanalytics_num_records_out_per_second	numRecordsOutPerSecond	Tracks the rate of outgoing records per second.
aws_kinesisanalytics_number_of_failed_checkpoints	numberOfFailedCheckpoints	Measures the number of failed checkpoints in Kinesis Analytics.
aws_kinesisanalytics_old_generation_gccount	oldGenerationGCCount	Tracks the count of garbage collection events in the old generation heap space.
aws_kinesisanalytics_old_generation_gctime	oldGenerationGCTime	Measures the time spent in garbage collection for the old generation heap.
aws_kinesisanalytics_records_lag_max	records_lag_max	Tracks the maximum lag of records being processed by Kinesis Analytics.
aws_kinesisanalytics_thread_count	threadCount	Monitors the number of active threads in the Kinesis Analytics application.
aws_kinesisanalytics_uptime uptime		Measures the uptime of the Kinesis Analytics application.
aws_kinesisanalytics_zeppelin_cpu_utilization	zeppelinCpuUtilization	Tracks the CPU utilization of the Zeppelin server used by Kinesis Analytics.
aws_kinesisanalytics_zeppelin_heap_memory_utilization	zeppelinHeapMemoryUtilization	Monitors the heap memory utilization of the Zeppelin server.
aws_kinesisanalytics_zeppelin_server_uptime	zeppelinServerUptime	Tracks the uptime of the Zeppelin server.
aws_kinesisanalytics_zeppelin_thread_count	zeppelinThreadCount	Monitors the number of active threads in the Zeppelin server.
aws_kinesisanalytics_zeppelin_waiting_jobs	zeppelinWaitingJobs	Measures the number of jobs waiting to be processed by the Zeppelin server.

AWS/Lambda

Function: Serverless compute service that runs code in response to events

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_lambda_info
aws_lambda_invocations	Invocations	Tracks the number of times your AWS Lambda function is invoked.
aws_lambda_errors	Errors	Monitors the number of invocations that result in an error.
aws_lambda_throttles	Throttles	Measures the number of times your Lambda function is throttled due to exceeding the concurrency limit.
aws_lambda_duration	Duration	Tracks the amount of time a Lambda function takes to execute.
aws_lambda_async_event_age	AsyncEventAge	Measures the age of an asynchronous event when Lambda begins executing the associated function.
aws_lambda_async_events_dropped	AsyncEventsDropped	Monitors the number of asynchronous events dropped due to Lambda service errors or throttling.
aws_lambda_async_events_received	AsyncEventsReceived	Tracks the number of asynchronous events received by the Lambda function.
aws_lambda_claimed_account_concurrency	ClaimedAccountConcurrency	Monitors the number of reserved concurrent executions for your account.
aws_lambda_concurrent_executions	ConcurrentExecutions	Tracks the number of concurrent executions across all Lambda functions in your account.
aws_lambda_dead_letter_errors	DeadLetterErrors	Measures the number of failed invocations that couldn’t be sent to the Dead Letter Queue.
aws_lambda_destination_delivery_failures	DestinationDeliveryFailures	Tracks the number of failures when delivering function results to a destination service.
aws_lambda_iterator_age	IteratorAge	Measures the age of the last record in the event source before Lambda starts processing.
aws_lambda_offset_lag	OffsetLag	Tracks the offset lag for Kinesis or DynamoDB streams when invoking Lambda functions.
aws_lambda_oversized_record_count	OversizedRecordCount	Measures the number of records that exceeded the maximum size supported by Lambda.
aws_lambda_post_runtime_extensions_duration	PostRuntimeExtensionsDuration	Tracks the time taken by post-runtime extensions after Lambda function execution.
aws_lambda_provisioned_concurrency_invocations	ProvisionedConcurrencyInvocations	Measures the number of invocations served by functions with provisioned concurrency.
aws_lambda_provisioned_concurrency_spillover_invocations	ProvisionedConcurrencySpilloverInvocations	Tracks the number of invocations that were served by standard concurrency when provisioned concurrency was exhausted.
aws_lambda_provisioned_concurrency_utilization	ProvisionedConcurrencyUtilization	Measures the percentage of provisioned concurrency that is being used by your Lambda function.
aws_lambda_provisioned_concurrent_executions	ProvisionedConcurrentExecutions	Tracks the number of concurrent executions using provisioned concurrency.
aws_lambda_recursive_invocations_dropped	RecursiveInvocationsDropped	Measures the number of recursive invocations that were dropped.
aws_lambda_unreserved_concurrent_executions	UnreservedConcurrentExecutions	Tracks the number of concurrent executions that are not using provisioned concurrency.

AWS/Logs

Function: Centralized logging service for monitoring and troubleshooting applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_logs_info
aws_logs_delivery_errors	DeliveryErrors	Tracks the number of errors that occurred while attempting to deliver log data to the CloudWatch Logs destination.
aws_logs_delivery_throttling	DeliveryThrottling	Measures the number of times log delivery was throttled due to exceeding the delivery limits.
aws_logs_forwarded_bytes	ForwardedBytes	Monitors the total volume of log data in bytes that was successfully forwarded to the CloudWatch Logs destination.
aws_logs_forwarded_log_events	ForwardedLogEvents	Tracks the number of log events successfully forwarded to the CloudWatch Logs destination.
aws_logs_incoming_bytes	IncomingBytes	Measures the total volume of incoming log data in bytes received by CloudWatch Logs.
aws_logs_incoming_log_events	IncomingLogEvents	Tracks the number of log events received by CloudWatch Logs.

AWS/MWAA

Function: Managed service for Apache Airflow to manage workflows and orchestration

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_mwaa_active_connection_count	ActiveConnectionCount	Tracks the number of active connections to the Managed Workflows for Apache Airflow (MWAA) environment.
aws_mwaa_approximate_age_of_oldest_task	ApproximateAgeOfOldestTask	Measures the age of the oldest running task in the MWAA environment.
aws_mwaa_cpuutilization	CPUUtilization	Monitors the percentage of CPU utilization in the MWAA environment.
aws_mwaa_database_connections	DatabaseConnections	Tracks the number of connections to the database used by MWAA.
aws_mwaa_disk_queue_depth	DiskQueueDepth	Measures the depth of the disk queue, indicating the number of IO operations waiting to be processed.
aws_mwaa_freeable_memory	FreeableMemory	Monitors the amount of free memory available in the MWAA environment.
aws_mwaa_memory_utilization	MemoryUtilization	Tracks the percentage of memory utilized in the MWAA environment.
aws_mwaa_queued_tasks	QueuedTasks	Measures the number of tasks waiting to be executed in the MWAA environment.
aws_mwaa_running_tasks	RunningTasks	Tracks the number of tasks currently running in the MWAA environment.
aws_mwaa_volume_write_iops	VolumeWriteIOPS	Monitors the input/output operations per second (IOPS) for write operations on the volume.
aws_mwaa_write_iops	WriteIOPS	Tracks the number of write operations per second in the MWAA environment.
aws_mwaa_write_latency	WriteLatency	Measures the latency of write operations in the MWAA environment.
aws_mwaa_write_throughput	WriteThroughput	Monitors the amount of data written per second in the MWAA environment.

AWS/MediaConnect

Function: Secure and reliable transport of live video streams

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_mediaconnect_info
aws_mediaconnect_arqrecovered	ARQRecovered	Monitors the number of Automatic Repeat reQuest (ARQ) packets successfully recovered in the MediaConnect flow.
aws_mediaconnect_arqrequests	ARQRequests	Tracks the number of ARQ requests made by MediaConnect flows.
aws_mediaconnect_bit_rate	BitRate	Measures the bitrate of the MediaConnect stream.
aws_mediaconnect_caterror	CATError	Detects Conditional Access Table (CAT) errors in the MediaConnect stream.
aws_mediaconnect_crcerror	CRCError	Tracks the number of cyclic redundancy check (CRC) errors in the stream.
aws_mediaconnect_connected	Connected	Monitors the connection status of the MediaConnect flow.
aws_mediaconnect_connected_outputs	ConnectedOutputs	Tracks the number of outputs connected to the MediaConnect flow.
aws_mediaconnect_connection_attempts	ConnectionAttempts	Measures the number of attempts made to establish a connection for the flow.
aws_mediaconnect_consecutive_drops	ConsecutiveDrops	Monitors the number of consecutive dropped packets in the MediaConnect flow.
aws_mediaconnect_consecutive_not_recovered	ConsecutiveNotRecovered	Tracks the number of consecutive packets that were not successfully recovered.
aws_mediaconnect_continuity_counter	ContinuityCounter	Monitors the continuity counter of the stream to detect missing packets.
aws_mediaconnect_disconnections	Disconnections	Tracks the number of times the MediaConnect flow was disconnected.
aws_mediaconnect_dropped_packets	DroppedPackets	Monitors the number of packets dropped in the MediaConnect flow.
aws_mediaconnect_egress_bridge_bit_rate	EgressBridgeBitRate	Tracks the bitrate for egress bridge flows.
aws_mediaconnect_egress_bridge_caterror	EgressBridgeCATError	Detects CAT errors in egress bridge flows.
aws_mediaconnect_egress_bridge_crcerror	EgressBridgeCRCError	Monitors the CRC errors in egress bridge flows.
aws_mediaconnect_egress_bridge_continuity_counter	EgressBridgeContinuityCounter	Measures the continuity of the egress bridge stream to detect missing packets.
aws_mediaconnect_egress_bridge_dropped_packets	EgressBridgeDroppedPackets	Tracks the number of packets dropped in the egress bridge flows.
aws_mediaconnect_egress_bridge_failover_switches	EgressBridgeFailoverSwitches	Monitors failover switches in the egress bridge flows.
aws_mediaconnect_egress_bridge_merge_active	EgressBridgeMergeActive	Indicates if an egress bridge merge is active.
aws_mediaconnect_egress_bridge_not_recovered_packets	EgressBridgeNotRecoveredPackets	Tracks the number of packets that were not recovered in the egress bridge.
aws_mediaconnect_egress_bridge_paterror	EgressBridgePATError	Detects Program Association Table (PAT) errors in the egress bridge.
aws_mediaconnect_egress_bridge_pcraccuracy_error	EgressBridgePCRAccuracyError	Monitors errors related to the accuracy of Program Clock Reference (PCR) in the egress bridge.
aws_mediaconnect_egress_bridge_pcrerror	EgressBridgePCRError	Tracks PCR errors in the egress bridge.
aws_mediaconnect_egress_bridge_piderror	EgressBridgePIDError	Monitors Packet Identifier (PID) errors in the egress bridge stream.
aws_mediaconnect_egress_bridge_pmterror	EgressBridgePMTError	Detects errors in the Program Map Table (PMT) in the egress bridge.
aws_mediaconnect_egress_bridge_ptserror	EgressBridgePTSError	Tracks Presentation Time Stamp (PTS) errors in the egress bridge stream.
aws_mediaconnect_egress_bridge_packet_loss_percent	EgressBridgePacketLossPercent	Measures the percentage of packet loss in the egress bridge.
aws_mediaconnect_egress_bridge_recovered_packets	EgressBridgeRecoveredPackets	Tracks the number of recovered packets in the egress bridge stream.
aws_mediaconnect_egress_bridge_source_bit_rate	EgressBridgeSourceBitRate	Monitors the bitrate of the source in the egress bridge.
aws_mediaconnect_egress_bridge_source_caterror	EgressBridgeSourceCATError	Detects CAT errors in the source of the egress bridge.
aws_mediaconnect_egress_bridge_source_crcerror	EgressBridgeSourceCRCError	Tracks CRC errors in the source of the egress bridge.
aws_mediaconnect_egress_bridge_source_continuity_counter	EgressBridgeSourceContinuityCounter	Measures the continuity of the source stream in the egress bridge to detect missing packets.
aws_mediaconnect_egress_bridge_source_dropped_packets	EgressBridgeSourceDroppedPackets	Monitors the number of dropped packets in the source stream of the egress bridge.
aws_mediaconnect_egress_bridge_source_merge_active	EgressBridgeSourceMergeActive	Indicates if the source merge is active in the egress bridge.
aws_mediaconnect_egress_bridge_source_merge_latency	EgressBridgeSourceMergeLatency	Measures latency during source merge in the egress bridge.
aws_mediaconnect_egress_bridge_source_not_recovered_packets	EgressBridgeSourceNotRecoveredPackets	Tracks the number of packets not recovered in the source of the egress bridge.
aws_mediaconnect_egress_bridge_source_paterror	EgressBridgeSourcePATError	Detects PAT errors in the source of the egress bridge.
aws_mediaconnect_egress_bridge_source_pcraccuracy_error	EgressBridgeSourcePCRAccuracyError	Monitors errors in the accuracy of the PCR in the source of the egress bridge.
aws_mediaconnect_egress_bridge_source_pcrerror	EgressBridgeSourcePCRError	Tracks PCR errors in the source stream of the egress bridge.
aws_mediaconnect_egress_bridge_source_piderror	EgressBridgeSourcePIDError
aws_mediaconnect_egress_bridge_source_pmterror	EgressBridgeSourcePMTError
aws_mediaconnect_egress_bridge_source_ptserror	EgressBridgeSourcePTSError
aws_mediaconnect_egress_bridge_source_packet_loss_percent	EgressBridgeSourcePacketLossPercent
aws_mediaconnect_egress_bridge_source_recovered_packets	EgressBridgeSourceRecoveredPackets
aws_mediaconnect_egress_bridge_source_tsbyte_error	EgressBridgeSourceTSByteError
aws_mediaconnect_egress_bridge_source_tssync_loss	EgressBridgeSourceTSSyncLoss
aws_mediaconnect_egress_bridge_source_total_packets	EgressBridgeSourceTotalPackets
aws_mediaconnect_egress_bridge_source_transport_error	EgressBridgeSourceTransportError
aws_mediaconnect_egress_bridge_tsbyte_error	EgressBridgeTSByteError
aws_mediaconnect_egress_bridge_tssync_loss	EgressBridgeTSSyncLoss
aws_mediaconnect_egress_bridge_total_packets	EgressBridgeTotalPackets
aws_mediaconnect_egress_bridge_transport_error	EgressBridgeTransportError
aws_mediaconnect_failover_switches	FailoverSwitches
aws_mediaconnect_ingress_bridge_bit_rate	IngressBridgeBitRate
aws_mediaconnect_ingress_bridge_caterror	IngressBridgeCATError
aws_mediaconnect_ingress_bridge_crcerror	IngressBridgeCRCError
aws_mediaconnect_ingress_bridge_continuity_counter	IngressBridgeContinuityCounter
aws_mediaconnect_ingress_bridge_dropped_packets	IngressBridgeDroppedPackets
aws_mediaconnect_ingress_bridge_failover_switches	IngressBridgeFailoverSwitches
aws_mediaconnect_ingress_bridge_merge_active	IngressBridgeMergeActive
aws_mediaconnect_ingress_bridge_not_recovered_packets	IngressBridgeNotRecoveredPackets
aws_mediaconnect_ingress_bridge_paterror	IngressBridgePATError
aws_mediaconnect_ingress_bridge_pcraccuracy_error	IngressBridgePCRAccuracyError
aws_mediaconnect_ingress_bridge_pcrerror	IngressBridgePCRError
aws_mediaconnect_ingress_bridge_piderror	IngressBridgePIDError
aws_mediaconnect_ingress_bridge_pmterror	IngressBridgePMTError
aws_mediaconnect_ingress_bridge_ptserror	IngressBridgePTSError
aws_mediaconnect_ingress_bridge_packet_loss_percent	IngressBridgePacketLossPercent
aws_mediaconnect_ingress_bridge_recovered_packets	IngressBridgeRecoveredPackets
aws_mediaconnect_ingress_bridge_source_arqrecovered	IngressBridgeSourceARQRecovered
aws_mediaconnect_ingress_bridge_source_arqrequests	IngressBridgeSourceARQRequests
aws_mediaconnect_ingress_bridge_source_bit_rate	IngressBridgeSourceBitRate
aws_mediaconnect_ingress_bridge_source_caterror	IngressBridgeSourceCATError
aws_mediaconnect_ingress_bridge_source_crcerror	IngressBridgeSourceCRCError
aws_mediaconnect_ingress_bridge_source_continuity_counter	IngressBridgeSourceContinuityCounter
aws_mediaconnect_ingress_bridge_source_dropped_packets	IngressBridgeSourceDroppedPackets
aws_mediaconnect_ingress_bridge_source_fecpackets	IngressBridgeSourceFECPackets
aws_mediaconnect_ingress_bridge_source_fecrecovered	IngressBridgeSourceFECRecovered
aws_mediaconnect_ingress_bridge_source_merge_active	IngressBridgeSourceMergeActive
aws_mediaconnect_ingress_bridge_source_merge_latency	IngressBridgeSourceMergeLatency
aws_mediaconnect_ingress_bridge_source_not_recovered_packets	IngressBridgeSourceNotRecoveredPackets
aws_mediaconnect_ingress_bridge_source_overflow_packets	IngressBridgeSourceOverflowPackets
aws_mediaconnect_ingress_bridge_source_paterror	IngressBridgeSourcePATError
aws_mediaconnect_ingress_bridge_source_pcraccuracy_error	IngressBridgeSourcePCRAccuracyError
aws_mediaconnect_ingress_bridge_source_pcrerror	IngressBridgeSourcePCRError
aws_mediaconnect_ingress_bridge_source_piderror	IngressBridgeSourcePIDError
aws_mediaconnect_ingress_bridge_source_pmterror	IngressBridgeSourcePMTError
aws_mediaconnect_ingress_bridge_source_ptserror	IngressBridgeSourcePTSError
aws_mediaconnect_ingress_bridge_source_packet_loss_percent	IngressBridgeSourcePacketLossPercent
aws_mediaconnect_ingress_bridge_source_recovered_packets	IngressBridgeSourceRecoveredPackets
aws_mediaconnect_ingress_bridge_source_round_trip_time	IngressBridgeSourceRoundTripTime
aws_mediaconnect_ingress_bridge_source_tsbyte_error	IngressBridgeSourceTSByteError
aws_mediaconnect_ingress_bridge_source_tssync_loss	IngressBridgeSourceTSSyncLoss
aws_mediaconnect_ingress_bridge_source_total_packets	IngressBridgeSourceTotalPackets
aws_mediaconnect_ingress_bridge_source_transport_error	IngressBridgeSourceTransportError
aws_mediaconnect_ingress_bridge_tsbyte_error	IngressBridgeTSByteError
aws_mediaconnect_ingress_bridge_tssync_loss	IngressBridgeTSSyncLoss
aws_mediaconnect_ingress_bridge_total_packets	IngressBridgeTotalPackets
aws_mediaconnect_ingress_bridge_transport_error	IngressBridgeTransportError
aws_mediaconnect_jitter	Jitter
aws_mediaconnect_latency	Latency
aws_mediaconnect_maintenance_canceled	MaintenanceCanceled
aws_mediaconnect_maintenance_failed	MaintenanceFailed
aws_mediaconnect_maintenance_rescheduled	MaintenanceRescheduled
aws_mediaconnect_maintenance_scheduled	MaintenanceScheduled
aws_mediaconnect_maintenance_started	MaintenanceStarted
aws_mediaconnect_maintenance_succeeded	MaintenanceSucceeded
aws_mediaconnect_merge_active	MergeActive
aws_mediaconnect_merge_latency	MergeLatency
aws_mediaconnect_not_recovered_packets	NotRecoveredPackets
aws_mediaconnect_output_connected	OutputConnected
aws_mediaconnect_output_disconnections	OutputDisconnections
aws_mediaconnect_output_dropped_payloads	OutputDroppedPayloads
aws_mediaconnect_output_late_payloads	OutputLatePayloads
aws_mediaconnect_output_total_bytes	OutputTotalBytes
aws_mediaconnect_output_total_payloads	OutputTotalPayloads
aws_mediaconnect_overflow_packets	OverflowPackets
aws_mediaconnect_paterror	PATError
aws_mediaconnect_pcraccuracy_error	PCRAccuracyError
aws_mediaconnect_pcrerror	PCRError
aws_mediaconnect_piderror	PIDError
aws_mediaconnect_pmterror	PMTError
aws_mediaconnect_ptserror	PTSError
aws_mediaconnect_packet_loss_percent	PacketLossPercent
aws_mediaconnect_recovered_packets	RecoveredPackets
aws_mediaconnect_round_trip_time	RoundTripTime
aws_mediaconnect_source_arqrecovered	SourceARQRecovered
aws_mediaconnect_source_arqrequests	SourceARQRequests
aws_mediaconnect_source_bit_rate	SourceBitRate
aws_mediaconnect_source_caterror	SourceCATError
aws_mediaconnect_source_crcerror	SourceCRCError
aws_mediaconnect_source_connected	SourceConnected
aws_mediaconnect_source_continuity_counter	SourceContinuityCounter
aws_mediaconnect_source_disconnections	SourceDisconnections
aws_mediaconnect_source_dropped_packets	SourceDroppedPackets
aws_mediaconnect_source_dropped_payloads	SourceDroppedPayloads
aws_mediaconnect_source_fecpackets	SourceFECPackets
aws_mediaconnect_source_fecrecovered	SourceFECRecovered
aws_mediaconnect_source_late_payloads	SourceLatePayloads
aws_mediaconnect_source_merge_active	SourceMergeActive
aws_mediaconnect_source_merge_latency	SourceMergeLatency
aws_mediaconnect_source_merge_status_warn_mismatch	SourceMergeStatusWarnMismatch
aws_mediaconnect_source_merge_status_warn_solo	SourceMergeStatusWarnSolo
aws_mediaconnect_source_missing_packets	SourceMissingPackets
aws_mediaconnect_source_not_recovered_packets	SourceNotRecoveredPackets
aws_mediaconnect_source_overflow_packets	SourceOverflowPackets
aws_mediaconnect_source_paterror	SourcePATError
aws_mediaconnect_source_pcraccuracy_error	SourcePCRAccuracyError
aws_mediaconnect_source_pcrerror	SourcePCRError
aws_mediaconnect_source_piderror	SourcePIDError
aws_mediaconnect_source_pmterror	SourcePMTError
aws_mediaconnect_source_ptserror	SourcePTSError
aws_mediaconnect_source_packet_loss_percent	SourcePacketLossPercent
aws_mediaconnect_source_recovered_packets	SourceRecoveredPackets
aws_mediaconnect_source_round_trip_time	SourceRoundTripTime
aws_mediaconnect_source_selected	SourceSelected
aws_mediaconnect_source_tsbyte_error	SourceTSByteError
aws_mediaconnect_source_tssync_loss	SourceTSSyncLoss
aws_mediaconnect_source_total_bytes	SourceTotalBytes
aws_mediaconnect_source_total_packets	SourceTotalPackets
aws_mediaconnect_source_total_payloads	SourceTotalPayloads
aws_mediaconnect_source_transport_error	SourceTransportError
aws_mediaconnect_tsbyte_error	TSByteError
aws_mediaconnect_tssync_loss	TSSyncLoss
aws_mediaconnect_total_packets	TotalPackets
aws_mediaconnect_transport_error	TransportError
aws_mediaconnect_uptime	Uptime

AWS/MediaTailor

Function: Personalizes advertisement insertion in video streams for a seamless experience

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_mediatailor_info
aws_mediatailor_ad_decision_server_ads	AdDecisionServer.Ads	Tracks the number of ads provided by the Ad Decision Server (ADS).
aws_mediatailor_ad_decision_server_duration	AdDecisionServer.Duration	Measures the duration of requests made to the Ad Decision Server.
aws_mediatailor_ad_decision_server_errors	AdDecisionServer.Errors	Monitors the number of errors returned by the Ad Decision Server.
aws_mediatailor_ad_decision_server_fill_rate	AdDecisionServer.FillRate	Tracks the rate at which ad slots are successfully filled by the Ad Decision Server.
aws_mediatailor_ad_decision_server_timeouts	AdDecisionServer.Timeouts	Tracks the number of timeouts during requests to the Ad Decision Server.
aws_mediatailor_ad_not_ready	AdNotReady	Indicates the number of instances where ads were not ready to be served.
aws_mediatailor_avails_duration	Avails.Duration	Measures the duration of available ad opportunities (avails).
aws_mediatailor_avails_fill_rate	Avails.FillRate	Tracks the rate at which avails are filled with ads.
aws_mediatailor_avails_filled_duration	Avails.FilledDuration	Measures the total filled duration of ad avails.
aws_mediatailor_get_manifest_errors	GetManifest.Errors	Monitors the number of errors encountered while retrieving the manifest.
aws_mediatailor_origin_errors	Origin.Errors	Tracks the number of errors originating from the content origin server.
aws_mediatailor_origin_timeouts	Origin.Timeouts	Monitors the number of timeouts from requests to the content origin server.

AWS/NATGateway

Function: Manages network address translation to securely connect instances to the internet

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_natgateway_info
aws_natgateway_active_connection_count	ActiveConnectionCount	Tracks the number of active connections to the NAT Gateway.
aws_natgateway_bytes_in_from_destination	BytesInFromDestination	Measures the amount of data received by the NAT Gateway from the destination (in bytes).
aws_natgateway_bytes_in_from_source	BytesInFromSource	Measures the amount of data received by the NAT Gateway from the source (in bytes).
aws_natgateway_bytes_out_to_destination	BytesOutToDestination	Tracks the data sent from the NAT Gateway to the destination (in bytes).
aws_natgateway_bytes_out_to_source	BytesOutToSource	Measures the data sent from the NAT Gateway to the source (in bytes).
aws_natgateway_connection_attempt_count	ConnectionAttemptCount	Counts the number of attempts to establish a connection via the NAT Gateway.
aws_natgateway_connection_established_count	ConnectionEstablishedCount	Measures the successful establishment of connections through the NAT Gateway.
aws_natgateway_error_port_allocation	ErrorPortAllocation	Tracks errors related to port allocation failures in the NAT Gateway.
aws_natgateway_idle_timeout_count	IdleTimeoutCount	Counts the number of times connections are closed due to idle timeouts on the NAT Gateway.
aws_natgateway_packets_drop_count	PacketsDropCount	Measures the number of packets dropped by the NAT Gateway.
aws_natgateway_packets_in_from_destination	PacketsInFromDestination	Tracks the number of packets received by the NAT Gateway from the destination.
aws_natgateway_packets_in_from_source	PacketsInFromSource	Measures the number of packets received by the NAT Gateway from the source.
aws_natgateway_packets_out_to_destination	PacketsOutToDestination	Tracks the number of packets sent from the NAT Gateway to the destination.
aws_natgateway_packets_out_to_source	PacketsOutToSource	Measures the number of packets sent from the NAT Gateway to the source.

AWS/Neptune

Function: Managed graph database service for building and running graph applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_neptune_info
aws_neptune_cpuutilization	CPUUtilization	Monitors the percentage of CPU resources used by the Neptune database instance.
aws_neptune_cluster_replica_lag	ClusterReplicaLag	Measures the replication lag between the Neptune writer and reader nodes in milliseconds.
aws_neptune_cluster_replica_lag_maximum	ClusterReplicaLagMaximum	Tracks the maximum replica lag during the monitored period.
aws_neptune_cluster_replica_lag_minimum	ClusterReplicaLagMinimum	Tracks the minimum replica lag during the monitored period.
aws_neptune_engine_uptime	EngineUptime	Monitors the total uptime of the Neptune engine instance.
aws_neptune_free_local_storage	FreeLocalStorage	Monitors the amount of local storage available on the Neptune instance.
aws_neptune_freeable_memory	FreeableMemory	Tracks the amount of available memory on the Neptune instance.
aws_neptune_gremlin_errors	GremlinErrors	Counts the errors encountered in Gremlin queries.
aws_neptune_gremlin_http1xx	GremlinHttp1xx	Tracks HTTP 1xx responses for Gremlin queries.
aws_neptune_gremlin_http2xx	GremlinHttp2xx	Tracks HTTP 2xx (successful) responses for Gremlin queries.
aws_neptune_gremlin_http4xx	GremlinHttp4xx	Monitors HTTP 4xx (client error) responses for Gremlin queries.
aws_neptune_gremlin_http5xx	GremlinHttp5xx	Tracks HTTP 5xx (server error) responses for Gremlin queries.
aws_neptune_gremlin_requests	GremlinRequests	Monitors the total number of Gremlin requests made.
aws_neptune_gremlin_requests_per_sec	GremlinRequestsPerSec	Measures the rate of Gremlin requests per second.
aws_neptune_gremlin_web_socket_available_connections	GremlinWebSocketAvailableConnections	Tracks available WebSocket connections for Gremlin.
aws_neptune_gremlin_web_socket_client_errors	GremlinWebSocketClientErrors	Monitors WebSocket client errors for Gremlin.
aws_neptune_gremlin_web_socket_server_errors	GremlinWebSocketServerErrors	Monitors WebSocket server errors for Gremlin.
aws_neptune_gremlin_web_socket_success	GremlinWebSocketSuccess	Counts successful WebSocket connections for Gremlin.
aws_neptune_http100	Http100	Monitors HTTP 100 responses from the Neptune instance.
aws_neptune_http101	Http101	Tracks HTTP 101 responses (Switching Protocols).
aws_neptune_http1xx	Http1xx	Tracks all HTTP 1xx responses for requests made to the Neptune instance.
aws_neptune_http200	Http200	Tracks HTTP 200 (OK) responses.
aws_neptune_http2xx	Http2xx	Monitors all HTTP 2xx responses (successful requests).
aws_neptune_http400	Http400	Tracks HTTP 400 (bad request) responses.
aws_neptune_http403	Http403	Monitors HTTP 403 (forbidden) responses.
aws_neptune_http405	Http405	Tracks HTTP 405 (method not allowed) responses.
aws_neptune_http413	Http413	Tracks HTTP 413 (request entity too large) responses.
aws_neptune_http429	Http429	Monitors HTTP 429 (too many requests) responses.
aws_neptune_http4xx	Http4xx	Tracks all HTTP 4xx (client error) responses.
aws_neptune_http500	Http500	Monitors HTTP 500 (internal server error) responses.
aws_neptune_http501	Http501	Tracks HTTP 501 (not implemented) responses.
aws_neptune_http5xx	Http5xx	Monitors all HTTP 5xx (server error) responses.
aws_neptune_loader_errors	LoaderErrors	Counts errors encountered during bulk loader operations.
aws_neptune_loader_requests	LoaderRequests	Tracks requests made to the bulk loader.
aws_neptune_network_receive_throughput	NetworkReceiveThroughput	Monitors the network throughput for data received by the Neptune instance.
aws_neptune_network_throughput	NetworkThroughput	Measures the total network throughput (incoming and outgoing) of the Neptune instance.
aws_neptune_network_transmit_throughput	NetworkTransmitThroughput	Tracks the network throughput for data transmitted by the Neptune instance.
aws_neptune_sparql_errors	SparqlErrors	Monitors errors encountered in SPARQL queries.
aws_neptune_sparql_http1xx	SparqlHttp1xx	Tracks HTTP 1xx responses for SPARQL queries.
aws_neptune_sparql_http2xx	SparqlHttp2xx	Tracks HTTP 2xx responses for SPARQL queries.
aws_neptune_sparql_http4xx	SparqlHttp4xx	Monitors HTTP 4xx responses for SPARQL queries.
aws_neptune_sparql_http5xx	SparqlHttp5xx	Tracks HTTP 5xx responses for SPARQL queries.
aws_neptune_sparql_requests**	SparqlRequests	Measures the number of SPARQL requests made to the Neptune instance.
aws_neptune_sparql_requests_per_sec	SparqlRequestsPerSec	Tracks the rate of SPARQL requests per second.
aws_neptune_status_errors	StatusErrors	Monitors the number of status errors reported by the Neptune instance.
aws_neptune_status_requests	StatusRequests	Tracks the number of status requests made to the Neptune instance.
aws_neptune_volume_bytes_used	VolumeBytesUsed	Measures the amount of storage used by the Neptune instance.
aws_neptune_volume_read_iops	VolumeReadIOPs	Monitors the read input/output operations per second on the Neptune instance’s volume.
aws_neptune_volume_write_iops	VolumeWriteIOPs	Tracks the write input/output operations per second on the Neptune instance’s volume.

AWS/NetworkELB

Function: Provides highly scalable and fault-tolerant network load balancing for traffic distribution

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_networkelb_info
aws_networkelb_active_flow_count	ActiveFlowCount	Monitors the total number of active flow connections through the Network Load Balancer.
aws_networkelb_active_flow_count_tls	ActiveFlowCount_TLS	Tracks the number of active flow connections through the Network Load Balancer that are using TLS.
aws_networkelb_client_tlsnegotiation_error_count	ClientTLSNegotiationErrorCount	Monitors the number of client TLS negotiation errors, indicating issues with SSL/TLS handshakes.
aws_networkelb_consumed_lcus	ConsumedLCUs	Measures Load Balancer Capacity Units (LCUs) consumed by the Network Load Balancer.
aws_networkelb_healthy_host_count	HealthyHostCount	Tracks the number of healthy targets available to receive traffic.
aws_networkelb_new_flow_count	NewFlowCount	Measures the number of new flow connections established with the Network Load Balancer.
aws_networkelb_new_flow_count_tls	NewFlowCount_TLS	Tracks the number of new flow connections using TLS.
aws_networkelb_processed_bytes	ProcessedBytes	Measures the total amount of data processed by the Network Load Balancer.
aws_networkelb_target_tlsnegotiation_error_count	TargetTLSNegotiationErrorCount	Monitors TLS negotiation errors on the target side, indicating failed handshakes.
aws_networkelb_tcp_client_reset_count	TCP_Client_Reset_Count	Tracks the number of TCP client resets, indicating client-initiated connection terminations.
aws_networkelb_tcp_target_reset_count	TCP_Target_Reset_Count	Monitors TCP resets initiated by the target, indicating failed connections.
aws_networkelb_un_healthy_host_count	UnHealthyHostCount	Measures the number of targets marked as unhealthy by the load balancer.
aws_networkelb_active_flow_count_tcp	ActiveFlowCount_TCP	Monitors the number of active TCP flows through the Network Load Balancer.
aws_networkelb_active_flow_count_udp	ActiveFlowCount_UDP	Tracks the number of active UDP flows through the Network Load Balancer.
aws_networkelb_consumed_lcus_tcp	ConsumedLCUs_TCP	Measures LCUs consumed by TCP traffic.
aws_networkelb_consumed_lcus_tls	ConsumedLCUs_TLS	Measures LCUs consumed by TLS traffic.
aws_networkelb_consumed_lcus_udp	ConsumedLCUs_UDP	Measures LCUs consumed by UDP traffic.
aws_networkelb_new_flow_count_tcp	NewFlowCount_TCP	Tracks the number of new TCP flow connections established.
aws_networkelb_new_flow_count_udp	NewFlowCount_UDP	Measures the number of new UDP flow connections established.
aws_networkelb_peak_packets_per_second	PeakPacketsPerSecond	Monitors the highest rate of packets processed by the Network Load Balancer per second.
aws_networkelb_port_allocation_error_count	PortAllocationErrorCount	Tracks the number of errors due to port allocation failures.
aws_networkelb_processed_bytes_tcp	ProcessedBytes_TCP	Measures the total data processed over TCP connections.
aws_networkelb_processed_bytes_tls	ProcessedBytes_TLS	Tracks the total data processed over TLS connections.
aws_networkelb_processed_bytes_udp	ProcessedBytes_UDP	Monitors the total data processed over UDP connections.
aws_networkelb_processed_packets	ProcessedPackets	Tracks the total number of packets processed by the Network Load Balancer.
aws_networkelb_security_group_blocked_flow_count_inbound_icmp	SecurityGroupBlockedFlowCount_Inbound_ICMP	Measures the number of inbound ICMP flows blocked by security groups.
aws_networkelb_security_group_blocked_flow_count_inbound_tcp	SecurityGroupBlockedFlowCount_Inbound_TCP	Tracks the number of inbound TCP flows blocked by security groups.
aws_networkelb_security_group_blocked_flow_count_inbound_udp	SecurityGroupBlockedFlowCount_Inbound_UDP	Monitors the number of inbound UDP flows blocked by security groups.
aws_networkelb_security_group_blocked_flow_count_outbound_icmp	SecurityGroupBlockedFlowCount_Outbound_ICMP	Measures the number of outbound ICMP flows blocked by security groups.
aws_networkelb_security_group_blocked_flow_count_outbound_tcp	SecurityGroupBlockedFlowCount_Outbound_TCP	Tracks the number of outbound TCP flows blocked by security groups.
aws_networkelb_security_group_blocked_flow_count_outbound_udp	SecurityGroupBlockedFlowCount_Outbound_UDP	Monitors the number of outbound UDP flows blocked by security groups.
aws_networkelb_tcp_elb_reset_count	TCP_ELB_Reset_Count	Tracks the number of TCP resets initiated by the Network Load Balancer itself.
aws_networkelb_unhealthy_routing_flow_count	UnhealthyRoutingFlowCount	Monitors the number of routing flows directed to unhealthy targets.

AWS/NetworkFirewall

Function: Managed network firewall service to secure VPCs

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_networkfirewall_info
aws_networkfirewall_dropped_packets	DroppedPackets	Tracks the number of packets dropped by the Network Firewall, indicating blocked or failed traffic.
aws_networkfirewall_packets	Packets	Monitors the total number of packets inspected by the Network Firewall.
aws_networkfirewall_passed_packets	PassedPackets	Measures the number of packets allowed through the Network Firewall, indicating successful traffic.
aws_networkfirewall_received_packet_count	ReceivedPacketCount	Tracks the total number of packets received by the Network Firewall for inspection.

AWS/PrivateLinkEndpoints

Function: Provides private connectivity between VPCs and AWS services or third-party services

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_privatelinkendpoints_info
aws_privatelinkendpoints_active_connections	ActiveConnections	Tracks the number of active connections through the PrivateLink endpoints.
aws_privatelinkendpoints_bytes_processed	BytesProcessed	Measures the amount of data processed by the PrivateLink endpoints in bytes.
aws_privatelinkendpoints_new_connections	NewConnections	Monitors the number of new connections established through the PrivateLink endpoints.
aws_privatelinkendpoints_packets_dropped	PacketsDropped	Tracks the number of packets dropped by the PrivateLink endpoints, which could indicate errors or network issues.
aws_privatelinkendpoints_rst_packets_received	RstPacketsReceived	Measures the number of reset (RST) packets received, which can indicate connection terminations.

AWS/PrivateLinkServices

Function: Service for building services accessible over AWS PrivateLink

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_privatelinkservices_info
aws_privatelinkservices_active_connections	ActiveConnections	Monitors the number of active connections managed by the PrivateLink services.
aws_privatelinkservices_bytes_processed	BytesProcessed	Measures the total amount of data processed by the PrivateLink services in bytes.
aws_privatelinkservices_endpoints_count	EndpointsCount	Tracks the number of PrivateLink service endpoints currently connected.
aws_privatelinkservices_new_connections	NewConnections	Monitors the number of new connections established via the PrivateLink services.
aws_privatelinkservices_rst_packets_received	RstPacketsReceived	Measures the number of reset (RST) packets received, indicating terminated connections.

AWS/Prometheus

Function: Managed Prometheus service for monitoring and alerting metrics

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_prometheus_info
aws_prometheus_alert_manager_alerts_received	AlertManagerAlertsReceived	Tracks the number of alerts received by the Prometheus Alert Manager.
aws_prometheus_alert_manager_notifications_failed	AlertManagerNotificationsFailed	Monitors the number of failed alert notifications sent by the Prometheus Alert Manager.
aws_prometheus_alert_manager_notifications_throttled	AlertManagerNotificationsThrottled	Measures the number of alert notifications throttled due to rate limits or other constraints.
aws_prometheus_discarded_samples	DiscardedSamples	Tracks the number of discarded samples due to errors or incorrect data.
aws_prometheus_rule_evaluation_failures	RuleEvaluationFailures	Monitors the number of failed rule evaluations in Prometheus.
aws_prometheus_rule_evaluations	RuleEvaluations	Measures the total number of rule evaluations performed by Prometheus.
aws_prometheus_rule_group_iterations_missed	RuleGroupIterationsMissed	Tracks the number of rule group evaluation iterations that were missed due to processing delays.

AWS/RDS

Function: Managed relational database service for databases like MySQL, PostgreSQL, and Oracle

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_rds_info
aws_rds_cpuutilization	CPUUtilization	Tracks the utilization of CPU resources by RDS instances.
aws_rds_database_connections	DatabaseConnections	Measures the number of active database connections to RDS instances.
aws_rds_replica_lag	ReplicaLag	Monitors the lag time between the master and replica databases.
aws_rds_freeable_memory	FreeableMemory	Indicates the available memory that can be used by the RDS instance.
aws_rds_free_storage_space	FreeStorageSpace	Shows the remaining storage space available on the RDS instance.
aws_rds_free_storage_space_log_volume	FreeStorageSpaceLogVolume
aws_rds_swap_usage	SwapUsage	Monitors the amount of swap space used by the RDS instance.
aws_rds_read_throughput	ReadThroughput	Measures the throughput for read operations from the database.
aws_rds_read_latency	ReadLatency	Indicates the latency for read operations on the database.
aws_rds_read_iops	ReadIOPS	Tracks the input/output operations per second for reads on the RDS instance.
aws_rds_write_throughput	WriteThroughput	Measures the throughput for write operations to the database.
aws_rds_write_latency	WriteLatency	Indicates the latency for write operations on the database.
aws_rds_write_iops	WriteIOPS	Tracks the input/output operations per second for writes on the RDS instance.
aws_rds_burst_balance	BurstBalance	Monitors the burst balance percentage for instances with burstable performance.
aws_rds_ebsbyte_balance_percent	EBSByteBalance%
aws_rds_ebsiobalance_percent	EBSIOBalance%
aws_rds_dbload	DBLoad	Measures the database load on the instance.
aws_rds_dbload_cpu	DBLoadCPU	Tracks the portion of database load related to CPU usage.
aws_rds_dbload_non_cpu	DBLoadNonCPU	Measures the portion of database load unrelated to CPU usage.
aws_rds_cpucredit_usage	CPUCreditUsage
aws_rds_cpucredit_balance	CPUCreditBalance
aws_rds_acuutilization	ACUUtilization	Monitors the utilization of Aurora Capacity Units (ACUs).
aws_rds_aborted_clients	AbortedClients	Tracks the number of aborted client connections to the database.
aws_rds_active_transactions	ActiveTransactions	Shows the number of active transactions on the database.
aws_rds_aurora_binlog_replica_lag	AuroraBinlogReplicaLag	Monitors the replication lag between the Aurora master and replicas.
aws_rds_aurora_dmlrejected_master_full	AuroraDMLRejectedMasterFull
aws_rds_aurora_dmlrejected_writer_full	AuroraDMLRejectedWriterFull
aws_rds_aurora_estimated_shared_memory_bytes	AuroraEstimatedSharedMemoryBytes
aws_rds_aurora_global_dbdata_transfer_bytes	AuroraGlobalDBDataTransferBytes
aws_rds_aurora_global_dbprogress_lag	AuroraGlobalDBProgressLag
aws_rds_aurora_global_dbrpolag	AuroraGlobalDBRPOLag
aws_rds_aurora_global_dbreplicated_write_io	AuroraGlobalDBReplicatedWriteIO
aws_rds_aurora_global_dbreplication_lag	AuroraGlobalDBReplicationLag
aws_rds_aurora_memory_health_state	AuroraMemoryHealthState	Indicates the health state of memory in Aurora instances.
aws_rds_aurora_memory_num_declined_sql_total	AuroraMemoryNumDeclinedSqlTotal
aws_rds_aurora_memory_num_kill_conn_total	AuroraMemoryNumKillConnTotal
aws_rds_aurora_memory_num_kill_query_total	AuroraMemoryNumKillQueryTotal
aws_rds_aurora_optimized_reads_cache_hit_ratio	AuroraOptimizedReadsCacheHitRatio
aws_rds_aurora_replica_lag	AuroraReplicaLag
aws_rds_aurora_replica_lag_maximum	AuroraReplicaLagMaximum
aws_rds_aurora_replica_lag_minimum	AuroraReplicaLagMinimum
aws_rds_aurora_slow_connection_handle_count	AuroraSlowConnectionHandleCount
aws_rds_aurora_slow_handshake_count	AuroraSlowHandshakeCount
aws_rds_aurora_volume_bytes_left_total	AuroraVolumeBytesLeftTotal
aws_rds_availability_percentage	AvailabilityPercentage	Measures the availability of the RDS instance in terms of percentage uptime.
aws_rds_backtrack_change_records_creation_rate	BacktrackChangeRecordsCreationRate
aws_rds_backtrack_change_records_stored	BacktrackChangeRecordsStored
aws_rds_backtrack_window_actual	BacktrackWindowActual
aws_rds_backtrack_window_alert	BacktrackWindowAlert
aws_rds_backup_retention_period_storage_used	BackupRetentionPeriodStorageUsed
aws_rds_bin_log_disk_usage	BinLogDiskUsage
aws_rds_blocked_transactions	BlockedTransactions
aws_rds_buffer_cache_hit_ratio	BufferCacheHitRatio
aws_rds_cpusurplus_credit_balance	CPUSurplusCreditBalance
aws_rds_cpusurplus_credits_charged	CPUSurplusCreditsCharged
aws_rds_checkpoint_lag	CheckpointLag
aws_rds_client_connections	ClientConnections
aws_rds_client_connections_closed	ClientConnectionsClosed
aws_rds_client_connections_no_tls	ClientConnectionsNoTLS
aws_rds_client_connections_received	ClientConnectionsReceived
aws_rds_client_connections_setup_failed_auth	ClientConnectionsSetupFailedAuth
aws_rds_client_connections_setup_succeeded	ClientConnectionsSetupSucceeded
aws_rds_client_connections_tls	ClientConnectionsTLS
aws_rds_commit_latency	CommitLatency
aws_rds_commit_throughput	CommitThroughput
aws_rds_connection_attempts	ConnectionAttempts
aws_rds_ddllatency	DDLLatency
aws_rds_ddlthroughput	DDLThroughput
aws_rds_dmllatency	DMLLatency
aws_rds_dmlthroughput	DMLThroughput
aws_rds_database_connection_requests	DatabaseConnectionRequests
aws_rds_database_connection_requests_with_tls	DatabaseConnectionRequestsWithTLS
aws_rds_database_connections_borrow_latency	DatabaseConnectionsBorrowLatency
aws_rds_database_connections_currently_borrowed	DatabaseConnectionsCurrentlyBorrowed
aws_rds_database_connections_currently_in_transaction	DatabaseConnectionsCurrentlyInTransaction
aws_rds_database_connections_currently_session_pinned	DatabaseConnectionsCurrentlySessionPinned
aws_rds_database_connections_setup_failed	DatabaseConnectionsSetupFailed
aws_rds_database_connections_setup_succeeded	DatabaseConnectionsSetupSucceeded
aws_rds_database_connections_with_tls	DatabaseConnectionsWithTLS
aws_rds_deadlocks	Deadlocks
aws_rds_delete_latency	DeleteLatency
aws_rds_delete_throughput	DeleteThroughput
aws_rds_disk_queue_depth	DiskQueueDepth
aws_rds_disk_queue_depth_log_volume	DiskQueueDepthLogVolume
aws_rds_engine_uptime	EngineUptime
aws_rds_failed_sqlserver_agent_jobs_count	FailedSQLServerAgentJobsCount
aws_rds_free_ephemeral_storage	FreeEphemeralStorage
aws_rds_free_local_storage	FreeLocalStorage
aws_rds_insert_latency	InsertLatency
aws_rds_insert_throughput	InsertThroughput
aws_rds_login_failures	LoginFailures
aws_rds_max_database_connections_allowed	MaxDatabaseConnectionsAllowed
aws_rds_maximum_used_transaction_ids	MaximumUsedTransactionIDs
aws_rds_network_receive_throughput	NetworkReceiveThroughput
aws_rds_network_throughput	NetworkThroughput
aws_rds_network_transmit_throughput	NetworkTransmitThroughput
aws_rds_num_binary_log_files	NumBinaryLogFiles
aws_rds_oldest_replication_slot_lag	OldestReplicationSlotLag
aws_rds_purge_boundary	PurgeBoundary
aws_rds_purge_finished_point	PurgeFinishedPoint
aws_rds_queries	Queries	Counts the number of queries executed on the RDS instance.
aws_rds_query_database_response_latency	QueryDatabaseResponseLatency
aws_rds_query_requests	QueryRequests
aws_rds_query_requests_no_tls	QueryRequestsNoTLS
aws_rds_query_requests_tls	QueryRequestsTLS
aws_rds_query_response_latency	QueryResponseLatency
aws_rds_to_aurora_postgre_sqlreplica_lag	RDSToAuroraPostgreSQLReplicaLag
aws_rds_read_iopsephemeral_storage	ReadIOPSEphemeralStorage
aws_rds_read_iopslog_volume	ReadIOPSLogVolume
aws_rds_read_latency_ephemeral_storage	ReadLatencyEphemeralStorage
aws_rds_read_latency_log_volume	ReadLatencyLogVolume
aws_rds_read_throughput_ephemeral_storage	ReadThroughputEphemeralStorage
aws_rds_read_throughput_log_volume	ReadThroughputLogVolume
aws_rds_replication_channel_lag	ReplicationChannelLag
aws_rds_replication_slot_disk_usage	ReplicationSlotDiskUsage
aws_rds_result_set_cache_hit_ratio	ResultSetCacheHitRatio
aws_rds_rollback_segment_history_list_length	RollbackSegmentHistoryListLength
aws_rds_row_lock_time	RowLockTime
aws_rds_select_latency	SelectLatency
aws_rds_select_throughput	SelectThroughput
aws_rds_serverless_database_capacity	ServerlessDatabaseCapacity
aws_rds_snapshot_storage_used	SnapshotStorageUsed
aws_rds_storage_network_receive_throughput	StorageNetworkReceiveThroughput
aws_rds_storage_network_throughput	StorageNetworkThroughput	Measures the network throughput for both transmitting and receiving data from the RDS instance.
aws_rds_storage_network_transmit_throughput	StorageNetworkTransmitThroughput
aws_rds_sum_binary_log_size	SumBinaryLogSize
aws_rds_temp_storage_iops	TempStorageIOPS
aws_rds_temp_storage_throughput	TempStorageThroughput
aws_rds_total_backup_storage_billed	TotalBackupStorageBilled
aws_rds_transaction_logs_disk_usage	TransactionLogsDiskUsage	Tracks the amount of disk space used by transaction logs.
aws_rds_transaction_logs_generation	TransactionLogsGeneration
aws_rds_truncate_finished_point	TruncateFinishedPoint
aws_rds_update_latency	UpdateLatency
aws_rds_update_throughput	UpdateThroughput
aws_rds_volume_bytes_used	VolumeBytesUsed	Shows the total amount of disk space used by the RDS instance.
aws_rds_volume_read_iops	VolumeReadIOPs
aws_rds_volume_write_iops	VolumeWriteIOPs
aws_rds_write_iopsephemeral_storage	WriteIOPSEphemeralStorage
aws_rds_write_iopslog_volume	WriteIOPSLogVolume
aws_rds_write_latency_ephemeral_storage	WriteLatencyEphemeralStorage
aws_rds_write_latency_log_volume	WriteLatencyLogVolume	Monitors the latency for write operations on the log volume.
aws_rds_write_throughput_ephemeral_storage	WriteThroughputEphemeralStorage
aws_rds_write_throughput_log_volume	WriteThroughputLogVolume

AWS/Redshift

Function: Fully managed data warehouse for large-scale data analytics

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_redshift_info
aws_redshift_cpuutilization	CPUUtilization	Tracks CPU utilization across Redshift clusters.
aws_redshift_commit_queue_length	CommitQueueLength	Measures the length of the commit queue for query execution.
aws_redshift_concurrency_scaling_active_clusters	ConcurrencyScalingActiveClusters	Monitors the number of active concurrency scaling clusters.
aws_redshift_concurrency_scaling_seconds	ConcurrencyScalingSeconds	Measures the time spent scaling for concurrency.
aws_redshift_database_connections	DatabaseConnections	Tracks the number of database connections to the Redshift cluster.
aws_redshift_health_status	HealthStatus	Provides health status of Redshift clusters.
aws_redshift_maintenance_mode	MaintenanceMode	Indicates if the cluster is in maintenance mode.
aws_redshift_max_configured_concurrency_scaling_clusters	MaxConfiguredConcurrencyScalingClusters	Tracks the maximum number of concurrency scaling clusters configured.
aws_redshift_network_receive_throughput	NetworkReceiveThroughput	Measures the network throughput for receiving data.
aws_redshift_network_transmit_throughput	NetworkTransmitThroughput	Measures the network throughput for transmitting data.
aws_redshift_num_exceeded_schema_quotas	NumExceededSchemaQuotas	Tracks how often schema quotas have been exceeded.
aws_redshift_percentage_disk_space_used	PercentageDiskSpaceUsed	Shows the percentage of disk space used by the cluster.
aws_redshift_percentage_quota_used	PercentageQuotaUsed	Monitors the percentage of quota used.
aws_redshift_queries_completed_per_second	QueriesCompletedPerSecond	Measures the number of queries completed per second.
aws_redshift_query_duration	QueryDuration	Tracks the duration of queries.
aws_redshift_query_runtime_breakdown	QueryRuntimeBreakdown	Provides a breakdown of the time spent on query execution.
aws_redshift_read_iops	ReadIOPS	Measures input/output operations per second for reads.
aws_redshift_read_latency	ReadLatency	Tracks latency for read operations.
aws_redshift_read_throughput	ReadThroughput	Measures throughput for read operations.
aws_redshift_schema_quota	SchemaQuota	Monitors schema quota usage.
aws_redshift_storage_used	StorageUsed	Shows the amount of storage used by the Redshift cluster.
aws_redshift_total_table_count	TotalTableCount	Measures the total number of tables in the cluster.
aws_redshift_wlmqueries_completed_per_second	WLMQueriesCompletedPerSecond	Tracks the number of queries completed per second in the Workload Management (WLM) queue.
aws_redshift_wlmquery_duration	WLMQueryDuration	Measures the duration of queries in the WLM queue.
aws_redshift_wlmqueue_length	WLMQueueLength	Tracks the length of the WLM queue.
aws_redshift_wlmqueue_wait_time	WLMQueueWaitTime	Measures the wait time for queries in the WLM queue.
aws_redshift_wlmrunning_queries	WLMRunningQueries	Shows the number of queries currently running in the WLM queue.
aws_redshift_write_iops	WriteIOPS	Measures input/output operations per second for writes.
aws_redshift_write_latency	WriteLatency	Tracks latency for write operations.
aws_redshift_write_throughput	WriteThroughput	Measures throughput for write operations.

AWS/Route53

Function: Scalable DNS and domain registration service

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_route53_info
aws_route53_child_health_check_healthy_count	ChildHealthCheckHealthyCount	Tracks the count of healthy child health checks.
aws_route53_connection_time	ConnectionTime	Measures the time it takes to establish a connection.
aws_route53_dnsqueries	DNSQueries	Monitors the number of DNS queries handled by Route 53.
aws_route53_health_check_percentage_healthy	HealthCheckPercentageHealthy	Displays the percentage of healthy Route 53 health checks.
aws_route53_health_check_status	HealthCheckStatus	Indicates the status of health checks, showing whether they are passing or failing.
aws_route53_sslhandshake_time	SSLHandshakeTime	Measures the time it takes to complete the SSL handshake.
aws_route53_time_to_first_byte	TimeToFirstByte	Tracks the time taken to receive the first byte of the response after a request is sent.

AWS/Route53Resolver

Function: DNS firewall to filter and monitor DNS queries

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_route53resolver_info
aws_route53resolver_inbound_query_volume	InboundQueryVolume	Measures the volume of DNS queries received by the Route 53 Resolver inbound endpoint.
aws_route53resolver_outbound_query_aggregated_volume	OutboundQueryAggregatedVolume	Tracks the total volume of outbound DNS queries across all outbound endpoints.
aws_route53resolver_outbound_query_volume	OutboundQueryVolume	Monitors the volume of DNS queries sent by the Route 53 Resolver outbound endpoint.

AWS/S3

Function: Scalable object storage service for a wide range of data types

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_s3_info
aws_s3_number_of_objects	NumberOfObjects	Tracks the total number of objects stored in an S3 bucket.
aws_s3_bucket_size_bytes	BucketSizeBytes	Measures the total size of an S3 bucket in bytes.
aws_s3_all_requests	AllRequests	Measures the total number of all requests made to an S3 bucket.
aws_s3_4xx_errors	4xxErrors	Counts the number of 4xx HTTP status code errors encountered during S3 requests.
aws_s3_total_request_latency	TotalRequestLatency	TotalRequestLatency Measures the total latency for S3 requests.
aws_s3_5xx_errors	5xxErrors	Counts the number of 5xx HTTP status code errors encountered during S3 requests.
aws_s3_bytes_downloaded	BytesDownloaded	Tracks the total bytes downloaded from an S3 bucket.
aws_s3_bytes_pending_replication	BytesPendingReplication	Measures the bytes pending replication in S3 cross-region replication scenarios.
aws_s3_bytes_uploaded	BytesUploaded	Tracks the total bytes uploaded to an S3 bucket.
aws_s3_delete_requests	DeleteRequests	Measures the number of delete requests made to an S3 bucket.
aws_s3_first_byte_latency	FirstByteLatency	Tracks the latency until the first byte is sent in an S3 request.
aws_s3_get_requests	GetRequests	Measures the number of GET requests made to an S3 bucket.
aws_s3_head_requests	HeadRequests	Counts the number of HEAD requests made to an S3 bucket.
aws_s3_list_requests	ListRequests	Tracks the number of LIST requests made to an S3 bucket.
aws_s3_operations_failed_replication	OperationsFailedReplication	Counts the number of replication operations that have failed.
aws_s3_operations_pending_replication	OperationsPendingReplication	Tracks the number of pending replication operations in an S3 bucket.
aws_s3_post_requests	PostRequests	Counts the number of POST requests made to an S3 bucket.
aws_s3_put_requests	PutRequests	Tracks the number of PUT requests made to an S3 bucket.
aws_s3_replication_latency	ReplicationLatency	Measures the latency of replication operations.
aws_s3_select_requests	SelectRequests	Measures the number of select requests made to an S3 bucket.
aws_s3_select_returned_bytes	SelectReturnedBytes	Tracks the bytes returned by S3 Select queries.
aws_s3_select_scanned_bytes	SelectScannedBytes	Measures the bytes scanned by S3 Select queries.

AWS/SES

Function: Email service for sending marketing, notification, and transactional emails

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_ses_bounce	Bounce
aws_ses_complaint	Complaint
aws_ses_delivery	Delivery
aws_ses_reject	Reject
aws_ses_send	Send
aws_ses_clicks	Clicks
aws_ses_opens	Opens
aws_ses_rendering_failures	Rendering Failures
aws_ses_reputation_bounce_rate	Reputation.BounceRate
aws_ses_reputation_complaint_rate	Reputation.ComplaintRate

AWS/SNS

Function: Managed messaging service for sending notifications to mobile devices or other services

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sns_info
aws_sns_number_of_messages_published	NumberOfMessagesPublished	Tracks the number of messages published to SNS topics.
aws_sns_number_of_notifications_delivered	NumberOfNotificationsDelivered	Measures the number of successfully delivered notifications.
aws_sns_number_of_notifications_failed	NumberOfNotificationsFailed	Tracks the number of failed notifications.
aws_sns_number_of_notifications_filtered_out	NumberOfNotificationsFilteredOut	Measures the notifications that were filtered out based on the subscription’s filter policies.
aws_sns_number_of_notifications_filtered_out_invalid_attributes	NumberOfNotificationsFilteredOut-InvalidAttributes	Tracks the notifications filtered out due to invalid message attributes.
aws_sns_number_of_notifications_filtered_out_message_body	NumberOfNotificationsFilteredOut-MessageBody	Measures notifications filtered out because of the message body content.
aws_sns_number_of_notifications_filtered_out_no_message_attributes	NumberOfNotificationsFilteredOut-NoMessageAttributes	Tracks notifications filtered out due to missing message attributes.
aws_sns_publish_size	PublishSize	Measures the size of messages published to SNS topics.
aws_sns_smsmonth_to_date_spent_usd	SMSMonthToDateSpentUSD	Tracks the month-to-date costs incurred for sending SMS messages.
aws_sns_smssuccess_rate	SMSSuccessRate	Measures the success rate of sending SMS messages via SNS.

AWS/SQS

Function: Fully managed message queuing service for decoupling and scaling microservices

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_sqs_info
aws_sqs_approximate_age_of_oldest_message	ApproximateAgeOfOldestMessage	Tracks the approximate age of the oldest message in the queue.
aws_sqs_approximate_number_of_messages_delayed	ApproximateNumberOfMessagesDelayed	Measures the approximate number of messages currently delayed.
aws_sqs_approximate_number_of_messages_not_visible	ApproximateNumberOfMessagesNotVisible	Tracks the approximate number of messages that are not visible to consumers due to being in flight.
aws_sqs_approximate_number_of_messages_visible	ApproximateNumberOfMessagesVisible	Measures the approximate number of messages currently visible to consumers.
aws_sqs_number_of_empty_receives	NumberOfEmptyReceives	Tracks the number of receive requests that did not return any messages.
aws_sqs_number_of_messages_deleted	NumberOfMessagesDeleted	Measures the number of messages successfully deleted from the queue.
aws_sqs_number_of_messages_received	NumberOfMessagesReceived	Tracks the number of messages received from the queue.
aws_sqs_number_of_messages_sent	NumberOfMessagesSent	Measures the number of messages successfully sent to the queue.
aws_sqs_sent_message_size	SentMessageSize	Tracks the size of messages sent to the queue.

AWS/SageMaker

Function: Managed service for building, training, and deploying machine learning models

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_info
aws_sagemaker_invocation4_xxerrors	Invocation4XXErrors	Tracks the count of 4XX errors (client-side errors) during model invocations.
aws_sagemaker_invocation5_xxerrors	Invocation5XXErrors	Tracks the count of 5XX errors (server-side errors) during model invocations.
aws_sagemaker_invocation_model_errors	InvocationModelErrors	Measures the errors specific to model invocations.
aws_sagemaker_invocations	Invocations	Counts the number of successful model invocations.
aws_sagemaker_invocations_per_copy	InvocationsPerCopy	Tracks the number of invocations per copy of the model.
aws_sagemaker_invocations_per_instance	InvocationsPerInstance	Measures the number of invocations per instance.
aws_sagemaker_model_cache_hit	ModelCacheHit	Tracks the instances where model cache is hit, reducing load times.
aws_sagemaker_model_downloading_time	ModelDownloadingTime	Measures the time taken to download the model to the instance.
aws_sagemaker_model_latency	ModelLatency	Tracks the latency of model invocations.
aws_sagemaker_model_loading_time	ModelLoadingTime	Measures the time taken to load the model on the instance.
aws_sagemaker_model_loading_wait_time	ModelLoadingWaitTime	Measures the wait time during the model loading process.
aws_sagemaker_model_setup_time	ModelSetupTime	Tracks the time taken to set up the model environment.
aws_sagemaker_model_unloading_time	ModelUnloadingTime	Measures the time taken to unload the model from the instance.
aws_sagemaker_overhead_latency	OverheadLatency	Tracks additional latency incurred due to overheads during the invocation process.

AWS/SageMaker/Endpoints

Function: Provides real-time and batch inference capabilities for deployed machine learning models

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_endpoints_info
aws_sagemaker_endpoints_cpureservation	CPUReservation	Tracks the amount of reserved CPU resources for SageMaker endpoints.
aws_sagemaker_endpoints_cpuutilization	CPUUtilization	Monitors the actual CPU utilization by the SageMaker endpoint.
aws_sagemaker_endpoints_cpuutilization_normalized	CPUUtilizationNormalized	Measures normalized CPU utilization based on instance type and capacity.
aws_sagemaker_endpoints_disk_utilization	DiskUtilization	Tracks the disk space utilization for SageMaker endpoints.
aws_sagemaker_endpoints_gpumemory_utilization	GPUMemoryUtilization	Monitors the actual GPU memory utilization for endpoints using GPU instances.
aws_sagemaker_endpoints_gpumemory_utilization_normalized	GPUMemoryUtilizationNormalized	Measures normalized GPU memory utilization.
aws_sagemaker_endpoints_gpureservation	GPUReservation	Tracks the amount of reserved GPU resources for endpoints using GPU instances.
aws_sagemaker_endpoints_gpuutilization	GPUUtilization	Monitors the actual GPU utilization by the SageMaker endpoint.
aws_sagemaker_endpoints_gpuutilization_normalized	GPUUtilizationNormalized	Measures normalized GPU utilization.
aws_sagemaker_endpoints_loaded_model_count	LoadedModelCount	Tracks the number of models currently loaded on the SageMaker endpoint.
aws_sagemaker_endpoints_memory_reservation	MemoryReservation	Tracks the amount of reserved memory for the SageMaker endpoint.
aws_sagemaker_endpoints_memory_utilization	MemoryUtilization	Monitors the actual memory utilization by the SageMaker endpoint.

AWS/SageMaker/InferenceRecommendationsJobs

Function: Offers guidance on optimizing inference workloads for ML models

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_inferencerecommendationsjobs_info
aws_sagemaker_inferencerecommendationsjobs_client_invocation_errors	ClientInvocationErrors	Tracks the number of errors encountered during client invocations for inference recommendations.
aws_sagemaker_inferencerecommendationsjobs_client_invocations	ClientInvocations	Monitors the number of client invocations of the inference recommendations job.
aws_sagemaker_inferencerecommendationsjobs_client_latency	ClientLatency	Measures the latency of client invocations during the inference recommendations job.
aws_sagemaker_inferencerecommendationsjobs_number_of_users	NumberOfUsers	Tracks the number of users interacting with the inference recommendations job.

AWS/SageMaker/ModelBuildingPipeline

Function: Managed pipelines to automate model training and deployment processes

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_modelbuildingpipeline_info
aws_sagemaker_modelbuildingpipeline_execution_duration	ExecutionDuration	Tracks the duration of pipeline executions.
aws_sagemaker_modelbuildingpipeline_execution_failed	ExecutionFailed	Monitors the number of failed pipeline executions.
aws_sagemaker_modelbuildingpipeline_execution_started	ExecutionStarted	Counts the number of started pipeline executions.
aws_sagemaker_modelbuildingpipeline_execution_stopped	ExecutionStopped	Tracks pipeline executions that were stopped.
aws_sagemaker_modelbuildingpipeline_execution_succeeded	ExecutionSucceeded	Monitors the number of successfully completed pipeline executions.
aws_sagemaker_modelbuildingpipeline_step_duration	StepDuration	Tracks the duration of individual steps within the pipeline.
aws_sagemaker_modelbuildingpipeline_step_failed	StepFailed	Monitors the number of failed steps within the pipeline.
aws_sagemaker_modelbuildingpipeline_step_started	StepStarted	Counts the number of steps started in the pipeline.
aws_sagemaker_modelbuildingpipeline_step_stopped	StepStopped	Tracks the steps that were stopped within the pipeline.
aws_sagemaker_modelbuildingpipeline_step_succeeded	StepSucceeded	Monitors the number of successfully completed steps within the pipeline.

AWS/SageMaker/ProcessingJobs

Function: Managed service for processing and transforming data at scale for machine learning

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_processingjobs_info
aws_sagemaker_processingjobs_cpureservation	CPUReservation	Monitors the amount of CPU resources reserved for processing jobs.
aws_sagemaker_processingjobs_cpuutilization	CPUUtilization	Tracks the utilization of CPU resources during processing jobs.
aws_sagemaker_processingjobs_cpuutilization_normalized	CPUUtilizationNormalized	Provides normalized CPU utilization for easier comparison across different instance types.
aws_sagemaker_processingjobs_disk_utilization	DiskUtilization	Monitors the disk utilization during the processing jobs.
aws_sagemaker_processingjobs_gpumemory_utilization	GPUMemoryUtilization	Tracks GPU memory usage during processing jobs.
aws_sagemaker_processingjobs_gpumemory_utilization_normalized	GPUMemoryUtilizationNormalized	Provides normalized GPU memory utilization for comparison across different instances.
aws_sagemaker_processingjobs_gpureservation	GPUReservation	Monitors the amount of GPU resources reserved for processing jobs.
aws_sagemaker_processingjobs_gpuutilization	GPUUtilization	Tracks the utilization of GPU resources during processing jobs.
aws_sagemaker_processingjobs_gpuutilization_normalized	GPUUtilizationNormalized	Provides normalized GPU utilization for easier cross-instance comparison.
aws_sagemaker_processingjobs_memory_reservation	MemoryReservation	Tracks memory resources reserved for processing jobs.
aws_sagemaker_processingjobs_memory_utilization	MemoryUtilization	Monitors the utilization of memory resources during processing jobs.

AWS/SageMaker/TrainingJobs

Function: Managed service for training ML models on large datasets

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_trainingjobs_info
aws_sagemaker_trainingjobs_cpureservation	CPUReservation	Tracks the amount of CPU resources reserved for training jobs.
aws_sagemaker_trainingjobs_cpuutilization	CPUUtilization	Monitors the CPU utilization during training jobs.
aws_sagemaker_trainingjobs_cpuutilization_normalized	CPUUtilizationNormalized	Provides normalized CPU utilization across different instance types.
aws_sagemaker_trainingjobs_disk_utilization	DiskUtilization	Monitors the disk utilization during training jobs.
aws_sagemaker_trainingjobs_gpumemory_utilization	GPUMemoryUtilization	Tracks GPU memory utilization during training jobs.
aws_sagemaker_trainingjobs_gpumemory_utilization_normalized	GPUMemoryUtilizationNormalized	Provides normalized GPU memory utilization for comparison across different instances.
aws_sagemaker_trainingjobs_gpureservation	GPUReservation	Tracks the amount of GPU resources reserved for training jobs.
aws_sagemaker_trainingjobs_gpuutilization	GPUUtilization	Monitors GPU utilization during training jobs.
aws_sagemaker_trainingjobs_gpuutilization_normalized	GPUUtilizationNormalized	Provides normalized GPU utilization across different instances.
aws_sagemaker_trainingjobs_memory_reservation	MemoryReservation	Monitors the amount of memory reserved for training jobs.
aws_sagemaker_trainingjobs_memory_utilization	MemoryUtilization	Tracks the memory usage during training jobs.

AWS/SageMaker/TransformJobs

Function: Enables large-scale, batch ML model inferences for data transformations

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_sagemaker_transformjobs_info
aws_sagemaker_transformjobs_cpureservation	CPUReservation	Tracks the CPU resources reserved for transform jobs.
aws_sagemaker_transformjobs_cpuutilization	CPUUtilization	Monitors the CPU utilization during transform jobs.
aws_sagemaker_transformjobs_cpuutilization_normalized	CPUUtilizationNormalized	Provides normalized CPU utilization across different instance types during transform jobs.
aws_sagemaker_transformjobs_disk_utilization	DiskUtilization	Monitors disk utilization during transform jobs.
aws_sagemaker_transformjobs_gpumemory_utilization	GPUMemoryUtilization	Tracks GPU memory utilization during transform jobs.
aws_sagemaker_transformjobs_gpumemory_utilization_normalized	GPUMemoryUtilizationNormalized	Provides normalized GPU memory utilization for comparison across different instances during transform jobs.
aws_sagemaker_transformjobs_gpureservation	GPUReservation	Tracks the GPU resources reserved for transform jobs.
aws_sagemaker_transformjobs_gpuutilization	GPUUtilization	Monitors GPU utilization during transform jobs.
aws_sagemaker_transformjobs_gpuutilization_normalized	GPUUtilizationNormalized	Provides normalized GPU utilization across different instances during transform jobs.
aws_sagemaker_transformjobs_memory_reservation	MemoryReservation	Monitors memory resources reserved for transform jobs.
aws_sagemaker_transformjobs_memory_utilization	MemoryUtilization	Tracks memory usage during transform jobs.

AWS/Scheduler

Function: Managed service to trigger events or workflows at a scheduled time

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_scheduler_invocation_attempt_count	InvocationAttemptCount	Tracks the number of attempts made for invocations.
aws_scheduler_invocation_dropped_count	InvocationDroppedCount	Monitors the count of invocations that were dropped.
aws_scheduler_invocation_throttle_count	InvocationThrottleCount	Counts the number of invocations that were throttled due to exceeding limits.
aws_scheduler_invocations_failed_to_be_sent_to_dead_letter_count	InvocationsFailedToBeSentToDeadLetterCount	Tracks the number of invocations that failed to be sent to the dead letter queue.
aws_scheduler_invocations_sent_to_dead_letter_count	InvocationsSentToDeadLetterCount	Counts the number of invocations successfully sent to the dead letter queue.
aws_scheduler_invocations_sent_to_dead_letter_count_truncated_message_size_exceeded	InvocationsSentToDeadLetterCount_Truncated_MessageSizeExceeded	Monitors the number of invocations sent to the dead letter queue due to exceeding message size.
aws_scheduler_target_error_count	TargetErrorCount	Tracks the count of errors encountered by the target.
aws_scheduler_target_error_throttled_count	TargetErrorThrottledCount	Counts the number of target errors caused by throttling.

AWS/States

Function: AWS Step Functions for orchestrating workflows and coordinating services

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_states_info
aws_states_activities_failed	ActivitiesFailed	Tracks the number of failed activities.
aws_states_activities_heartbeat_timed_out	ActivitiesHeartbeatTimedOut	Monitors activities whose heartbeat timed out.
aws_states_activities_scheduled	ActivitiesScheduled	Tracks the number of activities that have been scheduled.
aws_states_activities_started	ActivitiesStarted	Measures the number of activities that have started.
aws_states_activities_succeeded	ActivitiesSucceeded	Tracks successful activities.
aws_states_activities_timed_out	ActivitiesTimedOut	Tracks the number of activities that timed out.
aws_states_activity_run_time	ActivityRunTime	Monitors the runtime of activities.
aws_states_activity_schedule_time	ActivityScheduleTime	Tracks the schedule time for activities.
aws_states_activity_time	ActivityTime	Tracks the total time taken by an activity.
aws_states_consumed_capacity	ConsumedCapacity	Measures the consumed capacity for Step Functions.
aws_states_execution_throttled	ExecutionThrottled	Monitors throttled execution attempts.
aws_states_execution_time	ExecutionTime	Tracks the total time taken by an execution.
aws_states_executions_aborted	ExecutionsAborted	Tracks the number of executions that were aborted.
aws_states_executions_failed	ExecutionsFailed	Measures the number of failed executions.
aws_states_executions_started	ExecutionsStarted	Tracks the number of executions that started.
aws_states_executions_succeeded	ExecutionsSucceeded	Tracks successful executions.
aws_states_executions_timed_out	ExecutionsTimedOut	Monitors executions that timed out.
aws_states_express_execution_billed_duration	ExpressExecutionBilledDuration	Measures the billed duration for Express Workflows.
aws_states_express_execution_billed_memory	ExpressExecutionBilledMemory	Measures the billed memory for Express Workflows.
aws_states_express_execution_memory	ExpressExecutionMemory	Monitors the memory consumed by Express Workflows.
aws_states_lambda_function_run_time	LambdaFunctionRunTime	Measures the runtime of Lambda functions.
aws_states_lambda_function_schedule_time	LambdaFunctionScheduleTime	Tracks the schedule time for Lambda functions.
aws_states_lambda_function_time	LambdaFunctionTime	Tracks the total time taken by Lambda functions.
aws_states_lambda_functions_failed	LambdaFunctionsFailed	Monitors Lambda functions that failed.
aws_states_lambda_functions_scheduled	LambdaFunctionsScheduled	Tracks the number of Lambda functions that were scheduled.
aws_states_lambda_functions_started	LambdaFunctionsStarted	Tracks Lambda functions that have started.
aws_states_lambda_functions_succeeded	LambdaFunctionsSucceeded	Measures successful Lambda function executions.
aws_states_lambda_functions_timed_out	LambdaFunctionsTimedOut	Monitors Lambda functions that timed out.
aws_states_provisioned_bucket_size	ProvisionedBucketSize	Tracks the provisioned bucket size for Step Functions.
aws_states_provisioned_refill_rate	ProvisionedRefillRate	Measures the rate at which provisioned capacity is refilled.
aws_states_service_integration_run_time	ServiceIntegrationRunTime Measures	the runtime of service integrations.
aws_states_service_integration_schedule_time	ServiceIntegrationScheduleTime	Tracks the schedule time for service integrations.
aws_states_service_integration_time	ServiceIntegrationTime	Monitors the total time taken by service integrations.
aws_states_service_integrations_failed	ServiceIntegrationsFailed	Tracks failed service integrations.
aws_states_service_integrations_scheduled	ServiceIntegrationsScheduled	Measures the number of service integrations that were scheduled.
aws_states_service_integrations_started	ServiceIntegrationsStarted	Tracks service integrations that have started.
aws_states_service_integrations_succeeded	ServiceIntegrationsSucceeded	Monitors successful service integrations.
aws_states_service_integrations_timed_out	ServiceIntegrationsTimedOut	Measures service integrations that timed out.
aws_states_throttled_events	ThrottledEvents	Tracks the number of events that were throttled.

AWS/StorageGateway

Function: Hybrid cloud storage service connecting on-premises software appliances to AWS

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_storagegateway_info
aws_storagegateway_cache_free	CacheFree	Tracks the amount of free cache space in the gateway.
aws_storagegateway_cache_hit_percent	CacheHitPercent	Monitors the percentage of read operations served by the cache.
aws_storagegateway_cache_percent_dirty	CachePercentDirty	Measures the percentage of cache space that contains data that hasn’t been uploaded yet.
aws_storagegateway_cache_percent_used	CachePercentUsed	Tracks the percentage of used cache space.
aws_storagegateway_cache_used	CacheUsed	Measures the amount of cache space used.
aws_storagegateway_cloud_bytes_downloaded	CloudBytesDownloaded	Tracks the amount of data downloaded from AWS to the gateway.
aws_storagegateway_cloud_bytes_uploaded	CloudBytesUploaded	Measures the amount of data uploaded from the gateway to AWS.
aws_storagegateway_cloud_download_latency	CloudDownloadLatency	Tracks the latency experienced during downloads from AWS.
aws_storagegateway_queued_writes	QueuedWrites	Monitors the number of write operations queued in the gateway.
aws_storagegateway_read_bytes	ReadBytes	Tracks the amount of data read by the gateway.
aws_storagegateway_read_time	ReadTime	Measures the time spent on read operations.
aws_storagegateway_time_since_last_recovery_point	TimeSinceLastRecoveryPoint	Tracks the time since the last recovery point was created.
aws_storagegateway_total_cache_size	TotalCacheSize	Measures the total size of the cache.
aws_storagegateway_upload_buffer_free	UploadBufferFree	Tracks the amount of free space in the upload buffer.
aws_storagegateway_upload_buffer_percent_used	UploadBufferPercentUsed	Measures the percentage of the upload buffer that is used.
aws_storagegateway_upload_buffer_used	UploadBufferUsed	Monitors the amount of upload buffer space used.
aws_storagegateway_working_storage_free	WorkingStorageFree	Measures the amount of free working storage in the gateway.
aws_storagegateway_working_storage_percent_used	WorkingStoragePercentUsed	Tracks the percentage of working storage used.
aws_storagegateway_working_storage_used	WorkingStorageUsed	Monitors the amount of working storage used.
aws_storagegateway_write_bytes	WriteBytes	Monitors the amount of working storage used.
aws_storagegateway_write_time	WriteTime	Tracks the time spent on write operations.

AWS/Timestream

Function: Managed time series database for IoT and operational applications

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_timestream_data_scanned_bytes	DataScannedBytes	Tracks the total amount of data scanned by AWS Timestream during queries.
aws_timestream_successful_request_latency	SuccessfulRequestLatency	Measures the latency of successful requests sent to AWS Timestream.
aws_timestream_system_errors	SystemErrors	Monitors the number of system errors occurring in AWS Timestream.
aws_timestream_user_errors	UserErrors	Tracks the number of user-generated errors in AWS Timestream, such as invalid queries.

AWS/TransitGateway

Function: Service for connecting VPCs and on-premises networks through a central hub

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_transitgateway_info
aws_transitgateway_bytes_in	BytesIn	Tracks the total number of bytes received by the Transit Gateway.
aws_transitgateway_bytes_out	BytesOut	Measures the total number of bytes sent from the Transit Gateway.
aws_transitgateway_packet_drop_count_blackhole	PacketDropCountBlackhole	Monitors the number of packets dropped due to blackholing (unreachable routes).
aws_transitgateway_packet_drop_count_no_route	PacketDropCountNoRoute	Tracks the number of packets dropped due to no matching route found.
aws_transitgateway_packets_in	PacketsIn	Measures the total number of packets received by the Transit Gateway.
aws_transitgateway_packets_out	PacketsOut	Tracks the total number of packets sent from the Transit Gateway.

AWS/TrustedAdvisor

Function: Provides real-time recommendations to improve AWS resource optimization and security. This service only produces metrics to specific regions in AWS. Any jobs configured with this service will only gather data from the us-east-1 regions.

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_trustedadvisor_green_checks	GreenChecks	Tracks the number of Trusted Advisor checks in the green (optimal) status.
aws_trustedadvisor_red_checks	RedChecks	Measures the number of Trusted Advisor checks that indicate critical issues (red status).
aws_trustedadvisor_red_resources	RedResources	Tracks the number of resources flagged as critical or failing (red status).
aws_trustedadvisor_service_limit_usage	ServiceLimitUsage	Monitors the usage of service limits based on Trusted Advisor service limit checks.
aws_trustedadvisor_yellow_checks	YellowChecks	Measures the number of checks that show warnings (yellow status).
aws_trustedadvisor_yellow_resources	YellowResources	Tracks the number of resources flagged as warnings or requiring attention (yellow status).

AWS/Usage

Function: Tracks AWS service usage for cost monitoring and optimization

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_usage_call_count	CallCount	Tracks the number of API or service calls made.
aws_usage_resource_count	ResourceCount	Measures the number of resources in use or allocated in the AWS environment.

AWS/VPN

Function: Managed VPN service to securely connect on-premises networks to AWS

Scrape interval: 5 minutes

Includes: Out-of-the-box dashboard

Metric	Cloudwatch metric	Purpose
aws_vpn_info
aws_vpn_tunnel_data_in	TunnelDataIn	Monitors the amount of inbound data being transferred through the VPN tunnel. Helps track network traffic.
aws_vpn_tunnel_data_out	TunnelDataOut	Tracks the amount of outbound data being transferred through the VPN tunnel. Useful for bandwidth monitoring.
aws_vpn_tunnel_state	TunnelState	Monitors the current status of the VPN tunnel (e.g., up or down). Helps in identifying tunnel connectivity issues.

AWS/WAFV2

Function: Web application firewall to protect applications from common web exploits

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_wafv2_info
aws_wafv2_allowed_requests	AllowedRequests	Tracks the number of requests that are allowed by the WAF rules. Useful for monitoring legitimate traffic.
aws_wafv2_blocked_requests	BlockedRequests	Monitors the number of requests that are blocked by the WAF rules. Helps detect and prevent malicious traffic.
aws_wafv2_captcha_requests	CaptchaRequests	Tracks the number of requests that triggered a CAPTCHA challenge. Useful for tracking potential bot traffic.
aws_wafv2_captchas_attempted	CaptchasAttempted	Monitors the number of CAPTCHA challenges that were attempted by users. Indicates user engagement with challenges.
aws_wafv2_captchas_solved	CaptchasSolved	Tracks the number of CAPTCHA challenges successfully solved. Helps assess CAPTCHA effectiveness.
aws_wafv2_challenge_requests	ChallengeRequests	Monitors the number of requests that triggered additional security challenges. Useful for advanced threat detection.
aws_wafv2_counted_requests	CountedRequests	Tracks the number of requests counted for rule evaluation but not necessarily blocked or allowed.
aws_wafv2_passed_requests	PassedRequests	Monitors requests that passed through the challenge phase and were allowed access.
aws_wafv2_requests_with_valid_captcha_token	RequestsWithValidCaptchaToken	Tracks the number of requests with a valid CAPTCHA token. Useful for validating CAPTCHA implementation.
aws_wafv2_requests_with_valid_challenge_token	RequestsWithValidChallengeToken	Monitors the number of requests with valid security challenge tokens. Helps track successful security checks.

AWS/WorkSpaces

Function: Managed desktop virtualization service for delivering cloud-based desktops

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_workspaces_info
aws_workspaces_available	Available	Monitors the number of available WorkSpaces. Useful for tracking the availability of WorkSpaces for users.
aws_workspaces_connection_attempt	ConnectionAttempt	Tracks the number of connection attempts to WorkSpaces. Helps monitor user access and demand.
aws_workspaces_connection_failure	ConnectionFailure	Monitors the number of failed connection attempts. Useful for identifying connectivity issues or failures.
aws_workspaces_connection_success	ConnectionSuccess	Tracks the number of successful connections to WorkSpaces. Indicates the success rate of user connections.
aws_workspaces_in_session_latency	InSessionLatency	Monitors the latency experienced by users during WorkSpaces sessions. Helps assess user experience quality.
aws_workspaces_maintenance	Maintenance	Tracks the number of WorkSpaces under maintenance. Useful for understanding maintenance impact on availability.
aws_workspaces_session_disconnect	SessionDisconnect	Monitors the number of session disconnections. Helps detect connectivity issues or user-initiated disconnects.
aws_workspaces_session_launch_time	SessionLaunchTime	Tracks the time taken to launch a WorkSpaces session. Useful for assessing the performance of WorkSpaces launches.
aws_workspaces_stopped	Stopped	Monitors the number of WorkSpaces that are in the stopped state. Helps track WorkSpaces that are not running.
aws_workspaces_unhealthy	Unhealthy	Tracks the number of unhealthy WorkSpaces. Useful for identifying potential issues with WorkSpaces health.
aws_workspaces_user_connected	UserConnected	Monitors the number of users currently connected to WorkSpaces. Helps measure active user engagement.

AmazonMWAA

Function: Managed service for Apache Airflow workflows in the cloud

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_amazonmwaa_info
aws_amazonmwaa_collect_dbdags	CollectDBDags Monitors	how often database DAGs are collected.
aws_amazonmwaa_critical_section_busy	CriticalSectionBusy	Tracks the time spent when critical sections of code are busy.
aws_amazonmwaa_critical_section_duration	CriticalSectionDuration	Measures the duration for which critical sections remain busy.
aws_amazonmwaa_critical_section_query_duration	CriticalSectionQueryDuration	Monitors the time spent querying within critical sections.
aws_amazonmwaa_dagdependency_check	DAGDependencyCheck	Monitors dependency checks between DAGs.
aws_amazonmwaa_dagduration_failed	DAGDurationFailed	Tracks the duration of failed DAG runs.
aws_amazonmwaa_dagduration_success	DAGDurationSuccess	Tracks the duration of successful DAG runs.
aws_amazonmwaa_dagfile_processing_last_duration	DAGFileProcessingLastDuration	Measures the last processing time for DAG files.
aws_amazonmwaa_dagfile_processing_last_run_seconds_ago	DAGFileProcessingLastRunSecondsAgo	Tracks the time since the last DAG file processing run.
aws_amazonmwaa_dagfile_refresh_error	DAGFileRefreshError	Monitors errors in refreshing DAG files.
aws_amazonmwaa_dagschedule_delay	DAGScheduleDelay	Monitors delays in DAG scheduling.
aws_amazonmwaa_dag_bag_size	DagBagSize	Tracks the size of the DAG bag.
aws_amazonmwaa_dag_callback_exceptions	DagCallbackExceptions	Monitors exceptions occurring in DAG callbacks.
aws_amazonmwaa_exception_failures	ExceptionFailures	Tracks the number of exception failures.
aws_amazonmwaa_executed_tasks	ExecutedTasks	Tracks the total number of executed tasks.
aws_amazonmwaa_failed_celery_task_execution	FailedCeleryTaskExecution	Monitors failed task executions in Celery.
aws_amazonmwaa_failed_slacallback	FailedSLACallback	Tracks failures in SLA callbacks.
aws_amazonmwaa_failed_slaemail_attempts	FailedSLAEmailAttempts	Monitors failed attempts to send SLA emails.
aws_amazonmwaa_file_path_queue_update_count	FilePathQueueUpdateCount	Tracks the number of file path queue updates.
aws_amazonmwaa_first_task_scheduling_delay	FirstTaskSchedulingDelay	Measures the delay in scheduling the first task.
aws_amazonmwaa_import_errors	ImportErrors	Monitors errors encountered during imports.
aws_amazonmwaa_infra_failures	InfraFailures	Tracks infrastructure failures in the environment.
aws_amazonmwaa_job_end	JobEnd	Monitors the number of jobs completed.
aws_amazonmwaa_job_heartbeat_failure	JobHeartbeatFailure	Tracks heartbeat failures for jobs.
aws_amazonmwaa_job_start	JobStart	Monitors the number of jobs started.
aws_amazonmwaa_loaded_tasks	LoadedTasks	Tracks the number of tasks loaded in the environment.
aws_amazonmwaa_manager_stalls	ManagerStalls	Monitors the number of times the manager process stalls.
aws_amazonmwaa_open_slots	OpenSlots	Tracks the number of open task slots.
aws_amazonmwaa_operator_failures	OperatorFailures	Tracks the number of operator task failures.
aws_amazonmwaa_operator_successes	OperatorSuccesses	Tracks the number of operator task successes.
aws_amazonmwaa_orphaned	Orphaned	Monitors orphaned task instances.
aws_amazonmwaa_orphaned_tasks_adopted	OrphanedTasksAdopted	Tracks the number of orphaned tasks adopted.
aws_amazonmwaa_orphaned_tasks_cleared	OrphanedTasksCleared	Tracks the number of orphaned tasks cleared.
aws_amazonmwaa_other_callback_count	OtherCallbackCount	Tracks the number of other callbacks occurring in the environment.
aws_amazonmwaa_poked_exceptions	PokedExceptions	Monitors the number of exceptions in poked tasks.
aws_amazonmwaa_poked_success	PokedSuccess	Tracks successful pokes in tasks.
aws_amazonmwaa_poked_tasks	PokedTasks	Tracks the number of poked tasks.
aws_amazonmwaa_pool_deferred_slots	PoolDeferredSlots	Tracks deferred slots in task pools.
aws_amazonmwaa_pool_failures	PoolFailures	Monitors the number of task pool failures.
aws_amazonmwaa_pool_open_slots	PoolOpenSlots	Tracks the number of open slots in the task pool.
aws_amazonmwaa_pool_queued_slots	PoolQueuedSlots	Tracks the number of queued slots in the task pool.
aws_amazonmwaa_pool_running_slots	PoolRunningSlots	Tracks the number of running slots in the task pool.
aws_amazonmwaa_pool_starving_tasks	PoolStarvingTasks	Tracks tasks that are starving for resources in the task pool.
aws_amazonmwaa_processes	Processes	Tracks the number of processes running in the environment.
aws_amazonmwaa_processor_timeouts	ProcessorTimeouts	Monitors timeouts in processors.
aws_amazonmwaa_queued_tasks	QueuedTasks	Tracks the number of tasks in the queue.
aws_amazonmwaa_running_tasks	RunningTasks	Tracks the number of running tasks in the environment.
aws_amazonmwaa_slamissed	SLAMissed	Tracks the number of SLA misses in tasks.
aws_amazonmwaa_scheduler_heartbeat	SchedulerHeartbeat	Monitors the health of the scheduler through its heartbeat.
aws_amazonmwaa_scheduler_loop_duration	SchedulerLoopDuration	Measures the duration of scheduler loops.
aws_amazonmwaa_sla_callback_count	SlaCallbackCount	Tracks the number of SLA callbacks made.
aws_amazonmwaa_started_task_instances	StartedTaskInstances	Monitors the number of started task instances.
aws_amazonmwaa_task_instance_created_using_operator	TaskInstanceCreatedUsingOperator	Tracks the number of task instances created using an operator.
aws_amazonmwaa_task_instance_duration	TaskInstanceDuration	Monitors the duration of task instances.
aws_amazonmwaa_task_instance_failures	TaskInstanceFailures	Tracks the number of task instance failures.
aws_amazonmwaa_task_instance_finished	TaskInstanceFinished	Monitors the number of task instances that have finished.
aws_amazonmwaa_task_instance_previously_succeeded	TaskInstancePreviouslySucceeded	Tracks the number of task instances that have previously succeeded.
aws_amazonmwaa_task_instance_queued_duration	TaskInstanceQueuedDuration	Measures the time task instances spend in the queue before execution.
aws_amazonmwaa_task_instance_scheduled_duration	TaskInstanceScheduledDuration	Tracks the duration of time task instances were scheduled.
aws_amazonmwaa_task_instance_successes	TaskInstanceSuccesses	Tracks the number of successful task instances.
aws_amazonmwaa_task_removed_from_dag	TaskRemovedFromDAG	Monitors tasks that were removed from the DAG.
aws_amazonmwaa_task_restored_to_dag	TaskRestoredToDAG	Tracks tasks that were restored to the DAG.
aws_amazonmwaa_task_timeout_error	TaskTimeoutError	Monitors timeout errors in tasks.
aws_amazonmwaa_tasks_executable	TasksExecutable	Tracks the number of executable tasks.
aws_amazonmwaa_tasks_killed_externally	TasksKilledExternally	Tracks tasks that were killed externally.
aws_amazonmwaa_tasks_pending	TasksPending	Monitors pending tasks.
aws_amazonmwaa_tasks_running	TasksRunning	Tracks the number of tasks currently running.
aws_amazonmwaa_tasks_starving	TasksStarving	Tracks the number of tasks starving for resources.
aws_amazonmwaa_tasks_without_dag_run	TasksWithoutDagRun	Tracks tasks that are not associated with any DAG run.
aws_amazonmwaa_total_parse_time	TotalParseTime	Measures the total time spent parsing DAG files.
aws_amazonmwaa_trigger_heartbeat	TriggerHeartbeat	Tracks the heartbeat of task triggers.
aws_amazonmwaa_triggered_dag_runs	TriggeredDagRuns	Monitors the number of DAG runs triggered.
aws_amazonmwaa_triggers_blocked_main_thread	TriggersBlockedMainThread Tracks the number of triggers that block the main thread.
aws_amazonmwaa_triggers_failed	TriggersFailed	Monitors failed task triggers.
aws_amazonmwaa_triggers_running	TriggersRunning	Tracks the number of running task triggers.
aws_amazonmwaa_triggers_succeeded	TriggersSucceeded	Monitors successful task triggers.
aws_amazonmwaa_updates	Updates	Tracks the number of updates made to DAGs and other configurations.
aws_amazonmwaa_zombies_killed	ZombiesKilled Monitors	the number of zombie tasks killed in the environment.

ECS/ContainerInsights

Function: Provides monitoring and insights for ECS clusters, tasks, and containers

Scrape interval: 5 minutes

Metric	Cloudwatch metric	Purpose
aws_ecs_containerinsights_info
aws_ecs_containerinsights_container_instance_count	ContainerInstanceCount	Tracks the number of container instances in a cluster.
aws_ecs_containerinsights_cpu_reserved	CpuReserved	Monitors the amount of CPU reserved for tasks.
aws_ecs_containerinsights_cpu_utilized	CpuUtilized	Tracks the CPU utilization of running tasks.
aws_ecs_containerinsights_deployment_count	DeploymentCount	Measures the number of service deployments.
aws_ecs_containerinsights_desired_task_count	DesiredTaskCount	Monitors the desired number of running tasks in a service.
aws_ecs_containerinsights_ebsfilesystem_size	EBSFilesystemSize	Tracks the size of the EBS filesystem attached to the ECS instance.
aws_ecs_containerinsights_ebsfilesystem_utilized	EBSFilesystemUtilized	Monitors the utilized space in the EBS filesystem.
aws_ecs_containerinsights_ephemeral_storage_reserved	EphemeralStorageReserved	Measures the amount of reserved ephemeral storage for tasks.
aws_ecs_containerinsights_ephemeral_storage_utilized	EphemeralStorageUtilized	Tracks the ephemeral storage utilized by tasks.
aws_ecs_containerinsights_memory_reserved	MemoryReserved	Monitors the amount of memory reserved for tasks in ECS.
aws_ecs_containerinsights_memory_utilized	MemoryUtilized	Measures the memory utilized by tasks.
aws_ecs_containerinsights_network_rx_bytes	NetworkRxBytes	Tracks the number of bytes received by the network interfaces on the instance.
aws_ecs_containerinsights_network_tx_bytes	NetworkTxBytes	Monitors the number of bytes transmitted from the network interfaces on the instance.
aws_ecs_containerinsights_pending_task_count	PendingTaskCount	Monitors the number of tasks that are in the pending state in the service.
aws_ecs_containerinsights_running_task_count	RunningTaskCount	Tracks the number of running tasks in the service.
aws_ecs_containerinsights_service_count	ServiceCount	Monitors the number of services running in the cluster.
aws_ecs_containerinsights_storage_read_bytes	StorageReadBytes	Tracks the number of bytes read from the storage attached to the ECS instance.
aws_ecs_containerinsights_storage_write_bytes	StorageWriteBytes	Measures the number of bytes written to storage.
aws_ecs_containerinsights_task_count	TaskCount	Monitors the total number of tasks running in the ECS cluster.
aws_ecs_containerinsights_task_set_count	TaskSetCount	Measures the number of task sets in a service.
aws_ecs_containerinsights_instance_cpu_limit	instance_cpu_limit	Tracks the total CPU limit configured for the instance.
aws_ecs_containerinsights_instance_cpu_reserved_capacity	instance_cpu_reserved_capacity	Measures the reserved CPU capacity on the instance.
aws_ecs_containerinsights_instance_cpu_usage_total	instance_cpu_usage_total	Tracks the total CPU usage across all tasks on the instance.
aws_ecs_containerinsights_instance_cpu_utilization	instance_cpu_utilization	Monitors the percentage of CPU utilization on the ECS instance.
aws_ecs_containerinsights_instance_filesystem_utilization	instance_filesystem_utilization	Tracks the utilization of the filesystem attached to the ECS instance.
aws_ecs_containerinsights_instance_memory_limit	instance_memory_limit	Measures the total memory limit configured for the instance.
aws_ecs_containerinsights_instance_memory_reserved_capacity	instance_memory_reserved_capacity	Tracks the reserved memory capacity on the instance.
aws_ecs_containerinsights_instance_memory_utilization	instance_memory_utilization	Monitors the percentage of memory utilization on the ECS instance.
aws_ecs_containerinsights_instance_memory_working_set	instance_memory_working_set	Measures the working set memory on the instance, which is the amount of memory actively used.
aws_ecs_containerinsights_instance_network_total_bytes	instance_network_total_bytes	Tracks the total number of bytes transferred (both received and transmitted) by the network interfaces.
aws_ecs_containerinsights_instance_number_of_running_tasks	instance_number_of_running_tasks	Monitors the total number of running tasks on the instance.
aws_ecs_containerinsights_instance_memory_utliization	instance_memory_utliization	Measures the memory utilization of the instance.

Was this page helpful?

Email docs@grafana.com

Help and support

Community