AWS Monitoring, Audit and Performance
AWS CloudWatch Metrics
β’ CloudWatch provides metrics for every services in AWS
β’ Metric is a variable to monitor (CPUUtilization, NetworkInβ¦)
β’ Metrics belong to namespaces
β’ Dimension is an attribute of a metric (instance id, environment, etcβ¦).
β’ Up to 10 dimensions per metric
β’ Metrics have timestamps
β’ Can create CloudWatch dashboards of metrics
EC2 Detailed monitoring
β’ EC2 instance metrics have metrics βevery 5 minutesβ
β’ With detailed monitoring (for a cost), you get data βevery 1 minuteβ
β’ Use detailed monitoring if you want to scale faster for your ASG!
β’ The AWS Free Tier allows us to have 10 detailed monitoring metrics
β’ Note: EC2 Memory usage is by default not pushed (must be pushed from inside the instance as a custom metric)
CloudWatch Custom Metrics
β’ Possibility to define and send your own custom metrics to CloudWatch
β’ Example: memory (RAM) usage, disk space, number of logged in users β¦
β’ Use API call PutMetricData
β’ Ability to use dimensions (attributes) to segment metrics
β’ Instance.id
β’ Environment.name
β’ Metric resolution (StorageResolution API parameter β two possible value):
β’ Standard: 1 minute (60 seconds)
β’ High Resolution: 1/5/10/30 second(s) β Higher cost
β’ Important: Accepts metric data points two weeks in the past and two hours in the future (make sure to configure your EC2 instance time correctly)
CloudWatch Dashboards
β’ Great way to setup custom dashboards for quick access to key metrics and alarms
β’ Dashboards are global
β’ Dashboards can include graphs from different AWS accounts and regions
β’ You can change the time zone & time range of the dashboards
β’ You can setup automatic refresh (10s, 1m, 2m, 5m, 15m)
β’ Dashboards can be shared with people who donβt have an AWS account (public, email address, 3rd party SSO provider through Amazon Cognito)
β’ Pricing:
β’ 3 dashboards (up to 50 metrics) for free
β’ $3/dashboard/month afterwards
you can use multiple AWS accounts and regions

CloudWatch Logs
β’ Log groups: arbitrary name, usually representing an application
β’ Log stream: instances within application / log files / containers
β’ Can define log expiration policies (never expire, 30 days, etc..)
β’ CloudWatch Logs can send logs to:
β’ Amazon S3 (exports)
β’ Kinesis Data Streams
β’ Kinesis Data Firehose
β’ AWS Lambda
β’ ElasticSearch
CloudWatch LogsSources
β’ SDK, CloudWatch Logs Agent, CloudWatch Unified Agent
β’ Elastic Beanstalk: collection of logs from application
β’ ECS: collection from containers
β’ AWS Lambda: collection from function logs
β’ VPC Flow Logs: VPC specific logs
β’ API Gateway
β’ CloudTrail based on filter
β’ Route53: Log DNS querie
CloudWatch Logs Metric Filter & Insights
β’ CloudWatch Logs can use filter expressions
β’ For example, find a specific IP inside of a log
β’ Or count occurrences of βERRORβ in your logs
β’ Metric filters can be used to trigger CloudWatch alarms
β’ CloudWatch Logs Insights can be used to query logs and add queries to CloudWatch Dashboards

CloudWatch Logs β S3 Export
β’ Log data can take up to 12 hours to become available for export
β’ The API call is CreateExportTask
⒠Not near-real time or real-time⦠use Logs Subscriptions instead



CloudWatch Logs for EC2
β’ By default, no logs from your EC2 machine will go to CloudWatch
β’ You need to run a CloudWatch agent on EC2 to push the log files you want
β’ Make sure IAM permissions are correct
β’ The CloudWatch log agent can be setup on-premises too
CloudWatch Logs Agent & Unified Agent
β’ For virtual servers (EC2 instances, on-premises serversβ¦)
β’ CloudWatch Logs Agent
β’ Old version of the agent
β’ Can only send to CloudWatch Logs
β’ CloudWatch Unified Agent
β’ Collect additional system-level metrics such as RAM, processes, etcβ¦
β’ Collect logs to send to CloudWatch Logs
β’ Centralized configuration using SSM Parameter Store
CloudWatch Unified Agent β Metrics
β’ Collected directly on your Linux server / EC2 instance
β’ CPU (active, guest, idle, system, user, steal)
β’ Disk metrics (free, used, total), Disk IO (writes, reads, bytes, iops)
β’ RAM (free, inactive, used, total, cached)
β’ Netstat (number of TCP and UDP connections, net packets, bytes)
β’ Processes (total, dead, bloqued, idle, running, sleep)
β’ Swap Space (free, used, used %)
β’ Reminder: out-of-the box metrics for EC2 β disk, CPU, network (high level)
CloudWatch Alarms
β’ Alarms are used to trigger notifications for any metric
β’ Various options (sampling, %, max, min, etcβ¦)
β’ Alarm States:
β’ OK
β’ INSUFFICIENT_DATA
β’ ALARM
β’ Period:
β’ Length of time in seconds to evaluate the metric
β’ High resolution custom metrics: 10 sec, 30 sec or multiples of 60 sec
CloudWatch Alarm Targets
β’ Stop, Terminate, Reboot, or Recover an EC2 Instance
β’ Trigger Auto Scaling Action
β’ Send notification to SNS (from which you can do pretty much anything)


CloudWatch Events
β’ Event Pattern: Intercept events from AWS services (Sources)
β’ Example sources: EC2 Instance Start, CodeBuild Failure, S3, Trusted Advisor
β’ Can intercept any API call with CloudTrail integration
β’ Schedule or Cron (example: create an event every 4 hours)
β’ A JSON payload is created from the event and passed to a targetβ¦
β’ Compute: Lambda, Batch, ECS task
β’ Integration: SQS, SNS, Kinesis Data Streams, Kinesis Data Firehose
β’ Orchestration: Step Functions, CodePipeline, CodeBuild
β’ Maintenance: SSM, EC2 Actions
Amazon EventBridge
β’ EventBridge is the next evolution of CloudWatch Events
β’ Default Event Bus β generated by AWS services (CloudWatch Events)
β’ Partner Event Bus β receive events from SaaS service or applications (Zendesk, DataDog, Segment, Auth0β¦)
β’ Custom Event Buses β for your own applications
β’ Event buses can be accessed by other AWS accounts
β’ You can archive events (all/filter) sent to an event bus (indefinitely or set period)
β’ Ability to replay archived events
β’ Rules: how to process the events (like CloudWatch Events)
Amazon EventBridge β Schema Registry
β’ EventBridge can analyze the events in your bus and infer the schema
β’ The Schema Registry allows you to generate code for your application, that will know in advance how data is structured in the event bus
β’ Schema can be versioned

Amazon EventBridge β Resource-based Policy
β’ Manage permissions for a specific Event Bus
β’ Example: allow/deny events from another AWS account or AWS region
β’ Use case: aggregate all events from your AWS Organization in a single AWS account or AWS region

Amazon EventBridge vs CloudWatch Events
β’ Amazon EventBridge builds upon and extends CloudWatch Events.
β’ It uses the same service API and endpoint, and the same underlying service infrastructure.
β’ EventBridge allows extension to add event buses for your custom applications and your third-party SaaS apps.
β’ Event Bridge has the Schema Registry capability
β’ EventBridge has a different name to mark the new capabilities
β’ Over time, the CloudWatch Events name will be replaced with EventBridge.
AWS CloudTrail
β’ Provides governance, compliance and audit for your AWS Account
β’ CloudTrail is enabled by default!
β’ Get an history of events / API calls made within your AWS Account by:
β’ Console
β’ SDK
β’ CLI
β’ AWS Services
β’ Can put logs from CloudTrail into CloudWatch Logs or S3
β’ A trail can be applied to All Regions (default) or a single Region.
β’ If a resource is deleted in AWS, investigate CloudTrail first!

CloudTrail Events
β’ Management Events:
β’ Operations that are performed on resources in your AWS account
β’ Examples:
β’ Configuring security (IAM AttachRolePolicy)
β’ Configuring rules for routing data (Amazon EC2 CreateSubnet)
β’ Setting up logging (AWS CloudTrail CreateTrail)
β’ By default, trails are configured to log management events.
β’ Can separate Read Events (that donβt modify resources) from Write Events (that may modify resources)
β’ Data Events:
β’ By default, data events are not logged (because high volume operations)
β’ Amazon S3 object-level activity (ex: GetObject, DeleteObject, PutObject): can separate Read and Write Events
β’ AWS Lambda function execution activity (the Invoke API)
β’ CloudTrail Insights Events:
β’ See next slide
CloudTrail Insights
β’ Enable CloudTrail Insights to detect unusual activity in your account:
β’ inaccurate resource provisioning
β’ hitting service limits
β’ Bursts of AWS IAM actions
β’ Gaps in periodic maintenance activity
β’ CloudTrail Insights analyzes normal management events to create a baseline
β’ And then continuously analyzes write events to detect unusual patterns
β’ Anomalies appear in the CloudTrail console
β’ Event is sent to Amazon S3
β’ An EventBridge event is generated (for automation needs)


AWS Config
β’ Helps with auditing and recording compliance of your AWS resources
β’ Helps record configurations and changes over time
β’ Questions that can be solved by AWS Config:
β’ Is there unrestricted SSH access to my security groups?
β’ Do my buckets have any public access?
β’ How has my ALB configuration changed over time?
β’ You can receive alerts (SNS notifications) for any changes
β’ AWS Config is a per-region service
β’ Can be aggregated across regions and accounts
β’ Possibility of storing the configuration data into S3 (analyzed by Athena)
Config Rules
β’ Can use AWS managed config rules (over 75)
β’ Can make custom config rules (must be defined in AWS Lambda)
β’ Ex: evaluate if each EBS disk is of type gp2
β’ Ex: evaluate if each EC2 instance is t2.micro
β’ Rules can be evaluated / triggered:
β’ For each config change
β’ And / or: at regular time intervals
β’ AWS Config Rules does not prevent actions from happening (no deny)
β’ Pricing: no free tier, $0.003 per configuration item recorded per region, $0.001 per config rule evaluation per region

\


CloudWatch vs CloudTrail vs Config
β’ CloudWatch
β’ Performance monitoring (metrics, CPU, network, etcβ¦) & dashboards
β’ Events & Alerting
β’ Log Aggregation & Analysis
β’ CloudTrail
β’ Record API calls made within your Account by everyone
β’ Can define trails for specific resources
β’ Global Service
β’ Config
β’ Record configuration changes
β’ Evaluate resources against compliance rules
β’ Get timeline of changes and complian
For an Elastic Load Balancer
β’ CloudWatch:
β’ Monitoring Incoming connections metric
β’ Visualize error codes as % over time
β’ Make a dashboard to get an idea of your load balancer performance
β’ Config:
β’ Track security group rules for the Load Balancer
β’ Track configuration changes for the Load Balancer
β’ Ensure an SSL certificate is always assigned to the Load Balancer (compliance)
β’ CloudTrail:
β’ Track who made any changes to the Load Balancer with API calls
Last updated