Decoupling applications SQS, SNS, Kinesis, Active MQ

Section Introduction

β€’ When we start deploying multiple applications, they will inevitably need to communicate with one another

β€’ There are two patterns of application communication

Section Introduction

β€’ Synchronous between applications can be problematic if there are sudden spikes of traffic

β€’ What if you need to suddenly encode 1000 videos but usually it’s 10?

β€’ In that case, it’s better to decouple your applications,

β€’ using SQS: queue model

β€’ using SNS: pub/sub model

β€’ using Kinesis: real-time streaming model

β€’ These services can scale independently from our application!

queue = sΔ±ra

poll = sorğu

SQS = SΔ°MPLE QUEUE SERVΔ°CES

Amazon SQS What’s a queue?

Amazon SQS – Standard Queue

β€’ Oldest offering (over 10 years old)

β€’ Fully managed service, used to decouple applications

β€’ Attributes:

β€’ Unlimited throughput, unlimited number of messages in queue

β€’ Default retention of messages: 4 days, maximum of 14 days

β€’ Low latency (<10 ms on publish and receive)

β€’ Limitation of 256KB per message sent

β€’ Can have duplicate messages (at least once delivery, occasionally)

β€’ Can have out of order messages (best effort ordering)

SQS – Producing Messages

β€’ Produced to SQS using the SDK (SendMessage API)

β€’ The message is persisted in SQS until a consumer deletes it

β€’ Message retention: default 4 days, up to 14 days

β€’ Example: send an order to be processed

β€’ Order id

β€’ Customer id

β€’ Any attributes you want

β€’ SQS standard: unlimited throughput

SQS – Consuming Messages

β€’ Consumers (running on EC2 instances, servers, or AWS Lambda)…

β€’ Poll SQS for messages (receive up to 10 messages at a time)

β€’ Process the messages (example: insert the message into an RDS database)

β€’ Delete the messages using the DeleteMessage API

SQS – Multiple EC2 Instances Consumers

β€’ Consumers receive and process messages in parallel

β€’ At least once delivery

β€’ Best-effort message ordering

β€’ Consumers delete messages after processing them

β€’ We can scale consumers horizontally to improve throughput of processing

Amazon SQS - Security

β€’ Encryption:

β€’ In-flight encryption using HTTPS API

β€’ At-rest encryption using KMS keys

β€’ Client-side encryption if the client wants to perform encryption/decryption itself

β€’ Access Controls: IAM policies to regulate access to the SQS API

β€’ SQS Access Policies (similar to S3 bucket policies)

β€’ Useful for cross-account access to SQS queues

β€’ Useful for allowing other services (SNS, S3…) to write to an SQS queue

SQS Queue Access Policy

SQS – Message Visibility Timeout

β€’ After a message is polled by a consumer, it becomes invisible to other consumers

β€’ By default, the β€œmessage visibility timeout” is 30 seconds

β€’ That means the message has 30 seconds to be processed

β€’ After the message visibility timeout is over, the message is β€œvisible” in SQS

eger bir meseage gΓΆnderilende bir timeout mΓΌtedinde bir consumer bunu gΓΆrΓΌbse giderleri bunu gΓΆre bilmir ama o timeout mΓΌdeti bitdikden sonra gΓΆre biler ve eger mesage silinmeyibse birinci consumer terefinden

SQS – Message Visibility Timeout

β€’ If a message is not processed within the visibility timeout, it will be processed twice

β€’ A consumer could call the ChangeMessageVisibility API to get more time

β€’ If visibility timeout is high (hours), and consumer crashes, re-processing will take time

β€’ If visibility timeout is too low (seconds), we may get duplicates

Amazon SQS – Dead Letter Queue

β€’ If a consumer fails to process a message within the Visibility Timeout… the message goes back to the queue!

β€’ We can set a threshold of how many times a message can go back to the queue

β€’ After the MaximumReceives threshold is exceeded, the message goes into a dead letter queue (DLQ)

yeni biz bir queue ni neΓ§e defe gΓΆndere bilerik eger meselen 3 defe getdikden sonra bu ΓΆlΓΌm sirasΔ±na gedeecek bunun ΓΌΓ§ΓΌn sen yarartdiqinin queue de bunu enabel etmelisen

β€’ Useful for debugging!

β€’ Make sure to process the messages in the DLQ before they expire:

β€’ Good to set a retention of 14 days in the DLQ

SQS DLQ – Redrive to Source

β€’ Feature to help consume messages in the DLQ to understand what is wrong with them

β€’ When our code is fixed, we can redrive the messages from the DLQ back into the source queue (or any other queue) in batches without writing custom code

Amazon SQS – Delay Queue

bir message produser terefinden gΓΆnderilidikde sqs de neΓ§e saniye deqiqe gΓΆzlemesi ΓΌΓ§ΓΌn istifade edilir

β€’ Delay a message (consumers don’t see it immediately) up to 15 minutes

β€’ Default is 0 seconds (message is available right away)

β€’ Can set a default at queue level

β€’ Can override the default on send using the DelaySeconds parameter

Amazon SQS - Long Polling

bu poll sorğusunun gâzlenilmesi üçün istifade edilir yeni bir sorqu 20 saniye de olur

β€’ When a consumer requests messages from the queue, it can optionally β€œwait” for messages to arrive if there are none in the queue

β€’ This is called Long Polling

β€’ LongPolling decreases the number of API calls made to SQS while increasing the efficiency and reducing latency of your application

β€’ The wait time can be between 1 sec to 20 sec (20 sec preferable)

β€’ Long Polling is preferable to Short Polling

β€’ Long polling can be enabled at the queue level or at the API level using WaitTimeSeconds

SQS – Request-Response Systems

β€’ To implement this pattern: use the SQS Temporary Queue Client

β€’ It leverages virtual queues instead of creating / deleting SQS queues (cost-effective)\

Amazon SQS – FIFO Queue

β€’ FIFO = First In First Out (ordering of messages in the queue)

β€’ Limited throughput: 300 msg/s without batching, 3000 msg/s with

β€’ Exactly-once send capability (by removing duplicates)

β€’ Messages are processed in order by the consumer

Amazon SNS

β€’ What if you want to send one message to many receivers?

Amazon SNS \

β€’ The β€œevent producer” only sends message to one SNS topic

β€’ As many β€œevent receivers” (subscriptions) as we want to listen to the SNS topic notifications

β€’ Each subscriber to the topic will get all the messages (note: new feature to filter messages)

β€’ Up to 12,500,000 subscriptions per topic

β€’ 100,000 topics limit

SNS integrates with a lot of AWS services

β€’ Many AWS services can send data directly to SNS for notifications

Amazon SNS – How to publish

β€’ Topic Publish (using the SDK)

β€’ Create a topic

β€’ Create a subscription (or many)

β€’ Publish to the topic

β€’ Direct Publish (for mobile apps SDK)

β€’ Create a platform application

β€’ Create a platform endpoint

β€’ Publish to the platform endpoint

β€’ Works with Google GCM, Apple APNS, Amazon ADM…

Amazon SNS – Security

β€’ Encryption:

β€’ In-flight encryption using HTTPS API

β€’ At-rest encryption using KMS keys

β€’ Client-side encryption if the client wants to perform encryption/decryption itself

β€’ Access Controls: IAM policies to regulate access to the SNS API

β€’ SNS Access Policies (similar to S3 bucket policies)

β€’ Useful for cross-account access to SNS topics

β€’ Useful for allowing other services ( S3…) to write to an SNS topic\

SNS + SQS: Fan Out

yΙ™ni bir sns ile queu gelir ve ordan sqs vasitesi ile parΓ§alanΔ±r Γ§ΓΌnki sqs daha genis xΓΌsusuiyetleri var

β€’ Push once in SNS, receive in all SQS queues that are subscribers

β€’ Fully decoupled, no data loss

β€’ SQS allows for: data persistence, delayed processing and retries of work

β€’ Ability to add more SQS subscribers over time

β€’ Make sure your SQS queue access policy allows for SNS to write

Application: S3 Events to multiple queues

β€’ For the same combination of: event type (e.g. object create) and prefix (e.g. images/) you can only have one S3 Event rule

β€’ If you want to send the same S3 event to many SQS queues, use fan-out

Application: SNS to Amazon S3 through Kinesis Data Firehose

β€’ SNS can send to Kinesis and therefore we can have the following solutions architecture:

Amazon SNS – FIFO Topic

β€’ FIFO = First In First Out (ordering of messages in the topic)

β€’ Similar features as SQS FIFO:

β€’ Ordering by Message Group ID (all messages in the same group are ordered)

β€’ Deduplication using a Deduplication ID or Content Based Deduplication

β€’ Can only have SQS FIFO queues as subscribers

β€’ Limited throughput (same throughput as SQS FIFO)

SNS FIFO + SQS FIFO: Fan Out

β€’ In case you need fan out + ordering + deduplication

SNS – Message Filtering

β€’ JSON policy used to filter messages sent to SNS topic’s subscriptions

β€’ If a subscription doesn’t have a filter policy, it receives every message

Kinesis Overview

β€’ Makes it easy to collect, process, and analyze streaming data in real-time

β€’ Ingest real-time data such as: Application logs, Metrics, Website clickstreams, IoT telemetry data…

β€’ Kinesis Data Streams: capture, process, and store data streams

β€’ Kinesis Data Firehose: load data streams into AWS data stores

β€’ Kinesis Data Analytics: analyze data streams with SQL or Apache Flink

β€’ Kinesis Video Streams: capture, process, and store video streams

Kinesis Data Streams

Kinesis Data Streams

β€’ Retention between 1 day to 365 days

β€’ Ability to reprocess (replay) data

β€’ Once data is inserted in Kinesis, it can’t be deleted (immutability)

β€’ Data that shares the same partition goes to the same shard (ordering)

β€’ Producers: AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent

β€’ Consumers:

β€’ Write your own: Kinesis Client Library (KCL), AWS SDK

β€’ Managed: AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics,

Kinesis Data Streams – Capacity Modes

β€’ Provisioned mode:

β€’ You choose the number of shards provisioned, scale manually or using API

β€’ Each shard gets 1MB/s in (or 1000 records per second)

β€’ Each shard gets 2MB/s out (classic or enhanced fan-out consumer)

β€’ You pay per shard provisioned per hour

β€’ On-demand mode:

β€’ No need to provision or manage the capacity

β€’ Default capacity provisioned (4 MB/s in or 4000 records per second)

β€’ Scales automatically based on observed throughput peak during the last 30 days

β€’ Pay per stream per hour & data in/out per GB

Kinesis Data Firehose

β€’ Fully Managed Service, no administration, automatic scaling, serverless

β€’ AWS: Redshift / Amazon S3 / ElasticSearch

β€’ 3rd party partner: Splunk / MongoDB / DataDog / NewRelic / …

β€’ Custom: send to any HTTP endpoint

β€’ Pay for data going through Firehose

β€’ Near Real Time

β€’ 60 seconds latency minimum for non full batches

β€’ Or minimum 32 MB of data at a time

β€’ Supports many data formats, conversions, transformations, compression

β€’ Supports custom data transformations using AWS Lambda

β€’ Can send failed or all data to a backup S3 bucket

Kinesis Data Streams vs Firehose

Kinesis Data Streams

β€’ Streaming service for ingest at scale

β€’ Write custom code (producer / consumer)

β€’ Real-time (~200 ms)

β€’ Manage scaling (shard splitting / merging)

β€’ Data storage for 1 to 365 days

β€’ Supports replay capability

Kinesis Data Firehose

β€’ Load streaming data into S3 / Redshift / ES / 3rd party / custom HTTP

β€’ Fully managed

β€’ Near real-time (buffer time min. 60 sec)

β€’ Automatic scaling

β€’ No data storage

β€’ Doesn’t support replay capability

Kinesis Data Analytics (SQL application)

β€’ Perform real-time analytics on Kinesis Streams using SQL

β€’ Fully managed, no servers to provision

β€’ Automatic scaling

β€’ Real-time analytics

β€’ Pay for actual consumption rate

β€’ Can create streams out of the real-time queries

β€’ Use cases:

β€’ Time-series analytics

β€’ Real-time dashboards

β€’ Real-time metrics

Ordering data into Kinesis

β€’ Imagine you have 100 trucks (truck_1, truck_2, … truck_100) on the road sending their GPS positions regularly into AWS.

β€’ You want to consume the data in order for each truck, so that you can track their movement accurately.

β€’ How should you send that data into Kinesis?

β€’ Answer: send using a β€œPartition Key” value of the β€œtruck_id”

β€’ The same key will always go to the same shard

Ordering data into SQS

β€’ For SQS standard, there is no ordering.

β€’ For SQS FIFO, if you don’t use a Group ID, messages are consumed in the order they are sent, with only one consumer

β€’ You want to scale the number of consumers, but you want messages to be β€œgrouped” when they are related to each other

β€’ Then you use a Group ID (similar to Partition Key in Kinesis)

Kinesis vs SQS ordering

β€’ Let’s assume 100 trucks, 5 kinesis shards, 1 SQS FIFO

β€’ Kinesis Data Streams:

β€’ On average you’ll have 20 trucks per shard

β€’ Trucks will have their data ordered within each shard

β€’ The maximum amount of consumers in parallel we can have is 5

β€’ Can receive up to 5 MB/s of data

β€’ SQS FIFO

β€’ You only have one SQS FIFO queue

β€’ You will have 100 Group ID

β€’ You can have up to 100 Consumers (due to the 100 Group ID)

β€’ You have up to 300 messages per second (or 3000 if using batching)

SQS vs SNS vs Kinesis SQS:

SQS:

β€’ Consumer β€œpull data”

β€’ Data is deleted after being consumed

β€’ Can have as many workers (consumers) as we want

β€’ No need to provision throughput

β€’ Ordering guarantees only on FIFO queues

β€’ Individual message delay capability

SNS:

β€’ Push data to many subscribers

β€’ Up to 12,500,000 subscribers

β€’ Data is not persisted (lost if not delivered)

β€’ Pub/Sub

β€’ Up to 100,000 topics

β€’ No need to provision throughput

β€’ Integrates with SQS for fan- out architecture pattern

β€’ FIFO capability for SQS FIFO

Kinesis:

β€’ Standard: pull data

β€’ 2 MB per shard

β€’ Enhanced-fan out: push data

β€’ 2 MB per shard per consumer

β€’ Possibility to replay data

β€’ Meant for real-time big data, analytics and ETL

β€’ Ordering at the shard level

β€’ Data expires after X days

β€’ Provisioned mode or ondemand capacity mode

Amazon MQ

β€’ SQS, SNS are β€œcloud-native” services, and they’re using proprietary protocols from AWS.

β€’ Traditional applications running from on-premises may use open protocols such as: MQTT, AMQP, STOMP, Openwire, WSS

β€’ When migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ

β€’ Amazon MQ = managed Apache ActiveMQ

β€’ Amazon MQ doesn’t β€œscale” as much as SQS / SNS

β€’ Amazon MQ runs on a dedicated machine, can run in HA with failover

β€’ Amazon MQ has both queue feature (~SQS) and topic features (~SNS)

Last updated