AWS Internals

The architecture, tradeoffs, and internal mechanics of Amazon Web Services

AWS is a collection of tightly engineered distributed systems — and understanding how they work underneath is the difference between using them and mastering them. The hypervisor that isolates your Lambda from someone else's, the gossip protocol that keeps DynamoDB replicas consistent, the Raft implementation inside control planes — these internals determine the behavior you actually experience: latency tails, consistency guarantees, cold-start penalties, availability boundaries.

How AWS Is Structured

Regions & Availability Zones

AWS regions are geographic facilities, each containing multiple Availability Zones (AZs) — physically isolated data centers with independent power, cooling, and networking, connected by low-latency links (<1 ms within a region). You design for AZ failure by running across at least two; the default behavior of services like RDS Multi-AZ and EKS multi-AZ clusters does this for you. Regions are independent: data doesn't leave a region unless you explicitly cross it (with some exceptions for global services like IAM and Route 53).

Edge Locations & Wavelength

CloudFront operates ~600 edge locations globally for content delivery and Lambda@Edge / CloudFront Functions for compute-at-the-edge. Aurora Global Database replicates across regions with <1 second lag. Wavelength embeds AWS compute inside carrier co-location facilities for ultra-low-latency mobile workloads. Local Zones place AWS resources (EC2, ECS, RDS) near population centers, extending a region's footprint without a full AZ.

Shared Responsibility Model

AWS is responsible for the hypervisor, hardware, network fabric, physical security of its infrastructure. You are responsible for everything above the hypervisor: AMI hardening, IAM policies, encryption key management, application-level availability design. The "security of the cloud" vs "security in the cloud" distinction shapes every security decision on AWS — and the failures that make headlines are almost always in the latter.

AWS Architecture Diagram

Core Services Map

AWS has over 200 services. The ones that matter most for systems engineers form a layered stack:

Edge / Global

CloudFrontCDN + compute at edge (Lambda@Edge, CloudFront Functions)

Route 53Managed DNS, latency-based routing, health checks, failover

Route 53 ResolverHybrid DNS, VPC DNS forwarding, resolver endpoints for on-prem

ACMManaged TLS certificates, integrated with ALB, CloudFront, API Gateway

Application Tier

API GatewayREST/HTTP/WebSocket APIs, throttling, request validation, authorizers

ELB / ALBLayer 7 load balancing, path-based routing, target groups, TLS termination

LambdaEvent-driven functions, Firecracker isolation, per-invocation billing

SQSFully managed message queues, standard and FIFO, dead-letter queues

SNSPub/sub notifications, fan-out to multiple subscribers, SMS/email/SQS/mobile

EventBridgeEvent bus, schema registry, scheduled rules, SaaS integrations

Compute

EC2Bare-metal and virtual machines, auto-scaling groups, spot/prepay/on-demand

ECS / FargateContainer orchestration, Fargate = serverless containers (no EC2 management)

EKSManaged Kubernetes, control plane HA, node groups, Karpenter autoscaling

LambdaServerless functions, not for long-running workloads (>15 min timeout)

AWS BatchBatch job scheduling, managed compute environments, spot integration

LightsailFixed-price VPS for simple workloads, not for production at scale

Storage

S3Object storage, 11 9s durability, tiered storage classes, lifecycle policies

EBSBlock storage attached to EC2, gp3/io2 volumes, snapshots to S3

EFSElastic file system, NFS mount, multi-AZ, used by Lambda (too slow for cold starts)

FSxManaged Windows (SMB), Lustre, NetApp ONTAP, OpenZFS file systems

Instance StoreEphemeral SSD directly attached, no redundancy, highest throughput

Data & Databases

RDSManaged relational DBs: PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, Aurora

AuroraMySQL/PostgreSQL-compatible, distributed storage (quorum writes), 6 copies across 3 AZs

DynamoDBManaged NoSQL, single-digit ms latency, on-demand or provisioned RCU/WCU

ElastiCacheManaged Redis and Memcached, read replicas, cluster mode, serverless Redis

OpenSearchManaged Elasticsearch fork, full-text search, analytics dashboards

NeptuneManaged graph database, Gremlin/SPARQL/Cypher, multi-AZ

TimestreamTime-series database, serverless, ingestion from IoT/event sources

DocumentDBMongoDB-compatible, managed, TTL on documents, change streams

Networking

VPCVirtual Private Cloud, CIDR blocks, subnets (public/private), route tables, IGW/NAT GW

Security GroupsStateful firewall at ENI level, allow rules only, default deny

NACLsStateless subnet-level rules, explicit allow + deny, evaluated before SG

Direct ConnectDedicated private connection from on-prem to AWS, 1/10/100 Gbps ports

VPNSite-to-site IPSec VPN, managed NAT, transit gateway for hub-and-spoke

Transit GatewayHub for VPC-to-VPC and VPC-to-on-prem routing, scales to thousands of VPCs

PrivateLinkPrivate connectivity to AWS services and SaaS, no internet traverse

Security & Identity

IAMIdentity and access management, users/roles/policies, resource-based policies

KMSKey management, envelope encryption, CMK aliases, BYOK, FIPS 140-2

Secrets ManagerSecure storage for DB credentials, API keys, rotation via Lambda or RDS Proxy

SSM Parameter StoreHierarchical parameter storage, Standard (free) and Advanced (paid) tiers

CloudTrailAPI call logging across all AWS services, S3 log delivery, CloudWatch Logs

GuardDutyThreat detection, continuous monitoring of CloudTrail/VPC Flow/EFS logs

Networking Fundamentals

AWS networking follows the classic hybrid cloud model — you get a virtual network that behaves like an on-premises data center, with some important differences.

VPCs, CIDR, and Subnets

A VPC is an isolated virtual network in an AWS region. You define the CIDR block (e.g. 10.0.0.0/16), which determines the private IP address space available. Subnets are subdivisions of that CIDR, associated with a single AZ. A common pattern:

# CIDR: 10.0.0.0/16 — gives you 65,536 addresses
# Public subnet (has route to Internet Gateway):
#   10.0.1.0/24 — AZ us-east-1a, for ALBs, NAT Gateways, Bastion hosts
#   10.0.2.0/24 — AZ us-east-1b
# Private subnet (no direct internet route):
#   10.0.10.0/24 — AZ us-east-1a, for application servers, databases
#   10.0.20.0/24 — AZ us-east-1b, for application servers, databases
#   10.0.30.0/24 — AZ us-east-1a, for data processing, ML training
#
# Rules:
#   - Never assign a public IP to a private subnet instance
#   - Private instances reach the internet via NAT Gateway in a public subnet
#   - RDS, ElastiCache, etc. live in private subnets with no direct internet access
#   - Split your CIDR so app, data, and DMZ tiers are separate

Route Tables, Internet Gateway, NAT Gateway

Component	Role	Placement	Cost
Internet Gateway (IGW)	Routes traffic between your VPC and the internet. Handles NAT from public IPs to instance ENIs. Horizontally scaled, redundant.	Attached to VPC (one per VPC)	Free
NAT Gateway	Allows private subnet instances to initiate outbound internet traffic (e.g. pulling packages, calling external APIs) without allowing inbound connections. Stateful.	Inside a public subnet, one per AZ	$0.045/GB processed + hourly fee
Egress-only Internet Gateway	Like NAT GW but for IPv6. Stateful, only allows outbound from VPC to internet.	Inside a subnet	Free
Route Table	Determines where traffic goes. Each subnet must be associated with a route table. Local traffic (within VPC CIDR) is always local.	Associated with subnets, can have multiple subnets per table	Free

Security Groups vs NACLs

These are two distinct layers of the networking firewall stack. Understanding the difference is fundamental to debugging connectivity issues.

Property	Security Group (SG)	Network ACL (NACL)
Operates at	ENI level (instance)	Subnet level
Stateful?	Yes — return traffic auto-allowed	No — must explicitly allow return
Rule evaluation	All rules evaluated, most permissive wins	Processed in order, first match wins
Default behavior	Denies all inbound, allows all outbound	Allows all inbound/outbound (stateless)
Use case	Instance-level firewall (DB accepts app SG traffic)	Subnet-level guard rails (block entire subnet's SSH)

# Security Group: app-servers-sg
# Inbound: allow HTTP from ALB SG only
Type: HTTP | Source: sg-0abc123 (alb-sg) | Port: 80
Type: HTTPS | Source: sg-0abc123 (alb-sg) | Port: 443
Type: SSH | Source: 10.0.1.0/32 (bastion) | Port: 22
# Outbound: allow all (default) -- or lock down to specific destinations

# NACL: app-tier-nacl
# Inbound: deny SSH from anywhere, allow HTTP/HTTPS from VPC CIDR
Rule #100: ALLOW | 10.0.0.0/16 | TCP 80  | All
Rule #110: ALLOW | 10.0.0.0/16 | TCP 443 | All
Rule #200: DENY  | 0.0.0.0/0   | TCP 22  | All
# Outbound: must allow return traffic for stateful SG responses
Rule #100: ALLOW | 10.0.0.0/16 | All | All

VPC Endpoints and PrivateLink

By default, traffic from your VPC to AWS services (S3, DynamoDB, SQS, etc.) goes over the public internet. VPC Endpoints keep this traffic on the AWS network:

Gateway Endpoint (free): S3 and DynamoDB. You add a route table entry in your private subnet pointing vpce-id as the target. Traffic never leaves AWS backbone.
Interface Endpoint (~$0.01/GB): All other services. Deployed as ENIs with private IP in your subnet, backed by PrivateLink. Requires a security group that allows the service's port.

# Gateway endpoint for S3 — add route to private subnet route table
# Route table: destination 0.0.0.0/0 → nat-xxxxx (for internet)
#             destination pl-xxxxxxxx → vpce-xxxxxxx (S3 endpoint)
# Result: S3 traffic from private subnet routes via AWS backbone, not internet

# Interface endpoint for Secrets Manager — private IP in your subnet
# Endpoint: com.amazonaws.us-east-1.secretsmanager
# Security group must allow HTTPS (443) outbound
# Lambda in private subnet can call Secrets Manager without NAT GW overhead

Storage Deep Dive

S3 — The Bedrock of AWS

S3 stores objects (key-value, no hierarchy) in buckets (containers). The storage classes control cost vs. latency:

Class	Use case	Latency	Cost vs S3 Standard
S3 Standard	Frequently accessed data, hot storage	ms	1x
S3 Intelligent-Tiering	Unknown access patterns, auto-moves based on usage	ms	~0.9x + monitoring fee
S3 Standard-IA	Long-lived, infrequently accessed (>30 days)	ms (extra retrieval)	~0.5x
S3 Glacier Instant Retrieval	Archive, data lake, >90 days	<1 s retrieval	~0.25x
S3 Glacier Flexible Retrieval	Long-term archive, bulk retrieval acceptable	1 min–12 hr	~0.1x
S3 Glacier Deep Archive	Regulatory compliance, >7 year retention	12–48 hr	~0.02x

S3's consistency model is read-after-write consistency for new objects (PUTs succeed immediately), but eventual consistency for overwrites (PUTs to existing keys) and deletes. This matters when you're updating existing keys and immediately reading them — a common source of "stale read" bugs. S3 Select and Glacier Select let you query CSV/JSON/Parquet data in-place without full object retrieval.

# S3 replication: cross-region and same-account
# Versioning must be enabled on source bucket

# Cross-region replication (CRR) — for DR, latency
aws s3api put-bucket-replication \
  --bucket my-source-bucket \
  --replication-configuration '{
    "Role": "arn:aws:iam::123456789012:role/s3-replication-role",
    "Rules": [{
      "ID": "crr-rule",
      "Priority": 1,
      "Status": "Enabled",
      "Destination": {
        "Bucket": "arn:aws:s3:::my-dest-bucket",
        "StorageClass": "STANDARD"
      }
    }]
  }'

# S3 Transfer Acceleration — uses CloudFront edge for uploads
aws s3api put-bucket-acceleration-configuration \
  --bucket my-bucket \
  --accelerate-configuration Status=Enabled
# Upload to: my-bucket.s3-accelerate.amazonaws.com

# S3 Object Lambda — run your own code on GET requests
# Use case: redact PII, compress, resize images, without separate proxy layer
# Lambda runs inside S3's own network, so no data leaves AWS

EBS — Elastic Block Store

EBS provides persistent block storage attached to EC2 instances. The volume lives independently of the instance — you can detach and reattach. There are four volume types:

Type	Performance	Use case	Max IOPS/Volume	Max Throughput
gp3	Baseline 3,000 IOPS + 125 MB/s, configurable independently	General purpose, most workloads	16,000	1,000 MB/s
gp2	3,000 IOPS burst, scales with volume size	Legacy, gp3 preferred	16,000	250 MB/s
io2 Block Express	256,000 IOPS, sub-ms latency, PIOPS independent of throughput	Oracle, SAP, PostgreSQL OLTP	256,000	4,000 MB/s
io2	64,000 IOPS max, 1,000 IOPS per TB	Moderate DB workloads	64,000	1,000 MB/s
st1 (Throughput Optimized)	500 MB/s baseline, 500 MB/s burst	Data pipelines, log processing, MapReduce	500 MB/s	500 MB/s
sc1 (Cold Storage)	250 MB/s baseline	Rarely accessed, cheapest per GB	250 MB/s	250 MB/s

gp3 vs io2: gp3 lets you provision IOPS and throughput independently. For a Postgres DB needing 10,000 IOPS and 300 MB/s, gp3 is cheaper than io2. io2 Block Express (requires a z1d or similar instance type) supports up to 256 K IOPS with sub-millisecond latency.

# Provision gp3 volume with specific IOPS/throughput
aws ec2 create-volume \
  --volume-type gp3 \
  --size 500 \
  --iops 16000 \
  --throughput 1000 \
  --availability-zone us-east-1a
# Billed by: GB-month + IOPS provisioned (if > baseline) + throughput (if > baseline)

# io2 volume with provisioned IOPS
aws ec2 create-volume \
  --volume-type io2 \
  --size 100 \
  --iops 64000 \
  --availability-zone us-east-1a
# PIOPS billed at ~$0.065 per provisioned IOPS-month

# Attach to instance, then mount
aws ec2 attach-volume --volume-id vol-xxxx --instance-id i-xxxx --device /dev/sdf
# On the instance:
sudo mkfs -t xfs /dev/sdf
sudo mount /dev/sdf /data
sudo mount -o noatime /dev/sdf /data  # noatime improves performance on high-IO DBs

Instance Store — Ephemeral, Fast, Dangerous

Instance store volumes are physically attached to the host that runs your EC2 instance. They offer the highest throughput (hundreds of thousands of IOPS, GB/s) at the lowest latency (nanoseconds, no network). But: they are lost when the instance stops, terminates, or hardware fails. This makes them suitable only for scratch space, temp caches, or replicated databases. AWS recommends against using them for anything that can't be reconstructed from S3 or a database.

# Instance store on im4gd (storage optimized, local NVMe)
# 7.5 TB NVMe, ~1.5 million IOPS, ~16 GB/s throughput

# Important: instance store data survives reboot (not stop/terminate)
# Stop/start re-attach the same host (usually), terminate is gone

# For a Spark job with shuffle ephemeral storage:
#  - Use instance store for /tmp and shuffle space
#  - Fail-fast if the volume fills (Spark retries on other nodes)
#  - Never write checkpoint or output data to instance store

Compute Patterns — EC2, Lambda, Fargate

The right compute choice depends on three variables: runtime duration, concurrency pattern, and operational complexity tolerance.

Criteria	EC2	ECS/Fargate	EKS	Lambda
Best for	Long-running, stateful, predictable baseline	Container workloads, short-lived tasks, microservices	Teams already running Kubernetes on-prem	Event-driven,sporadic, <15 min
Duration	Unlimited	Unlimited (60 min max task)	Unlimited	15 min max
Cold start	0 (always running)	20–60 s (new container)	60–120 s (new pod)	100–3000 ms
Scale model	ASG (minutes)	Service auto-scaling (seconds)	Karpenter/HPA (seconds)	Automatic, +60/min
Networking	ENI with SG, VPC-native	awsvpc mode (1 ENI per task)	awsvpc mode or host	Via Hyperplane ENI
OS control	Full	None (container image)	Full (pod spec)	None
Billing granularity	Second (with Savings Plans)	vCPU-second, GB-second	vCPU-second, GB-second	100 ms (request-level)

EC2 Instance Types

EC2 instances are categorized by family based on their resource profile. The naming pattern is [family][generation].[size] (e.g. m6i.4xlarge):

Family	Focus	Typical use case
t (Burstable)	CPU bursting, baseline + credits	Dev machines, small DBs, build servers. T2/T3 unlimited mode removes credit cap.
m (General Purpose)	Balanced, 1 vCPU per 4 GB RAM	Most application servers, containers, mid-tier DBs
c (Compute Optimized)	More CPU per GB RAM (1 vCPU per 2 GB)	High-performance web servers, batch processing, ML inference
r (Memory Optimized)	More RAM per vCPU (1 vCPU per 8 GB)	In-memory DBs (Redis, SAP HANA), big data analytics
x (Extra Memory)	1 vCPU per 16 GB RAM	Ultra-high-memory: SAP HANA, Spark executors
i (Storage Optimized)	NVMe instance store, high disk IOPS	NoSQL (Cassandra, DynamoDB), Elasticsearch, data warehousing
d (Dense Storage)	High HDD capacity (24 TB)	Massive file servers, distributed filesystems, Hadoop
z (High Frequency)	3.9 GHz all-core turbo, best single-threaded	High-frequency trading, real-time gaming servers, video encoding
p (GPU)	NVIDIA A10G/T4/V100, training and inference	ML training, inference, HPC, deep learning
inf (Inferentia)	AWS Inferentia chips (custom NN inference)	Cost-efficient ML inference, large model serving
mac (Mac Mini)	Apple Mac hardware, x86 for iOS CI	iOS/macOS build agents, Xcode CI

Savings Plans vs On-Demand vs Spot: On-Demand is pay-as-you-go. Savings Plans lock you into 1 or 3 year commitments at 30–72% discounts (SCU = compute unit). Spot instances are auction-priced (60–90% off) but can be reclaimed with 2 minutes warning — fine for batch jobs, terrible for stateful databases.

Databases on AWS

AWS's managed database portfolio spans relational, NoSQL, in-memory, graph, time-series, and document stores. The choice determines your consistency model, operational overhead, and scalability ceiling.

RDS and Aurora

RDS (Relational Database Service) is a managed PostgreSQL/MySQL/MariaDB/Oracle/SQL Server with automated backups, failover, and read replicas. Aurora is AWS's custom engine that is MySQL- and PostgreSQL-compatible but replaces the traditional storage engine with a distributed, quorum-based storage system:

6 copies of data across 3 AZs — tolerates loss of 2 AZs without data loss
Write quorum: 4 of 6. Read quorum: 3 of 6. You can serve reads from any AZ.
Storage auto-scales in 10 GB increments, up to 128 TB
Global Database: cross-region replication with <1 second lag, secondary region serves reads
Aurora Serverless v2: scales capacity automatically, no idle cost, good for unpredictable workloads
Backtrack: rewind the database to any point in time within the recovery window (up to 72 hours)

# Aurora cluster (Serverless v2)
aws rds create-db-cluster \
  --engine aurora-postgresql \
  --engine-mode serverless \
  --scale-capacity 2-40 \
  --db-cluster-identifier my-aurora-serverless \
  --master-username admin \
  --master-password 'SomePassword!' \
  --vpc-security-group-ids sg-xxxx \
  --subnet-group my-subnets

# Aurora Global Database (cross-region DR)
aws rds create-global-cluster \
  --global-cluster-identifier my-global \
  --db-cluster-identifier my-primary-aurora

aws rds create-db-cluster \
  --engine aurora-postgresql \
  --engine-mode provisioned \
  --global-cluster-identifier my-global \
  --db-cluster-identifier my-secondary-aurora \
  --region eu-west-1

DynamoDB — The Scalable NoSQL Workhorse

DynamoDB is a fully managed NoSQL key-value and document database. It is the AWS service most engineers have experience with but least understand deeply. Key concepts:

Tables, Items, Attributes: Items are JSON-like documents, up to 400 KB each.
Partition Key (PK): Required. Determines data distribution across partitions. High-cardinality PKs spread load.
Sort Key (SK): Optional. Enables composite primary keys — allows range queries within a partition (e.g. UserID + Timestamp for a user's activity feed).
RCU / WCU: Read Capacity Units and Write Capacity Units. One RCU = one strongly consistent read of 4 KB/s, or two eventually consistent reads. One WCU = one write of 1 KB/s.
GSI (Global Secondary Index): Alternate PK/SK to query data in different patterns. GSI has its own RCU/WCU and is eventually consistent by default (can be strongly consistent at 2x RCU cost).
On-demand: Pay per request. Scales instantly. Good for unpredictable, spiky workloads.
Provisioned: Reserve RCU/WCU. 70–90% cheaper at steady state. Supports auto-scaling.
DynamoDB Accelerator (DAX): In-memory cache in front of DynamoDB, reduces read latency from ~5 ms to ~1 ms. DAX is a cluster of nodes in your VPC. Use for repeated reads of hot items.

# DynamoDB table with GSI for secondary access pattern
aws dynamodb create-table \
  --table-name Orders \
  --attribute-definitions \
    AttributeName=PK,AttributeType=S \
    AttributeName=SK,AttributeType=S \
    AttributeName=CustomerEmail,AttributeType=S \
  --key-schema \
    AttributeName=PK,KeyType=HASH \
    AttributeName=SK,KeyType=RANGE \
  --global-secondary-indexes '[
    {
      "IndexName": "CustomerEmailIndex",
      "KeySchema": [{"AttributeName":"CustomerEmail","KeyType":"HASH"}],
      "Projection": {"ProjectionType":"ALL"},
      "ProvisionedThroughput": {"ReadCapacityUnits":50,"WriteCapacityUnits":25}
    }
  ]' \
  --provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=50

# On-demand capacity (pay per request, no capacity planning)
aws dynamodb update-table \
  --table-name Orders \
  --billing-mode PAY_PER_REQUEST

ElastiCache — Redis and Memcached

ElastiCache is a managed Redis (open-source compatible) and Memcached. For Redis specifically:

Cluster mode disabled: Single primary, up to 5 read replicas. Replicas have async replication. Good for read-heavy, where some staleness is acceptable.
Cluster mode enabled: Data partitioned across shards (1-500). Each shard has 1-5 replicas. Supports slot-based partitioning automatically. Use for write-heavy or very large datasets.
Serverless Redis: AWS manages shard count and replication. You pay per RU (read unit) and WCU. No cluster management. Good for variable, growing workloads.
Global Datastore: Cross-region replication. Promotes secondary region for DR with RPO <1 second.
Memcached: Pure caching, no replication, multi-AZ with auto-failover, more memory-efficient per connection. Use when you only need cache, not a data store.

When NOT to use Redis: If your data exceeds 5% cache hit rate, consider whether Redis is actually helping or just adding latency. A cold Redis (cache miss rate above ~80%) adds ~2 ms per request for the cache lookup, plus the DynamoDB call — you're worse off.

Serverless Architecture

AWS's serverless stack chains together managed services to form complete event-driven applications with no persistent servers. The core pattern: a trigger fires, a function runs, a queue manages backpressure.

The Serverless Primitives

Lambda

Your compute unit. Stateless, auto-scales from 0 to thousands of concurrent executions. Cold start is the enemy. Keep packages small, avoid heavy imports at the top of your handler. Use SnapStart for Java, provisioned concurrency for predictable latency floors.

EventBridge

The event bus. Ingests events from 200+ AWS services and SaaS partners via "event sources." Rules route events to targets (Lambda, SQS, SNS, ECS). The schema registry lets you validate event payloads. Scheduler triggers rules on cron. Use EventBridge over SNS when you need fan-out to multiple targets with filtering.

SQS

Decouple producers from consumers. Standard queues: at-least-once, unordered, no limit. FIFO queues: exactly-once, ordered, 300 msg/s per queue. Lambda can poll SQS directly (event source mapping). SQS dead-letter queues handle poison messages. Use SQS over EventBridge when you need at-least-once guarantee and explicit queue depth monitoring.

SNS

Pub/sub push notifications. Subscribers are Lambda functions, SQS queues, HTTP endpoints, mobile push, email, SMS. Fan-out pattern: one SNS topic → multiple SQS queues (one per microservice) → Lambda consumers. SNS message size: 256 KB. Use SNS when you need push-style fan-out; use EventBridge when you need filtering-first fan-out.

API Gateway

Managed REST/HTTP/WebSocket API in front of Lambda or any HTTP backend. Handles auth (IAM, Cognito, JWT), throttling (burst and sustained), request validation, OpenAPI import. HTTP API is 70% cheaper than REST API but lacks request validation and caching.

Step Functions

State machine orchestrator. Coordinates Lambda + other AWS services into long-running workflows. Standard workflow: full execution history, exactly-once semantics, up to 1 year. Express workflow: high-throughput, at-least-once, up to 5 minutes. Use Step Functions over ad-hoc Lambda chaining when workflows need visibility, retries, human approval steps, or branching.

The Serverless Application Pattern

# A typical serverless order processing pipeline
#
#  API Gateway (REST) → Lambda (validate & persist) → DynamoDB
#  DynamoDB Stream → Lambda (trigger) → SNS (notify)
#  SNS → SQS (per-service queues) → Lambda (fulfillment) → SQS DLQ
#
# Flow:
# 1. Client POSTs to API Gateway → Lambda validates schema + auth
# 2. Lambda writes order to DynamoDB (PK: OrderId, SK: userId)
# 3. DynamoDB Streams trigger Lambda for cross-cutting concerns (analytics)
# 4. SNS fans out to per-service SQS queues (fulfillment, notification, accounting)
# 5. Each service's Lambda consumes from its SQS queue, with DLQ on failure
# 6. Step Functions workflow coordinates multi-step fulfillment (check inventory → charge → ship)
#
# Error handling:
#  - SQS queue has a visibility timeout (say, 5 min) during which only one Lambda sees the message
#  - Lambda fails → message reappears after visibility timeout (implicit retry)
#  - After MaxReceiveCount (e.g. 3), move to DLQ → CloudWatch alert
#  - Lambda idempotency key (orderId) prevents duplicate processing on retries

AWS Observability Stack

AWS's observability is built on three pillars: metrics, logs, and traces — unified by X-Ray for distributed tracing and CloudWatch for metrics/logs.

CloudWatch

CloudWatch collects metrics from every AWS service. Key concepts:

Metrics: Time-series data points (CPUUtilization, RequestCount, Latency). Custom metrics from your app via PutMetricData API. Resolution: standard (1 min), high-resolution (1 sec, costs more).
Alarms: Trigger actions when a metric crosses a threshold. Can notify via SNS, auto-scale, or stop/terminate EC2 instances.
Logs: CloudWatch Logs ingests from CloudTrail (API calls), Route53 (DNS), VPC Flow Logs (network), your applications (via agent or SDK). Log groups are arbitrary namespaces. Retention is configurable (1 day to 10 years). Query with CloudWatch Logs Insights (fields @timestamp, @message | filter @message like /ERROR/).
Dashboards: Custom metric dashboards, widget-based, can include graphs from multiple regions and services.
Anomaly Detection: CloudWatch automatically learns your metric's baseline and raises alarms on deviations, without you specifying static thresholds.

# CloudWatch Logs Insights query: find error logs across all Lambda functions
fields @timestamp, @message, aws:FunctionName
| filter @message like /ERROR/ or @message like /Exception/
| sort @timestamp desc
| limit 50

# Put custom metric from Lambda (Node.js)
const { CloudWatchClient, PutMetricDataCommand } = require('@aws-sdk/client-cloudwatch');
const cw = new CloudWatchClient({});
await cw.send(new PutMetricDataCommand({
  Namespace: 'MyApp/Performance',
  MetricData: [{
    MetricName: 'OrderProcessingTime',
    Value: durationMs,
    Unit: 'Milliseconds',
    Dimensions: [{ Name: 'Service', Value: 'OrderService' }]
  }]
}));

X-Ray — Distributed Tracing

X-Ray records traces: end-to-end latency across all services in a request. Each trace consists of segments (services) and subsegments (calls to DynamoDB, external HTTP, Lambda invocations). The X-Ray SDK auto-instruments Lambda, API Gateway, and makes it easy to add custom segments.

# Lambda with X-Ray tracing enabled (environment variable AWS_XRAY_SDK_ENABLED=true)
# X-Ray SDK in Node.js:
const AWSXRay = require('aws-xray-sdk-core');
// Instrument AWS SDK calls (DynamoDB, S3, etc.)
AWSXRay.captureAWSClient(S3);

exports.handler = async (event) => {
  // Manual subsegment for custom processing
  return AWSXRay.captureAsyncFunc('orderProcessing', async (segment) => {
    const result = await processOrder(event);
    segment.addMetadata('orderId', result.orderId);
    return result;
  });
};
// Trace shows: API GW → Lambda → DynamoDB (GetItem) → SNS (Publish)
// Each segment has: start/end time, metadata, errors/exceptions
// Subsegments show downstream call duration: DynamoDB 4ms, SNS 12ms

CloudTrail — Audit Trail

CloudTrail records every API call made in your account: who (IAM principal), what (service + action), when (timestamp), where (source IP). By default, CloudTrail writes to an S3 bucket in the same account. You can aggregate trails from multiple accounts into a single S3 bucket using an organization trail.

# CloudTrail event structure (S3 object JSON)
{
  "eventVersion": "1.08",
  "userIdentity": {
    "type": "IAMUser",
    "arn": "arn:aws:iam::123456789012:user/jane",
    "principalId": "AIDA...",
    "accountId": "123456789012"
  },
  "eventTime": "2025-01-15T10:30:00Z",
  "eventSource": "ec2.amazonaws.com",
  "eventName": "DescribeInstances",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "203.0.113.42",
  "userAgent": "aws-cli/2.x",
  "requestParameters": {...},
  "responseElements": {...},
  "requestID": "a1b2c3d4-...",
  "eventID": "abc123"
}

// CloudTrail Insights: detects anomalous API activity
# E.g., many DescribeInstances calls followed by TerminateInstances — might indicate compromise
# Insights events written to separate S3 prefix: AWSLogs/AccountID/CloudTrail-Insight/us-east-1/2025/01/15/

ServiceLens — Metrics + Traces + Logs in One Place

ServiceLens is the integration layer: it connects CloudWatch metrics and logs with X-Ray traces so you can see a service map with latency percentiles, error rates, and trace drill-down from the same interface. It's the recommended way to monitor a distributed application on AWS.

Security — IAM, KMS, and the Principle of Least Privilege

IAM (Identity and Access Management) is the foundation of AWS security. Everything else builds on it.

IAM Core Concepts

IAM governs who (principal) can do what (action) to which resource under what conditions. The evaluation logic:

Deny by default — if no policy explicitly allows, the request is denied.
Explicit Deny in any attached policy overrides any Allow.
Permissions are additive — all attached policies (identity-based, resource-based, SCP, session) are combined.
Principal can be a user, group, role, or federated identity.

# IAM policy — allow Lambda to read from S3 bucket "my-data-bucket"
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "s3:GetObject",
      "s3:GetObjectVersion"
    ],
    "Resource": "arn:aws:s3:::my-data-bucket/*",
    "Condition": {
      "StringEquals": {
        "aws:RequestedRegion": ["us-east-1", "eu-west-1"]
      }
    }
  }]
}

# Service Control Policy (SCP) — organization-level restriction
# Forces all accounts in OU to use TLS 1.2+ for S3
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Deny",
    "Action": "s3:*",
    "Resource": "*",
    "Condition": {
      "NumericLessThan": {
        "aws:SecureTransport": "1"
      }
    }
  }]
}

# Cross-account access via role
# Account B trusts Account A to assume a role
# Account A principal can call sts:AssumeRole to get temporary credentials
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "AWS": "arn:aws:iam::111122223333:root" },
    "Action": "sts:AssumeRole",
    "Condition": {
      "Bool": { "aws:SecureTransport": "true" }
    }
  }]
}

KMS — Envelope Encryption

AWS KMS (Key Management Service) manages encryption keys. Every AWS service that encrypts data at rest uses KMS under the hood. The key hierarchy:

AWS managed keys (e.g. aws/s3): created and managed by AWS. You cannot view, rotate, or delete them. Used automatically by services.
Customer managed keys (CMK): you create, manage, rotate (annual by default), control access via IAM policy + key policy. $1-3/month per CMK + per-use fees.
Custom key stores: BYOK with CloudHSM — keys live in your own FIPS 140-2 hardware security module. Highest compliance requirements.

Envelope encryption: KMS encrypts a data key (DEK) with a CMK; the DEK encrypts your data. S3 uses this automatically — your data is encrypted with a per-object DEK, which is encrypted with the CMK. The CMK is only called when encrypting/decrypting the DEK, not every object operation.

# Create CMK with key rotation enabled
aws kms create-key \
  --description "Data encryption key for PII" \
  --key-usage ENCRYPT_DECRYPT \
  --origin AWS_KMS \
  --enable-key-rotation

# Encrypt data (envelope encryption under the hood)
aws kms encrypt \
  --key-id alias/my-key \
  --plaintext fileb://data.json \
  --output text --query CiphertextBlob | base64 -d > data.encrypted

# Generate data key (DEK) — returns plaintext + encrypted copy
aws kms generate-data-key \
  --key-id alias/my-key \
  --key-spec AES_256 \
  --output text --query [Plaintext, CiphertextBlob]

# Use the plaintext DEK to encrypt your data (local operation)
# Use the encrypted DEK to store alongside the ciphertext
# Decrypt: call KMS with encrypted DEK → get plaintext DEK → decrypt data (local)

Security Hub and GuardDuty

GuardDuty is a continuous threat detection service. It analyzes CloudTrail event logs (management API calls), VPC Flow Logs (network traffic), EKS audit logs, and DNS query logs. It maintains a finding with severity (Low, Medium, High, Critical) and remediation steps. Findings can trigger EventBridge rules → SNS → PagerDuty.

Security Hub aggregates findings from GuardDuty, Inspector, Config, Macie, and third-party tools into a single dashboard. It maps findings to the AWS Foundational Security Best Practices standard and CIS AWS Foundations Benchmark, giving you a compliance score.

FAQ

Should I put my Lambda in a VPC?

Only if you need access to VPC resources: RDS instances, ElastiCache, internal APIs behind PrivateLink, or services without VPC endpoints. Inside a VPC, Lambda uses Hyperplane ENIs for NAT, and the ENI attachment overhead is minimal (<1 s). The tradeoff: VPC-attached Lambda cold starts are slightly slower (a few milliseconds), and you must ensure your subnet has enough IP addresses for your peak concurrency.

What is the difference between S3 eventual consistency and DynamoDB eventual consistency?

S3 provides read-after-write consistency for new objects (PUT to a new key is immediately readable), but eventual consistency for overwrite PUTs and DELETEs. DynamoDB provides eventual consistency by default for all reads, with strongly consistent reads available at 2x the RCU cost. This means if you update a DynamoDB item and immediately read it, you might see the old value. For S3, if you overwrite a key and immediately read, you might see the old value. The practical mitigation: use versioned objects in S3 for overwrites; use conditional writes (ConditionExpression) in DynamoDB for updates that must be idempotent or sequential.

How do I choose between Aurora, DynamoDB, and ElastiCache?

Aurora: structured relational data, complex queries (joins, aggregations), PostgreSQL/MySQL compatibility, ACID transactions, team SQL expertise. DynamoDB: semi-structured or key-value data, very high throughput (millions of ops/s), predictable single-digit ms latency, no JOINs, scales to zero. ElastiCache: caching layer in front of either (or standalone), data that expires or is rebuildable, session storage. The decision tree: do you need SQL? → Aurora. Do you need joins? → Aurora. Is your access pattern "get by known key" or "scan"? → DynamoDB. Do you have hot keys that cause DynamoDB throttling? → ElastiCache in front.

What's the real difference between VPC Endpoint (gateway) and Interface Endpoint?

A gateway endpoint is a target in your route table for a specific AWS service (S3 or DynamoDB). It's free, applies to all instances in the subnet's route table, and traffic stays on the AWS network. An interface endpoint is an ENI (with a private IP) in your subnet that proxies traffic to an AWS service (Secrets Manager, Systems Manager, CloudWatch, etc.). It's private, secured by a security group, and has a per-GB cost. You use gateway endpoints for S3/DynamoDB in private subnets; you use interface endpoints for everything else that needs VPC private access.

How does AWS charge for data transfer, and what are the common mistakes?

Data transfer pricing is the most commonly misunderstood AWS cost. The basics: data transfer within a single AZ is free. Data transfer between AZs is $0.01/GB. Data transfer between regions is $0.02-$0.12/GB depending on region pair. Data transfer out to the internet from a region has tiered pricing (first 10 TB/month at ~$0.09/GB in us-east-1). Mistakes: (1) not using PrivateLink and routing traffic through the internet to reach AWS services in the same region; (2) not using gateway endpoints for S3 in private subnets, paying NAT GW charges for S3 traffic; (3) inter-AZ traffic from Lambda to RDS when a read replica in the same AZ would suffice.

When should I use AWS Batch vs Lambda vs ECS Fargate for a long-running job?

Lambda: job duration <15 minutes, demand is event-driven and sporadic (few invocations or bursty). Lambda auto-scales instantly and you pay only for actual execution time. AWS Batch: job duration >15 minutes or requires GPUs, with a queue of work and variable demand. Batch manages a compute environment (EC2 or Fargate), queues jobs in SQS, and can use Spot instances for 60-70% cost savings. Fargate: long-running services that need containers but don't fit Lambda's model. Use Fargate when you have a steady baseline of traffic, need more control over the runtime environment than Lambda allows, or have jobs that run for hours. The key question: is your job triggered by an event with variable demand? → Lambda. Is it queued with predictable but bursty demand? → AWS Batch. Is it a steady-state service? → Fargate.