AWS Lambda

Table of contents

General
Main integrations
Lambda Limits
Permissions
Anti-patterns
Storage options
Concurrency
Scaling
Invocation types
Lambda Layers
Network
Lambda + SQS
Lambda + Kinesis
Lambda@Edge

General

AWS Lambda is a serverless, highly-available (no scheduled maintenance), event-driven compute service that lets you run code in any programming language for virtually any type of application, backend service, or for integrating different AWS services. They execute on-demand and are charged by the millisecond.

Multiple instances of the same function can run simultaneously, but each of them can run for up to 15 minutes.

They can have from 128MB up to 10GB of RAM. The RAM size has a direct relation with CPU and Network resources: more RAM equals better performance.

EFS disks can be attached to the lambda functions.

Main integrations

API Gateway: for rest APIs;
Kinesis: for real-time data processing;
DynamoDB: for real-time event reaction. For example, sending a welcome message when a new user is added to the database. DynamoDB can also be used for stateful applications;
S3: with bucket notifications. For example, for removing PII data;
Cloudfront: Lambda@Edge;
Cloudwatch Events: actions triggered by Cloudwatch alarms or triggered based on a schedule using CRON-like specification.
SNS and SQS: to react to notifications;
Cognito: to authenticate users.

Lambda Limits

From 128 MB up to 10 GB of memory;
Execution time of up to 15 minutes;
Max of 4KB of environment variables;
512MB available for temporary storage on the ./tmp directory, which can also be used for initialization files;
Up to 1.000 concurrent executions (can be increased via Support request);
Code must have up to 50MB in a .zip file format. Unzipped, it can have up to 250MB;

Permissions

User policies: defines if a user can run the function or not;
Execution Role: a Lambda policy specifying which AWS services/resources the function is allowed to interact with. The default role gives access to publish to the Cloudwatch logs.
Resource-based policies: allow other services to execute the Lambda function.

Anti-patterns

Long processes;
Dynamic websites;
Stateful Applications.

Storage options

	Ephemeral Storage /tmp	Lambda Layers	Amazon S3	Amazon EFS
Max. Size	10GB / 512MB for temp folder	up to 5 layers per function for a total of 250MB	Elastic	Elastic
Persistence	Ephemeral	Durable	Durable	Durable
Content	Dynamic	Static	Dynamic	Dynamic
Storage Type	File System	Archive	Object	File System
Pricing	Included	Included	Storage + Requests + Data Transfer	Storage + Data Transfer + Throughput
Permissions	N/A (Function only)	IAM	IAM	IAM + NFS
Shared Across Concurrent Invocations	No	Yes	Yes	Yes
Use cases	Temporary files	Code dependencies/libraries	Input files	Shared files across multiple lambda instances
Relative Speed	Fastest	Fastest	Slowest	Intermediate

Concurrency

Concurrency defines the number of simultaneous requests (Lambda Function’s instances) each function supports. If the concurrency is ever set to zero or reaches the maximum limit, the Functions will not be able to handle any request.

Concurrency

The maximum concurrency is limited by AWS Region and this limit is shared with all Lambda Functions on that region.

The ideal concurrency for a function can be calculated by:

\[Concurrency = (average requests per second) * (average request duration in seconds)\]

This means concurrency is not the same as requests per second.

There are two types of concurrency:

Reserved Concurrency: the concurrency is reserved exclusively for a single Lambda Function (including all their versions and aliases). This means the pool of shared concurrency will be reduced. This is useful for ensuring other functions will not block this function’s executions and also that the function will not scale without control.
Provisioned Concurrency: ensures the set number of instances are always active, minimizing the latency caused by cold starts. The Provisioned Concurrency cannot be bigger than the Reserved Concurrency and they are set on a version/alias level. The downside is that it incur costs even if the function is not being utilized (the instances are actice nonetheless).

It is possible to use the Application Auto Scaling to change the provisioned concurrency according to time/demand.

Scaling

Lambda tries to reutilize an instance whenever it becomes idle. However, if new requests are coming while every instance is busy, a new instance will be created and every new instance will have a cold start delay:

Function Scaling

Invocation types

Direct Trigger

Lambdas can be triggered via direct trigger, synchronously or asynchronously, from:

CLI
S3
SNS
API Gateway

This type of trigger is recommended for discrete events and real-time processing.

Synchronous

The process that started the execution waits until the processing has completed. One event is processed per time.

Synchronous mode is triggered by the invoke command.

Asynchronous

The event is added to a queue and only the status of this action is returned. The process which triggered the execution do not receive the return of the function.

Several services can start an asynchronous execution, such as:

In case of failure in the Lambda, by default it will retry it two more times. It is possible to configure how Lambda will handle the error, for example, by sending the events to a SQS to Event Bridge to another Lambda or to a SNS Topic, or by modifying the parameters MaximumEventAgeInSeconds and MaximumRetryAttempts.

For dead-letter queues, Lambda only sends the content of the event, without details about the response.

Be aware that the Lambda Async queue is eventually consistent, so the same message may be processed more than once. Therefore, it is recommended to ensure the function is idempotent.

Event Source Mapping

It is a Lambda resource that reads items from streams or queues and process them in batches. Recommended for processing high-volume data. Supported services are:

Each event is processed at least once, so it is recommended to have an idempotent function.

It is possible to configure MaximumBatchingWindowInSeconds and BatchSize and it is also possible to filter events.

When one of this three happens, a batch is sent to the Lambda:

The batching window reaches its maximum value;
The batch size is met.
The payload size reaches 6 MB. You cannot modify this limit.

In case of failure, the whole batch is processed and, to ensure in-order processing, the event source mapping pauses processing for the affected shard until the error is resolved.

Function URL

A dedicated HTTP(S) endpoint for the Lambda Function, in the form of https://<url-id>.lambda-url.<region>.on.aws.

It is possible to control access to the Lambda Function URL using the AuthType parameter and resource-based policies, but Function URLs are accessible via public internet only.

Lambda Layers

A Lambda Layer is a package that can contain code dependencies and customized libraries, helping to reduce the deployment package and to keep a better control over what packages are being used.

Each Lambda function can have a maximum of 5 layers attached to them, and they are loaded under /opt.

Network

When a function is connected to a VPC, the service will create a ENI for each Security Group/Subnet in the VPC. This process can take a couple of minutes, during which the function cannot be changed.

If the ENI stays idle for too long, it will be removed from the function and a new one will be created when the function is called again. This means the function will probably fail while the ENI is being recreated.

Each ENI can be attached to multiple Lambda Functions.

Lambda + SQS

Supports Standard or FIFO Queues, but the messages are ordered only on FIFO queues, by Message Group ID.

Lambda + Kinesis

When integrated with Kinesis, the function receives a batch of records as an event. The records are processed synchronously, one by one. The batch size is configurable to up to 10.000 records, but you need to be aware of the function time limit to avoid timeouts. Additionally, if the batch is greater than 6MB, then it will be automatically split into smaller batches.

The Lambda function will automatically retry any failed batch until it succeeds or the data expires. If the errors are not handled properly, it may block the Kinesis Shard, which expects a response from the Lambda function.

Lambda@Edge

Lambda@Egde allows the deployment of Lambda function with the Cloudfront CDN, on the Edge Locations, for:

More responsive apps;
CDN customization.

It is possible to execute the Lambda function at four distinct stages:

User Request;
Cloudfront Request;
Origin Response;
Cloudfront Response.

  sequenceDiagram;
      participant u as User;
      participant c  as Cloudfront;
      participant o as Origin;

      u ->> c: User Request;
      c ->> o: Cloudfront Request;
      o ->> c: Origin Response;
      c ->> u: Cloudfront Response;

It is possible to generate the User Response without sending the request to the Origin, using only User Request and Cloudfront Response.

Use cases include SEO, A/B tests, user tracking, analytics, security, etc.