What is AWS Lambda?

AWS Lambda allows you to upload and run your code without the need to manage any servers. You simply upload your code to it which is run on-demand when a request is made by an app, service or triggered automatically using CloudWatch rules. AWS charges for only the compute resources that are used, without any need to pay any upfront fee. You can have any kind of application written in a variety of runtimes(nodejs, Python, Go, Ruby, Java and .net core). AWS takes care of the administration, management and scaling of the backend infrastructure while you can focus on the code itself.

AWS Lambda cold boot start time and concurrent executions

What is cold boot time?

When the first request is made after the deployment of AWS Lambda function, this would make AWS fire up a new micro-container and load up your code by bootstrapping the runtime, which will incur a delay in the execution as opposed to the normal execution time. This is called the cold boot time or the cold start time.

This cold boot time happens on the first request after deployment and also after the Lambda function has been idle for a decent amount of time. The exact figure is unknown as it is internal to Lambda, but it can vary between 5 to 30 minutes. Until the Lambda function is not garbage collected, it is in the ‘warm state’ and does not require a cold start.

The cold start time depends on the factors such as the runtime for which the code is written, the package size, whether or not an AWS VPC(Virtual Private Cloud) is involved and also on the allocated memory to the Lambda function.

Concurrent Executions

The above text explains cold starts in the most simple of scenarios, but going to a more deeper level, we need to be aware of another phenomena: concurrent executions

Concurrent requests are the number of requests that are made to an AWS account at loosely a single moment in parallel. An AWS account has a default concurrent execution quota of 1000. All the Lambda functions in the account are going to share concurrent executions from this limit. Lambda is going to send only one request to a container at any given moment and wait for it to finish execution before sending in a second request.

AWS Lambda cold start time is not just for the very first single request, but also for the number of concurrent requests, as for each concurrent request, it has to fire up a container. So, for example, after idle time you receive 10 concurrent requests, all 10 of your concurrent requests are going to suffer the cold start time.

This requires us to keep our AWS Lambda functions warm. And the way to do that is to send in periodic, regular requests to the functions so that they don’t remain idle for long and don’t get garbage collected.

Suppose if you had an app which is facing your customers. You can analyse the peak times at which the requests hit our resources the hardest. Let’s say the peak times were between 5 – 11 PM, then we can have a trigger that would cold start the Lambda function at around 04:50 PM with the number of concurrent requests expected. That way our customers are not going to have to bear the latency involved with the cold start times.

Similar tactics can be put in place to have our functions warmed up if the traffic is more unexpected. One open source tool that can come in handy is X-Lambda.