Facebook iconUsing Puppeteer on AWS Lambda for Headless Automation
F22 logo
Blogs/Serverless

Using Puppeteer on AWS Lambda for Headless Automation

Written by Goutham
Feb 13, 2026
6 Min Read
Using Puppeteer on AWS Lambda for Headless Automation Hero

Web automation often reaches a point where managing servers becomes more effort than the automation itself. I’m writing this guide for teams who want browser-level automation without maintaining long-running infrastructure.

This article explains how Puppeteer fits into a serverless model using AWS Lambda, why that combination works well for headless automation, and how to structure it correctly for reliability, scalability, and cost control.

Key features of Puppeteer

Key features of Puppeteer include:

  • Programmatic control over a headless Chrome or Chromium instance
  • Realistic user interaction with DOM elements, navigation, and events
  • Native support for screenshots and PDF generation

These capabilities make Puppeteer suitable for automation tasks that require real browser behavior rather than HTTP-level simulation.

Common use cases for Puppeteer range from data extraction and market research to automated testing of web applications and generating pre-rendered content for static websites.

Benefits of Serverless Computing

Before we dive into combining Puppeteer with serverless architecture, let's briefly explore the benefits of serverless computing:

  • Scalability: Serverless platforms scale function execution automatically, allowing automation workloads to expand or contract without manual capacity planning, ensuring optimal performance during traffic spikes.
  • Cost Efficiency: Execution-based billing makes serverless ideal for bursty automation tasks such as scraping, testing, or report generation, making it ideal for sporadic or unpredictable workloads.
  • Reduced management overhead: Serverless eliminates the need for server provisioning, patching, and maintenance, allowing developers to focus on writing code.
  • Event-driven execution: Serverless functions can be triggered by various events, enabling responsive and efficient application architectures.

Maximizing Serverless Puppeteer Automation

Running Puppeteer on AWS Lambda enables on-demand browser automation without persistent servers. This model works best for short-lived, event-driven workloads where startup cost is acceptable and parallel execution is beneficial. This approach allows you to run Puppeteer scripts on-demand without managing dedicated servers.

Puppeteer on AWS Lambda: when it works best

The Benefits Include:

  • Reduced infrastructure costs.
  • Automatic scaling to handle varying workloads.
  • Simplified deployment and management.

Moreover, by utilizing layers in serverless platforms, we can enhance the reusability and modularity of our Puppeteer-based functions, making it easier to maintain and update our automation scripts.

Setting up AWS Lambda Puppeteer 

Setting up Puppeteer on AWS Lambda requires aligning runtime, dependencies, and execution limits to support headless browser startup and execution within Lambda constraints.

Step 1. Install Puppeteer

Step 2. Choose a serverless platform: Popular options include AWS Lambda, Azure Functions, and Google Cloud Functions. For this guide, we are going to use AWS Lambda.

Step 3. Set up the AWS CLI and configure your credentials.

Step 4. Create a new Lambda function and configure it to use the Node.js runtime.

Partner with Us for Success

Experience seamless collaboration and exceptional results.

Creating Serverless Functions with Puppeteer

This example demonstrates a minimal Lambda handler that launches Chromium, navigates to a target URL, and returns a screenshot. The focus is on correct browser initialization, execution path resolution, and graceful shutdown.

import chromium from '@sparticuz/chromium';
import puppeteer from 'puppeteer-core';

export const handler = async (event) => {
  let browser = null;
    let result = null;

    try {
        browser = await puppeteer.launch({
           args: chromium.args,
            defaultViewport: chromium.defaultViewport,
            executablePath: await chromium.executablePath('/opt/nodejs/node_modules/@sparticuz/chromium/bin'),
            headless: chromium.headless,
            ignoreHTTPSErrors: true,
        });

        const page = await browser.newPage();

        await page.goto(event.url || 'https://example.com');

        const screenshot = await page.screenshot({ encoding: 'base64' });

        result = {
            statusCode: 200,
            headers: {
                'Content-Type': 'image/png',
            },
            body: screenshot,
            isBase64Encoded: true,
        };
    } catch (error) {
        console.error(error);
        result = {
            statusCode: 500,
            body: JSON.stringify({ error: error.message }),
        };
    } finally {
        if (browser) {
            await browser.close();
        }
    }

    return result;
};

This function takes a URL as input, navigates to the web page, and returns a base64-encoded screenshot. (note we are using @sparticuz/chromium for chromium-browser because we are using chromium lambda layers provided by sparticuz/chromium arn link in aws).

Leveraging Layers for Modularity and Reusability

Lambda layers enable the separation of Puppeteer and Chromium dependencies from function logic. This approach reduces deployment size, simplifies updates, and improves maintainability across multiple automation functions across multiple functions. For Puppeteer, we can create a layer containing Puppeteer and its dependencies:

When working with Puppeteer on Lambda, using layers can significantly improve the management and deployment of your functions. This is especially useful when you have multiple functions that require Puppeteer.

1. Create a directory for your layer

mkdir puppeteer-layer && cd puppeteer-layer

2. Initialize a new Node.js project and install Puppeteer

npm init -y
npm install puppeteer

3. Zip the contents of the node_modules folder

zip -r puppeteer-layer.zip node_modules

4. Upload this zip file as a new layer in your serverless platform

Create and add the layer in the AWS layers window, then attach it to the respective lambda function

5. Attach the layer to your Lambda functions that require Puppeteer

Click on Add Layers and Specify ARN, add the link below, then verify and add

Note: Add the browser version along with the region based on your configuration, and click on add. 

 arn:aws:lambda:ap-south-1:764866452798:layer:chrome-aws-lambda:46

Ref: https://github.com/shelfio/chrome-aws-lambda-layer?tab=readme-ov-file

Again, add the layer custom layer we custom-made. We also need to change the Executable path to start from /opt/ for the custom layer to work with the lambda function.

By using layers, you can keep your function code lean and easily update Puppeteer across all your functions by updating the layer.

Now, if you check the layers, we have two layers that we added.

Partner with Us for Success

Experience seamless collaboration and exceptional results.

Headless Chromium requires sufficient memory and startup time. Allocating at least 2 GB RAM and extending the timeout ensures stable browser initialization and execution within Lambda’s runtime limits for the function to run because it needs to open a browser and perform the automation, the default 3 seconds would not budge. So, go to the configuration tab under the same lambda function.

After the configuration, it should look like this:

After saving the changes, click the deploy button and test it with the test button

FAQs

1. Is Puppeteer suitable for AWS Lambda?

Yes. Puppeteer works well on Lambda when paired with a compatible Chromium build and sufficient memory allocation.

2. Why use Puppeteer instead of HTTP-based scraping?

Puppeteer is preferred when JavaScript execution, dynamic rendering, or real user interaction is required.

3. What are Lambda layers used for in Puppeteer setups?

Layers separate heavy dependencies like Chromium from function code, simplifying deployment and updates.

4. What are the main limitations of Puppeteer on Lambda?

Cold starts, memory constraints, and execution time limits must be managed carefully.

5. Is Puppeteer on Lambda suitable for long-running automation?

No. Lambda is best for short-lived, event-driven automation rather than continuous browser sessions.

Conclusion

Puppeteer on AWS Lambda is a strong choice for browser-based automation when workloads are event-driven, execution is short-lived, and infrastructure management needs to stay minimal. With correct layering, configuration, and execution limits, it delivers scalable headless automation without long-running servers for web scraping, testing, and other automation tasks. By mastering AWS Lambda and Puppeteer integration, developers can create efficient and scalable web automation workflows. Whether you're scraping data, generating reports, or running automated tests, Puppeteer on AWS Lambda provides a flexible and powerful solution.

Ready to get started? Set up your first serverless Puppeteer function today and unlock the potential of scalable web automation!

Author-Goutham
Goutham

Hey, I’m Goutham - a techie who loves simplifying complex ideas. I design systems by day and break down tech jargon by night, always excited to share how awesome tech can be.

Share this article

Phone

Next for you

What is AWS CDK? Cover

Serverless

Dec 10, 20255 min read

What is AWS CDK?

Imagine you're a developer needing to set up a bunch of AWS resources for your new project. Instead of manually configuring everything through the AWS console, you can use the AWS Cloud Development Kit (AWS CDK). This toolkit lets you write code in your favorite programming languages like TypeScript, JavaScript, Python, Java, or C# to automate the process. For example, think of it as creating a blueprint for a house. Instead of physically building each part yourself, you design it all on your c

How to Analyse Documents Using AWS Services Cover

Serverless

Feb 19, 202613 min read

How to Analyse Documents Using AWS Services

Imagine needing to extract structured data from thousands of documents every day, such as customer forms, invoices, medical records, or contracts. Manually processing this volume is inefficient, error-prone, and operationally unsustainable. When writing this guide, I wanted to clarify how AWS document analysis services can transform document-heavy workflows into scalable, automated systems. AWS provides powerful OCR capabilities through services like Amazon Textract, combined with advanced NLP

Serverless vs. Microservices: Which Architecture to Choose Cover

Serverless

Oct 22, 20256 min read

Serverless vs. Microservices: Which Architecture to Choose

Selecting the right architecture is fraught with challenges. The emergence of serverless architecture and microservices architecture has created confusion.  The dilemma between serverless vs. microservices isn't a mere academic debate. It's a genuine issue that keeps architects up at night. Inefficient systems, rising expenses, and missed opportunities can all arise from a lack of clarity and insight. Making the incorrect decision may limit creativity, reduce scalability, and waste valuable re