Analytics

AWS Glue

Discover, prepare, and integrate all your data at any scale

Get started with AWS Glue

Connect with an AWS Glue specialist

1 million objects stored free

with the AWS Free Tier

Build and manage a modern data pipeline with a single data integration service.

Keep costs low and focus more on your data at any scale with serverless data integration.

Use your favorite method: drag and drop, write code, or connect using your notebook.

Support various data processing methods and workloads, including ETL, ELT, batch, and streaming.

How it works

AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML), and application development.

Event-driven ETL
AWS Glue Data Catalog
No-code ETL jobs
Self-service data preparation

Event-driven ETL
AWS Glue can run your extract, transform, and load (ETL) jobs as new data arrives. For example, you can configure AWS Glue to initiate your ETL jobs to run as soon as new data becomes available in Amazon Simple Storage Service (S3).
AWS Glue Data Catalog
You can use the Data Catalog to quickly discover and search multiple AWS datasets without moving the data. Once the data is cataloged, it is immediately available for search and query using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
No-code ETL jobs
AWS Glue Studio makes it easier to visually create, run, and monitor AWS Glue ETL jobs. You can build ETL jobs that move and transform data using a drag-and-drop editor, and AWS Glue automatically generates the code.
Self-service data preparation
With AWS Glue DataBrew, you can explore and experiment with data directly from your data lake, data warehouses, and databases, including Amazon S3, Amazon Redshift, AWS Lake Formation, Amazon Aurora, and Amazon Relational Database Service (RDS). You can choose from over 250 prebuilt transformations in DataBrew to automate data preparation tasks such as filtering anomalies, standardizing formats, and correcting invalid values.

Introduction to AWS Glue (01:54)

Why AWS Glue?

Preparing your data to obtain quality results is the first step in an analytics or ML project. AWS Glue is a serverless data integration service that makes data preparation simpler, faster, and cheaper. You can discover and connect to over 70 diverse data sources, manage your data in a centralized data catalog, and visually create, run, and monitor ETL pipelines to load data into your data lakes.

Use cases

Simplify ETL pipeline development

Remove infrastructure management with automatic provisioning and worker management, and consolidate all your data integration needs into a single service.

Learn more about AWS Glue Auto Scaling »

Discover data efficiently

Quickly identify data across multiple AWS datasets, and then make it instantly available for querying and transforming.

Learn more about AWS Glue Data Catalog »

Interactively explore, experiment on, and process data

Using AWS Glue interactive sessions, data engineers can interactively explore and prepare data using the integrated development environment (IDE) or notebook of their choice.

Learn more about AWS Glue Interactive Sessions »