← Technical glossary AWS Technologies

AWS Glue — Serverless ETL

AWS's serverless data integration service: discover, prepare and move data between your sources and your data lake without managing infrastructure.

What is AWS Glue?

AWS Glue is a serverless ETL (Extract, Transform, Load) service: it connects your data sources, automatically catalogs their structure and runs the transformation jobs that feed your data lake or data warehouse, without you provisioning or maintaining servers.

Its Data Catalog works as a central index of all your datasets, queryable by services like Athena and Redshift. Being serverless, it scales with the job's volume and charges for execution time, making it efficient for pipelines that do not run around the clock.

Use cases

What AWS Glue is used for

Feed a data lake

Ingest and transform data from multiple sources into S3 in an analytics-ready format.

Data catalog

Centralize the schema and metadata of all datasets so analysts can discover and query them.

Prep for analytics and AI

Clean and structure data before loading it into Redshift or using it in machine learning models.

System integration

Move data between operational databases and analytics platforms on a schedule.

How Caleidos implements it

AWS Glue with an AWS partner

At Caleidos we build data platforms on AWS using Glue for ingestion and transformation pipelines, integrated with a data lake on S3 and consumable by analytics and AI. It is the heart of our data engineering projects.

Explore Data Engineering →

Frequently asked questions

What does it mean that AWS Glue is serverless?
You do not provision or manage servers: AWS allocates compute capacity when the job runs and releases it afterward. You pay for execution time, not for infrastructure running 24/7.
What is the Glue Data Catalog for?
It is a central index of your datasets and their structure. It lets services like Athena and Redshift query the data without manually defining schemas each time.
Does AWS Glue replace a data team?
No: it is a tool that makes the team more efficient. Designing good pipelines, modeling the data and ensuring quality still require data engineering judgment.

Evaluating AWS Glue for your project?

Tell us what you want to achieve. In 30 minutes we give you a concrete recommendation.

Let's talk