DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Running Apache Airflow + Docker for Free Using GitHub Codespaces

Running Apache Airflow + Docker for Free Using GitHub Codespaces

Comments
1 min read
Why Big Tech is Migrating from Traditional Databases to NewSQL

Why Big Tech is Migrating from Traditional Databases to NewSQL

1
Comments
1 min read
Organizing How to Use AWS Lake Formation

Organizing How to Use AWS Lake Formation

3
Comments
9 min read
ETL Pipeline: Fetching Real-Time News Data with Python and Postgres

ETL Pipeline: Fetching Real-Time News Data with Python and Postgres

1
Comments
4 min read
Data Engineering Pipeline: Understanding ETL vs ELT

Data Engineering Pipeline: Understanding ETL vs ELT

1
Comments
2 min read
I built a mini columnar storage engine in pure Python — and finally understood why Parquet is so fast

I built a mini columnar storage engine in pure Python — and finally understood why Parquet is so fast

Comments
6 min read
One command from your laptop to Kubernetes — no CI pipeline

One command from your laptop to Kubernetes — no CI pipeline

Comments
7 min read
Why I Don’t Let the LLM Decide Issue State

Why I Don’t Let the LLM Decide Issue State

Comments
5 min read
Your Scraper Collected 50 Rows. There Were 4,000.

Your Scraper Collected 50 Rows. There Were 4,000.

Comments
7 min read
Deeper into Dataform 3: Auditing Dataform

Deeper into Dataform 3: Auditing Dataform

Comments
2 min read
How I Broke Down My ETL Pipeline Project Into Smaller Engineering Exercises

How I Broke Down My ETL Pipeline Project Into Smaller Engineering Exercises

Comments
2 min read
I built a data-contract validator in pure Python (no pandas, no PyYAML) and it caught a 30% revenue ghost

I built a data-contract validator in pure Python (no pandas, no PyYAML) and it caught a 30% revenue ghost

Comments
6 min read
A practical pipeline for turning messy business documents into spreadsheets

A practical pipeline for turning messy business documents into spreadsheets

Comments
2 min read
Day 9 of 100 Days of ClickHouse®: Mastering Data Aggregation from GROUP BY to CUBE

Day 9 of 100 Days of ClickHouse®: Mastering Data Aggregation from GROUP BY to CUBE

1
Comments
4 min read
Your Scraper Died at Row 12,000. The Rerun Pattern.

Your Scraper Died at Row 12,000. The Rerun Pattern.

Comments
13 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.