1. Cloud, AWS & Data Engineering Fundamentals
• Cloud computing concepts and AWS overview
• Introduction to Data Engineering and modern data pipelines
• Role of DevOps in Data Engineering projects
• AWS account setup and IAM basics
2. AWS Core Services for Data Platforms
• Amazon EC2 (Linux) – compute basics
• Amazon VPC – networking overview
• Amazon S3 – data lake concepts
• Static application deployment (overview)
3. Linux & Python Basics for Data Engineers
• Linux file system and common commands
• Process management and permissions
• SSH and troubleshooting basics
• Python basics for data pipelines
4. Git & GitLab for Data Projects
• Version control concepts
• Git commands and workflow
• GitLab repositories and branching
• Code collaboration and reviews
5. AWS Data Engineering Core Services
• AWS Glue – ETL concepts, Crawlers, Data Catalog, Glue Jobs
• Amazon Athena – Querying S3 data using SQL, Glue integration
• AWS Lambda – Serverless data processing, S3 & SQS triggers
• Messaging – Amazon SQS and SNS
6. Data Pipeline Architecture on AWS
• Batch vs event-driven pipelines
• End-to-end data workflow design
• Data ingestion, processing and querying
• Error handling and monitoring overview
7. Data Quality, Testing & Automation
• Data quality concepts
• Python code quality using Pylint
• Unit testing using Pytest
• CI/CD automation for data pipelines
8. Infrastructure as Code – Terraform
• IaC concepts for data platforms
• Terraform workflow and state management
• Provisioning data infrastructure on AWS
• Environment management using Terraform
9. CI/CD for Data Engineering (GitLab)
• CI/CD concepts for data pipelines
• GitLab CI/CD workflow
• Automating ETL and Lambda deployments
10. Containerization for Data Workloads
• Docker basics for data applications
• Dockerfile for Python ETL jobs
• Running containerized data services
11. Kubernetes Basics (Supportive)
• Kubernetes overview
• Running data workloads on Kubernetes
• Basic deployment concepts
12. Application & Data Deployment on AWS EKS
• Amazon EKS overview
• Deploying data-related services on EKS
• High-level monitoring concepts
13. Real-Time Data Engineering Projects
• Data Lake using S3, Glue and Athena
• Event-driven pipeline using Lambda and SQS
• CI/CD automation for data pipelines
• End-to-end AWS Data Engineering workflow