Opportunity Description
Role Overview
We are looking for a hands-on Data Engineer to design, build, and optimize scalable data pipelines and platforms. You will be responsible for creating robust batch and streaming data processing frameworks that enable advanced analytics and AI solutions. (1, 2)
Key Responsibilities
- Data Pipeline Development : Design and maintain scalable ETL/ELT pipelines using Scala and Spark (Core, SQL, Streaming).
- Real-time Streaming : Implement and manage real-time data ingestion using Apache Kafka or GCP Pub/Sub.
- Cloud Infrastructure : Build and optimize data solutions on Google Cloud Platform using services like BigQuery , Dataflow , Dataproc , and Cloud Storage .
- Performance Tuning : Optimize Spark jobs for speed and scalability through partitioning, caching, and shuffle tuning.
- Data Modeling : Design efficien...