Data Engineer
Extia
As a consulting company specialized in IT and digital sectors, Extia has prioritized an approach that combines performance and well-being at work since its creation in 2007. This vision is shared today by more than 2,500 Extians across France and internationally, and has been recognized by the Great Place to Work certification for 15 years!
We believe in equal opportunity and offer every candidate the chance to reveal their potential, without distinction of any kind. At Extia, it's "First who, then what" so, let's do it!
First who
Strong problem-solving skills with the ability to break down complex challenges into practical solutions
Excellent communication skills, able to explain technical concepts to both technical and non-technical stakeholders
Business-oriented mindset with an understanding of how data supports organizational goals
Then what
What You’ll Do
- Design, build, and maintain scalable and reliable data platforms
- Develop and optimize data pipelines for data ingestion, processing, orchestration, and analysis
- Collect, process, and analyze large and complex datasets from diverse sources
- Implement data processing workflows using frameworks such as Apache Spark and Apache Beam
- Integrate data from multiple sources, including databases, file systems, and APIs
- Ensure high data quality, accuracy, and integrity through collaboration with cross-functional teams
- Implement data security and privacy best practices, including access controls and encryption
- Monitor system performance and optimize platforms for availability, scalability, and performance
- Experience with cloud platforms and services for data engineering, especially Google Cloud Platform (GCP)
- Proficiency in Python, Java, or Scala
- Hands-on experience with big data technologies such as Spark, Flink, Kafka, Elasticsearch, Hadoop, Hive, Sqoop, Flume, Impala, Druid, Kafka Streams and Connect
- Strong understanding of data modeling and database design principles
- Experience with data integration and ETL tools (e.g., Apache Kafka, Talend)
- Solid knowledge of distributed systems and modern data processing architectures
- Strong SQL skills and experience with both relational and NoSQL databases
- Familiarity with additional cloud services such as AWS (S3) or Azure Data Factory
- Experience using version control systems such as Git