Convz
Full-Time
Dubai, Hybrid, Remote
Posted 4 months ago
What you will do?
- Design end-to-end data solutions, including solution architecture, data modeling, and data flow diagrams, to address complex business requirements.
- Design, develop, and maintain scalable data pipeline architecture to support the company’s and clients’ data processing needs.
- Implement best practices for data ingestion, storage, and processing to ensure data integrity, reliability, and security.
- Architect and implement Big Data processing workflows to handle large volumes of data with high throughput and low latency, ensuring timely and accurate data delivery.
- Drive the adoption of data engineering best practices and standards across the organization, including data governance, data security, and data quality.
- Lead and mentor junior team members, providing guidance on best practices and technical solutions.
- Maintains consistent communication with the Project Team Lead and Project Manager, providing clear updates on task status and promptly escalating any issues or potential delays
Good to have:
- Design and develop real-time streaming data pipelines using technologies such as Apache Kafka, Apache Pulsar, or cloud-native streaming services (e.g., AWS Kinesis, Google Cloud Pub/Sub).
- Lead data architecture reviews and provide recommendations for optimizing performance, scalability, and cost efficiency.
- Collaborate with data scientists and analysts to understand data requirements and provide technical expertise in data modeling and analytics.
What we are looking for?
- Proficiency in designing, building, and maintaining scalable ETL data processing systems using Python, Java, Scala, or similar programming languages.
- Expertise in SQL – Demonstrate a strong grip on SQL to process and analyze various forms of data.
- Experience working with Cloud Deployment Environments, preferably GCP, Azure or AWS.
- Experience in creating and managing CI/CD pipelines using Azure DevOps, Jenkins, GoCD, or similar technologies.
- Proficiency in orchestration tools like Apache Airflow, Azure Data Factory, AWS Data Pipelines, or similar.
- Familiarity with version control systems like GitHub and CI/CD concepts.
- Understanding of Data Modeling and Data Warehousing techniques, including star schema or raw data vault.
- Experience working with large-scale data sets, including structured and unstructured data formats like Parquet, Avro, JSON, etc.
Good to have:
- Knowledge of big data processing technologies such as Hadoop, Apache HBase, Apache Hive, or similar.
- Experienced with designing and implementing Data Governance policies covering Data Accessibility, Security, Management, Quality etc.