About the Role
We are seeking a talented Data Engineer with extensive experience in Python, SQL, NoSQL, and API creation and integration for near-live data streaming. The Data Engineer will play a vital role in designing, building, and maintaining data pipelines, ensuring the efficient flow of information within the organization. The ideal candidate will have a strong understanding of data architecture, a passion for working with large-scale data, and the ability to deliver insights that drive business success in the retail sector.
Responsibilities
Design and Develop Data Pipelines: Create and maintain scalable data pipelines using Python, ensuring that data is accessible, consistent, and reliable. Develop, construct, test, and maintain data architectures such as data lakes, databases, and large-scale processing systems.
Implement ETL Processes: Use AWS Glue and PySpark to implement ETL processes.
Create and Integrate APIs for Near-Live Data Streaming: Develop and manage APIs to facilitate near-live data streaming, integrating with various systems and platforms.
Collaborate with Data Architects: Work closely with data architects to implement data models, data lakes, and data warehouses, aligning with organizational goals and industry standards. Implement complex data projects.
Collaborate with Data Scientists: Assist in data-related technical issues and support their data infrastructure needs.
Implement Data Integration Solutions: Develop and manage data integration strategies, including pub/sub and data streaming services, to support various platforms and systems.
Optimize Performance: Monitor and optimize data systems' performance, ensuring smooth operations and optimal resource utilization.
Ensure Data Compliance: Establish data governance policies and adhere to regulatory requirements, maintaining data integrity and security.
Support AI and Machine Learning Initiatives: Collaborate with AI teams to provide data support for machine learning models and algorithms.
Technology Evaluation: Evaluate and implement new technologies and tools that align with the company's vision, including cloud platforms such as AWS, GCP, and Azure.
Skills
Python, PySpark
AWS Glue, S3, Data Lake
ETL processes
Terraform (plus)
Qualifications
Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
3+ years of experience in data engineering, data modeling, or related areas.
Extensive experience in Python, with the ability to develop efficient data pipelines and processes.
Knowledge and experience with SQL and NoSQL databases, understanding how to manipulate and analyze data effectively.
Experience in API creation and integration for near-live data streaming, with a deep understanding of data synchronization and real-time processing.
Familiarity with data lake and data warehouse technologies, and experience with at least one large-scale data implementation.
Knowledge of major cloud platforms such as AWS, GCP, and Azure, and experience with pub/sub and data streaming services.
Strong understanding of the retail industry, with a focus on technology-driven solutions.
About the Company
Retailogists is your comprehensive retail partner dedicated to optimizing your ROI. Leveraging cutting-edge technology and data-centric digital marketing, we fuel your business growth and help you achieve your objectives. Our seasoned experts simplify the complexities of eCommerce, from devising effective digital strategies to crafting high-performance transactional platforms and executing intelligent marketing plans.
We excel in streamlining your business operations through automation and synchronization, covering areas such as transactional solutions for both B2C and B2B, merchandising, logistics optimization, and systems integrations, including ERP, inventory, and CRM.
We also specialize in building state-of-the-art data infrastructure that empowers you with actionable business insights, ranging from straightforward reporting to AI-driven decision-making.