Design, implement, and optimize Elasticsearch clusters for high-performance querying and data retrieval.
Build and manage Elasticsearch indexes, ensuring data is stored, indexed, and queried efficiently.
Build and optimize data storage solutions like data lakes and warehouses.
Integrate structured and unstructured data from various internal and external systems to create a unified view for analysis.
Ensure data accuracy, consistency, and completeness through rigorous validation, cleansing, and transformation processes.
Maintain comprehensive documentation for data processes, tools, and systems while promoting best practices for efficient workflows.
Collaborate with product managers, and other stakeholders to gather requirements and translate them into technical solutions.
Participate in requirement analysis sessions to understand business needs and user requirements.
Provide technical insights and recommendations during the requirements-gathering process.
Participate in Agile development processes, including sprint planning, daily stand-ups, and sprint reviews.
Work closely with Agile teams to deliver software solutions on time and within scope.
Adapt to changing priorities and requirements in a fast-paced Agile environment.
Conduct thorough testing and debugging to ensure the reliability, security, and performance of applications.
Write unit tests and validate the functionality of developed features and individual elements.
Writing integration tests to ensure different elements within a given application function as intended and meet desired requirements.
Identify and resolve software defects, code smells, and performance bottlenecks.
Stay updated with the latest technologies and trends in full-stack development.
Propose innovative solutions to improve the performance, security, scalability, and maintainability of applications.
Continuously seek opportunities to optimize and refactor existing codebase for better efficiency.
Stay up-to-date with cloud platforms such as AWS, Azure, or Google Cloud Platform.
Collaborate effectively with cross-functional teams, including testers, and product managers.
Requirements
Bachelor's degree in Computer Science, Engineering, or related field.
Proven experience as a Data Engineer, with a minimum of 3 years of experience.
Proficiency in Elasticsearch and Python programming language is a must.
Experience with database technologies such as SQL (e.g., MySQL, PostgreSQL) and NoSQL (e.g., MongoDB) databases.
Strong understanding of Programming Libraries/Frameworks and technologies such as Flask, API frameworks, Data warehousing/lakehouse, Principles, Database, ORM, Data analysis, Databricks, Pandas, Spark, PySpark, Machine learning, OpenCV, Scikit-learn.
Utilize Java to build and enhance backend systems, particularly for integration with Elasticsearch and databases.
Develop APIs, microservices, and automation scripts as needed.
Strong understanding of data warehousing/lakehousing principles and concurrent/parallel processing concepts.
Familiarity with at least one cloud data engineering stack (Azure, AWS, or GCP) and the ability to quickly learn and adapt to new ETL/ELT tools across various cloud providers.
Familiarity with version control systems like Git and collaborative development workflows.
Competence in working on Linux OS and creating shell scripts.
Solid understanding of software engineering principles, design patterns, and best practices.
Excellent problem-solving and analytical skills, with a keen attention to detail.
Effective communication skills, both written and verbal, and the ability to collaborate in a team environment.
Adaptability and willingness to learn new technologies and tools as needed.