PointClickCare is a leading health tech company dedicated to improving healthcare delivery. They are seeking a Principal Data Engineer to design and implement scalable streaming data pipelines, optimize real-time data solutions, and mentor team members to enhance technical excellence.
Responsibilities:
- Lead and guide the design and implementation of scalable streaming data pipelines
- Engineer and optimize real-time data solutions using frameworks like Apache Kafka, Flink, Spark Streaming
- Collaborate cross-functionally with product, analytics, and AI teams to ensure data is a strategic asset
- Advance ongoing modernization efforts, deepening adoption of event-driven architectures and cloud-native technologies
- Drive adoption of best practices in data governance, observability, and performance tuning for streaming workloads
- Embed data quality in processing pipelines by defining schema contracts, implementing transformation tests and data assertions, enforcing backward-compatible schema evolution, and automating checks for freshness, completeness, and accuracy across batch and streaming paths before production deployment
- Establish robust observability for data pipelines by implementing metrics, logging, and distributed tracing for streaming jobs, defining SLAs and SLOs for latency and throughput, and integrating alerting and dashboards to enable proactive monitoring and rapid incident response
- Foster a culture of quality through peer reviews, providing constructive feedback and seeking input on your own work
Requirements:
- Principal Data Engineer with at least 10 years of professional experience in software or data engineering, including a minimum of 4 years focused on streaming and real-time data systems
- Proven experience driving technical direction and mentoring engineers while delivering complex, high-scale solutions as a hands-on contributor
- Deep expertise in streaming and real-time data technologies, including frameworks such as Apache Kafka, Flink, and Spark Streaming
- Strong understanding of event-driven architectures and distributed systems, with hands-on experience implementing resilient, low-latency pipelines
- Practical experience with cloud platforms (AWS, Azure, or GCP) and containerized deployments for data workloads
- Fluency in data quality practices and CI/CD integration, including schema management, automated testing, and validation frameworks (e.g., dbt, Great Expectations)
- Operational excellence in observability, with experience implementing metrics, logging, tracing, and alerting for data pipelines using modern tools
- Solid foundation in data governance and performance optimization, ensuring reliability and scalability across batch and streaming environments
- Experience with Lakehouse architectures and related technologies, including Databricks, Azure ADLS Gen2, and Apache Hudi
- Strong collaboration and communication skills, with the ability to influence stakeholders and evangelize modern data practices within your team and across the organization
- Strong analytical and problem-solving mindset
- Ability to learn quickly and adapt to new technologies, even when uncomfortable
- Self-starter who thrives with minimal supervision and collaborates effectively as a team player
- Excellent organizational and critical-thinking skills
- Comfortable leveraging AI tools to accelerate development