Mindlance is a company focused on consulting and engineering solutions. They are seeking a Lead Data Engineering Platform professional to consult on complex initiatives and contribute to data engineering and platform development for financial crimes use cases by building scalable data pipelines and improving data quality.
Responsibilities:
- Build and maintain batch and/or streaming data pipelines supporting financial crimes initiatives
- Develop data transformations using Python + PySpark and optimize performance for large datasets
- Apply strong understanding of Apache Spark architecture (executors, partitions, shuffles, joins, caching) to improve performance
- Partner with business and technical stakeholders to translate requirements into data models, mappings, and curated datasets
- Support ingestion from multiple sources (transactional systems, case management, reference data, etc.)
- Implement data quality checks, reconciliation, and controls to ensure auditability and reliability
- Contribute to modernization efforts (legacy → in-house build) including migration planning and redesign
- Create documentation for pipelines, logic, and operational runbooks
- Work within Agile delivery (Jira), supporting sprint execution and delivery timelines
Requirements:
- 5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work or consulting experience, training, military experience, education
- 5+ years of experience in data engineering / ETL / data platform development
- Strong hands-on development in: Python
- PySpark / Apache Spark
- Advanced SQL
- Experience working with large-scale data sets and performance tuning
- Strong understanding of data concepts: data modeling, lineage, metadata, governance
- Experience supporting regulated environments with emphasis on controls and audit readiness
- Strong communication skills (ability to work with both engineering + business partners)
- Experience running PySpark workloads on Google Cloud Platform (GCP)
- Dataproc, BigQuery, Google Cloud Storage (GCS), etc
- Experience with cloud native Big Data platforms
- Knowledge of data governance, security, and compliance practices
- Experience with CI/CD pipelines for data engineering workloads
- Orchestration: Airflow (or similar scheduling tools)
- Streaming: Kafka
- Lakehouse/Warehouse: Databricks / Snowflake / BigQuery
- CI/CD + DevOps: Git, pipelines, automation, release management
- Data governance/security: encryption, access controls, data masking, PII handling
- Prior Financial Crimes domain: AML / sanctions / fraud / investigations / KYC