itD is a consulting and software development company seeking a Software Engineer to design and scale data pipelines for machine-generated data. The role involves building distributed data pipelines, optimizing performance, and collaborating with machine learning engineers to support model training workflows.

Responsibilities:

Build and scale distributed data pipelines for large-scale time series, log data, and high-volume event streams
Design and maintain reliable, high-performance Spark and Python workflows to support model training datasets
Analyze and resolve performance bottlenecks related to latency, memory utilization, data skew, and throughput
Improve data quality, validation processes, and reproducibility for machine learning workloads
Partner with machine learning engineers and researchers to accelerate foundation model development
Measure and optimize application and transaction performance in production data systems
Collaborate cross-functionally to ensure data infrastructure aligns with evolving research and product needs
Attend regular internal practice community meetings
Collaborate with your itD practice team on industry thought leadership
Complete client case studies and learning material (blogs, media material)
Build out material to contribute to the Digital Transformation practice
Attend internal itD networking events (in person and virtual)
Work with leadership on career fast-track opportunities

Requirements:

5+ years of software engineering experience
Strong proficiency in Python
Hands-on experience with Apache Spark (PySpark or Scala)
Experience building large-scale data pipelines in distributed environments
Experience working with time series data, logs, or high-volume event streams
Strong debugging skills and experience with performance optimization in distributed systems
Bachelor's degree in a relevant field or equivalent work experience required
Experience supporting machine learning or large model training workflows
Familiarity with sequence modeling or time series data systems
Experience with streaming systems such as Kafka or Spark Streaming
Experience with cloud-native or Kubernetes-based platforms
Cisco experience a plus

Software Engineer - Remote (6070)

Key skills

About this role

Responsibilities:

Requirements: