InvoiceCloud is a fast-growing fintech leader recognized for its innovative solutions and commitment to customer service. The Associate Product Reliability Engineer will support production operations for InvoiceCloud’s Payment Service Network while building foundational technical skills and contributing to system reliability through automation and monitoring.

Responsibilities:

Supports issue triage and debugging across production systems, using logs, metrics, and traces to identify symptoms and narrow hypotheses
Writes clean, functional, and well-tested code (primarily .NET/C#) to deliver small reliability improvements, automation, and fixes with defined scope and guidance
Assists in building and maintaining monitoring dashboards and alerting to improve visibility into platform health
Participates in incident response activities and post-incident reviews, demonstrating attention to detail and follow-through
Owns assigned incident tickets or operational work items through resolution, communicating progress, impact, and blockers clearly
Documents recurring issues, troubleshooting steps, and runbooks so others can respond consistently and efficiently
Partners with senior engineers and product support teams to reproduce issues, validate fixes, and confirm service restoration
Follows InvoiceCloud’s development, security, and change-management standards, taking accountability for safe and reliable production outcomes
Uses Git and standard branching/review practices to streamline collaboration and ensure operational changes are traceable
Creates or improves automation scripts (PowerShell and/or Python) to reduce repetitive operational work and speed up diagnostics
Learns to prioritize reliability work using impact and urgency (e.g., incident severity, customer impact, and SLO risk) while meeting sprint goals
Helps improve incident response processes by identifying gaps (monitoring, runbooks, alerts) and proposing actionable remediations
Explores reliability and observability tools and techniques that improve detection, diagnosis, and recovery (e.g., better logging, actionable alerts, and dashboards)
Leverages AI-assisted development tools (e.g., GitHub Copilot, Cursor, Windsurf) for debugging, code generation, and documentation, while learning to validate AI output critically
Contributes ideas for improving incident response, post-mortem practices, and production readiness during feature delivery
Demonstrates curiosity, experimentation, and a learning mindset, seeking feedback to build reliability engineering best practices

Requirements:

Bachelor's degree in Computer Science, Engineering, or related technical discipline
0–2 years of experience in software engineering, DevOps, production support, or technical support (internship, co-op, or professional)
Understanding of object-oriented programming, basic data structures, and algorithms
Familiarity with .NET/.NET Framework, C#, SQL, and version control systems (Git)
Exposure to cloud environments (Azure preferred) and basic concepts like deployments, configuration, and networking fundamentals
Exposure to scripting for automation and troubleshooting (Python and/or PowerShell)
Familiarity with monitoring/observability tools or concepts (dashboards, alerts, log queries); experience with New Relic or similar is a plus
Strong debugging/troubleshooting, problem-solving, collaboration, and written communication skills

Associate Product Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: