Bayview Asset Management, LLC is seeking a Data Engineer to join their Nebula team. This role is critical in building and evolving the data foundation that supports analytics, reporting, AI development, and operational decision-making across the organization, requiring strong technical skills in modern data tooling and cloud-based environments.
Responsibilities:
- Design, build, and maintain robust data pipelines for a wide variety of input and output sources, including internal systems, third-party platforms, files, APIs, event streams, and databases
- Develop scalable ETL and ELT workflows for both batch and real-time processing
- Ensure pipelines are reliable, testable, observable, and easy to extend as business needs evolve
- Build reusable data integration patterns that support growing volumes, new source systems, and downstream consumers across analytics, applications, and AI initiatives
- Design and manage data architectures that support OLTP, OLAP, and reporting workloads across operational and analytical environments
- Build and optimize data models, warehouse schemas, and curated datasets for analytics and BI use cases
- Contribute to the design and operation of modern data platforms, including warehouses, lakehouses, streaming systems, and supporting orchestration frameworks
- Help define patterns for data storage, partitioning, performance optimization, retention, and lifecycle management
- Deploy, operate, and improve data pipelines and data stores on major cloud platforms such as AWS, GCP, or Azure
- Use infrastructure-as-code, CI/CD, and automation practices to improve deployment speed, consistency, and reliability
- Monitor production data systems using logging, alerting, and observability tooling to proactively identify and resolve issues
- Support secure, resilient, and cost-conscious operation of cloud-based data infrastructure
- Implement data quality checks, validation rules, reconciliation processes, and monitoring to ensure trustworthy data across systems
- Establish and maintain standards for lineage, documentation, metadata, schema evolution, and operational runbooks
- Partner with stakeholders to improve data accessibility, consistency, and usability while maintaining appropriate controls and governance
- Contribute to practices that support security, privacy, auditability, and compliance in a regulated environment
- Partner closely with Product, Engineering, and business stakeholders to understand data needs, workflows, and constraints
- Translate business and operational requirements into clean, scalable, and maintainable data solutions
- Support downstream consumers of data, including analysts, researchers, product teams, and operational users
- Communicate clearly with both technical and non-technical stakeholders about data availability, quality, tradeoffs, and delivery timelines
- Continuously improve pipeline performance, reliability, scalability, and developer productivity
- Identify opportunities to simplify architecture, reduce operational toil, and improve data platform leverage across teams
- Operate with a strong bias toward action and iterative delivery, moving quickly from problem definition to implementation and improvement
- Help raise the bar on engineering quality through thoughtful design, testing, documentation, and operational discipline
Requirements:
- 2-4+ years of experience building and operating production-grade data pipelines and data systems
- Strong experience with industry-standard tools and platforms for ETL/ELT, orchestration, data warehousing, streaming, and BI
- Experience working with both OLTP and OLAP systems, with a strong understanding of the tradeoffs between transactional and analytical workloads
- Experience building flexible data pipelines that integrate with many different source and destination types, including databases, APIs, files, message queues, SaaS platforms, and event streams
- Experience supporting both batch and real-time data processing patterns
- Experience deploying and operating data infrastructure on major cloud platforms such as AWS, GCP, or Azure
- Strong SQL skills and experience with data modeling, transformation frameworks, and performance optimization
- Experience building AI-powered capabilities on top of LLMs, including orchestration, evaluation, and data integration patterns
- Experience with modern programming languages commonly used in data engineering, such as Python, Java, Scala, or Go
- Comfort working with CI/CD, infrastructure-as-code, observability, and production operations for data systems
- Strong judgment in ambiguous environments where requirements evolve and systems must balance speed, reliability, and flexibility
- Clear communication skills with both technical and non-technical teammates
- Experience with modern orchestration and transformation tools such as Airflow, Dagster, dbt, or similar platforms
- Experience with cloud-native data warehouses or lakehouse platforms such as Snowflake, BigQuery, Redshift, Databricks, or equivalent technologies
- Experience with streaming and real-time data platforms such as Kafka, Kinesis, SQS, or similar systems
- Experience enabling BI and self-service analytics through curated datasets, semantic layers, and reporting platforms such as Looker, Power BI, Tableau, or similar tools
- Experience in fintech, mortgage, lending, payments, insurance, or other regulated domains
- Experience building data platforms that support AI, machine learning, or decisioning workflows
- Experience improving data quality, reliability, cost efficiency, and platform scalability as a system grows