Payscale is a leading compensation innovator focused on helping organizations scale their business through data-driven insights. They are seeking a Data Engineer I to join their Data Engineering Team, responsible for managing data pipelines, data warehouse operations, and supporting data productization efforts.
Responsibilities:
- Help manage the data warehouse by completing well-scoped tasks with guidance
- Maintain and extend existing data pipelines and ingestion jobs (small changes, fixes)
- Write clean SQL and basic Python scripts following team standards
- Support operations: monitor jobs, validate loads, and escalate issues with context
- Implement changes within established modeling/warehouse patterns (star schema, layered data)
- Collaborate with partner teams on straightforward data access and data quality requests
- Apply basic best practices for data hygiene and query performance
- Contribute small improvements to internal tooling, runbooks, and documentation
- Build fluency in team tools and platforms (e.g., Snowflake, CI/orchestration, ingestion tools)
- Participate in an on-call rotation to help monitor and support data pipelines and the data warehouse, escalating issues as needed
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- 2+ years of experience in data warehousing and data engineering
- Hands-on experience with Snowflake data warehouse platform for data storage, retrieval, and analysis
- Advanced programming skills in Python, SQL
- Experience in building AI/ML retraining pipelines
- Experience in designing and deploying APIs in cloud environments (e.g., AWS, Azure, GCP) while considering scalability and elasticity requirements
- Proficiency in machine learning development, including experience with training, fine-tuning, and validating ML models using frameworks like TensorFlow, PyTorch, or similar
- In-depth knowledge of AI/ML model evaluation, optimization, and deployment strategies
- Strong understanding of vector databases and their application in AI/ML model storage and retrieval
- Experience in utilizing Large Language Models (LLMs) such as GPT (Generative Pre-trained Transformer), BERT (Bidirectional Encoder Representations from Transformers), LLaMa (Meta AI), or similar models in real-world applications
- Proficiency in fine-tuning and deploying LLMs for various natural language processing (NLP) tasks, such as text generation, summarization, sentiment analysis, or language translation