Lead the aggregation of external document collections into CPR’s systems, ensuring they are structured, consistent and usable
Define, apply and maintain metadata standards, schema requirements, taxonomies, and controlled vocabularies
Evaluate and onboard new sources and datasets
Work alongside the partnerships team to support highly-respected external document collection curators
Anticipate and manage schema evolution as external providers update or expand their data
Create and carry out data quality processes, including identifying duplication, improving metadata completeness, and maintaining consistency across collections
Ensure that content gaps raised by user feedback or analysis feed back into collection priorities and schema development
Document processes and standards so workflows are repeatable and scalable
Track and communicate the impact of data ingestion efforts, including metrics on database coverage, data quality and update frequency
Requirements
At least 10 years of professional experience, with at least 5 years of experience managing large or complex digital document collections
Strong experience designing and governing metadata schemas, controlled vocabularies, and taxonomy development
Demonstrated proficiency in structuring and maintaining aggregated datasets from multiple external sources
Experience evaluating external datasets for structure, completeness and long-term maintainability
A track record of improving processes — designing workflows that are reproducible, well-documented, and resilient to change
Strong communication and stakeholder management skills, comfortable engaging with technical and non-technical audiences
Benefits
A deep commitment to employee wellbeing, including policies such as four-day workweek (same pay, Fridays off)
Generous leave
A wellbeing allowance
A vibrant, collaborative, empathetic work culture that thrives on innovation and the impact of our work
A hybrid work model that encourages collaboration while providing flexibility