Cofense is a cybersecurity platform focused on stopping phishing threats. The Data Engineer will integrate enterprise-wide data from various cybersecurity applications and develop data pipelines using cloud technologies while ensuring data privacy and compliance.

Responsibilities:

Work with Architects & Cloud Systems Engineers in designing Data Platform and Architecture
Substantial experience with SQL and no-SQL OLTP databases and OLAP data warehousing technologies, especially AWS Aurora
Experience data modeling and building data pipelines for multi-product and/or multi-department organizations
Develop Data Pipelines on Cloud Technologies like Azure/AWS with well-defined tool frameworks
Able to develop ETL code to stream data from disparate (structured and semi-structured) SaaS product data stores to Data Lake/Data Warehouse using Python, Azure/AWS Data Lake services
Ability to write complex SQL scripts and automate them using Python
Develop test cases and unit tests for key implementations of Data Platform by adhering to software engineering best practices and standards
Secure data end-to-end by complying data privacy rules while developing processes to move data across Applications/Data Sources and Data Lake/Data Warehouse, as well as while delivering data through SQL clients and BI tools
Experience building adhoc reporting APIs and service layer on top of underlying OLTP and OLAP databases
Help integrate Data Platform with BI tools like Power BI, Tableau, Splunk etc
Ability to develop and interpret Entity Relationship Diagrams (ERD) across data sets in relational database systems as well as non-relational Data stores
Able to do Data Mining and Identify trends, patterns, anomalies in complex data sets across multiple data sources/systems and present results without ambiguity
Develop data transformations to generate Facts, Summaries, Key metrics by applying business rulesets and aggregations using Python, SQL and other transformation tools
Able to review current processes related to data ingestion, transformation and statistical analysis and re-engineer them
Collaborate with business users across Cofense’s departments in defining requirements, prioritize project work and deliver them timely
Other duties as assigned

Requirements:

US Citizenship – Supports FedRamp
Substantial experience with SQL and no-SQL OLTP databases and OLAP data warehousing technologies, especially AWS Aurora
Experience data modeling and building data pipelines for multi-product and/or multi-department organizations
Develop Data Pipelines on Cloud Technologies like Azure/AWS with well-defined tool frameworks
Able to develop ETL code to stream data from disparate (structured and semi-structured) SaaS product data stores to Data Lake/Data Warehouse using Python, Azure/AWS Data Lake services
Ability to write complex SQL scripts and automate them using Python
Develop test cases and unit tests for key implementations of Data Platform by adhering to software engineering best practices and standards
Secure data end-to-end by complying data privacy rules while developing processes to move data across Applications/Data Sources and Data Lake/Data Warehouse, as well as while delivering data through SQL clients and BI tools
Experience building adhoc reporting APIs and service layer on top of underlying OLTP and OLAP databases
Help integrate Data Platform with BI tools like Power BI, Tableau, Splunk etc
Ability to develop and interpret Entity Relationship Diagrams (ERD) across data sets in relational database systems as well as non-relational Data stores
Able to do Data Mining and Identify trends, patterns, anomalies in complex data sets across multiple data sources/systems and present results without ambiguity
Develop data transformations to generate Facts, Summaries, Key metrics by applying business rulesets and aggregations using Python, SQL and other transformation tools
Able to review current processes related to data ingestion, transformation and statistical analysis and re-engineer them
Collaborate with business users across Cofense's departments in defining requirements, prioritize project work and deliver them timely
Expertise in SQL skills for data transformations, statistical analysis, and troubleshooting across more than one Database Platforms (MySQL, PostgreSQL, Redshift, Azure SQL Warehouse etc.)
Expert in writing complex SQL scripts and automate them using Python
Knowledge of Data management on NoSQL DBs like DynamoDB, Mongo, and know-how of Big Data tools Hadoop, Spark, Kafka/Kinesis/SQS/Azure Queues or other messaging tools is huge plus
Analytical skills, with good at finding data trends/outliers, anomalies, and articulate complex information or data points with Business Users, Management, and individuals
Enthusiasm to work with lot of data across disparate data sources and Databases
Has strong sense of engineering craftsmanship, takes pride in the code they write
Has a sense of intellectual curiosity and a burning desire to learn is self-driven, actively looks for ways to contribute, and knows how to get things done
Is deliriously customer-focused both internal and external customers
Sees big picture impact and relationships among and across work units
Identifies complex technical problems and tries to resolve with minimal help
Over 5 years of proven experience in data architecture, data modelling, and lifecycle management
Hands-on experience with relational and cloud databases, including Azure SQL Database, Microsoft SQL Server, Amazon Aurora, and Amazon Redshift
Strong background in developing Python and data technologies
Practical experience designing and developing ETL data pipelines and applications using SQL and Python
Strong expertise in writing complex SQL queries for data transformation and automating processes using Python
Experience building and consuming RESTful APIs using Python libraries
Proficient in integrating data platforms with BI tools such as Power BI and Splunk, including dashboard and report development
Solid experience working with Unix/Linux environments, including SSH tunnelling and writing/interpreting Bash scripts
Advanced proficiency in Python 3, with hands-on experience using NumPy, SciPy, Scikit-learn, and Pandas
Experience leveraging APIs to extract data and load it into databases using Python
Bachelor's degree in Computer Science or Math, Data Analytics, Data Sciences, BI or demonstrated industry experience

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: