POSITION DESCRIPTION Job Title: IT Developer II (Deidentification Data Engineer)
Primary Role: This position reports to the DHTS-Data Partnerships Director of Data and Analytics Platforms. This position will be responsible for the management and execution of the deidentification processes applied to the Duke University Health System data assets to be included in the Federated Clinical Applications Platform (FCAP), and the management and administration of the FCAP deidentification cloud environment. The position will be a member of the Data Partnerships Data Engineering team and will additionally provide expertise in the development of data integration and delivery pipelines to deliver new data modalities into the FCAP and Duke's Data Lake. These solutions will capitalize on technologies to improve the value of analytical data, improve the effectiveness of information stewardship, and streamline the flow of data in the organization. Solutions will focus on using state-of-the-art data and analytics tools including traditional and near real-time data warehousing, big data, relational and document-based databases using both extract, load, and transform (ELT) toolsets as well as REST APIs and FHIR. The ideal candidate will be comfortable with data science platforms with proven experience leveraging DevOps and Automation/Orchestration tools.
Essential Tasks/Responsibilities
Create and follow defined procedures in the deidentification of patient medical information
Maintain and tune the deidentification environment to perform optimally and comply with DUHS and DHTS policies and standards
Collaborate with the Duke partner on improving the deidentification programs and processes, and work with the partner and the Duke Cloud Team on troubleshooting issues
Assemble large, complex data sets that meet business requirements
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Recommend design of analytics solutions that improves data integration, data quality, and data delivery with an eye toward re-useable components
Create and maintain an optimal data pipeline architecture
Articulate differences, advantages, and disadvantages between architectural solution methods
Work with Agile team members to document and execute test plans for data loading and data validation scripts. Support the code promotion process through development and production as required by using standard CI/CD processes
Develop, implement, and maintain schedule/dependency logic for automated ETL processing
Develop monitoring, logging, and error notification processes to ensure data is updated as expected and processing metrics reported
Participate in the creation and maintenance of standards for coding, documentation, error handling, error notification, logging, etc.
Accountable for conforming to established architectural, developmental, and operational standards and practices including the creation of metadata
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
Work with stakeholders including the Executive, Product, Data, and Design teams to assist with data-related technical issues and support their data infrastructure needs
Education: Bachelor's degree in a related field, or four years of equivalent technical experience required
Required Experience: We are looking for a candidate with 5+ years of experience in a Data Engineer role who should also have hands-on, professional experience with the following:
Relational SQL and NoSQL databases
Writing and executing Python programs and shell scripts on Linux
Intermediate Linux administration
Data Engineering on Microsoft Azure
Data pipeline and orchestration tools such as Azure Data Factory and SQL Server Integration Services
Developing on cloud-based analytic platforms such as Azure Synapse
Performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Supporting and working with cross-functional teams in a dynamic environment
A successful history of manipulating, processing, and extracting value from large disconnected datasets
Required Skills:
Intermediate to advanced skills in Python programming
Intermediate to advanced skills in the Azure Cloud Data Engineering Stack
Intermediate Linux administration and shell scripting
Advanced working SQL knowledge and experience working with relational and non-relation database systems
Strong analytic, documentation, and organizational skills
Desired Skills:
Prior experience in health care IT
Working knowledge of Azure DevOps & Automation/Orchestration
Knowledge of open-source software solutions and open-source as a business model
Technical breadth across application development, enterprise architecture, or application integration
Understanding of Agile methodology
Knowledge of APIs, API Integration, and API Management
The information above describes the general nature and level of work assigned to this position. It is not intended to be an exhaustive list of all duties and responsibilities required of position incumbents.
Duke is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex, sexual orientation, or veteran status.
Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas-an exchange that is best when the rich diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.
Essential Physical Job Functions: Certain jobs at Duke University and Duke University Health System may include essentialjob functions that require specific physical and/or mental abilities. Additional information and provision for requests for reasonable accommodation will be provided by each hiring department.
As a world-class academic and health care system, Duke Health strives to transform medicine and health locally and globally through innovative scientific research, rapid translation of breakthrough discoveries, educating future clinical and scientific leaders, advocating and practicing evidence-based medicine to improve community health, and leading efforts to eliminate health inequalities.