Computercraft Corporation

Genetic Sequence Database Product Owner and Data Wrangler

Posted 2 Days Ago

Be an Early Applicant

Bethesda, MD

110K-150K Annually

Senior level

Healthtech • Biotech

The Role

The role involves managing the GenBank database, coordinating data exchange, optimizing user workflows, ensuring compliance, and generating downloadable data. The Product Owner will lead data gathering efforts, define product roadmaps, and provide expertise in genetic data curation while collaborating with stakeholders and analyzing incoming sequence data.

Summary Generated by Built In

Computercraft is looking for a Genetic Sequence Database Product Owner and Data Wrangler to support our work for the National Center for Biotechnology Information (NCBI), part of the National Library of Medicine (NLM) at the National Institutes of Health (NIH).
NCBI, one of the 400 most-visited sites in the world, is the premier biomedical center, hosting over four million daily users in search of clinical, genetic, and other information. NCBI’s wide range of applications (e.g., PubMed, ClinicalTrials.gov), platforms, and environments (e.g., big data [petabytes], machine learning, multiple clouds) serve more users with more data than any other U.S. Government agency. Working on NCBI products, you can help to accelerate the development of cures for diseases like cancer.
The Sequence Archives and Submissions (SeqArch) program needs a Product Owner and Data Wrangler for the GenBank sequence database, a unique scientific resource of human health and genetic data at NCBI. This person will be responsible for coordinating data exchange with the International Nucleotide Sequence Database Collaboration, generating downloadable data for external users, and coordinating targeted updates to the database based on systematic changes in taxonomic information.
In this position you will help manage GenBank’s data-access-related products, tools, and protocols. You will make decisions about the direction of the product and prioritize tasks. You will also work to define development tasks, establish delivery schedules, and ensure compliance with the organization’s policies and procedures.
Job Responsibilities

Develop product vision, goals, and strategic roadmaps
Lead data-gathering efforts through market research, data analysis, and user research to make balanced, objective decisions and provide clear guidance to delivery teams to create incremental value in an Agile environment
Synthesize data-gathering efforts into a logical organization of epics and user stories for the development team
Collaborate with users and lead cross-functional teams to define and optimize user workflows to improve user experience
Understand customer segments and identify targeted solutions to exceed their needs
Lead teams through a complete product lifecycle of discovery to delivery
Nurture partnerships with various stakeholders who wish to participate in the sharing of genomic data for research in cloud and conventional environments, using secure cross-agency protocols
Participate in external collaborations and work with senior stakeholders
Analyze incoming genetic sequence data for trends
Prioritize the actions of the product team
Critically evaluate datasets and functional annotations to assess quality
Monitor automated dataflows for loading data to production databases
Provide critical expertise to NCBI in biological data curation of genetic sequences
Analyze log files, error files, or test-case “diffs” that can total hundreds of megabytes using tools such as sed, grep, awk, and Perl to confirm known/expected outcomes and identify outlier/problematic outcomes

Required Skills/Experience

B.S. in bioinformatics, molecular biology, data science, computer science, information technology, or a similar field
Excellent verbal and written communication skills
Genomics/bioinformatics experience
Strong understanding of molecular biology concepts
Scientific ETL data model experience/skills
The ability to troubleshoot technical and staffing roadblocks and mitigate resource risks
Experience managing large and cross-functional projects in a complex, policy-driven environment
Strong customer engagement, networking, presentation, and collaboration skills
Ability to incorporate and diplomatically resolve conflicting priorities from multiple user groups and technical stakeholders
Data processing experience in a Linux environment (5+ years)
Experience coaching team members and eliminating knowledge silos

Desired Skills/Experience

Experience working with GenBank or other sequence databases at NIH or other organizations
Experience with data interoperability and sharing standards and policies
Experience working with Cloud data storage and processing platforms (e.g., AWS, GCP)
Proficiency in at least one scripting language (e.g., BASH, Python)
Experience working with large SQL databases involving many tables and billions of data rows
Experience with CI/CD pipelines, unit tests, integration, and regression testing
Expertise in bioinformatics of sequence analysis and tools including BLAST and multiple sequence aligners
Solid understanding of key molecular biology concepts, such as the central dogma that describes the flow of genetic information from gene (DNA) to mRNA to protein
Experience working in Product Owner or Product Manager positions in an Agile environment (e.g., developing vision, strategic plan, roadmap, requirements; applying user testing methodologies; prioritizing features based on value and effort)

The compensation for this position will be based on the experience of the successful candidate. The expected pay range for this position is $110,000 to $150,000.

Top Skills

Bash

Python

View all jobs at Computercraft Corporation

View Computercraft Corporation Profile

Report Job

The Company

Washington, DC

53 Employees

On-site Workplace

Year Founded: 1981

What We Do

Computercraft, an American Indian– and Woman-owned small business, provides the public with user-friendly access to reliable and current genetic sequence, genomic, chemical, and scientific information.

Our technical and scientific staff work with customers to build and refine high-profile information resources that get accurate health and biomedical research data into the hands of researchers and other stakeholders all over the globe, helping them solve our world’s greatest health challenges.

In some of our other work, we provide program management support and health communication and outreach that directly and indirectly facilitate and sustain our nation's public health efforts