Eli Lilly and Company Logo

Eli Lilly and Company

Statistical Genetics Platform Engineer

Posted 10 Days Ago
Be an Early Applicant
In-Office
2 Locations
167K-266K Annually
Senior level
In-Office
2 Locations
167K-266K Annually
Senior level
The Statistical Genetics Platform Engineer will develop scalable computational pipelines and tools for statistical genetics analyses, collaborating with researchers to enhance data-driven decision-making in therapeutic target evaluation.
The summary above was generated by AI

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

The Lilly research environment is evolving to centralize the access and analysis of human genetic data. This new initiative will work to define data, tools and process to provide the therapy area teams key evidence for target evaluation and target discovery.   

Many different therapy areas across Eli Lilly focus on new therapeutic approaches for the treatment of many different diseases. Starting from an idea, we work with partners across Lilly to discover and develop novel biologic, small molecule and nucleic acid-based therapeutics. Our focus is the patient: by understanding the biology and pathophysiology underlying disease states, we aim to address the root cause of disease and develop breakthrough therapies. We have one of the strongest pipelines in the industry and a track record of delivering impactful medicines that improve people’s lives.

In this hands-on role, the Statistical Genetics Platform Engineer will join a team that enables statistical geneticists to derive scientific insights from internal and external human genetic data. The ultimate purpose being to drive data-driven decision-making within the organization. The successful candidate will collaborate with team members and also with data engineers and platform architects across the Lilly research environment. The goals of the collaboration will include identifying genetically-based disease targets, finding potential expanded clinical indications for existing assets, classifying and validating patient subpopulations, and understanding disease mechanisms. The role will support these goals by developing robust computational pipelines that leverage harmonized clinical datasets. This role is a great opportunity to be at the forefront of scientific exploration in a dynamic research field. 

Interested in working on an innovative team focusing on providing clear evidence for therapeutic targets? Apply today!

Key Responsibilities:

  • Design and implement robust, scalable computational pipelines for statistical genetics analyses, including workflows for GWAS, polygenic risk scores, fine-mapping, colocalization and variant annotation
  • Develop and maintain platform tools and APIs that enable researchers to efficiently process genomic data at scale (biobanks, population cohorts, multi-omics datasets)
  • Build infrastructure for reproducible research, including containerization, workflow orchestration, and version control for analytical pipelines
  • Optimize computational performance of statistical genetics algorithms and implement distributed computing solutions for large-scale analyses
  • Collaborate with statistical geneticists and computational biologists to translate methodological innovations into production-ready software
  • Establish best practices for data access, quality control, validation, and documentation across genomic analysis pipelines
  • Maintain and improve existing codebases, ensuring code quality, testing coverage, and comprehensive documentation
  •  Monitor platform performance, solve issues, and implement improvements based on user feedback and evolving research needs
  • Support the integration of AI-based tools and required MLOps infrastructure

Basic Requirements:

  • Master’s in Computer Science, Statistical Genetics, Bioinformatics or related field and 6+ years post-Master’s experience (in industry or large-scale non-academic institutions, e.g. Broad, NIH),
  • OR PhD in Computer Science, Statistical Genetics, Bioinformatics or related field and 3+ years post-PhD experience (in industry or large-scale non-academic institutions, e.g. Broad, NIH) 

Key Requirements:

  • Strong programming skills in languages commonly used in genomics research (Python, R)
  • Demonstrable understanding of statistical genetics concepts including GWAS, heritability estimation, genetic correlation, rare variant analysis, and population structure
  • Experience using standard tools and formats for genetic data (VCF, BGEN, PLINK, BAM/CRAM) and genomic databases
  • Proficiency with workflow management systems (Nextflow, Cromwell/WDL) and containerization technologies
  • Experience with high-performance computing environments, cloud platforms (AWS, GCP, Azure), or distributed computing frameworks
  • Strong problem-solving abilities and attention to detail in handling complex biological datasets
  • Ability to prioritize and manage multiple competing priorities within a fast-paced environment

Additional Skills/Preferences:

  • Demonstrated track record performing end-to-end analysis of human genetic data
  • Familiarity with operationalizing statistical genetics tools like plink, ADMIXTURE, regenie, VEP, LDSC, FINEMAP, SuSiE, coloc, METAL, LDpred2, MAGMA, rvtest (rare variants), SNP-int GPU (epistasis)
  • Experience performing complex analyses in cloud-based environments required; prior experience with DNANexus and/or DataBricks is preferred
  • Experience with large-scale biobanks and their trusted research environments is preferred.
  • Experience of querying data for analysis through SQL (e.g. PostgreSQL), noSQL (e.g. Elasticsearch), data stores (e.g. hail), graph databases (e.g. neo4j) and file storage (e.g. S3)
  • Experience working with additional data formats, including RNA-seq, metabolomic, and proteomic data 
  • Experience with protein language models and/or sequence language models
  • Experience working with clinical data 
  • Experience working with electronic health record data 
  • Knowledge of AdAM and OMOP formats 
  • Strongly team-oriented with a customer focused design thinking approach
  • Knowledge of drug development process and how genomics data is used to impact these areas
     

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.


Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include: Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location.  The anticipated wage for this position is

$166,500 - $266,200

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

Top Skills

AWS
Azure
Cromwell/Wdl
Elasticsearch
GCP
Nextflow
NoSQL
Postgres
Python
R
S3
SQL

Similar Jobs

2 Minutes Ago
Hybrid
Boston, MA, USA
165K-223K Annually
Mid level
165K-223K Annually
Mid level
Cloud • Healthtech • Social Impact • Software • Biotech
The Software Engineer will develop full-stack web applications, focusing on frontend and backend coding, improving code maintainability, optimizing performance, and collaborating across teams.
Top Skills: PythonReact
2 Hours Ago
Hybrid
Arlington, MA, USA
89K-133K Annually
Senior level
89K-133K Annually
Senior level
Fintech • Insurance • Payments • Social Impact • Financial Services
The Commercial Credit & Portfolio Manager oversees credit quality in commercial real estate loans, mentors teams, and drives credit decisions and process improvements.
2 Hours Ago
Hybrid
Arlington, MA, USA
74K-110K Annually
Junior
74K-110K Annually
Junior
Fintech • Insurance • Payments • Social Impact • Financial Services
The Business Analyst will work as a liaison between business units and software development teams, document requirements, and drive process improvements.
Top Skills: APIsConfluenceDatabasesJIRAMS Office

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account