David Arndt - Bioinformatician, Software Developer

Work Experience

The Metabolomics Innovation Centre/
Wishart Research Group

Edmonton

Bioinformatician, Research Assistant

May 2011–Apr 2014,
Aug 2014–Dec 2020,
Feb 2021–Present

Development work on several leading metabolomics databases:

Including The Human Metabolome Database, FooDB, The Toxic Exposome Database, The Small Molecule Pathway Database, and ContaminantDB.
Collected and curated data from third party sources and published literature, including chemical structures, chemical identifiers (SMILES, InChI, InChIKey), NMR and MS spectra, gene regulation data, health effects, compound descriptions, mechanisms of toxicity, and biochemical pathways. Used Ruby on Rails, MySQL, JChem.
Used Python, RDKit, and the SMARTS language to generate large sets of chemical structures.

Metagenomics and genomics projects:

Lead developer of PHASTER, a leading web server for prophage prediction in bacterial genomes and metagenomic contigs that processes ~60,000 submissions/mo. (1.7 million to date). Optimized high-performance computing cluster job processing for 3X throughput increase.
Lead developer of METAGENassist, a web app for metagenomic data analysis. Adapted R functions for univariate analysis, multivariate analysis (PCA, PLS-DA), heatmaps, clustering and supervised learning.
Analysis of metagenomic samples.

Cheminformatics projects:

Contributed to ClassyFire, a 1.5 TB database of chemical classifications. Used Docker to prepare portable version with microservices, supporting scalability for high-throughput automatic chemical classification on high-performance computing clusters.
CFM-ID (v4.0 release forthcoming).

Contributed to Heatmapper, a web app for creating gene expression and other heatmaps. Used R and Shiny.

>5 years as primary manager of IT infrastructure in a bioinformatics lab, for >70 servers and >110 web applications and internal services.

Regent College

Vancouver

TA, Introductory New Testament Greek

Jun–Aug 2009

Marked exams and quizzes, prepared exercises, and answered students' questions.

Wishart Research Group, University of Alberta

Edmonton

Programmer/Analyst - Bioinformatics

May 2006–May 2008

Worked on several protein structural biology projects, including:

Algorithm development and implementation for protein structure refinement program incorporating NMR chemical shifts (written in C).
Contributed to development of protein secondary and 3D structure prediction pipelines, using C, Perl, Java.

Wishart Research Group, University of Alberta

Edmonton

Programmer/Analyst - Bioinformatics

May–Aug 2004

Worked on interactive graphical cellular metabolism simulator (Java).

Education

BSc Honors in Molecular Genetics

University of Alberta

1997–2001

Including laboratory methods, research project.

BSc in Computing Science with Specialization in Bioinformatics

University of Alberta

2002–2005

Including courses in databases, AI, algorithms and bioinformatics.

MA in Theological Studies

Regent College

2008–2015

Historical research thesis.
Ancient languages: Hebrew, Greek.

Professional Skills

Top Skills

90%

Bioinformatics

>10 years

Experience preparing several metabolomics databases, using cheminformatics tools, analyzing metagenomic data and developing structural biology tools.

90%

Ruby on Rails

Advanced, 6 years

Contributed to development of numerous websites, including PHASTER, CFM-ID, and HMDB. Wrote how-to guides and led tutorials for new staff.

80%

R/Shiny

Experienced, 8 years

Includes adaptation of statistical and machine learning methods in development of METAGENassist, and using Shiny to develop Heatmapper.

85%

Docker

Experienced, 3 years

Containerized several applications, adopting microservices architecture. Prepared containerized application for running on computing cluster. Led tutorials for staff.

85%

Cloud Infrastructure

Experienced, 6 years

Primary Cloud/IT infrastructure manager at TMIC for >110 web applications and internal services. Experience with AWS, Google Cloud, OpenStack. System administration, network configuration, backup solution implementation.

Other Skills

Python SQL Perl/CGI C Singularity Vagrant HPC Clusters
Bash Git Java JSP HTML CSS French

Portfolio Highlights

Led or worked on teams developing the following projects

All
Metabolomics
Cheminformatics
Metagenomics and Genomics
Data Visualization

PHASTER

HPC Cluster
Rails/Perl/Bash

View Website

Heatmapper

R/Shiny
Docker

View Website

METAGENassist

R/Python/Perl
Java/JSP

View Website

TEDB

Ruby on Rails

View Website

CFM-ID

Ruby on Rails
Docker

View Website

ClassyFire

Ruby on Rails
Docker

View Website

HMDB

Ruby on Rails

View Website

ContaminantDB

Rails/Python/RDKit
(in development)

View Website

Publications

Arndt, D., Marcu, A., Liang, Y. and Wishart, D.S. (2019) PHAST, PHASTER and PHASTEST: Tools for finding prophage in bacterial genomes. Brief. Bioinformatics, 20(4), 1560–1567. View Article | PubMed | DOI

Djoumbou-Feunang, Y., Pon, A., Karu, N., Zheng, J., Li, C., Arndt, D., Gautam, M., Allen, F. and Wishart, D.S. (2019) CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification. Metabolites, 9(4), 72. View Article | PubMed | DOI | View Website

Wishart, D.S., Feunang, Y.D., Marcu, A., Guo, A.C., Liang, K., Vázquez-Fresno, R., Sajed, T., Johnson, D., Li, C., Karu, N., et al. (2018) HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res., 46(D1), D608–D617. View Article | PubMed | DOI | View Website

Hafsa, N.E., Berjanskii, M.V., Arndt, D. and Wishart, D.S. (2018) Rapid and reliable protein structure determination via chemical shift threading. J. Biomol. NMR, 70, 33–51. View Article | PubMed | DOI | View Website

Ramirez-Gaona, M., Marcu, A., Pon, A., Guo, A.C., Sajed, T., Wishart, N.A., Karu, N., Djoumbou Feunang, Y., Arndt, D. and Wishart, D.S. (2017) YMDB 2.0: a significantly expanded version of the yeast metabolome database. Nucleic Acids Res., 45(D1), D440–D445. View Article | PubMed | DOI | View Website

Arndt, D., Grant, J.R., Marcu, A., Sajed, T., Pon, A., Liang, Y. and Wishart, D.S. (2016) PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res., 44, W16–21. View Article | PubMed | DOI | View Website

Babicki, S., Arndt, D., Marcu, A., Liang, Y., Grant, J.R., Maciejewski, A. and Wishart, D.S. (2016) Heatmapper: web-enabled heat mapping for all. Nucleic Acids Res., 44(W1), W147–153. View Article | PubMed | DOI | View Website

Berjanskii, M., Arndt, D., Liang, Y. and Wishart, D.S. (2015) A robust algorithm for optimizing protein structures with NMR chemical shifts. J. Biomol. NMR, 63(3), 255–264. View Article | PubMed | DOI | View Website

Hafsa, N.E., Arndt, D. and Wishart, D.S. (2015) CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts. Nucleic Acids Res., 43(W1), W370–377. View Article | PubMed | DOI | View Website

Hafsa, N.E., Arndt, D. and Wishart, D.S. (2015) Accessible surface area from NMR chemical shifts. J. Biomol. NMR, 62(3), 387–401. View Article | PubMed | DOI | View Website

Wishart, D., Arndt, D., Pon, A., Sajed, T., Guo, A.C., Djoumbou, Y., Knox, C., Wilson, M., Liang, Y., Grant, J., et al. (2015) T3DB: the toxic exposome database. Nucleic Acids Res., 43(D1), D928–934. View Article | PubMed | DOI | View Website

Law, V., Knox, C., Djoumbou, Y., Jewison, T., Guo, A.C., Liu, Y., Maciejewski, A., Arndt, D., Wilson, M., Neveu, V., et al. (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res., 42(D1), D1091–1097. View Article | PubMed | DOI | View Website

Jewison, T., Su, Y., Disfany, F.M., Liang, Y., Knox, C., Maciejewski, A., Poelzer, J., Huynh, J., Zhou, Y., Arndt, D., et al. (2014) SMPDB 2.0: big improvements to the Small Molecule Pathway Database. Nucleic Acids Res., 42(D1), D478–484. View Article | PubMed | DOI | View Website

Wishart, D.S., Jewison, T., Guo, A.C., Wilson, M., Knox, C., Liu, Y., Djoumbou, Y., Mandal, R., Aziat, F., Dong, E., Bouatra, S., Sinelnikov, I., Arndt, D., et al. (2013) HMDB 3.0—The Human Metabolome Database in 2013. Nucleic Acids Res., 41(D1), D801–807. View Article | PubMed | DOI | View Website

Wishart, D.S., Arndt, D., Berjanskii, M., Tang, P., Zhou, J. and Lin, G. (2008) CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res., 36(Web Server issue), W496–502. View Article | PubMed | DOI | View Website

Montgomerie, S., Cruz, J.A., Shrivastava, S., Arndt, D., Berjanskii, M. and Wishart, D.S. (2008) PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation. Nucleic Acids Res., 36(Web Server issue), W202–209. View Article | PubMed | DOI | View Website

Arndt, D., Xia, J., Liu, Y., Zhou, Y., Guo, A.C., Cruz, J.A., Sinelnikov, I., Budwill, K., Nesbø, C.L. and Wishart, D.S. (2012) METAGENassist: a comprehensive web server for comparative metagenomics. Nucleic Acids Res., 40(Web Server issue), W88–95. View Article | PubMed | DOI | View Website

Wishart, D.S., Arndt, D., Berjanskii, M., Guo, A.C., Shi, Y., Shrivastava, S., Zhou, J., Zhou, Y. and Lin, G. (2008) PPT-DB: the protein property prediction and testing database. Nucleic Acids Res., 36(Database issue), D222–229. View Article | PubMed | DOI | View Website

Shi, Y., Zhou, J., Arndt, D., Wishart, D.S. and Lin, G. (2008) Protein contact order prediction from primary sequences. BMC Bioinformatics, 9, 255. View Article | PubMed | DOI

Wishart, D.S., Tzur, D., Knox, C., Eisner, R., Guo, A.C., Young, N., Cheng, D., Jewell, K., Arndt, D., Sawhney, S., et al. (2007) HMDB: the Human Metabolome Database. Nucleic Acids Res., 35(Database issue), D521–526. View Article | PubMed | DOI | View Website

Wishart, D.S., Yang, R., Arndt, D., Tang, P. and Cruz, J. (2005) Dynamic cellular automata: an alternative approach to cellular simulation. In Silico Biol., 5(2), 139–161. PubMed | View Website

Get in Touch

I'm always interested to get involved in new projects and to apply and expand my skills in new areas.

I have strengths in the following areas:

Working with bioinformatics data from a range of fields
Web app development (Ruby on Rails, Shiny, etc.)
Big data processing on cluster/cloud environments
Database design and implementation
Working collaboratively with partners and on small teams

I can be reached at [email protected]

Work Experience

The Metabolomics Innovation Centre/Wishart Research Group

Regent College

Wishart Research Group, University of Alberta

Wishart Research Group, University of Alberta

Education

BSc Honors in Molecular Genetics

BSc in Computing Science with Specialization in Bioinformatics

MA in Theological Studies

Professional Skills

Top Skills

Bioinformatics

Ruby on Rails

R/Shiny

Docker

Cloud Infrastructure

Other Skills

Portfolio Highlights

Publications

Get in Touch

The Metabolomics Innovation Centre/
Wishart Research Group