Tutorial in Translational Informatics

ICITCS2017 Tutorial in Translational Informatics: Balancing Functionality and Security

Introduction to Translational Informatics (Facelli)
In this introductory lecture, we will discuss the concept of Translational Science, the endeavor that
facilitates the translation of basic science discoveries in biomedical sciences for improvement of
health of individuals and the public. As the complexity of biomedical sciences increases
informatics plays a key role in facilitating this knowledge transfer. While many informatics
techniques transfer to biomedical sciences, quite substantial adaptations are needed in view of the
critically of human health and privacy concerns related to health data. In this lecture, we introduce
the basic concepts in transactional informatics and make comparisons between the informatics
requirements for translational science and other informatics applications.

Open Source Software for Translational Informatics (LaSalle)
This lecture will build on the previous introduction leading towards its main topic: A Current
Survey of Open Source Tools and Applications for Translational Research. This will be a
comprehensive review of tools for data management, bio-specimen repositories, imaging data and
data management issues specific to ‘omics’ domains. The objective of the lecture is to provide
complete descriptions of resources available across multiple domains, within translational
research, how they are used and where to find them.

Building a Complaint/Secure Environment for Translational Science (Cheatham)
High-performance computing centers (HPC) traditionally have far less restrictive privacy
management policies than those encountered in healthcare. We show how an HPC environment
can be re-engineered to accommodate biomedical data while retaining its utility in computationally
intensive tasks such as data mining, machine learning, and statistics for translational science. We
also discuss deploying protected virtual machines. We also discuss critical planning steps needed
to engage the university’s information security operations and the information security and privacy

Data Integration Methods (Gouripeddi)
Central to integration of any data is the need to manage (1) identities of the different players
(patients, research participants, healthcare providers, and organizations) involved, (2) semantics
(ontologies and metadata) within data, and (3) persistence of integrated data. Distributed
computing methods have low penetrance in biomedical domain due its semantic complexity. In
this lecture, we will discuss different distributed computing approaches for data integration, with
a special focus on dynamic federation approaches using the OpenFurther (OF) platform as an
example. OF is an open-source informatics platform that integrates heterogeneous and disparate
data sources. It empowers translational researchers with the ability to assess feasibility of particular
research studies, export biomedical datasets for analysis, and create aggregate databases. With the
added abilities of on-the-fly probabilistic resolution of identities of unique individuals and its
federated query engine’s ability to grant query-based data for only permitted individuals, OF
ensures privacy and security without limiting the performance of translational research. In
addition, it systematically supports federated (retaining data at their sources with query-based
ephemeral or persisted stores), and centralized (comprehensive source aggregation and persisted
stores) data governance models. We will also discuss and demonstrate OF’s components and its
use in different translational research including exposomic research that utilizes d methodologies
for metadata discovery and semantically consistent data integration, as well as to support temporal
reasoning and research reproducibility.

Genomics for Translational Science (Lee)
High throughput technologies successfully capture diverse genome-wide sequence information,
quantitative gene expression, and regulatory information. The generation of huge volumes of data
by these technologies, ‘omics’ have made remarkable contributions to building a comprehensive
list of functional elements in the human genome. This lecture will demonstrate the state of the art
on how to translate these data into biological and clinical knowledge. This presentation will give
a summary of research that contributes to this endeavor by focusing on the study of characterizing
the systems-level properties and genetic/molecular basis of human disease by integrating and
interpreting heterogeneous multi-omics data including genome, transcriptome, and epigenome.

Predictive Modeling in Translational Science (Abdelrahman)
In this tutorial, we will provide a state of the art of the methods to analyze the data from clinical
trials using modern computational techniques like artificial intelligence and machine learning.
The tutorial will place special emphasis on methods for overcoming recruitment challenges, by
better targeting potential participants and the use of predictive modeling classes as promising
clinical trial solutions. Our tutorial will be demonstrated through state-of-the-art research
examples. Necessary mathematics, statistics and predictive modeling backgrounds will be
introduced first to cover all tutorial aspects.

Tutorial Schedule

00:00 Introduction to Translational Informatics: Computational and Security Challenges
Professor Julio C. Facelli (30 minutes)

00:30 Data Integration Methods
Professor Ramkiran Gouripeddi (45 minutes)

15 minutes break and Q&A

01:30 Genomics for Translational Science
Professor Younghee Lee (30 minutes)

02:00 Predictive Modeling in Translational Science
Professor Samir Abdelrahman (30 minutes)

15 minutes break and Q&A

02:45 Building a Complaint/Secure environment for Translational Science
Professor Thomas Chatham III (30 minutes)

03:15 Open Source Software for Translational Informatics
Professor Bernard LaSalle (45 minutes)