OrbiTox – A Translational Discovery Platform with Concerted View and Analysis of Big Data
OrbiTox presents Big Data in an intuitive, immersive, and fluid 3D environment. It is a one-of-a-kind computational translational discovery platform for interactive, 3D visualization of large (~million objects), high-dimensional (~1000 features), multidomain toxicology and drug discovery data. It enables translational discovery by deducing knowledge embedded in connections across multiple data domains. OrbiTox has a rich, user-friendly interface offering almost instantly refreshing visualizations.
OrbiTox is also a data gap-filling platform. It contains cheminformatics tools capable of discovering relationships across diverse data domains, such as activities of chemicals on gene targets and organisms, gene membership of pathways, thus, enabling read-across analysis with simultaneous view of chemical and biological similarity. One of the unique capabilities of OrbiTox is to offer interpretable in silico models that are validated and provide chemistry-backed reasoning of each prediction. Based on the structural features and motifs responsible for a prediction, a toxicologist can hypothesize mechanistic steps for a given chemical, or a drug discovery scientist can identify (un)favorable chemotypes for a desirable property profile and/or prioritize experimental testing.
In addition to housing clean, curated publicly available and predicted data, OrbiTox has provisions for adding proprietary data to investigate its relationships between layers of available data. This allows users to analyze proprietary data along with public data without having to share it.
OrbiTox houses large amounts of data organized in concentric orbits; each containing data from a specific domain – chemistry, genes, pathways, and species. Objects in each orbit are clustered based on extent of their relatedness. The order of orbits is based on complexity of objects. Currently, the Chemical orbit contains approximately 900,000 chemicals from the US EPA DSSTox database along with their structure, name, and other properties and identifiers. The Gene orbit has gene and respective protein information for about 25,000 human genes. Around 2,000 annotated pathways with names and member genes constitute the Pathway orbit. Finally, the smallest and most complex Species orbit has nearly 200 test organisms with genus, family, life span and some biometrics. Through this data organization, OrbiTox fulfills a growing need of concerted view of multi-domain data to interactively deduce knowledge embedded in connections among these data. The technology underlying OrbiTox enables refreshing of the 3D display almost instantaneously and provides extremely fast searching and filtering on numeric and textual fields.
Interactions, both experimental and predicted, between objects from different orbits are displayed as connections (also referred to as ‘links’ or ‘edges’). Typical links are from chemical to gene and species orbits and from gene to pathway orbits. For example, experimental or predicted Ames test outcome in TA100 strain will be displayed as a link between chemical and TA100 strain in the Species orbit. Similarly, the fructose metabolism pathway will show links between Pathway and Gene orbits, whereas aromatic hydrocarbon receptor (AHR) activity of hydramethylnon (DTXSID6023868) will trigger a link between Chemical and Gene orbits.
Data connections are currently furnished through various public data repositories, after additional curation, mapping, and cleaning by our data experts. Among the data sources are qHTS data generated through the Tox 21 program, mutagenicity data from ToxNet, and rodent carcinogenicity data from ToxNet, NTP technical reports, CPDB, and IARC.
Through this innovative mechanism of interactive concerted view of multi-domain data and connections between them refreshing almost instantaneously in 3D, OrbiTox fulfills a growing need for translational discovery. Connectivity between OrbiTox orbits affords initiating query from any domain for extremely fast searching and filtering on numeric and textual fields.
OrbiTox can easily incorporate large volumes of additional data, and can intelligently integrate and offer intuitive visualization of such additional data, which allows adding proprietary data to investigate its relationships with public data in a secured environment without sharing it publicly.
OrbiTox is also a data gap-filling platform. It contains validated, interpretable predictive in silico models developed with recursive partitioning methodology (RP) using our recently developed 834 chemistry-aware and chemically viable functional groups and moieties as descriptors, called Saagar[1,2]. Since chemically aware Saagar substructures and moieties are interpretable by design, the predictive models in OrbiTox provide chemistry-based reasoning for a prediction. Please explore https://www.sciome.com/saagar/ for further details on interpretable Saagar-RP models.
Saagar descriptors have been shown to retrieve similar compounds with better scaffold similarity and correct substituents and their position as compared to well-known fingerprints MACCS, ECFP4, PubChem, and ToxPrint. The example here shows structures and similarity coefficients between query Chloroanthraquinone and AhR active compounds retrieved from 1,223 compounds assayed in the Tox 21 AhR assay. This makes OrbiTox a unique tool to select analogues for read-across assessments.
In its current release, 41 in silico models to predict outcomes in Tox21 assays at two concentration thresholds are available. Predictive models to assess outcome of Ames test with and without metabolic activation in OECD-recommended strains can also be accessed through OrbiTox.
Publications and Presentations
-  Sedykh AY; Shah RR; Kleinstreuer NC; Auerbach SS; Gombar VK (2021). “Saagar-A New, Extensible Set of Molecular Substructures for QSAR/QSPR and Read-Across Predictions.” Chemical Research in Toxicology 34(2):634-640
-  A. Sedykh1, R. Shah1, N. Kleinstreuer2, S. Auerbach2, V. Gombar1 (2020). “Interpretable QSAR and read-across predictions using Saagar – A new, extensible set of molecular substructures.” Poster presentation at the 19th International Workshop on (Q)SAR in Environmental and Health Sciences, Durham, NC.