The LINCS project is based on the premise that disrupting any one of the many steps of a given biological process will cause related changes in the molecular and cellular characteristics, behavior, and/or function of the cell – the observable composite of which is known as the cellular phenotype. Observing how and when a cell’s phenotype is altered by specific stressors can provide clues about the underlying mechanisms involved in perturbation and, ultimately, disease.
LINCS data are being made openly available as a community resource through a series of data releases, so as to enable scientists to address a broad range of basic research questions and to facilitate the identification of biological targets for new disease therapies. LINCS datasets consist of assay results from cultured and primary human cells treated with bioactive small molecules, ligands such as growth factors and cytokines, or genetic perturbations. Many different assays are used to monitor cell responses, including assays measuring transcript and protein expression; cell phenotype data are captured by biochemical and imaging readouts. Assays are typically carried out on multiple cell types, and at multiple timepoints; perturbagen activity is monitored at multiple doses.
The LINCS program is an NIH Common Fund program and has been implemented in two phases. The pilot phase of the program was completed in FY 2013 and focused on the following activities:
- Large-scale production of perturbation-induced molecular and cellular signatures
- Creation of databases, common data standards, and public user interfaces for accessing the data
- Computational tool development and integrative data analyses
- Development of new cost-effective molecular and cellular phenotypic assays
- Integration of existing datasets into LINCS
Phase 2, which began in FY 2014, supports six LINCS Data and Signature Generation Centers. Phase 2 also synergizes with the efforts of the NIH Big Data to Knowledge (BD2K) program through the BD2K-LINCS Data Coordination and Integration Center (DCIC), which focuses on the following activities:
- Work with data/signature generators to ensure common annotation of data
- Coordinate data and signature accessibility across data generators so that data can be easily accessed
- Ensure that tools/algorithms can be found through the portal and that they are annotated
- Coordinate outreach activities across the LINCS consortium
- Develop and implement tools/approaches for integrative queries across multiple LINCS data/signature types
Q & A
What is the overall goal of this program?
- LINCS is working to establish a new understanding of health and disease through an integrative approach that identifies patterns of common networks and cellular responses (called cellular signatures) across different types of tissues and cells in response to a broad range of perturbations.
- The underlying premise of the LINCS program is that disrupting any one of the many steps of a given biological process will cause related changes in the molecular and cellular characteristics, behavior, and/or function of the cell – also known as the cellular phenotype. A cellular phenotype, in turn, can be reflected by signatures derived from comparable assays of clinical states. Observing how and when a cell phenotype is altered by specific stressors can provide clues about the mechanisms involved in perturbation and, ultimately, disease.
What are cellular signatures?
- A cellular signature of a perturbagen response is the set of reduced dimensionality descriptors of the underlying data that provide insight into mechanism and serve as predictors. Therefore, meaningful signatures are dependent on the assay, and on how diverse assays are integrated together, either into predictive patterns or signaling networks that could lead to mechanistic interpretations.
- To develop meaningful signatures, data from diverse assays must be scaled and normalized. Integration, normalization, and scaling of heterogeneous, multi-parameter dose-response data is a non-trivial task involving conceptual and practical hurdles.
What assays are being run?
- The program has six Data and Signature Generation Centers: Drug Toxicity Signature Generation Center, HMS LINCS Center, LINCS Center for Transcriptomics, LINCS Proteomic Characterization Center for Signaling and Epigenetics, MEP LINCS Center, and NeuroLINCS Center.
- The Drug Toxicity Signature Generation Center’s assays monitor gene and protein expression, as well as phenotype assays that are applied to understand the response of differentiated iPSCs to single and combinations of FDA-approved drug perturbations.
- The HMS LINCS Center monitors cell responses using multiple biochemical, imaging and cell biological assays. They range from direct assays of drug-kinase interaction in cell extracts, to multiplex biochemical assays of cell signaling proteins, to imaging assays, to assays of transcriptional response (in collaboration with the LINCS Center for Transcriptomics) and cell viability assays. More details about HMS LINCS Center assays can be found here.
- The LINCS Center for Transcriptomics uses the L1000 assay which is a gene-expression profiling assay based on the direct measurement of a reduced representation of the transcriptome and computational inference of the portion of the transcriptome not explicitly measured. Measurements are (a) of endogenous mRNA (i.e. not reporter-based system); (b) from treated whole cell lysates. The current detection method is by optically-addressed microspheres-based Luminex system.
- The LINCS Proteomic Characterization Center for Signaling and Epigenetics performs two assays: P100 reduced representation phosphoprofiling (3 hour time point) and GCP global chromatin profiling (24 hour time point). The P100 assay is a mass spectrometry-based targeted proteomics assay that detects and quantifies a representative set of ~100 phosphopeptide probes that are present in a wide range of cell types and have been demonstrated to be modulated via perturbations. The Global Chromatin Profiling assay is a mass spectrometry-based targeted proteomics assay that detects and quantifies an extensive set of chromatin modifications (specifically, post-translational modifications on histone proteins).
- The MEP LINCS Center’s assays start with a novel assay that images cancer cell lines placed in a micro-environment array. Selected conditions are then followed with transcriptomics (L1000) and proteomics assays.
- The NeuroLINCS Center is focused on studying the properties of iPSCs derived from normal, familial and sporadic ALS patients. These cells are differentiated into motor neurons and assayed using targeted proteomics, transcriptomics (RNA-seq), and imaging assays.
What perturbations are being used?
- The program uses small molecules, ligands, micro-environments, CRISPR gene over-expression and knockdown perturbations.
- Each data and signature generation center has its own set of perturbations and they are determined primarily by the assay technology being used to determine response.
Is there a timeline available for release of data?
- New releases of data will become available every quarter, and the release schedule is summarized here.
- Metadata annotations of LINCS data will be available along with each data release. We are working diligently to include sufficient metadata with each release of data.
- In addition to the data generated by the LINCS Data and Signature Generation Centers, the BD2K-LINCS Data Coordination and Integration Center is extracting signatures from the public domain, and these can be accessed here. APIs, attribute tables, and metadata are expected to be provided soon.
So where are the signatures?
- The signatures are available via multiple methods: (a) APIs to programmatically search and download the signatures; (b) Tools that display the signatures and provide integrative analyses.
What are your future plans?
- In the coming months you will see more user interfaces to query LINCS data and signatures, and publications demonstrating the utility of the LINCS approach.
- We are also focusing on generating data in primary cells and iPS cells, and data pertaining to these will be available soon.
What data integration challenges are you taking on?
- Data is being collected via a joint project called the “Dense Cube” between all the data and signature generation centers. This project will focus on five common cell lines to explore the relationships between immediate early cell signaling events, transcription, and phenotypes. This will constitute the largest such public dataset, generated in a coherent manner, available for download and querying.
- One of the most significant challenges we have is to establish standards for metadata for all LINCS perturbations, assays, and experiments. We want to annotate these consistently, and to make them widely available so that outside groups can perform their own analysis and develop better methods to extract signatures from the LINCS data.
Any opportunities to collaborate?
- Funding opportunities often exist to enable collaborations with the BD2K-LINCS DCIC and the LINCS Data and Signature Generation Centers.
- There are also LINCS Data Science Research Webinars that can be used to learn about the various LINCS datasets and research projects.
- Details about outreach opportunities related to LINCS can be found in the Community section of this website.