LINCS Workflow

Find the best place to obtain the LINCS L1000 data

The LINCS L1000 project has collected gene expression profiles for thousands of perturbagens at a variety of time points, doses, and cell lines. A full list of the chemical and genetic perturbations used can be found on the CLUE website along with their descriptions.

The LINCS L1000 data is separated into five data levels at different points in the analysis pipeline.

L1000 Pipeline

L1000 data is provided at five levels of the data processing pipeline:

  • Level 1: Raw unprocessed flow cytometry data from Luminex (LXB)
  • Level 2: Gene expression values per 1000 genes after deconvolution (GEX)
  • Level 3: Quantile-normalized gene expression profiles of landmark genes and imputed transcripts (Q2NORM or INF)
  • Level 4: Gene signatures computed using z-scores relative to the plate population as control (ZSPCINF) or relative to the plate vehicle control (ZSVCINF)
  • Level 5: Differential gene expression signatures

You can obtain the L1000 directly from the LINCS Center for Transcriptomics by signing up for an account here.

Downloading annotated data packages from the LINCS Data Portal

You can search for L1000 data packages on the LINCS Data Portal.

L1000 Search Result

Each result provides a link to the annotated data package. In addition, each data package includes cell line and small molecule metadata, descriptions of the assay and the data generation center.

Downloading lower-level LINCS L1000 data from GEO

Compressed data files of level 1-4 L1000 data are also available on GEO.

GEO Screenshot

The Level 2-4 is collected into a data series that can be found on the GEO accession pages. You can begin exploring the data in the super series web-page at: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE70138. This page contains links to twelve subseries that also have the raw, unprocessed Level 1 data.

Accessing annotations for cells lines, genes, perturbagens, and other L1000 assay information

The LINCS L1000 data is stored on Amazon S3. You can access the CLUE website API usage guidelines and demos here. The API provides programmatic access to annotations and perturbational signatures in the L1000 data via the HTTP-based RESTful web services.

lincscloud API Example
Downloading precomputed gene expression signatures for a particular condition

You can also access gene expression signatures using a tool developed by the BD2K-LINCS DCIC called Slicr. Slicr allows users to query the LINCS L1000 corpus that is available on GEO for their perturbation of interest at the desired cell lines, doses, and time points by submitting requests to the search bar below:

Slicr LogoSlicrLINCS L1000 Slicr  [ GSE70138 data only ]

Users can choose between Level 3 data and Level 5 data. Level 3 data contains quantile-normalized expression values. The Level 5 data contains differential expression signatures computed across three replicates using the characteristic direction (CD) method (http://www.maayanlab.net/CD/).

Slicr Screenshot

Add your conditions of interest to the shopping cart. Your selected signatures are then available for download as tab-separated values. Individual entries can also be downloaded as JSON objects.

Below is a YouTube lecture describing more about details about the L1000 data with more ways to access it: