The source of
RNA transcriptomics data can be mRNA (messenger RNA), microRNA, lncRNA
(long non-coding RNA) and more. We can use
Microarray or
RNA-Seq gene expression platforms to perform experiments and generate
transcriptomics data.
DepMap Portal
The goal of the Dependency Map (DepMap)
portal is to empower the research community to make discoveries related to
cancer vulnerabilities by providing open access to key cancer dependencies
analytical and visualization tools.
DepMap Expression (mRNA) data
In order to process DepMap Expression data we need
to download the follwoing datasets from DepMap website.
Cell Line Sample Info
DepMap_ID:
Static primary key assigned by DepMap to each cell line
cell_line_name
stripped_cell_line_name:
Cell line name with alphanumeric characters only
CCLE_Name:
Previous naming system that used the stripped cell line name followed by
the lineage; no longer assigned to new cell lines
alias: Additional
cell line identifiers (not a comprehensive list)
COSMIC_ID: Cell
line ID used in Cosmic cancer database
sex: Sex of tissue
donor if known
source: Source of
cell line vial used by DepMap
Achilles_n_replicates:
Number of replicates used in Achilles CRISPR screen passing QC
cell_line_NNMD:
Difference in the means of positive and negative controls normalized by
the standard deviation of the negative control distribution
culture_type:
Growth pattern of cell line (Adherent, Suspension, Mixed adherent and
suspension, 3D, or Adherent (requires laminin coating))
culture_medium:
Medium used to grow cell line
cas9_activity:
Percentage of cells remaining GFP negative on days 12-14 of cas9
activity assay as measured by FACs
RRID: Cellosaurus
research resource identifier
WTSI_Master_Cell_ID
sample_collection_site:
Tissue collection site
primary_or_metastasis:
Indicates whether tissue sample is from primary or metastatic site
primary_disease:
General cancer lineage category
Subtype:
Subtype of disease; specific disease name
age: If known, age
of tissue donor at time of sample collection
Sanger_model_ID:
Sanger Institute Cell Model Passport ID
depmap_public_comments
lineage:
Cancer type classifications in a standardized form
lineage_subtype
lineage_sub_subtype
lineage_molecular_subtype
Expression
RNA-Seq
TPM
gene expression data (Log2 transformed) for just protein coding genes using
RSEM
(RNA-Seq by Expectation Maximization).
Rows: cell lines
(Broad IDs)
Columns: genes (HGNC
symbol and Entrez ID)
19177 Genes
1389 Cell Lines
33 Primary Diseases
37 Lineages
Data Processing
Not all DepMap_IDs in "sample_info.csv"
file are present in "CCLE_expression.csv" file. Moreover, it is better to
have a separate file for features/genes/probes based on the following data
model. You can download a file by clicking on its file name.
This video shows how you can
upload the CCLE_gene_cn files to
Bioada
SmartArray and then explore, analyze, visualize and build predictive
models significantly faster and easier.