Dr.Paso takes as input a text file containing expression of 47-genes in different samples.
The file must be in tab-delimited format with genes in rows and samples in columns
(see file format). The file can be easily loaded by clicking the
"Browse" button (figure 1a) and
selecting the right file.
Note that two datasets are provided here as examples:
CCLE dataset: A subset of CCLE mRNA expression data
455 cell lines and 47 genes. Original expression data (see the link above) was transformed (average of probe
expression when several probes target the same gene + gene scaling).
GDSC dataset: mRNA expression of 953 cell lines from GDSC. Raw CEL files
were RMA-normalized and probes expression values were collapsed into a gene expression value using arithmetic mean.
Example can be selected by clicking on the radio button attached to the dataset (figure 1c).
Expression data are expected to be normalized and free from any noise/unwanted signal. However, Dr.Paso offers the possibility of re-scaling the data
to harmonize the input dataset with the feature representation used in our model. This operation is optional;
user has to check the box "scale (gene-scores)" (figure 1b).
Dr.Paso was designed to predict chemo-sensitivity to 24 drugs. Users can select all drugs
(by clicking the link "select all" - figure 1d) or at least one of them
(by checking the box corresponding to the drug - figure 1e).
The useful link "unselect all" could be used to clear all selected drugs at once (figure 1d).
When at least steps (1) and (3) are fulfilled, users can ask Dr.Paso to predict the chemo-sensitivity of samples by clicking the button
"Predict" (figure 1f). If everything works well (see reported errors section), figures and table with results will appear in the interface.
Chemo-sensitivities reported by Dr.Paso are estimates of activity area (AA) as defined in the CCLE article (Barretina et al., 2012).
Predicted chemo-sensitivities for the given samples are presented with figures and a table in the panel labelled "Prediction". This panel is organized in
two sections set up in vertical mode. The section on top is used for all figures, while the section on the bottom shows a single table.
Figure section is used for all figures. One figure is shown at a time. Users can focus either on drugs ("By drugs" tab) or on samples ("By samples" tab). Users can easily change section by clicking the corresponding label.
View 1 - Overview of all predicted sensitivities grouped by drug:
the first figure focuses on drugs. It summarizes predicted sensitivities of all drugs in different boxplots (figure 2e). Each boxplot represents the predicted sensitivities for a certain drug (i.e., predicted sample's drug sensitivity to this particular drug). By default, boxplots are sorted according to their median (to reflect the relative efficacy of each drug). Users can sort the boxplot using the drug name. For this, the "Name" item of the "Order x-axis by" (figure 2f) list should be selected. Returning to median sorting is possible by selecting the item "Median" in the same list.
View 2 - Drug sensitivities per drug
It is also possible to focus on one drug. The web interface provides a list ("Select" - figure 2g) with the set of drugs requested by the user. Selecting one of them will display a plot representing the predicted sensitivity of all samples for the drug selected. On that plot, each dot represents a sample and the x-axis is sorted according to drug sensitivity.
View 3 - Drug sensitivities per sample (figure 2d):
Users can also focus their visualizations on specific sample. In that mode, sensitivities of one sample to all requested drugs are shown. Each dot represents the predicted value (y-axis) for a particular drug (x-axis). Drugs are sorted according to sample sensitivity.
"previous" or "next" buttons can be used to change the samples.
"index" field can be used to directly access to a sample using the sample index (see Table section).
The table represents drug sensitivities of all samples and all requested drugs. Samples are shown in rows and drugs in columns. The first column contains sample indexes, the second column sample names and remaining columns refer to drugs.
All columns (except the first) are sortable.
Number of rows displayed can be fixed to 10 (default), 25, 50 or 100. Depending on the limit, the table is spread in different pages. A set of buttons is provided to browse all results (figure 2j).
The "search" filed (figure 2i) can be used to select rows matching a string. All columns (except index) are considered.
Table can be exported in csv or txt format by clicking the corresponding export button (figure 2l).
Expression of 47 genes in 455 cell lines. Original expression data (CCLE_Expression_Entrez_2012-09-29.gct) was slightly transformed (average of probe expression when several probes target the same gene + gene scaling).
Click the radio button figure 1c to select this dataset.
Select at least one drug (here: we select all drugs by clicking the link "select all".figure 1d).
Click the button "Predict".
Drug sensitivities are calculated for all samples and selected drugs. This should take some milliseconds. Now the interface has switched to the "Prediction" view. The first figure shows several boxplots that represent predictions for all samples by drug. Boxplots are sorted according to their median.
Users could be, for example., interested in seeing how samples respond to one drug:
Select "TAE684" in "Select" list (figure 2g). Now a figure representing the sensitivity (y-axis) of all samples to TAE684 is displayed. In the figure, each point represent a sample. Points are order according to the y-axis.
Sample names can be displayed by mouse clicking on the point of interest,
Select high responder: the figure shows a group of samples with very high-predicted values (activity area) compare to other, i.e., insensitive samples. To get information about those samples, brush the region with the mouse; points will be colored in red and the table will be filtered to keep only the set of samples. Users can export the table.
Select samples by name: Another way to select a set of samples is by using the "search" field of the table. For example, users may wish to see how breast cancer cell lines responds to TAE684. Enter "breast" in the field. Now the table is filtered and breast cancer cell lines are shown in red. Note that "breast" must be in the sample name.
It is also possible to focus on one sample and see how it responds to d drugs.
Open the tab labelled "By sample" (figure 2d). All drug sensitivities for the first sample (order according to the input file) are shown. Each point represents a drug (x-axis) and points are sorted according sample response.
To navigate between samples:
Use "previous" or next "button". Sample are ordered according to the input file.
Use sample index in the table. Enter 10 in "index" field will display "A172_CENTRAL_NERVOUS_SYSTEM".
Dr.Paso takes in input as text file containing expression of 47-genes in different samples. The file must be in tab-delimited format with genes in row and samples in column.
The file should contains 48 rows:
In the first row, the name of each column should be included. This row should start with "Gene" since symbols of the 47 genes are expected to be in the first column. The remaining columns are labelled with the sample names. Note that gene symbols must be written exactly as here.
The remaining 47 rows include the expression values of each gene per sample: One row per gene and one sample per column. The number of elements in each row must match the number of entries of the first row. Missing values are not allowed.
Gene expressions are expected to be normalized and corrected from any noise or unwanted signal. No data transformation is implemented in Dr.Paso except the gene Z-score.
Dr.Paso web app is an Rstudio/Shiny web application. The tool was developed using R/Shiny package (https://shiny.rstudio.com/)
(http://rstudio.github.io/DT - for datatable manipulation). The app was developed by Tony Kaoma under the supervision of
Francisco Azuaje at the Bioinformatics and Modeling Research Group (BIOMOD) of LIH. Feedback about this tool can be provided to Tony Kaoma
Barretina et al.The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 2012 Mar 28;483(7391):603-7.