Load required R packages

Sys.setenv(JAVA_HOME='C:/Program Files/Java/jdk-9.0.1') # for 64-bit version
library(xlsx)
## Loading required package: rJava
## Loading required package: xlsxjars
library(animation)
library(readr)
library(biomaRt)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, first, last
## The following object is masked from 'package:biomaRt':
## 
##     select
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(pubmed.mineR)

Load the gene of interest

gene<-"BRCA1"

Create a excel workbook project with xlsx package

  wb<-createWorkbook()
  ## create a bold font for column names of data tables
  cs<-CellStyle(wb) + Font(wb, isBold = TRUE) + Border() + Alignment(h="ALIGN_CENTER")

Create a excel sheet within the workbook, we will name this sheet “Gene Info”, because we will put general gene info in this first “cover sheet”

  sheet0<-createSheet(wb, sheetName = "Gene Info")

Now, we will use the biomaRt package to retrive basic gene info, such as genome coordinates

  mart<-useMart("ensembl", dataset="hsapiens_gene_ensembl")
  a<-mart@attributes$name
  ##the info we retrieve with biomaRt is fully customization, see biomaRt package manual for a full list of more than 800 gene features available
  a.useful<-c(a[1], a[9:12], a[61], a[97])
  gene.info<-getBM(a.useful, a[60], gene, mart)

Now we load our gene info onto the excel sheet we created

  addDataFrame(gene.info, sheet0, colnamesStyle = cs)

On the same work sheet, we can add additional info, such as phenotype info

  gene.phenotype<-getBM("phenotype_description", a[60], gene, mart)
  addDataFrame(gene.phenotype, sheet0, startRow = 6, colnamesStyle = cs)

Let say this is all we are going to put on our first sheet. We are going to create a new work sheet for a new analysis, within the same workbook

  sheet1<-createSheet(wb, sheetName = "Expression")
  
  ##Generate an expression figure according to stage
  download.file("https://github.com/hackseq/2017_project_5/raw/master/data/exprData.csv.gz", "exprData.csv.gz")
  exprData<-read.csv("exprData.csv.gz")
  expr.gene<-subset(exprData, exprData$GeneId==gene)
  expr.gene.graph<-as.data.frame(t(expr.gene))
  expr.gene.graph<-mutate(expr.gene.graph, GENE=as.character(expr.gene.graph[1,1]))
  expr.gene.graph<-expr.gene.graph[2:nrow(expr.gene.graph),]

We can save the expression table in worksheet1

  addDataFrame(expr.gene.graph, sheet1, startRow = 1, colnamesStyle = cs)

Here our expression analysis can be virsualized with a figure

  ##Make an expression figure
  gene.boxplot<-ggplot(expr.gene.graph, aes(x=GENE, y=as.numeric(as.character(expr.gene.graph$GENE)))) +
    geom_boxplot() +
    labs(x = "Gene",
         y = "Expression (Log2 transformed)",
         title = paste0(("Expression of "),gene))+
    theme(plot.title = element_text(hjust = 0.5),
          axis.text.x = element_text(size=6))

One way of storing the figure in our work sheet is to save it to an external file first, then import the figure to our data Here, we are saving our expression figure on worksheet1, to avoid overlapping with our data table we load on this sheet earlier, we will load it onto column 6

  ggsave(gene.boxplot, filename="Expression.figure.png")
## Saving 7 x 5 in image
## Warning in fun(x, ...): NAs introduits lors de la conversion automatique
## Warning in FUN(X[[i]], ...): NAs introduits lors de la conversion
## automatique
## Warning: Removed 350 rows containing non-finite values (stat_boxplot).
  addPicture("Expression.figure.png", sheet1, scale=1, startRow=1, startColumn=6)

We can store unlimited number of data sheets or figure in a single workbook, on multiple worksheets. Create as many new sheets as needed and load the results data table or figure onto them.

  sheet2<-createSheet(wb, sheetName = "CNV")
  ##import your CNV data results as table
  CNV.info2<-"CNV.data"
  addDataFrame(CNV.info2, sheet2, colnamesStyle = cs)

Sometimes, functions from packages will save images as .pdf as default format and .pdf can not be directly imported onto our work sheet. In such case, we can use the function from ‘animation’ package to convert pdf into .png file

  im.convert("survival.pdf", output = "survival.png",extra.opts="-density 150")
  addPicture("survival.png", sheet2, scale=0.5, startRow=12, startColumn=6)

Some functions do not have a built-in file output. We can use R default device image capture to store images

dev.off()
##some script making a figure to R imaging device
dev.copy2pdf()
file.rename("Rplot.pdf")
 im.convert("Rplot.pdf", output = "Rplot.png",extra.opts="-density 150")
  addPicture("Rplot.png", sheet2, scale=0.5, startRow=12, startColumn=6)

We can also store text results, such as text-mining results to our workbook Example of storing results of PubMed text-mining results on a worksheet

  sheet3<-createSheet(wb, sheetName = "PubMed")
  ##grab abstracts for gene
  ##compile a abs database regarding "1000 genome breast cancer" search in pub-med
  ABSdata<-readabs("pubmed_result_1000genomescancerbreast.txt")
## Warning in initialize(value, ...): NAs introduits lors de la conversion
## automatique
  gene.abs<-getabs(ABSdata, gene, FALSE)
## [1] "14 abstracts BRCA1"
  gene.PMID<-get_PMCIDS(gene.abs)
  ##extract sentence from abs for gene
  gene.sentences<-Give_Sentences(gene, gene.abs)
  addDataFrame(gene.PMID, sheet3, colnamesStyle = cs)
  addDataFrame(unlist(gene.sentences), sheet3, startRow = 6, colnamesStyle = cs)

After all results are loaded onto our multi-sheet workbook, we can save our workbook as an excel sheet

  saveWorkbook(wb, file = paste0("ALL_RESULTS.xlsx"))