git repo stats


Bitbucket does not have any repository related statistics.
The way around it is gitstats.

In Ubuntu install with:
sudo apt-get install gitstats.

And run as:
gitstats /home/bioinformin/exomeMPS/ /home/bioinformin/exomeMPS_stats
firefox /home/bioinformin/exomeMPS_stats/index.html



I made R packege to annotate ggplot2 boxplots with significance clips and text. It lives at sigAnnotateBoxplots Bitbucket repository.

shell history


history … history of shell commands
<Ctrl+R> –> search_term –> <Enter> … search history and execute
<Ctrl+R> –> search_term –> <Right Arrow> … search history and edit command

vim survival commands


<ESC> … get to COMMAND mode
i … get to INSERT mode (for editing)

:w … write (=save) (in COMMAND mode)
:x … save and exit (in COMMAND mode)
:q … quit (it won't let you if changes not saved) (in COMMAND mode)
:q! … quit discarding changes (you will not be prompted if changes not saved) (in COMMAND mode)

shell tips


I have updated linux tips with these:

find . -name "._*" -type f -delete … find and delete files starting “._” recursively in a current folder
find . ! -name "*.txt" -type f … inverse find (all files not ending “.txt”)
ls -laFR … list files recursively (with folders contents)

Generate random FASTQ file


For ASCII printable characters look at e.g. Wikipedia entry

# generate random 'DNA sequence':
dna_seq <- paste(replicate(20, sample(c("A", "T", "G", "C"), 1)), sep = "", 
    collapse = "")

# generate random Phred quality scores:
dna_qual_num <- replicate(20, sample(5:45, 1))  # numeric values
dna_qual_chr <- rawToChar(as.raw(dna_qual_num + 64))  # ad 64 for Phred+64 flavor

# print FASTQ:
cat("@SEQ_ID\n", dna_seq, "\n+\n", dna_qual_chr, sep = "")
## @SEQ_ID
## +
## I^]^h_EN`Za[PjPgPOLN

It takes more space to write numbers for quality values compared to ASCII characters:

cat(paste(dna_qual_num, sep = "", collapse = ","), "\n", dna_qual_chr, sep = "")
## 9,30,29,30,40,31,5,14,32,26,33,27,16,42,16,39,16,15,12,14
## I^]^h_EN`Za[PjPgPOLN

Markdown to produce HTML guts to be inserted on the web


I wanted a way to generate bits of a html without writing html tags. Some sort of markdown would serve this well. It should work as mini redaction system. This is how I have done that:

  • I used Rstudio to make R Markdown document. The markdown document gets knitted into html.
  • As part of the markdown document I have also a call to system command sed to extract the <body> part of the html and save it into plain text file. So when markdown is knitted the text file is also generated.
  • I uploaded the text file on the server. I added php include command to the web page which should include this html “chunk”.

That's it.


  1. Create bloq_post.Rmd file in Rstudio
  2. In that file include this R code chunk (it gets executed when the bloq_post.Rmd is knitted):
    setwd("/path/to/your/blog_post") # set working directory system("sed -e '1,/<body>/d' -e'/^<\\/body>/,$d' bloq_post.html > bloq_post.txt") # extract body part of the html
  3. Knit the bloq_post.Rmd to get bloq_post.html and also bloq_post.txt
  4. Upload bloq_post.txt on your web server.
  5. In the web page to have the bloq_post.txt included (e.g. index.php) put something like:
    <?php $INC_DIR = $_SERVER["DOCUMENT_ROOT"]. "/www/"; include ($INC_DIR. "blog/bloq_post.txt"); ?>
  6. Done.


The <body> part of html extraction with sed was stolen from Jeromy Anglim (and adapted to exclude the <body> tag itself).

The markdown needs to be knitted twice (first round does not have the html with changes yet).

The bloq_post.txt can be also shared via Dropbox so no ftp is needed.