## Archive for May 2015

Today somebody asked me about building a progress bar into a for loop. This can be really useful if you are running lots of bootstrapping or Monte Carlo simulations, and you want some peace of mind so that you know that loop is still running as the computer chugs away in the background. It’s good to know whether it’s worth hanging around for the code finish, or better to go climbing/ skiing for the weekend/ or whether there is just time for a cup of tea.

I’ve written a dummy script here to show how this can be done. Fortunately the built in `R.utils`

package contains a function for incorporating a progress bar on the R console using the function `txtProgressBar()`

. I thought it would be fun for the for loop to actually do something, so I made it write out a message (stored in the `cdRclb`

object) in an empty plot window. Here is the script:

# Write a message to plot and make an empty plot window cdRclb <-c( "c","o","d", "e", "R", "c", "l", "u", "b") plot(x= 1:9, y = 1:9, type="n", axes=FALSE, frame.plot = TRUE, ann=FALSE)

So that the progress bar takes at least some time to finish, I decided to run the for loop for 900000 iterations (100000 x the `length()`

of the object `cdRclb`

).

# Set the number of iterations for the loop nIts <- length(cdRclb) *100000

Then make the progress bar and write the loop. You write the `txtProgressBar()`

function first, outside of the loop, and everything within the `{}`

is repeated for `nIts`

, the number of iterations. Here, for each iteration we find a new value, `k`

, the iteration number in the loop divided by 100000.

Still within the for loop, we use an `if`

statement to decide whether we should write-out out some of the message stored in `cdRclb`

. When `k`

is a round number, we write an item of `cdRclb`

using `text()`

. At the end of the iteration we update the progress bar using `setTxtProgressBar()`

and the loop starts over. Check what happens in the plot window as the loop progresses, and also check the R console. Fun and extremely satisfying!

# create the progress bar progBar <- txtProgressBar(min = 0, max = nIts, style = 3) # Start the for loop. for(i in 1:nIts){ # Find k. NB, lots of other functions/commands could go here. k <- i/100000 # Use an if command to decide whether to plot part of the message. Only this when k is a whole number if(floor(k)==k) text(x = k ,y = 5, labels = cdRclb[k], col= k+1, cex = 4) # Update the progress bar setTxtProgressBar(progBar, i) } # Close the progress bar close(progBar)

for loop · if · progress bar · txtProgressBar

10

# Finding the row and column names of a matrix

No comments ·
Posted by *alistairseddon* in Data manipulation

In Friday’s codeRclub, we had a problem which involved finding the row and column names for items in a matrix greater than a specified value (e.g. finding the names of the pairs of samples in a correlation matrix with correlation coefficient greater than 0.5). The problem is that using standard sub-setting methods you are able to find the locations/ values of the cells within the matrix, but not the row or column names. We solved the problem using an argument in the `which`

command in R. We wrote a function to do this, returning the row and column names and the correlation coefficients in a `data.frame`

.

First simulate a correlation matrix and set our correlation cut-off value:

x <- matrix(c(1,.8,.2, .8,1,.7, .2,.7,1),nrow=3, dimnames = list(c("a", "b", "c"), c("a", "b", "c"))) # Simulate the 3x3 matric and give the matrix row and column names of the samples

Then make a function, `which.names.matrix`

, to return the row and column names of interest. `x`

is a correlation matrix, `cutVal`

is your correlation cut-off value.

which.names.matrix <- function(x, cutVal = 0.5){ x[lower.tri(x)] <- NA # Because it's a correlation matrix, we are only interested in one half of it, so set the lower triangle to NA. diag(x) <- NA # Set the diagonals to NA locs <- which(x>cutVal,arr.ind=TRUE) # Find the locations of the cells in the matrix > than cutVal scores <- na.omit(x[x>cutVal]) # Get the scores of the cells > cutVal data.frame(row = rownames(x)[locs[,1]], col = colnames(x)[locs[,2]], value = scores) # Return the data.frame with the row and column names, plus the scores } which.names.matrix(x)

`expression()`

and related functions including `bquote()`

are powerful tools for annotating figures with mathematical notation in R. This functionality is not obvious from their respective help files. `demo(plotmath)`

nicely shows the huge potential of `expression()`

, but does not help that much with getting the code need for many real cases.

I tend to get my expressions to work by trial and lots of errors (although having put this together, I now understand them at least temporarily). I’ve just searched through my code library and extracted and annotated some examples of `expression()`

being used. I hope someone finds it useful.

I’m going to use `expression()`

with `title()`

, but the same expressions can be used with any of the functions (`text()`

, `title()`

, `mtext()`

, `legend()`

, etc) used for putting text on plots.

x11(width=4, height=5, point=14) par(mar=rep(0,4), cex.main=.8) plot(1, type="n", axes=FALSE, ann=FALSE)

The simplest use of expression is take a character or string of characters and it will be added to the plot. If the string contains spaces, it must be enclosed in quotes (alternatively, the space can be replaced by a tilde `~`

, which probably gives better code).

title(line=-1, main=expression(fish))

This use of expression is entirely pointless, but is a useful starting point. Some strings have special meanings, for example infinity will draw the infinity symbol. If for some reason you want to have “infinity” written on your plot, it must be in quotes. Greek letters can be used by giving their name in lower-case or with the first letter capitalised to get the lower or upper case character respectively.

title(line=-2, main=expression(infinity)) title(line=-3, main=expression(pi)) title(line=-4, main=expression(Delta))

Subscript or superscript can be added to a string using ^ and [] notation respectively.

title(line=-5, main=expression(r^2)) title(line=-6, main=expression(beta[1]))

If the string we want to have as sub- or superscript contains a space, the string must be in quotes. Braces can be used to force multiple elements to all be superscript.

Strings can be separated by mathematical operators.

title(line=-7, main=expression(N[high]-N[low])) title(line=-8, main=expression(N[2]==5))

To make more complicated expressions, build them up from separate parts by either using * or paste to join them together (if you want a multiplication symbol, use `%*%`

). The * notation gives nicer code.

title(line=-9, main=expression(Delta*"R yr")) title(line=-10, main=expression(paste(Delta,"R yr"))) title(line=-11, main=expression(paste("Two Year Minimum ",O[2]))) #title(line=-11, main=expression(Two~Year~Minimum~O[2])) title(line=-12, main=expression(paste("Coefficient ", beta[1]))) #title(line=-12, main=expression(Coefficient~beta[1])) title(line=-13, main=expression(paste("TP ", mu,"g l"^-1))) #title(line=-13, main=expression(TP~mu*g~l^-1)) title(line=-14, main=expression(paste(delta^18,"O"))) #title(line=-14, main=expression(delta^18*O)) title(line=-15, main=expression(paste("Foram ", exp(H*minute[bc])))) #title(line=-15, main=expression(Foram~exp(H*minute[bc])))

To start an `expression()`

with a superscript (or subscript), I use an empty string (you can also use `phantom()`

).

title(line=-16, main= expression(""^14*C*" years BP")) #title(line=-16, main= expression(phantom()^14*C~years~BP))

So far so good. But sometimes, you want to use the value of an R-object in plot annotation.

For example, if we wanted to label a point with its x value, this will not work.

x<-5 title(line=-17, main= expression(x==x))

Instead of using `expression()`

, we have to use `bquote()`

, with the object we want written out inside `.()`

title(line=-18, main= bquote(x==.(x))) title(line=-19, main= bquote(x==.(x)~mu*g~l^-1))

If you understand these examples, you should be able to use the remainder of the functionality demonstrated by `demo(plotmath)`

and at `?plotmath`

.

bquote · expression · plotmath · title

To show formatted code within a paragraph, for example a function name, use <code>plot</code> which will appear as `plot`

.

To show a block of code, use something like this but using **square brackets []** rather than braces {}.

{code language=”r”}

x<-rnorm(100)

hist(x)

#don't forget comments

{/code}

Setting the language to R lets the WordPress plugin use appropriate syntax highlighting.

When formatted, the code will look like this

x<-rnorm(100) hist(x) #don't forget comments

There are more options to set line numbers and highlighting by adding extra parameters.

Tips:

- Keep the lines of code short – long lines will force the user to scroll.
- Use the text editor not the visual editor (which may garble your code)
- Check the code works (it is very tempting to edit it and break it)

No tags