Universitat de Barcelona

R Code


Summation space to test the degree of combination

Suppose that you have two sources of information that people can use to make an estimate of some physical attribute be it spatial or temporal. You might want to know whether the precision of the estimates used by subjects result from combining the two sources in specific ways. One nice way to check the extent of combination or integration is to use a summation space or pooling space. Here we will simulate different degrees of integration by applying a Quick’s pooling equation (Quick, 1974):

\[\frac{1}{\sigma_{f}} = \left[ \left(\frac{1}{\sigma_{e}}\right)^k + \left(\frac{1}{\sigma_{l}}\right )^k \right ]^{1/k}\]

Note that the precision can be defined as the reciprocal of the SD. We first load the packages and define the functions that we will need.

We define one function so as to draw a circle (prediction from mle k=2 in the pooling equation)

library(dplyr) # we will need it later
library(ggplot2)
draw.pred <- function(k,n=500) # 
  {
    x <- seq(0,1,len=n)
    y <- (1-x^k)^(1/k)
    return(data.frame(x = x, y = y))
  }

The function to simulate the combined response, k is the parameter that determines the degree of combination

simul_comb <- function(cue1,cue2,k)
{
  sd1 <- sd(cue1)
  sd2 <- sd(cue2)
  precision.comb =  ( (1/sd1)^k + (1/sd2)^k )^(1/k) #Quick's pooling equation
  rnorm(length(cue1),0.3,1/precision.comb)
}

Finally, a function to simulate responses to individual cues

simul_resp <- function(n,cue=1)
{
  if(cue==1){ # parameters for cue 1
    mu <- 0.3
    sd <- runif(1,0.072,0.13) # less reliable than cue 2    
  }
  else{ # parameters for cue 2
    mu <- 0.3
    sd <- runif(1,0.022,0.064) # more reliable than cue 1
  }  
  rnorm(n,mu,sd)
}

We are ready know to simulate the responses. We will simulate 500 responses for 8 hypothetical subjects. We create a data frame with one column (subj), with as many rows as number of respones times number of subjects:

  data1 <- data.frame(subj=rep(seq(1,8),each=500))

We use mutate from dplyr to add columns consisting of 500 simulate responses for each subject.

data1 <- mutate(group_by(data1,subj,add=F),resp_cue1=simul_resp(500,1),resp_cue2=simul_resp(500,2))

We add different kinds of combination (ls=linear summation, mle=maximum likelihood, ps=probability summation):

data1 <- mutate(group_by(data1,subj,add=F),resp_ls=simul_comb(resp_cue1,resp_cue2,k=1),resp_mle=simul_comb(resp_cue1,resp_cue2,k=2),resp_ps=simul_comb(resp_cue1,resp_cue2,k=3.5))

We obtain SD summaries for each subject, cue and type of combined response

data.summary <- summarize(group_by(data1,subj,add=F),cue1.sd=sd(resp_cue1),cue2.sd=sd(resp_cue2),ls.sd=sd(resp_ls),mle.sd=sd(resp_mle),ps.sd=sd(resp_ps))

We compute the ratios of the different SD: For example, to check if responses in the linear summation condition do really follow linear summation (k=1), we compute two ratios: ratio1 = ls.sd/cue1.sd and ratio2= ls.sd/cue2.sd. We will do the same for the other two types of combination.

data.summary$ratio1.ls <- data.summary$ls.sd/data.summary$cue1.sd # linear summation
data.summary$ratio2.ls <- data.summary$ls.sd/data.summary$cue2.sd # linear summation
data.summary$ratio1.mle <- data.summary$mle.sd/data.summary$cue1.sd # mle 
data.summary$ratio2.mle <- data.summary$mle.sd/data.summary$cue2.sd # mle
data.summary$ratio1.ps <- data.summary$ps.sd/data.summary$cue1.sd # probability summation
data.summary$ratio2.ps <- data.summary$ps.sd/data.summary$cue2.sd # probability summation

Finally, we plot the different subjects and the different types of combinations in the summation space. In red we plot the combination generated by linear summation, MLE in blue and probability summation in green.

keq2 <- draw.pred(2,500) # predictions for mle (k=2)
keq35 <- draw.pred(3.5,500) # predictions for prob. summation (k = 3.5)
df <- data.frame(x=c(0,1),xend=c(1,1),y=c(1,1),yend=c(1,0))
ggplot(data.summary) + geom_point(aes(ratio1.ls,ratio2.ls),colour="red",size=4) + geom_point(aes(ratio1.mle,ratio2.mle),colour="blue",size=4) +  geom_point(aes(ratio1.ps,ratio2.ps),colour="green",size=4) + geom_segment(aes(x=x,xend=xend,y=y,yend=yend),linetype=3,colour="grey30",data=df) + geom_abline(intercept=1,slope=-1,colour="grey30") + geom_path(data=keq2,aes(x=x,y=y),linetype=2,colour="grey30") + geom_path(data=keq35,aes(x=x,y=y),linetype=3,colour="grey30") + xlab("combined_cue/cue 1") + ylab("combined_cue/cue 2") + theme_grey(18) + xlim(0,1.2) + ylim(0,1.2)

The solid line denotes the linear summation prediction, the dashed line corresponds to the mle prediction and the dotted curved line denotes probability summation (k between 3 and 4) and the straight dotted lines represent no combination at all (\(k=\infty\)).

We have used this approach (de la Malla et al. 2015) to analyse the degree of combination of different cues across time. One, however, have to be careful when interpreting the exponents of the models as pointed out in Pannunzi et al (2015)

References:

  1. Quick, R. F. (1974) Kybernetic 16, 65–67.
  2. de la Malla, C. and López-Moliner, J. (2015). Predictive plus online visual information optimizes temporal precision in interception. J Exp Psychol Hum Percept Perform, 41(5):1271–1280.
  3. Pannunzi, M., Pérez-Bellido, A., Pereda-Baños, A., López-Moliner, J., Deco, G., and Soto-Faraco, S. (2015). Deconstructing multisensory enhancement in detection. J Neurophysiol, 113(6):1800–18.


Diffusion process

Diffusion models in Psychology (see refs.) are very relevant in the domain of modelling response times. They are the continuous version of a random walk. What is usually modelled is the situation in which subjects make two-choice decisions that are based on a single process for which response times do not usually take longer than one second. The basic assumption of the models is that a stimulus provides evidence for either response and that this evidence is accumulated over time towards one of two response threshold, each threshold representing one of the two response alternatives. A response is initiated when one of the response threshold is reached.

Usually the diffusion process has a starting point or position at time=0 (at the time the stimulus is presented). The stimulus starts the process of accummulation of evidence until the response threshold is reached. We here will simulate a discrete version of the diffusion process, that is we will use a random walk.

A random walk can be easily implemented in R (see the function go_for_a_walk and how we can use it to plot a simple random walk.

library(ggplot2)
go_for_a_walk <-function(m=0.2,sig=0.75,th=c(5,-5),y0=0)
{
  #m: constant input to the process
  #sig: amplitude of Gaussian noise
  #th: response threshold values for correct responses and errors
  #y0: initial state
  y <- y0 + cumsum(m + sig*rnorm(10000)) # 
  id <- which(y>th[1] | y<th[2])[1]# index of the time step at which either threshold is first reached
  y[1:id]
}

y <- go_for_a_walk()
steps <- 1:length((y))
#we plot the random walk and one line denoting the threshold
qplot(steps,y,geom = "line",lwd=2,col="red") + geom_hline(yintercept=5) + theme(legend.position="none")

However the interest of using diffusion processes relies on simulating many of these random walks in order to obtain time distributions. The following code implements the process of simulating N random walks for one of the responses (positive threshold) very efficiently due to the use of dplyr funcions.

diffusion1Dv <- function(th=5,m=0.2,sig=0.75,y=0,N=1000)
{
  require(dplyr)
  
  df    <- data.frame(trial=1:N)            # number of repetitions or trials 1e7
  df <- group_by(df,trial) %>% do(data.frame(y=go_for_a_walk(m=m,sig=sig,th=c(th,-th),y0=y)))
  df <- mutate(df,step=1:n()) # we add a colum with the step numer
  df
}

rws <- diffusion1Dv(N=500) # it (N) should be much larger
ggplot(rws,aes(step,y,colour=trial,group=trial))+geom_point(alpha=0.02)+geom_line(alpha=0.2)+ geom_hline(yintercept=c(-5,5)) + theme_classic()+ theme(legend.position="none")+scale_x_log10()

As you can see, there are some errors, that means, random walks that end up reaching the threshold for the alternative response. Once we have then the full set of simulated processes we can obtain histograms of the time steps. We need then to obtain the number of iterations or steps per each of the random walks. From the output of diffusion1Dv, the data frame is already grouped by trial, so we obtain summaries per trial. First whether is correct or not (type) and second the number of steps (n).In order to get a nicer histogram we call diffusion1Dv again but this time with a large N.

rws <- diffusion1Dv(N=10000) # it (N) should be much larger
rws.s <- summarise(rws,type=sign(last(y)),n=n())#n() returns the length (# iterations) of each trial
rws.s$type <- factor(rws.s$type,labels = c("errors","correct responses"))
ggplot(rws.s,aes(n,fill=type))+geom_histogram(position="dodge")+theme_classic()+scale_fill_brewer(type="qual",palette = "Set1")

You can see the apparent simplicty of the R code to simulate diffusion processes. You will also notice that the distributions are skewed (not symmetrical).

References

  1. Ratcliff, R. (1978). A theory of memory retrieval. Psychol Rev, 85:59–108.
  2. Ratcliff, R. (1979). Group reaction time distributions and an analysis of distribution statistics. Psychological Bulletin, 86:446–461.
  3. Ratcliff, R. (1980). A note on modeling accumulation of information when the rate of accumulation changes over time. Journal of Mathematical Psychology, 21:178–184.
  4. Ratcliff, R., Van Zandt, T., and McKoon, G. (1999). Connectionist and diffusion models of reaction time. Psychol Rev, 106(2):261–300.


Quickpsy

by Daniel Linares and Joan López-Moliner

quickpsy is just a new R package. With quickpsy you can fit psychometric functions to different conditions with very little code in R. It is designed to be user-friendly but also provides you with great flexibility (e.g. support for user-defined functions), speed and nice graphical output based on ggplot2.

Below you have a simple example:

library(quickpsy) # we will need it later
library(MPDiR) # contains the Vernier data; use ?Vernier for the reference
fit <- quickpsy(Vernier, Phaseshift, NumUpward, N,
                grouping = .(Direction, WaveForm, TempFreq))
plot(fit) + theme_classic()+ scale_color_brewer(type="qual",palette="Set1")

But you can do a lot, lot more in no time …

To get quickpsy and quickly learn how to use it, you can go to CRAN or visit this link at Dani’s web page and follow the directions for installation.

and … Enjoy it!!!

Web page made with rmarkdown using R and Rstudio. Content created by Joan López-Moliner