We want to calibrate the parameters that will be used to model the simulations of coverage variations. Specifically, we are analyzing the coverage variation within individuals. Such variation will consist on a multiplier selection, coming from a Gamma distribution with a hyper prior for the alpha shape, sampled from a Log. Normal distribution. Mean and standard deviation parameters of the Log. Normal distribution will be modified to study, afterwards, the resulting coverage distributions.
I’ll start with Log. Normal Distribution parameters of:
Also, they will be sampled within the (1,9)
quantiles so we wouldn’t have too low/high values.
probs=seq(0,1,length.out = 1000)
lims= qlnorm(probs,meanLN,sdLN) # To avoid Inf
plot(probs,lims,
xlab="Probabilities",ylab="Values",pch=16,
main=paste0("Quantile function\nLog. Normal distribution (",meanLN,",",sdLN,")"),
col=ifelse(((probs> 0.1 & probs <= 0.9)),"red", "black"), axes=F)
axis(1); axis(2)
Sample the alpha-shapes for all the replicates.
for (i in 1: nReplicates){
a=rlnorm(1,meanlog = meanLN,sdlog = sdLN)
lims= qlnorm(c(0.1,0.9),meanLN,sdLN) # To avoid Inf
while ( a < lims[1] && a > lims[2]){
a=rlnorm(1,meanlog = meanLN,sdlog = sdLN)
}
alphaShapes[i]=a
}
x=seq(0,5,length.out = 100000)
plot(x,
dgamma(
x,
alphaShapes[1],
alphaShapes[1]),
type="l", col="darkred",lwd=2,
ylim=c(0,5),
main="Gamma distribution sampled",
ylab="dgamma(x, alpha,alpha)", xlab="Value", axes=F)
for (i in 2:nReplicates){
points(x,dgamma(
x,
alphaShapes[i],
alphaShapes[i]),
type="l", col=cols[i],
lwd=2)
}
axis(1);axis(2)
for (i in 1:nReplicates){
multipliers[[i]]=rgamma(nInds, alphaShapes[i],alphaShapes[i])
for (j in 1:nInds){
multiplier=multipliers[[i]][j]
data[[i]][j,]=data[[i]][j,]*multiplier
}
}
maxX=max(unlist(lapply(data, max)))
histData=list(data.frame())
densityData=list(data.frame())
for (i in 1:nReplicates){
h=hist(data[[i]], breaks = 100, plot=F)
histData[[i]]=h
d=density(data[[i]])
densityData[[i]]=d
}
maxY=max(unlist(lapply(densityData,function(x){max(x$y)})))
plot(densityData[[1]]$x,densityData[[1]]$y,
col=cols[1], lwd=2, type="l",
main="Coverage distribution",
xlab="Depth of coverage", ylab="Density",
xlim=c(0,maxX), ylim=c(0,maxY),axes=F)
for (i in 2:nReplicates){
points(densityData[[i]]$x,densityData[[i]]$y,col=cols[i], lwd=2, pch=16, type="l")
}
axis(1);axis(2)