【 以下文字转载自 Statistics 讨论区 】
发信人: xzxz0000 (凶涨凶涨), 信区: Statistics
标 题: R 编程面试题,被弄残废了,在这里求解,钱不多,但会鼎力散财,
发信站: BBS 未名空间站 (Sat Apr 14 20:21:09 2012, 美东)
面试了一个software职位,本以为很有戏,但考我的题基本上全是job description上
没有的R 编程,希望这里的大侠帮助解惑答疑,在下感恩不尽。
1)
An analytical technique used in a molecular biology lab involves dispensing
solutions of DNA in very
low concentration into 384-well plates. Consider the perfectly random
distribution of N=30 molecules
onto this device
a) What is the probability that two molecules fall in the same well? Derive
the closed-form
equation for this probability.
b) Plot the above expression for the probability as a function of the number
of molecules
dispensed.
c) Solve the same problem by means of a simulation. Include your R code and
provide
appropriate statements to insure exact reproducibility of your simulation
results
d) Consider the case where the above-described device is affected by the so
called “edge
effects” – that is, the probability of a molecule landing in the wells
located on the edge of the plate is
smaller than the probability of a molecule landing in any other well.
Assuming that the probability ratio
is 1/3, revise the simulation above to calculate the probability that half
of the molecules are found in the
center wells of the plate.
2)
For a given SNP with alleles a and A, the minor allele frequency is .
Assume that this frequency is the
same for both males and females, that there is no migration in or out of the
population, and that there
is no selective advantage for either allele. The proportions of these
alleles are stable in the population
over time. Denote the possible genotype states by aa=1, Aa=2, and AA=3. The
evolution of a population
over time considering this SNP alone can be described, e.g, as follows: in
the first step, a female is of
genotype aa so it is in state 1 ( . In the next step, a mate is selected
at random and one or more
daughters are produced, eldest of whom had genotype . In the following
step, this daughter selects a
mate at random and produces an eldest daughter with genotype and so on.
a) Calculate the transition matrix for the above Markov chain.
b) Show that this chain is ergodic. What is the smallest number of
iterations, N, for which the
power N of the transition matrix is strictly positive?
c) According to the Hardy-Weinberg law, this chain is supposed to have a
steady-state
distribution σ= [ ( ( ]. Does this match the calculated steady
state?
d) For , simulate this chain for n=100,000 iterations and compare the
sampling
distribution of the simulated states with the one from the Hardy-Weinberg
vector. Show your
code and include statements to insure exact reproducibility of your
simulation.
3)
A certain molecular analyte, comprised of long quasi-linear macromolecules
with approximately
constant length l is analyzed with the help of a specialized detector. This
detector is assembled in the
form of many parallel long strips, each strip of width L. If the
macromolecules are randomly distributed
across the surface of the detector, what is the probability that such a
macromolecule would cross the
boundary between two strips? (Assume that l is smaller than L and ignore any
“edge effect”, i.e., assume
the detector has a large surface.)