Redian新闻
>
问个genomics和bioinformatics的问题
avatar
问个genomics和bioinformatics的问题# Biology - 生物学
m*n
1
准备了一晚上,以为可以轻松对答,结果还是出问题,我的简历竟然有拼写错误, 自
己竟然没看出来,
被人家问的无地自容,说想听听我的解释...
然后又问我,毕业以后这几个月都在干嘛
哎,郁闷的面试,郁闷死了,上来发发牢骚...
avatar
x*a
2
一个男人带着他的宠物鳄鱼走进一间酒吧,他把鳄鱼放在吧台上,然后转身对惊讶的酒
客们说:“跟大家做个交易,我将把鳄鱼的嘴打开,把我的那个放进去,然后它会合上
嘴。数分钟后再打开,我会将我的家伙毫发无伤的取出来,届时你们每个人都请我喝一
杯,以做为目睹这个奇观的回报。”
群众喃喃低语的允诺了,那男人站在吧台前脱下裤子,把他的那个放进了鳄鱼张开
的嘴,在观众的屏息中鳄鱼合上了它的嘴。过了一分钟后,那男人拿起一个啤酒瓶用力
敲打鳄鱼的头部,鳄鱼张开嘴,那男人果真毫发无伤的取出他的家伙,群众们欢呼并送
上饮料给那男人。
不久那男人又站起来提出另一个提议:“我出一千元给任何胆敢试试看的人!”群
众间一阵沉默,过了一会儿酒吧后方举起一只手,一个金发女郎羞怯的说:“我可以试
试看,但你要答应我不能用啤酒瓶敲我的头。”
avatar
b*s
3
我对genomics不是很懂,有些问题想请教一下。
如果用microarray或RNA-seq做完transcriptional profile,是否可以通过分析有变化
的gene的promoter elements得到high hits的candidates。然后通过这些可能的
element candidates做yeast-1-hybrid筛选,找到他们的binding transcription
factors。
这样在实际中是否不可行?是因为最后很难得到可靠的elements去做筛选吗?这些
elements不是针对一两个特定基因,而是针对有表型的处理组,通过分析基因表达的变
化,找到跟这些表型变化相关的transcriptional regulators。
我的模式材料是植物。谢谢!
avatar
g*s
4
think positively: at least now u know ur resume has spelling issues.
you should thank this interviewer who is careful and serious.

【在 m******n 的大作中提到】
: 准备了一晚上,以为可以轻松对答,结果还是出问题,我的简历竟然有拼写错误, 自
: 己竟然没看出来,
: 被人家问的无地自容,说想听听我的解释...
: 然后又问我,毕业以后这几个月都在干嘛
: 哎,郁闷的面试,郁闷死了,上来发发牢骚...

avatar
c*r
5
莫非。。。
Blonde抢了你的小男友?
avatar
K*4
6
不可行!
可以做,但是false positive 太大,没有实际意义了
主要是现有mRNA 数据不完整,无法正确locate real transcription start site,
习惯上人们要提取TSS 前500-4000 bp for promoter analysis, 但是实际的很可能在
50-100kb以外。

【在 b******s 的大作中提到】
: 我对genomics不是很懂,有些问题想请教一下。
: 如果用microarray或RNA-seq做完transcriptional profile,是否可以通过分析有变化
: 的gene的promoter elements得到high hits的candidates。然后通过这些可能的
: element candidates做yeast-1-hybrid筛选,找到他们的binding transcription
: factors。
: 这样在实际中是否不可行?是因为最后很难得到可靠的elements去做筛选吗?这些
: elements不是针对一两个特定基因,而是针对有表型的处理组,通过分析基因表达的变
: 化,找到跟这些表型变化相关的transcriptional regulators。
: 我的模式材料是植物。谢谢!

avatar
m*n
7
恩, 我也觉得hr很nice,她还说我应该找找学校的career center改简历,哎,郁闷是
因为觉得自
己蠢,竟然没看到,然后就是找工作找了这么久,没什么结果,精神都快崩溃了

【在 g*********s 的大作中提到】
: think positively: at least now u know ur resume has spelling issues.
: you should thank this interviewer who is careful and serious.

avatar
x*a
8
俺木有小男友。

【在 c*******r 的大作中提到】
: 莫非。。。
: Blonde抢了你的小男友?

avatar
x*m
9
你这样还不如做chip-seq
avatar
b*s
10
chip-seq不是找下游targets吗?
我现在想找上游的binding TFs

【在 x******m 的大作中提到】
: 你这样还不如做chip-seq
avatar
b*s
11
这个是否可以找到一些有可能的elements,然后连上fluorescent proteins进一步in
vivo分析,确定之后再做screen?

【在 K**4 的大作中提到】
: 不可行!
: 可以做,但是false positive 太大,没有实际意义了
: 主要是现有mRNA 数据不完整,无法正确locate real transcription start site,
: 习惯上人们要提取TSS 前500-4000 bp for promoter analysis, 但是实际的很可能在
: 50-100kb以外。

avatar
l*1
12
if relative to stress response:
you can try
RAD-seq: Restriction site Associated DNA Sequencing
http://www.molbio.uoregon.edu/facres/johnson.html
HTTPS: //www.wiki.ed.ac.uk/display/RADSequencing/Home
if relative to histone modification
you can try
BS-Seq: Bisulphite Sequencing
http://seqanswers.com/wiki/BS-Seq
original hint was from one Nature job posting:
http://www.nature.com/naturejobs/science/jobs/344164-postdoctor
>Postdoctoral Fellow in Evolutionary Bioinformatics : Vienna, Austria
>A postdoctoral position in bioinformatics is immediately available in the
research group of Ovidiu Paun at >the University of Vienna (see http://www.botanik.univie.ac.at/systematik/personnel/Paun.htm).
below ignored
> The candidate will play a lead role in analysing next generation
sequencing data including RNA-seq, >smRNA-seq, BS-seq and RAD-seq. The
fellow will be also involved in identifying outliers and performing >
environmental correlations.
>We are looking for a highly self-motivated and independent candidate, yet
willing to work in a team->effort. The fellow should hold a relevant PhD
degree in bioinformatics or related fields before starting this >position.
Fluency in a major programming language such as perl or python and a strong
publication >record are expected. The successful candidate should also be
able to demonstrate experience with >computational analyses of high-
throughput genomic data.
>To be considered please send your application per email to ovidiu.paun’@‘
univie.ac.at including your CV, ........
>The latest preferred start >date is March 1st, 2014.
or
http://evol.mcmaster.ca/~brian/evoldir/PostDocs/Vienna.Evolutio
http://evol.mcmaster.ca/cgi-bin/my_wrap/brian/evoldir/PostDocs/

【在 b******s 的大作中提到】
: chip-seq不是找下游targets吗?
: 我现在想找上游的binding TFs

avatar
u*1
13
我们要区分清楚两个概念:
a. RNA-seq发现一些基因的transcription level有变化
b. 通过yeast hybrid,chip-seq等等我们得到一些证据证明某个TF是bind到这个基因
的某些element的
我只想说,transcription level的调控是非常非常非常复杂的,远远不是promoter那
么简单;你其实可以直接去看ENCODE project对基因noncoding区域的annotation,有
很多TF是和intron结合的,同时被很多相隔很远的enhancer调控(chromatin
structure比如looping);所以transcription level有变化绝对不能就说是promoter
被调控导致表达量有变化
当然我也不会完全信ENCODE的数据,1. 我相信还有很多其他的罕见的TF会binding到这
个位点,但没有被数据库cover,2. 纵然有证据证明一个TF binding to loci,也不代
表就一定有biological function,这个需要下游证明
以上还是基于最简单的考虑,没有考虑tissue-specific,没考虑epigenetic,没考虑
splicing,。。。总之太多因素都可以导致transcription level变化
总之我的意思是,a和b是没有绝对关系的,虽然貌似有联系

【在 b******s 的大作中提到】
: 我对genomics不是很懂,有些问题想请教一下。
: 如果用microarray或RNA-seq做完transcriptional profile,是否可以通过分析有变化
: 的gene的promoter elements得到high hits的candidates。然后通过这些可能的
: element candidates做yeast-1-hybrid筛选,找到他们的binding transcription
: factors。
: 这样在实际中是否不可行?是因为最后很难得到可靠的elements去做筛选吗?这些
: elements不是针对一两个特定基因,而是针对有表型的处理组,通过分析基因表达的变
: 化,找到跟这些表型变化相关的transcriptional regulators。
: 我的模式材料是植物。谢谢!

avatar
u*1
14
对你有兴趣的基因,你直接去UCSC上看就是了。现在基于chip-seq的数据很多了,你可
以看到你有兴趣的(比如transcription有变化的)基因在全基因上被哪些TF binding
;。。。如果你发现一段序列被很多很多TF binding,加上又是conserved的,那么这
段element就很大可能是functional的,然后拿到luciferase system来做

【在 b******s 的大作中提到】
: 这个是否可以找到一些有可能的elements,然后连上fluorescent proteins进一步in
: vivo分析,确定之后再做screen?

avatar
c*y
15
Are those transcription factor binding site prediction softwares making
sense? I mean if the chip data are not available, what can we do about the
regulatory elements on the basis of the sequences?

binding

【在 u*********1 的大作中提到】
: 对你有兴趣的基因,你直接去UCSC上看就是了。现在基于chip-seq的数据很多了,你可
: 以看到你有兴趣的(比如transcription有变化的)基因在全基因上被哪些TF binding
: ;。。。如果你发现一段序列被很多很多TF binding,加上又是conserved的,那么这
: 段element就很大可能是functional的,然后拿到luciferase system来做

avatar
u*1
16
makes NO sense at all in my perspective
你可以看到很多prediction的软件/网站;不同网站预测出来的结果完全不一样。
TF binding motif,我想都是非常variable的吧(http://en.wikipedia.org/wiki/Position-specific_scoring_matrix),当然我是外行,我想请教做TF binding的内行,到现在能准确identify比如MEF2A的binding site就一定是比如ATGGCC(我随便乱说的)?
但根据俺的经验,纵然MEF2A是exclusively的bind到ATGGCC;也不是说每个ATGGCC都一
定会被MEF2A target,一定还是要做实验的
现在我比较相信的是:加入chip-seq的数据表明TF会bind在某个基因的某个loci,然后
这个loci的某个SNP被软件预测可以改变binding motif;那么我相信这个SNP 会通过这
个TF binding调控基因的表达

【在 c***y 的大作中提到】
: Are those transcription factor binding site prediction softwares making
: sense? I mean if the chip data are not available, what can we do about the
: regulatory elements on the basis of the sequences?
:
: binding

avatar
b*s
17
非常感谢。其实我想的是在genome-wide的层次上来做。通过分析有变化的
transcription的promoter elements找到一些candidates,然后连reporters做进一步
筛选。确定的elements拿去做y1h筛选。所以即便会有很多false negative和false
positive,只要能找到一些有变化的基因,及其他们的调控序列和regulator就很好了。
另外我用的系统是植物,据说植物很多数据库很差。基本用不上。

promoter

【在 u*********1 的大作中提到】
: 我们要区分清楚两个概念:
: a. RNA-seq发现一些基因的transcription level有变化
: b. 通过yeast hybrid,chip-seq等等我们得到一些证据证明某个TF是bind到这个基因
: 的某些element的
: 我只想说,transcription level的调控是非常非常非常复杂的,远远不是promoter那
: 么简单;你其实可以直接去看ENCODE project对基因noncoding区域的annotation,有
: 很多TF是和intron结合的,同时被很多相隔很远的enhancer调控(chromatin
: structure比如looping);所以transcription level有变化绝对不能就说是promoter
: 被调控导致表达量有变化
: 当然我也不会完全信ENCODE的数据,1. 我相信还有很多其他的罕见的TF会binding到这

avatar
l*1
18
非植物 或哺乳的数据库区别也 植物的也 一样在完善
关键是楼主有无NGS and HMM (Hidden Markov Models 的背景或能找到那种背景的
合作人
pls refer
PMID 23435661
by Van der Does D et al., (2013).
Salicylic acid suppresses jasmonic acid signaling downstream of SCFCOI1-JAZ
by targeting GCC promoter motifs via transcription factor ORA59.
Plant Cell. 25: 744-61.
Abstract:
ignored
In silico promoter analysis of the SA/JA crosstalk transcriptome revealed
that the 1-kb promoter regions of JA-responsive genes that are suppressed by
SA are significantly enriched in the JA-responsive GCC-box motifs
below ignored too
http://www.ncbi.nlm.nih.gov/pubmed/23435661
full pdf link:
HTTP double dot //www.plantcell.org/content/25/2/744.full.pdf
or
Wong KC et al., (2013).
DNA motif elucidation using belief propagation.
Nucleic Acids Res. 41: e153.
http://www.ncbi.nlm.nih.gov/pubmed/23814189
Abstract
Protein-binding microarray (PBM) is a high-throughout platform that can
measure the DNA-binding preference of a protein in a comprehensive and
unbiased manner. A typical PBM experiment can measure binding signal
intensities of a protein to all the possible DNA k-mers (k = 8 ~10); such
comprehensive binding affinity data usually need to be reduced and
represented as motif models before they can be further analyzed and applied.
Since proteins can often bind to DNA in multiple modes, one of the major
challenges is to decompose the comprehensive affinity data into multimodal
motif representations. Here, we describe a new algorithm that uses Hidden
Markov Models (HMMs) and can derive precise and multimodal motifs using
belief propagations. We describe an HMM-based approach using belief
propagations (kmerHMM), which accepts and preprocesses PBM probe raw data
into median-binding intensities of individual k-mers.
below ignored
http://www.cs.toronto.edu/~wkc/kmerHMM/downloads.html
or
http://www.cs.utoronto.ca/~wkc/
http://www.utoronto.ca/zhanglab/people.html

了。

【在 b******s 的大作中提到】
: 非常感谢。其实我想的是在genome-wide的层次上来做。通过分析有变化的
: transcription的promoter elements找到一些candidates,然后连reporters做进一步
: 筛选。确定的elements拿去做y1h筛选。所以即便会有很多false negative和false
: positive,只要能找到一些有变化的基因,及其他们的调控序列和regulator就很好了。
: 另外我用的系统是植物,据说植物很多数据库很差。基本用不上。
:
: promoter

avatar
l*1
19
pls refer fresh new both papers:
a,
by Morozov VY and Ioshikhes IP. (2013).
Optimized Position Weight Matrices in Prediction of Novel Putative Binding
Sites for Transcription Factors in the Drosophila melanogaster Genome.
PLoS One. 8: e68712.
Abstract
Position weight matrices (PWMs) have become a tool of choice for the
identification of transcription factor binding sites in DNA sequences.
below ignored
In the present study, we extended this technique originally tested on
single examples of transcription factors (TFs) and showed its capability to
optimize PWM performance to predict new binding sites in the fruit fly
genome. We propose refined PWMs in mono- and dinucleotide versions similarly
computed for a large variety of transcription factors of Drosophila
melanogaster. Along with the addition of many auxiliary sites the
optimization includes variation of the PWM motif length, the binding sites
location on the promoters and the PWM score threshold. To assess the
predictive performance of the refined PWMs we compared them to conventional
TRANSFAC and JASPAR sources.
below ignored
http://www.ncbi.nlm.nih.gov/pubmed/23936309
or
b,
by Radivojac P et al., (2013).
A large-scale evaluation of computational protein function prediction.
Nat Methods. 10: 221-7.
Abstract
above ignored
Fifty-four methods representing the state of the art for protein function
prediction were evaluated on a target set of 866 proteins from 11 organisms.
Two findings stand out: (i) today's best protein function prediction
algorithms substantially outperform widely used first-generation methods,
with large gains on all types of targets; and (ii) although the top methods
perform well enough to guide experiments, there is considerable need for
improvement of currently available tools.
http://www.ncbi.nlm.nih.gov/pubmed/23353650

【在 c***y 的大作中提到】
: Are those transcription factor binding site prediction softwares making
: sense? I mean if the chip data are not available, what can we do about the
: regulatory elements on the basis of the sequences?
:
: binding

avatar
l*1
20
Pls refer one review
By Madrigal P and Krajewski P. (2012).
Current bioinformatic approaches to identify DNase I hypersensitive sites
and genomic footprints from DNase-seq data.
Front Genet. 3:230.
cited from its pp2:
>With sufficiently deepsequencing,the
>so-called“digital genomic footprinting”
>technique can reveal single protein-
>binding events(Hesselberth et al.,2009).
>Unlike ChIP-seq,which is specific for the
>protein under study,footprints identify
>narrow DNA regions that can be bound
>by any factor(Hager,2009),showing sig-
>nificant enrichment for known motifs
>upstream of the transcription start sites
>(TSSs).
http://www.ncbi.nlm.nih.gov/pubmed/23118738
or
Zhang W et al., (2012),
Genome-wide identification of regulatory DNA elements and protein-binding
footprints using signatures of open chromatin in Arabidopsis.
Plant Cell. 24: 2719-31.
http://www.ncbi.nlm.nih.gov/pubmed/22773751

【在 b******s 的大作中提到】
: chip-seq不是找下游targets吗?
: 我现在想找上游的binding TFs

相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。