会场介绍 | 第15届中国 R 会（北京）-人大专场：统计推断理论与应用

2022-11-14 05:11

2022年，第15届中国 R 会（北京）将于11月19-25日在中国人民大学召开，本次会议由统计之都，中国人民大学统计学院、中国人民大学应用统计科学研究中心主办，得到 Posit 赞助支持，将以线上会议和线下会议相结合的方式举办。欢迎进入 R 会官网，获取更多会议信息！

链接：

https://china-r.org/bj2022/index.html

下面为您奉上本次 R 会人大专场：统计推断理论与应用演讲介绍，本会场主席为吕晓玲：

人大专场：统计推断理论与应用

时间：2022年11月19日下午14:00-16:45

腾讯会议号：581200034

腾讯会议链接：https://meeting.tencent.com/dm/g6FZSd4HQNhM

线下会场：明德主楼1031

李杰

Statistical Inference for Mean Function of Longitudinal Imaging Data over Complicated Domains

个人简介

李杰，中国人民大学统计学院师资博士后。2022年毕业于清华大学，获得统计学博士学位。主要研究方向为函数型数据分析、时间序列和非参数统计。曾获国际统计学会2021年简·丁伯根奖一等奖，国际数理统计协会2020年Hannan Graduate Student Travel Award，并在Statistica Sinica等期刊发表论文多篇。

报告摘要

Motivated by longitudinal imaging data possessing inherent spatial and temporal correlation, we propose a novel procedure to estimate its mean function. Functional moving average is applied to depict the dependence among temporally ordered images and flexible bivariate splines over triangulations are utilized to handle the irregular domain of images which is common in imaging studies. Both global and local asymptotic properties of the bivariate spline estimator for mean function are established with simultaneous confidence corridors (SCCs) as a theoretical byproduct. Under some mild conditions, the proposed estimator and its accompanying SCCs are shown to be consistent and oracle efficient as if all images were entirely observed without errors. The finite sample performance of the proposed method through Monte Carlo simulation experiments strongly corroborates the asymptotic theory. The proposed method is further illustrated by analyzing two sea water potential temperature data sets.

周峰

Generalized Bayesian Spatio-Temporal Point Process Model and Its Application

个人简介

周峰，中国人民大学统计学院讲师，中国人民大学杰出青年学者。主持国家自然科学基金青年项目，中国博士后基金特别资助、面上资助，入选博士后国际交流计划引进项目。主要研究方向包括统计机器学习、贝叶斯方法、随机过程、神经脉冲序列等。主要研究论文发表于Journal of Machine Learning Research, Statistics and Computing, International Conference on Learning Representations (ICLR), Conference on Neural Information Processing Systems (NeurIPS) 等期刊、会议上。

报告摘要

The spatio-temporal point process is a common stochastic process model which is used to model the pattern of events occurring in time or space. Its application covers a wide range of domains including seismology, epidemics, neuroscience and high-frequency financial engineering. The traditional spatio-temporal point process model has limitations on flexibility, time-variability, multi-taskability, uncertainty and efficiency. To relieve the aforementioned limitations, in the first part of our work we propose the flexible time-varying nonlinear Hawkes process to extend the traditional Hawkes process in terms of both flexibility and time-variability; in the second part of our work we propose the heterogeneous multi-task nonparametric Cox process to extend the traditional nonhomogeneous Poisson process in terms of both flexibility and multi-taskability. In the meantime, for each model, we convert the non-conjugate problem to a conditional conjugate one by using the data augmentation technique, so as to derive efficient inference algorithms with analytical expressions. This work lays a solid foundation for the application of Bayesian spatio-temporal point processes in the big data scenario.

吴奔

Blind source separation for multimodal brain networks

个人简介

吴奔，中国人民大学统计学院讲师，曾经在Emory大学生物统计与生物信息系、Michigan大学生物统计系从事博士后研究工作。主要研究兴趣为贝叶斯统计、独立成分分析、神经影像数据分析等。在JASA、Biometrics、中国科学（数学）、统计研究、系统工程理论与实践等期刊上发表过论文，正在加油尝试更多的期刊来延长个人简介。

报告摘要

There is a strong interest in analyzing multimodal brain networks in recent years. Integrating information from multimodal connections can potentially help better understand the formation and alteration in brain connectors due to neurodevelopment and disease progression. Investigating the interplay among multimodal brain networks is challenging due to several reasons such as the high noise of the imaging data, the different measures of connectivity across modalities, etc. In this talk, we will introduce a new blind source separation method that can be applied to decompose discrete representations of brain networks and achieve joint analysis of multimodal connections. We demonstrate our method with comprehensive simulations and present our findings on functional and structural brain connectivity from a real data study.

陈泽

Multifold Cross-Validation Model Averaging for Generalized Additive Partial Linear Models

个人简介

陈泽，中国人民大学统计学院在读博士生，主要研究方向为变量重要性，模型平均等。

报告摘要

Generalized additive partial linear models (GAPLMs) are appealing for model interpretation and prediction. However, for GAPLMs, the covariates and the degree of smoothing in the nonparametric parts are often difficult to determine in practice. To address this model selection uncertainty issue, we develop a computationally feasible model averaging (MA) procedure.The model weights are data-driven and selected based on multifold cross-validation (CV) (instead of leave-one-out) for computational saving. When all the candidate models are misspecified, we show that the proposed MA estimator for GAPLMs is asymptotically optimal in the sense of achieving the lowest possible Kullback-Leibler loss. In the other scenario where the candidate model set contains at least one quasi-correct model, the weights chosen by the multifold CV are asymptotically concentrated on the quasi-correct models. As a by-product, we propose a variable importance measure to quantify the importances of the predictors in GAPLMs based on the MA weights. It is shown to be able to asymptotically identify the variables in the true

model.Moreover, when the number of candidate models is very large, a model screening method is provided. Numerical experiments show the superiority of the proposed MA method over some existing model averaging and selection methods.

赖基正

Learning conditional dependence graph for concepts via

matrix normal graphical model

个人简介

赖基正，中国人民大学统计学院在读博士生，主要研究方向为文本挖掘，概念图模型等。

报告摘要

Conditional dependence relationships for random vectors is extensively studied and broadly applied. But it is not very clear how to construct the dependence graph for unstructured data like concept words or phrases in text corpus, where the variables(concepts) are not jointly observed with i.i.d. assumption. We assume that all the concept vectors learned from GloVe jointly follow a matrix normal distribution with sparse precision matrices. Different from knowledge graph methods, the conditional dependence graph describes the conditional dependence structure between concepts given all other concepts, which means that the concepts(nodes) linked by edges cannot be separated by other concepts. It represents an essential semantic relationship. A penalized matrix normal graphical model(MNGM) is then employed to learn the conditional dependence graph for both the concepts and the embedding 'dimensions'. Since the concept words are nodes in our graph with huge dimensions, we employ the MDMC optimization method to speed up the glasso algorithm. On the other hand, we propose a sentence granularity bootstrap to get 'independent' repeats of samples to enhance the penalized MNGM algorithm. We name the proposed method as Matrix-GloVe. In simulation studies, we check that the graph learned by Matrix-GloVe is more suitable for Graph Convolutional Networks(GCN) than a correlation graph. We employ the proposed method in two scenarios from real data and get good results.