主题报告专场(十三)| 首届机器学习与统计会议

Deep Learning in the Era of Large-scale Models


2023年8月26日 13:30-15:00


华东师范大学普陀校区 文史楼201

组 织 者:

周岭 西南财经大学

 刘斌  西南财经大学

题目:Customizing personal large-scale language model using co-occurrence statistic information

摘要In text generation, a large language model (LLM) makes a choice of each new word based only on the former selection of its context using the softmax function. Nevertheless, the link statistics information of concurrent words based on a scene-specific corpus is valuable in choosing the next word, which can help to match the topic of generated text with the current task. To fully explore such important information, we propose a graphsoftmax function for task-specific text generation. It is expected that the final word choice would be determined by both the global knowledge from the LLM and the local knowledge from the scene-specific corpus. To achieve this goal, we regularize the traditional softmax function with a graph total variation, which incorporates the local knowledge into the LLM. The proposed graphsoftmax can be plugged into a large pre-trained LLM for text generation and machine translation.Through experiments, we demonstrate that the new GTV-based regularization yields better performances in comparison with existing methods. Human testers can also easily distinguish the text generated by the graphsoftmax or softmax.

简介:刘斌,本科硕士博士分别就读于辽宁工业大学信息与计算科学,电子科技大学软件工程和电子科技大学计算机软件与理论,并在英属哥伦比亚大学进行博士联合培养年,香港大学博士后,于2018年加入西南财经大学。研究兴趣为机器 学习和数据挖掘。

吕绍高  南京审计大学

题目:Robust Structure Learning And L_p-Regularization For Graph Neural Networks

摘要Graph neural networks (GNNs) have become one of the most important branches in various deep learning, due to their remarkable power in learning with graph-structured data. Our current report consists of two folds.  First, we provide a lower bound of Rademacher complexity for two-layer GCNs, which motivates us to formulate the proposed robust algorithm for recovering graph structure and learning tasks in GCNs. Second, we also aims at quantifying the trade off of GCN between smoothness and sparsity, with the help of a new L_p-regularized (1 < p ≤ 2) stochastic learning proposed in the work. For a single-layer GCN, we develop an explicit theoretical understanding of GCN with the L_p-regularized stochastic learning by analyzing the stability of our regularized stochastic algorithm. Finally, several empirical experiments are implemented to validate our theoretical findings.


吕凤毛  西南交通大学



简介:Fengmao Lv is currently an Associate Professor at the School of Computing and Artificial Intelligence, Southwest Jiaotong University, China. My research interests are in multimodal deep learning, transfer learning and their applications in computer vision, natural language processing and social network analysis.He has openings for self-motivated undergraduate students, master students and PhD students (co-supervised). Feel free to contact me if you are interested in my research area

周井然  西南财经大学

题目:Supervised Random Feature Regression via Projection Pursuit

摘要:Random feature methods and neural network models are two popular nonparametric modeling methods, which are regarded as representatives of shallow learning and Neural Network, respectively. In practice random feature methods are short of the capacity of feature learning, while neural network methods lead to computationally heavy problems. This paper aims at proposing a flexible but computational efficient method for general nonparametric problems. Precisely, our proposed method is a feed-forward two-layer nonparametric estimation, and the first layer is used to learn a series of univariate basis functions for each projection variable, and then search for their optimal linear combination for each group of these learnt functions. Based on all the features derived in the first layer, the second layer attempts at learning a single index function with an unknown activation function. Our nonparametric estimation takes advantage of both random features and neural networks, and can be seen as an intermediate bridge between them.






