Redian新闻
>
医疗/健康-垂直领域与语言模型 ChatDoctor(上)

医疗/健康-垂直领域与语言模型 ChatDoctor(上)

公众号新闻
来自:看个通俗理解吧
进NLP群—>加入NLP交流群

千“垂”百炼:垂直领域与语言模型

Using Language Models in Specific Domains
(2)
[Medical/Health] ChatDoctor (Part 1)

这一系列文章仍然坚持走“通俗理解”的风格,用尽量简短、简单、通俗的话来描述清楚每一件事情。本系列主要关注语言模型在垂直领域尝试的相关工作。

This series of articles still sticks to the "general understanding" style, describing everything in as short, simple and easy-to-understand terms as possible. This series focuses on the work of language models in specific domains.

[Download] PDF版 PPT/Slides:https://github.com/createmomo/Open-Source-Language-Model-Pocket

目录 (Table of Contents):

1 引言 Introduction

  • 1.1 语言模型的能力 Power of Language Models
  • 1.2 落地垂直领域的灵魂发问 Questioning: Are You Sure Specific Domains?

2 归根到底是可用的垂直领域数据 Essential: Domain-specific Training Data

【医疗 Medical/健康 Health】

  • 2.1 ChatDoctor (上, Part 1)(←)
  • 2.1 ChatDoctor (中)
  • 2.1 ChatDoctor (下)
  • 2.2 MedicalGPT-zh
  • 2.3 SoulChat
  • 2.4 DoctorGLM
  • 2.5 BenTsao
  • 2.6 QiZhenGPT
  • 2.7 HuaTuoGPT
  • 2.8 BianQue
  • 2.9 MedicalGPT
  • 更多 More(待定 to be confirmed)

2 归根到底是可用的垂直领域数据

Essential: Domain-specific Training Data

把语言模型应用到垂直领域,至少应该考虑清楚两件事情

  • 我们希望语言模型可以完成具体什么任务、应用在具体什么应用场景
  • 如何获得并利用可以满足上述任务/场景的训练数据

Applying a language model to a particular domain should consider at least two things clearly:

  • What specific tasks** we want the language model to perform and what specific application scenarios** it should be applied to
  • How to obtain and use training data that can satisfy the above tasks/scenarios

按照这个思路,我们会串讲一些现有的工作是如何解决这两件事情的。

Following this idea, we will talk about how some of the existing work addresses these two issues.

2.1 ChatDoctor (上, Part 1)

ChatDoctor的作者发现,ChatGPT在面对医疗健康方面的提问时,给出的答复有时并不准确,和距离真正的AI医生还有差距。The authors of ChatDoctor found that the answers given by ChatGPT when faced with questions on healthcare were sometimes inaccurate and fell a long way from being a proper AI doctor.

(除此之外,我还发现ChatGPT有一定几率会直接拒绝回答医疗健康类问题 In addition to this, I also found that there was a certain chance that ChatGPT would simply refuse to answer medical health questions)

ChatDoctor的定位/scenario:让语言模型变成具备一定资历的AI医生,能够完成患者-医生对话 turning language models into AI doctors with certain qualifications, capable of completing patient-doctor conversations

  • 患者 Patients→提出需求 giving their needs;
  • ChatDoctor→提供质量不错的建议、诊断、用药建议等 provides decent quality advice, diagnosis, medication advice, etc.

训练模型 (Model Training)

1)Original LLaMA → Fine-tuned LLaMA

LLaMA模型本质是一个语言模型,对话聊天、遵循指令的技能并不突出。The LLaMA model is essentially a language model, and the skills of conversational chatting and following instructions are not outstanding.

为了强化模型的这些技能,ChatDoctor并没有着急一上来就用医患对话的数据去微调,而是先利用通用领域的instruction-following数据来微调,保证LLaMA获得更好的对话聊天、遵循指令的能力。To strengthen these skills in the model, ChatDoctor did not hurry to fine-tune it right away with data from patient-doctor conversations, but first fine-tuned it using instruction-following data from the generic domain to ensure that LLaMA acquired better abilities to chat in conversation and follow instructions.

2)Fine-tuned LLaMA → Final Fine-tuned LLaMA

当完成第1)步后,再利用准备好的垂直领域数据(医患对话)进行微调。When step 1) has been completed, the model is fine-tuned using the prepared special domain data (doctor-patient dialogue).

数据 (Data)

那我们的问题是,这些训练数据是哪里来的、怎么来的呢?The question for us then is, where and how did this training data come from?

1)找到现成可用的数据 Find available data

在线上医疗咨询网站(HealthCareMagic)获得现成的医患对话数据,这些对话是真实的,并不是创造出来的。The data on doctor-patient conversations, which are real and not generated, is readily available on the medical advice website (HealthCareMagic).

2)通过人工和自动的方式进行数据清洗 Cleaning of data by manual and automated processes

  • 移除医生患者的个人身份信息 Removal of personally identifiable information of doctors and patients from the data
  • 使用自动纠正工具修正语法错误 Use the auto-correction tool to fix grammatical errors

最终获得100k的数据可用于模型的微调。The resulting 100k of data can be used to fine-tune the model.

3)测试集 Data for Performance Evaluation

为了证明ChatDoctor在提供医疗建议方面确实比ChatGPT有所提升,这篇工作还准备了一个在训练中没有见过的数据集。To demonstrate that ChatDoctor is indeed an improvement over ChatGPT in providing medical advice, this work also prepares a dataset that has not been seen in training.

核心思想是:把这个数据集中同样的问题同时喂给ChatDoctorChatGPT,然后去对比两者答案的好坏。在后面的部分中我们会细聊评测的过程。 The core idea is that the same questions in this dataset are fed to both ChatDoctor and ChatGPT and then to compare which of the two answers is better. We'll talk more about the evaluation process in a later section.

训练设置 (Training Settings)

  • 6 x A100 GPUs
  • 3 hours
  • Batch Size 192
  • 3 epochs
  • Max Sequence Length 512 tokens

推理阶段 Inference

如果单纯只靠微调后的模型通过参数去记住医学知识和对话的方式,应该还不够。It would not be enough to simply rely on a fine-tuned model to remember medical knowledge and dialogue by its parameters.

在推理的过程中,如果模型具备接触外部资源的机会从外部资源中提炼和用户的问题紧密相关的知识能力,那么模型的回复效果会更好(内容更准确、可靠)。In the process of inference, the model will respond better (more accurate and reliable content) if it has the ability to access external resources and extract knowledge from external resources that is closely related to the user's problem.

ChatDoctor准备了两种可以接触的外部资源:疾病相关的知识库维基百科。ChatDoctor has prepared two external resources that can be accessed: disease-related knowledge base and Wikipedia.

疾病相关的知识库的内容格式可以参考下图,大概包含了:疾病名称、症状、可以进一步做的检测与实施的措施、可用的药物等。The format of the disease related knowledge base can be found in the following figure, which probably contains: name of the disease, symptoms, further tests and measures that can be carried out, medication options, etc.

那ChatDoctor是如何与这两种外部知识互动的呢?How then does ChatDoctor interact with these two types of external knowledge?

在这篇工作中,互动的方式比较直接、朴素,并没有用到文本embedding的技术,但是对于我们来说,仍然具有一定的参考价值。In this work, the interaction is relatively straightforward and plain, and does not use the techniques of text embedding, but it is still useful for our reference.

在下一篇文章中,我们会更详细的描述与外部知识资源的互动细节。In the next post we will describe the details of the interaction with external knowledge resources.

(未完待续, To be continued)


进NLP群—>加入NLP交流群

微信扫码关注该文公众号作者

戳这里提交新闻线索和高质量文章给我们。
相关阅读
我们的思维方式与语言有着怎样的关系?卷起来!Dr. LLaMA:通过生成数据增强改进特定领域 QA 中的小型语言模型,重点关注医学问答任务面试官:如何使用Dockerfile去构建自定义的Docker镜像?问倒一堆商汤大模型全面升级!「商量SenseChat 2.0」大语言模型疯狂上分Docker容器超全详解,别再说不会用Docker了!特别的文化盛宴:在美国庆祝“中国文化节”一文速览大语言模型在分子领域中的探索【城事】巴黎市长将重修Châtelet 广场以方便行人垂直领域大模型的一些思考及开源模型汇总如何更好地蒸馏ChatGPT模型能力:Lion闭源大型语言模型的对抗性蒸馏模型原理及实验工作介绍QUERT:基于旅行搜索领域Query理解的预训练语言模型BLIP-2、InstructBLIP稳居前三!十二大模型,十六份榜单,全面测评「多模态大语言模型」面试官:如何使用 Dockerfile 去构建自定义的 Docker 镜像?问倒一大片。。。大型语言模型专场上线!四位AI新青年直播讲解MiniGPT-4、LLaVA、Gorilla以及大型语言模型Token危机中文医学大模型“本草”(原名华驼):医学知识增强在中文大型语言模型指令微调上的初步探索LLM in Medical Domain: 一文速览大语言模型在医学领域的应用几幅青绿山水习作微软发布 Guidance 语言,用于控制大语言模型学习生成式大语言模型,东北大学自然语言处理实验室有一堂课为期五年,Ginkgo牵手谷歌开发新型大型语言模型,助力药物发现和生物安全领域大语言模型综述全新出炉:51页论文带你盘点LLM领域专业化技术一文解决所有「语言模型」疑问:能不能训多个epoch?怎么微调效率高?需要多少条数据?美华该考虑如何面对即将到来的中美战争了波士顿大学,坐落都市中心(上) | 咨询/精神健康咨询/应用人类发展/遗传咨询/心理学硕士项目介绍Benedict Cumberbatch 证实「奇异博士 Doctor Strange」将于 2024 年回归ICCV 2023 | 基于预训练视觉语言模型和大语言模型的零样本图像到文本生成花卉摄影,明媚春光大模型变“小”:黑马天启开创AI模型“重度垂直”新思路,入选北京大模型行业应用典型案例Stability AI进军编程领域,发布首个用于代码生成的大语言模型《我怎麼哭了》万代南梦宫:从垂直领域玩家到日本文娱巨头Knowledge Transfer(KT)知识转化论坛(京港场):健康医疗国内首个医疗大语言模型问世!多模态打通诊疗全流程,别再叫我做题家Meta开源Code Llama,号称编程领域 “最先进的大语言模型”巴黎市长将重修Châtelet 广场以方便行人
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。