-
Notifications
You must be signed in to change notification settings - Fork 0
/
zh.html
384 lines (323 loc) · 33.7 KB
/
zh.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
<html> <head>
<meta http-equiv="Content-Type" content="text/html">
<meta name="description" content="***">
<meta name="keywords" content="Ruihui Zhao, 赵瑞辉, 抖音评论, 自然语言处理, 信息安全">
<meta charset="UTF-8">
<body font-size=8px>
<title> Ruihui Zhao, 赵瑞辉, 抖音评论, 自然语言处理, 信息安全</title>
</head>
<body>
<table>
<tr><th></th><th></th><th></th>
<tr>
<td><img src="ruihui.jpeg" width=200 height=200 alt="a photo"></td>
<td> </td>
<td><h1> 赵 瑞辉 (Ruihui, Zhao) </h1>
</p>
<a style="text-decoration:none" href="xxx" target="_blank">抖音-评论-深圳团队</a></br>
<strong>算法负责人</strong></br>
</br>
邮箱: [email protected]</br>
电话: +86-156-8088-2195
</td>
</table>
<hr>
<h3>关于我</h3>
抖音-评论-深圳团队算法负责人,负责公司产品的评论、直播、弹幕等信息的内容安全和生态画风优化,管理团队和项目,牵头与业界的科研合作、前沿论文落地转化等。累计发表顶级学术会议/期刊论文<b>20</b>余篇 (e.g. ACL, WWW, AAAI, IJCAI, CIKM, NAACL, EMNLP, ICDE, TII, TOIT, etc.),
被引用次数<b>191</b>次,高引h-index 9,累计申请发明专利<b>100</b>余项,并担任多个顶级学术会议/期刊审稿人,获得多项行业竞赛第一名。<br>
此前,在<a class="grey" href="https://jarvislab.tencent.com">腾讯天衍实验室</a> 担任高级研究员(2018.05-2022.09),主要研究方向为NLP、机器学习,及其在信息安全和医疗大数据领域的业务突破。在此之前,在创新工场AI工程院NLP组担任算法工程师 (2017 - 2018)。
硕士毕业于日本早稻田大学<a href="http://www.iwaihara-lab.org/pub/">数据工程实验室</a> (2017, 研究型硕士)。本科毕业于电子科技大学信息安全专业 (2015)。<br>
<!-- <hr>
<h4>Research Interests</h4>
<li>Natural Language Processing: Medical Pre-training Language Modeling, Knowledge Distillation, Text Matching, Taxonomy Construction and Expansion, Question Generation and Answering, etc. </li>
<li>Machine Learning Techniques: Time Series Anomaly Detection, Community Detection, Graph Embeddings, Model Interpretability, etc. </li>
<li>Information Security: Federated Learning, Searchable Encryption </li>
-->
<hr>
<h3>工作经历</h3>
<!-- <table class="imgtable"><tr><td> -->
<td align="left"><h4>算法负责人 深圳今日头条科技有限公司 Data-抖音-评论 (2022.9 ~ 至今)</h4>
主要的工作内容:
<ul>
<li> 负责公司产品的评论、直播、弹幕等信息的内容安全和生态画风优化,如:抖音评论生态画风问题、抖音网暴专项、抖音多模态抑郁自杀检测专项等; </li>
<li> 负责将最前沿的机器学习算法(NLP/CV/文本挖掘等)应用到众多产品中,解决业务的实际问题; </li>
<li> 带领团队与跨职能部门合作,推进项目的整个生命周期,包括需求讨论,系统设计,算法开发,NLP/CV模型开发和迭代、模型应用和服务的上线; </li>
<li> 管理团队和项目,帮助团队成员成长,激发团队潜力; </li>
<li> 负责牵头与业界的科研合作、前沿论文落地转化等。 </li>
</ul>
<td align="left"><h4>T10高级研究员 腾讯科技(深圳)有限公司 天衍实验室 医疗大数据组 (2018.5 ~ 2022.9)</h4>
<!-- <ul> -->
作为天衍实验室<b>骨干创始成员</b>,曾虚线带领5名全职员工, 探索NLP、机器学习技术在信息安全及医疗领域的<b>业务突破</b>、项目落地及研究:
<li><b>智慧医保反欺诈, 技术负责人</b></a>
<ul>
<li><b>背景及挑战</b></li>
<ul>
<li> 现有监管系统无法直接发现骗保团伙,执法大队主要靠群众举报发现团伙 </li>
<li> 业界现存基于大数据的解决方案无法直接定性,且模型可解释性要求高 </li>
<li> 医保电子凭证全渠道已激活<b>10亿</b>,为骗保团伙提供新渠道,去年底开始出现盗刷案件 </li>
</ul>
<li><b>方案及主要工作</b></li>
<ul>
<li> <b>行业首创</b>团伙骗保发现方案及电子凭证实时风控方案,<b>创新点</b>总结为: 1)主动发现 2)团伙排序 3)团伙扩散 4)发现团伙骗保新模式 </li>
<li> 设计架构及方案,包含:图社区探测(Attactor、二部图等)、时间序列异常检测、模型可解释性(ECOD等)、标签扩散等技术 </li>
<li> 从零设计出就诊画像、个人画像、团伙画像等230个业务画像指标,用于对可疑团伙和个人进行排序 </li>
<li> 基于faiss、DBScan等,<b>自研聚类加速算法</b>,速度提升<b>~10倍</b>,并解决内存溢出问题 </li>
</ul>
<li><b>项目成果及亮点</b></li>
<ul>
<li> 该方案获得2022国家医保局解决方案大赛<b>全国第一(1/156, 行业最高奖项)</b>,2021数博会领先科技成果奖(49/560, <b>国家社会科技奖</b>),2020北京局数智医保竞赛金奖(行业首个比赛),确立在智慧医保领域的领先地位 </li>
<li> 在北京局真实落地部署,挖掘出<b>10多个</b>骗保团伙,经执法大队验证,<b>准确率100%,全部为高价值团伙</b>,并发现<b>新型特大团伙</b>,涉案金额<b>数千万</b>,收到客户的成果验证及感谢信 </li>
<li> 为腾讯在智慧医保赛道找到突破口,获得腾讯医疗副总裁支持立项执法通产品,项目已在广东省、北京、郑州等落地 </li>
<li> 该项目累计曝光420万次,覆盖医保及医疗行业相关人员183万,全媒体报道共178篇次,如光明网、中新网、国际在线等多家核心党央媒,转发率超356%
</ul>
</ul>
</li>
<li><b>腾讯健康-资讯搜索&医学文献搜索, 技术负责人</b></a>
<ul>
<li><b>背景及挑战</b></li>
<ul>
<li> 通用领域搜索引擎如Google和Bing等均使用了预训练语言模型BERT </li>
<li> 医疗搜索面临挑战:1)排序结果的可解释性 2)因内容源数量少,排序质量要求高 3)排序算法时延要求控制在80ms~100ms </li>
<li> 现有学术搜索引擎如谷歌学术、微软学术、Aminer、SciBERT等无法自动生成文献阅读路径,且搜索结果与文献综述中的列表相比,精确率较低 </li>
</ul>
<li><b>方案及主要工作</b></li>
<ul>
<li> 基于知识蒸馏的语言模型LTD-BERT及持续学习(Continual Learning)算法,提供轻量级句子级语义特征,并基于语义哈希(Semantic Hash)提升检索速度 </li>
<li> 基于Self-attention提供词级别交互特征、基于对比学习(如ConSERT)解决语义坍缩问题 </li>
<li> 基于改进的Eli5及L2R(如LambdaMART等)算法提供排序算法可解释性 </li>
<li> 提出RePaGer系统自动生成阅读路径,基于改进的NEWST算法在引用图上进行路径选择 </li>
</ul>
<li><b>项目成果及亮点</b></li>
<ul>
<li> 平均DAU 49w,与腾讯AI Lab的NSE算法相比,线上点击率A/B test<b>提升3.8%</b> </li>
<li> 自研LTD-BERT并引入交互特征,与BERT效果持平,F1达到92.5%,同时<b>速度提升~25倍</b> </li>
<li> 独立使用Streamlit上线可解释性工具可展示特征重要性得分及排序、特征名、特征值等信息,便于线上debug </li>
<li> 公司级算力平台Venus上开源BERT组件及LTD-BERT算法,累计超<b>17个团队</b>使用,并获得腾讯AI算法评选<b>金奖</b> </li>
<li> 方案能自动生成文献阅读路径、并引入预备知识链,告诉用户预先读哪些基础知识、且跟谷歌学术、微软学术等相比,提升精确率<b>10%</b>以上 </li>
<li> 语义哈希两篇论文分别被NLP领域<b>顶会ACL及EMNLP录用</b>,阅读路径论文被数据挖掘<b>顶会ICDE录用</b>,并开源SurveyBank数据集 </li>
</ul>
</ul>
</li>
</li>
<li><b>腾讯健康-新冠疫苗问答, 项目发起人&技术负责人</b></a>
<ul>
<li><b>背景及挑战</b></li>
<ul>
<li> 2021年初,业界还没有新冠疫苗智能问答系统,我发起并作为技术负责人上线该系统 </li>
<li> 官方疫苗接种资讯不定期持续更新,需消耗医学专家人力定期查找并更新疫苗知识树 </li>
<li> 从零启动项目,缺乏用户问法语料,需通过算法进行生成不同的问法 </li>
</ul>
<li><b>方案及主要工作</b></li>
<ul>
<li> 基于自研目录树扩展(HEF)及目录树填充(QEN)算法,自动对疫苗知识树节点进行扩展填充 </li>
<li> 基于自研问题生成算法,通过逐步改写,自动生成难度可控的问题 </li>
<li> 基于自研LTD-BERT和目录树,对用户查询进行分类,将对应的标准答案回复给用户 </li>
</ul>
<li><b>项目成果及亮点</b></li>
<ul>
<li> 业界首个<a class="grey" href="https://mp.weixin.qq.com/s/kKjfV0bMGOx_Co-BhNdqrg">新冠疫苗智能问答系统</a>,上线到腾讯健康及微信搜一搜,DAU最高达<b>80w</b> </li>
<li> 自研HEF模型比业界最佳方案准确率提升<b>46.7%</b>, MRR指标提升<b>32.3%</b> </li>
<li> 目录树算法分别被数据挖掘<b>顶会WWW录用</b>,问题生成算法被NLP<b>顶会ACL录用</b> </li>
</ul>
</ul>
</li>
</li>
<li><b>其他代表项目</b></a>
<ul>
<li><b>腾讯健康-医保电子凭证保险推荐,技术负责人</b></li>
<ul>
<li> 微信医保电子凭证激活用户已超过<b>5亿</b>,本项目的目标是筛选高质量用户进行保险页面入口投放,以及对保险进行个性化排序,以提升购买率 </li>
<li> 从0到1与产品、开发侧对接,给出整体技术方案和技术架构,确定业务目标及流程,拆解出投放系统、推荐系统、标签系统 </li>
<li> 依据不同来源数据构建标签画像体系,包括:基础特征,行为特征,交叉特征,支付特征,Saas买药特征等</b> </li>
<li> 基于Lookalike、L2R、Multi-task Learning等技术进行投放系统及推荐系统开发上线 </li>
</ul>
<li><b>医疗OCR,算法核心成员</b></li>
<ul>
<li> 尝试引入Transformer,vgg+Transformer+ctc效果达到<b>词准确率91.66%,单字准确率97.81%,单字召回率98.36%</b> </li>
<li> 负责后处理纠错模型,模拟人基于语境信息先识别后改正,<b>识别F1达到87%,纠正F1达到85.7%</b>,并开源医疗拼写纠错数据集,已被CIKM'22 resource track录用 </li>
</ul>
<li><b>医疗联邦学习,项目发起人&技术负责人</b></li>
<ul>
<li> <a class="grey" href="https://tech.qq.com/a/20200413/006866.htm">业界首个医疗联邦学习案例</a> ,使用某市真实医院数据进行脑卒中预测,AUC达80% </li>
<li> 与微众银行AI部门共建联合实验室 </li>
<li> 方案与Transformer、One-Versus-All等方法结合,发表TII(<b>SCI一区,IF=10.2</b>)、TOIT、IJCNN、IJCAI'FML等论文 </li>
</ul>
<li><b>流感预测,算法负责人</b></li>
<ul>
<li> 与<b>钟南山院士</b>呼研院合作,预测未来一周的流感病毒阳性率,为疾控部门政策提供参考 </li>
<li> 基于香港每周流感数据,使用XGBoost、SARIMA、SIR等模型,设计时间序列、斜率、天气、季节等相关特征 </li>
<li> 评测指标MAE=1.1%、MAPE=18.03%, 对比业界现存方案(基于重庆数据发表在《柳叶刀》)提升率分别为<b>11.2%</b>和<b>1.72%</b> </li>
</ul>
</ul>
</li>
<!-- </ul> -->
<!-- </td></tr></table> -->
<hr>
<!-- <table class="imgtable"><tr><td> -->
<td align="left"><h4>NLP算法工程师 创新工场 AI工程院 NLP组 (2017.9 ~ 2018.5)</h4>
</li>
<li><b>爆款预测, 算法负责人</b></a>
<ul>
<li><b>背景及挑战</b></li>
<ul>
<li> 2017年,网红甜品"脏脏包"突然成为爆品,该系统可基于社交媒体自动发现甜品领域新词并预测能否成为爆款,从而为商家提供商品舆情和爆款推荐服务 </li>
<li> 领域词典需要从0构建,且需要定期维护更新,自动发现新词 </li>
<li> 爆款难以归因,爆发预测(Burst Prediction)领域文献和业界落地解决方案极少 </li>
</ul>
<li><b>方案及主要工作</b></li>
<ul>
<li> 作为<b>算法负责人</b>,提出了基于社交网络文本的两阶段算法方案: 1)领域词典自动构建及新词发现 2)爆发预测 </li>
<li> 领域词典自动构建及新词发现:基于hashtag共现及词边界,建立共现图,使用社区探测算法进行社区探测及新词发现,使用文本分类算法(HAN等)进行过滤 </li>
<li> 爆发预测:定义了Trigger/Burst/Off-Burst/Off-Trigger/Death五个生命周期,基于时间序列算法进行特征建模及分类预测 </li>
</ul>
<li><b>项目成果及亮点</b></li>
<ul>
<li> 新词召回率达<b>97%</b>,发现25个甜品,如:魔法翻糖、冰皮月饼、翻糖人偶等 </li>
<li> 文本分类精确率达96.7%,其中HAH模型效果远好于WordCNN/DPCNN/Bi-GRU等模型 </li>
<li> 爆发预测准确率达<b>85%</b> </li>
</ul>
</ul>
</li>
<!-- <li>Participated in a Task-oriented Dialogue System, designed the algorithm of NER/ID/DST;</li> -->
<!-- <li>Owned a burst prediction system, designed the algorithm of constructing domain dictionary and hashtag-based burst prediction.</li> -->
<!-- </td></tr></table> -->
<hr>
<h3> 代表性论文 <a class="grey" href="https://scholar.google.com/citations?user=0okAFQMAAAAJ&hl=en">[Google Scholar]</a></p></h3>
<h4>2022</h4>
<a class="grey" href="xxx"><i>Identify Event Causality with Knowledge and Analogy</i></a><img src="new.jpeg" width="25" height="19"><br/>
Sifan Wu, <strong>Ruihui Zhao</strong>, etc.<br/>
The 37th AAAI Conference on Artificial Intelligence (<strong>AAAI-2023, CCF-A</strong>), Washington, DC, USA<br/></p>
<!-- <a class="grey" href="https://github.com/ha-lins/GEML-MDG">[Codes]</a></p> -->
<a class="grey" href="xxx"><i>Semi-supervised Credit Card Fraud Detection via Attribute-driven Graph Representation</i></a><img src="new.jpeg" width="25" height="19"><br/>
Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, <strong>Ruihui Zhao</strong>, Ziliang Chen, Liang Lin<br/>
The 37th AAAI Conference on Artificial Intelligence (<strong>AAAI-2023, CCF-A</strong>), Washington, DC, USA<br/></p>
<!-- <a class="grey" href="https://github.com/ha-lins/GEML-MDG">[Codes]</a></p> -->
<a class="grey" href="https://arxiv.org/abs/2210.04242"><i>Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning</i></a><br/>
Sheng Xiang, Mingzhi Zhu, Dawei Cheng, Enxia Li, <strong>Ruihui Zhao</strong>, etc.<br/>
The 2022 Conference on Empirical Methods in Natural Language Processing (<strong>EMNLP-2022, CCF-B</strong>), Abu Dhabi<br/></p>
<!-- <a class="grey" href="https://github.com/J-zin/DHIM">[Codes]</a></p> -->
<a class="grey" href="https://j-zin.github.io/files/MCSCSet.pdf"><i>MCSCSet: A Specialist-annotated Dataset for Medical-domain Chinese Spelling Correction</i></a><br/>
Wangjie Jiang, Zhihao Ye, Zijing Ou, <strong>Ruihui Zhao</strong>, etc.<br/>
ACM International Conference on Information and Knowledge Management 2022 (<strong>CIKM-2022 Resource Track, CCF-B</strong>), Atlanta, Georgia, USA.<br/></p>
<a class="grey" href="https://arxiv.org/abs/2106.09645"><i>Prototypical Graph Contrastive Learning</i></a><br/>
Shuai Lin, Chen Liu, Ziyuan Hu, Pan Zhou, Shuojia Wang, <strong>Ruihui Zhao</strong>, Yefeng Zheng, Liang Lin, Eric Xing, Xiaodan Liang<br/>
IEEE Transactions on Neural Networks and Learning Systems (<strong>TNNLS-2022, CCF-B, IF=10.47</strong>)<br/></p>
<a class="grey" href="https://arxiv.org/abs/2110.06354"><i>Tell Me How to Survey: Literature Review Made Simple with Automatic Reading Path Generation</i></a><br/>
Jiayuan Ding, Tong Xiang, Zijing Ou, Wangyang Zuo, <strong>Ruihui Zhao</strong>, Chenghua Lin, Yefeng Zheng and Bang Liu<br/>
IEEE International Conference on Data Engineering 2022 (<strong>ICDE-2022 Industry Track, CCF-A</strong>), Virtual Conference, hosted by Kuala Lumpur, Malaysia<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://dl.acm.org/doi/pdf/10.1145/3485447.3511943"><i>QEN: Applicable Taxonomy Completion via Evaluating Full Taxonomic Relations</i></a><br/>
Suyuchen Wang, <strong>Ruihui Zhao</strong>, Yefeng Zheng and Bang Liu<br/>
The Web Conference (<strong>WWW-2022, CCF-A</strong>), Virtual Conference, hosted by Lyon, France<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://arxiv.org/pdf/2204.13953.pdf"><i>"My nose is running. ""Are you also coughing?": Building A Medical Diagnosis Agent with Interpretable Inquiry Logics</i></a><br/>
Wenge Liu, Jianheng Tang, Hao Wang, Yi Cheng, Yafei Liu, <strong>Ruihui Zhao</strong>, Xi Chen, Yefeng Zheng, Xiaodan Liang<br/>
International Joint Conference on Artificial Intelligence (<strong>IJCAI-2022, CCF-A</strong>), Vienna, Austria<br/></p>
<a class="grey" href="https://arxiv.org/pdf/2204.00843.pdf"><i>Privacy-preserving Anomaly Detection in Cloud Manufacturing via Federated Transformer</i></a><br/>
Shiyao Ma, Jiangtian Nie, Jiawen Kang, Lingjuan Lyu, Ryan Wen Liu, <strong>Ruihui Zhao</strong>, Ziyao Liu, Dusit Niyato<br/>
IEEE Transactions on Industrial Informatics (<strong>TII-2022, CCF-A, IF=10.215</strong>)<br/></p>
<h4>2021</h4>
<a class="grey" href="https://arxiv.org/abs/2109.02867"><i>Refining BERT Embeddings for Document Hashing via Mutual Information Maximization</i></a><br/>
Zijing Ou, Qinliang Su, Jianxing Yu, <strong>Ruihui Zhao</strong>, Yefeng Zheng and Bang Liu<br/>
The 2021 Conference on Empirical Methods in Natural Language Processing (<strong>EMNLP-2021 findings, CCF-B</strong>), Punta Cana, Dominican Republic<br/>
<a class="grey" href="https://github.com/J-zin/DHIM">[Codes]</a></p>
<a class="grey" href="https://arxiv.org/pdf/2105.13066.pdf"><i>Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval</i></a><br/>
Zijing Ou, Qinliang Su, Jianxing Yu, Bang Liu, Jingwen Wang, <strong>Ruihui Zhao</strong>, Changyou Chen and Yefeng Zheng<br/>
The 59th Annual Meeting of the Association for Computational Linguistics (<strong>ACL-2021, CCF-A</strong>), Bangkok, Thailand<br/>
<a class="grey" href="https://github.com/J-zin/SNUH">[Codes]</a></p>
<a class="grey" href="https://arxiv.org/pdf/2105.11698.pdf"><i>Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting</i></a><br/>
Yi Cheng, Siyao Li, Bang Liu, <strong>Ruihui Zhao</strong>, Sujian Li, Chenghua Lin and Yefeng Zheng<br/>
The 59th Annual Meeting of the Association for Computational Linguistics (<strong>ACL-2021, CCF-A</strong>), Bangkok, Thailand<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://aclanthology.org/2021.naacl-main.238.pdf"><i>Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management</i></a><br/>
Zhengxu Hou, Bang Liu, <strong>Ruihui Zhao</strong>, Zijing Ou, Yafei Liu, Xi Chen, Yefeng Zheng<br/>
2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (<strong>NAACL-2021, CCF-B, NLP top-tier Conf.</strong>), Virtual Conference<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://arxiv.org/abs/2101.11268"><i>Enquire One's Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion</i></a><br/>
Suyuchen Wang, <strong>Ruihui Zhao</strong>, Xi Chen, Yefeng Zheng, Bang Liu<br/>
The Web Conference (<strong>WWW-2021, CCF-A</strong>), Virtual Conference<br/>
<a class="grey" href="https://github.com/sheryc/HEF">[Codes]</a></p>
<a class="grey" href="https://ieeexplore.ieee.org/document/9533409"><i>FedOVA: One-vs-All Training Method for Federated Learning with Non-IID Data</i></a><br/>
Yuanshao Zhu, Christos Markos, <strong>Ruihui Zhao</strong>, Yefeng Zheng, James Yu<br/>
International Joint Conference on Neural Networks (<strong>IJCNN-2021, CCF-C</strong>), Virtual Conference<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://arxiv.org/abs/2012.11988"><i>Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation</i></a><br/>
Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, <strong>Ruihui Zhao</strong>, Ziliang Chen, Liang Lin<br/>
The Thirty-Fifth AAAI Conference on Artificial Intelligence (<strong>AAAI-2021, CCF-A</strong>), Virtual Conference<br/>
<a class="grey" href="https://github.com/ha-lins/GEML-MDG">[Codes]</a></p>
<a class="grey" href="https://dl.acm.org/doi/pdf/10.1145/3453169"><i>Towards Communication-efficient and Attack-resistant Federated Edge Learning for Industrial Internet of Things</i></a><br/>
Yi Liu, <strong>Ruihui Zhao</strong>, Jiawen Kang, Abdulsalam Yassine, Dusit Niyato, Jialiang Peng*<br/>
Transactions on Internet Technology (<strong>TOIT-2021, CCF-B, IF=4.79</strong>)<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<h4>2020</h4>
<a class="grey" href="https://arxiv.org/abs/2006.10517"><i>Privacy-Preserving Technology to Help Millions of People: Federated Prediction Model for Stroke Prevention</i></a><br/>
Ce Ju*, <strong>Ruihui Zhao*</strong>, Jichao Sun, Xiguang Wei, Bo Zhao, Yang Liu, Hongshan Li, Tianjian Chen, Xinwei Zhang, Dashan Gao, Ben Tan, Han Yu and Yuan Jin<br/>
The 29th International Joint Conference on Artificial Intelligence (<strong>FML'IJCAI-2020</strong>), Virtual Conference<br/></p>
<!-- <a class="grey" href="xxx">[Codes]</a></p> -->
<a class="grey" href="https://www.nature.com/articles/s41598-020-78084-w"><i>Forecasting the long-term trend of COVID-19 epidemic using a dynamic model</i></a><br/>
Jichao Sun,† Xi Chen,† Ziheng Zhang,† Shengzhang Lai, Bo Zhao, Hualuo Liu, Shuojia Wang, Wenjing Huan, <strong>Ruihui Zhao</strong>, Man Tat Alexander Ng*, Yefeng Zheng*<br/>
Scientific Reports (<strong>Scientific Reports, IF=4.525</strong>)<br/></p>
<!-- <a class="grey" href="https://github.com/jichaosun001/covid_forecast">[Codes]</a></p> -->
<h4>2019 and Before</h4>
<a class="grey" href="https://arxiv.org/abs/1705.11056"><i>A Lightweight Efficient Searchable Encryption Scheme using Supervised Sentence Representations</i></a><br/>
<strong>Ruihui Zhao</strong>, Mizuho Iwaihara, etc.<br/>
The 28th International Joint Conference on Artificial Intelligence (<strong>FL'IJCAI-2019</strong>), Macau, China<br/></p>
<a href="https://ieeexplore.ieee.org/document/7752502"><i>A Community-based P2P OSNs Using Broadcast Encryption Supporting Cross-platform with High-Security</i></a><br/>
Mingjie Ding, <strong>Ruihui Zhao</strong>, Keiichi Koyanagi, Takeshi Tsuchiya, Hiroaki Sawano<br/>
8th International Conference on Wireless Communications & Signal Processing (<strong>IEEE WCSP-2016, EI</strong>), Yangzhou, China</p>
<a href="https://ieeexplore.ieee.org/document/6992161?arnumber=6992161"><i>Privacy-preserving personalized search over encrypted cloud data supporting multi-keyword ranking</i></a><br/>
<strong>Ruihui Zhao</strong>, Hongwei Li, Yi Yang, Yu Liang<br/>
6th International Conference on Wireless Communications & Signal Processing (<strong>IEEE WCSP-2014, EI</strong>), Hefei, China</p>
<hr>
<h3>学术职务</h3>
<li>工业界主席: AAAI 2021 rseml workshop, <a class="grey" href="http://federated-learning.org/rseml2021/">[Link]</a></li>
<li>程序委员会成员: IJCAI (IJCAI 2021, IJCAI-ECAI 2022, IJCAI 2023 及 IJCAI 2024)</li>
<li>审稿人: EMNLP 2021&2022, TPDS 2018 (CCF-A类期刊), AACL 2020等</li></p>
<hr>
<h3>已授权国家发明专利</h3>
<font color="red"><i>11项国家发明专利已授权,另有~90余项国家发明专利已递交</i></font><br/>
<li>2022.01 一种基于时序医疗记录及外部医学知识的疾病预测诊断方法, <strong>赵瑞辉</strong>, CN 113257412 A</a></li>
<li>2021.07 一种基于口语化同义句生成的数据增强系统, <strong>赵瑞辉</strong>/乔倩倩/韦伟, CN 110110045 A</a></li>
<li>2021.06 一种适用于问答系统的细粒度树形分类器构建方案, <strong>赵瑞辉</strong>/韦伟/乔倩倩/谭雯雯, CN 110032631 A</a></li>
<li>2021.04 一种协同任务预测方法及计算机可读存储介质, 李洪珊/<strong>赵瑞辉</strong>/赵博, HK 40022283</a></li>
<li>2021.03 基于神经网络语言模型的轻量级加密云数据检索方案, <strong>赵瑞辉</strong>/乔倩倩/许顺楠, CN 109992978 A</a></li>
<li>2021.02 一种基于因果图以及深度学习的疾病预测诊断方法, <strong>赵瑞辉</strong>/王婧雯, CN 112035671 A</a></li>
<li>2021.01 一种通信高效的基于异步梯度压缩的联邦学习框架, 刘毅/<strong>赵瑞辉</strong>, CN 111784002 A</a></li>
<li>2020.11 一种基于多头自注意力的知识蒸馏的语言模型压缩方法, 黄展鹏/赵博/<strong>赵瑞辉</strong>/陆扩建, CN 111554268 A</a></li>
<li>2020.10 一种基于预构建本体和知识图谱的可控医疗实体推荐方案, <strong>赵瑞辉</strong>/陆扩建/赵博/黄展鹏, CN 111538894 A</a></li>
<li>2020.10 一种可解释的基于知识蒸馏的轻量级搜索排序方案, <strong>赵瑞辉</strong>/陆扩建/赵博/黄展鹏, CN 111538908 A</a></li>
<li>2020.06 一种综合纵向-横向联邦学习的分散数据协同训练方案, 李洪珊/<strong>赵瑞辉</strong>/赵博, CN 111081337 A</a></li></p>
<hr>
<h3>奖项及荣誉</h3>
<li>2022.06 作为团队负责人研发中文医疗预训练语言模型,获得首个中文医疗信息处理评测基准<a class="grey" href="https://tianchi.aliyun.com/cblue">CBLUE2.0第一名</a> (共14个NLP任务,包括文本信息抽取、分类、关系判定、术语标准化等,参赛队伍来自中科院自动化所、清华、AISpeech、金域医学、西安交大)</li>
<li>2022.05 带领团队获得2022数博会-领先科技成果奖(<strong>撰写人&团队负责人, 国家社会科技奖,</strong> 获奖比例: 55/437), <a class="grey" href="https://mp.weixin.qq.com/s/5oS1PfYp8hY4YaNN79Xl8w">[Link]</a></li>
<li>2022.03 作为团队负责人,带领鹅厂B赛题队获得<a class="grey" href="http://www.nhsa.gov.cn/art/2022/3/3/art_14_7918.html">国家医保局解决方案大赛决赛第一名</a>, <a class="grey" href="http://www.gd.chinanews.com.cn/2022/2022-03-09/419708.shtml">[Link]</a> (1/156, 腾讯、阿里、百度、平安、泰康等133家企业156个团队)</li>
<!-- <li>2022.01 获得深圳市产业发展与创新人才奖</a></li> -->
<li>2021.11 牵头天衍实验室与同济大学助理教授<a class="grey" href="http://cs1.tongji.edu.cn/~dawei/">程大伟</a>在医保基金反欺诈领域的高校合作</a></li>
<li>2021.10 获得中国计算语言学大会(CCL 2021)智能医疗对话诊疗评测-智能化医疗诊断赛道一等奖 <a class="grey" href="https://mp.weixin.qq.com/s/DnQb3AOvajiMG7kf6aUUlQ">[Link] </a></li>
<li>2021.10 与中山大学副教授<a class="grey" href="https://lemondan.github.io/">梁小丹</a>合作的课题3.3医疗自然语言理解获得2020年CCF-腾讯犀牛鸟基金优秀奖(6/25)</a></li>
<li>2021.10 主持2021 CCF-腾讯犀牛鸟基金课题4.3医疗自然语言理解, 牵头与上海交通大学助理教授<a class="grey" href="https://coai-sjtu.github.io">陈露</a>的科研合作</a></li>
<li>2021.07 成功联合主办国际比赛ICLR MLPCP challenge, 并贡献数据集给<a class="grey" href="https://tianchi.aliyun.com/specials/promotion/2021chinesemedicalnlpleaderboardchallenge">CBLUE2.0</a>,助推医疗对话生成和辅助诊断的行业交流与发展, <a class="grey" href="https://mp.weixin.qq.com/s?__biz=MzI0NzE1MTg2Ng==&mid=2649907523&idx=1&sn=d5c471e288f95f968c34683981625938&chksm=f1b2acd6c6c525c091fb2709fa05a38a8cfe6ca3a2bdb18d7c607830392a3c8dcd483892d975&mpshare=1&scene=1&srcid=0705GaniaipKOu5OAiPELYfc&sharer_sharetime=1625457310048&sharer_shareid=b0cf66fe604877ee3ba75189cecc2110&version=3.1.8.90238&platform=mac#rd">[Link1], <a class="grey" href="https://mlpcp21.github.io/pages/challenge">[Link2], <a class="grey" href="https://mp.weixin.qq.com/s/yOB0VTKND8nmGAeaPO-Nzg">[Link3]</a></a></a></li>
<li>2021.05 数博会-领先科技成果奖(<strong>撰写人&第二完成人, 国家社会科技奖,</strong> 获奖比例: 49/560,路演比例: 5/49), <a class="grey" href="https://jingji.cctv.com/2021/05/26/ARTIRHsVCF4vw7csSIlWroIJ210526.shtml">[Link1]</a>, <a class="grey" href="https://mp.weixin.qq.com/s/7PO8UPW1tWdNJYbmatMqrw">[Link2]</a></li>
<li>2021.05 ACL录用新闻报道 by 机器之心, <a class="grey" href="https://mp.weixin.qq.com/s/hncJr2sjULaKnshaWAUQgw">[Link]</a></li>
<li>2021.05 联合主办并赞助CCKS评测, <a class="grey" href="https://www.biendata.xyz/competition/ccks_2021_mdg/">[Link]</a></li>
<li>2021.03 荣获腾讯“年度专利接口人”(top-10, 腾讯云与智慧产业事业群唯一, 带领天衍实验室在2020年度共申请了146项发明专利,增长53.7%)</a></li>
<li>2021.01 作为项目Sponsor及技术负责人,研发国内首个新冠疫苗智能问答(已上线到腾讯健康及微信搜一搜), <a class="grey" href="https://mp.weixin.qq.com/s/kKjfV0bMGOx_Co-BhNdqrg">[Link]</a></li>
<li>2020.12 荣获腾讯“业务突破奖”、腾讯“同心战役特别贡献奖” </a></li>
<li>2020.11 <strong>南山区“十大创新工匠”提名奖</strong>(top-20,与南山区100多家企业如大疆、中兴、大族激光、中国建筑集团等公司的创始人、CEO、首席科学家等同台竞技), <a class="grey" href="https://www.163.com/dy/article/FOJFPRU70511PVPC.html">[Link1]</a>, <a class="grey" href="http://www.sznews.com/news/content/2020-10/17/content_23641159.htm">[Link2]</a></li>
<li>2020.10 主持2020 CCF-腾讯犀牛鸟基金课题3.3医疗自然语言理解, 牵头与中山大学副教授<a class="grey" href="https://lemondan.github.io/">梁小丹</a>的科研合作, <a class="grey" href="https://mp.weixin.qq.com/s/19BTZdVVC3x69BlhqgkJyQ">[Link]</a></li>
<li>2020.09 首届北京数智医保创新竞赛金奖 (技术负责人,参赛公司:百度阿里/平安泰康/清华北大/四大行/传统IT企业等), <a class="grey" href="https://mp.weixin.qq.com/s?__biz=MzI0NzE1MTg2Ng==&mid=2649904037&idx=2&sn=96a9abb49728757836e0f08690e18027&chksm=f1b29e30c6c51726db28971535e35db7c42a6833985d4f32507fdf41793db8f18537a3651715&mpshare=1&scene=1&srcid=01057pMtb1h3dWajwItpLSE8&sharer_sharetime=1609826139916&sharer_shareid=03ceeec3a81474a058c2259020e5ab73&version=3.1.0.2353&platform=mac#rd">[Link]</a></li>
<li>2020.08 作为Sponsor,和<a class="grey" href="http://home.cse.ust.hk/~qyang/">杨强</a>、<a class="grey" href="https://air.tsinghua.edu.cn/team-detail.html?id=86&classid=17">刘洋</a>等牵头成立腾讯天衍-微众联邦学习联合实验室, <a class="grey" href="https://mp.weixin.qq.com/s?__biz=MzI0NzE1MTg2Ng==&mid=2649903750&idx=2&sn=a6663dc3f68767441dd44923886c8da3&chksm=f1b29f13c6c51605f8709949f107abee6fe50bc2dfff02d863241e8145978105a5b2b00b8e09&mpshare=1&scene=1&srcid=0105TQS39d3G6XnGu4RE1GxS&sharer_sharetime=1609826154864&sharer_shareid=03ceeec3a81474a058c2259020e5ab73&version=3.1.0.2353&platform=mac#rd">[Link]</a></li>
<li>2020.04 与微众银行AI部门联合研发并发布业界首个医疗联邦学习应用框架, <a class="grey" href="https://mp.weixin.qq.com/s?__biz=MzI0NzE1MTg2Ng==&mid=2649902474&idx=1&sn=ed74ee6de05892ab92a506112d416a55&chksm=f1b2801fc6c50909f4ea1ed11287f62ce726b34635132e97b0863976280a0a3847f7c84566ec&scene=21#wechat_redirect">[Link]</a></li>
<li>2020.03 牵头天衍实验室与加拿大蒙特利尔大学助理教授<a class="grey" href="http://www-labs.iro.umontreal.ca/~liubang/">刘邦</a>的科研顾问合作</a></li>
<li>2020.03 作为腾讯代表参与IEEE Standard P3652.1-Federated Machine Learning首个联邦学习国际标准制定会议, <a class="grey" href="https://new.qq.com/rain/a/20210402A05RPW00">[Link]</a></li>
<li>2020.03 "Explainable Anomaly Detection Using Spark", <strong>Ruihui Zhao</strong>, Ting Chen, presentation accepted by <a class="grey" href="https://databricks.com/sparkaisummit/north-america-2020">Spark AI Summit 2020</a></li>
<li>2020.02 担任腾讯医疗与钟南山院士团队成立的<a class="grey" href="https://www.163.com/tech/article/F6D5NOD300097U7R.html">大数据及人工智能联合实验室</a>核心研究员,并进行新冠重症预测及流感预测方面的研究</a></li></p>
<hr>
<div class="content footer">
Last updated on <b><font color=#008B8B>Dec 10, 2022</font></b>.
<!-- Visitor number: <a href="https://www.hitwebcounter.com" target="_blank"> -->
<!-- <img src="https://hitwebcounter.com/counter/counter.php?page=7804457&style=0027&nbdigits=8&type=page&initCount=0" title="Free Counter" Alt="web counter" border="0" /></a> -->
</div>
</body>
</html>