Skip to content

mtuann/llm-updated-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 

Repository files navigation

Table of Contents

  1. Large Language Models Papers
  2. Other topics
  3. Large Language Models Papers with Code

Large Language Models Papers

This GitHub repository contains an updated list of Federated Learning papers as of November 29, 2024.

  • The resources are collected from various sources, including arXiv, NeurIPS, ICML, ICLR, ACL, EMNLP, AAAI, IJCAI, KDD, CVPR, ICCV, ECCV, NIPS, IEEE, ACM, Springer, ScienceDirect, Wiley, Nature, Science, and other top AI/ML conferences and journals.
  • For a better reading experience, visit the Shinyapps website.

Other Topics

Explore additional research papers on the following topics:


For contributions, inquiries, or suggestions, feel free to reach out via email.


If you find this application helpful and would like to support its development, you can buy me a coffee using one of the following methods:


Large Language Models Papers with Code

Due to GitHub repository limitations, this section includes only those papers that provide accompanying code, sorted by publish date. For access to the full list of papers, please visit the Shinyapps website.


No. Title Authors Publish Date Venue Code URL
1 A Survey on Large Language Model based Autonomous Agents Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen 2024-12-01 arXiv https://github.com/Paitesanshi/LLM-Agent-Survey https://doi.org/10.48550/arXiv.2308.11432
2 The Two Word Test: A Semantic Benchmark for Large Language Models Nicholas Riccardi, Xuan Yang, Rutvik H. Desai 2024-12-01 arXiv https://github.com/NickRiccardi/two-word-test https://doi.org/10.48550/arXiv.2306.04610
3 Large Language Models as Surrogate Models in Evolutionary Algorithms: A Preliminary Study Hao Hao, Xiaoqun Zhang, Aimin Zhou 2024-12-01 arXiv https://github.com/hhyqhh/LAEA https://doi.org/10.48550/arXiv.2406.10675
4 DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu 2024-11-22 arXiv:2411.14055, 2024 https://github.com/hexuandeng/DRPruning http://arxiv.org/abs/2411.14055v1
5 SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model Christopher Nguyen, William Nguyen, Atsushi Suzuki, Daisuke Oku, Hong An Phan, Sang Dinh, Zooey Nguyen, Anh Ha, Shruti Raghavan, Huy Vo, Thang Nguyen, Lan Nguyen, Yoshikuni Hirayama 2024-11-22 arXiv …, 2024 https://github.com/aitomatic/semikong http://arxiv.org/abs/2411.13802v2
6 UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages Bethel Melesse Tessema, Akhil Kedia, Tae-Sun Chung 2024-11-22 arXiv:2411.14343, 2024 https://github.com/bethelmelesse/unifiedcrawl http://arxiv.org/abs/2411.14343v1
7 Disentangling Memory and Reasoning Ability in Large Language Models Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang 2024-11-21 arXiv …, 2024 https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning http://arxiv.org/abs/2411.13504v2
8 DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving Xianda Guo, Ruijun Zhang, Yiqun Duan, Yuhang He, Chenming Zhang, Shuai Liu, Long Chen 2024-11-21 arXiv …, 2024 https://github.com/XiandaGuo/Drive-MLLM http://arxiv.org/abs/2411.13112v1
9 On the Consistency of Video Large Language Models in Temporal Comprehension Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang, Angela Yao 2024-11-21 arXiv:2411.12951, 2024 https://github.com/minjoong507/Consistency-of-Video-LLM http://arxiv.org/abs/2411.12951v1
10 Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods Jai Doshi, Asa Cooper Stickland 2024-11-18 arXiv https://github.com/JaiDoshi/Knowledge-Erasure http://arxiv.org/abs/2411.12103v2
11 FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training Anjia Cao, Xing Wei, Zhiheng Ma 2024-11-18 arXiv https://github.com/MIV-XJTU/FLAME http://arxiv.org/abs/2411.11927v1
12 BianCang: A Traditional Chinese Medicine Large Language Model Sibo Wei, Xueping Peng, Yi-fei Wang, Jiasheng Si, Weiyu Zhang, Wenpeng Lu, Xiaoming Wu, Yinglong Wang 2024-11-18 arXiv …, 2024 https://github.com/QLU-NLP/BianCang http://arxiv.org/abs/2411.11027v1
13 Multilingual Large Language Models: A Systematic Survey Shaolin Zhu, Supryadi, Shaoyang Xu, Haoran Sun, Leiyu Pan, Menglong Cui, Jiangcun Du, Renren Jin, António Branco, Deyi Xiong 2024-11-17 arXiv https://github.com/tjunlp-lab/Awesome-Multilingual-LLMs-Papers http://arxiv.org/abs/2411.11072v1
14 DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models Yongdong Wang, Runze Xiao, Jun Younes Louhi Kasahara, Ryosuke Yajima, Keiji Nagatani, Atsushi Yamashita, Hajime Asama 2024-11-17 arXiv e …, 2024 https://wyd0817.github.io/project-dart-llm/ http://arxiv.org/abs/2411.09022v1
15 LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation Zhenshi Li, Dilxat Muhtar, Feng Gu, Xueliang Zhang, Pengfeng Xiao, Guangjun He, Xiaoxiang Zhu 2024-11-17 arXiv e …, 2024 https://github.com/NJU-LHRS/LHRS-Bot http://arxiv.org/abs/2411.09301v1
16 TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models Tingyu Qu, Mingxiao Li, Tinne Tuytelaars, Marie-Francine Moens 2024-11-17 arXiv https://github.com/tingyu215/TS-LLaVA http://arxiv.org/abs/2411.11066v1
17 Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering Zeping Yu, Sophia Ananiadou 2024-11-17 arXiv https://github.com/zepingyu0512/llava-mechanism http://arxiv.org/abs/2411.10950v1
18 Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model Ting Liu, Liangtao Shi, Richang Hong, Yue Hu, Quanjun Yin, Linfeng Zhang 2024-11-16 arXiv https://github.com/liuting20/MustDrop http://arxiv.org/abs/2411.10803v1
19 Large Language Models Can Self-Improve in Long-context Reasoning Siheng Li, Cheng Yang, Zesen Cheng, Lemao Liu, Mo Yu, Yujiu Yang, Wai Lam 2024-11-16 arXiv e …, 2024 https://github.com/SihengLi99/SEALONG http://arxiv.org/abs/2411.08147v1
20 Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash Parsa Hejabi, Elnaz Rahmati, Alireza S. Ziabari, Preni Golazizian, Jesse Thomason, Morteza Dehghani 2024-11-15 arXiv https://github.com/ParsaHejabi/Simulation-Framework-for-Multi-Agent-Balderdash http://arxiv.org/abs/2411.10422v1
21 Orca: Enhancing Role-Playing Abilities of Large Language Models by Integrating Personality Traits Yuxuan Huang 2024-11-15 arXiv https://github.com/Aipura/Orca http://arxiv.org/abs/2411.10006v1
22 MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMs Mengyuan Zhang, Ruihui Wang, Bo Xia, Yuan Sun, Xiaobing Zhao 2024-11-15 arXiv:2411.09492, 2024 https://github.com/joenahm/MM-Eval http://arxiv.org/abs/2411.09492v1
23 Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era Thanh Tam Nguyen, Zhao Ren, Trinh Pham, Thanh Trung Huynh, Phi Le Nguyen, Hongzhi Yin, Quoc Viet Hung Nguyen 2024-11-15 arXiv https://github.com/tamlhp/awesome-instruction-editing http://arxiv.org/abs/2411.09955v1
24 CorrectBench: Automatic Testbench Generation with Functional Self-Correction using LLMs for HDL Design Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li 2024-11-15 arXiv e …, 2024 https://github.com/AutoBench/CorrectBench http://arxiv.org/abs/2411.08510v1
25 Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination Haojie Zheng, Tianyang Xu, Hanchi Sun, Shu Pu, Ruoxi Chen, Lichao Sun 2024-11-15 arXiv https://github.com/Terry-Xu-666/visual_inference_chain http://arxiv.org/abs/2411.12591v1
26 InfiCoder-Eval: Systematically Evaluating the Question-Answering Capabilities of Code Large Language Models Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, Ziyue Hua, Guanghan Ning, Siwei Wang, Tao Xie, Hongxia Yang 2024-11-14 arXiv https://infi-coder.github.io/infibench https://doi.org/10.48550/arXiv.2404.07940
27 MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju 2024-11-14 The Thirty-eight …, 2024 https://github.com/AI4LIFE-GROUP/med-safety-bench http://arxiv.org/abs/2403.03744v4
28 TourSynbio-Search: A Large Language Model Driven Agent Framework for Unified Search Method for Protein Engineering Yungeng Liu, Zan Chen, Yu Guang Wang, Yiqing Shen 2024-11-14 arXiv e-prints, 2024 https://github.com/tsynbio/Toursynbio-Search http://arxiv.org/abs/2411.06024v1
29 When LLMs Meet Cunning Questions: A Fallacy Understanding Benchmark for Large Language Models Yinghui Li, Qingyu Zhou, Yuanzhen Luo, Shirong Ma, Yangning Li, Hai-Tao Zheng, Xuming Hu, Philip S. Yu 2024-11-14 arXiv https://github.com/THUKElab/FLUB https://doi.org/10.48550/arXiv.2402.11100
30 DROJ: A Prompt-Driven Attack against Large Language Models Leyang Hu, Boran Wang 2024-11-14 arXiv https://github.com/Leon-Leyang/LLM-Safeguard http://arxiv.org/abs/2411.09125v1
31 Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models Yusen Zhang, Sarkar Snigdha Sarathi Das, Rui Zhang 2024-11-12 arXiv https://github.com/psunlpgroup/VerbosityLLM http://arxiv.org/abs/2411.07858v1
32 ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Canyu Chen, Jian Yu, Shan Chen, Che Liu, Zhongwei Wan, Danielle Bitterman, Fei Wang, Kai Shu 2024-11-10 GoogleScholar https://clinicalbench.github.io http://arxiv.org/abs/2411.06469v1
33 AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering Yungeng Liu, Zan Chen, Yu Guang Wang, Yiqing Shen 2024-11-10 arXiv e-prints, 2024 https://github.com/tsynbio/AutoPE http://arxiv.org/abs/2411.04440v1
34 Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation Ayan Sengupta, Vaibhav Seth, Arinjay Pathak, Natraj Raman, Sriram Gopalakrishnan, Tanmoy Chakraborty 2024-11-10 arXiv e …, 2024 https://github.com/LCS2-IIITD/MonteCLoRA http://arxiv.org/abs/2411.04358v2
35 Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Ho-Jin Choi 2024-11-10 arXiv e-prints, 2024 https://github.com/passing2961/Thanos http://arxiv.org/abs/2411.04496v1
36 LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu 2024-11-09 EMNLP https://github.com/yxwan123/LogicAsker https://aclanthology.org/2024.emnlp-main.128
37 On Fake News Detection with LLM Enhanced Semantics Mining X Ma, Y Zhang, K Ding, J Yang, J Wu… 2024-11-09 OpenReview https://github.com/LEG4FD/LEG4FD https://openreview.net/pdf/ee5cbff327c79c172ad3b32ba4e6aaf163010f2f.pdf
38 Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen 2024-11-09 EMNLP https://github.com/RUCAIBox/ALLO https://aclanthology.org/2024.emnlp-main.857
39 A User-Centric Benchmark for Evaluating Large Language Models Jiayin Wang, Fengran Mo, Weizhi Ma, Peijie Sun, Min Zhang, Jian-Yun Nie 2024-11-09 arXiv https://github.com/Alice1998/URS https://doi.org/10.48550/arXiv.2404.13940
40 Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Xiaojun Wu, Junxi Liu, Huanyi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo 2024-11-09 arXiv https://github.com/IDEA-FinAI/Golden-Touchstone http://arxiv.org/abs/2411.06272v1
41 LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit Ruihao Gong, Yang Yong, Shiqiao Gu, Yushi Huang, Chengtao Lv, Yunchen Zhang, Dacheng Tao, Xianglong Liu 2024-11-09 EMNLP https://github.com/ModelTC/llmc https://aclanthology.org/2024.emnlp-industry.12
42 Game-theoretic LLM: Agent Workflow for Negotiation Games Wenyue Hua, Ollie Liu, Lingyao Li, Alfonso Amayuelas, Julie Chen, Lucas Jiang, Mingyu Jin, Lizhou Fan, Fei Sun, William Wang, Xintong Wang, Yongfeng Zhang 2024-11-08 arXiv https://github.com/Wenyueh/game_theory http://arxiv.org/abs/2411.05990v2
43 Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation Dong Shu, Bingbing Duan, Kai Guo, Kaixiong Zhou, Jiliang Tang, Mengnan Du 2024-11-08 arXiv https://github.com/Tizzzzy/LLM-GDM-alignment http://arxiv.org/abs/2411.05316v1
44 WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models Shengda Fan, Xin Cong, Yuepeng Fu, Zhong Zhang, Shuyan Zhang, Yuanwei Liu, Yesai Wu, Yankai Lin, Zhiyuan Liu, Maosong Sun 2024-11-08 arXiv https://github.com/OpenBMB/WorkflowLLM http://arxiv.org/abs/2411.05451v1
45 FineTuneBench: How well do commercial fine-tuning APIs infuse knowledge into LLMs? Eric Wu, Kevin Wu, James Zou 2024-11-07 arXiv https://github.com/kevinwu23/StanfordFineTuneBench http://arxiv.org/abs/2411.05059v1
46 Abstract2Appendix: Academic Reviews Enhance LLM Long-Context Capabilities Shengzhi Li, Kittipat Kampa, Rongyu Lin, Bohang Li, Shichao Pei 2024-11-07 arXiv https://github.com/findalexli/Abstract2Appendix http://arxiv.org/abs/2411.05232v1
47 Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models Zhijian Zhuo, Ya Wang, Yutao Zeng, Xiaoqing Li, Xun Zhou, Jinwen Ma 2024-11-06 arXiv https://github.com/BryceZhuo/PolyCom http://arxiv.org/abs/2411.03884v1
48 QUILL: Quotation Generation Enhancement of Large Language Models Jin Xiao, Bowei Zhang, Qianyu He, Jiaqing Liang, Feng Wei, Jinglei Chen, Zujie Liang, Deqing Yang, Yanghua Xiao 2024-11-06 arXiv https://github.com/GraceXiaoo/QUILL http://arxiv.org/abs/2411.03675v1
49 FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai, Jiang Bian 2024-11-05 arXiv https://github.com/microsoft/CADGeneration/FlexCAD http://arxiv.org/abs/2411.05823v1
50 Change Is the Only Constant: Dynamic LLM Slicing based on Layer Redundancy Razvan-Gabriel Dumitru, Paul-Ioan Clotan, Vikas Yadav, Darius Peteleaza, Mihai Surdeanu 2024-11-05 arXiv https://github.com/RazvanDu/DynamicSlicing http://arxiv.org/abs/2411.03513v1
51 Leveraging Large Language Models in Code Question Answering: Baselines and Issues Georgy Andryushchenko, Vladimir Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev 2024-11-05 arXiv https://github.com/IU-AES-AI4Code/CodeQuestionAnswering http://arxiv.org/abs/2411.03012v1
52 SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents Dawei Li, Zhen Tan, Peijia Qian, Yifan Li, Kumar Satvik Chaudhary, Lijie Hu, Jiayi Shen 2024-11-05 arXiv https://github.com/David-Li0406/SMoA http://arxiv.org/abs/2411.03284v1
53 Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task Hoonick Lee, Mogan Gim, Donghyeon Park, Donghee Choi, Jaewoo Kang 2024-11-04 arXiv http://github.com/dmis-lab/CulinaryASH http://arxiv.org/abs/2411.01996v1
54 Eurekaverse: Environment Curriculum Generation via Large Language Models William Liang, Sam Wang, Hung-Ju Wang, Osbert Bastani, Dinesh Jayaraman, Yecheng Jason Ma 2024-11-04 arXiv https://eureka-research.github.io/eurekaverse http://arxiv.org/abs/2411.01775v1
55 SQL Injection Jailbreak: a structural disaster of large language models Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu 2024-11-03 arXiv https://github.com/weiyezhimeng/SQL-Injection-Jailbreak http://arxiv.org/abs/2411.01565v1
56 TODO: Enhancing LLM Alignment with Ternary Preferences Yuxiang Guo, Lu Yin, Bo Jiang, Jiaqi Zhang 2024-11-02 arXiv https://github.com/XXares/TODO http://arxiv.org/abs/2411.02442v1
57 Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis Shijia Liao, Yuxuan Wang, Tianyu Li, Yifan Cheng, Ruoyi Zhang, Rongzhi Zhou, Yijin Xing 2024-11-02 arXiv https://github.com/fishaudio/fish-speech http://arxiv.org/abs/2411.01156v1
58 Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection Han Yin, Yang Xiao, Jisheng Bai, Rohan Kumar Das 2024-11-02 arXiv https://github.com/apple-yinhan/Noise-robust-SED http://arxiv.org/abs/2411.01174v1
59 Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM Xiong Wang, Yangze Li, Chaoyou Fu, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma 2024-11-01 arXiv https://freeze-omni.github.io/ http://arxiv.org/abs/2411.00774v1
60 Beyond Utility: Evaluating LLM as Recommender Chumeng Jiang, Jiayin Wang, Weizhi Ma, Charles L. A. Clarke, Shuai Wang, Chuhan Wu, Min Zhang 2024-11-01 arXiv https://github.com/JiangDeccc/EvaLLMasRecommender http://arxiv.org/abs/2411.00331v1
61 LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models Nam V. Nguyen, Thong T. Doan, Luong Tran, Van Nguyen, Quang Pham 2024-11-01 arXiv https://fsoft-aic.github.io/fsoft-LibMoE.github.io http://arxiv.org/abs/2411.00918v1
62 Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling Yiwen Ding, Zhiheng Xi, Wei He, Zhuoyuan Li, Yitao Zhai, Xiaowei Shi, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang 2024-11-01 arXiv https://github.com/Yiwen-Ding/Guided-Self-Improvement http://arxiv.org/abs/2411.00750v1
63 MoD: A Distribution-Based Approach for Merging Large Language Models Quy-Anh Dang, Chris Ngo 2024-11-01 arXiv https://github.com/knovel-eng/mod http://arxiv.org/abs/2411.00406v1
64 PILL: Plug Into LLM with Adapter Expert and Attention Gate Fangyuan Zhang, Tingting Liang, Zhengyuan Wu, Yuyu Yin 2024-11-01 Applied Soft Computing https://github.com/DsaltYfish/PILL http://arxiv.org/abs/2311.02126v1
65 Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, Jinyuan Liu, Yichen Gong, Qi Li, Anyu Wang, Xiaoyun Wang 2024-11 LAMPS '24: Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis https://github.com/ThuCCSLab/MergeGuard https://dl.acm.org/doi/10.1145/3689217.3690614
66 SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing Zhiyuan Zhang, DongDong Chen, Jing Liao 2024-11 ACM Transactions on Graphics (TOG), Volume 43, Issue 6 https://bestzzhang.github.io/SGEdit https://dl.acm.org/doi/10.1145/3687957
67 Large Language Models for Anomaly Detection in Computational Workflows: From Supervised Fine-Tuning to In-Context Learning Hongwei Jin, George Papadimitriou, Krishnan Raghavan, Pawel Zuk, Prasanna Balaprakash, Cong Wang, Anirban Mandal, Ewa Deelman 2024-11 SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis https://github.com/PoSeiDon-Workflows/LLM_AD https://dl.acm.org/doi/10.1109/SC41406.2024.00098
68 Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models Siqiao Xue, Danrui Qi, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhiping Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou, Hong Yi, Shaodong Liu, Hongjun Yang, Faqiang Chen 2024-11 Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12 https://github.com/eosphoros-ai/DB-GPT https://dl.acm.org/doi/10.14778/3685800.3685876
69 EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Unified Compression and Adaptive Layer Voting Zhongzhi Yu, Zheng Wang, Yuhan Li, Haoran You, Ruijie Gao, Xiaoya Zhou, Sreenidhi Reedy Bommu, Yang Katie Zhao, Yingyan Celine Lin 2024-11 DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference https://github.com/GATECH-EIC/Edge-LLM https://dl.acm.org/doi/10.1145/3649329.3658473
70 End-to-End Ontology Learning with Large Language Models Andy Lo, Albert Q. Jiang, Wenda Li, Mateja Jamnik 2024-10-31 arXiv https://github.com/andylolu2/ollm http://arxiv.org/abs/2410.23584v1
71 What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Ming Li, Yanhong Li, Tianyi Zhou 2024-10-31 arXiv https://github.com/MingLiiii/Layer_Gradient http://arxiv.org/abs/2410.23743v1
72 Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh 2024-10-31 arXiv https://github.com/parameterlab/mia-scaling http://arxiv.org/abs/2411.00154v1
73 LLaMo: Large Language Model-based Molecular Graph Assistant Jinyoung Park, Minseong Bae, Dohwan Ko, Hyunwoo J. Kim 2024-10-31 arXiv https://github.com/mlvlab/LLaMo http://arxiv.org/abs/2411.00871v1
74 LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction Andre Niyongabo Rubungo, Kangming Li, Jason Hattrick-Simpers, Adji Bousso Dieng 2024-10-31 arXiv https://github.com/vertaix/LLM4Mat-Bench http://arxiv.org/abs/2411.00177v1
75 BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments Xinghao Wang, Pengyu Wang, Bo Wang, Dong Zhang, Yunhua Zhou, Xipeng Qiu 2024-10-31 arXiv https://github.com/xinghaow99/BitStack http://arxiv.org/abs/2410.23918v1
76 DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xinyi Yang, Yulin Yuan, Lidia S. Chao 2024-10-31 arXiv https://github.com/NLP2CT/DetectRL http://arxiv.org/abs/2410.23746v1
77 BUZZ: Beehive-structured Sparse KV Cache with Segmented Heavy Hitters for Efficient LLM Inference Junqi Zhao, Zhijin Fang, Shu Li, Shaohui Yang, Shichao He 2024-10-30 arXiv https://github.com/JunqiZhao888/buzz-llm http://arxiv.org/abs/2410.23079v1
78 Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, Tat-Seng Chua 2024-10-30 arXiv https://github.com/itsmeyjt/CFT http://arxiv.org/abs/2410.22809v1
79 Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning Dong Shu, Mengnan Du 2024-10-30 arXiv https://github.com/Tizzzzy/Demonstration_Selection_Overview http://arxiv.org/abs/2410.23099v1
80 On Memorization of Large Language Models in Logical Reasoning Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, Ravi Kumar 2024-10-30 arXiv https://memkklogic.github.io http://arxiv.org/abs/2410.23123v1
81 Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning Keqin Bao, Ming Yan, Yang Zhang, Jizhi Zhang, Wenjie Wang, Fuli Feng, Xiangnan He 2024-10-30 arXiv https://github.com/ym689/rec_icl http://arxiv.org/abs/2410.23136v1
82 SciPIP: An LLM-based Scientific Paper Idea Proposer Wenxiao Wang, Lihui Gu, Liye Zhang, Yunxiang Luo, Yi Dai, Chen Shen, Liang Xie, Binbin Lin, Xiaofei He, Jieping Ye 2024-10-30 arXiv https://github.com/cheerss/SciPIP http://arxiv.org/abs/2410.23166v1
83 ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning Millennium Bismay, Xiangjue Dong, James Caverlee 2024-10-30 arXiv https://github.com/millenniumbismay/reasoningrec http://arxiv.org/abs/2410.23180v1
84 Distinguishing Ignorance from Error in LLM Hallucinations Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov 2024-10-29 arXiv https://github.com/technion-cs-nlp/hallucination-mitigation http://arxiv.org/abs/2410.22071v1
85 Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach Qingchuan Li, Jiatong Li, Tongxuan Liu, Yuting Zeng, Mingyue Cheng, Weizhe Huang, Qi Liu 2024-10-29 arXiv https://github.com/wufeiwuwoshihua/nshy http://arxiv.org/abs/2410.21779v1
86 Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance Dongmin Park, Sebin Kim, Taehong Moon, Minkyu Kim, Kangwook Lee, Jaewoong Cho 2024-10-29 arXiv https://github.com/krafton-ai/Rare2Frequent http://arxiv.org/abs/2410.22376v1
87 Scaling LLM Inference with Optimized Sample Compute Allocation Kexun Zhang, Shang Zhou, Danqing Wang, William Yang Wang, Lei Li 2024-10-29 arXiv https://github.com/LeiLiLab/OSCA http://arxiv.org/abs/2410.22480v1
88 LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment Ge Yang, Changyi He, Jinyang Guo, Jianyu Wu, Yifu Ding, Aishan Liu, Haotong Qin, Pengliang Ji, Xianglong Liu 2024-10-28 arXiv https://github.com/AboveParadise/LLMCBench http://arxiv.org/abs/2410.21352v2
89 SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization Wanhua Li, Zibin Meng, Jiawei Zhou, Donglai Wei, Chuang Gan, Hanspeter Pfister 2024-10-28 arXiv https://mengzibin.github.io/SocialGPT.github.io/ http://arxiv.org/abs/2410.21411v1
90 Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models Yilun Jin, Zheng Li, Chenwei Zhang, Tianyu Cao, Yifan Gao, Pratik Jayarao, Mao Li, Xin Liu, Ritesh Sarkhel, Xianfeng Tang, Haodong Wang, Zhengyang Wang, Wenju Xu, Jingfeng Yang, Qingyu Yin, Xian Li, Priyanka Nigam, Yi Xu, Kai Chen, Qiang Yang, Meng Jiang, Bing Yin 2024-10-28 arXiv https://github.com/KL4805/ShoppingMMLU http://arxiv.org/abs/2410.20745v2
91 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen 2024-10-28 arXiv https://github.com/bytedance/ShadowKV http://arxiv.org/abs/2410.21465v1
92 NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu 2024-10-28 arXiv https://github.com/hexuandeng/NewTerm http://arxiv.org/abs/2410.20814v1
93 Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye Yirong Sun, Dawei Zhu, Yanjun Chen, Erjia Xiao, Xinghao Chen, Xiaoyu Shen 2024-10-28 arXiv https://github.com/EIT-NLP/BLEUless_DocMT http://arxiv.org/abs/2410.20941v2
94 Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation Mufei Li, Siqi Miao, Pan Li 2024-10-28 arXiv https://github.com/Graph-COM/SubgraphRAG http://arxiv.org/abs/2410.20724v1
95 Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese 2024-10-28 arXiv https://github.com/pasquini-dario/project_mantis http://arxiv.org/abs/2410.20911v1
96 Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data Xinhong Xie, Tao Li, Quanyan Zhu 2024-10-27 arXiv https://github.com/XXXinhong/Detoxification_LLM http://arxiv.org/abs/2410.20298v1
97 LLMs Can Evolve Continually on Modality for X-Modal Reasoning Jiazuo Yu, Haomiao Xiong, Lu Zhang, Haiwen Diao, Yunzhi Zhuge, Lanqing Hong, Dong Wang, Huchuan Lu, You He, Long Chen 2024-10-26 arXiv https://github.com/JiazuoYu/PathWeave http://arxiv.org/abs/2410.20178v1
98 Developing Retrieval Augmented Generation (RAG) based LLM Systems from PDFs: An Experience Report Ayman Asad Khan, Md Toufique Hasan, Kai Kristian Kemell, Jussi Rasku, Pekka Abrahamsson 2024-10-26 arXiv e …, 2024 https://github.com/GPT-Laboratory/RAG-LLM-Development-Guidebook-from-PDFs http://arxiv.org/abs/2410.15944v1
99 Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News Marc-Antoine Allard, Paul Teiletche, Adam Zinebi 2024-10-26 arXiv https://github.com/paultltc/InflaBERT http://arxiv.org/abs/2410.20198v1
100 Language Agents Meet Causality -- Bridging LLMs and Causal World Models John Gkountouras, Matthias Lindemann, Phillip Lippe, Efstratios Gavves, Ivan Titov 2024-10-25 arXiv https://j0hngou.github.io/LLMCWM/ http://arxiv.org/abs/2410.19923v1
101 Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities Chung-En Sun, Xiaodong Liu, Weiwei Yang, Tsui-Wei Weng, Hao Cheng, Aidan San, Michel Galley, Jianfeng Gao 2024-10-25 arXiv …, 2024 https://github.com/SunChungEn/ADV-LLM http://arxiv.org/abs/2410.18469v1
102 APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs Huaxiaoyue Wang, Nathaniel Chin, Gonzalo Gonzalez-Pumariega, Xiangwan Sun, Neha Sunkara, Maximus Adrian Pace, Jeannette Bohg, Sanjiban Choudhury 2024-10-25 arXiv https://portal-cornell.github.io/apricot/ http://arxiv.org/abs/2410.19656v1
103 Distill Visual Chart Reasoning Ability from LLMs to MLLMs Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang 2024-10-24 arXiv https://github.com/hewei2001/ReachQA http://arxiv.org/abs/2410.18798v1
104 GCoder: Improving Large Language Model for Generalized Graph Problem Solving Qifan Zhang, Xiaobin Hong, Jianheng Tang, Nuo Chen, Yuhan Li, Wenzhong Li, Jing Tang, Jia Li 2024-10-24 arXiv https://github.com/Bklight999/WWW25-GCoder/tree/master http://arxiv.org/abs/2410.19084v1
105 Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design Ruisi Cai, Yeonju Ro, Geon-Woo Kim, Peihao Wang, Babak Ehteshami Bejnordi, Aditya Akella, Zhangyang Wang 2024-10-24 arXiv https://github.com/VITA-Group/READ-ME http://arxiv.org/abs/2410.19123v1
106 AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models Kim Sung-Bin, Oh Hyun-Bin, JungMok Lee, Arda Senocak, Joon Son Chung, Tae-Hyun Oh 2024-10-23 arXiv https://github.com/AVHBench/AVHBench http://arxiv.org/abs/2410.18325v1
107 CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation Qinsi Wang, Saeed Vahidian, Hancheng Ye, Jianyang Gu, Jianyi Zhang, Yiran Chen 2024-10-23 arXiv https://wangqinsi1.github.io/coreinfer_page/ http://arxiv.org/abs/2410.18311v1
108 Cross-model Control: Improving Multiple Large Language Models in One-time Training Jiayi Wu, Hao Sun, Hengyi Cai, Lixin Su, Shuaiqiang Wang, Dawei Yin, Xiang Li, Ming Gao 2024-10-23 arXiv https://github.com/wujwyi/CMC http://arxiv.org/abs/2410.17599v1
109 VoiceBench: Benchmarking LLM-Based Voice Assistants Yiming Chen, Xianghu Yue, Chen Zhang, Xiaoxue Gao, Robby T. Tan, Haizhou Li 2024-10-22 arXiv https://github.com/MatthewCYM/VoiceBench http://arxiv.org/abs/2410.17196v1
110 CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing Chen Yang, Chenyang Zhao, Quanquan Gu, Dongruo Zhou 2024-10-22 arXiv https://github.com/uclaml/COPS http://arxiv.org/abs/2410.16670v1
111 Improving Causal Reasoning in Large Language Models: A Survey Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Qingzhen Liu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan 2024-10-22 arXiv https://github.com/chendl02/Awesome-LLM-causal-reasoning http://arxiv.org/abs/2410.16676v3
112 Large Language Models Empowered Personalized Web Agents Hongru Cai, Yongqi Li, Wenjie Wang, Fengbin Zhu, Xiaoyu Shen, Wenjie Li, Tat-Seng Chua 2024-10-22 arXiv https://hongrucai.github.io/PersonalWAB/ http://arxiv.org/abs/2410.17236v1
113 AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration Bradley McDanel 2024-10-22 arXiv https://github.com/BradMcDanel/AMUSD/ http://arxiv.org/abs/2410.17375v1
114 ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage Taewhoo Lee, Chanwoong Yoon, Kyochul Jang, Donghyeon Lee, Minju Song, Hyunjae Kim, Jaewoo Kang 2024-10-22 arXiv https://github.com/dmis-lab/ETHIC http://arxiv.org/abs/2410.16848v1
115 Automated Spinal MRI Labelling from Reports Using a Large Language Model Robin Y. Park, Rhydian Windsor, Amir Jamaludin, Andrew Zisserman 2024-10-22 MICCAI https://github.com/robinyjpark/AutoLabelClassifier https://doi.org/10.1007/978-3-031-72086-4_10
116 DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models Chen Qian, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao 2024-10-22 arXiv https://github.com/ChnQ/DEAN http://arxiv.org/abs/2410.16672v1
117 Boosting Jailbreak Transferability for Large Language Models Hanqing Liu, Lifeng Zhou, Huanqian Yan 2024-10-21 arXiv https://github.com/HqingLiu/SI-GCG http://arxiv.org/abs/2410.15645v1
118 CausalGraph2LLM: Evaluating LLMs for Causal Queries Ivaxi Sheth, Bahare Fatemi, Mario Fritz 2024-10-21 arXiv https://github.com/ivaxi0s/CausalGraph2LLM http://arxiv.org/abs/2410.15939v1
119 LLaVA-KD: A Framework of Distilling Multimodal Large Language Models Yuxuan Cai, Jiangning Zhang, Haoyang He, Xinwei He, Ao Tong, Zhenye Gan, Chengjie Wang, Xiang Bai 2024-10-21 arXiv https://github.com/Fantasyele/LLaVA-KD http://arxiv.org/abs/2410.16236v2
120 MagicPIG: LSH Sampling for Efficient LLM Generation Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen 2024-10-21 arXiv https://github.com/Infini-AI-Lab/MagicPIG http://arxiv.org/abs/2410.16179v1
121 Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs Xin Ma, Yang Liu, Jingjing Liu, Xiaoxu Ma 2024-10-21 arXiv https://github.com/soacker/Mesa-Extrapolation http://arxiv.org/abs/2410.15859v3
122 RAC: Efficient LLM Factuality Correction with Retrieval Augmentation Changmao Li, Jeffrey Flanigan 2024-10-21 arXiv https://github.com/jlab-nlp/Retrieval-Augmented-Correction http://arxiv.org/abs/2410.15667v1
123 A Comprehensive Evaluation of Cognitive Biases in LLMs Simon Malberg, Roman Poletukhin, Carolin M. Schuster, Georg Groh 2024-10-20 arXiv https://github.com/simonmalberg/cognitive-biases-in-llms http://arxiv.org/abs/2410.15413v1
124 Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction Yinhan He, Zaiyi Zheng, Patrick Soga, Yaozhen Zhu, yushun Dong, Jundong Li 2024-10-19 EMNLP 2024 (Findings) https://github.com/YinhanHe123/new\_LLM4GNNExplanation http://arxiv.org/abs/2410.15165v1
125 Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization Zihui Wu, Haichang Gao, Ping Wang, Shudong Zhang, Zhaoxiang Liu, Shiguo Lian 2024-10-19 arXiv https://github.com/wooozihui/GlitchMiner http://arxiv.org/abs/2410.15052v2
126 Imprompter: Tricking LLM Agents into Improper Tool Use Xiaohan Fu, Shuheng Li, Zihan Wang, Yihao Liu, Rajesh K. Gupta, Taylor Berg-Kirkpatrick, Earlence Fernandes 2024-10-19 arXiv https://github.com/Reapor-Yurnero/imprompter http://arxiv.org/abs/2410.14923v2
127 MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification Yin Li, Liangwei Wang, Shiyuan Piao, Boo-Ho Yang, Ziyue Li, Wei Zeng, Fugee Tsung 2024-10-19 arXiv https://github.com/MCCodeAI/MCCoder http://arxiv.org/abs/2410.15154v1
128 Are LLMs Good Zero-Shot Fallacy Classifiers? Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu 2024-10-19 arXiv https://github.com/panFJCharlotte98/Fallacy_Detection http://arxiv.org/abs/2410.15050v1
129 Evaluating Deep Unlearning in Large Language Models Ruihan Wu, Chhavi Yadav, Russ Salakhutdinov, Kamalika Chaudhuri 2024-10-19 arXiv https://github.com/wrh14/deep_unlearning http://arxiv.org/abs/2410.15153v1
130 GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization Zihui Wu, Haichang Gao, Ping Wang, Shudong Zhang, Zhaoxiang Liu, Shiguo Lian 2024-10-19 arXiv https://github.com/wooozihui/GlitchMiner http://arxiv.org/abs/2410.15052v4
131 SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-based Agent Jiarui Ji, Yang Li, Hongtao Liu, Zhicheng Du, Zhewei Wei, Weiran Shen, Qi Qi, Yankai Lin 2024-10-18 arXiv https://github.com/jijiarui-cather/SRAPAgent_Framework http://arxiv.org/abs/2410.14152v1
132 Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation Shuo Tang, Xianghe Pang, Zexi Liu, Bohan Tang, Rui Ye, Xiaowen Dong, Yanfeng Wang, Siheng Chen 2024-10-18 arXiv https://github.com/ShuoTang123/MATRIX-Gen http://arxiv.org/abs/2410.14251v1
133 Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models Wei Jie Yeo, Ranjan Satapathy, Erik Cambria 2024-10-18 arXiv https://github.com/wj210/Causal-Faithfulness http://arxiv.org/abs/2410.14155v2
134 REEF: Representation Encoding Fingerprints for Large Language Models Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, Jing Shao 2024-10-18 arXiv https://github.com/tmylla/REEF http://arxiv.org/abs/2410.14273v1
135 Enabling Scalable Evaluation of Bias Patterns in Medical LLMs Hamed Fayyaz, Raphael Poulain, Rahmatollah Beheshti 2024-10-18 arXiv https://github.com/healthylaife/autofair http://arxiv.org/abs/2410.14763v1
136 CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic Huaiyuan Yao, Longchao Da, Vishnu Nandam, Justin Turnau, Zhiwei Liu, Linsey Pang, Hua Wei 2024-10-18 arXiv https://github.com/Hyan-Yao/CoMAL http://arxiv.org/abs/2410.14368v1
137 Retrieval-Augmented Personalization for Multimodal Large Language Models Haoran Hao, Jiaming Han, Changsheng Li, Yu-Feng Li, Xiangyu Yue 2024-10-17 arXiv https://github.com/Hoar012/RAP-MLLM http://arxiv.org/abs/2410.13360v2
138 Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models Yu Yuan, Lili Zhao, Kai Zhang, Guangting Zheng, Qi Liu 2024-10-17 EMNLP https://github.com/yyhappier/ShortcutSuite https://aclanthology.org/2024.emnlp-main.679
139 Data Defenses Against Large Language Models William Agnew, Harry H. Jiang, Cella Sum, Maarten Sap, Sauvik Das 2024-10-17 arXiv https://github.com/wagnew3/LLMDataDefenses http://arxiv.org/abs/2410.13138v1
140 FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs Forrest Sheng Bao, Miaoran Li, Renyi Qu, Ge Luo, Erana Wan, Yujia Tang, Weisi Fan, Manveer Singh Tamber, Suleman Kazi, Vivek Sourabh, Mike Qi, Ruixuan Tu, Chenyu Xu, Matthew Gonzales, Ofer Mendelevitch, Amin Ahmad 2024-10-17 arXiv https://github.com/vectara/FaithBench http://arxiv.org/abs/2410.13210v1
141 SLM-Mod: Small Language Models Surpass LLMs at Content Moderation Xianyang Zhan, Agam Goyal, Yilun Chen, Eshwar Chandrasekharan, Koustuv Saha 2024-10-17 arXiv https://github.com/AGoyal0512/SLM-Mod http://arxiv.org/abs/2410.13155v1
142 aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion Siyuan Jiang, Jia Li, He Zong, Huanyu Liu, Hao Zhu, Shukai Hu, Erlu Li, Jiazheng Ding, Yu Han, Wei Ning, Gen Wang, Yihong Dong, Kechi Zhang, Ge Li 2024-10-17 arXiv https://github.com/aixcoder-plugin/aiXcoder-7B/tree/main http://arxiv.org/abs/2410.13187v1
143 Self-Pluralising Culture Alignment for Large Language Models Shaoyang Xu, Yongqi Leng, Linhao Yu, Deyi Xiong 2024-10-16 arXiv https://github.com/shaoyangxu/CultureSPA http://arxiv.org/abs/2410.12971v1
144 Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models Iaroslav Chelombitko, Egor Safronov, Aleksey Komissarov 2024-10-16 arXiv https://github.com/nup-csai/Qtok/ http://arxiv.org/abs/2410.12989v1
145 Neuron-based Personality Trait Induction in Large Language Models Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao Yang, Wayne Xin Zhao, Ji-Rong Wen 2024-10-16 arXiv https://github.com/RUCAIBox/NPTI http://arxiv.org/abs/2410.12327v1
146 Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors Weixuan Wang, Jingyuan Yang, Wei Peng 2024-10-16 arXiv https://github.com/weixuan-wang123/SADI http://arxiv.org/abs/2410.12299v1
147 POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization Batuhan K. Karaman, Ishmam Zabir, Alon Benhaim, Vishrav Chaudhary, Mert R. Sabuncu, Xia Song 2024-10-16 arXiv https://github.com/batuhankmkaraman/POROver http://arxiv.org/abs/2410.12999v1
148 ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua Lin, Kai Chen 2024-10-16 arXiv https://github.com/open-compass/ProSA http://arxiv.org/abs/2410.12405v1
149 Hypothesis Testing the Circuit Hypothesis in LLMs Claudia Shi, Nicolas Beltran-Velez, Achille Nazaret, Carolina Zheng, Adrià Garriga-Alonso, Andrew Jesson, Maggie Makar, David M. Blei 2024-10-16 arXiv https://github.com/blei-lab/circuitry http://arxiv.org/abs/2410.13032v1
150 DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs Yingsong Luo, Ling Chen 2024-10-16 arXiv https://github.com/LuoYingSong/DAQ http://arxiv.org/abs/2410.12187v2
151 Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights Rahul Krishna, Rangeet Pan, Raju Pavuluri, Srikanth Tamilselvam, Maja Vukovic, Saurabh Sinha 2024-10-16 arXiv https://github.com/IBM/codellm-devkit http://arxiv.org/abs/2410.13007v1
152 HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park 2024-10-16 arXiv https://github.com/ssu-humane/HerO http://arxiv.org/abs/2410.12377v1
153 Exploring Model Kinship for Merging Large Language Models Yedi Hu, Yunzhi Yao, Ningyu Zhang, Shumin Deng, Huajun Chen 2024-10-16 arXiv https://github.com/zjunlp/ModelKinship http://arxiv.org/abs/2410.12613v1
154 Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch 2024-10-16 arXiv https://github.com/weixuan-wang123/INCLINE http://arxiv.org/abs/2410.12462v1
155 Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao, Penglei Gao, Lichun Li, Yuan Zhao, Xiaofeng Wang, Wei Wang, Jianke Zhu 2024-10-15 EMNLP https://github.com/Kaiseem/IST https://aclanthology.org/2024.findings-emnlp.109
156 Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language Models Zhongye Liu, Hongbin Liu, Yuepeng Hu, Zedian Shao, Neil Zhenqiang Gong 2024-10-15 arXiv https://github.com/lycheeefish/VHExpansion http://arxiv.org/abs/2410.11242v1
157 LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs Volker Strobel, Marco Dorigo, Mario Fritz 2024-10-15 arXiv https://github.com/Pold87/LLM2Swarm http://arxiv.org/abs/2410.11387v2
158 Subspace Optimization for Large Language Models with Convergence Guarantees Yutong He, Pengrui Li, Yipeng Hu, Chuyan Chen, Kun Yuan 2024-10-15 arXiv https://github.com/pkumelon/Golore http://arxiv.org/abs/2410.11289v1
159 Zero-shot Model-based Reinforcement Learning using Large Language Models Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat, Oussama Zekri, Albert Thomas, Giuseppe Paolo, Maurizio Filippone, Ievgen Redko, Balázs Kégl 2024-10-15 arXiv https://github.com/abenechehab/dicl http://arxiv.org/abs/2410.11711v1
160 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han 2024-10-14 arXiv https://github.com/mit-han-lab/duo-attention http://arxiv.org/abs/2410.10819v1
161 MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media Wei Zhai, Nan Bai, Qing Zhao, Jianqiang Li, Fan Wang, Hongzhi Qi, Meng Jiang, Xiaoqin Wang, Bing Xiang Yang, Guanghui Fu 2024-10-14 arXiv https://github.com/zwzzzQAQ/MentalGLM http://arxiv.org/abs/2410.10323v1
162 Locking Down the Finetuned LLMs Safety Minjun Zhu, Linyi Yang, Yifan Wei, Ningyu Zhang, Yue Zhang 2024-10-14 arXiv https://github.com/zhu-minjun/SafetyLock http://arxiv.org/abs/2410.10343v1
163 Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Ziyue Li, Tianyi Zhou 2024-10-14 arXiv https://github.com/tianyi-lab/MoE-Embedding http://arxiv.org/abs/2410.10814v2
164 Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues Qibing Ren, Hao Li, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao 2024-10-14 arXiv https://github.com/renqibing/ActorAttack http://arxiv.org/abs/2410.10700v1
165 AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models Haiquan Lu, Yefan Zhou, Shiwei Liu, Zhangyang Wang, Michael W. Mahoney, Yaoqing Yang 2024-10-14 arXiv https://github.com/haiquanlu/AlphaPruning https://doi.org/10.48550/arXiv.2410.10912
166 Large Language Model Evaluation via Matrix Nuclear-Norm Yahan Li, Tingyu Xia, Yi Chang, Yuan Wu 2024-10-14 arXiv https://github.com/MLGroupJLU/MatrixNuclearNorm https://doi.org/10.48550/arXiv.2410.10672
167 One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Jing Yao, Si-Qing Chen, Michael Wooldridge, Furu Wei 2024-10-14 arXiv https://github.com/fangru-lin/redial_dialect_robustness_fairness https://doi.org/10.48550/arXiv.2410.11005
168 RMB: Comprehensively Benchmarking Reward Models in LLM Alignment Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang 2024-10-13 arXiv https://github.com/Zhou-Zoey/RMB-Reward-Model-Benchmark http://arxiv.org/abs/2410.09893v1
169 LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models Han Qiu, Jiaxing Huang, Peng Gao, Qin Qi, Xiaoqin Zhang, Ling Shao, Shijian Lu 2024-10-13 arXiv https://github.com/hanqiu-hq/LongHalQA http://arxiv.org/abs/2410.09962v2
170 LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Rongqiao An, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun 2024-10-12 arXiv https://github.com/thunlp/LLMxMapReduce http://arxiv.org/abs/2410.09342v1
171 ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models Nandan Kumar Jha, Brandon Reagen 2024-10-12 arXiv https://github.com/Nandan91/relu-revival-normfree http://arxiv.org/abs/2410.09637v1
172 OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang 2024-10-12 arXiv https://openreasoner.github.io http://arxiv.org/abs/2410.09671v1
173 Skipping Computations in Multimodal LLMs Mustafa Shukor, Matthieu Cord 2024-10-12 arXiv https://github.com/mshukor/ima-lmms http://arxiv.org/abs/2410.09454v1
174 FlatQuant: Flatness Matters for LLM Quantization Yuxuan Sun, Ruikang Liu, Haoli Bai, Han Bao, Kang Zhao, Yuening Li, Jiaxin Hu, Xianzhi Yu, Lu Hou, Chun Yuan, Xin Jiang, Wulong Liu, Jun Yao 2024-10-12 arXiv https://github.com/ruikangliu/FlatQuant http://arxiv.org/abs/2410.09426v1
175 FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback Youquan Li, Miao Zheng, Fan Yang, Guosheng Dong, Bin Cui, Weipeng Chen, Zenan Zhou, Wentao Zhang 2024-10-12 arXiv https://github.com/PKU-Baichuan-MLSystemLab/FB-Bench http://arxiv.org/abs/2410.09412v1
176 ELICIT: LLM Augmentation via External In-Context Capability Futing Wang, Jianhao Yan, Yue Zhang, Tao Lin 2024-10-12 arXiv https://github.com/LINs-lab/ELICIT http://arxiv.org/abs/2410.09343v1
177 MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection Xi Jiang, Jian Li, Hanqiu Deng, Yong Liu, Bin-Bin Gao, Yifeng Zhou, Jialin Li, Chengjie Wang, Feng Zheng 2024-10-12 arXiv https://github.com/jam-cc/MMAD http://arxiv.org/abs/2410.09453v1
178 Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking Wei Zhang, Pengfei Li, Junli Wang, Bingchuan Sun, Qihao Jin, Guangjun Bao, Shibo Rui, Yang Yu, Wenchao Ding, Peng Li, Yilun Chen 2024-10-11 arXiv https://github.com/ChipsICU/Dual-AEB https://doi.org/10.48550/arXiv.2410.08616
179 AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Zijun Wang, Haoqin Tu, Jieru Mei, Bingchen Zhao, Yisen Wang, Cihang Xie 2024-10-11 arXiv https://github.com/UCSC-VLAA/AttnGCG-attack http://arxiv.org/abs/2410.09040v1
180 QEFT: Quantization for Efficient Fine-Tuning of LLMs Changhun Lee, Jun-gyu Jin, Younghyun Cho, Eunhyeok Park 2024-10-11 arXiv https://github.com/xvyaward/qeft http://arxiv.org/abs/2410.08661v1
181 Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang 2024-10-10 arXiv https://github.com/sitaocheng/Knowledge_Interplay https://doi.org/10.48550/arXiv.2410.08414
182 VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Lisa Dunlap, Krishna Mandal, Trevor Darrell, Jacob Steinhardt, Joseph E Gonzalez 2024-10-10 arXiv https://github.com/lisadunlap/VibeCheck http://arxiv.org/abs/2410.12851v1
183 Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond Qi Wang, Jindong Li, Shiqi Wang, Qianli Xing, Runliang Niu, He Kong, Rui Li, Guodong Long, Yi Chang, Chengqi Zhang 2024-10-10 arXiv https://github.com/jindongli-Ai/Next-Generation-LLM-based-Recommender-Systems-Survey http://arxiv.org/abs/2410.19744v1
184 StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs Yuanqing Yu, Zhefan Wang, Weizhi Ma, Zhicheng Guo, Jingtao Zhan, Shuai Wang, Chuhan Wu, Zhiqiang Guo, Min Zhang 2024-10-10 arXiv https://github.com/yuyq18/StepTool http://arxiv.org/abs/2410.07745v1
185 Reward-Augmented Data Enhances Direct Preference Alignment of LLMs Shenao Zhang, Zhihan Liu, Boyi Liu, Yufeng Zhang, Yingxiang Yang, Yongfei Liu, Liyu Chen, Tao Sun, Zhaoran Wang 2024-10-10 arXiv https://github.com/shenao-zhang/reward-augmented-preference http://arxiv.org/abs/2410.08067v1
186 Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System Weize Chen, Jiarui Yuan, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong Sun 2024-10-10 arXiv https://chenweize1998.github.io/optima-project-page http://arxiv.org/abs/2410.08115v1
187 Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models Wenting Tan, Dongxiao Chen, Jieting Xue, Zihao Wang, Taijie Chen 2024-10-10 arXiv https://github.com/SallyTan13/Teaching-Inspired-Prompting https://doi.org/10.48550/arXiv.2410.08068
188 GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps Muhammad Umair Nasir, Steven James, Julian Togelius 2024-10-10 arXiv https://github.com/umair-nasir14/Game-Traversal-Benchmark https://doi.org/10.48550/arXiv.2410.07765
189 Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models Zhipeng Chen, Liang Song, Kun Zhou, Wayne Xin Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen 2024-10-10 arXiv https://github.com/RUCAIBox/MAET https://doi.org/10.48550/arXiv.2410.07825
190 A Closer Look at Machine Unlearning for Large Language Models Xiaojian Yuan, Tianyu Pang, Chao Du, Kejiang Chen, Weiming Zhang, Min Lin 2024-10-10 arXiv https://github.com/sail-sg/closer-look-LLM-unlearning https://doi.org/10.48550/arXiv.2410.08109
191 Privately Learning from Graphs with Applications in Fine-tuning Large Language Models Haoteng Yin, Rongzhe Wei, Eli Chien, Pan Li 2024-10-10 arXiv https://github.com/Graph-COM/PvGaLM https://doi.org/10.48550/arXiv.2410.08299
192 IterGen: Iterative Structured LLM Generation Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, Sasa Misailovic 2024-10-09 arXiv https://github.com/uiuc-arc/itergen http://arxiv.org/abs/2410.07295v1
193 WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents Siyu Zhou, Tianyi Zhou, Yijun Yang, Guodong Long, Deheng Ye, Jing Jiang, Chengqi Zhang 2024-10-09 arXiv https://github.com/elated-sawyer/WALL-E http://arxiv.org/abs/2410.07484v2
194 Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, Sijia Liu 2024-10-09 arXiv https://github.com/OPTML-Group/Unlearn-Simple http://arxiv.org/abs/2410.07163v1
195 Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles Qi Chen, Bowen Zhang, Gang Wang, Qi Wu 2024-10-09 arXiv https://github.com/chenqi008/LateralThinking http://arxiv.org/abs/2410.06733v1
196 Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization Changli Tang, Yixuan Li, Yudong Yang, Jimin Zhuang, Guangzhi Sun, Wei Li, Zujun Ma, Chao Zhang 2024-10-09 arXiv https://video-salmonn-2.github.io http://arxiv.org/abs/2410.06682v1
197 Dissecting Fine-Tuning Unlearning in Large Language Models Yihuai Hong, Yuelin Zou, Lijie Hu, Ziqian Zeng, Di Wang, Haiqin Yang 2024-10-09 EMNLP https://github.com/yihuaihong/Dissecting-FT-Unlearning https://aclanthology.org/2024.emnlp-main.228
198 CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models Zi Gong, Hang Yu, Cong Liao, Bingchang Liu, Chaoyu Chen, Jianguo Li 2024-10-09 EMNLP https://github.com/codefuse-ai/MFTCoder https://aclanthology.org/2024.emnlp-main.459
199 AgentSquare: Automatic LLM Agent Search in Modular Design Space Yu Shang, Yu Li, Keyu Zhao, Likai Ma, Jiahe Liu, Fengli Xu, Yong Li 2024-10-08 arXiv https://github.com/tsinghua-fib-lab/AgentSquare http://arxiv.org/abs/2410.06153v1
200 GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Muhammad Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogério Feris, Leonid Karlinsky, James R. Glass 2024-10-08 arXiv https://github.com/jmiemirza/GLOV https://doi.org/10.48550/arXiv.2410.06154
201 Enhancing Temporal Modeling of Video LLMs via Time Gating Zi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang 2024-10-08 arXiv https://github.com/LaVi-Lab/TG-Vid http://arxiv.org/abs/2410.05714v1
202 MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment Amir Hossein Kargaran, Ali Modarressi, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schütze 2024-10-08 arXiv https://github.com/cisnlp/Mexa http://arxiv.org/abs/2410.05873v1
203 ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities Zhenchao Jin, Mengchen Liu, Dongdong Chen, Lingting Zhu, Yunsheng Li, Lequan Yu 2024-10-08 arXiv https://github.com/CharlesPikachu/ToolBridge http://arxiv.org/abs/2410.10872v1
204 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, Chuan Guo 2024-10-07 arXiv https://github.com/facebookresearch/SecAlign http://arxiv.org/abs/2410.05451v1
205 PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Mengzhao Chen, Yi Liu, Jiahao Wang, Yi Bin, Wenqi Shao, Ping Luo 2024-10-07 arXiv https://github.com/ChenMnZ/PrefixQuant http://arxiv.org/abs/2410.05265v1
206 Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild Xinyu Zhao, Guoheng Sun, Ruisi Cai, Yukun Zhou, Pingzhi Li, Peihao Wang, Bowen Tan, Yexiao He, Li Chen, Yi Liang, Beidi Chen, Binhang Yuan, Hongyi Wang, Ang Li, Zhangyang Wang, Tianlong Chen 2024-10-07 arXiv https://github.com/Model-GLUE/Model-GLUE http://arxiv.org/abs/2410.05357v1
207 Can LLMs Understand Time Series Anomalies? Zihao Zhou, Rose Yu 2024-10-07 arXiv https://github.com/Rose-STL-Lab/AnomLLM/` http://arxiv.org/abs/2410.05440v2
208 Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback Sanjiban Choudhury, Paloma Sodhi 2024-10-07 arXiv https://leap-llm.github.io http://arxiv.org/abs/2410.05434v1
209 Synthesizing Interpretable Control Policies through Large Language Model Guided Search Carlo Bosio, Mark W. Mueller 2024-10-07 arXiv https://github.com/muellerlab/synthesizing_interpretable_control_policies https://doi.org/10.48550/arXiv.2410.05406
210 Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality Guanyu Zhou, Yibo Yan, Xin Zou, Kun Wang, Aiwei Liu, Xuming Hu 2024-10-07 arXiv https://github.com/The-Martyr/CausalMM https://doi.org/10.48550/arXiv.2410.04780
211 Intriguing Properties of Large Language and Vision Models Young-Jun Lee, Byungsoo Ko, Han-Gyu Kim, Yechan Hwang, Ho-Jin Choi 2024-10-07 arXiv https://github.com/passing2961/IP-LLVM https://doi.org/10.48550/arXiv.2410.04751
212 Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models via Recounted Narratives Xinliang Frederick Zhang, Nicholas Beauchamp, Lu Wang 2024-10-07 EMNLP https://github.com/launchnlp/NoT https://aclanthology.org/2024.findings-emnlp.963
213 Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan 2024-10-07 EMNLP https://feiwang96.github.io/DataAdvisor/ https://aclanthology.org/2024.emnlp-main.461
214 CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language Models Yijiang Li, Qingying Gao, Haoran Sun, Haiyun Lyu, Dezhi Luo, Hokin Deng 2024-10-06 arXiv https://growing-ai-like-a-child.github.io/ https://doi.org/10.48550/arXiv.2410.10855
215 Leveraging Large Language Models for Suicide Detection on Social Media with Limited Labels Vy Nguyen, Chau Pham 2024-10-06 arXiv https://github.com/khanhvynguyen/Suicide_Detection_LLMs https://doi.org/10.48550/arXiv.2410.04501
216 MindScope: Exploring Cognitive Biases in Large Language Models Through Multi-Agent Systems Zhentao Xie, Jiabao Zhao, Yilei Wang, Jinxin Shi, Yanhong Bai, Xingjiao Wu, Liang He 2024-10-06 ECAI https://github.com/2279072142/MindScope https://doi.org/10.3233/FAIA240879
217 CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints Anirudh Atmakuru, Jatin Nainani, Rohith Siddhartha Reddy Bheemreddy, Anirudh Lakkaraju, Zonghai Yao, Hamed Zamani, Haw-Shiuan Chang 2024-10-05 arXiv https://github.com/anirudhlakkaraju/cs4_benchmark https://doi.org/10.48550/arXiv.2410.04197
218 Neuron-Level Sequential Editing for Large Language Models Houcheng Jiang, Junfeng Fang, Tianyu Zhang, An Zhang, Ruipeng Wang, Tao Liang, Xiang Wang 2024-10-05 arXiv https://github.com/jianghoucheng/NSE https://doi.org/10.48550/arXiv.2410.04045
219 Self-Powered LLM Modality Expansion for Large Speech-Text Models Tengfei Yu, Xuebo Liu, Zhiyi Hou, Liang Ding, Dacheng Tao, Min Zhang 2024-10-04 arXiv https://github.com/ytf-philp/Self-powered-LSM http://arxiv.org/abs/2410.03798v2
220 Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs Tianqi Shang, Shu Yang, Weiqing He, Tianhua Zhai, Dawei Li, Bojian Hou, Tianlong Chen, Jason H. Moore, Marylyn D. Ritchie, Li Shen 2024-10-04 arXiv https://github.com/hwq0726/SDoHenPKG http://arxiv.org/abs/2410.09080v1
221 LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity Selim Furkan Tekin, Fatih Ilhan, Tiansheng Huang, Sihao Hu, Ling Liu 2024-10-04 arXiv https://github.com/git-disl/llm-topla http://arxiv.org/abs/2410.03953v1
222 GraphRouter: A Graph-based Router for LLM Selections Tao Feng, Yanzhen Shen, Jiaxuan You 2024-10-04 arXiv https://github.com/ulab-uiuc/GraphRouter http://arxiv.org/abs/2410.03834v1
223 Aligning LLMs with Individual Preferences via Interaction Shujin Wu, May Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, Heng Ji 2024-10-04 arXiv https://github.com/ShujinWu-0814/ALOE http://arxiv.org/abs/2410.03642v1
224 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang 2024-10-04 arXiv …, 2024 https://github.com/agiresearch/ASB http://arxiv.org/abs/2410.02644v1
225 ARB-LLM: Alternating Refined Binarizations for Large Language Models Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong Qin, Dong Xie, Jiang Tian, zhongchao shi, Linghe Kong, Yulun Zhang, Xiaokang Yang 2024-10-04 arXiv https://github.com/ZHITENGLI/ARB-LLM https://doi.org/10.48550/arXiv.2410.03129
226 Steering Large Language Models between Code Execution and Textual Reasoning Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma, Chuchu Fan, Chi Wang 2024-10-04 arXiv https://yongchao98.github.io/CodeSteer/ https://doi.org/10.48550/arXiv.2410.03524
227 A Probabilistic Perspective on Unlearning and Alignment for Large Language Models Yan Scholten, Stephan Günnemann, Leo Schwinn 2024-10-04 arXiv https://github.com/yascho/probabilistic-unlearning https://doi.org/10.48550/arXiv.2410.03523
228 Output Scouting: Auditing Large Language Models for Catastrophic Responses Andrew Bell, João Fonseca 2024-10-04 arXiv https://github.com/joaopfonseca/outputscouting https://doi.org/10.48550/arXiv.2410.05305
229 PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models Lemei Zhang, Peng Liu, Marcus Tiedemann Oekland Henriksboe, Even W. Lauvrak, Jon Atle Gulla, Heri Ramampiaro 2024-10-04 arXiv https://github.com/SmartmediaAI/PersonalSum https://doi.org/10.48550/arXiv.2410.03905
230 CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions Jun Rao, Xuebo Liu, Lian Lian, Shengjun Cheng, Yunjie Liao, Min Zhang 2024-10-04 arXiv https://github.com/raojay7/CommonIT https://doi.org/10.48550/arXiv.2410.03077
231 PersoBench: Benchmarking Personalized Response Generation in Large Language Models Saleh Afzoon, Usman Naseem, Amin Beheshti, Zahra Jamali 2024-10-04 arXiv https://github.com/salehafzoon/PersoBench https://doi.org/10.48550/arXiv.2410.03198
232 FactAlign: Long-form Factuality Alignment of Large Language Models Chao-Wei Huang, Yun-Nung Chen 2024-10-03 arXiv https://github.com/MiuLab/FactAlign https://doi.org/10.48550/arXiv.2410.01691
233 POSIX: A Prompt Sensitivity Index For Large Language Models Anwoy Chatterjee, H. S. V. N. S. Kowndinya Renduchintala, Sumit Bhatia, Tanmoy Chakraborty 2024-10-03 arXiv https://github.com/kowndinya-renduchintala/POSIX https://doi.org/10.48550/arXiv.2410.02185
234 Traffic Light or Light Traffic? Investigating Phrasal Semantics in Large Language Models Rui Meng, Ye Liu, Lifu Tu, Daqing He, Yingbo Zhou, Semih Yavuz 2024-10-03 arXiv https://github.com/memray/llm_phrase_semantics https://doi.org/10.48550/arXiv.2410.02308
235 Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language Anthony Costarelli, Mat Allen, Severin Field 2024-10-03 arXiv:2410.02472, 2024 https://github.com/acostarelli/meta-models-public http://arxiv.org/abs/2410.02472v2
236 Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective Zeyu Gan, Yong Liu 2024-10-03 arXiv:2410.01720, 2024 https://github.com/ZyGan1999/Towards-a-Theoretical-Understanding-of-Synthetic-Data-in-LLM-Post-Training http://arxiv.org/abs/2410.01720v2
237 Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models Shayekh Bin Islam, Md Asib Rahman, K. S. M. Tozammel Hossain, Enamul Hoque, Shafiq Joty, Md Rizwan Parvez 2024-10-02 EMNLP https://openragmoe.github.io/ https://aclanthology.org/2024.findings-emnlp.831
238 Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin, Bing Li, Grace Li Zhang 2024-10-02 arXiv https://github.com/TUDa-HWAI/Basis_Sharing https://doi.org/10.48550/arXiv.2410.03765
239 DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models Yuxuan Zhang, Ruizhe Li 2024-10-02 arXiv https://github.com/MeCuping/DLP-LoRA https://doi.org/10.48550/arXiv.2410.01497
240 StringLLM: Understanding the String Processing Capability of Large Language Models Xilong Wang, Hao Fu, Jindong Wang, Neil Zhenqiang Gong 2024-10-02 arXiv https://github.com/wxl-lxw/StringLLM https://doi.org/10.48550/arXiv.2410.01208
241 TypedThinker: Typed Thinking Improves Large Language Model Reasoning Danqing Wang, Jianxin Ma, Fei Fang, Lei Li 2024-10-02 arXiv https://github.com/dqwang122/ThinkHub https://doi.org/10.48550/arXiv.2410.01952
242 EMMA: Efficient Visual Alignment in Multi-Modal LLMs Sara Ghazanfari, Alexandre Araujo, Prashanth Krishnamurthy, Siddharth Garg, Farshad Khorrami 2024-10-02 arXiv https://github.com/SaraGhazanfari/EMMA http://arxiv.org/abs/2410.02080v1
243 Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint? Xi Chen, Kaituo Feng, Changsheng Li, Xunhao Lai, Xiangyu Yue, Ye Yuan, Guoren Wang 2024-10-02 arXiv https://github.com/xichen-fy/Fira http://arxiv.org/abs/2410.01623v2
244 PneumoLLM: Harnessing the power of large language model for pneumoconiosis diagnosis Meiyue Song, Jiarui Wang, Zhihua Yu, Jiaxin Wang, Le Yang, Yuting Lu, Baicun Li, Xue Wang, Xiaoxu Wang, Qinghua Huang, Zhijun Li, Nikolaos I. Kanellakis, Jiangfeng Liu, Jing Wang, Binglu Wang, Juntao Yang 2024-10-01 Medical Image Anal. https://github.com/CodeMonsterPHD/PneumoLLM/tree/main https://doi.org/10.1016/j.media.2024.103248
245 Style-Specific Neurons for Steering LLMs in Text Style Transfer Wen Lai, Viktor Hangya, Alexander Fraser 2024-10-01 arXiv https://github.com/wenlai-lavine/sNeuron-TST http://arxiv.org/abs/2410.00593v1
246 Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis Chun-Hsiao Yeh, Jiayun Wang, Andrew D. Graham, Andrea J. Liu, Bo Tan, Yubei Chen, Yi Ma, Meng C. Lin 2024-10-01 arXiv https://danielchyeh.github.io/MDPipe/ http://arxiv.org/abs/2410.00292v1
247 Dynamic Planning for LLM-based Graphical User Interface Automation Shaoqing Zhang, Zhuosheng Zhang, Kehai Chen, Xinbei Ma, Muyun Yang, Tiejun Zhao, Min Zhang 2024-10-01 OpenReview https://github.com/sqzhang-lazy/D-PoT http://arxiv.org/abs/2410.00467v1
248 mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/PaperOwl https://dl.acm.org/doi/10.1145/3664647.3681294
249 WorldGPT: Empowering LLM as Multimodal World Model Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/DCDmllm/WorldGPT https://dl.acm.org/doi/10.1145/3664647.3681488
250 Semantic Alignment for Multimodal Large Language Models Tao Wu, Mengze Li, Jingyuan Chen, Wei Ji, Wang Lin, Jinyang Gao, Kun Kuang, Zhou Zhao, Fei Wu 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://mccartney01.github.io/SAM https://dl.acm.org/doi/10.1145/3664647.3681014
251 Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval Yiyang Jiang, Wengyu Zhang, Xulu Zhang, Xiaoyong Wei, Chang Wen Chen, Qing Li 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/fletcherjiang/LLMEPET https://dl.acm.org/doi/10.1145/3664647.3681115
252 Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval Yabing Wang, Le Wang, Qiang Zhou, Zhibin Wang, Hao Li, Gang Hua, Wei Tang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/LiJiaBei-7/leccr https://dl.acm.org/doi/10.1145/3664647.3680886
253 MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/TangYuan96/MiniGPT-3D https://dl.acm.org/doi/10.1145/3664647.3681257
254 Making Large Language Models Perform Better in Knowledge Graph Completion Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Wen Zhang, Huajun Chen 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/zjukg/KoPA https://dl.acm.org/doi/10.1145/3664647.3681327
255 MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models Haoxuan Li, Zhengmao Yang, Yunshan Ma, Yi Bin, Yang Yang, Tat-Seng Chua 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/LuminosityX/MM-Forecast https://dl.acm.org/doi/10.1145/3664647.3681593
256 Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding Minghui Wu, Chenxu Zhao, Anyang Su, Donglin Di, Tianyu Fu, Da An, Min He, Ya Gao, Meng Ma, Kun Yan, Ping Wang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/suay1113/HMLLM https://dl.acm.org/doi/10.1145/3664647.3680810
257 Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs Peng Ding, Jingyu Wu, Jun Kuang, Dan Ma, Xuezhi Cao, Xunliang Cai, Shi Chen, Jiajun Chen, Shujian Huang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/NJUNLP/Hallu-PI https://dl.acm.org/doi/10.1145/3664647.3681251
258 Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation Jingjing Xie, Yuxin Zhang, Mingbao Lin, Liujuan Cao, Rongrong Ji 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/xjjxmu/QSLAW https://dl.acm.org/doi/10.1145/3664647.3680838
259 UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models Qi Liu, Yongyi He, Defu Lian, Zhi Zheng, Tong Xu, Liu Che, Enhong Chen 2024-10 CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management https://github.com/Javkonline/UniMEL https://dl.acm.org/doi/10.1145/3627673.3679793
260 Fairness in Large Language Models in Three Hours Thang Viet Doan, Zichong Wang, Nhat Nguyen Minh Hoang, Wenbin Zhang 2024-10 CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management https://github.com/LavinWong/Fairness-in-Large-Language-Models https://dl.acm.org/doi/10.1145/3627673.3679090
261 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding, Hanglei Hu, Jie Zhou, Qin Chen, Bo Jiang, Liang He 2024-10 CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management https://github.com/ECNU-ICALK/SocraticMath https://dl.acm.org/doi/10.1145/3627673.3679881
262 VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs Ruotong Liao, Max Erler, Huiyu Wang, Guangyao Zhai, Gengyuan Zhang, Yunpu Ma, Volker Tresp 2024-09-30 arXiv https://github.com/mayhugotong/VideoINSTA http://arxiv.org/abs/2409.20365v2
263 RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, Yu Zhang 2024-09-30 arXiv https://github.com/shuhao02/RouterDC https://doi.org/10.48550/arXiv.2409.19886
264 LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models Haitao Li, You Chen, Qingyao Ai, Yueyue Wu, Ruizhe Zhang, Yiqun Liu 2024-09-30 arXiv https://github.com/CSHaitao/LexEval https://doi.org/10.48550/arXiv.2409.20288
265 LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation Ziyao Zhang, Yanlin Wang, Chong Wang, Jiachi Chen, Zibin Zheng 2024-09-30 arXiv https://github.com/DeepSoftwareAnalytics/LLMCodingHallucination http://arxiv.org/abs/2409.20550v1
266 Do Influence Functions Work on Large Language Models? Zhe Li, Wei Zhao, Yige Li, Jun Sun 2024-09-30 arXiv https://github.com/plumprc/Failures-of-Influence-Functions-in-LLMs https://doi.org/10.48550/arXiv.2409.19998
267 Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models Luohe Shi, Yao Yao, Zuchao Li, Lefei Zhang, Hai Zhao 2024-09-30 arXiv https://github.com/ShiLuohe/ReferenceTrustableDecoding https://doi.org/10.48550/arXiv.2409.20181
268 A multimodal LLM for the non-invasive decoding of spoken text from brain recordings Youssef Hmamouche, Ismail Chihab, Lahoucine Kdouri, Amal El Fallah Seghrouchni 2024-09-29 arXiv https://github.com/Hmamouche/brain_decode http://arxiv.org/abs/2409.19710v1
269 BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode Zongrong Li, Yunlei Su, Chenyuan Zhu, Wufan Zhao 2024-09-29 arXiv https://github.com/Jasper0122/BuildingView https://doi.org/10.48550/arXiv.2409.19527
270 Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models Xin Li, Weize Chen, Qizhi Chu, Haopeng Li, Zhaojun Sun, Ran Li, Chen Qian, Yiwei Wei, Zhiyuan Liu, Chuan Shi, Maosong Sun, Cheng Yang 2024-09-29 arXiv https://github.com/BUPT-GAMMA/ProGraph https://doi.org/10.48550/arXiv.2409.19667
271 Identifying Knowledge Editing Types in Large Language Models Xiaopeng Li, Shangwen Wang, Shezheng Song, Bin Ji, Huijun Liu, Shasha Li, Jun Ma, Jie Yu 2024-09-29 arXiv https://github.com/xpq-tech/KETI https://doi.org/10.48550/arXiv.2409.19663
272 OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation Tanvir Mahmud, Diana Marculescu 2024-09-28 arXiv https://github.com/tanvir-utexas/OpenSep https://doi.org/10.48550/arXiv.2409.19270
273 Enhancing text-based knowledge graph completion with zero-shot large language models: A focus on semantic enhancement Rui Yang, Jiahao Zhu, Jianping Man, Li Fang, Yi Zhou 2024-09-27 Knowl. Based Syst. https://github.com/sjlmg/CP-KGC https://doi.org/10.1016/j.knosys.2024.112155
274 MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Gongfan Fang, Hongxu Yin, Saurav Muralidharan, Greg Heinrich, Jeff Pool, Jan Kautz, Pavlo Molchanov, Xinchao Wang 2024-09-27 arXiv https://github.com/NVlabs/MaskLLM https://doi.org/10.48550/arXiv.2409.17481
275 HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection Xuefeng Du, Chaowei Xiao, Yixuan Li 2024-09-27 arXiv:2409.17504, 2024 https://github.com/deeplearningwisc/haloscope http://arxiv.org/abs/2409.17504v1
276 From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection Xinlei Wang, Maike Feng, Jing Qiu, Jinjin Gu, Junhua Zhao 2024-09-27 arXiv:2409.17515, 2024 https://github.com/ameliawong1996/From_News_to_Forecast http://arxiv.org/abs/2409.17515v2
277 AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment Nan Sun, Bo Mao, Yongchang Li, Lumeng Ma, Di Guo, Huaping Liu 2024-09-27 arXiv:2409.17655, 2024 https://assistantx-agent.github.io/AssistantX/ http://arxiv.org/abs/2409.17655v1
278 CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models Kanghyun Ryu, Qiayuan Liao, Zhongyu Li, Koushil Sreenath, Negar Mehr 2024-09-27 arXiv https://github.com/labicon/CurricuLLM https://doi.org/10.48550/arXiv.2409.18382
279 Align2LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation Hongzhe Huang, Zhewen Yu, Jiang Liu, Li Cai, Dian Jiao, Wenqiao Zhang, Siliang Tang, Juncheng Li, Hao Jiang, Haoyuan Li, Yueting Zhuang 2024-09-27 arXiv https://github.com/DCDmllm/Align2LLaVA https://doi.org/10.48550/arXiv.2409.18541
280 A Survey on the Honesty of Large Language Models Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam 2024-09-27 arXiv https://github.com/SihengLi99/LLM-Honesty-Survey https://doi.org/10.48550/arXiv.2409.18786
281 Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Jiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, yuelin bai, Run Luo, Longze Chen, Min Yang 2024-09-27 arXiv https://github.com/Geaming2002/Ruler https://doi.org/10.48550/arXiv.2409.18943
282 Extracting Affect Aggregates from Longitudinal Social Media Data with Temporal Adapters for Large Language Models Georg Ahnert, Max Pellert, David Garcia, Markus Strohmaier 2024-09-26 arXiv https://github.com/dess-mannheim/temporal-adapters http://arxiv.org/abs/2409.17990v1
283 RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee 2024-09-26 arXiv https://github.com/kriti-hippo/red_queen https://doi.org/10.48550/arXiv.2409.17458
284 LLM-CARD: Towards a Description and Landscape of Large Language Models Shengwei Tian, Lifeng Han, Erick Mendez Guzman, Goran Nenadic 2024-09-25 arXiv https://github.com/shengwei-tian/dependency-parser-visualization https://doi.org/10.48550/arXiv.2409.17011
285 Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness Shixuan Ma, Quan Wang 2024-09-25 arXiv https://github.com/Shixuan-Ma/TOCSIN http://arxiv.org/abs/2409.16914v1
286 Search for Efficient Large Language Models Xuan Shen, Pu Zhao, Yifan Gong, Zhenglun Kong, Zheng Zhan, Yushu Wu, Ming Lin, Chao Wu, Xue Lin, Yanzhi Wang 2024-09-25 arXiv https://github.com/shawnricecake/search-llm https://doi.org/10.48550/arXiv.2409.17372
287 DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling Kyuheon Jung, Yongdeuk Seo, Seongwoo Cho, Jaeyoung Kim, Hyun-seok Min, Sungchul Choi 2024-09-25 arXiv https://github.com/kkyuhun94/dalda http://arxiv.org/abs/2409.16949v1
288 HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows Wenlin Yao, Haitao Mi, Dong Yu 2024-09-25 arXiv https://github.com/wenlinyao/HDFlow http://arxiv.org/abs/2409.17433v1
289 EventHallusion: Diagnosing Event Hallucinations in Video LLMs Jiacheng Zhang, Yang Jiao, Shaoxiang Chen, Jingjing Chen, Yu-Gang Jiang 2024-09-25 arXiv https://github.com/Stevetich/EventHallusion http://arxiv.org/abs/2409.16597v1
290 Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Zhenmei Shi, Yifei Ming, Xuan-Phi Nguyen, Yingyu Liang, Shafiq Joty 2024-09-25 arXiv https://github.com/SalesforceAIResearch/GemFilter http://arxiv.org/abs/2409.17422v1
291 CHBench: A Chinese Dataset for Evaluating Health in Large Language Models Chenlu Guo, Nuo Xu, Yi Chang, Yuan Wu 2024-09-24 arXiv https://github.com/TracyGuo2001/CHBench https://doi.org/10.48550/arXiv.2409.15766
292 HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Haoran Que, Feiyu Duan, Liqun He, Yutao Mou, Wangchunshu Zhou, Jiaheng Liu, Wenge Rong, Zekun Moore Wang, Jian Yang, Ge Zhang, Junran Peng, Zhaoxiang Zhang, Songyang Zhang, Kai Chen 2024-09-24 arXiv https://github.com/Quehry/HelloBench https://doi.org/10.48550/arXiv.2409.16191
293 XTRUST: On the Multilingual Trustworthiness of Large Language Models Yahan Li, Yi Wang, Yi Chang, Yuan Wu 2024-09-24 arXiv https://github.com/LluckyYH/XTRUST https://doi.org/10.48550/arXiv.2409.15762
294 Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method Weichao Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng 2024-09-23 arXiv https://github.com/zhang-wei-chao/DC-PDD https://doi.org/10.48550/arXiv.2409.14781
295 COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models Kehui Liu, Zixin Tang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li 2024-09-23 arXiv https://github.com/MrKeee/COHERENT https://doi.org/10.48550/arXiv.2409.15146
296 Phantom of Latent for Large Language and Vision Models Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro 2024-09-23 arXiv https://github.com/ByungKwanLee/Phantom https://doi.org/10.48550/arXiv.2409.14713
297 Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu 2024-09-22 arXiv https://github.com/Shelley1214/Trope https://doi.org/10.48550/arXiv.2409.14324
298 PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL Ruilin Luo, Liyuan Wang, Binghuai Lin, Zicheng Lin, Yujiu Yang 2024-09-21 arXiv https://github.com/lrlbbzl/PTD-SQL http://arxiv.org/abs/2409.14082v1
299 StateAct: State Tracking and Reasoning for Acting and Planning with Large Language Models Nikolai Rozanov, Marek Rei 2024-09-21 arXiv https://github.com/ai-nikolai/StateAct https://doi.org/10.48550/arXiv.2410.02810
300 ProcessTBench: An LLM Plan Generation Dataset for Process Mining Andrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin, Andrea Burattin 2024-09-20 arXiv e …, 2024 https://github.com/microsoft/ProcessTBench http://arxiv.org/abs/2409.09191v2
301 ShizishanGPT: An Agricultural Large Language Model Integrating Tools and Resources Shuting Yang, Zehui Liu, Wolfgang Mayer 2024-09-20 arXiv https://github.com/Zaiwen/CropGPT https://doi.org/10.48550/arXiv.2409.13537
302 Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models Peiyi Zhang, Yazhou Zhang, Bo Wang, Lu Rong, Jing Qin 2024-09-20 arXiv https://github.com/zhangpeii/Edu-Values https://doi.org/10.48550/arXiv.2409.12739
303 CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information Yuxin Wang, Minghua Ma, Zekun Wang, Jingchang Chen, Huiming Fan, Liping Shan, Qing Yang, Dongliang Xu, Ming Liu, Bing Qin 2024-09-20 arXiv https://github.com/wyxscir/CFSP http://arxiv.org/abs/2409.13199v1
304 HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling Junyi Chen, Lu Chi, Bingyue Peng, Zehuan Yuan 2024-09-19 arXiv https://github.com/bytedance/HLLM https://doi.org/10.48550/arXiv.2409.12740
305 Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models Xinyu Zhou, Delong Chen, Samuel Cahyawijaya, Xufeng Duan, Zhenguang G. Cai 2024-09-19 arXiv https://github.com/ChenDelong1999/Linguistic-Similarity https://doi.org/10.48550/arXiv.2409.12435
306 CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Tsung-Han Wu, Joseph E. Gonzalez, Trevor Darrell, David M. Chan 2024-09-19 arXiv https://github.com/DavidMChan/clair-a https://doi.org/10.48550/arXiv.2409.12962
307 Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Issey Sukeda 2024-09-19 arXiv https://github.com/stardust-coder/japanese-lm-med-harness https://doi.org/10.48550/arXiv.2409.11783
308 BodyShapeGPT: SMPL Body Shape Manipulation with LLMs Baldomero R. Árbol, Dan Casas 2024-09-18 arXiv https://github.com/baldoarbol/BodyShapeGPT http://arxiv.org/abs/2410.03556v1
309 Large Language Models Are Strong Audio-Visual Speech Recognition Learners Umberto Cappellazzo, Minsu Kim, Honglie Chen, Pingchuan Ma, Stavros Petridis, Daniele Falavigna, Alessio Brutti, Maja Pantic 2024-09-18 arXiv https://github.com/umbertocappellazzo/AVSR-LLMs https://doi.org/10.48550/arXiv.2409.12319
310 Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad 2024-09-17 arXiv https://github.com/SecureAIAutonomyLab/MA-ToT http://arxiv.org/abs/2409.11527v1
311 Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs Dingjie Song, Wenjun Wang, Shunian Chen, Xidong Wang, Michael Guan, Benyou Wang 2024-09-17 arXiv https://github.com/FreedomIntelligence/TRIM http://arxiv.org/abs/2409.10994v1
312 NVLM: Open Frontier-Class Multimodal LLMs Wenliang Dai, Nayeon Lee, Boxin Wang, Zhuoling Yang, Zihan Liu, Jon Barker, Tuomas Rintamaki, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping 2024-09-17 arXiv https://nvlm-project.github.io/ http://arxiv.org/abs/2409.11402v2
313 Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges Vinay Samuel, Yue Zhou, Henry Peng Zou 2024-09-16 arXiv https://github.com/vsamuel2003/data-contamination https://doi.org/10.48550/arXiv.2409.09927
314 HALO: Hallucination Analysis and Learning Optimization to Empower LLMs with Retrieval-Augmented Context for Guided Clinical Decision Making Sumera Anjum, Hanzhi Zhang, Wenjun Zhou, Eun Jin Paek, Xiaopeng Zhao, Yunhe Feng 2024-09-16 arXiv https://github.com/ResponsibleAILab/HALO http://arxiv.org/abs/2409.10011v2
315 Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models Weihao Ye, Qiong Wu, Wenhao Lin, Yiyi Zhou 2024-09-16 arXiv https://github.com/ywh187/FitPrune https://doi.org/10.48550/arXiv.2409.10197
316 Do Large Language Models Need a Content Delivery Network? Yihua Cheng, Kuntai Du, Jiayi Yao, Junchen Jiang 2024-09-16 arXiv https://github.com/LMCache/LMCache https://doi.org/10.48550/arXiv.2409.13761
317 Benchmarking Large Language Model Uncertainty for Prompt Optimization Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin 2024-09-16 arXiv https://github.com/0Frett/PO-Uncertainty-Benchmarking https://doi.org/10.48550/arXiv.2409.10044
318 AlpaPICO: Extraction of PICO Frames from Clinical Trial Documents Using LLMs Madhusudan Ghosh, Shrimon Mukherjee, Asmit Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar, Debasis Ganguly 2024-09-15 arXiv https://github.com/shrimonmuke0202/AlpaPICO http://arxiv.org/abs/2409.09704v1
319 Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based Recognition Zongyou Yu, Qiang Qu, Xiaoming Chen, Chen Wang 2024-09-15 arXiv https://github.com/ChrisYu-Zz/Pure-event-based-recognition-based-LLM https://doi.org/10.48550/arXiv.2409.09628
320 Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model Bo-Kai Ruan, Hao-Tang Tsui, Yung-Hui Li, Hong-Han Shuai 2024-09-15 arXiv https://basiclab.github.io/TTSG https://doi.org/10.48550/arXiv.2409.09575
321 LLM-Powered Ensemble Learning for Paper Source Tracing: A GPU-Free Approach Kunlong Chen, Junjun Wang, Zhaoqun Chen, Kunjin Chen, Yitian Chen 2024-09-14 arXiv https://github.com/Cklwanfifa/KDDCUP2024-PST http://arxiv.org/abs/2409.09383v2
322 PeriGuru: A Peripheral Robotic Mobile App Operation Assistant based on GUI Image Understanding and Prompting with LLM Kelin Fu, Yang Tian, Kaigui Bian 2024-09-14 arXiv https://github.com/Z2sJ4t/PeriGuru http://arxiv.org/abs/2409.09354v1
323 L3Cube-IndicQuest: A Benchmark Questing Answering Dataset for Evaluating Knowledge of LLMs in Indic Context Pritika Rohera, Chaitrali Ginimav, Akanksha Salunke, Gayatri Sawant, Raviraj Joshi 2024-09-13 arXiv https://github.com/l3cube-pune/indic-nlp http://arxiv.org/abs/2409.08706v1
324 FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition Zhenhua Xu, Wenpeng Xing, Zhebo Wang, Chang Hu, Chen Jie, Meng Han 2024-09-13 arXiv https://fingerprintvector.github.io https://doi.org/10.48550/arXiv.2409.08846
325 Intelligent LiDAR Navigation: Leveraging External Information and Semantic Maps with LLM as Copilot Fujing Xie, Jiajie Zhang, Sören Schwertfeger 2024-09-13 arXiv https://github.com/xiexiexiaoxiexie/Intelligent-LiDAR-Navigation-LLM-as-Copilot http://arxiv.org/abs/2409.08493v1
326 TAIiST CPS-UAV at the SBFT Tool Competition 2024 T. Zhu, W. Newton, S. Embury, Y. Sun 2024-09-12 2024 IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT) https://github.com/Trusted-AI-in-System-Test https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10699540
327 Fine-tuning Large Language Models for Entity Matching Aaron Steiner, Ralph Peeters, Christian Bizer 2024-09-12 arXiv https://github.com/wbsg-uni-mannheim/TailorMatch https://doi.org/10.48550/arXiv.2409.08185
328 LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts Henrique Da Silva Gameiro, Andrei Kucharavy, Ljiljana Dolamic 2024-09-11 arXiv e-prints, 2024 https://github.com/Reliable-Information-Lab-HEVS/dynamic_llm_detector_benchmark http://arxiv.org/abs/2409.03291v1
329 Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu, JongWoo Kim, MunYong Yi 2024-09-11 arXiv https://github.com/BBeeChu/InteractEval http://arxiv.org/abs/2409.07355v1
330 LLaMA-Omni: Seamless Speech Interaction with Large Language Models Qingkai Fang, Shoutao Guo, Yan Zhou, Zhengrui Ma, Shaolei Zhang, Yang Feng 2024-09-11 arXiv https://github.com/ictnlp/LLaMA-Omni https://doi.org/10.48550/arXiv.2409.06666
331 Understanding Knowledge Drift in LLMs through Misinformation Alina Fastowski, Gjergji Kasneci 2024-09-11 arXiv https://github.com/afastowski/knowledge_drift http://arxiv.org/abs/2409.07085v1
332 Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Yao Shu, Wenyang Hu, See-Kiong Ng, Bryan Kian Hsiang Low, Fei Richard Yu 2024-09-11 arXiv https://github.com/allen4747/Ferret https://doi.org/10.48550/arXiv.2409.06277
333 DrLLM: Prompt-Enhanced Distributed Denial-of-Service Resistance Method with Large Language Models Zhenyu Yin, Shang Liu, Guangyuan Xu 2024-09-11 arXiv https://github.com/liuup/DrLLM https://doi.org/10.48550/arXiv.2409.10561
334 AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs Lijia Lv, Weigang Zhang, Xuehai Tang, Jie Wen, Feng Liu, Jizhong Han, Songlin Hu 2024-09-11 arXiv https://github.com/Yummy416/AdaPPA http://arxiv.org/abs/2409.07503v1
335 Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach Meng Zhou, Surajsinh Parmar, Anubhav Bhatti 2024-09-10 arXiv https://github.com/SpassMed/Med-Llama3 https://doi.org/10.48550/arXiv.2409.05732
336 What is the Role of Small Models in the LLM Era: A Survey Lihu Chen, Gaël Varoquaux 2024-09-10 arXiv https://github.com/tigerchen52/role_of_small_models http://arxiv.org/abs/2409.06857v2
337 Benchmarking Chinese Knowledge Rectification in Large Language Models Tianhe Lu, Jizhan Fang, Yunzhi Yao, Xin Xu, Ningyu Zhang, Huajun Chen 2024-09-09 arXiv https://github.com/zjunlp/EasyEdit https://doi.org/10.48550/arXiv.2409.05806
338 FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations Ziyao Wang, Zheyu Shen, Yexiao He, Guoheng Sun, Hongyi Wang, Lingjuan Lyu, Ang Li 2024-09-09 arXiv https://github.com/ATP-1010/FederatedLLM https://doi.org/10.48550/arXiv.2409.05976
339 Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design Andre Nakkab, Sai Qian Zhang, Ramesh Karri, Siddharth Garg 2024-09-09 MLCAD '24: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD https://github.com/ajn313/ROME-LLM https://dl.acm.org/doi/10.1145/3670474.3685964
340 OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Jintian Zhang, Cheng Peng, Mengshu Sun, Xiang Chen, Lei Liang, Zhiqiang Zhang, Jun Zhou, Huajun Chen, Ningyu Zhang 2024-09-08 arXiv https://github.com/zjunlp/OneGen http://arxiv.org/abs/2409.05152v1
341 Multi-Programming Language Ensemble for Code Generation in Large Language Model Tengfei Xue, Xuefeng Li, Tahir Azim, Roman Smirnov, Jianhui Yu, Arash Sadrieh, Babak Pahlavan 2024-09-06 arXiv https://github.com/NinjaTech-AI/MPLE https://doi.org/10.48550/arXiv.2409.04114
342 Sirius: Contextual Sparsity with Correction for Efficient LLMs Yang Zhou, Zhuoming Chen, Zhaozhuo Xu, Victoria Lin, Beidi Chen 2024-09-05 arXiv https://github.com/Infini-AI-Lab/Sirius http://arxiv.org/abs/2409.03856v1
343 Sketch: A Toolkit for Streamlining LLM Operations Xin Jiang, Xiang Li, Wenjia Ma, Xuezhi Fang, Yiqun Yao, Naitong Yu, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang 2024-09-05 arXiv https://github.com/cofe-ai/Sketch http://arxiv.org/abs/2409.03346v1
344 Attention Heads of Large Language Models: A Survey Zifan Zheng, Yezhaohui Wang, Yuxin Huang, Shichao Song, Mingchuan Yang, Bo Tang, Feiyu Xiong, Zhiyu Li 2024-09-05 arXiv https://github.com/IAAR-Shanghai/Awesome-Attention-Heads https://doi.org/10.48550/arXiv.2409.03752
345 Planning In Natural Language Improves LLM Search For Code Generation Evan Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, Will Song, Vaskar Nath, Ziwen Han, Sean Hendryx, Summer Yue, Hugh Zhang 2024-09-05 arXiv https://github.com/scaleapi/plansearch http://arxiv.org/abs/2409.03733v1
346 Debate on Graph: a Flexible and Reliable Reasoning Framework for Large Language Models Jie Ma, Zhitao Gao, Qi Chai, Wangchun Sun, Pinghui Wang, Hongbin Pei, Jing Tao, Lingyun Song, Jun Liu, Chen Zhang, Lizhen Cui 2024-09-05 arXiv https://github.com/reml-group/DoG https://doi.org/10.48550/arXiv.2409.03155
347 Alignment-Aware Model Extraction Attacks on Large Language Models Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu 2024-09-04 arXiv https://github.com/liangzid/alignmentExtraction https://doi.org/10.48550/arXiv.2409.02718
348 Hypothesizing Missing Causal Variables with LLMs Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz 2024-09-04 arXiv https://github.com/ivaxi0s/hypothesizing-causal-variable-llm http://arxiv.org/abs/2409.02604v1
349 Large Language Model-Based Agents for Software Engineering: A Survey Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou 2024-09-04 arXiv https://github.com/FudanSELab/Agent4SE-Paper-List https://doi.org/10.48550/arXiv.2409.02977
350 Pooling And Attention: What Are Effective Designs For LLm-Based Embedding Models? Yixuan Tang, Yi Yang 2024-09-04 arXiv https://github.com/yixuantt/PoolingAndAttn http://arxiv.org/abs/2409.02727v1
351 MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs Saeid Asgari Taghanaki, Aliasgahr Khani, Amir Khasahmadi 2024-09-03 arXiv https://github.com/asgsaeid/mmlu-pro-plus http://arxiv.org/abs/2409.02257v1
352 Foundations of Large Language Model Compression - Part 1: Weight Quantization Sean I. Young 2024-09-03 arXiv https://github.com/seannz/cvxq https://doi.org/10.48550/arXiv.2409.02026
353 Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor Abdullah Arafat Miah, Yu Bi 2024-09-03 arXiv https://github.com/SiSL-URI/Arch_Backdoor_LLM https://doi.org/10.48550/arXiv.2409.01952
354 Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu 2024-09-03 arXiv https://github.com/git-disl/Booster https://doi.org/10.48550/arXiv.2409.01586
355 Agentic Society: Merging skeleton from real world and texture from Large Language Model Yuqi Bai, Kun Sun, Huishi Yin 2024-09-02 arXiv https://github.com/baiyuqi/agentic-society https://doi.org/10.48550/arXiv.2409.10550
356 FlashFlex: Accommodating Large Language Model Training over Heterogeneous Environment Ran Yan, Youhe Jiang, Wangcheng Tao, Xiaonan Nie, Bin Cui, Binhang Yuan 2024-09-02 arXiv https://github.com/Relaxed-System-Lab/FlashFlex https://doi.org/10.48550/arXiv.2409.01143
357 Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data Mohammadreza Ghaffarzadeh-Esfahani, Mahdi Ghaffarzadeh-Esfahani, Arian Salahi-Niri, Hossein Toreyhi, Zahra Atf, Amirali Mohsenzadeh-Kermani, Mahshad Sarikhani, Zohreh Tajabadi, Fatemeh Shojaeian, Mohammad Hassan Bagheri, Aydin Feyzi, Mohammadamin Tarighatpayma, Narges Gazmeh, Fateme Heydari, Hossein Afshar, Amirreza Allahgholipour, Farid Alimardani, Ameneh Salehi, Naghmeh Asadimanesh, Mohammad Amin Khalafi, Hadis Shabanipour, Ali Moradi, Sajjad Hossein Zadeh, Omid Yazdani, Romina Esbati, Moozhan Maleki, Danial Samiei Nasr, Amirali Soheili, Hossein Majlesi, Saba Shahsavan, Alireza Soheilipour, Nooshin Goudarzi, Erfan Taherifard, Hamidreza Hatamabadi, Jamil S. Samaan, Thomas Savage, Ankit Sakhuja, Ali Soroush, Girish N. Nadkarni, Ilad Alavi Darazam, Mohamad Amin Pourhoseingholi, Seyed Amir Ahmad Safavi-Naini 2024-09-02 arXiv https://github.com/mohammad-gh009/Large-Language-Models-vs-Classical-Machine-learning https://doi.org/10.48550/arXiv.2409.02136
358 Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference Barys Liskavets, Maxim Ushakov, Shuvendu Roy, Mark Klibanov, Ali Etemad, Shane Luke 2024-09-02 arXiv https://github.com/Workday/cpc http://arxiv.org/abs/2409.01227v2
359 Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models Bang An, Sicheng Zhu, Ruiyi Zhang, Michael-Andrei Panaitescu-Liess, Yuancheng Xu, Furong Huang 2024-09-01 arXiv https://github.com/umd-huang-lab/FalseRefusal https://doi.org/10.48550/arXiv.2409.00598
360 Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering Derian Boer, Fabian Koch, Stefan Kramer 2024-09-01 arXiv https://github.com/kramerlab/4StepFocus http://arxiv.org/abs/2409.00861v1
361 Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges Mohammed Elhenawy, Ahmad Abutahoun, Taqwa I. Alhadidi, Ahmed Jaber, Huthaifa I. Ashqar, Shadi Jaradat, Ahmed Abdelhay, Sebastien Glaser, Andry Rakotonirainy 2024-09-01 arXiv https://github.com/ahmed-abdulhuy/Solving-TSP-and-mTSP-Combinatorial-Challenges-using-Visual-Reasoning-and-Multi-Agent-Approach-MLLMs- https://doi.org/10.48550/arXiv.2407.00092
362 Large Language Models for Software Engineering: A Systematic Literature Review Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John C. Grundy, Haoyu Wang 2024-09 ACM Transactions on Software Engineering and Methodology (TOSEM), Just Accepted https://github.com/xinyi-hou/LLM4SE_SLR https://dl.acm.org/doi/10.1145/3695988
363 AskIt: Unified Programming Interface for Programming with Large Language Models Katsumi Okuda, Saman P. Amarasinghe 2024-09 2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) https://github.com/katsumiok/ts-askit https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10444830
364 LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi 2024-08-31 arXiv https://github.com/zhiyuanhubj/LongRecipe https://doi.org/10.48550/arXiv.2409.00509
365 MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models Shuai Peng, Di Fu, Liangcai Gao, Xiuqin Zhong, Hongguang Fu, Zhi Tang 2024-08-30 arXiv https://github.com/pengshuai-rin/MultiMath https://doi.org/10.48550/arXiv.2409.00147
366 A Survey on Evaluation of Large Language Models Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie 2024-08-28 ACM Transactions on Intelligent Systems and Technology (TIST), Volume 15, Issue 3 https://llm-eval.github.io/ https://dl.acm.org/doi/10.1145/3641289
367 CBF-LLM: Safe Control for LLM Alignment Yuya Miyaoka, Masaki Inoue 2024-08-28 arXiv https://github.com/Mya-Mya/CBF-LLM http://arxiv.org/abs/2408.15625v1
368 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu 2024-08-28 arXiv https://github.com/NVlabs/Eagle http://arxiv.org/abs/2408.15998v1
369 Efficient LLM Scheduling by Learning to Rank Yichao Fu, Siqi Zhu, Runlong Su, Aurick Qiao, Ion Stoica, Hao Zhang 2024-08-28 arXiv https://github.com/hao-ai-lab/vllm-ltr http://arxiv.org/abs/2408.15792v1
370 Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Yuncheng Yang, Yulei Qin, Tong Wu, Zihan Xu, Gang Li, Pengcheng Guo, Hang Shao, Yucheng Shi, Ke Li, Xing Sun, Jie Yang, Yun Gu 2024-08-28 arXiv https://github.com/Yaphabates/Rocket https://doi.org/10.48550/arXiv.2408.15915
371 RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models Junyao Ge, Yang Zheng, Kaitai Guo, Jimin Liang 2024-08-27 arXiv https://github.com/SlytherinGe/RSTeller https://doi.org/10.48550/arXiv.2408.14744
372 PAT: Pruning-Aware Tuning for Large Language Models Yijiang Liu, Huanrui Yang, Youxin Chen, Rongyu Zhang, Miao Wang, Yuan Du, Li Du 2024-08-27 arXiv https://github.com/kriskrisliu/PAT_Pruning-Aware-Tuning https://doi.org/10.48550/arXiv.2408.14721
373 LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language Models Haven Kim, Kahyun Choi 2024-08-27 arXiv https://github.com/havenpersona/lycon https://doi.org/10.48550/arXiv.2408.14750
374 AgentMove: Predicting Human Mobility Anywhere Using Large Language Model based Agentic Framework Jie Feng, Yuwei Du, Jie Zhao, Yong Li 2024-08-26 arXiv https://github.com/tsinghua-fib-lab/AgentMove https://doi.org/10.48550/arXiv.2408.13986
375 CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation Muhammad Fawi 2024-08-26 arXiv https://github.com/MNoorFawi/curlora http://arxiv.org/abs/2408.14572v1
376 Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System Sein Kim, Hongseok Kang, Seungyoon Choi, Donghyun Kim, Min-Chul Yang, Chanyoung Park 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/ghdtjr/A-LLMRec https://dl.acm.org/doi/10.1145/3637528.3671931
377 Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Seyed Amir Ahmad Safavi-Naini, Shuhaib Ali, Omer Shahab, Zahra Shahhoseini, Thomas Savage, Sara Rafiee, Jamil S. Samaan, Reem Al Shabeeb, Farah Ladak, Jamie O. Yang, Juan Echavarria, Sumbal Babar, Aasma Shaukat, Samuel Margolis, Nicholas P. Tatonetti, Girish N. Nadkarni, Bara El Kurdi, Ali Soroush 2024-08-25 arXiv https://github.com/Sdamirsa/LLM-VLM-in-Gastroenterology https://doi.org/10.48550/arXiv.2409.00084
378 Understanding the Weakness of Large Language Model Agents within a Complex Android Environment Mingzhe Xing, Rongkai Zhang, Hui Xue, Qi Chen, Fan Yang, Zhen Xiao 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/AndroidArenaAgent/AndroidArena https://dl.acm.org/doi/10.1145/3637528.3671650
379 RecExplainer: Aligning Large Language Models for Explaining Recommendation Models Yuxuan Lei, Jianxun Lian, Jing Yao, Xu Huang, Defu Lian, Xing Xie 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/microsoft/RecAI https://dl.acm.org/doi/10.1145/3637528.3671802
380 R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models Shangqing Tu, Yuanchun Wang, Jifan Yu, Yuyang Xie, Yaran Shi, Xiaozhi Wang, Jing Zhang, Lei Hou, Juanzi Li 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/THU-KEG/R-Eval https://dl.acm.org/doi/10.1145/3637528.3671564
381 OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/rui-ye/OpenFedLLM https://dl.acm.org/doi/10.1145/3637528.3671582
382 A Survey of Large Language Models for Graphs Xubin Ren, Jiabin Tang, Dawei Yin, Nitesh V. Chawla, Chao Huang 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/HKUDS/Awesome-LLM4Graph-Papers https://dl.acm.org/doi/10.1145/3637528.3671460
383 Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network Lin Chen, Fengli Xu, Nian Li, Zhenyu Han, Meng Wang, Yong Li, Pan Hui 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/LinChen-65/ReStruct https://dl.acm.org/doi/10.1145/3637528.3671965
384 Bias and Unfairness in Information Retrieval Systems: New Challenges in the LLM Era Sunhao Dai, Chen Xu, Shicheng Xu, Liang Pang, Zhenhua Dong, Jun Xu 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://llm-ir-bias-fairness.github.io/ https://dl.acm.org/doi/10.1145/3637528.3671458
385 AutoWebGLM: A Large Language Model-based Web Navigating Agent Hanyu Lai, Xiao Liu, Iat Long Iong, Shuntian Yao, Yuxuan Chen, Pengbo Shen, Hao Yu, Hanchen Zhang, Xiaohan Zhang, Yuxiao Dong, Jie Tang 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/THUDM/AutoWebGLM https://dl.acm.org/doi/10.1145/3637528.3671620
386 A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li 2024-08-25 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://advanced-recommender-systems.github.io/RAG-Meets-LLMs/ https://dl.acm.org/doi/10.1145/3637528.3671470
387 ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models Yeji Park, Deokyeong Lee, Junsuk Choe, Buru Chang 2024-08-25 arXiv https://github.com/yejipark-m/ConVis https://doi.org/10.48550/arXiv.2408.13906
388 HRGraph: Leveraging LLMs for HR Data Knowledge Graphs with Information Propagation-based Job Recommendation Azmine Toushik Wasi 2024-08-24 Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024), Association for Computational Linguistics 2024 https://github.com/azminewasi/HRGraph http://arxiv.org/abs/2408.13521v1
389 LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jing Tang, Sunghun Kim 2024-08-24 arXiv https://github.com/deep-diver/llamaduo http://arxiv.org/abs/2408.13467v2
390 vitaLITy 2: Reviewing Academic Literature Using Large Language Models Hongye An, Arpit Narechania, Emily Wall, Kai Xu 2024-08-24 arXiv https://vitality-vis.github.io https://doi.org/10.48550/arXiv.2408.13450
391 LIMP: Large Language Model Enhanced Intent-aware Mobility Prediction Songwei Li, Jie Feng, Jiawei Chi, Xinyuan Hu, Xiaomeng Zhao, Fengli Xu 2024-08-23 arXiv https://github.com/tsinghua-fib-lab/LIMP https://doi.org/10.48550/arXiv.2408.12832
392 MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Yi-Fan Zhang, Huanyu Zhang, Haochen Tian, Chaoyou Fu, Shuangqing Zhang, Junfei Wu, Feng Li, Kun Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan 2024-08-23 arXiv https://mme-realworld.github.io/ http://arxiv.org/abs/2408.13257v1
393 LLM-PBE: Assessing Data Privacy in Large Language Models Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song 2024-08-23 Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11 https://llm-pbe.github.io/ https://dl.acm.org/doi/10.14778/3681954.3681994
394 Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models Subham Sah, Rishab Mitra, Arpit Narechania, Alex Endert, John T. Stasko, Wenwen Dou 2024-08-23 arXiv https://nl4dv.github.io https://doi.org/10.48550/arXiv.2408.13391
395 IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities Bin Wang, Chunyu Xie, Dawei Leng, Yuhui Yin 2024-08-23 arXiv https://github.com/360CVGroup/Inner-Adaptor-Architecture https://doi.org/10.48550/arXiv.2408.12902
396 BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models Yige Li, Hanxun Huang, Yunhan Zhao, Xingjun Ma, Jun Sun 2024-08-23 arXiv https://github.com/bboylyg/BackdoorLLM https://doi.org/10.48550/arXiv.2408.12798
397 GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models Kunsheng Tang, Wenbo Zhou, Jie Zhang, Aishan Liu, Gelei Deng, Shuai Li, Peigui Qi, Weiming Zhang, Tianwei Zhang, Nenghai Yu 2024-08-22 arXiv https://github.com/kstanghere/GenderCARE-ccs24 https://doi.org/10.48550/arXiv.2408.12494
398 Towards Evaluating and Building Versatile Large Language Models for Medicine Chaoyi Wu, Pengcheng Qiu, Jinxin Liu, Hongfei Gu, Na Li, Ya Zhang, Yanfeng Wang, Weidi Xie 2024-08-22 arXiv https://henrychur.github.io/MedS-Bench/ https://doi.org/10.48550/arXiv.2408.12547
399 Reasoning Factual Knowledge in Structured Data with Large Language Models Sirui Huang, Yanggan Gu, Xuming Hu, Zhonghao Li, Qing Li, Guandong Xu 2024-08-22 arXiv https://github.com/EganGu/StructFact https://doi.org/10.48550/arXiv.2408.12188
400 MDD-5k: A New Diagnostic Conversation Dataset for Mental Disorders Synthesized via Neuro-Symbolic LLM Agents Congchi Yin, Feng Li, Shu Zhang, Zike Wang, Jun Shao, Piji Li, Jianhua Chen, Xun Jiang 2024-08-22 arXiv https://github.com/lemonsis/MDD-5k http://arxiv.org/abs/2408.12142v1
401 Controllable Text Generation for Large Language Models: A Survey Xun Liang, Hanyu Wang, Yezhaohui Wang, Shichao Song, Jiawei Yang, Simin Niu, Jie Hu, Dan Liu, Shunyu Yao, Feiyu Xiong, Zhiyu Li 2024-08-22 arXiv https://github.com/IAAR-Shanghai/CTGSurvey https://doi.org/10.48550/arXiv.2408.12599
402 Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Ronit Singhal, Pransh Patwa, Parth Patwa, Aman Chadha, Amitava Das 2024-08-22 arXiv https://github.com/ronit-singhal/evidence-backed-fact-checking-using-rag-and-few-shot-in-context-learning-with-llms http://arxiv.org/abs/2408.12060v1
403 Enhanced Fine-Tuning of Lightweight Domain-Specific Q&A Model Based on Large Language Models Shenglin Zhang, Pengtian Zhu, Minghua Ma, Jiagang Wang, Yongqian Sun, Dongwen Li, Jingyu Wang, Qianying Guo, Xiaolei Hua, Lin Zhu, Dan Pei 2024-08-22 arXiv https://github.com/Zero-Pointer/Self-Evolution https://doi.org/10.48550/arXiv.2408.12247
404 Aligning (Medical) LLMs for (Counterfactual) Fairness Raphael Poulain, Hamed Fayyaz, Rahmatollah Beheshti 2024-08-22 arXiv https://github.com/healthylaife/FairAlignmentLLM http://arxiv.org/abs/2408.12055v1
405 MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing Hao Zhou, Zhijun Wang, Shujian Huang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Weihua Luo, Jiajun Chen 2024-08-21 arXiv https://github.com/zjwang21/MoE-LPR https://doi.org/10.48550/arXiv.2408.11396
406 Personality Alignment of Large Language Models Minjun Zhu, Linyi Yang, Yue Zhang 2024-08-21 arXiv https://github.com/zhu-minjun/PAlign https://doi.org/10.48550/arXiv.2408.11779
407 SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models Yang Cao 2024-08-21 arXiv https://github.com/Gunale0926/SORSA https://doi.org/10.48550/arXiv.2409.00055
408 SimBench: A Rule-Based Multi-Turn Interaction Benchmark for Evaluating an LLM's Ability to Generate Digital Twins Jingquan Wang, Harry Zhang, Huzaifa Mustafa Unjhawala, Peter Negrut, Shu Wang, Khailanii Slaton, Radu Serban, Jin-Long Wu, Dan Negrut 2024-08-21 arXiv https://github.com/uwsbel/SimBench http://arxiv.org/abs/2408.11987v1
409 Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models Yuzhou Huang, Yiran Qin, Shunlin Lu, Xintao Wang, Rui Huang, Ying Shan, Ruimao Zhang 2024-08-21 arXiv https://yuzhou914.github.io/Story3D-Agent/ https://doi.org/10.48550/arXiv.2408.11801
410 LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu, Graziano Chesi, Ngai Wong, Hao Yu 2024-08-20 arXiv https://github.com/YupengSu/LLM-Barber https://doi.org/10.48550/arXiv.2408.10631
411 SysBench: Can Large Language Models Follow System Messages? Yanzhao Qin, Tao Zhang, Tao Zhang, Yanjun Shen, Wenjing Luo, Haoze Sun, Yan Zhang, Yujing Qiao, Weipeng Chen, Zenan Zhou, Wentao Zhang, Bin Cui 2024-08-20 arXiv https://github.com/PKU-Baichuan-MLSystemLab/SysBench https://doi.org/10.48550/arXiv.2408.10943
412 Large Language Models for Multimodal Deformable Image Registration Mingrui Ma, Weijie Wang, Jie Ning, Jianfeng He, Nicu Sebe, Bruno Lepri 2024-08-20 arXiv https://github.com/ninjannn/LLM-Morph https://doi.org/10.48550/arXiv.2408.10703
413 Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter Junhao Chen, Bowen Wang, Zhouqiang jiang, Yuta Nakashima 2024-08-20 arXiv https://github.com/3244we/Question-Rewriter http://arxiv.org/abs/2408.10573v1
414 FLAME: Learning to Navigate with Multimodal LLM in Urban Environments Yunzhe Xu, Yiyuan Pan, Zhe Liu, Hesheng Wang 2024-08-20 arXiv https://flame-sjtu.github.io http://arxiv.org/abs/2408.11051v1
415 Beyond Labels: Aligning Large Language Models with Human-like Reasoning Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Ihsanul Haque Asif, Jawad Ibn Ahad, Fuad Rahman, Mohammad Ruhul Amin, Nabeel Mohammed, Shafin Rahman 2024-08-20 arXiv https://github.com/apurba-nsu-rnd-lab/DFAR https://doi.org/10.48550/arXiv.2408.11879
416 Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model Chenhan Yuan, Fei Huang, Ru Peng, Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou 2024-08-20 arXiv https://github.com/chenhan97/Otter https://doi.org/10.48550/arXiv.2408.10764
417 CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding? Yuwei Zhao, Ziyang Luo, Yuchen Tian, Hongzhan Lin, Weixiang Yan, Annan Li, Jing Ma 2024-08-20 arXiv https://github.com/CodeLLM-Research/CodeJudge-Eval https://doi.org/10.48550/arXiv.2408.10718
418 FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant Zhengchao Huang, Bin Xia, Zicheng Lin, Zhun Mou, Wenming Yang, Jiaya Jia 2024-08-19 arXiv https://ffaa-vl.github.io https://doi.org/10.48550/arXiv.2408.10072
419 AutoML-guided Fusion of Entity and LLM-based representations Boshko Koloski, Senja Pollak, Roberto Navigli, Blaž Škrlj 2024-08-19 arXiv https://github.com/bkolosk1/bablfusion http://arxiv.org/abs/2408.09794v1
420 CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models Linhao Yu, Yongqi Leng, Yufei Huang, Shang Wu, Haixin Liu, Xinmeng Ji, Jiahui Zhao, Jinwang Song, Tingting Cui, Xiaoqing Cheng, Liutao Liutao, Deyi Xiong 2024-08-19 ACL https://github.com/tjunlp-lab/CMoralEval https://aclanthology.org/2024.findings-acl.703
421 Pedestrian Attribute Recognition: A New Benchmark Dataset and A Large Language Model Augmented Framework Jiandong Jin, Xiao Wang, Qian Zhu, Haiyang Wang, Chenglong Li 2024-08-19 arXiv https://github.com/Event-AHU/OpenPAR https://doi.org/10.48550/arXiv.2408.09720
422 R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang 2024-08-19 arXiv https://github.com/Event-AHU/Medical_Image_Analysis https://doi.org/10.48550/arXiv.2408.09743
423 PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding Dawei Dai, Yuanhui Zhang, Long Xu, Qianlan Yang, Xiaojing Shen, Shuyin Xia, Guoyin Wang 2024-08-18 arXiv https://github.com/ddw2AIGROUP2CQUPT/PA-LLaVA https://doi.org/10.48550/arXiv.2408.09530
424 Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Josh Kimball, Ling Liu 2024-08-18 arXiv https://huangtiansheng.github.io/Antidote_gh_page/ https://doi.org/10.48550/arXiv.2408.09600
425 HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo 2024-08-18 arXiv https://github.com/HiAgent2024/HiAgent https://doi.org/10.48550/arXiv.2408.09559
426 TC-RAG:Turing-Complete RAG's Case study on Medical LLM Systems Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, Wentao Zhang, Ruizhe Zhang, Yuchen Fang, Xu Chu, Junfeng Zhao, Yasha Wang 2024-08-17 arXiv https://https://github.com/Artessay/SAMA http://arxiv.org/abs/2408.09199v1
427 Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges Baixiang Huang, Canyu Chen, Kai Shu 2024-08-16 arXiv https://llm-authorship.github.io http://arxiv.org/abs/2408.08946v1
428 Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program Alejandro Carrasco, Victor Rodriguez-Fernandez, Richard Linares 2024-08-16 arXiv https://github.com/ARCLab-MIT/kspdg http://arxiv.org/abs/2408.08676v1
429 MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang 2024-08-16 arXiv https://github.com/wjfu99/MIA-Tuner https://doi.org/10.48550/arXiv.2408.08661
430 Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks Jiawei Zhao, Kejiang Chen, Xiaojian Yuan, Weiming Zhang 2024-08-15 arXiv https://github.com/weiyezhimeng/Prefix-Guidance https://doi.org/10.48550/arXiv.2408.08924
431 Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma, Jieting Long, Tom Weidong Cai 2024-08-15 arXiv https://github.com/Zhiyuan-Li-John/MuCR https://doi.org/10.48550/arXiv.2408.08105
432 Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models Tianyu Wang, Haitao Lin, Junqiu Yu, Yanwei Fu 2024-08-15 arXiv https://star-uu-wang.github.io/Polaris/ https://doi.org/10.48550/arXiv.2408.07975
433 Can Large Language Models Understand Symbolic Graphics Programs? Zeju Qiu, Weiyang Liu, Haiwen Feng, Zhen Liu, Tim Z. Xiao, Katherine M. Collins, Joshua B. Tenenbaum, Adrian Weller, Michael J. Black, Bernhard Schölkopf 2024-08-15 arXiv https://sgp-bench.github.io/ https://doi.org/10.48550/arXiv.2408.08313
434 FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models Zhongyu Zhao, Menghang Dong, Rongyu Zhang, Wenzhao Zheng, Yunpeng Zhang, Huanrui Yang, Dalong Du, Kurt Keutzer, Shanghang Zhang 2024-08-15 arXiv https://github.com/zhenwuweihe/FactorLLM https://doi.org/10.48550/arXiv.2408.11855
435 ArabLegalEval: A Multitask Benchmark for Assessing Arabic Legal Knowledge in Large Language Models Faris Hijazi, Somayah AlHarbi, Abdulaziz AlHussein, Harethah Abu Shairah, Reem Alzahrani, Hebah AlShamlan, George Turkiyyah, Omar Knio 2024-08-15 ArabicNLP https://github.com/Thiqah/ArabLegalEval https://aclanthology.org/2024.arabicnlp-1.20
436 Evaluating Large Language Model based Personal Information Extraction and Countermeasures Yupei Liu, Yuqi Jia, Jinyuan Jia, Neil Zhenqiang Gong 2024-08-14 arXiv https://github.com/liu00222/LLM-Based-Personal-Profile-Extraction https://doi.org/10.48550/arXiv.2408.07291
437 Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao 2024-08-14 arXiv https://github.com/ChenhuiHu/knowledge_in_superposition https://doi.org/10.48550/arXiv.2408.07413
438 Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, Dacheng Tao 2024-08-14 arXiv https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications http://arxiv.org/abs/2408.07666v3
439 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li 2024-08-13 arXiv https://github.com/THUDM/LongWriter http://arxiv.org/abs/2408.07055v1
440 Kov: Transferable and Naturalistic Black-Box LLM Attacks using Markov Decision Processes and Tree Search Robert J. Moss 2024-08-11 arXiv https://github.com/sisl/Kov.jl http://arxiv.org/abs/2408.08899v1
441 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang, Yi Chang, Yuan Wu 2024-08-09 arXiv https://github.com/cyp-jlu-ai/BA-LoRA https://doi.org/10.48550/arXiv.2408.04556
442 Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models Qirui Jiao, Daoyuan Chen, Yilun Huang, Yaliang Li, Ying Shen 2024-08-09 arXiv https://github.com/modelscope/data-juicer/tree/ImgDiff https://doi.org/10.48550/arXiv.2408.04594
443 Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, Vicente Grau 2024-08-09 arXiv https://github.com/MedicineToken/Medical-Graph-RAG/tree/main https://doi.org/10.48550/arXiv.2408.04187
444 Open-domain Implicit Format Control for Large Language Model Generation Yiqun Yao, Wenjia Ma, Xuezhi Fang, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang 2024-08-09 arXiv https://github.com/cofe-ai/OIFC https://doi.org/10.48550/arXiv.2408.04392
445 Revisiting Multi-Modal LLM Evaluation Jian Lu, Shikhar Srivastava, Junyu Chen, Robik Shrestha, Manoj Acharya, Kushal Kafle, Christopher Kanan 2024-08-09 arXiv https://kevinlujian.github.io/MLLM_Evaluations/ http://arxiv.org/abs/2408.05334v1
446 SHIELD: LLM-Driven Schema Induction for Predictive Analytics in EV Battery Supply Chain Disruptions Zhi-Qi Cheng, Yifei Dong, Aike Shi, Wei Liu, Yuzhi Hu, Jason O'Connor, Alexander G. Hauptmann, Kate S. Whitefoot 2024-08-09 arXiv https://fly1113.github.io/MFI/ http://arxiv.org/abs/2408.05357v2
447 Tabular Transfer Learning via Prompting LLMs Jaehyun Nam, Woomin Song, Seong Hyeon Park, Jihoon Tack, Sukmin Yun, Jaehyung Kim, Kyu Hwan Oh, Jinwoo Shin 2024-08-09 arXiv https://github.com/jaehyun513/P2T http://arxiv.org/abs/2408.11063v1
448 VITA: Towards Open-Source Interactive Omni Multimodal LLM Chaoyou Fu, Haojia Lin, Zuwei Long, Yunhang Shen, Meng Zhao, Yifan Zhang, Shaoqi Dong, Xiong Wang, Di Yin, Long Ma, Xiawu Zheng, Ran He, Rongrong Ji, Yunsheng Wu, Caifeng Shan, Xing Sun 2024-08-09 arXiv https://vita-home.github.io http://arxiv.org/abs/2408.05211v2
449 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang 2024-08-08 arXiv https://github.com/apple/ToolSandbox http://arxiv.org/abs/2408.04682v1
450 BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models Yupeng Chang, Yi Chang, Yuan Wu 2024-08-08 arXiv https://github.com/cyp-jlu-ai/BA-LoRA http://arxiv.org/abs/2408.04556v2
451 NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu 2024-08-07 arXiv https://github.com/PaddlePaddle/Research/tree/master/NLP/ACL2024-NACL http://arxiv.org/abs/2408.03675v2
452 WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Prannaya Gupta, Le Qi Yau, Hao Han Low, I-Shiang Lee, Hugo Maximus Lim, Yu Xin Teoh, Jia Hng Koh, Dar Win Liew, Rishabh Bhardwaj, Rajat Bhardwaj, Soujanya Poria 2024-08-07 arXiv https://github.com/walledai/walledeval https://doi.org/10.48550/arXiv.2408.03837
453 CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Shieh, Wenmeng Zhou 2024-08-07 arXiv https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent https://doi.org/10.48550/arXiv.2408.03910
454 Citekit: A Modular Toolkit for Large Language Model Citation Generation Jiajun Shen, Tong Zhou, Suifeng Zhao, Yubo Chen, Kang Liu 2024-08-06 arXiv https://github.com/SjJ1017/Citekit https://doi.org/10.48550/arXiv.2408.04662
455 OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs Hasan Iqbal, Yuxia Wang, Minghan Wang, Georgi Georgiev, Jiahui Geng, Iryna Gurevych, Preslav Nakov 2024-08-06 arXiv https://github.com/hasaniqbal777/openfactcheck http://arxiv.org/abs/2408.11832v1
456 StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation Boxi Cao, Mengjie Ren, Hongyu Lin, Xianpei Han, Feng Zhang, Junfeng Zhan, Le Sun 2024-08-06 ACL https://github.com/c-box/StructEval https://aclanthology.org/2024.findings-acl.314
457 Topic Modeling with Fine-tuning LLMs and Bag of Sentences Johannes Schneider 2024-08-06 arXiv https://github.com/JohnTailor/FT-Topic http://arxiv.org/abs/2408.03099v1
458 ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen 2024-08-06 arXiv https://github.com/nlp-uoregon/ullme https://doi.org/10.48550/arXiv.2408.03402
459 RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Daniel Fleischer, Moshe Berchansky, Moshe Wasserblat, Peter Izsak 2024-08-05 arXiv https://github.com/IntelLabs/RAGFoundry http://arxiv.org/abs/2408.02545v1
460 ReDel: A Toolkit for LLM-Powered Recursive Multi-Agent Systems Andrew Zhu, Liam Dugan, Chris Callison-Burch 2024-08-05 arXiv https://github.com/zhudotexe/redel http://arxiv.org/abs/2408.02248v1
461 UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model Zhaowei Li, Wei Wang, Yiqing Cai, Qi Xu, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang 2024-08-05 arXiv https://github.com/lzw-lzw/UnifiedMLLM https://doi.org/10.48550/arXiv.2408.02503
462 Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Haoyang Li 2024-08-05 arXiv https://github.com/liangzid/PromptExtractionEval https://doi.org/10.48550/arXiv.2408.02416
463 Mini-Monkey: Multi-Scale Adaptive Cropping for Multimodal Large Language Models Mingxin Huang, Yuliang Liu, Dingkang Liang, Lianwen Jin, Xiang Bai 2024-08-04 arXiv https://github.com/Yuliang-Liu/Monkey https://doi.org/10.48550/arXiv.2408.02034
464 MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance Jihye Choi, Nils Palumbo, Prasad Chalasani, Matthew M. Engelhard, Somesh Jha, Anivarya Kumar, David Page 2024-08-03 arXiv https://github.com/jihyechoi77/malade http://arxiv.org/abs/2408.01869v1
465 PLUGH: A Benchmark for Spatial Understanding and Reasoning in Large Language Models Alexey Tikhonov 2024-08-03 arXiv https://github.com/altsoph/PLUGH https://doi.org/10.48550/arXiv.2408.04648
466 Non Verbis, Sed Rebus: Large Language Models are Weak Solvers of Italian Rebuses Gabriele Sarti, Tommaso Caselli, Malvina Nissim, Arianna Bisazza 2024-08-02 arXiv https://github.com/gsarti/verbalized-rebus https://doi.org/10.48550/arXiv.2408.00584
467 Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs Yilun Hua, Yoav Artzi 2024-08-02 arXiv https://github.com/lil-lab/ICCA http://arxiv.org/abs/2408.01417v1
468 CFBench: A Comprehensive Constraints-Following Benchmark for LLMs Tao Zhang, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao Liang, Tao Zhang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou 2024-08-02 arXiv https://github.com/PKU-Baichuan-MLSystemLab/CFBench http://arxiv.org/abs/2408.01122v1
469 Agentic LLM Workflows for Generating Patient-Friendly Medical Reports Malavikha Sudarshan, Sophie Shih, Estella Yee, Alina Yang, John Zou, Cathy Chen, Quan Zhou, Leon Chen, Chinmay Singhal, George Shih 2024-08-02 arXiv http://github.com/malavikhasudarshan/Multi-Agent-Patient-Letter-Generation http://arxiv.org/abs/2408.01112v2
470 ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji 2024-08-01 arXiv https://github.com/mrwu-mac/ControlMLLM https://doi.org/10.48550/arXiv.2407.21534
471 ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire 2024-08 Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 9 https://github.com/penfever/ArcheType https://dl.acm.org/doi/10.14778/3665844.3665857
472 Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning Xingchen Zeng, Haichuan Lin, Yilin Ye, Wei Zeng 2024-07-29 arXiv https://github.com/zengxingchen/ChartQA-MLLM https://doi.org/10.48550/arXiv.2407.20174
473 Can Editing LLMs Inject Harm? Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu 2024-07-29 arXiv https://llm-editing.github.io http://arxiv.org/abs/2407.20224v2
474 CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare Jingwei Zhu, Minghuan Tan, Min Yang, Ruixue Li, Hamid Alinejad-Rokny 2024-07-29 arXiv https://github.com/CAS-SIAT-XinHai/CollectiveSFT https://doi.org/10.48550/arXiv.2407.19705
475 rLLM: Relational Table Learning with LLMs Weichen Li, Xiaotong Huang, Jianwu Zheng, Zheng Wang, Chaokun Wang, Li Pan, Jianhua Li 2024-07-29 arXiv https://github.com/rllm-project/rllm http://arxiv.org/abs/2407.20157v1
476 A Role-specific Guided Large Language Model for Ophthalmic Consultation Based on Stylistic Differentiation Laiyi Fu, Binbin Fan, Hongkai Du, Yanxiang Feng, Chunhua Li, Huping Song 2024-07-26 arXiv https://github.com/sperfu/EyeDoc https://doi.org/10.48550/arXiv.2407.18483
477 The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models Zihui Wu, Haichang Gao, Jianping He, Ping Wang 2024-07-25 arXiv https://github.com/wooozihui/jailbreakfunction https://doi.org/10.48550/arXiv.2407.17915
478 Exploring Bengali Religious Dialect Biases in Large Language Models with Evaluation Perspectives Azmine Toushik Wasi, Raima Islam, Mst Rafia Islam, Taki Hasan Rafi, Dong-Kyu Chae 2024-07-25 arXiv https://heal-workshop.github.io/#:~:text=Exploring%20Bengali%20Religious%20Dialect%20Biases%20in%20Large%20Language%20Models%20with%20Evaluation%20Perspectives https://doi.org/10.48550/arXiv.2407.18376
479 Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance Ao Shen, Qiang Wang, Zhiquan Lai, Xionglve Li, Dong-sheng Li 2024-07-24 arXiv https://github.com/xiaocaigou/qbaraqahira https://doi.org/10.48550/arXiv.2407.17029
480 Scalify: scale propagation for efficient low-precision LLM training Paul Balança, Sam Hosegood, Carlo Luschi, Andrew Fitzgibbon 2024-07-24 arXiv https://github.com/graphcore-research/jax-scalify http://arxiv.org/abs/2407.17353v1
481 Enhancing LLM's Cognition via Structurization Kai Liu, Zhihang Fu, Chao Chen, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye 2024-07-23 arXiv https://github.com/alibaba/struxgpt http://arxiv.org/abs/2407.16434v1
482 Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models Shi Lin, Rongchang Li, Xun Wang, Changting Lin, Wenpeng Xing, Meng Han 2024-07-23 arXiv https://github.com/theshi-1128/ABJ-Attack https://doi.org/10.48550/arXiv.2407.16205
483 INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model Yiwei Ma, Zhibin Wang, Xiaoshuai Sun, Weihuang Lin, Qiang Zhou, Jiayi Ji, Rongrong Ji 2024-07-23 arXiv https://github.com/WeihuangLin/INF-LLaVA https://doi.org/10.48550/arXiv.2407.16198
484 LawLuo: A Chinese Law Firm Co-run by LLM Agents Jingyun Sun, Chengxiao Dai, Zhongze Luo, Yangbo Chang, Yang Li 2024-07-23 arXiv https://github.com/NEFUJing/LawLuo http://arxiv.org/abs/2407.16252v1
485 Structure-aware Domain Knowledge Injection for Large Language Models Kai Liu, Ze Chen, Zhihang Fu, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye 2024-07-23 arXiv https://github.com/alibaba/struxgpt http://arxiv.org/abs/2407.16724v2
486 LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models Xi Chen, Songyang Zhang, Qibing Bai, Kai Chen, Satoshi Nakamura 2024-07-22 ACL https://github.com/openaudiolab/LLaST https://aclanthology.org/2024.findings-acl.416
487 SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Mingze Xu, Mingfei Gao, Zhe Gan, Hong-You Chen, Zhengfeng Lai, Haiming Gang, Kai Kang, Afshin Dehghan 2024-07-22 arXiv https://github.com/apple/ml-slowfast-llava https://doi.org/10.48550/arXiv.2407.15841
488 Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_hi$) Ishan Kavathekar, Anku Rani, Ashmit Chamoli, Ponnurangam Kumaraguru, Amit Sheth, Amitava Das 2024-07-22 OpenReview https://github.com/ishank31/Counter_Turing_Test http://arxiv.org/abs/2407.15694v1
489 Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models Wenbin An, Feng Tian, Jiahao Nie, Wenkai Shi, Haonan Lin, Yan Chen, Qianying Wang, Yaqiang Wu, Guang Dai, Ping Chen 2024-07-22 arXiv https://github.com/Lackel/DKA https://doi.org/10.48550/arXiv.2407.15346
490 Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Zhuoyan Xu, Zhenmei Shi, Yingyu Liang 2024-07-22 arXiv https://github.com/OliverXUZY/LLM_Compose https://doi.org/10.48550/arXiv.2407.15720
491 BIGbench: A Unified Benchmark for Social Bias in Text-to-Image Generative Models Based on Multi-modal LLM Hanjun Luo, Haoyu Huang, Ziye Deng, Xuecheng Liu, Ruizhe Chen, Zuozhu Liu 2024-07-21 arXiv https://github.com/BIGbench2024/BIGbench2024/ http://arxiv.org/abs/2407.15240v2
492 Large Language Model for Verilog Generation with Golden Code Feedback Ning Wang, Bingkun Yao, Jie Zhou, Xi Wang, Zhe Jiang, Nan Guan 2024-07-21 arXiv https://github.com/CatIIIIIIII/veriseek https://doi.org/10.48550/arXiv.2407.18271
493 Navigation Instruction Generation with BEV Perception and Large Language Models Sheng Fan, Rui Liu, Wenguan Wang, Yi Yang 2024-07-21 arXiv https://github.com/FanScy/BEVInstructor https://doi.org/10.48550/arXiv.2407.15087
494 SynCPKL: Harnessing LLMs to Generate Synthetic Data for Commonsense Persona Knowledge Linking Kuan-Yen Lin 2024-07-21 arXiv https://github.com/irislin1006/CPKL http://arxiv.org/abs/2407.15281v1
495 On the Design and Analysis of LLM-Based Algorithms Yanxi Chen, Yaliang Li, Bolin Ding, Jingren Zhou 2024-07-20 arXiv https://github.com/modelscope/agentscope/tree/main/examples/paper_llm_based_algorithm http://arxiv.org/abs/2407.14788v1
496 Beyond Code Generation: Assessing Code LLM Maturity with Postconditions Fusen He, Juan Zhai, Minxue Pan 2024-07-19 arXiv https://github.com/MatureModel/PostcondGen http://arxiv.org/abs/2407.14118v1
497 Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models Xuenan Xu, Pingyue Zhang, Ming Yan, Ji Zhang, Mengyue Wu 2024-07-19 arXiv https://www.github.com/wsntxxn/AttrEnhZsAc https://doi.org/10.48550/arXiv.2407.14355
498 Internal Consistency and Self-Feedback in Large Language Models: A Survey Xun Liang, Shichao Song, Zifan Zheng, Hanyu Wang, Qingchen Yu, Xunkai Li, Rong-Hua Li, Peng Cheng, Zhonghao Wang, Feiyu Xiong, Zhiyu Li 2024-07-19 arXiv https://github.com/IAAR-Shanghai/ICSFSurvey https://doi.org/10.48550/arXiv.2407.14507
499 SegPoint: Segment Any Point Cloud via Large Language Model Shuting He, Henghui Ding, Xudong Jiang, Bihan Wen 2024-07-18 arXiv https://heshuting555.github.io/SegPoint https://doi.org/10.48550/arXiv.2407.13761
500 ViLLa: Video Reasoning Segmentation with Large Language Model Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao 2024-07-18 arXiv https://github.com/rkzheng99/ViLLa https://doi.org/10.48550/arXiv.2407.14500
501 Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models Sadegh Mahdavi, Raquel Aoki, Keyi Tang, Yanshuai Cao 2024-07-17 arXiv https://github.com/BorealisAI/llm-pddl-planning https://doi.org/10.48550/arXiv.2407.12979
502 E5-V: Universal Embeddings with Multimodal Large Language Models Ting Jiang, Minghui Song, Zihan Zhang, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang 2024-07-17 arXiv https://github.com/kongds/E5-V https://doi.org/10.48550/arXiv.2407.12580
503 MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models Leyang Shen, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie 2024-07-17 arXiv https://github.com/JiuTian-VL/MoME https://doi.org/10.48550/arXiv.2407.12709
504 Patch-Level Training for Large Language Models Chenze Shao, Fandong Meng, Jie Zhou 2024-07-17 arXiv https://github.com/shaochenze/PatchTrain https://doi.org/10.48550/arXiv.2407.12665
505 Robust Utility-Preserving Text Anonymization Based on Large Language Models Tianyu Yang, Xiaodan Zhu, Iryna Gurevych 2024-07-16 arXiv https://github.com/UKPLab/arxiv2024-rupta https://doi.org/10.48550/arXiv.2407.11770
506 VISA: Reasoning Video Object Segmentation via Large Language Models Cilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang, Weidi Xie, Efstratios Gavves 2024-07-16 arXiv https://github.com/cilinyan/VISA https://doi.org/10.48550/arXiv.2407.11325
507 LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices Jung Hyun Lee, Jeonghoon Kim, June Yong Yang, Se Jung Kwon, Eunho Yang, Kang Min Yoo, Dongsoo Lee 2024-07-16 arXiv https://github.com/onliwad101/FlexRound_LRQ https://doi.org/10.48550/arXiv.2407.11534
508 NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen 2024-07-16 arXiv https://github.com/open-compass/opencompass http://arxiv.org/abs/2407.11963v1
509 Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models Jiasheng Zheng, Boxi Cao, Zhengzhao Ma, Ruotong Pan, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun 2024-07-16 arXiv https://github.com/jszheng21/RACE https://doi.org/10.48550/arXiv.2407.11470
510 By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual Prompting Hyungjun Yoon, Biniyam Aschalew Tolera, Taesik Gong, Kimin Lee, Sung-Ju Lee 2024-07-15 arXiv https://github.com/diamond264/ByMyEyes https://doi.org/10.48550/arXiv.2407.10385
511 Evaluating Large Language Models with fmeval Pola Schwöbel, Luca Franceschi, Muhammad Bilal Zafar, Keerthan Vasist, Aman Malhotra, Tomer Shenhar, Pinal Tailor, Pinar Yilmaz, Michael Diamond, Michele Donini 2024-07-15 arXiv https://github.com/aws/fmeval https://doi.org/10.48550/arXiv.2407.12872
512 IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization Jie Cao, Dian Jiao, Qiang Yan, Wenqiao Zhang, Siliang Tang, Yueting Zhuang 2024-07-15 arXiv https://github.com/DCDmllm/IDEAL_Summary https://doi.org/10.48550/arXiv.2407.10486
513 Learning Dynamics of LLM Finetuning Yi Ren, Danica J. Sutherland 2024-07-15 arXiv https://github.com/Joshua-Ren/Learning_dynamics_LLM http://arxiv.org/abs/2407.10490v1
514 Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models Louis Abraham, Charles Arnal, Antoine Marie 2024-07-15 arXiv https://prompt-ultra.github.io/ https://doi.org/10.48550/arXiv.2407.10645
515 Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang 2024-07-15 arXiv https://github.com/qcznlp/uncertainty_attack https://doi.org/10.48550/arXiv.2407.11282
516 VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation Bocheng Zou, Mu Cai, Jianrui Zhang, Yong Jae Lee 2024-07-15 arXiv https://vgbench.github.io https://doi.org/10.48550/arXiv.2407.10972
517 When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments Chong Zhang, Xinyi Liu, Mingyu Jin, Zhongmou Zhang, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang 2024-07-15 arXiv https://github.com/MingyuJ666/Stockagent https://doi.org/10.48550/arXiv.2407.18957
518 Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo 2024-07-14 arXiv https://github.com/Yuchen413/AnomalyRuler https://doi.org/10.48550/arXiv.2407.10299
519 ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning Zhongsheng Wang, Jiamou Liu, Qiming Bao, Hongfei Rong, Jingfeng Zhang 2024-07-14 arXiv https://github.com/Strong-AI-Lab/ChatLogic https://doi.org/10.48550/arXiv.2407.10162
520 LLMatic: Neural Architecture Search Via Large Language Models And Quality Diversity Optimization Muhammad Umair Nasir, Sam Earle, Julian Togelius, Steven James, Christopher W. Cleghorn 2024-07-14 GECCO '24: Proceedings of the Genetic and Evolutionary Computation Conference https://github.com/umair-nasir14/LLMatic https://dl.acm.org/doi/10.1145/3638529.3654017
521 Refusing Safe Prompts for Multi-modal Large Language Models Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong 2024-07-12 arXiv https://github.com/Sadcardation/MLLM-Refusal https://doi.org/10.48550/arXiv.2407.09050
522 Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu 2024-07-12 arXiv https://github.com/RobustNLP/DeRTa http://arxiv.org/abs/2407.09121v1
523 Mitigating Entity-Level Hallucination in Large Language Models Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu 2024-07-12 arXiv https://github.com/oneal2000/EntityHallucination https://doi.org/10.48550/arXiv.2407.09417
524 Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection Xingyu Peng, Yan Bai, Chen Gao, Lirong Yang, Fei Xia, Beipeng Mu, Xiaofei Wang, Si Liu 2024-07-12 arXiv https://github.com/GradiusTwinbee/GLIS http://arxiv.org/abs/2407.08931v1
525 Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors Nico Daheim, Jakub Macina, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan 2024-07-12 arXiv https://github.com/eth-lre/verify-then-generate https://doi.org/10.48550/arXiv.2407.09136
526 GLBench: A Comprehensive Benchmark for Graph with Large Language Models Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng Cai, Victor Wai Kin Chan, Jia Li 2024-07-12 arXiv https://github.com/NineAbyss/GLBench https://doi.org/10.48550/arXiv.2407.07457
527 Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James R. Glass 2024-07-11 arXiv https://github.com/voidism/Lookback-Lens https://doi.org/10.48550/arXiv.2407.07071
528 FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation Liqun Ma, Mingjie Sun, Zhiqiang Shen 2024-07-11 arXiv:2407.07093, 2024 https://github.com/LiqunMa/FBI-LLM http://arxiv.org/abs/2407.07093v1
529 Incorporating Large Language Models into Production Systems for Enhanced Task Automation and Flexibility Yuchen Xia, Jize Zhang, Nasser Jazdi, Michael Weyrich 2024-07-11 arXiv https://github.com/YuchenXia/GPT4IndustrialAutomation https://doi.org/10.48550/arXiv.2407.08550
530 Metron: Holistic Performance Evaluation Framework for LLM Inference Systems Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov 2024-07-11 arXiv …, 2024 https://github.com/project-metron/metron http://arxiv.org/abs/2407.07000v1
531 Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Huanqian Wang, Yang Yue, Rui Lu, Jingxin Shi, Andrew Zhao, Shenzhi Wang, Shiji Song, Gao Huang 2024-07-11 arXiv https://github.com/lucywang720/model-surgery http://arxiv.org/abs/2407.08770v1
532 SEED-Story: Multimodal Long Story Generation with Large Language Model Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen 2024-07-11 arXiv https://github.com/TencentARC/SEED-Story https://doi.org/10.48550/arXiv.2407.08683
533 The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Zhen Qin, Daoyuan Chen, Wenhao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, Shuiguang Deng 2024-07-11 arXiv https://github.com/modelscope/data-juicer/blob/main/docs/awesome_llm_data.md https://doi.org/10.48550/arXiv.2407.08583
534 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng 2024-07-10 arXiv https://github.com/HuangOwen/RoLoRA http://arxiv.org/abs/2407.08044v1
535 Large Language Models are Learnable Planners for Long-Term Recommendation Wentao Shi, Xiangnan He, Yang Zhang, Chongming Gao, Xinyue Li, Jizhi Zhang, Qifan Wang, Fuli Feng 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/jizhi-zhang/BiLLP https://dl.acm.org/doi/10.1145/3626772.3657683
536 OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems Shuyuan Xu, Wenyue Hua, Yongfeng Zhang 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/agiresearch/OpenP5 https://dl.acm.org/doi/10.1145/3626772.3657883
537 PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept Linking Yuzhang Xie, Jiaying Lu, Joyce Ho, Fadi B. Nahab, Xiao Hu, Carl Yang 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/constantjxyz/PromptLink https://dl.acm.org/doi/10.1145/3626772.3657904
538 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Aoyu Pang, Maonan Wang, Man-On Pun, Chung Shue Chen, Xi Xiong 2024-07-10 arXiv https://github.com/Traffic-Alpha/iLLM-TSC https://doi.org/10.48550/arXiv.2407.06025
539 TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/skyriver-2000/TRAD-Official https://dl.acm.org/doi/10.1145/3626772.3657788
540 USimAgent: Large Language Models for Simulating Search Users Erhan Zhang, Xingzhu Wang, Peiyuan Gong, Yankai Lin, Jiaxin Mao 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/Meow-E/USimAgent https://dl.acm.org/doi/10.1145/3626772.3657963
541 Waterfall: Framework for Robust and Scalable Text Watermarking and Provenance for LLMs Gregory Kang Ruey Lau, Xinyuan Niu, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low 2024-07-10 arXiv e …, 2024 https://github.com/aoi3142/Waterfall http://arxiv.org/abs/2407.04411v2
542 LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan 2024-07-10 arXiv:2407.05975, 2024 https://github.com/CONE-MT/LLaMAX/ http://arxiv.org/abs/2407.05975v1
543 LLaRA: Large Language-Recommendation Assistant Jiayi Liao, Sihang Li, Zhengyi Yang, Jiancan Wu, Yancheng Yuan, Xiang Wang, Xiangnan He 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/ljy0ustc/LLaRA https://dl.acm.org/doi/10.1145/3626772.3657690
544 Inference Performance Optimization for Large Language Models on CPUs Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie 2024-07-10 arXiv https://github.com/intel/xFasterTransformer https://doi.org/10.48550/arXiv.2407.07304
545 LLMBox: A Comprehensive Library for Large Language Models Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zijing Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen 2024-07-10 arXiv https://github.com/RUCAIBox/LLMBox https://doi.org/10.48550/arXiv.2407.05563
546 IDGenRec: LLM-RecSys Alignment with Textual ID Learning Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/agiresearch/IDGenRec https://dl.acm.org/doi/10.1145/3626772.3657821
547 GraphGPT: Graph Instruction Tuning for Large Language Models Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, Suqi Cheng, Dawei Yin, Chao Huang 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/HKUDS/GraphGPT https://dl.acm.org/doi/10.1145/3626772.3657775
548 GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing Zhenyu Wang, Aoxue Li, Zhenguo Li, Xihui Liu 2024-07-10 arXiv:2407.05600, 2024 https://zhenyuw16.github.io/GenArtist_page http://arxiv.org/abs/2407.05600v1
549 EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo 2024-07-10 arXiv https://github.com/OpenGVLab/EfficientQAT https://doi.org/10.48550/arXiv.2407.11062
550 ChatUniTest: A Framework for LLM-Based Test Generation Yinghao Chen, Zehao Hu, Chen Zhi, Junxiao Han, Shuiguang Deng, Jianwei Yin 2024-07-10 FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering https://github.com/ZJU-ACES-ISE/ChatUniTest https://dl.acm.org/doi/10.1145/3663529.3663801
551 Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange Ankit Satpute, Noah Giessing, Andre Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/gipplab/LLM-Investig-MathStackExchange https://dl.acm.org/doi/10.1145/3626772.3657945
552 Are Large Language Models Good at Utility Judgments? Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/ict-bigdatalab/utility_judgments https://dl.acm.org/doi/10.1145/3626772.3657784
553 A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon 2024-07-10 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/ielab/llm-rankers https://dl.acm.org/doi/10.1145/3626772.3657813
554 Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov 2024-07-09 arXiv https://github.com/project-etalon/etalon http://arxiv.org/abs/2407.07000v2
555 DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations Luke Yoffe, Alfonso Amayuelas, William Yang Wang 2024-07-08 arXiv https://github.com/lukeyoffe/debunc https://doi.org/10.48550/arXiv.2407.06426
556 KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions Yanxu Zhu, Jinlin Xiao, Yuhang Wang, Jitao Sang 2024-07-08 arXiv https://github.com/yanxuzhu/KG-FPQ http://arxiv.org/abs/2407.05868v1
557 Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression Zhichao Xu, Ashim Gupta, Tao Li, Oliver Bentham, Vivek Srikumar 2024-07-06 arXiv https://github.com/zhichaoxu-shufe/Beyond-Perplexity-Compression-Safety-Eval http://arxiv.org/abs/2407.04965v2
558 LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts Yijia Xiao, Edward Sun, Tianyu Liu, Wei Wang 2024-07-06 arXiv https://github.com/Yijia-Xiao/LogicVista http://arxiv.org/abs/2407.04973v1
559 When LLMs Play the Telephone Game: Cumulative Changes and Attractors in Iterated Cultural Transmissions Jérémy Perez, Corentin Léger, Grgur Kovač, Cédric Colas, Gaia Molinaro, Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier 2024-07-05 arXiv https://github.com/jeremyperez2/TelephoneGameLLM http://arxiv.org/abs/2407.04503v1
560 Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs Mihir Parmar, Hanieh Deilamsalehy, Franck Dernoncourt, Seunghyun Yoon, Ryan A. Rossi, Trung Bui 2024-07-05 arXiv https://github.com/Mihir3009/Extract-AI http://arxiv.org/abs/2407.04855v1
561 BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks Jieying Xue, Minh Phuong Nguyen, Blake Matheny, Le Minh Nguyen 2024-07-05 arXiv https://github.com/yingjie7/BiosERC http://arxiv.org/abs/2407.04279v1
562 Automating Venture Capital: Founder assessment using LLM-powered segmentation, feature engineering and automated labeling techniques Ekin Ozince, Yiğit Ihlamur 2024-07-05 arXiv https://github.com/velapartners/moneyball-LLM-based-founder-features http://arxiv.org/abs/2407.04885v1
563 AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Petr Anokhin, Nikita Semenov, Artyom Sorokin, Dmitry Evseev, Mikhail Burtsev, Evgeny Burnaev 2024-07-05 arXiv https://github.com/AIRI-Institute/AriGraph http://arxiv.org/abs/2407.04363v1
564 TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models Jiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding, Lianwen Jin 2024-07-04 arXiv https://github.com/SCUT-DLVCLab/TongGu-LLM https://doi.org/10.48550/arXiv.2407.03937
565 NutriBench: A Dataset for Evaluating Large Language Models in Carbohydrate Estimation from Meal Descriptions Andong Hua, Mehak Preet Dhaliwal, Ryan Burke, Laya Pullela, Yao Qin 2024-07-04 arXiv https://mehak126.github.io/nutribench.html https://doi.org/10.48550/arXiv.2407.12843
566 AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li 2024-07-04 arXiv https://github.com/AutoBench/AutoBench http://arxiv.org/abs/2407.03891v1
567 Future Events as Backdoor Triggers: Investigating Temporal Vulnerabilities in LLMs Sara Price, Arjun Panickssery, Sam Bowman, Asa Cooper Stickland 2024-07-04 arXiv https://github.com/sbp354/Future_triggered_backdoors http://arxiv.org/abs/2407.04108v1
568 Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu, Bo An 2024-07-04 arXiv https://github.com/mansicer/Q-Adapter http://arxiv.org/abs/2407.03856v2
569 The Price of Prompting: Profiling Energy Use in Large Language Models Inference Erik Johannes Husom, Arda Goknil, Lwin Khin Shar, Sagar Sen 2024-07-04 arXiv https://github.com/ejhusom/MELODI https://doi.org/10.48550/arXiv.2407.16893
570 GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models Zike Yuan, Ming Liu, Hui Wang, Bing Qin 2024-07-03 arXiv https://github.com/ZIKEYUAN/GraCoRe https://doi.org/10.48550/arXiv.2407.02936
571 Improving LLM Abilities in Idiomatic Translation Sundesh Donthi, Maximilian Spencer, Om Patel, Joon Doh, Eid Rodan 2024-07-03 arXiv https://github.com/ANON13222/ITR http://arxiv.org/abs/2407.03518v1
572 Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction Chenlong Deng, Kelong Mao, Yuyao Zhang, Zhicheng Dou 2024-07-02 arXiv https://github.com/ChenlongDeng/ADAPT http://arxiv.org/abs/2407.01964v3
573 TokenPacker: Efficient Visual Projector for Multimodal LLM Wentong Li, Yuqian Yuan, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang 2024-07-02 arXiv https://github.com/CircleRadon/TokenPacker http://arxiv.org/abs/2407.02392v1
574 Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation Pablo Messina, René Vidal, Denis Parra, Alvaro Soto, Vladimir Araujo 2024-07-02 ACL https://github.com/PabloMessina/CXR-Fact-Encoder https://aclanthology.org/2024.findings-acl.236
575 To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models Bozhong Tian, Xiaozhuan Liang, Siyuan Cheng, Qingbin Liu, Mengru Wang, Dianbo Sui, Xi Chen, Huajun Chen, Ningyu Zhang 2024-07-02 arXiv https://github.com/zjunlp/KnowUnDo https://doi.org/10.48550/arXiv.2407.01920
576 CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models Ying Nie, Binwei Yan, Tianyu Guo, Hao Liu, Haoyu Wang, Wei He, Binfan Zheng, Weihao Wang, Qiang Li, Weijian Sun, Yunhe Wang, Dacheng Tao 2024-07-02 arXiv https://cfinbench.github.io/ https://doi.org/10.48550/arXiv.2407.02301
577 Breaking Bias, Building Bridges: Evaluation and Mitigation of Social Biases in LLMs via Contact Hypothesis Chahat Raj, Anjishnu Mukherjee, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu 2024-07-02 arXiv https://github.com/chahatraj/breakingbias http://arxiv.org/abs/2407.02030v1
578 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Zihan Wang, Deli Chen, Damai Dai, Runxin Xu, Zhuoshu Li, Y. Wu 2024-07-02 arXiv https://github.com/deepseek-ai/ESFT https://doi.org/10.48550/arXiv.2407.01906
579 Fine-grained, Multi-dimensional Summarization Evaluation with LLMs Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour 2024-07-01 arXiv https://github.com/DISL-Lab/FineSurE-ACL24 http://arxiv.org/abs/2407.00908v2
580 SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models Zheng Lin, Xuanjie Hu, Yuxin Zhang, Zhe Chen, Zihan Fang, Xianhao Chen, Ang Li, Praneeth Vepakomma, Yue Gao 2024-07-01 arXiv https://fduinc.github.io/splitlora/ https://doi.org/10.48550/arXiv.2407.00952
581 RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models Liangliang Chen, Yutian Lei, Shiyu Jin, Ying Zhang, Liangjun Zhang 2024-07-01 IEEE Robotics and Automation Letters https://rlingua.github.io https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10529514
582 MIRAI: Evaluating LLM Agents for Event Forecasting Chenchen Ye, Ziniu Hu, Yihe Deng, Zijie Huang, Mingyu Derek Ma, Yanqiao Zhu, Wei Wang 2024-07-01 arXiv https://mirai-llm.github.io/ http://arxiv.org/abs/2407.01231v1
583 FineSurE: Fine-grained Summarization Evaluation using LLMs Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour 2024-07-01 arXiv https://github.com/DISL-Lab/FineSurE-ACL24 http://arxiv.org/abs/2407.00908v1
584 MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education Shashank Sonkar, Naiming Liu, MyCo Le, Richard G. Baraniuk 2024-07-01 EMNLP https://github.com/luffycodes/MalAlgoQA-Dataset https://aclanthology.org/2024.findings-emnlp.913
585 Enhancing the Capability and Robustness of Large Language Models through Reinforcement Learning-Driven Query Refinement Zisu Huang, Xiaohua Wang, Feiran Zhang, Zhibo Xu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang 2024-07-01 arXiv https://github.com/Huangzisu/query-refinement https://doi.org/10.48550/arXiv.2407.01461
586 EconNLI: Evaluating Large Language Models on Economics Reasoning Yue Guo, Yi Yang 2024-07-01 arXiv https://github.com/Irenehere/EconNLI https://doi.org/10.48550/arXiv.2407.01212
587 DiscoveryBench: Towards Data-Driven Discovery with Large Language Models Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark 2024-07-01 arXiv https://github.com/allenai/discoverybench https://doi.org/10.48550/arXiv.2407.01725
588 AutoFlow: Automated Workflow Generation for Large Language Model Agents Zelong Li, Shuyuan Xu, Kai Mei, Wenyue Hua, Balaji Rama, Om Raheja, Hao Wang, He Zhu, Yongfeng Zhang 2024-07-01 arXiv https://github.com/agiresearch/AutoFlow https://doi.org/10.48550/arXiv.2407.12821
589 Exploring Advanced Large Language Models with LLMsuite Giorgio Roffo 2024-07-01 arXiv https://github.com/giorgioroffo/large_language_models_open_suite https://doi.org/10.48550/arXiv.2407.12036
590 LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation Mushui Liu, Yuhang Ma, Yang Zhen, Jun Dan, Yunlong Yu, Zeng Zhao, Zhipeng Hu, Bai Liu, Changjie Fan 2024-06-30 arXiv https://xiaobul.github.io/LLM4GEN/ http://arxiv.org/abs/2407.00737v1
591 GraphArena: Benchmarking Large Language Models on Graph Computational Problems Jianheng Tang, Qifan Zhang, Yuhan Li, Jia Li 2024-06-29 arXiv https://github.com/squareRoot3/GraphArena https://doi.org/10.48550/arXiv.2407.00379
592 LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement Jiahao Ying, Mingbao Lin, Yixin Cao, Wei Tang, Bo Wang, Qianru Sun, Xuanjing Huang, Shuicheng Yan 2024-06-29 arXiv https://yingjiahao14.github.io/LLMs-as-Instructors-pages/ http://arxiv.org/abs/2407.00497v1
593 Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen 2024-06-28 arXiv https://mbzuai-llm.github.io/webpage2code/ http://arxiv.org/abs/2406.20098v1
594 Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring Jiazheng Li, Hainiu Xu, Zhaoyue Sun, Yuxiang Zhou, David West, Cesare Aloisi, Yulan He 2024-06-28 arXiv https://github.com/lijiazheng99/thought_tree_assessment http://arxiv.org/abs/2406.19949v1
595 MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics? Jinming Li, Yichen Zhu, Zhiyuan Xu, Jindong Gu, Minjie Zhu, Xin Liu, Ning Liu, Yaxin Peng, Feifei Feng, Jian Tang 2024-06-28 arXiv https://mm-robobench.github.io/ http://arxiv.org/abs/2406.19693v1
596 YuLan: An Open-source Large Language Model Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen 2024-06-28 arXiv https://github.com/RUC-GSAI/YuLan-Chat https://doi.org/10.48550/arXiv.2406.19853
597 Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models Yang Yan, Lizhi Ma, Anqi Li, Jingsong Ma, Zhenzhong Lan 2024-06-27 arXiv https://github.com/kuri-leo/BigFive-LLM-Predictor https://doi.org/10.48550/arXiv.2406.17287
598 Large Language Models are Interpretable Learners Ruochen Wang, Si Si, Felix Yu, Dorothea Wiesmann, Cho-Jui Hsieh, Inderjit S. Dhillon 2024-06-27 arXiv https://github.com/ruocwang/llm-symbolic-program https://doi.org/10.48550/arXiv.2406.17224
599 STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis Wenbin Li, Di Yao, Ruibo Zhao, Wenjie Chen, Zijie Xu, Chengxue Luo, Chang Gong, Quanliang Jing, Haining Tan, Jingping Bi 2024-06-27 arXiv https://github.com/LwbXc/STBench https://doi.org/10.48550/arXiv.2406.19065
600 Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li 2024-06-27 arXiv …, 2024 https://github.com/MozerWang/Loong http://arxiv.org/abs/2406.17419v1
601 DIM: Dynamic Integration of Multimodal Entity Linking with Large Language Model Shezheng Song, Shasha Li, Jie Yu, Shan Zhao, Xiaopeng Li, Jun Ma, Xiaodong Liu, Zhuo Li, Xiaoguang Mao 2024-06-27 arXiv https://github.com/season1blue/DIM https://doi.org/10.48550/arXiv.2407.12019
602 Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo 2024-06-27 arXiv https://github.com/kaistAI/knowledge-reasoning https://doi.org/10.48550/arXiv.2406.19502
603 Dual-Space Knowledge Distillation for Large Language Models Songming Zhang, Xue Zhang, Zengkui Sun, Yufeng Chen, Jinan Xu 2024-06-27 arXiv https://github.com/songmzhang/DSKD https://doi.org/10.48550/arXiv.2406.17328
604 Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo 2024-06-27 arXiv https://github.com/kaistAI/knowledge-reasoning http://arxiv.org/abs/2406.19502v2
605 A Review of Large Language Models and Autonomous Agents in Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White 2024-06-26 arXiv https://github.com/ur-whitelab/LLMs-in-science https://doi.org/10.48550/arXiv.2407.01603
606 Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs Lei Zhang, Yunshui Li, Jiaming Li, Xiaobo Xia, Jiaxi Yang, Run Luo, Minzheng Wang, Longze Chen, Junhao Liu, Min Yang 2024-06-26 arXiv https://github.com/Hambaobao/HCP-Coder http://arxiv.org/abs/2406.18294v2
607 Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, Ji-Rong Wen 2024-06-26 arXiv https://github.com/dongguanting/DPA-RAG http://arxiv.org/abs/2406.18676v1
608 The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval Meinardus Boris, Batra Anil, Rohrbach Anna, Rohrbach Marcus 2024-06-26 arXiv https://github.com/sudo-Boris/mr-Blip https://doi.org/10.48550/arXiv.2406.18113
609 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Xin Lai, Zhuotao Tian, Yukang Chen, Senqiao Yang, Xiangru Peng, Jiaya Jia 2024-06-26 arXiv https://github.com/dvlab-research/Step-DPO http://arxiv.org/abs/2406.18629v1
610 Selective Prompting Tuning for Personalized Conversations with LLMs Qiushi Huang, Xubo Liu, Tom Ko, Bo Wu, Wenwu Wang, Yu Zhang, Lilian Tang 2024-06-26 OpenReview https://github.com/hqsiswiliam/SPT http://arxiv.org/abs/2406.18187v1
611 IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons Dan Shi, Renren Jin, Tianhao Shen, Weilong Dong, Xinwei Wu, Deyi Xiong 2024-06-26 arXiv https://github.com/danshi777/IRCAN http://arxiv.org/abs/2406.18406v1
612 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen 2024-06-26 arXiv https://charxiv.github.io/ http://arxiv.org/abs/2406.18521v1
613 ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Ahmed Heakl, Youssef Zaghloul, Mennatullah Ali, Rania Hossam, Walid Gomaa 2024-06-26 arXiv http://github.com/ahmedheakl/arazn-llm http://arxiv.org/abs/2406.18120v1
614 A Closer Look into Mixture-of-Experts in Large Language Models Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu 2024-06-26 arXiv https://github.com/kamanphoebe/Look-into-MoEs https://doi.org/10.48550/arXiv.2406.18219
615 BADGE: BADminton report Generation and Evaluation with LLM Shang-Hsuan Chiang, Lin-Wei Chao, Kuang-Da Wang, Chih-Chuan Wang, Wen-Chih Peng 2024-06-26 arXiv https://github.com/AndyChiangSH/BADGE http://arxiv.org/abs/2406.18116v1
616 From Distributional to Overton Pluralism: Investigating Large Language Model Alignment Thom Lake, Eunsol Choi, Greg Durrett 2024-06-25 arXiv https://github.com/thomlake/investigating-alignment https://doi.org/10.48550/arXiv.2406.17692
617 TALEC: Teach Your LLM to Evaluate in Specific Domain with In-house Criteria by Criteria Division and Zero-shot Plus Few-shot Kaiqi Zhang, Shuai Yuan, Honghan Zhao 2024-06-25 arXiv https://github.com/zlkqz/auto_eval http://arxiv.org/abs/2407.10999v1
618 T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang 2024-06-25 arXiv https://github.com/microsoft/T-MAC http://arxiv.org/abs/2407.00088v1
619 Retrieval Augmented Instruction Tuning for Open NER with Large Language Models Tingyu Xie, Jian Zhang, Yan Zhang, Yuanyuan Liang, Qi Li, Hongwei Wang 2024-06-25 arXiv https://github.com/Emma1066/Retrieval-Augmented-IT-OpenNER https://doi.org/10.48550/arXiv.2406.17305
620 M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan 2024-06-25 arXiv https://github.com/ServiceNow/M2Lingual https://doi.org/10.48550/arXiv.2406.16783
621 Large Language Models Are Cross-Lingual Knowledge-Free Reasoners Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang 2024-06-25 arXiv https://github.com/NJUNLP/Knowledge-Free-Reasoning https://doi.org/10.48550/arXiv.2406.16655
622 Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback Zhongtao Miao, Kaiyan Zhao, Yoshimasa Tsuruoka 2024-06-25 arXiv https://github.com/gpgg/art https://doi.org/10.48550/arXiv.2406.17873
623 Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu 2024-06-25 arXiv https://github.com/RazvanDu/LayerwiseQuant http://arxiv.org/abs/2406.17415v2
624 DemoRank: Selecting Effective Demonstrations for Large Language Models in Ranking Task Wenhan Liu, Yutao Zhu, Zhicheng Dou 2024-06-25 arXiv https://github.com/8421BCD/DemoRank https://doi.org/10.48550/arXiv.2406.16332
625 ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models Yash Akhauri, Ahmed F. AbouElhamayed, Jordan Dotzel, Zhiru Zhang, Alexander M. Rush, Safeen Huda, Mohamed S. Abdelfattah 2024-06-25 arXiv https://github.com/abdelfattah-lab/shadow_llm/ https://doi.org/10.48550/arXiv.2406.16635
626 AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models Jiale Cheng, Yida Lu, Xiaotao Gu, Pei Ke, Xiao Liu, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang 2024-06-25 arXiv https://github.com/thu-coai/AutoDetect https://doi.org/10.48550/arXiv.2406.16714
627 Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models Wenhao Shi, Zhiqiang Hu, Yi Bin, Junhua Liu, Yang Yang, See-Kiong Ng, Lidong Bing, Roy Ka-Wei Lee 2024-06-25 arXiv https://github.com/HZQ950419/Math-LLaVA https://doi.org/10.48550/arXiv.2406.17294
628 Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models Nisarg Patel, Mohith Kulkarni, Mihir Parmar, Aashna Budhiraja, Mutsumi Nakamura, Neeraj Varshney, Chitta Baral 2024-06-25 arXiv https://github.com/Mihir3009/Multi-LogiEval https://doi.org/10.48550/arXiv.2406.17169
629 Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients Aashiq Muhamed, Oscar Li, David Woodruff, Mona Diab, Virginia Smith 2024-06-25 arXiv https://github.com/aashiqmuhamed/GRASS http://arxiv.org/abs/2406.17660v1
630 Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers Xiuying Wei, Skander Moalla, Razvan Pascanu, Caglar Gulcehre 2024-06-25 arXiv:2406.16450, 2024 https://github.com/CLAIRE-Labo/StructuredFFN/tree/main http://arxiv.org/abs/2406.16450v1
631 Crafting Customisable Characters with LLMs: Introducing SimsChat, a Persona-Driven Role-Playing Agent Framework Bohao Yang, Dong Liu, Chen Tang, Chenghao Xiao, Kun Zhao, Chao Li, Lin Yuan, Guang Yang, Lanxiao Huang, Chenghua Lin 2024-06-25 arXiv https://github.com/Bernard-Yang/SimsChat http://arxiv.org/abs/2406.17962v3
632 DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph Zhehao Zhang, Jiaao Chen, Diyi Yang 2024-06-25 arXiv https://github.com/SALT-NLP/DARG https://doi.org/10.48550/arXiv.2406.17271
633 FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models Junyi Zhu, Shuochen Liu, Yu Yu, Bo Tang, Yibo Yan, Zhiyu Li, Feiyu Xiong, Tong Xu, Matthew B. Blaschko 2024-06-24 arXiv https://github.com/IAAR-Shanghai/FastMem https://doi.org/10.48550/arXiv.2406.16069
634 Can LLM Graph Reasoning Generalize beyond Pattern Memorization? Yizhuo Zhang, Heng Wang, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov 2024-06-24 arXiv …, 2024 https://github.com/MatthewYZhang/NLGift http://arxiv.org/abs/2406.15992v1
635 Can LLM be a Personalized Judge? Yijiang River Dong, Tiancheng Hu, Nigel Collier 2024-06-24 arXiv e-prints, 2024 https://github.com/dong-river/Personalized-Judge http://arxiv.org/abs/2406.11657v1
636 ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Dan Zhang, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Jingyu Sun, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, Zihan Wang 2024-06-24 arXiv https://github.com/THUDM https://doi.org/10.48550/arXiv.2406.12793
637 Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chulin Xie, Chiyuan Zhang 2024-06-24 arXiv https://github.com/google-research/crosslingual-knowledge-barriers https://doi.org/10.48550/arXiv.2406.16135
638 EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Mengping Yang, Cheng Zhang, Hao Li 2024-06-24 arXiv https://sais-fuxi.github.io/projects/evalalign/ http://arxiv.org/abs/2406.16562v2
639 Efficient Evolutionary Search Over Chemical Space with Large Language Models Haorui Wang, Marta Skreta, Cher-Tian Ser, Wenhao Gao, Lingkai Kong, Felix Streith-Kalthoff, Chenru Duan, Yuchen Zhuang, Yue Yu, Yanqiao Zhu, Yuanqi Du, Alán Aspuru-Guzik, Kirill Neklyudov, Chao Zhang 2024-06-24 arXiv http://github.com/zoom-wang112358/MOLLEO https://doi.org/10.48550/arXiv.2406.16976
640 FS-RAG: A Frame Semantics Based Approach for Improved Factual Accuracy in Large Language Models Harish Tayyar Madabushi 2024-06-24 arXiv https://github.com/H-TayyarMadabushi/A-Frame-Semantics-based-approach-for-Improved-Factual-Accuracy-in-Large-Language-Models https://doi.org/10.48550/arXiv.2406.16167
641 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal 2024-06-24 arXiv https://github.com/kiddyboots216/lottery-ticket-adaptation http://arxiv.org/abs/2406.16797v2
642 Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models Yichen Sun, Zhixuan Chu, Zhan Qin, Kui Ren 2024-06-24 arXiv https://github.com/TruthAI-Lab/PCIG http://arxiv.org/abs/2406.16333v1
643 AudioBench: A Universal Benchmark for Audio Large Language Models Bin Wang, Xunlong Zou, Geyu Lin, Shuo Sun, Zhuohan Liu, Wenyu Zhang, Zhengyuan Liu, AiTi Aw, Nancy F. Chen 2024-06-24 arXiv https://github.com/AudioLLMs/AudioBench https://doi.org/10.48550/arXiv.2406.16020
644 The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models Jiajia Li, Lu Yang, Mingni Tang, Chenchong Chenchong, Zuchao Li, Ping Wang, Hai Zhao 2024-06-23 ACL https://github.com/zcli-charlie/ZIQI-Eval https://aclanthology.org/2024.findings-acl.194
645 Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level Zhaopeng Feng, Ruizhe Chen, Yan Zhang, Zijie Meng, Zuozhu Liu 2024-06-23 arXiv:2406.15741, 2024 https://github.com/fzp0424/Ladder http://arxiv.org/abs/2406.15741v1
646 RuleR: Improving LLM Controllability by Rule-based Data Recycling Ming Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou 2024-06-23 arXiv …, 2024 https://github.com/MingLiiii/RuleR http://arxiv.org/abs/2406.15938v1
647 Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models Qi Liu, Bo Wang, Nan Wang, Jiaxin Mao 2024-06-22 arXiv https://github.com/liuqi6777/pe_rank https://doi.org/10.48550/arXiv.2406.14848
648 Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph Roman Vashurin, Ekaterina Fadeeva, Artem Vazhentsev, Lyudmila Rvanova, Akim Tsvigun, Daniil Vasilev, Rui Xing, Abdelrahman Boda Sadallah, Kirill Grishchenkov, Sergey Petrakov, Alexander Panchenko, Timothy Baldwin, Preslav Nakov, Maxim Panov, Artem Shelmanov 2024-06-22 arXiv https://github.com/IINemo/lm-polygraph https://doi.org/10.48550/arXiv.2406.15627
649 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang 2024-06-22 arXiv https://github.com/bytedance/SALMONN/ https://doi.org/10.48550/arXiv.2406.15704
650 SS-GEN: A Social Story Generation Framework with Large Language Models Yi Feng, Mingyang Song, Jiaqi Wang, Zhuang Chen, Guanqun Bi, Minlie Huang, Liping Jing, Jian Yu 2024-06-22 arXiv https://github.com/MIMIFY/SS-GEN http://arxiv.org/abs/2406.15695v2
651 MT-Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level Zhaopeng Feng, Yan Zhang, Ruizhe Chen, Zijie Meng, Zuozhu Liu 2024-06-22 arXiv https://github.com/fzp0424/Ladder http://arxiv.org/abs/2406.15741v2
652 Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration Zhongzhi Yu, Zheng Wang, Yonggan Fu, Huihong Shi, Khalid Shaikh, Yingyan Celine Lin 2024-06-22 arXiv https://github.com/GATECH-EIC/ACT https://doi.org/10.48550/arXiv.2406.15765
653 InternLM-Law: An Open Source Chinese Legal Large Language Model Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge 2024-06-22 arXiv https://github.com/InternLM/InternLM-Law https://doi.org/10.48550/arXiv.2406.14887
654 Identifying Inaccurate Descriptions in LLM-generated Code Comments via Test Execution Sungmin Kang, Louis Milliken, Shin Yoo 2024-06-22 arXiv:2406.14836, 2024 https://smkang96.github.io/assets/pdf/doctest_supplementary_arxiv.pdf http://arxiv.org/abs/2406.14836v1
655 ICLEval: Evaluating In-Context Learning Ability of Large Language Models Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia, Zhao Cao, Ji-Rong Wen 2024-06-22 arXiv https://github.com/yiye3/ICLEval https://doi.org/10.48550/arXiv.2406.14955
656 GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He 2024-06-22 arXiv https://github.com/GIEBench/GIEBench https://doi.org/10.48550/arXiv.2406.14903
657 Decoding Matters: Addressing Amplification Bias and Homogeneity Issue for LLM-based Recommendation Keqin Bao, Jizhi Zhang, Yang Zhang, Xinyue Huo, Chong Chen, Fuli Feng 2024-06-22 arXiv …, 2024 https://github.com/SAI990323/DecodingMatters http://arxiv.org/abs/2406.14900v1
658 ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models Haiquan Zhao, Lingyu Li, Shisong Chen, Shuqi Kong, Jiaan Wang, Kexin Huang, Tianle Gu, Yixu Wang, Jian Wang, Dandan Liang, Zhixu Li, Yan Teng, Yanghua Xiao, Yingchun Wang 2024-06-21 arXiv https://github.com/haidequanbu/ESC-Eval https://doi.org/10.48550/arXiv.2406.14952
659 GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians Haoyang Liu, Haohan Wang 2024-06-21 arXiv https://github.com/Liu-Hy/GenoTex http://arxiv.org/abs/2406.15341v1
660 MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Tianyu Fu, Haofeng Huang, Xuefei Ning, Genghan Zhang, Boju Chen, Tianqi Wu, Hongyi Wang, Zixiao Huang, Shiyao Li, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang 2024-06-21 arXiv https://github.com/thu-nics/MoA https://doi.org/10.48550/arXiv.2406.14909
661 OATH-Frames: Characterizing Online Attitudes Towards Homelessness with LLM Assistants Jaspreet Ranjit, Brihi Joshi, Rebecca Dorn, Laura Petry, Olga Koumoundouros, Jayne Bottarini, Peichen Liu, Eric Rice, Swabha Swayamdipta 2024-06-21 arXiv https://dill-lab.github.io/oath-frames/ http://arxiv.org/abs/2406.14883v1
662 FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving Xiaohan Lin, Qingxing Cao, Yinya Huang, Haiming Wang, Jianqiao Lu, Zhengying Liu, Linqi Song, Xiaodan Liang 2024-06-20 arXiv https://fveler.github.io/ https://doi.org/10.48550/arXiv.2406.14408
663 Taxonomy-Guided Zero-Shot Recommendations with LLMs Yueqing Liang, Liangwei Yang, Chen Wang, Xiongxiao Xu, Philip S. Yu, Kai Shu 2024-06-20 arXiv https://github.com/yueqingliang1/TaxRec http://arxiv.org/abs/2406.14043v1
664 ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation Zhiyu Mei, Wei Fu, Kaiwei Li, Guangju Wang, Huanchen Zhang, Yi Wu 2024-06-20 arXiv https://github.com/openpsi-project/ReaLHF https://doi.org/10.48550/arXiv.2406.14088
665 QPaug: Question and Passage Augmentation for Open-Domain Question Answering of LLMs Minsang Kim, Cheoneum Park, Seungjun Baek 2024-06-20 arXiv https://github.com/kmswin1/QPaug http://arxiv.org/abs/2406.14277v2
666 MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models Zhongshen Zeng, Yinhong Liu, Yingjia Wan, Jingyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia 2024-06-20 arXiv https://randolph-zeng.github.io/Mr-Ben.github.io/ https://doi.org/10.48550/arXiv.2406.13975
667 CityBench: Evaluating the Capabilities of Large Language Model as World Model Jie Feng, Jun Zhang, Junbo Yan, Xin Zhang, Tianjian Ouyang, Tianhui Liu, Yuwei Du, Siqi Guo, Yong Li 2024-06-20 arXiv https://github.com/tsinghua-fib-lab/CityBench https://doi.org/10.48550/arXiv.2406.13945
668 Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng 2024-06-20 arXiv https://github.com/wen112358/ImplicitBiasPsychometricEvaluation https://doi.org/10.48550/arXiv.2406.14023
669 CityGPT: Empowering Urban Spatial Cognition of Large Language Models Jie Feng, Yuwei Du, Tianhui Liu, Siqi Guo, Yuming Lin, Yong Li 2024-06-20 arXiv https://github.com/tsinghua-fib-lab/CityGPT https://doi.org/10.48550/arXiv.2406.13948
670 CEBench: A Benchmarking Toolkit for the Cost-Effectiveness of LLM Pipelines Wenbo Sun, Jiaqi Wang, Qiming Guo, Ziyu Li, Wenlu Wang, Rihan Hai 2024-06-20 arXiv https://github.com/amademicnoboday12/CEBench http://arxiv.org/abs/2407.12797v1
671 APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking Can Jin, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris N. Metaxas 2024-06-20 arXiv https://github.com/jincan333/APEER https://doi.org/10.48550/arXiv.2406.14449
672 BeHonest: Benchmarking Honesty of Large Language Models Steffi Chern, Zhulin Hu, Yuqing Yang, Ethan Chern, Yuan Guo, Jiahe Jin, Binjie Wang, Pengfei Liu 2024-06-19 arXiv https://github.com/GAIR-NLP/BeHonest https://doi.org/10.48550/arXiv.2406.13261
673 Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators Matéo Mahaut, Laura Aina, Paula Czarnowska, Momchil Hardalov, Thomas Müller, Lluís Màrquez 2024-06-19 OpenReview https://github.com/amazon-science/factual-confidence-of-llms http://arxiv.org/abs/2406.13415v1
674 Finding Blind Spots in Evaluator LLMs with Interpretable Checklists Sumanth Doddapaneni, Mohammed Safi Ur Rahman Khan, Sshubam Verma, Mitesh M. Khapra 2024-06-19 arXiv https://github.com/AI4Bharat/FBI http://arxiv.org/abs/2406.13439v1
675 Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata Mykhailo Poliakov, Nadiya Shvai 2024-06-19 arXiv https://github.com/mxpoliakov/Multi-Meta-RAG http://arxiv.org/abs/2406.13213v1
676 Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, Jingren Zhou 2024-06-19 arXiv https://github.com/QwenLM/AutoIF https://doi.org/10.48550/arXiv.2406.13542
677 Low-Redundant Optimization for Large Language Model Alignment Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen 2024-06-18 arXiv https://github.com/RUCAIBox/ALLO https://doi.org/10.48550/arXiv.2406.12606
678 TourLLM: Enhancing LLMs with Tourism Knowledge Qikai Wei, Mingzhi Yang, Jinqiang Wang, Wenwei Mao, Jiabo Xu, Huansheng Ning 2024-06-18 arXiv https://github.com/mrweiqk/Cultour http://arxiv.org/abs/2407.12791v1
679 Stealth edits to large language models Oliver J. Sutton, Qinghua Zhou, Wei Wang, Desmond J. Higham, Alexander N. Gorban, Alexander Bastounis, Ivan Y. Tyukin 2024-06-18 arXiv https://github.com/qinghua-zhou/stealth-edits http://arxiv.org/abs/2406.12670v2
680 Stealth edits for provably fixing or attacking large language models Oliver J. Sutton, Qinghua Zhou, Wei Wang, Desmond J. Higham, Alexander N. Gorban, Alexander Bastounis, Ivan Yu. Tyukin 2024-06-18 arXiv https://github.com/qinghua-zhou/stealth-edits https://doi.org/10.48550/arXiv.2406.12670
681 SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation Xiaoze Liu, Ting Sun, Tianyang Xu, Feijie Wu, Cunxiang Wang, Xiaoqian Wang, Jing Gao 2024-06-18 arXiv https://github.com/xz-liu/SHIELD http://arxiv.org/abs/2406.12975v1
682 MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction Yuyan Liu, Sirui Ding, Sheng Zhou, Wenqi Fan, Qiaoyu Tan 2024-06-18 arXiv https://github.com/NYUSHCS/MolecularGPT https://doi.org/10.48550/arXiv.2406.12950
683 CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis Saranya Venkatraman, Nafis Irtiza Tripto, Dongwon Lee 2024-06-18 arXiv https://github.com/saranya-venkatraman/multi_llm_story_writing http://arxiv.org/abs/2406.12665v1
684 IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models Qiyao Wang, Jianguo Huang, Shule Lu, Yuan Lin, Kan Xu, Liang Yang, Hongfei Lin 2024-06-18 arXiv https://ipeval.github.io/ https://doi.org/10.48550/arXiv.2406.12386
685 Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models Eldar Kurtic, Amir Moeini, Dan Alistarh 2024-06-18 arXiv https://github.com/IST-DASLab/Mathador-LM https://doi.org/10.48550/arXiv.2406.12572
686 CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework Shaohuang Wang, Lun Wang, Yunhan Bu, Tianwei Huang 2024-06-18 arXiv https://github.com/xxxxxx http://arxiv.org/abs/2406.12243v1
687 AgentReview: Exploring Peer Review Dynamics with LLM Agents Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kaijie Zhu, Yijia Xiao, Jindong Wang 2024-06-18 arXiv https://agentreview.github.io/ http://arxiv.org/abs/2406.12708v1
688 TroL: Traversal of Layers for Large Language and Vision Models Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro 2024-06-18 arXiv https://github.com/ByungKwanLee/TroL https://doi.org/10.48550/arXiv.2406.12246
689 Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Chuchu Han, Xiaonan Huang, Changxin Gao, Yuehuan Wang, Nong Sang 2024-06-18 arXiv https://github.com/pipixin321/HolmesVAD http://arxiv.org/abs/2406.12235v1
690 DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models Fan Zhou, Siqiao Xue, Danrui Qi, Wenhui Shi, Wang Zhao, Ganglin Wei, Hongyang Zhang, Caigai Jiang, Gangwei Jiang, Zhixuan Chu, Faqiang Chen 2024-06-17 arXiv https://github.com/eosphoros-ai/DB-GPT-Hub https://doi.org/10.48550/arXiv.2406.11434
691 VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou 2024-06-17 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://showlab.github.io/videollm-online https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10657274
692 Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models Yuqing Wang, Yun Zhao, Sara Alessandra Keller, Anne A. H. de Hond, Marieke M. van Buchem, Malvika Pillai, Tina Hernandez-Boussard 2024-06-17 arXiv https://github.com/EternityYW/BiasEval-LLM-MentalHealth https://doi.org/10.48550/arXiv.2406.12033
693 Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models Sheng Feng, Heyang Liu, Yu Wang, Yanfeng Wang 2024-06-17 arXiv https://github.com/FsFrancis15/BrainLLM https://doi.org/10.48550/arXiv.2406.11568
694 Soft Prompting for Unlearning in Large Language Models Karuna Bhaila, Minh-Hao Van, Xintao Wu 2024-06-17 arXiv https://github.com/karuna-bhaila/llm_unlearning https://doi.org/10.48550/arXiv.2406.12038
695 Probing the Decision Boundaries of In-context Learning in Large Language Models Siyan Zhao, Tung Nguyen, Aditya Grover 2024-06-17 arXiv https://github.com/siyan-zhao/ICL_decision_boundary https://doi.org/10.48550/arXiv.2406.11233
696 Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models Hengyi Wang, Haizhou Shi, Shiwei Tan, Weiyi Qin, Wenyuan Wang, Tunyu Zhang, Akshay Nambi, Tanuja Ganu, Hao Wang 2024-06-17 arXiv https://github.com/Wang-ML-Lab/multimodal-needle-in-a-haystack https://doi.org/10.48550/arXiv.2406.11230
697 Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models Fangzhi Xu, Qiushi Sun, Kanzhi Cheng, Jun Liu, Yu Qiao, Zhiyong Wu 2024-06-17 arXiv https://github.com/xufangzhi/ENVISIONS https://doi.org/10.48550/arXiv.2406.11736
698 ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking Wenshuo Li, Xinghao Chen, Han Shu, Yehui Tang, Yunhe Wang 2024-06-17 Proceedings of Machine Learning Research https://github.com/Gaffey/ExCP http://arxiv.org/abs/2406.11257v1
699 Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging Priyanka Kargupta, Ishika Agarwal, Dilek Hakkani-Tur, Jiawei Han 2024-06-17 arXiv http://github.com/agarwalishika/TreeInstruct http://arxiv.org/abs/2406.11709v2
700 AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval Shirley Wu, Shiyu Zhao, Qian Huang, Kexin Huang, Michihiro Yasunaga, Kaidi Cao, Vassilis N. Ioannidis, Karthik Subbian, Jure Leskovec, James Zou 2024-06-17 arXiv https://github.com/zou-group/avatar http://arxiv.org/abs/2406.11200v2
701 mDPO: Conditional Preference Optimization for Multimodal Large Language Models Fei Wang, Wenxuan Zhou, James Y. Huang, Nan Xu, Sheng Zhang, Hoifung Poon, Muhao Chen 2024-06-17 arXiv https://feiwang96.github.io/mDPO https://doi.org/10.48550/arXiv.2406.11839
702 Tokenization Falling Short: On Subword Robustness in Large Language Models Yekun Chai, Yewei Fang, Qiwei Peng, Xuhong Li 2024-06-17 EMNLP https://github.com/FloatAI/TKEval https://aclanthology.org/2024.findings-emnlp.86
703 MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model Jiahao Huo, Yibo Yan, Boren Hu, Yutao Yue, Xuming Hu 2024-06-17 arXiv https://github.com/Z1zs/MMNeuron https://doi.org/10.48550/arXiv.2406.11193
704 GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation Shihao Cai, Keqin Bao, Hangyu Guo, Jizhi Zhang, Jun Song, Bo Zheng 2024-06-17 arXiv https://github.com/Lanyu0303/GeoGPT4V_Project https://doi.org/10.48550/arXiv.2406.11503
705 Problematic Tokens: Tokenizer Bias in Large Language Models Jin Yang, Zhiqiang Wang, Yanbin Lin, Zunduo Zhao 2024-06-17 arXiv https://github.com/yeyimilk/LLMGPT4o http://arxiv.org/abs/2406.11214v3
706 Investigating Annotator Bias in Large Language Models for Hate Speech Detection Amit Das, Zheng Zhang, Najib Hasan, Souvika Sarkar, Fatemeh Jamshidi, Tathagata Bhattacharya, Mostafa Rahgouy, Nilanjana Raychawdhary, Dongji Feng, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals 2024-06-17 arXiv https://github.com/AmitDasRup123/HateSpeechCorpus https://doi.org/10.48550/arXiv.2406.11109
707 LLaNA: Large Language and NeRF Assistant Andrea Amaduzzi, Pierluigi Zama Ramirez, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano 2024-06-17 arXiv https://andreamaduzzi.github.io/llana/ https://doi.org/10.48550/arXiv.2406.11840
708 AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning Shirley Wu, Shiyu Zhao, Qian Huang, Kexin Huang, Michihiro Yasunaga, Kaidi Cao, Vassilis N. Ioannidis, Karthik Subbian, Jure Leskovec, James Zou 2024-06-17 arXiv https://github.com/zou-group/avatar http://arxiv.org/abs/2406.11200v3
709 RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models Yuqing Wang, Yun Zhao 2024-06-16 arXiv https://github.com/EternityYW/RUPBench https://doi.org/10.48550/arXiv.2406.11020
710 Toward Optimal LLM Alignments Using Two-Player Games Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Hang Li, Yang Liu 2024-06-16 arXiv https://github.com/ruizheng20/gpo http://arxiv.org/abs/2406.10977v1
711 SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking Zhuang Li, Yuncheng Hua, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari 2024-06-16 arXiv https://github.com/zhuang-li/SCAR https://doi.org/10.48550/arXiv.2406.10882
712 RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan, Jiachun Li, Yubo Chen, Kang Liu, Jun Zhao 2024-06-16 arXiv http://rwku-bench.github.io https://doi.org/10.48550/arXiv.2406.10890
713 A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery Yu Zhang, Xiusi Chen, Bowen Jin, Sheng Wang, Shuiwang Ji, Wei Wang, Jiawei Han 2024-06-16 arXiv https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models https://doi.org/10.48550/arXiv.2406.10833
714 Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference Jiaming Tang, Yilong Zhao, Kan Zhu, Guangxuan Xiao, Baris Kasikci, Song Han 2024-06-16 Proceedings of Machine Learning Research http://github.com/mit-han-lab/Quest http://arxiv.org/abs/2406.10774v1
715 Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies Hung-Ting Su, Chun-Tong Chao, Ya-Ching Hsu, Xudong Lin, Yulei Niu, Hung-Yi Lee, Winston H. Hsu 2024-06-16 arXiv https://ander1119.github.io/TiM https://doi.org/10.48550/arXiv.2406.10923
716 GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents Dongping Chen, Yue Huang, Siyuan Wu, Jingyu Tang, Liuyi Chen, Yilin Bai, Zhigang He, Chenlong Wang, Huichi Zhou, Yiqiang Li, Tianshuo Zhou, Yue Yu, Chujie Gao, Qihui Zhang, Yi Gui, Zhen Li, Yao Wan, Pan Zhou, Jianfeng Gao, Lichao Sun 2024-06-16 arXiv https://gui-world.github.io/ http://arxiv.org/abs/2406.10819v1
717 A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su, Camillo J. Taylor, Dan Roth 2024-06-16 arXiv https://github.com/bowen-upenn/llm_token_bias https://doi.org/10.48550/arXiv.2406.11050
718 StructBench: An Autogenerated Benchmark for Evaluating Large Language Model's Ability in Structure-Rich Text Understanding Zhouhong Gu, Haoning Ye, Zeyang Zhou, Hongwei Feng, Yanghua Xiao 2024-06-15 arXiv https://github.com/MikeGu721/StructBench http://arxiv.org/abs/2406.10621v1
719 StrucText-Eval: Evaluating Large Language Model's Reasoning Ability in Structure-Rich Text Zhouhong Gu, Haoning Ye, Xingzhou Chen, Zeyang Zhou, Hongwei Feng, Yanghua Xiao 2024-06-15 arXiv https://github.com/MikeGu721/StrucText-Eval http://arxiv.org/abs/2406.10621v3
720 StrucText-Eval: An Autogenerated Benchmark for Evaluating Large Language Model's Ability in Structure-Rich Text Understanding Zhouhong Gu, Haoning Ye, Zeyang Zhou, Hongwei Feng, Yanghua Xiao 2024-06-15 arXiv https://github.com/MikeGu721/StrucText-Eval https://doi.org/10.48550/arXiv.2406.10621
721 Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu 2024-06-15 arXiv https://github.com/TsingmaoAI/MI-optimize http://arxiv.org/abs/2406.12928v1
722 What is the best model? Application-driven Evaluation for Large Language Models Shiguo Lian, Kaikai Zhao, Xinhui Liu, Xuejiao Lei, Bikun Yang, Wenjing Zhang, Kai Wang, Zhaoxiang Liu 2024-06-14 arXiv https://github.com/UnicomAI/DataSet/tree/main/TestData/GeneralAbility https://doi.org/10.48550/arXiv.2406.10307
723 BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages Junho Myung, Nayeon Lee, Yi Zhou, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla Perez-Almendros, Abinew Ali Ayele, Víctor Gutiérrez-Basulto, Yazmín Ibáñez-García, Hwaran Lee, Shamsuddeen Hassan Muhammad, Kiwoong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jose Camacho-Collados, Alice Oh 2024-06-14 arXiv https://github.com/nlee0212/BLEnD http://arxiv.org/abs/2406.09948v1
724 Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein 2024-06-14 arXiv https://github.com/ahans30/goldfish-loss http://arxiv.org/abs/2406.10209v1
725 CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models Wenjing Zhang, Xuejiao Lei, Zhaoxiang Liu, Meijuan An, Bikun Yang, Kaikai Zhao, Kai Wang, Shiguo Lian 2024-06-14 arXiv https://github.com/UnicomAI/DataSet/tree/main/TestData/Safety https://doi.org/10.48550/arXiv.2406.10311
726 CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang 2024-06-14 arXiv https://clibench.github.io http://arxiv.org/abs/2406.09923v2
727 CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S. Chang, Wei Wang 2024-06-14 arXiv https://clibench.github.io https://doi.org/10.48550/arXiv.2406.09923
728 First Multi-Dimensional Evaluation of Flowchart Comprehension for Multimodal Large Language Models Enming Zhang, Ruobing Yao, Huanyong Liu, Junhui Yu, Jiale Wang 2024-06-14 arXiv https://github.com/360AILAB-NLP/FlowCE https://doi.org/10.48550/arXiv.2406.10057
729 JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models Delong Ran, Jinyuan Liu, Yichen Gong, Jingyi Zheng, Xinlei He, Tianshuo Cong, Anyu Wang 2024-06-13 arXiv https://github.com/ThuCCSLab/JailbreakEval https://doi.org/10.48550/arXiv.2406.09321
730 Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs Weixuan Wang, Barry Haddow, Minghao Wu, Wei Peng, Alexandra Birch 2024-06-13 arXiv https://github.com/weixuan-wang123/multilingual-neurons http://arxiv.org/abs/2406.09265v2
731 Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min Zhang 2024-06-13 arXiv https://github.com/zhaochen0110/Cotempqa https://doi.org/10.48550/arXiv.2406.09072
732 LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models Xiaohao Yang, He Zhao, Dinh Q. Phung, Wray L. Buntine, Lan Du 2024-06-13 arXiv https://github.com/Xiaohao-Yang/Topic_Model_Evaluation https://doi.org/10.48550/arXiv.2406.09008
733 LLAVIDAL: Benchmarking Large Language Vision Models for Daily Activities of Living Rajatsubhra Chakraborty, Arkaprava Sinha, Dominick Reilly, Manish Kumar Govind, Pu Wang, François Brémond, Srijan Das 2024-06-13 arXiv https://adl-x.github.io/ https://doi.org/10.48550/arXiv.2406.09390
734 SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen 2024-06-13 arXiv https://github.com/hicai-zju/sciknoweval https://doi.org/10.48550/arXiv.2406.09098
735 Investigating the translation capabilities of Large Language Models trained on parallel data only Javier García Gilabert, Carlos Escolano, Aleix Sant Savall, Francesca de Luca Fornaciari, Audrey Mash, Xixian Liao, Maite Melero 2024-06-13 arXiv https://github.com/projecte-aina/Plume https://doi.org/10.48550/arXiv.2406.09140
736 DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation A B M Ashikur Rahman, Saeed Anwar, Muhammad Usman, Ajmal Mian 2024-06-13 arXiv https://github.com/ashikiut/DefAn http://arxiv.org/abs/2406.09155v1
737 Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs Xuan Zhang, Chao Du, Tianyu Pang, Qian Liu, Wei Gao, Min Lin 2024-06-13 arXiv https://github.com/sail-sg/CPO http://arxiv.org/abs/2406.09136v1
738 Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs Zhao Xu, Fan Liu, Hao Liu 2024-06-13 arXiv https://github.com/usail-hkust/Bag_of_Tricks_for_LLM_Jailbreaking http://arxiv.org/abs/2406.09324v1
739 Enhancing Psychotherapy Counseling: A Data Augmentation Pipeline Leveraging Large Language Models for Counseling Conversations Jun-Woo Kim, Ji-Eun Han, Jun-Seok Koh, Hyeon-Tae Seo, Du-Seong Chang 2024-06-13 arXiv https://github.com/jwkim-chat/A-Data-Augmentation-Pipeline-Leveraging-Large-Language-Models-for-Counseling-Conversations https://doi.org/10.48550/arXiv.2406.08718
740 Large Language Models Must Be Taught to Know What They Don't Know Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine M. Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson 2024-06-12 arXiv https://github.com/activatedgeek/calibration-tuning https://doi.org/10.48550/arXiv.2406.08391
741 TasTe: Teaching Large Language Models to Translate through Self-Reflection Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie Zhou, Min Zhang 2024-06-12 arXiv https://github.com/YutongWang1216/ReflectionLLMMT https://doi.org/10.48550/arXiv.2406.08434
742 Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference Jiabao Ji, Yujian Liu, Yang Zhang, Gaowen Liu, Ramana Rao Kompella, Sijia Liu, Shiyu Chang 2024-06-12 arXiv https://github.com/UCSB-NLP-Chang/ULD http://arxiv.org/abs/2406.08607v1
743 Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin 2024-06-12 arXiv https://magpie-align.github.io/ http://arxiv.org/abs/2406.08464v1
744 MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents Luyuan Wang, Yongyu Deng, Yiwei Zha, Guodong Mao, Qinmin Wang, Tianchen Min, Wei Chen, Shoufa Chen 2024-06-12 arXiv https://MobileAgentBench.github.io http://arxiv.org/abs/2406.08184v1
745 Large Language Model Unlearning via Embedding-Corrupted Prompts Chris Yuhao Liu, Yaxuan Wang, Jeffrey Flanigan, Yang Liu 2024-06-12 arXiv https://github.com/chrisliu298/llm-unlearn-eco https://doi.org/10.48550/arXiv.2406.07933
746 CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Xiaoshuai Song, Muxi Diao, Guanting Dong, Zhengyang Wang, Yujia Fu, Runqi Qiao, Zhexu Wang, Dayuan Fu, Huangxuan Wu, Bin Liang, Weihao Zeng, Yejie Wang, Zhuoma Gongque, Jianing Yu, Qiuna Tan, Weiran Xu 2024-06-12 arXiv https://github.com/csbench/csbench https://doi.org/10.48550/arXiv.2406.08587
747 Are Large Language Models Good Statisticians? Yizhang Zhu, Shiyin Du, Boyan Li, Yuyu Luo, Nan Tang 2024-06-12 arXiv https://statqa.github.io/ https://doi.org/10.48550/arXiv.2406.07815
748 Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning Jaehyun Nam, Kyuyoung Kim, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, Jinwoo Shin 2024-06-12 arXiv https://github.com/jaehyun513/OCTree http://arxiv.org/abs/2406.08527v1
749 When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models Haoran You, Yichao Fu, Zheng Wang, Amir Yazdanbakhsh, Yingyan Celine Lin 2024-06-11 ICML https://github.com/GATECH-EIC/Linearized-LLM https://openreview.net/forum?id=7mFSaP6IiN
750 VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models Yu Liu, Lang Gao, Mingxin Yang, Yu Xie, Ping Chen, Xiaojin Zhang, Wei Chen 2024-06-11 arXiv https://github.com/Sweetaroo/VulDetectBench https://doi.org/10.48550/arXiv.2406.07595
751 VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing 2024-06-11 arXiv https://github.com/DAMO-NLP-SG/VideoLLaMA2 http://arxiv.org/abs/2406.07476v2
752 MoreauPruner: Robust Pruning of Large Language Models against Weight Perturbations Zixiao Wang, Jingwei Zhang, Wenqian Zhao, Farzan Farnia, Bei Yu 2024-06-11 arXiv https://github.com/ShiningSord/MoreauPruner https://doi.org/10.48550/arXiv.2406.07017
753 Towards more realistic evaluation of LLM-based code generation: an experimental study and beyond Dewu Zheng, Yanlin Wang, Ensheng Shi, Ruikai Zhang, Yuchi Ma, Hongyu Zhang, Zibin Zheng 2024-06-11 arXiv https://github.com/DeepSoftwareAnalytics/EvoEval http://arxiv.org/abs/2406.06918v1
754 Scaling Large-Language-Model-based Multi-Agent Collaboration Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Zhiyuan Liu, Maosong Sun 2024-06-11 arXiv https://github.com/OpenBMB/ChatDev https://doi.org/10.48550/arXiv.2406.07155
755 QuickLLaMA: Query-aware Inference Acceleration for Large Language Models Jingyao Li, Han Shi, Xin Jiang, Zhenguo Li, Hong Xu, Jiaya Jia 2024-06-11 arXiv https://github.com/dvlab-research/Q-LLM https://doi.org/10.48550/arXiv.2406.07528
756 Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena Aidar Myrzakhan, Sondos Mahmoud Bsharat, Zhiqiang Shen 2024-06-11 arXiv https://github.com/VILA-Lab/Open-LLM-Leaderboard http://arxiv.org/abs/2406.07545v1
757 Instruct Large Language Models to Drive like Humans Ruijun Zhang, Xianda Guo, Wenzhao Zheng, Chenming Zhang, Kurt Keutzer, Long Chen 2024-06-11 arXiv https://github.com/bonbon-rj/InstructDriver https://doi.org/10.48550/arXiv.2406.07296
758 Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models Zhenyi Lu, Jie Tian, Wei Wei, Xiaoye Qu, Yu Cheng, Wenfeng Xie, Dangyang Chen 2024-06-11 arXiv https://github.com/Chuge0335/PC-CoT https://doi.org/10.48550/arXiv.2406.07001
759 Entropy-Reinforced Planning with Large Language Models for Drug Discovery Xuefeng Liu, Chih-chan Tien, Peng Ding, Songhao Jiang, Rick L. Stevens 2024-06-11 arXiv https://github.com/xuefeng-cs/ERP https://doi.org/10.48550/arXiv.2406.07025
760 MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs Vera Neplenbroek, Arianna Bisazza, Raquel Fernández 2024-06-11 arXiv https://github.com/Veranep/MBBQ http://arxiv.org/abs/2406.07243v2
761 Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Yichi Zhang, Yao Huang, Yitong Sun, Chang Liu, Zhe Zhao, Zhengwei Fang, Yifan Wang, Huanran Chen, Xiao Yang, Xingxing Wei, Hang Su, Yinpeng Dong, Jun Zhu 2024-06-11 arXiv https://multi-trust.github.io/ https://doi.org/10.48550/arXiv.2406.07057
762 Evolving Subnetwork Training for Large Language Models Hanqi Li, Lu Chen, Da Ma, Zijian Wu, Su Zhu, Kai Yu 2024-06-11 arXiv https://github.com/OpenDFM/EST https://doi.org/10.48550/arXiv.2406.06962
763 Limited Out-of-Context Knowledge Reasoning in Large Language Models Peng Hu, Changjiang Gao, Ruiqi Gao, Jiajun Chen, Shujian Huang 2024-06-11 arXiv https://github.com/NJUNLP/ID-OCKR https://doi.org/10.48550/arXiv.2406.07393
764 LUNAR: Unsupervised LLM-based Log Parsing Junjie Huang, Zhihan Jiang, Zhuangbin Chen, Michael R. Lyu 2024-06-11 arXiv https://github.com/Jun-jie-Huang/LUNAR http://arxiv.org/abs/2406.07174v2
765 ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization Haoran You, Yipin Guo, Yichao Fu, Wei Zhou, Huihong Shi, Xiaofan Zhang, Souvik Kundu, Amir Yazdanbakhsh, Yingyan Celine Lin 2024-06-10 arXiv https://github.com/GATECH-EIC/ShiftAddLLM http://arxiv.org/abs/2406.05981v3
766 Aligning Large Language Models with Representation Editing: A Control Perspective Lingkai Kong, Haorui Wang, Wenhao Mu, Yuanqi Du, Yuchen Zhuang, Yifei Zhou, Yue Song, Rongzhi Zhang, Kai Wang, Chao Zhang 2024-06-10 arXiv https://github.com/Lingkai-Kong/RE-Control https://doi.org/10.48550/arXiv.2406.05954
767 AutoSurvey: Large Language Models Can Automatically Write Surveys Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang 2024-06-10 arXiv https://github.com/AutoSurveys/AutoSurvey https://doi.org/10.48550/arXiv.2406.10252
768 How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark Ruizhong Qiu, Weiliang Will Zeng, Hanghang Tong, James Ezick, Christopher Lott 2024-06-10 arXiv https://github.com/q-rz/enamel http://arxiv.org/abs/2406.06647v2
769 LLM Dataset Inference: Did you train on my dataset? Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic 2024-06-10 arXiv https://github.com/pratyushmaini/llm_dataset_inference/ http://arxiv.org/abs/2406.06443v1
770 Low-Rank Quantization-Aware Training for LLMs Yelysei Bondarenko, Riccardo Del Chiaro, Markus Nagel 2024-06-10 arXiv https://github.com/qualcomm-ai-research/LR-QAT http://arxiv.org/abs/2406.06385v2
771 Recurrent Context Compression: Efficiently Expanding the Context Window of LLM Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang 2024-06-10 arXiv https://github.com/WUHU-G/RCC_Transformer http://arxiv.org/abs/2406.06110v1
772 Data-Juicer: A One-Stop Data Processing System for Large Language Models Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao, Yuexiang Xie, Zhaoyang Liu, Jinyang Gao, Yaliang Li, Bolin Ding, Jingren Zhou 2024-06-09 SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data https://github.com/alibaba/data-juicer https://dl.acm.org/doi/10.1145/3626246.3653385
773 Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking Fangxu Yu, Lai Jiang, Haoqiang Kang, Shibo Hao, Lianhui Qin 2024-06-09 arXiv https://github.com/Yu-Fangxu/FoR http://arxiv.org/abs/2406.05673v2
774 Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples Fangxu Yu, Lai Jiang, Haoqiang Kang, Shibo Hao, Lianhui Qin 2024-06-09 arXiv https://github.com/Yu-Fangxu/FoR http://arxiv.org/abs/2406.05673v3
775 Hello Again! LLM-powered Personalized Agent for Long-term Dialogue Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua 2024-06-09 arXiv https://github.com/leolee99/LD-Agent http://arxiv.org/abs/2406.05925v1
776 How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li 2024-06-09 arXiv https://github.com/ydyjya/LLM-IHS-Explanation http://arxiv.org/abs/2406.05644v2
777 Soundscape Captioning using Sound Affective Quality Network and Large Language Model Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren 2024-06-09 arXiv https://github.com/Yuanbo2020/SoundSCaper https://doi.org/10.48550/arXiv.2406.05914
778 On the Worst Prompt Performance of Large Language Models Bowen Cao, Deng Cai, Zhisong Zhang, Yuexian Zou, Wai Lam 2024-06-08 arXiv https://github.com/cbwbuaa/On-the-Worst-Prompt- https://doi.org/10.48550/arXiv.2406.10248
779 Large Language Model Assisted Adversarial Robustness Neural Architecture Search Rui Zhong, Yang Cao, Jun Yu, Masaharu Munetomo 2024-06-08 arXiv https://github.com/RuiZhong961230/LLMO https://doi.org/10.48550/arXiv.2406.05433
780 NYU CTF Dataset: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security Minghao Shao, Sofija Jancheska, Meet Udeshi, Brendan Dolan-Gavitt, Haoran Xi, Kimberly Milner, Boyuan Chen, Max Yin, Siddharth Garg, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Muhammad Shafique 2024-06-08 arXiv https://github.com/NYU-LLM-CTF/LLM_CTF_Database http://arxiv.org/abs/2406.05590v1
781 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David F. Fouhey, Joyce Chai 2024-06-07 arXiv https://3d-grand.github.io http://arxiv.org/abs/2406.05132v2
782 An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models Xiongtao Zhou, Jie He, Yuhua Ke, Guangyao Zhu, Víctor Gutiérrez-Basulto, Jeff Z. Pan 2024-06-07 arXiv https://github.com/alenai97/PEFT-MLLM https://doi.org/10.48550/arXiv.2406.05130
783 CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models Ling Shi, Deyi Xiong 2024-06-07 arXiv https://github.com/lingshi6565/Risk_eval https://doi.org/10.48550/arXiv.2406.04752
784 FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models Rui Ye, Rui Ge, Xinyu Zhu, Jingyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, Siheng Chen 2024-06-07 arXiv https://github.com/rui-ye/FedLLM-Bench https://doi.org/10.48550/arXiv.2406.04845
785 LLM-Enhanced Bayesian Optimization for Efficient Analog Layout Constraint Generation Guojin Chen, Keren Zhu, Seunggeun Kim, Hanqing Zhu, Yao Lai, Bei Yu, David Z. Pan 2024-06-07 arXiv https://github.com/dekura/LLANA http://arxiv.org/abs/2406.05250v2
786 LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model Zhi Zhou, Jiang-Xin Shi, Peng-Xiao Song, Xiao-Wen Yang, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li 2024-06-07 arXiv https://github.com/pengxiao-song/LaWGPT https://doi.org/10.48550/arXiv.2406.04614
787 LogiCode: An LLM-Driven Framework for Logical Anomaly Detection Yiheng Zhang, Yunkang Cao, Xiaohao Xu, Weiming Shen 2024-06-07 IEEE Transactions on Automation Science and Engineering https://github.com/22strongestme/LOCO-Annotations https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10710633
788 Towards Semantic Equivalence of Tokenization in Multimodal LLM Shengqiong Wu, Hao Fei, Xiangtai Li, Jiayi Ji, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan 2024-06-07 arXiv https://chocowu.github.io/SeTok-web/ http://arxiv.org/abs/2406.05127v2
789 MoralBench: Moral Evaluation of LLMs Jianchao Ji, Yutong Chen, Mingyu Jin, Wujiang Xu, Wenyue Hua, Yongfeng Zhang 2024-06-06 arXiv https://github.com/agiresearch/MoralBench http://arxiv.org/abs/2406.04428v1
790 ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models Yuanyi Ren, Haoran Ye, Hanjun Fang, Xin Zhang, Guojie Song 2024-06-06 arXiv https://github.com/Value4AI/ValueBench https://doi.org/10.48550/arXiv.2406.04214
791 Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models Phat Nguyen, Tsun-Hsuan Wang, Zhang-Wei Hong, Sertac Karaman, Daniela Rus 2024-06-06 arXiv https://text-to-drive.github.io/ https://doi.org/10.48550/arXiv.2406.04300
792 TESTEVAL: Benchmarking Large Language Models for Test Case Generation Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, Lei Ma 2024-06-06 arXiv https://llm4softwaretesting.github.io https://doi.org/10.48550/arXiv.2406.04531
793 PaCE: Parsimonious Concept Engineering for Large Language Models Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan, Darshan Thaker, Aditya Chattopadhyay, Chris Callison-Burch, René Vidal 2024-06-06 arXiv https://github.com/peterljq/Parsimonious-Concept-Engineering https://doi.org/10.48550/arXiv.2406.04331
794 Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs Shang Zhou, Feng Yao, Chengyu Dong, Zihan Wang, Jingbo Shang 2024-06-06 arXiv https://github.com/ShangDataLab/Smooth-Control http://arxiv.org/abs/2406.04460v1
795 LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification Chun Liu, Hongguang Zhang, Kainan Zhao, Xinghai Ju, Lin Yang 2024-06-06 Proceedings of the Annual Meeting of the Association for Computational Linguistics https://github.com/ChunLiu-cs/LLMEmbed-ACL2024 http://arxiv.org/abs/2406.03725v1
796 Aligning Agents like Large Language Models Adam Jelley, Yuhan Cao, David Bignell, Sam Devlin, Tabish Rashid 2024-06-06 arXiv https://adamjelley.github.io/aligning-agents-like-llms https://doi.org/10.48550/arXiv.2406.04208
797 AgentGym: Evolving Large Language Model-based Agents across Diverse Environments Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang 2024-06-06 arXiv https://agentgym.github.io https://doi.org/10.48550/arXiv.2406.04151
798 Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Ling Yang, Zhaochen Yu, Tianjun Zhang, Shiyi Cao, Minkai Xu, Wentao Zhang, Joseph E. Gonzalez, Bin Cui 2024-06-06 arXiv https://github.com/YangLing0818/buffer-of-thought-llm https://doi.org/10.48550/arXiv.2406.04271
799 DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning Shangqing Tu, Kejian Zhu, Yushi Bai, Zijun Yao, Lei Hou, Juanzi Li 2024-06-06 arXiv https://github.com/THU-KEG/DICE http://arxiv.org/abs/2406.04197v1
800 Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models Peijie Dong, Lujun Li, Zhenheng Tang, Xiang Liu, Xinglin Pan, Qiang Wang, Xiaowen Chu 2024-06-05 arXiv https://github.com/pprp/Pruner-Zero https://doi.org/10.48550/arXiv.2406.02924
801 MultifacetEval: Multifaceted Evaluation to Probe LLMs in Mastering Medical Knowledge Yuxuan Zhou, Xien Liu, Chen Ning, Ji Wu 2024-06-05 arXiv https://github.com/THUMLP/MultifacetEval http://arxiv.org/abs/2406.02919v1
802 Text-like Encoding of Collaborative Information in Large Language Models for Recommendation Yang Zhang, Keqin Bao, Ming Yang, Wenjie Wang, Fuli Feng, Xiangnan He 2024-06-05 ACL https://github.com/zyang1580/BinLLM https://aclanthology.org/2024.acl-long.497
803 Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Xinrong Zhang, Zhiyuan Liu, Chuan Shi, Maosong Sun 2024-06-05 arXiv https://github.com/MayDomine/Seq1F1B https://doi.org/10.48550/arXiv.2406.03488
804 PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs Charlie Hou, Akshat Shrivastava, Hongyuan Zhan, Rylan Conway, Trang Le, Adithya Sagar, Giulia Fanti, Daniel Lazar 2024-06-05 arXiv https://github.com/houcharlie/PrE-Text http://arxiv.org/abs/2406.02958v1
805 PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM Tao Yang, Yingmin Luo, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen 2024-06-05 arXiv https://github.com/posterllava/PosterLLaVA http://arxiv.org/abs/2406.02884v1
806 Llumnix: Dynamic Scheduling for Large Language Model Serving Biao Sun, Ziming Huang, Hanyu Zhao, Wencong Xiao, Xinyi Zhang, Yong Li, Wei Lin 2024-06-05 arXiv https://github.com/AlibabaPAI/llumnix https://doi.org/10.48550/arXiv.2406.03243
807 HYDRA: Model Factorization Framework for Black-Box LLM Personalization Yuchen Zhuang, Haotian Sun, Yue Yu, Rushi Qiang, Qifan Wang, Chao Zhang, Bo Dai 2024-06-05 arXiv https://github.com/night-chen/HYDRA http://arxiv.org/abs/2406.02888v2
808 Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation Tingjia Shen, Hao Wang, Jiaqing Zhang, Sirui Zhao, Liangyue Li, Zulong Chen, Defu Lian, Enhong Chen 2024-06-05 arXiv https://github.com/TingJShen/URLLM https://doi.org/10.48550/arXiv.2406.03085
809 CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs Shuang Ao, Stefan Rueger, Advaith Siddharthan 2024-06-05 arXiv https://github.com/AoShuang92/css_uq_llms http://arxiv.org/abs/2406.03158v1
810 BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian 2024-06-05 Proceedings of the Annual Meeting of the Association for Computational Linguistics https://github.com/DPamK/BadAgent http://arxiv.org/abs/2406.03007v1
811 XRec: Large Language Models for Explainable Recommendation Qiyao Ma, Xubin Ren, Chao Huang 2024-06-04 arXiv https://github.com/HKUDS/XRec https://doi.org/10.48550/arXiv.2406.02377
812 Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models Marianna Nezhurina, Lucia Cipolina-Kun, Mehdi Cherti, Jenia Jitsev 2024-06-04 arXiv https://github.com/LAION-AI/AIW https://doi.org/10.48550/arXiv.2406.02061
813 Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature Tong Zhou, Xuandong Zhao, Xiaolin Xu, Shaolei Ren 2024-06-04 arXiv https://github.com/Tongzhou0101/Bileve-official https://doi.org/10.48550/arXiv.2406.01946
814 Hiding Text in Large Language Models: Introducing Unconditional Token Forcing Confusion Jakub Hoscilowicz, Pawel Popiolek, Jan Rudkowski, Jedrzej Bieniasz, Artur Janicki 2024-06-04 arXiv https://github.com/j-hoscilowic/zurek-stegano https://doi.org/10.48550/arXiv.2406.02481
815 Large Language Models as Carriers of Hidden Messages Jakub Hoscilowicz, Pawel Popiolek, Jan Rudkowski, Jedrzej Bieniasz, Artur Janicki 2024-06-04 arXiv https://github.com/j-hoscilowic/zurek-stegano http://arxiv.org/abs/2406.02481v2
816 Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller Min Cai, Yuchen Zhang, Shichang Zhang, Fan Yin, Dan Zhang, Difan Zou, Yisong Yue, Ziniu Hu 2024-06-04 arXiv https://llm-self-control.github.io/ http://arxiv.org/abs/2406.02721v3
817 SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM Quandong Wang, Yuxuan Yuan, Xiaoyu Yang, Ruike Zhang, Kang Zhao, Wei Liu, Jian Luan, Daniel Povey, Bin Wang 2024-06-03 arXiv https://github.com/XiaoMi/subllm http://arxiv.org/abs/2406.06571v2
818 Sparsity-Accelerated Training for Large Language Models Da Ma, Lu Chen, Pengyu Wang, Hongshen Xu, Hanqi Li, Liangtai Sun, Su Zhu, Shuai Fan, Kai Yu 2024-06-03 arXiv https://github.com/OpenDFM/SAT https://doi.org/10.48550/arXiv.2406.01392
819 The Geometry of Categorical and Hierarchical Concepts in Large Language Models Kiho Park, Yo Joong Choe, Yibo Jiang, Victor Veitch 2024-06-03 arXiv https://github.com/KihoPark/LLM_Categorical_Hierarchical_Representations https://doi.org/10.48550/arXiv.2406.01506
820 VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model Jinze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Zeke Xie, Zhong Ji, Jungong Han, Mingming Sun 2024-06-03 arXiv https://github.com/ucasyjz/VIP https://doi.org/10.48550/arXiv.2406.01059
821 Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization Yu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, Wei-Lin Chen, Chao-Wei Huang, Yu Meng, Yun-Nung Chen 2024-06-03 arXiv https://github.com/MiuLab/PersonaLLM-Survey http://arxiv.org/abs/2406.01171v2
822 REvolve: Reward Evolution with Large Language Models using Human Feedback Rishi Hazra, Alkis Sygkounas, Andreas Persson, Amy Loutfi, Pedro Zuidberg Dos Martires 2024-06-03 arXiv https://rishihazra.github.io/REvolve http://arxiv.org/abs/2406.01309v2
823 Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs Haokun Lin, Haobo Xu, Yichen Wu, Jingzhi Cui, Yingtao Zhang, Linzhan Mou, Linqi Song, Zhenan Sun, Ying Wei 2024-06-03 arXiv https://github.com/Hsu1023/DuQuant http://arxiv.org/abs/2406.01721v1
824 Towards Scalable Automated Alignment of LLMs: A Survey Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu 2024-06-03 arXiv https://github.com/cascip/awesome-auto-alignment http://arxiv.org/abs/2406.01252v1
825 REvolve: Reward Evolution with Large Language Models for Autonomous Driving Rishi Hazra, Alkis Sygkounas, Andreas Persson, Amy Loutfi, Pedro Zuidberg Dos Martires 2024-06-03 arXiv https://rishihazra.github.io/REvolve https://doi.org/10.48550/arXiv.2406.01309
826 LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback Wen Lai, Mohsen Mesgar, Alexander Fraser 2024-06-03 arXiv https://github.com/boschresearch/ACL24-MLLM http://arxiv.org/abs/2406.01771v1
827 Knowledge Graph in Astronomical Research with Large Language Models: Quantifying Driving Forces in Interdisciplinary Scientific Discovery Zechang Sun, Yuan-Sen Ting, Yaobo Liang, Nan Duan, Song Huang, Zheng Cai 2024-06-03 arXiv https://astrokg.github.io/ https://doi.org/10.48550/arXiv.2406.01391
828 DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs Haokun Lin, Haobo Xu, Yichen Wu, Jingzhi Cui, Yingtao Zhang, Linzhan Mou, Linqi Song, Zhenan Sun, Ying Wei 2024-06-03 arXiv https://duquant.github.io http://arxiv.org/abs/2406.01721v2
829 Demystifying Platform Requirements for Diverse LLM Inference Use Cases Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna 2024-06-03 arXiv https://github.com/abhibambhaniya/GenZ-LLM-Analyzer http://arxiv.org/abs/2406.01698v1
830 Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models Cheng-Hsun Hsueh, Paul Kuo-Ming Huang, Tzu-Han Lin, Che-Wei Liao, Hung-Chieh Fang, Chao-Wei Huang, Yun-Nung Chen 2024-06-03 arXiv https://github.com/MiuLab/EditLLM-Survey https://doi.org/10.48550/arXiv.2406.01436
831 LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang 2024-06-03 arXiv https://github.com/ARIES-LM/Lexmatcher-MT http://arxiv.org/abs/2406.01441v1
832 Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection Chentao Cao, Zhun Zhong, Zhanke Zhou, Yang Liu, Tongliang Liu, Bo Han 2024-06-02 arXiv https://github.com/tmlr-group/EOE https://doi.org/10.48550/arXiv.2406.00806
833 Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction Xiaoyuan Li, Wenjie Wang, Moxin Li, Junrong Guo, Yang Zhang, Fuli Feng 2024-06-02 arXiv https://github.com/LittleCirc1e/EIC https://doi.org/10.48550/arXiv.2406.00755
834 A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters Long Hei Matthew Lam, Ramya Keerthy Thatikonda, Ehsan Shareghi 2024-06-01 arXiv https://github.com/Mattylam/Logic_Symbolic_Solvers_Experiment http://arxiv.org/abs/2406.00284v1
835 Phased Instruction Fine-Tuning for Large Language Models Wei Pang, Chuan Zhou, Xiao-Hua Zhou, Xiaojie Wang 2024-06-01 arXiv https://github.com/xubuvd/PhasedSFT https://doi.org/10.48550/arXiv.2406.04371
836 Knowledge-Aware Code Generation with Large Language Models Tao Huang, Zhihong Sun, Zhi Jin, Ge Li, Chen Lyu 2024-06 2024 IEEE/ACM 32nd International Conference on Program Comprehension (ICPC) https://github.com/CodeGeneration3/KareCoder.CCS https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10556459
837 Investigating the Efficacy of Large Language Models for Code Clone Detection Mohamad Khajezade, Jie JW Wu, Fatemeh Hendijani Fard, Gema Rodríguez-Pérez, Mohamed Sami Shehata 2024-06 2024 IEEE/ACM 32nd International Conference on Program Comprehension (ICPC) https://github.com/mkhfring/llm-for-ccd https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10556419
838 Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies Yilun Liu, Shimin Tao, Weibin Meng, Jingyu Wang, Wenbing Ma, Yanqing Zhao, Yuhang Chen, Hao Yang, Yanfei Jiang, Xun Chen 2024-06 2024 IEEE/ACM 32nd International Conference on Program Comprehension (ICPC) https://github.com/lunyiliu/LogPrompt.CCS https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10556497
839 Improved Techniques for Optimization-Based Jailbreaking on Large Language Models Xiaojun Jia, Tianyu Pang, Chao Du, Yihao Huang, Jindong Gu, Yang Liu, Xiaochun Cao, Min Lin 2024-05-31 arXiv https://github.com/jiaxiaojunQAQ/I-GCG https://doi.org/10.48550/arXiv.2405.21018
840 LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation Qidong Liu, Xian Wu, Yejing Wang, Zijian Zhang, Feng Tian, Yefeng Zheng, Xiangyu Zhao 2024-05-31 arXiv https://github.com/Applied-Machine-Learning-Lab/LLM-ESR http://arxiv.org/abs/2405.20646v2
841 Ovis: Structural Embedding Alignment for Multimodal Large Language Model Shiyin Lu, Yang Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Han-Jia Ye 2024-05-31 arXiv https://github.com/AIDC-AI/Ovis https://doi.org/10.48550/arXiv.2405.20797
842 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Tianyang Xu, Shujin Wu, Shizhe Diao, Xiaoze Liu, Xingyao Wang, Yangyi Chen, Jing Gao 2024-05-31 arXiv https://github.com/xu1868/SaySelf http://arxiv.org/abs/2405.20974v2
843 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Chaoyou Fu, Yuhan Dai, Yongdong Luo, Lei Li, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun 2024-05-31 arXiv https://video-mme.github.io http://arxiv.org/abs/2405.21075v2
844 One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models Yutao Zhu, Zhaoheng Huang, Zhicheng Dou, Ji-Rong Wen 2024-05-30 arXiv https://github.com/DaoD/SPRING/ https://doi.org/10.48550/arXiv.2405.19670
845 Xwin-LM: Strong and Scalable Alignment Practice for LLMs Bolin Ni, JingCheng Hu, Yixuan Wei, Houwen Peng, Zheng Zhang, Gaofeng Meng, Han Hu 2024-05-30 arXiv https://github.com/Xwin-LM/Xwin-LM http://arxiv.org/abs/2405.20335v1
846 Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, Wangmeng Zuo 2024-05-30 arXiv https://github.com/guozix/LLM-catalyst http://arxiv.org/abs/2405.19732v3
847 TAIA: Large Language Models are Out-of-Distribution Data Learners Shuyang Jiang, Yusheng Liao, Ya Zhang, Yu Wang, Yanfeng Wang 2024-05-30 arXiv https://github.com/pixas/TAIA_LLM https://doi.org/10.48550/arXiv.2405.20192
848 PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations Jiatong Li, Renjun Hu, Kunzhe Huang, Yan Zhuang, Qi Liu, Mengxiao Zhu, Xing Shi, Wei Lin 2024-05-30 arXiv https://github.com/aigc-apps/PertEval http://arxiv.org/abs/2405.19740v1
849 Designing an Evaluation Framework for Large Language Models in Astronomy Research John F. Wu, Alina Hyk, Kiera McCormick, Christine Ye, Simone Astarita, Elina Baral, Jo Ciuca, Jesse Cranney, Anjalie Field, Kartheik Iyer, Philipp Koehn, Jenn Kotler, Sandor Kruk, Michelle Ntampaka, Charles O'Neill, Joshua E. G. Peek, Sanjib Sharma, Mikaeel Yunus 2024-05-30 arXiv https://github.com/jsalt2024-evaluating-llms-for-astronomy/astro-arxiv-bot https://doi.org/10.48550/arXiv.2405.20389
850 NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models Kai Wu, Boyuan Jiang, Zhengkai Jiang, Qingdong He, Donghao Luo, Shengzhi Wang, Qingwen Liu, Chengjie Wang 2024-05-30 arXiv https://kaiwu5.github.io/noiseboost https://doi.org/10.48550/arXiv.2405.20081
851 Large Language Models as Planning Domain Generators (Student Abstract) James T. Oswald, Kavitha Srinivas, Harsha Kokel, Junkyu Lee, Michael Katz, Shirin Sohrabi 2024-05-30 AAAI https://github.com/IBM/NL2PDDL https://doi.org/10.1609/aaai.v38i21.30491
852 Evaluating Large Language Model Biases in Persona-Steered Generation Andy Liu, Mona Diab, Daniel Fried 2024-05-30 arXiv https://github.com/andyjliu/persona-steered-generation-bias https://doi.org/10.48550/arXiv.2405.20253
853 Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach Ernesto Quevedo, Jorge Yero, Rachel Koerner, Pablo Rivas, Tomás Cerný 2024-05-30 arXiv https://github.com/Baylor-AI/HalluDetect https://doi.org/10.48550/arXiv.2405.19648
854 PATIENT-ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals Ruiyi Wang, Stephanie Milani, Jamie C. Chiu, Jiayin Zhi, Shaun M. Eack, Travis Labrum, Samuel M. Murphy, Nev Jones, Kate Hardy, Hong Shen, Fei Fang, Zhiyu Zoey Chen 2024-05-30 EMNLP https://github.com/ruiyiw/patient-psi https://aclanthology.org/2024.emnlp-main.711
855 AutoDroid: LLM-powered Task Automation in Android Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu 2024-05-29 ACM MobiCom '24: Proceedings of the 30th Annual International Conference on Mobile Computing and Networking https://autodroid-sys.github.io/ https://dl.acm.org/doi/10.1145/3636534.3649379
856 Compressing Large Language Models using Low Rank and Low Precision Decomposition Rajarshi Saha, Naomi Sagan, Varun Srivastava, Andrea J. Goldsmith, Mert Pilanci 2024-05-29 arXiv https://github.com/pilancilab/caldera https://doi.org/10.48550/arXiv.2405.18886
857 Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer Zengqun Zhao, Yu Cao, Shaogang Gong, Ioannis Patras 2024-05-29 arXiv https://github.com/zengqunzhao/Exp-CLIP http://arxiv.org/abs/2405.19100v2
858 LLMs Meet Multimodal Generation and Editing: A Survey Yingqing He, Zhaoyang Liu, Jingye Chen, Zeyue Tian, Hongyu Liu, Xiaowei Chi, Runtao Liu, Ruibin Yuan, Yazhou Xing, Wenhai Wang, Jifeng Dai, Yong Zhang, Wei Xue, Qifeng Liu, Yike Guo, Qifeng Chen 2024-05-29 arXiv https://github.com/YingqingHe/Awesome-LLMs-meet-Multimodal-Generation http://arxiv.org/abs/2405.19334v2
859 MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Y. Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu, Noah Wang, Quehry Que, Ruibo Liu, Sine Liu, Shawn Guo, Soren Gao, Wangchunshu Zhou, Xinyue Zhang, Yizhi Zhou, Yubo Wang, Yuelin Bai, Yuhan Zhang, Yuxiang Zhang, Zenith Wang, Zhenzhu Yang, Zijian Zhao, Jiajun Zhang, Wanli Ouyang, Wenhao Huang, Wenhu Chen 2024-05-29 arXiv https://map-neo.github.io/ https://doi.org/10.48550/arXiv.2405.19327
860 VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin, Jaehong Yoon, Feng Cheng, Gedas Bertasius, Mohit Bansal 2024-05-29 arXiv https://videotree2024.github.io/ http://arxiv.org/abs/2405.19209v1
861 ORLM: Training Large Language Models for Optimization Modeling Zhengyang Tang, Chenyu Huang, Xin Zheng, Shixi Hu, Zizhuo Wang, Dongdong Ge, Benyou Wang 2024-05-28 arXiv https://github.com/Cardinal-Operations/ORLM https://doi.org/10.48550/arXiv.2405.17743
862 Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model Haogeng Liu, Quanzeng You, Xiaotian Han, Yongfei Liu, Huaibo Huang, Ran He, Hongxia Yang 2024-05-28 arXiv https://github.com/liuhaogeng/Anchor-Former https://doi.org/10.48550/arXiv.2405.17815
863 TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models Jaewoo Ahn, Taehyun Lee, Junyoung Lim, Jin-Hwa Kim, Sangdoo Yun, Hwaran Lee, Gunhee Kim 2024-05-28 arXiv https://ahnjaewoo.github.io/timechara https://doi.org/10.48550/arXiv.2405.18027
864 Pipette: Automatic Fine-Grained Large Language Model Training Configurator for Real-World Clusters Jinkyu Yim, Jaeyong Song, Yerim Choi, Jaebeen Lee, Jaewon Jung, Hongsun Jang, Jinho Lee 2024-05-28 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE) https://github.com/yimjinkyu1/date2024_pipette https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10546826
865 OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu 2024-05-28 arXiv https://github.com/pixeli99/OwLore http://arxiv.org/abs/2405.18380v1
866 Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing Wei Zhao, Zhe Li, Yige Li, Ye Zhang, Jun Sun 2024-05-28 arXiv https://github.com/ledllm/ledllm https://doi.org/10.48550/arXiv.2405.18166
867 Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu 2024-05-28 arXiv https://github.com/git-disl/Lisa http://arxiv.org/abs/2405.18641v5
868 Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu 2024-05-28 arXiv https://github.com/git-disl/Lisa https://doi.org/10.48550/arXiv.2405.18641
869 Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference Hao Mark Chen, Wayne Luk, Ka Fai Cedric Yiu, Rui Li, Konstantin Mishchenko, Stylianos I. Venieris, Hongxiang Fan 2024-05-28 arXiv https://github.com/hmarkc/parallel-prompt-decoding http://arxiv.org/abs/2405.18628v2
870 C$^3$Bench: A Comprehensive Classical Chinese Understanding Benchmark for Large Language Models Jiahuan Cao, Yongxin Shi, Dezhi Peng, Yang Liu, Lianwen Jin 2024-05-28 arXiv https://github.com/SCUT-DLVCLab/C3bench http://arxiv.org/abs/2405.17732v2
871 LLMs and Memorization: On Quality and Specificity of Copyright Compliance Felix B Mueller, Rebekka Görge, Anna K Bernzen, Janna C Pirk, Maximilian Poretschkin 2024-05-28 Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 984-996, 2024 https://github.com/felixbmuller/llms-memorization-copyright http://arxiv.org/abs/2405.18492v1
872 CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs Haoyu Wang, Bei Liu, Hang Shao, Bo Xiao, Ke Zeng, Guanglu Wan, Yanmin Qian 2024-05-27 arXiv https://github.com/fayuge/CLAQ http://arxiv.org/abs/2405.17233v2
873 Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Shichao Song, Hanyu Wang, Jiawei Yang, Feiyu Xiong, Bo Tang, Chenyang Xi 2024-05-27 arXiv https://github.com/IAAR-Shanghai/PGRAG https://doi.org/10.48550/arXiv.2405.16933
874 Entity Alignment with Noisy Annotations from Large Language Models Shengyuan Chen, Qinggang Zhang, Junnan Dong, Wen Hua, Qing Li, Xiao Huang 2024-05-27 arXiv https://github.com/chensyCN/llm4ea_official https://doi.org/10.48550/arXiv.2405.16806
875 LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding Haoyu Zhao, Wenhang Ge, Ying-Cong Chen 2024-05-27 arXiv https://haoyu-zhao.github.io/LLM-Optic.github.io/ https://doi.org/10.48550/arXiv.2405.17104
876 Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching Tianshu Wang, Xiaoyang Chen, Hongyu Lin, Xuanang Chen, Xianpei Han, Hao Wang, Zhenyu Zeng, Le Sun 2024-05-27 arXiv https://github.com/tshu-w/LLM4EM https://doi.org/10.48550/arXiv.2405.16884
877 MotionLLM: Multimodal Motion-Language Learning with Large Language Models Qi Wu, Yubo Zhao, Yifan Wang, Yu-Wing Tai, Chi-Keung Tang 2024-05-27 arXiv https://knoxzhao.github.io/MotionLLM https://doi.org/10.48550/arXiv.2405.17013
878 Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models Shengyun Peng, Pin-Yu Chen, Matthew Hull, Duen Horng Chau 2024-05-27 arXiv https://github.com/ShengYun-Peng/llm-landscape https://doi.org/10.48550/arXiv.2405.17374
879 ReMoDetect: Reward Models Recognize Aligned LLM's Generations Hyunseok Lee, Jihoon Tack, Jinwoo Shin 2024-05-27 arXiv https://github.com/hyunseoklee-ai/reward_llm_detect http://arxiv.org/abs/2405.17382v1
880 Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model Kuan-Chih Huang, Xiangtai Li, Lu Qi, Shuicheng Yan, Ming-Hsuan Yang 2024-05-27 arXiv https://KuanchihHuang.github.io/project/reason3d https://doi.org/10.48550/arXiv.2405.17427
881 Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models Xijie Huang, Xinyuan Wang, Hantao Zhang, Yinghao Zhu, Jiawen Xi, Jingkun An, Hao Wang, Hao Liang, Chengwei Pan 2024-05-26 arXiv https://github.com/dirtycomputer/O2M_attack http://arxiv.org/abs/2405.20775v2
882 Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs Mustafa Shukor, Matthieu Cord 2024-05-26 arXiv https://ima-lmms.github.io/ http://arxiv.org/abs/2405.16700v1
883 Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models Xijie Huang, Xinyuan Wang, Hantao Zhang, Jiawen Xi, Jingkun An, Hao Wang, Chengwei Pan 2024-05-26 arXiv https://github.com/dirtycomputer/O2M_attack https://doi.org/10.48550/arXiv.2405.20775
884 Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration Sunhao Dai, Weihao Liu, Yuqi Zhou, Liang Pang, Rongju Ruan, Gang Wang, Zhenhua Dong, Jun Xu, Ji-Rong Wen 2024-05-26 Proceedings of the Annual Meeting of the Association for Computational Linguistics https://github.com/KID-22/Cocktail http://arxiv.org/abs/2405.16546v1
885 Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity Shanghaoran Quan 2024-05-26 arXiv https://github.com/quanshr/AugCon http://arxiv.org/abs/2405.16579v1
886 AutoManual: Constructing Instruction Manuals by LLM Agents via Interactive Environmental Learning Minghao Chen, Yihang Li, Yanting Yang, Shiyu Yu, Binbin Lin, Xiaofei He 2024-05-25 arXiv https://github.com/minghchen/automanual http://arxiv.org/abs/2405.16247v4
887 SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models Xudong Lu, Aojun Zhou, Yuhui Xu, Renrui Zhang, Peng Gao, Hongsheng Li 2024-05-25 arXiv https://github.com/Lucky-Lance/SPP https://doi.org/10.48550/arXiv.2405.16057
888 CulturePark: Boosting Cross-cultural Understanding in Large Language Models Cheng Li, Damien Teney, Linyi Yang, Qingsong Wen, Xing Xie, Jindong Wang 2024-05-24 arXiv https://github.com/Scarelette/CulturePark https://doi.org/10.48550/arXiv.2405.15145
889 A Solution-based LLM API-using Methodology for Academic Information Seeking Yuanchun Wang, Jifan Yu, Zijun Yao, Jing Zhang, Yuyang Xie, Shangqing Tu, Yiyang Fu, Youhe Feng, Jinkai Zhang, Jingyao Zhang, Bowen Huang, Yuanyao Li, Huihui Yuan, Lei Hou, Juanzi Li, Jie Tang 2024-05-24 arXiv https://github.com/RUCKBReasoning/SoAy http://arxiv.org/abs/2405.15165v1
890 Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs Chenxi Sun, Hongzhi Zhang, Zijia Lin, Jingyuan Zhang, Fuzheng Zhang, Zhongyuan Wang, Bin Chen, Chengru Song, Di Zhang, Kun Gai, Deyi Xiong 2024-05-24 arXiv https://github.com/tjunlp-lab/Lexical-Unit-Decoding-LUD- http://arxiv.org/abs/2405.15208v1
891 Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan 2024-05-24 arXiv https://github.com/simonucl/adv-retreival-icl https://doi.org/10.48550/arXiv.2405.15984
892 LM4LV: A Frozen Large Language Model for Low-level Vision Tasks Boyang Zheng, Jinjin Gu, Shijun Li, Chao Dong 2024-05-24 arXiv https://github.com/bytetriper/LM4LV https://doi.org/10.48550/arXiv.2405.15734
893 Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yong Man Ro 2024-05-24 arXiv https://github.com/ByungKwanLee/Meteor https://doi.org/10.48550/arXiv.2405.15574
894 WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen 2024-05-23 arXiv https://github.com/zjunlp/EasyEdit https://doi.org/10.48550/arXiv.2405.14768
895 Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li 2024-05-23 arXiv https://read-llm.github.io/ http://arxiv.org/abs/2405.14314v2
896 RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models Xiangkun Hu, Dongyu Ru, Lin Qiu, Qipeng Guo, Tianhang Zhang, Yang Xu, Yun Luo, Pengfei Liu, Yue Zhang, Zheng Zhang 2024-05-23 arXiv https://github.com/amazon-science/RefChecker https://doi.org/10.48550/arXiv.2405.14486
897 Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs Jaewoo Yang, Hayun Kim, Younghoon Kim 2024-05-23 arXiv https://github.com/onnoo/activation-spikes http://arxiv.org/abs/2405.14428v1
898 HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, Yu Su 2024-05-23 arXiv https://github.com/OSU-NLP-Group/HippoRAG https://doi.org/10.48550/arXiv.2405.14831
899 FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models Hongyang Yang, Boyu Zhang, Neng Wang, Cheng Guo, Xiaoli Zhang, Likun Lin, Junlin Wang, Tianyu Zhou, Mao Guan, Runjia Zhang, Christina Dan Wang 2024-05-23 arXiv https://github.com/AI4Finance-Foundation/FinRobot https://doi.org/10.48550/arXiv.2405.14767
900 DuanzAI: Slang-Enhanced LLM with Prompt for Humor Understanding Yesian Rohn 2024-05-23 arXiv https://github.com/YesianRohn/DuanzAI http://arxiv.org/abs/2405.15818v1
901 Dissociation of Faithful and Unfaithful Reasoning in LLMs Evelyn Yee, Alice Li, Chenyu Tang, Yeon Ho Jung, Ramamohan Paturi, Leon Bergen 2024-05-23 arXiv https://github.com/CoTErrorRecovery/CoTErrorRecovery http://arxiv.org/abs/2405.15092v1
902 ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation Jingnan Zheng, Han Wang, An Zhang, Tai D. Nguyen, Jun Sun, Tat-Seng Chua 2024-05-23 2024 Neurips https://github.com/SophieZheng998/ALI-Agent http://arxiv.org/abs/2405.14125v2
903 Large Language Models Can Self-Correct with Key Condition Verification Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang 2024-05-23 EMNLP https://wzy6642.github.io/proco.github.io/ https://aclanthology.org/2024.emnlp-main.714
904 AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Bin Lei, Yuchen Li, Qiuwu Chen 2024-05-23 arXiv https://github.com/bin123apple/AutoCoder https://doi.org/10.48550/arXiv.2405.14906
905 AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability Fei Zhao, Taotian Pang, Chunhui Li, Zhen Wu, Junjie Guo, Shangyu Xing, Xinyu Dai 2024-05-23 arXiv https://aligngpt-vl.github.io/ https://doi.org/10.48550/arXiv.2405.14129
906 PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery Runlong He, Mengya Xu, Adrito Das, Danyal Z. Khan, Sophia Bano, Hani J. Marcus, Danail Stoyanov, Matthew J. Clarkson, Mobarakol Islam 2024-05-22 arXiv https://github.com/mobarakol/PitVQA http://arxiv.org/abs/2405.13949v1
907 VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding Yongxin Guo, Jingyu Liu, Mingda Li, Xiaoying Tang, Xi Chen, Bo Zhao 2024-05-22 arXiv https://github.com/gyxxyg/VTG-LLM http://arxiv.org/abs/2405.13382v1
908 Prompt-Time Ontology-Driven Symbolic Knowledge Capture with Large Language Models Tolga Çöplü, Arto Bendiken, Andrii Skomorokhov, Eduard Bateiko, Stephen Cobb, Joshua J. Bouw 2024-05-22 arXiv https://github.com/HaltiaAI/paper-PTODSKC https://doi.org/10.48550/arXiv.2405.14012
909 Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards Xiaoyu Yang, Jie Lu, En Yu 2024-05-22 arXiv https://github.com/Anonymous0Knight/ConceptDriftMLLMs http://arxiv.org/abs/2405.13459v2
910 LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework Yiran Qiao, Xiang Ao, Yang Liu, Jiarong Xu, Xiaoqian Sun, Qing He 2024-05-22 arXiv https://github.com/QiaoYRan/LOGIN https://doi.org/10.48550/arXiv.2405.13902
911 Adapting Multi-modal Large Language Model to Concept Drift in the Long-tailed Open World Xiaoyu Yang, Jie Lu, En Yu 2024-05-22 arXiv https://github.com/Anonymous0Knight/ConceptDriftMLLMs https://doi.org/10.48550/arXiv.2405.13459
912 An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Zhiyu Tan, Mengping Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li 2024-05-21 arXiv https://github.com/llm-conditioned-diffusion/llm-conditioned-diffusion https://doi.org/10.48550/arXiv.2405.12914
913 SirLLM: Streaming Infinite Retentive LLM Yao Yao, Zuchao Li, Hai Zhao 2024-05-21 Proceedings of the Annual Meeting of the Association for Computational Linguistics https://github.com/Zoeyyao27/SirLLM http://arxiv.org/abs/2405.12528v1
914 Large Language Models Meet NL2Code: A Survey Libo Qin, Qiguang Chen, Xiachong Feng, Yang Wu, Yongheng Zhang, Yinghui Li, Min Li, Wanxiang Che, Philip S. Yu 2024-05-21 ACL https://nl2code.github.io https://doi.org/10.18653/v1/2023.acl-long.411
915 CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua 2024-05-20 arXiv https://github.com/zt991211/CLAMBER https://doi.org/10.48550/arXiv.2405.12063
916 DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction Hao Chen, Biaojie Zeng, Xin Lin, Liang He, Aimin Zhou 2024-05-20 arXiv https://github.com/ChenhaoEcnuCS/Reason-Correct https://doi.org/10.48550/arXiv.2405.12100
917 MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark Hongwei Liu, Zilong Zheng, Yuxuan Qiao, Haodong Duan, Zhiwei Fei, Fengzhe Zhou, Wenwei Zhang, Songyang Zhang, Dahua Lin, Kai Chen 2024-05-20 OpenReview https://github.com/open-compass/MathBench http://arxiv.org/abs/2405.12209v1
918 MBIAS: Mitigating Bias in Large Language Models While Retaining Context Shaina Raza, Ananya Raval, Veronica Chatrath 2024-05-18 arXiv https://github.com/shainarazavi/MBIAS/tree/main https://doi.org/10.48550/arXiv.2405.11290
919 Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts Yunxin Li, Shenyuan Jiang, Baotian Hu, Longyue Wang, Wanqi Zhong, Wenhan Luo, Lin Ma, Min Zhang 2024-05-18 arXiv https://uni-moe.github.io/ http://arxiv.org/abs/2405.11273v1
920 RDRec: Rationale Distillation for LLM-based Recommendation Xinfeng Wang, Jin Cui, Yoshimi Suzuki, Fumiyo Fukumoto 2024-05-17 Proceedings of the Annual Meeting of the Association for Computational Linguistics https://github.com/WangXFng/RDRec http://arxiv.org/abs/2405.10587v2
921 Surgical Feature-Space Decomposition of LLMs: Why, When and How? Arnav Chavan, Nahush Lele, Deepak Gupta 2024-05-17 OpenReview https://github.com/nyunAI/SFSD-LLM http://arxiv.org/abs/2405.13039v1
922 Layer-Condensed KV Cache for Efficient Inference of Large Language Models Haoyi Wu, Kewei Tu 2024-05-17 arXiv https://github.com/whyNLP/LCKV https://doi.org/10.48550/arXiv.2405.10637
923 Benchmarking Large Language Models on CFLUE - A Chinese Financial Language Understanding Evaluation Dataset Jie Zhu, Junhui Li, Yalong Wen, Lifan Guo 2024-05-17 arXiv https://github.com/aliyun/cflue https://doi.org/10.48550/arXiv.2405.10542
924 DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues Xiang Luo, Zhiwen Tang, Jin Wang, Xuejie Zhang 2024-05-16 LREC/COLING https://github.com/suntea233/DuetSim https://aclanthology.org/2024.lrec-main.481
925 LFED: A Literary Fiction Evaluation Dataset for Large Language Models Linhao Yu, Qun Liu, Deyi Xiong 2024-05-16 LREC/COLING https://github.com/tjunlp-lab/LFED https://aclanthology.org/2024.lrec-main.915
926 Libra: Building Decoupled Vision System on Large Language Models Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu 2024-05-16 arXiv https://github.com/YifanXu74/Libra https://doi.org/10.48550/arXiv.2405.10140
927 MarkLLM: An Open-Source Toolkit for LLM Watermarking Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Hanlin Zhang, Xuming Hu, Lijie Wen, Irwin King 2024-05-16 arXiv https://github.com/THU-BPM/MarkLLM http://arxiv.org/abs/2405.10051v3
928 When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H. S. Torr, Marc Pollefeys, Matthias Nießner, Ian D. Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu 2024-05-16 arXiv https://github.com/ActiveVisionLab/Awesome-LLM-3D https://doi.org/10.48550/arXiv.2405.10255
929 Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography Minhao Bai. Jinshuai Yang, Kaiyi Pang, Huili Wang, Yongfeng Huang 2024-05-15 arXiv https://github.com/ba0z1/Linguistic-Steganalysis-with-LLMs http://arxiv.org/abs/2405.09090v1
930 Evaluating LLMs at Evaluating Temporal Generalization Chenghao Zhu, Nuo Chen, Yufei Gao, Benyou Wang 2024-05-14 arXiv https://github.com/FreedomIntelligence/FreshBench http://arxiv.org/abs/2405.08460v1
931 Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring Tiantian Zhang, Manxi Lin, Hongda Guo, Xiaofan Zhang, Ka Fung Peter Chiu, Aasa Feragen, Qi Dou 2024-05-14 arXiv https://github.com/med-air/PICG2scoring https://doi.org/10.48550/arXiv.2405.08786
932 Is Your LLM Outdated? Evaluating LLMs at Temporal Generalization Chenghao Zhu, Nuo Chen, Yufei Gao, Yunyi Zhang, Prayag Tiwari, Benyou Wang 2024-05-14 arXiv https://github.com/FreedomIntelligence/FreshBench http://arxiv.org/abs/2405.08460v2
933 Towards Personalized Evaluation of Large Language Models with An Anonymous Crowd-Sourcing Platform Mingyue Cheng, Hao Zhang, Jiqian Yang, Qi Liu, Li Li, Xin Huang, Liwei Song, Zhi Li, Zhenya Huang, Enhong Chen 2024-05-13 WWW '24: Companion Proceedings of the ACM on Web Conference 2024 https://github.com/Mingyue-Cheng/Bingjian https://dl.acm.org/doi/10.1145/3589335.3651243
934 Representation Learning with Large Language Models for Recommendation Xubin Ren, Wei Wei, Lianghao Xia, Lixin Su, Suqi Cheng, Junfeng Wang, Dawei Yin, Chao Huang 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/HKUDS/RLMRec https://dl.acm.org/doi/10.1145/3589334.3645458
935 RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems Jianxun Lian, Yuxuan Lei, Xu Huang, Jing Yao, Wei Xu, Xing Xie 2024-05-13 WWW '24: Companion Proceedings of the ACM on Web Conference 2024 https://github.com/microsoft/RecAI https://dl.acm.org/doi/10.1145/3589335.3651242
936 ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation Jianghao Lin, Rong Shan, Chenxu Zhu, Kounianhua Du, Bo Chen, Shigang Quan, Ruiming Tang, Yong Yu, Weinan Zhang 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/LaVieEnRose365/ReLLa https://dl.acm.org/doi/10.1145/3589334.3645467
937 News Recommendation with Category Description by a Large Language Model Yuki Yada, Hayato Yamana 2024-05-13 arXiv https://github.com/yamanalab/gpt-augmented-news-recommendation https://doi.org/10.48550/arXiv.2405.13007
938 Item-side Fairness of Large Language Model-based Recommendation System Meng Jiang, Keqin Bao, Jizhi Zhang, Wenjie Wang, Zhengyi Yang, Fuli Feng, Xiangnan He 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/JiangM-C/IFairLRS https://dl.acm.org/doi/10.1145/3589334.3648158
939 Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering Hongda Sun, Yuxuan Liu, Chengwei Wu, Haiyu Yan, Cheng Tai, Xin Gao, Shuo Shang, Rui Yan 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/EthanLeo-LYX/LLMQA https://dl.acm.org/doi/10.1145/3589334.3645670
940 Collaborative Large Language Model for Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/yaochenzhu/llm4rec https://dl.acm.org/doi/10.1145/3589334.3645347
941 A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen 2024-05-13 arXiv https://github.com/webis-de/msmarco-llm-distillation https://doi.org/10.48550/arXiv.2405.07920
942 GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi 2024-05-13 WWW '24: Proceedings of the ACM on Web Conference 2024 https://github.com/alibaba/GraphTranslator https://dl.acm.org/doi/10.1145/3589334.3645682
943 EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models Yunsheng Ni, Chuanjian Liu, Yehui Tang, Kai Han, Yunhe Wang 2024-05-13 arXiv https://github.com/niyunsheng/EMS-SD https://doi.org/10.48550/arXiv.2405.07542
944 FashionReGen: LLM-Empowered Fashion Report Generation Yujuan Ding, Yunshan Ma, Wenqi Fan, Yige Yao, Tat-Seng Chua, Qing Li 2024-05-13 WWW '24: Companion Proceedings of the ACM on Web Conference 2024 https://github.com/CompFashion/FashionReGen https://dl.acm.org/doi/10.1145/3589335.3651232
945 AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents Shuyuan Xu, Zelong Li, Kai Mei, Yongfeng Zhang 2024-05-11 arXiv https://github.com/agiresearch/CoRE http://arxiv.org/abs/2405.06907v2
946 ChaCha: Leveraging Large Language Models to Prompt Children to Share Their Emotions about Personal Events Woosuk Seo, Chanmo Yang, Young-Ho Kim 2024-05-11 CHI '24: Proceedings of the CHI Conference on Human Factors in Computing Systems https://naver-ai.github.io/chacha/ https://dl.acm.org/doi/10.1145/3613904.3642152
947 LaMI: Large Language Models for Multi-Modal Human-Robot Interaction Chao Wang, Stephan Hasler, Daniel Tanneberg, Felix Ocker, Frank Joublin, Antonello Ceravola, Joerg Deigmoeller, Michael Gienger 2024-05-11 CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems https://hri-eu.github.io/Lami/ https://dl.acm.org/doi/10.1145/3613905.3651029
948 Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models Hyung-Kwon Ko, Hyeon Jeon, Gwanmo Park, Dae Hyun Kim, Nam Wook Kim, Juho Kim, Jinwook Seo 2024-05-11 CHI '24: Proceedings of the CHI Conference on Human Factors in Computing Systems https://github.com/hyungkwonko/chart-llm https://dl.acm.org/doi/10.1145/3613904.3642943
949 LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play Li-Chun Lu, Shou-Jen Chen, Tsung-Min Pai, Chan-Hung Yu, Hung-yi Lee, Shao-Hua Sun 2024-05-10 arXiv https://github.com/lawraa/LLM-Discussion https://doi.org/10.48550/arXiv.2405.06373
950 Linearizing Large Language Models Jean Mercat, Igor Vasiljevic, Sedrick Keh, Kushal Arora, Achal Dave, Adrien Gaidon, Thomas Kollar 2024-05-10 arXiv https://github.com/TRI-ML/linear_open_lm https://doi.org/10.48550/arXiv.2405.06640
951 PLeak: Prompt Leaking Attacks against Large Language Model Applications Bo Hui, Haolin Yuan, Neil Zhenqiang Gong, Philippe Burlina, Yinzhi Cao 2024-05-10 arXiv https://github.com/BHui97/PLeak https://doi.org/10.48550/arXiv.2405.06823
952 Pruning as a Domain-specific LLM Extractor Nan Zhang, Yanchi Liu, Xujiang Zhao, Wei Cheng, Runxue Bao, Rui Zhang, Prasenjit Mitra, Haifeng Chen 2024-05-10 Findings of the Association for Computational Linguistics: NAACL 2024 - Findings https://github.com/psunlpgroup/D-Pruner http://arxiv.org/abs/2405.06275v1
953 LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models Ruihao Gong, Yang Yong, Shiqiao Gu, Yushi Huang, Yunchen Zhang, Xianglong Liu, Dacheng Tao 2024-05-09 arXiv https://github.com/ModelTC/llmc https://doi.org/10.48550/arXiv.2405.06001
954 Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning Artem Lykov, Miguel Altamirano Cabrera, Koffivi Fidèle Gbagbe, Dzmitry Tsetserukou 2024-05-09 arXiv https://github.com/TemaLykov/robots_can_feel http://arxiv.org/abs/2405.05824v1
955 Probing Multimodal LLMs as World Models for Driving Shiva Sreeram, Tsun-Hsuan Wang, Alaa Maalouf, Guy Rosman, Sertac Karaman, Daniela Rus 2024-05-09 arXiv https://github.com/sreeramsa/DriveSim http://arxiv.org/abs/2405.05956v1
956 Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference Zhihang Lin, Mingbao Lin, Luxi Lin, Rongrong Ji 2024-05-09 arXiv https://github.com/lzhxmu/VTW https://doi.org/10.48550/arXiv.2405.05803
957 CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen 2024-05-09 arXiv https://github.com/SHI-Labs/CuMo http://arxiv.org/abs/2405.05949v1
958 Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models Sander Land, Max Bartolo 2024-05-08 arXiv https://github.com/cohere-ai/magikarp/ https://doi.org/10.48550/arXiv.2405.05417
959 DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature Dawei Li, Shu Yang, Zhen Tan, Jae Young Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong-Tran, Ying Ding, Huan Liu, Li Shen, Tianlong Chen 2024-05-08 arXiv https://github.com/David-Li0406/DALK http://arxiv.org/abs/2405.04819v2
960 Vidur: A Large-Scale Simulation Framework For LLM Inference Amey Agrawal, Nitin Kedia, Jayashree Mohan, Ashish Panwar, Nipun Kwatra, Bhargav Gulavani, Ramachandran Ramjee, Alexey Tumanov 2024-05-08 arXiv https://github.com/microsoft/vidur http://arxiv.org/abs/2405.05465v2
961 QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving Yujun Lin, Haotian Tang, Shang Yang, Zhekai Zhang, Guangxuan Xiao, Chuang Gan, Song Han 2024-05-07 arXiv https://github.com/mit-han-lab/qserve http://arxiv.org/abs/2405.04532v2
962 MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline Mohamed Yaseen Jabarulla, Steffen Oeltze-Jafra, Philipp Beerbaum, Theodor Uden 2024-05-06 arXiv https://github.com/yaseen28/MedDoc-Bot https://doi.org/10.48550/arXiv.2405.03359
963 Word2World: Generating Stories and Worlds through Large Language Models Muhammad Umair Nasir, Steven James, Julian Togelius 2024-05-06 arXiv https://github.com/umair-nasir14/Word2World https://doi.org/10.48550/arXiv.2405.06686
964 When LLMs Meet Cybersecurity: A Systematic Literature Review Jie Zhang, Haoyu Bu, Hui Wen, Yu Chen, Lun Li, Hongsong Zhu 2024-05-06 arXiv https://github.com/tmylla/Awesome-LLM4Cybersecurity http://arxiv.org/abs/2405.03644v1
965 Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs Jordan Dotzel, Yuzong Chen, Bahaa Kotb, Sushma Prasad, Gang Wu, Sheng Li, Mohamed S. Abdelfattah, Zhiru Zhang 2024-05-06 arXiv https://github.com/cornell-zhang/llm-datatypes http://arxiv.org/abs/2405.03103v2
966 Large Language Models Synergize with Automated Machine Learning Jinglue Xu, Jialong Li, Zhen Liu, Nagar Anthel Venkatesh Suryanarayanan, Guoyuan Zhou, Jia Guo, Hitoshi Iba, Kenji Tei 2024-05-06 arXiv https://github.com/JLX0/llm-automl https://doi.org/10.48550/arXiv.2405.03727
967 Language Evolution for Evading Social Media Regulation via LLM-Based Multi-Agent Simulation Jinyu Cai, Jialong Li, Mingyue Zhang, Munan Li, Chen-Shu Wang, Kenji Tei 2024-05-05 2024 IEEE Congress on Evolutionary Computation (CEC) https://github.com/BlueLinkXlGA-MAS https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10612015
968 NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional Stimuli Xu Wang, Cheng Li, Yi Chang, Jindong Wang, Yuan Wu 2024-05-05 arXiv https://github.com/wangxu0820/NegativePrompt https://doi.org/10.48550/arXiv.2405.02814
969 EDA Corpus: A Large Language Model Dataset for Enhanced Interaction with OpenROAD Bing-Yue Wu, Utsav Sharma, Sai Rahul Dhanvi Kankipati, Ajay Yadav, Bintu Kappil George, Sai Ritish Guntupalli, Austin Rovinski, Vidya A. Chhabria 2024-05-04 2024 IEEE LLM Aided Design Workshop (LAD) https://github.com/OpenROAD-Assistant/EDA-Corpus https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10691774
970 Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding Zheng Zhao, Emilio Monti, Jens Lehmann, Haytham Assem 2024-05-04 arXiv https://github.com/amazon-science/ContextualUnderstanding-ContrastiveDecoding https://doi.org/10.48550/arXiv.2405.02750
971 PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning Hyeong Kyu Choi, Yixuan Li 2024-05-03 arXiv https://github.com/deeplearning-wisc/picle https://doi.org/10.48550/arXiv.2405.02501
972 ProFLingo: A Fingerprinting-based Copyright Protection Scheme for Large Language Models Heng Jin, Chaoyu Zhang, Shanghao Shi, Wenjing Lou, Y. Thomas Hou 2024-05-03 arXiv https://github.com/hengvt/ProFLingo https://doi.org/10.48550/arXiv.2405.02466
973 Learning Multiple Object States from Actions via Large Language Models Masatoshi Tateno, Takuma Yagi, Ryosuke Furuta, Yoichi Sato 2024-05-02 arXiv https://masatate.github.io/ObjStatefromAction.github.io/ http://arxiv.org/abs/2405.01090v2
974 Analyzing the Role of Semantic Representations in the Era of Large Language Models Zhijing Jin, Yuen Chen, Fernando Gonzalez Adauto, Jiarui Liu, Jiayi Zhang, Julian Michael, Bernhard Schölkopf, Mona T. Diab 2024-05-02 NAACL-HLT https://github.com/causalNLP/amr_llm https://doi.org/10.18653/v1/2024.naacl-long.209
975 Creative Problem Solving in Large Language and Vision Models - What Would it Take? Lakshmi Nair, Evana Gizzi, Jivko Sinapov 2024-05-02 arXiv https://github.com/lnairGT/creative-problem-solving-LLMs https://doi.org/10.48550/arXiv.2405.01453
976 A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law Zhiyu Zoey Chen, Jing Ma, Xinlu Zhang, Nan Hao, An Yan, Armineh Nourbakhsh, Xianjun Yang, Julian J. McAuley, Linda R. Petzold, William Yang Wang 2024-05-02 arXiv https://github.com/czyssrs/LLM_X_papers https://doi.org/10.48550/arXiv.2405.01769
977 Characterising the Creative Process in Humans and Large Language Models Surabhi S. Nath, Peter Dayan, Claire Stevenson 2024-05-01 arXiv https://github.com/surabhisnath/Creative_Process https://doi.org/10.48550/arXiv.2405.00899
978 Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, Jingren Zhou 2024-05 Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 5 https://github.com/BeachWang/DAIL-SQL https://dl.acm.org/doi/10.14778/3641204.3641221
979 CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification Yuchen Tian, Weixiang Yan, Qian Yang, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma 2024-04-30 arXiv https://github.com/yuchen814/CodeHalu http://arxiv.org/abs/2405.00253v2
980 CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification Yuchen Tian, Weixiang Yan, Qian Yang, Xuandong Zhao, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma, Dawn Song 2024-04-30 arXiv https://github.com/yuchen814/CodeHalu http://arxiv.org/abs/2405.00253v3
981 Do Large Language Models Understand Conversational Implicature - A case study with a chinese sitcom Shisen Yue, Siyuan Song, Xinyuan Cheng, Hai Hu 2024-04-30 arXiv https://github.com/sjtu-compling/llm-pragmatics https://doi.org/10.48550/arXiv.2404.19509
982 Transcrib3D: 3D Referring Expression Resolution through Large Language Models Jiading Fang, Xiangshan Tan, Shengjie Lin, Igor Vasiljevic, Vitor Guizilini, Hongyuan Mei, Rares Ambrus, Gregory Shakhnarovich, Matthew R. Walter 2024-04-30 arXiv https://ripl.github.io/Transcrib3D https://doi.org/10.48550/arXiv.2404.19221
983 LLM-SR: Scientific Equation Discovery via Programming with Large Language Models Parshin Shojaee, Kazem Meidani, Shashank Gupta, Amir Barati Farimani, Chandan K. Reddy 2024-04-29 arXiv https://github.com/deep-symbolic-mathematics/LLM-SR https://doi.org/10.48550/arXiv.2404.18400
984 Benchmarking Benchmark Leakage in Large Language Models Ruijie Xu, Zengzhi Wang, Run-Ze Fan, Pengfei Liu 2024-04-29 arXiv https://gair-nlp.github.io/benbench https://doi.org/10.48550/arXiv.2404.18824
985 Hallucination of Multimodal Large Language Models: A Survey Zechen Bai, Pichao Wang, Tianjun Xiao, Tong He, Zongbo Han, Zheng Zhang, Mike Zheng Shou 2024-04-29 arXiv https://github.com/showlab/Awesome-MLLM-Hallucination https://doi.org/10.48550/arXiv.2404.18930
986 SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning Jinghan Jia, Yihua Zhang, Yimeng Zhang, Jiancheng Liu, Bharat Runwal, James Diffenderfer, Bhavya Kailkhura, Sijia Liu 2024-04-28 arXiv https://github.com/OPTML-Group/SOUL http://arxiv.org/abs/2404.18239v4
987 SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, Xiaoxiang Shi, Chunan Shi, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia 2024-04-27 ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 https://github.com/flexflow/FlexFlow/ https://dl.acm.org/doi/10.1145/3620666.3651335
988 A Unified Debugging Approach via LLM-Based Multi-Agent Synergy Cheryl Lee, Chunqiu Steven Xia, Longji Yang, Jen-tse Huang, Zhouruixin Zhu, Lingming Zhang, Michael R. Lyu 2024-04-26 arXiv https://github.com/AcceptePapier/UniDebugger http://arxiv.org/abs/2404.17153v1
989 When to Trust LLMs: Aligning Confidence with Response Quality Shuchang Tao, Liuyi Yao, Hanxing Ding, Yuexiang Xie, Qi Cao, Fei Sun, Jinyang Gao, Huawei Shen, Bolin Ding 2024-04-26 arXiv https://github.com/TaoShuchang/CONQORD http://arxiv.org/abs/2404.17287v2
990 SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Bohao Li, Yuying Ge, Yi Chen, Yixiao Ge, Ruimao Zhang, Ying Shan 2024-04-25 arXiv https://github.com/AILab-CVC/SEED-Bench https://doi.org/10.48550/arXiv.2404.16790
991 List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs An Yan, Zhengyuan Yang, Junda Wu, Wanrong Zhu, Jianwei Yang, Linjie Li, Kevin Lin, Jianfeng Wang, Julian McAuley, Jianfeng Gao, Lijuan Wang 2024-04-25 arXiv https://github.com/zzxslp/SoM-LLaVA http://arxiv.org/abs/2404.16375v1
992 Large Language Models in the Clinic: A Comprehensive Benchmark Fenglin Liu, Zheng Li, Hongjian Zhou, Qingyu Yin, Jingfeng Yang, Xianfeng Tang, Chen Luo, Ming Zeng, Haoming Jiang, Yifan Gao, Priyanka Nigam, Sreyashi Nag, Bing Yin, Yining Hua, Xuan Zhou, Omid Rohanian, Anshul Thakur, Lei Clifton, David A. Clifton 2024-04-25 arXiv https://github.com/AI-in-Health/ClinicBench http://arxiv.org/abs/2405.00716v3
993 Evaluating Class Membership Relations in Knowledge Graphs using Large Language Models Bradley P. Allen, Paul T. Groth 2024-04-25 arXiv https://github.com/bradleypallen/evaluating-kg-class-memberships-using-llms https://doi.org/10.48550/arXiv.2404.17000
994 Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models Xu Ji, Jianyi Zhang, Ziyin Zhou, Zhangchi Zhao, Qianqian Qiao, Kaiying Han, Md Imran Hossen, Xiali Hei 2024-04-25 arXiv https://github.com/cistineup/CantCounter http://arxiv.org/abs/2405.00718v1
995 Continual Learning of Large Language Models: A Comprehensive Survey Haizhou Shi, Zihao Xu, Hengyi Wang, Weiyi Qin, Wenyuan Wang, Yibin Wang, Zifeng Wang, Sayna Ebrahimi, Hao Wang 2024-04-25 arXiv https://github.com/Wang-ML-Lab/llm-continual-learning-survey https://doi.org/10.48550/arXiv.2404.16789
996 Attacks on Third-Party APIs of Large Language Models Wanru Zhao, Vidit Khazanchi, Haodi Xing, Xuanli He, Qiongkai Xu, Nicholas Donald Lane 2024-04-24 arXiv https://github.com/vk0812/Third-Party-Attacks-on-LLMs https://doi.org/10.48550/arXiv.2404.16891
997 Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model Yongqi Zhao, Wenbo Xiao, Tomislav Mihalj, Jia Hu, Arno Eichberger 2024-04-24 arXiv https://github.com/ftgTUGraz/Chat2Scenario https://doi.org/10.48550/arXiv.2404.16147
998 ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction Henry Peng Zou, Vinay Samuel, Yue Zhou, Weizhi Zhang, Liancheng Fang, Zihe Song, Philip S. Yu, Cornelia Caragea 2024-04-24 arXiv https://github.com/HenryPengZou/ImplicitAVE http://arxiv.org/abs/2404.15592v1
999 Rethinking LLM Memorization through the Lens of Adversarial Compression Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter 2024-04-23 arXiv https://locuslab.github.io/acr-memorization http://arxiv.org/abs/2404.15146v1
1000 LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models Mihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo, Santosh Mashetty, Arindam Mitra, Chitta Baral 2024-04-23 ACL https://github.com/Mihir3009/LogicBench https://aclanthology.org/2024.acl-long.739
1001 Think-Program-reCtify: 3D Situated Reasoning with Large Language Models Qingrong He, Kejun Lin, Shizhe Chen, Anwen Hu, Qin Jin 2024-04-23 arXiv https://qingrongh.github.io/LLM-TPC/ https://doi.org/10.48550/arXiv.2404.14705
1002 MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit Boning Zhang, Chengxi Li, Kai Fan 2024-04-22 arXiv https://github.com/MARIO-Math-Reasoning/math_evaluation http://arxiv.org/abs/2404.13925v1