CP-KGC

The paper is available at: Can Text-based Knowledge Graph Completion Benefit From Zero-Shot Large Language Models?

In this paper, we found that (1) We validated the effectiveness of CP-KGC using three publicly available datasets. CP-KGC enhances the semantic integrity of text and further improves the performance of text-based approaches. (2) We demonstrated the effectiveness of quantized model (Qwen-7B-Chat-int4) by utilizing LLMs with various parameter scales. This enhances the feasibility of further research on the integration of KGC and LLMs. (3) CP-KGC's contextual constraints strategy accurately identifies polysemous entities in datasets. Additionally, this strategy can also enhance the stability of text generation by LLMs.

CP-KGC semantic enhancement framework.

Requirements

The models mentioned in the paper (SimKGC, KG-S2S, and KG-BERT) have their code available in their original publications, along with instructions on how to run them. You can also optimize these public data sets or field data sets according to your own ideas.

python>=3.8
torch>=1.8 (for mixed precision training)
transformers>=4.15

All experiments are run with 1 A800(80GB) GPU.

The graphics required to reproduce the experiment is about 78GB (SimKGC: max length=50, batch size=1024).

CP-KGC used SimKGC as the basic model in the paper.

pip install transformers

Step 1: Replace the original datasets with the datasets we provide.

Step 2: Adjust the max tokens value, for example, set it to 30 for Table 2, 130 for Table 3, and 20, 30, 40, and 50 for Figure 2. Please keep other parameters unchanged.

Attention: If you set max length = 130 and batch size = 1024 on SimKGC, it will take about 240GB of GPU memory. If you need to run the UMLS dataset on SimKGC, set max length = 30, batch size = 256, epoch = 20.

If you don't have the computing resources, you can use the Qwen-7B-Chat and LLaMA2-7B/13B-Chat inference tests here. ModelScope provides sufficient computing resources for inference testing of the 13B model. You can also use quantized models for inference.

Citation

If you find our paper or code repository helpful, please consider citing as follows:

@article{yang2023cp,
  title={CP-KGC: constrained-prompt knowledge graph completion with large language models},
  author={Yang, Rui and Fang, Li and Zhou, Yi},
  journal={arXiv preprint arXiv:2310.08279},
  year={2023}
}
Currently online on KBS.
@article{yang2024enhancing,
  title={Enhancing text-based knowledge graph completion with zero-shot large language models: A focus on semantic enhancement},
  author={Yang, Rui and Zhu, Jiahao and Man, Jianping and Fang, Li and Zhou, Yi},
  journal={Knowledge-Based Systems},
  pages={112155},
  year={2024},
  publisher={Elsevier}
}

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
FB15k-237		FB15k-237
Synonyms_WN18RR/Qwen		Synonyms_WN18RR/Qwen
UMLS		UMLS
WN18RR		WN18RR
inference		inference
README.md		README.md
model.png		model.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CP-KGC

Requirements

Citation

About

Releases

Packages

Languages

sjlmg/CP-KGC

Folders and files

Latest commit

History

Repository files navigation

CP-KGC

Requirements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages