LogBench

LogBench is a benchmark for evaluating logging statement generation.

Logging statements are imperative in modern software. They serve important role in reflecting developer's intention, recording system behavior, and guiding failure diagnosis procedure. LogBench provides a benchmark and toolkit, allowing you to measure your own models and conveniently compare them with existing baseline models.

If you find our paper benefit your research, please kindly cite our following paper:

Yichen Li, Yintong Huo, Zhihan Jiang, Renyi Zhong, Pinjia He, Yuxin Su, Lionel C. Briand, and Michael R. Lyu. Exploring the Effectiveness of LLMs in Automated Logging Generation: An Empirical Study, IEEE Transactions on Software Engineering(TSE), 2024.

Study overview

The study is fully described in this paper. LogBench comprises two subsets for evaluating the model's effectiveness and generalizability, respectively:

Effectiveness: LogBench-O contains a collection of high-quality logging statements and their associated code contexts.
Generalizability: LogBench-T is an unseen code dataset, after semantically-equivalent code transformation from LogBench-O.

Additionally, LogBench offers various variants to support different settings in logging statement generation, including:

Method-level
File-level
Comment-included
Comment-free

Repository organization

We currently provide part of the code in the folder /src. We will release the full source code after the paper has been accepted.

LogBench-O: The /LogBench-O folder contains the files for LogBench-O.
LogBench-T: The /LogBench-T folder contains the files for LogBench-T.
Cases: Please refer to the cases folder for the generated cases.

├── LICENSE
├── LogBench-O
│   ├── LogBench-O_prefix_1point.zip
│   ├── LogBench-O_prefix_1point_file_level.zip
│   └── LogBench-O_prefix_1point_wo_comments.zip
├── LogBench-T
│   ├── LogBench-T_prefix_1point.zip
│   └── LogBench-T_prefix_1point_file_level.zip
├── README.md
├── build
│   └── code-transformer.jar
├── cases
│   └── generated_cases.csv
├── img
│   ├── overview.pdf
│   └── overview.png
└── src
    ├── Baselines
    │   ├── DeepLV
    │   ├── WhichVar
    │   ├── LogenText-Plus
    │   ├── StarCoder
    │   └── Lance
    │   └── InCoder
    │   └── ...
    ├── CodeTransformer
    │   └── README.md
    └── DataCollector
        ├── ...

Study subjects

11 LLMs	Access	Paper reference
Davinci	API	Project
ChatGPT	API	Project
LANCE	Model	[ICSE'22] Using deep learning to generate complete log statements
InCoder	Model	[ICLR'23] InCoder: A Generative Model for Code Infilling and Synthesis
Llama2	Model	Llama 2: Open Foundation and Fine-Tuned Chat Models
StarCoder	Model	StarCoder: may the source be with you!
CodeLlama	Model	Code Llama: Open Foundation Models for Code
CodeGeex	Plugin	CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X
TabNine	Plugin	-
Copilot	Plugin	-
Code Whisperer	Plugin	-
Non-LLMs
DeepLV	Model	[ICSE'21] DeepLV: Suggesting Log Levels Using Ordinal Based Neural Networks
WhichVar	Model	[TSE'21] Which Variables Should I Log?
LoGenText-Plus	Model	[TOSEM'23] LoGenText-Plus: Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

For each baseline utilized, we kindly request that please ensure to cite the relevant paper while using the code.

Download original crawling logging dataset

For further logging-related research, as GitHub does not hold large datasets, you can download the whole collected logging dataset Fullsize at here (zip: 252M; unzip: 786M).

Code transformation tool

The folder /build contains the built tranformation tool. It will conduct the code tranformation automatically with its eight code transformers.

To conduct the code transformation in batch:

java -jar code-transformer.jar -f ./javafiles/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LogBench

Study overview

Repository organization

Study subjects

Download original crawling logging dataset

Code transformation tool

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
LogBench-O		LogBench-O
LogBench-T		LogBench-T
build		build
cases		cases
img		img
src		src
LICENSE		LICENSE
README.md		README.md

License

logpai/LogBench

Folders and files

Latest commit

History

Repository files navigation

LogBench

Study overview

Repository organization

Study subjects

Download original crawling logging dataset

Code transformation tool

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages