LITE:beers:

This is the repository for the resources in TACL 2022 Paper "Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference". This repository contains the source code and links to some pre-trained model checkpoint.

Abstract

The task of ultra-fine entity typing (UFET) seeks to predict diverse and free-form words or phrases that describe the appropriate types of entities mentioned in sentences. A key challenge for this task lies in the large amount of types and the scarcity of annotated data per type. Existing systems formulate the task as a multi-way classification problem and train directly or distantly supervised classifiers. This causes two issues: (i) the classifiers do not capture the type semantics since types are often converted into indices; (ii) systems developed in this way are limited to predicting within a pre-defined type set, and often fall short of generalizing to types that are rarely seen or unseen in training.

This work presents LITE:beers:, a new approach that formulates entity typing as a natural language inference (NLI) problem, making use of (i) the indirect supervision from NLI to infer type information meaningfully represented as textual hypotheses and alleviate the data scarcity issue, as well as (ii) a learning-to-rank objective to avoid the pre-defining of a type set. Experiments show that, with limited training data, LITE obtains state-of-the-art performance on the UFET task. In addition, LITE demonstrates its strong generalizability, by not only yielding best results on other fine-grained entity typing benchmarks, more importantly, a pre-trained LITE system works well on new data containing unseen types.

Environment

python 3.7
Transformers (Huggingface) version 4.6.1 (Important)
PyTorch with CUDA support
cudatoolkit 10.0
CUDA Version 11.1

Dataset

The Ultra-fine Entity Typing (UFET) dataset is available at https://www.cs.utexas.edu/~eunsol/html_pages/open_entity.html.

Run the experiment

Download and unzip UFET data. Move its /release directory to /data
Run data/process_ultrafine.py to pre-process UFET data
Edit and run lite.sh to run the experiment for training.
Edit and run eval.sh to generate type vocab rankings given their confidence score
Edit and run result.sh to calculate the best threshold on dev set and generate the typing result on test set given such threshold.

Link to the pre-trained full models and results

Pre-trained LITE checkpoint is available at https://drive.google.com/file/d/1gICYx_UzHGcRNg3k-DPNx9w0JJKHg4AR/view?usp=sharing for users to do inference on their own data. Result of our pre-trained LITE is available at https://drive.google.com/file/d/1c02mOh_dozJq7afeUwGfXXn-cXcDV9vq/view?usp=sharing.

Out-of-the-Box version on CoLab In progress...

Reference

Bibtex:

@article{li-etal-2022-lite,
  title={Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference},
  author={Li, Bangzheng and Yin, Wenpeng and Chen, Muhao},
  journal={Transactions of the Association for Computational Linguistics},
  volume={10},
  year={2022},
  publisher={MIT Press}
}

This project is supported by by the National Science Foundation of United States Grant IIS 2105329.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
readme		readme
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
eval.py		eval.py
eval.sh		eval.sh
lite.py		lite.py
lite.sh		lite.sh
model.py		model.py
result.py		result.py
result.sh		result.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LITE:beers:

Abstract

Environment

Dataset

Run the experiment

Link to the pre-trained full models and results

Reference

About

Releases

Packages

Contributors 3

Languages

License

luka-group/lite

Folders and files

Latest commit

History

Repository files navigation

LITE:beers:

Abstract

Environment

Dataset

Run the experiment

Link to the pre-trained full models and results

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages