FM-Matcher

FM-Matcher demonstrates the use of Large Language Models for Schema Matching. Find more information in our publication. This tool uses the OpenAI Python SDK to communicate with the OpenAI API. Other models are currently not supported out-of-the-box.

Installation

We use (and thus recommend) to install FM-Matcher using poetry:

poetry install

You may also choose to inspect the requirements.txt file to install the tool manually via pip.

Container usage

You may also choose to use FM-Matcher containerized. We provide a Dockerfile in this repository, based on a Python slim image. You can build an image with podman, for example, like this:

podman build -t fm_matcher .

Configuration

Under Linux and in the containerized setting, you can use environment variables to configure FM-Matcher. Other OSes are not tested, but you can change the default configuration in utils/config.py if needed.

OPENAI_API_KEY: REQUIRED The OpenAI API key that will be used. There is no default, you will have to create an OpenAI API key yourself.
QUERY_OPENAI: Set this to False to generate a random result instead of prompting the LLM. Useful for testing and developing. Default: True
OPENAI_MODEL: The OpenAI model that is used. Default: gpt-4o-mini-2024-07-18
OPENAI_N: The number of answers that is requested from a model per prompt. Default: 3
OPENAI_TEMPERATURE: The models temperature setting. Default: 1.0
OPENAI_TIMEOUT: Timeout of the OpenAI API calls. There is some tenacity used to query the API, we would still recommend to test before setting this significantly lower. Default: 60
TEMPLATE_DIR: Directory where the prompt templates are stored. The template are filled with the schema information from FM-Matcher and sent to OpenAI. Default: resources/prompt_templates
PARALLEL_OPENAI_REQUESTS: Maximum number of parallel requests that will be sent asynchronously to OpenAI. Lower this to fix RateLimitErrors. 5
SQLITE_PATH: Path to an SQLite database file, used for caching results. You may set this to "" to disable. Default: dev.sqlite3

Running

Run FM-Matcher as you would run any Streamlit application:

poetry run streamlit run main.py

Container usage

Assuming you have build the container as shown above, you can start a container like this:

podman run -d --name fm_matcher -e OPENAI_API_KEY="mySuperSecretApiKey" -p 8501:8501 fm_matcher

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
resources/prompt_templates		resources/prompt_templates
test_inputs		test_inputs
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
entrypoint.sh		entrypoint.sh
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FM-Matcher

Installation

Container usage

Configuration

Running

Container usage

About

Releases

Packages

Contributors 2

Languages

License

UHasselt-DSI-Data-Systems-Lab/code-demo-fm-matcher

Folders and files

Latest commit

History

Repository files navigation

FM-Matcher

Installation

Container usage

Configuration

Running

Container usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages