Skip to content

Commit

Permalink
Fix minor issues for CRAN submission; RAG added to the readme
Browse files Browse the repository at this point in the history
  • Loading branch information
atomashevic committed Aug 20, 2024
1 parent 41bb99f commit 11ad5be
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 5 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Authors@R: c(person("Alexander", "Christensen", email = "alexpaulchristensen@gma
role = "aut", comment = c(ORCID = "0000-0002-9798-7037")),
person("Hudson", "Golino", email = "[email protected]", role = "aut",
comment = c(ORCID = "0000-0002-1601-1447")),
person("Aleksandar", "Tomasevic", email = "[email protected]", role = c("aut", "cre"),
person("Aleksandar", "Tomašević", email = "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-4863-6051")))
Maintainer: Aleksandar Tomasevic <[email protected]>
Maintainer: Aleksandar Tomašević <[email protected]>
Description: Implements sentiment analysis using huggingface <https://huggingface.co> transformer zero-shot classification model pipelines for text and image data. The default text pipeline is Cross-Encoder's DistilRoBERTa <https://huggingface.co/cross-encoder/nli-distilroberta-base> and default image/video pipeline is Open AI's CLIP <https://huggingface.co/openai/clip-vit-base-patch32>. All other zero-shot classification model pipelines can be implemented using their model name from <https://huggingface.co/models?pipeline_tag=zero-shot-classification>.
License: GPL (>= 3.0)
Encoding: UTF-8
Expand Down
2 changes: 1 addition & 1 deletion R/transformer_scores.R
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
#' \href{https://huggingface.co/datasets/multi_nli}{MultiNLI} datasets. The DistilRoBERTa
#' is intended to be a smaller, more lightweight version of \code{"cross-encoder-roberta"},
#' that sacrifices some accuracy for much faster speed (see
#' \href{https://www.sbert.net/docs/pretrained_cross-encoders.html#nli}{https://www.sbert.net/docs/pretrained_cross-encoders.html#nli})}
#' \href{https://www.sbert.net/docs/cross_encoder/pretrained_models.html#nli}{https://www.sbert.net/docs/cross_encoder/pretrained_models.html#nli})}
#'
#' \item{\code{"facebook-bart"}}{Uses \href{https://huggingface.co/facebook/bart-large-mnli}{Facebook's BART Large}
#' zero-shot classification model trained on the
Expand Down
27 changes: 26 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
### CRAN 0.1.4 | GitHub 0.1.5
### CRAN 0.1.5 | GitHub 0.1.5

[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![R-CMD-check](https://github.com/atomashevic/transforEmotion/actions/workflows/r.yml/badge.svg)](https://github.com/atomashevic/transforEmotion/actions/workflows/r.yml) [![Downloads Total](https://cranlogs.r-pkg.org/badges/grand-total/transforEmotion?color=brightgreen)](https://cran.r-project.org/package=transforEmotion)

Expand Down Expand Up @@ -120,6 +120,31 @@ transformer_scores(
)
```

## RAG

The `rag` function is designed to enhance text generation using Retrieval-Augmented Generation (RAG) techniques. This function allows users to input text data or specify a path to local PDF files, which are then used to retrieve relevant documents.

The rag function supports various large language models (LLMs), including TinyLLAMA, LLAMA-2, Mistral-7B, Orca-2, and Phi-2, each offering different levels of computational efficiency and quality. The default model is TinyLLAMA, which is the fastest model.

Here's an example based on the decription of this package. First, we specify the text data.

```R
text <- "With `transforEmotion` you can use cutting-edge transformer models for zero-shot emotion classification of text, image, and video in R, *all without the need for a GPU, subscriptions, paid services, or using Python. Implements sentiment analysis using [huggingface](https://huggingface.co/) transformer zero-shot classification model pipelines. The default pipeline for text is [Cross-Encoder's DistilRoBERTa](https://huggingface.co/cross-encoder/nli-distilroberta-base) trained on the [Stanford Natural Language Inference](https://huggingface.co/datasets/snli) (SNLI) and [Multi-Genre Natural Language Inference](https://huggingface.co/datasets/multi_nli) (MultiNLI) datasets. Using similar models, zero-shot classification transformers have demonstrated superior performance relative to other natural language processing models (Yin, Hay, & Roth, [2019](https://arxiv.org/abs/1909.00161)). All other zero-shot classification model pipelines can be implemented using their model name from https://huggingface.co/models?pipeline_tag=zero-shot-classification."
```

And then we run the `rag` function.

```R
rag(text, query = "What is the use case for transforEmotion package?"
+ )
```

This code will provide the output similar to this one.

```
The use case for transforEmotion package is to use cutting-edge transformer models for zero-shot emotion classification of text, image, and video in R, without the need for a GPU, subscriptions, paid services, or using Python. This package implements sentiment analysis using the Cross-Encoder's DistilRoBERTa model trained on the Stanford Natural Language Inference (SNLI) and MultiNLI datasets. Using similar models, zero-shot classification transformers have demonstrated superior performance relative to other natural language processing models (Yin, Hay, & Roth, [2019](https://arxiv.org/abs/1909.00161)). The transforEmotion package can be used to implement these models and other zero-shot classification model pipelines from the HuggingFace library.>
```

## Image Example

For Facial Expression Recognition (FER) task from images we use Open AI's [CLIP](https://huggingface.co/openai/clip-vit-base-patch32) transformer model. Two input arguments are needed: the path to image and list of emotion labels.
Expand Down
2 changes: 1 addition & 1 deletion man/transformer_scores.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 11ad5be

Please sign in to comment.